Wednesday, April 29, 2009

R library som

Self-organizing maps for clustering, need large number of iterations.
A filtering function in the package,

filtering()

is very helpful to floor, ceil the input data table.

Wednesday, April 22, 2009

Corr

mantel.rtest {ade4},

This provides a comparison of two distance matrices.
Still looking for a good way of translating correlation to distance.

mahalanobis {stats}

Calculate the mahalanobis distance.

To calculate the mahalanobis distance require Cov estimate, and this can be done with either classic cov or robust cov estimate.

Monday, April 20, 2009

Procrustes Analysis

Procrustes analysis: procrustes() in vegan provides procrustes analysis, this package also provides functions for ordination and further information on that area is given in the Environmetrics task view. Generalised procrustes analysis via GPA() is available from FactoMineR.

Wednesday, April 15, 2009

clues package for clustering evaluation

clues contain the calculation of five different indexes when comparing two cluster/classifications.

adjustedRand(cl1, cl2, randMethod = c("Rand", "HA", "MA", "FM", "Jaccard"))

While the adjustedRandIndex from mclust can only compute for one.

Saturday, April 4, 2009

ggplot2: Great plot libs

Just put up a post first. More details coming up soon.

Wednesday, April 1, 2009

Robust correlations

The traditional correlation measures are not suitable for noisy or those with outliers. Non-parametric methods like Kendall and Spearman can do slightly better job than Pearson but not enough.

The R package robust provides nice robust correlation methods, covRob:

R LINK

Other robust methods can be found here:
Robust Task View

For outlier removal, you may refer to outliers package. Honestly, it's not doing a good job at all.