Sammon
Purpose
Computes the Sammon projection of multivariate data to lower dimension.
Synopsis
- p = sammon(x, ncolout, options)
- p = sammon(x, p, options)
Description
The Sammon algorithm maps points from a higher dimensional space to a lower dimensional space in such a way as to preserve the relative interpoint distances.
SAMMON is based on algorithm described in: Sammon, JW (1969): A nonlinear mapping for data structure analysis, IEEE Transactions on Computers 18: 401–409. Note that in the original paper, the term "y" is used for the projections. In this function, the projections are referred to as "p" to avoid confusion with regression functions.
Inputs
- x = (matrix or dataset) data to be projected.
- ncolout = number of output dimensions in the Sammon projection, OR
- p = the initial projection matrix, with size nrow by ncolout.
Optional Inputs
- second = optional second input is this.
Outputs
- p = (matrix or dataset) projected coordinates of each data point. Dimension >= 2
Options
options = a structure array with one or more of the following fields:
- niterations: number of iterations performed.
- maxseconds: Maximum number of seconds allowed. Overrides niterations.
- alpha: [0.2] Sammon's "magic factor"
- D: (matrix) Intersample Euclidean distance, size nrow x nrow.
- plots: [ 'none' | {'final'} ] governs level of plotting.
- display: [ 'on' | {'off'} ] governs level of display to command window.
- maxsamples: [ 2000 ] Maximum number of samples for which Sammon projection will be calculated for. If an array or dataset has more than this number of samples, the Sammon projections will be returned as all NaN's. This is because the algorithm can be quite slow with many samples.
Example
This example code shows the use of SAMMON on the 'arch' dataset, reducing the dimensionality from 10 to 2 while trying to preserving the interpoint spacing. The resulting plot shows the four classes
load arch opts.plots='final'; p = sammon(arch.data,2,opts)
The resulting 75x2 array contains the Sammon projection of the 75x10 input array.
If the input data is a dataset with sample classes then the plot produced shows the projected points colored according to their class. Here we see five classes, the four 'arch' classes plus class 0, the unknowns.
p = sammon(arch,2,opts) colorbar