Sammon: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Donal
No edit summary
imported>Donal
Line 14: Line 14:


SAMMON is based on algorithm described in:
SAMMON is based on algorithm described in:
Sammon, JW (1969): A nonlinear mapping for data structure analysis, IEEE Transactions on Computers 18: 401–409.
Sammon, JW (1969): "A nonlinear mapping for data structure analysis", IEEE Transactions on Computers 18: 401–409.
Note that in the original paper, the term "y" is used for the projections. In this function, the projections are referred to as "p" to avoid confusion with regression functions.
Note that in the original paper, the term "y" is used for the projections. In this function, the projections are referred to as "p" to avoid confusion with regression functions.



Revision as of 19:44, 21 October 2013

Purpose

Computes the Sammon projection of multivariate data to lower dimension.

Synopsis

p = sammon(x, ncolout, options)
p = sammon(x, p, options)

Description

The Sammon algorithm maps points from a higher dimensional space to a lower dimensional space in such a way as to preserve the relative interpoint distances.

SAMMON is based on algorithm described in: Sammon, JW (1969): "A nonlinear mapping for data structure analysis", IEEE Transactions on Computers 18: 401–409. Note that in the original paper, the term "y" is used for the projections. In this function, the projections are referred to as "p" to avoid confusion with regression functions.

Inputs

  • x = (matrix or dataset) data to be projected.
  • ncolout = number of output dimensions in the Sammon projection, OR
  • p = the initial projection matrix, with size nrow by ncolout.

Outputs

  • p = (matrix or dataset) projected coordinates of each data point. Dimension >= 2

Options

options = a structure array with one or more of the following fields:

  • niterations: number of iterations performed.
  • maxseconds: Maximum number of seconds allowed. Overrides niterations.
  • alpha: [0.2] Sammon's "magic factor"
  • D: (matrix) Intersample Euclidean distance, size nrow x nrow.
  • plots: [ 'none' | {'final'} ] governs level of plotting.
  • display: [ 'on' | {'off'} ] governs level of display to command window.
  • maxsamples: [ 2000 ] Maximum number of samples for which Sammon projection will be calculated for. If an array or dataset has more than this number of samples, the Sammon projections will be returned as all NaN's. This is because the algorithm can be quite slow with many samples.

Example

This example code shows the use of SAMMON on the 'arch' dataset, reducing the dimensionality from 10 to 2 while trying to preserving the interpoint spacing. The resulting plot shows the four classes:

load arch
opts.plots='final';
p = sammon(arch.data,2,opts)

The resulting 75x2 array contains the Sammon projection of the 75x10 input array.

If the input data is a dataset with sample classes then the produced plot shows the projected points colored according to their class. Here we see five classes, the four 'arch' classes plus class 0, the unknowns.

load arch
opts.plots='final';
p = sammon(arch,2,opts)
colorbar

See Also

PCA