Sammon and Release Notes Version 8 6 1: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Donal
No edit summary
 
imported>Scott
No edit summary
 
Line 1: Line 1:
===Purpose===
==Changes and Bug Fixes in Version 8.6.1==


Computes the Sammon projection of multivariate data to lower dimension.
{| {{table}}
| align="center" style="background:#f0f0f0;"|'''File'''
| align="center" style="background:#f0f0f0;"|'''Comment'''


===Synopsis===


|----valign="top"
|'''Variable Selection'''
|
* Fixes for using variable selections with more than one window open.


:p = sammon(x, ncolout, ''options'')
|----valign="top"
:p = sammon(x, p, ''options'')
|'''Context Menus'''
|
* Fixes for context menu positioning high DPI systems. 


===Description===


The Sammon algorithm maps points from a higher dimensional space to a lower dimensional space in such a way as to preserve the relative interpoint distances.
|----valign="top"
|'''[[asca]]'''
|
* Update ssq_tot calculation for using included samples only.


SAMMON is based on algorithm described in:
|----valign="top"
Sammon, JW (1969): A nonlinear mapping for data structure analysis, IEEE Transactions on Computers 18: 401–409.
|'''[[dendrogram]]'''
Note that in the original paper, the term "y" is used for the projections. In this function, the projections are referred to as "p" to avoid confusion with regression functions.
|
* User can now choose to add the created cluster class to the x-block instead of only being allowed to overwrite an existing class.


====Inputs====
|----valign="top"
*'''x''' = (matrix or dataset) data to be projected.
|'''[[estimatefactors]]'''
*'''ncolout''' = number of output dimensions in the Sammon projection, OR
|
*'''p''' = the initial projection matrix, with size nrow by ncolout.
* Add check to avoid columns which have NaN std dev.


====Outputs====
|----valign="top"
* '''p''' = (matrix or dataset) projected coordinates of each data point. Dimension >= 2
|'''[[matchrows]]'''
|
* Add option for requiring unique labels.  


===Options===
|----valign="top"
|'''[[splitcaltest]]'''
|
* Fix bug if replicates classset was not first classset.


options = a structure array with one or more of the following fields:
|----valign="top"
 
|'''[[xlsreadr]]'''
* '''niterations''': number of iterations performed.
|
* '''maxseconds''': Maximum number of seconds allowed. Overrides niterations.
* Update to avoid using 'basic' mode to better handle date conversion from Excel to Matlab.
* '''alpha''': [0.2] Sammon's "magic factor"
|----
* '''D''': (matrix) Intersample Euclidean distance, size nrow x nrow.
|}
* '''plots''': [ 'none' | {'final'} ] governs level of plotting.
* '''display''': [ 'on' | {'off'} ] governs level of display to command window.
* '''maxsamples''': [ 2000 ] Maximum number of samples for which Sammon projection will be calculated for. If an array or dataset has more than this number of samples, the Sammon projections will be returned as all NaN's. This is because the algorithm can be quite slow with many samples.
 
===Example===
 
This example code shows the use of SAMMON on the 'arch' dataset, reducing the dimensionality from 10 to 2 while trying to preserving the interpoint spacing. The resulting plot shows the four classes:
<pre>
load arch
opts.plots='final';
p = sammon(arch.data,2,opts)
</pre>
The resulting 75x2 array contains the Sammon projection of the 75x10 input array.
 
If the input data is a dataset with sample classes then the produced plot shows the projected points colored according to their class. Here we see five classes, the four 'arch' classes plus class 0, the unknowns.
<pre>
load arch
opts.plots='final';
p = sammon(arch,2,opts)
colorbar
</pre>
 
===See Also===
 
[[PCA]]

Revision as of 11:28, 22 February 2018

Changes and Bug Fixes in Version 8.6.1

File Comment


Variable Selection
  • Fixes for using variable selections with more than one window open.
Context Menus
  • Fixes for context menu positioning high DPI systems.


asca
  • Update ssq_tot calculation for using included samples only.
dendrogram
  • User can now choose to add the created cluster class to the x-block instead of only being allowed to overwrite an existing class.
estimatefactors
  • Add check to avoid columns which have NaN std dev.
matchrows
  • Add option for requiring unique labels.
splitcaltest
  • Fix bug if replicates classset was not first classset.
xlsreadr
  • Update to avoid using 'basic' mode to better handle date conversion from Excel to Matlab.