Exteriorpts and Manhattandist: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Neal
 
imported>Benjamin
No edit summary
 
Line 1: Line 1:
===Purpose===
===Purpose===
 
Calculates Manhattan Distance between Samples (rows) of a Dataset Object (DSO) or a matrix.
Finds pts on the exterior of a normalized data space.


===Synopsis===
===Synopsis===


:[isel,loads] = exteriorpts(x,ncomp,options)
:distances = manhattandist(x)
:distances = manhattandist(x,basis)
:distances = manhattandist(x,options)
:distances = manhattandist(x,basis,options)


===Description===
===Description===


Given a two-way or higher-order data set (X), the most exterior samples or variables are identified and their indices returned.
Calculates the Manhattan Distance, sum of the absolute value differences, from each row to every other row in the supplied matrix or, optionally, all rows of (x) to all rows in a second matrix (basis).


For a two-way data set, the data (X) are assumed to be modelable as:
====Inputs====
<tt>X = CS' + E</tt>


The following is how it works. First, note that non-negative data all lie in a multivariate analog of the upper right hand quadrant and that, given sufficient selectivity in the data,  the pure-component spectra (a.k.a. end-members) must lie at the exterior of the data cloud.
* '''x''' = A DSO or a matrix.
:A) First take a 1 norm of all the data which constrains the responses to a  hyper-plane and
:B) remove data points with low norm (and most likely to be affected by noise). [see options.minnorm] (An alternative is to add a small offset to all the data to 'push them' towards the center of the data cloud.)
At this point the data are transformed from looking like a "snow-cone" with it's point at the origin to looking like a "hyper-pyramid" with the  end-members corresponding to the corners.
: C) Next, the 1-normed data are mean-centered so that the hyper-plane has a center at [0,0,...]. This procedure transforms the problem from finding points on the exterior of a data cloud to finding points at the vertices of a hyper-polygon which is done using the DISTSLCT function (called from EXTERIORPTS).


====Inputs====
====Optional Inputs====
* '''x''' = MxN matrix.
* '''ncomp''' = number of components to extract.


====Optional Inputs====
* '''basis''' = A second DSO/matrix to compare the first DSO/matrix against when calculating Manhattan distance.
* '''options''' = a standard options structure containing one or more of the fields discussed in the Options section below.
* '''options''' = Discussed below.


====Outputs====
====Outputs====
* '''isel''' = if selectdim option was non-empty, isel is a vector of the selected indices. Otherwise, isel is a cell array with the indices selected on each mode of the data.
* '''loads''' = cell array with extracted pts/factors. Modes other than selectdim are determined via projection.


===Options===
* '''distances''' = A m-by-m matrix containing the comprehensive calculated Manhattan distances between samples.


options = a structure array with the following fields:
====Options====


* '''selectdim''': [1] mode of the data from which items should be selected (i.e. 1=rows, 2=columns, ...) If empty [], all modes are analyzed and the mode with the largest sum-squared captured value is used.
options = A structure array with the following fields:
* '''waitbar''': [ 'off' | 'on' | {'auto'} ] governs of waitbar while processing. 'auto' uses waitbar only if multiple modes are being analyzed with nway data.
 
* '''minnorm''': [ 0.03 ] approximate noise level, points with unit area smaller than this (as a fraction of the maximum value in x) are ignored during selection.
* '''waitbar''': [{'auto'}| 'on' | 'off' ], display waitbar. 'Auto' setting will automatically display a waitbar if computation takes longer than 3 seconds.  
* '''usepca''': [{'no'}| 'yes' ] governs use of PCA as a pre-filtering step on the data prior to selection.
* '''diag''': {default: 0} Defines the values to be used when comparing a sample to itself. Technically this distance is zero however in some instances, using an alternate value (e.g.: inf) is useful for flagging these self-calculated distances.
* '''usennls''': [{'no'}| 'yes' ] governs use of non-negative least squares when calculating loadings for other-than-sample modes. Only used when (loads) output is requested.
* '''distmeasure''': [ {'Euclidian'} | 'Mahalanobis' ] Governs the type of distance measurement to use. Mahalanobis requires the usepca option to be 'yes'.
* '''samplemode''': [ 1 ] mode that contains variance (factors for other modes are normalized to unit 2-norm). Only used when loads output is requested.


===See Also===
===See Also===
[[als]], [[distslct]], [[mcr]], [[parafac]], [[purity]], [[purityengine]]

Revision as of 13:16, 15 August 2017

Purpose

Calculates Manhattan Distance between Samples (rows) of a Dataset Object (DSO) or a matrix.

Synopsis

distances = manhattandist(x)
distances = manhattandist(x,basis)
distances = manhattandist(x,options)
distances = manhattandist(x,basis,options)

Description

Calculates the Manhattan Distance, sum of the absolute value differences, from each row to every other row in the supplied matrix or, optionally, all rows of (x) to all rows in a second matrix (basis).

Inputs

  • x = A DSO or a matrix.

Optional Inputs

  • basis = A second DSO/matrix to compare the first DSO/matrix against when calculating Manhattan distance.
  • options = Discussed below.

Outputs

  • distances = A m-by-m matrix containing the comprehensive calculated Manhattan distances between samples.

Options

options = A structure array with the following fields:

  • waitbar: [{'auto'}| 'on' | 'off' ], display waitbar. 'Auto' setting will automatically display a waitbar if computation takes longer than 3 seconds.
  • diag: {default: 0} Defines the values to be used when comparing a sample to itself. Technically this distance is zero however in some instances, using an alternate value (e.g.: inf) is useful for flagging these self-calculated distances.

See Also