Auto

From Eigenvector Research Documentation Wiki
Revision as of 14:24, 3 September 2008 by imported>Jeremy (Importing text file)
Jump to navigation Jump to search

Purpose

Autoscales a matrix to mean zero and unit variance.

Synopsis

[ax,mx,stdx,msg] = auto(x,options)
[ax,mx,stdx,msg] = auto(x,offset)
options = auto('options')

Description

[ax,mx,stdx] = auto(x) autoscales a matrix x and returns the resulting matrix ax with mean-zero unit variance columns, a vector of means mx and a vector of standard deviations stdx used in the scaling. Output msg returns any warning messages. If missing data NaNs are found, the available data is autoscaled if the fraction missing is not above the thresholds specified below. mx and stdx can be used to scale new data (see SCALE).

Options

  • options = a structure array with the following fields:
  • offset: scaling can use standard deviation plus an offset {default = 0},
  • display: [ {'off'}| 'on' ] governs level of display to the command window,
  • matrix_threshold: fraction of missing data allowed based on entire matrix (x) {default = 0.15}, and
  • column_threshold: fraction of missing data allowed base on a single column {default = 0.25}.
  • algorithm: [ {'standard'} | 'robust'] scaling algorithm. 'robust' uses MADC for scaling and median instead of mean. Should be used for robust techniques,
  • stdthreshold: [ 0 ] scalar or vector of standard deviation threshold values. If a standard deviation is below its corresponding threshold value, the threshold value will be used in lieu of the actual value. Note that the actual standard deviation is always returned, whether or not it exceedes the threshold. A scalar value is used as a threshold for all variables,
  • badreplacement: [0] value to use in place of standard deviation values of 0 (zero). Typical values used with the following effects:
  • 0 = Any value in given variable is set to zero. Variable is effectively excluded (but still expected by model). This is also the behavior when badreplacement = inf.
  • 1 = Values different from mean of the given variable are flagged in Q residuals with no reweighting.
  • Values >0 and <inf give the variable different weighting in the Q residuals (values >1 down-weight the bad variables for Q residual calculations, values <1 up-weight the bad variables.).

If the input (offset) is a scalar then, this is used as the offset value with other options set at their default values.

The optional input offset is added to the standard deviations before scaling and can be used to suppress low-level variables that would otherwise have standard deviations near zero.

The default options can be retreived using: options = auto('options');.

See Also

gscale, medcn, mncn, normaliz, npreprocess, regcon, rescale, scale, snv