Autoscale Settings GUI

From Eigenvector Research Documentation Wiki
Revision as of 12:05, 13 October 2008 by imported>Jeremy (Autoset moved to Autoscale Settings GUI)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The Autoscale Settings GUI allows you to select options relevant to the use of autoscaling with your data. Autoscaling is an operation in which each variable (column) of your data is normalized to the standard deviation of that column. In addition, the mean of the column is subtracted. The result is that each variable has mean zero and standard deviation of one. The settings in the GUI primarily control how autoscale handles zero or near-zero standard deviations. There are three controls which can be used alone or together as described below.

Scaling Offset:

The value specified in this control will be added to each standard deviation (regardless of value). The net effect is that variables with zero or near-zero standard deviation will instead be divided by the value specified. Near-zero values will be scaled slightly more, depending on the magnitude of the standard deviation and the scaling offset. Variables with standard deviations much larger than the offset will not be affected by the offset value. This setting is usually used alone without any of the other settings in the GUI. It will be applied first, thereby causing the settings for the other controls to be ignored.

Bad Scale Replacement:

The value specified will be used in place of any standard deviation values of zero. Any variable with a zero standard deviation (a "bad scale" variable) will be assigned this value instead. Any variable with a non-zero standard deviation (even near-zero standard deviations), will not be affected by this setting. The magnitude of this value indicates how strongly (if at all) future variations in "bad scale" variables should be weighted when calculating Q residuals. If this value is either "inf" (infinite) or 0 (zero) then "bad scale" variables will be ignored completely and future variation (e.g. in future measurements and test data) will be completely ignored. Effectively, the variable is excluded. Any finite value here will indicate how strongly future variations in "bad scale" variables are included. A value of 1 indicates that the raw variable value will be used without scaling when calculating Q.

Bad Scale Replacement can be used with Scaling Threshold. See below for details.

Scaling Threshold:

This setting determines the minimum allowable scaling for the variables. If the standard deviation of a given variable is below the threshold value, the threshold is used in place of the determined scaling. A value of zero disables this feature. The threshold can be either a scalar (single value) which indicates the lower threshold for all variables or a vector equal in length to the total number of variables in the original data, in which case, each element of the vector specifies the scaling threshold for the corresponding variable. The "Load" button allows you to load a vector of scaling values into this field.

If a vector is supplied and any of those threshold values are zero, the Bad Scale Replacement value will be used if any of those variables have a standard deviation of zero. Thus, the Bad Scale Replacement value is essentially a default value to use if no Threshold value is supplied for a given variable.

Note that, if a vector is supplied as the Scaling Threshold, its length (number of values) must be equal to either the number of included variables in the data or equal to the total number of variables (before excluding any variables). Autoscale will automatically handle exclusion of the unneeded values in the latter case.

See Also

auto