Choosecomp: Difference between revisions
Jump to navigation
Jump to search
imported>Scott No edit summary |
imported>Scott |
||
(3 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
===Purpose=== | ===Purpose=== | ||
Returns suggestion for number of components to include in a model. | |||
===Synopsis=== | |||
:lvs = choosecomp(model) | |||
:lvs = choosecomp(model,options) | |||
===Description=== | |||
Automatic factor suggestion based on information available in a given model. Suggestion is made on the available information in the model, depending on model type: | |||
:* '''PCA''' | |||
::* Without cross-validation: selection is based on looking for a "knee" (drop) in eigenvalue. The PC just before the drop is selected. | |||
::* With cross-validation: initial suggestion is made based on eigenvalues (as described above). Suggestion is refined by looking at change in RMSECV for adding or removing factors . If changing factors provides more than a given % improvement in RMSECV (relative to the maximum RMSECV observed), then suggestion is changed. Threshold for change is defined by options (see below). | |||
:* '''PLS/PCR ''' | |||
::* Without cross-validation: No suggestion will be made. | |||
::* With cross-validation: A "knee" in RMSEC is searched for (none found will suggest 1 LV). The suggestion is then improved using the RMSECV values and the search algorithm described above for PCA. | |||
:* '''PLSDA''' | |||
::* Without cross-validation: No suggestion will be made. | |||
::* With cross-validation: An initial suggestion is determined by searching for a "knee" in the mean RMSECV (note difference from PLS/PCR). This suggestion is then refined based on the mean misclassification error reported from cross-validation. | |||
=== | In all cases, a suggestion is only offered for models with more than 7 factors and if that suggestion includes less than 50% of the estimated rank of the data. No suggestion is made for unlisted model types. | ||
====Inputs==== | |||
*'''model''' = standard model structure. | |||
====Optional Inputs==== | |||
*'''options''' = options structure defined below. | |||
====Outputs==== | |||
*'''lvs''' = number of suggested components. Will be empty [ ] if no suggestion can be made. | |||
===Options=== | |||
The options structure can contain one or more of the following fields: | |||
:*'''plscvthreshold''' : [ ] Percent improvement required to relative RMSECV to change the number of LVs from the initial suggestion (for PLS models only). If not specified (i.e. passed as [ ] empty) the algorithm uses a threshold equal to the average of the absolute difference of adjacent RMSECV values. i.e.: <tt> mean(abs(diff(CV)))</tt> Where CV is the relative CV. | |||
:*'''plsdacvthreshold''' : [ ] Same as above but used for PLSDA models. | |||
:*'''pcacvthreshold''' : [ ] Same as above but used for PCA models. | |||
The default values for these options can also be set using the [[setplspref]] command or the preferences expert interface. | |||
===See Also=== | |||
[[crossval]], [[estimatefactors]], [[pca]], [[pls]], [[plsda]] |
Latest revision as of 21:26, 20 June 2013
Purpose
Returns suggestion for number of components to include in a model.
Synopsis
- lvs = choosecomp(model)
- lvs = choosecomp(model,options)
Description
Automatic factor suggestion based on information available in a given model. Suggestion is made on the available information in the model, depending on model type:
- PCA
- Without cross-validation: selection is based on looking for a "knee" (drop) in eigenvalue. The PC just before the drop is selected.
- With cross-validation: initial suggestion is made based on eigenvalues (as described above). Suggestion is refined by looking at change in RMSECV for adding or removing factors . If changing factors provides more than a given % improvement in RMSECV (relative to the maximum RMSECV observed), then suggestion is changed. Threshold for change is defined by options (see below).
- PLS/PCR
- Without cross-validation: No suggestion will be made.
- With cross-validation: A "knee" in RMSEC is searched for (none found will suggest 1 LV). The suggestion is then improved using the RMSECV values and the search algorithm described above for PCA.
- PLSDA
- Without cross-validation: No suggestion will be made.
- With cross-validation: An initial suggestion is determined by searching for a "knee" in the mean RMSECV (note difference from PLS/PCR). This suggestion is then refined based on the mean misclassification error reported from cross-validation.
In all cases, a suggestion is only offered for models with more than 7 factors and if that suggestion includes less than 50% of the estimated rank of the data. No suggestion is made for unlisted model types.
Inputs
- model = standard model structure.
Optional Inputs
- options = options structure defined below.
Outputs
- lvs = number of suggested components. Will be empty [ ] if no suggestion can be made.
Options
The options structure can contain one or more of the following fields:
- plscvthreshold : [ ] Percent improvement required to relative RMSECV to change the number of LVs from the initial suggestion (for PLS models only). If not specified (i.e. passed as [ ] empty) the algorithm uses a threshold equal to the average of the absolute difference of adjacent RMSECV values. i.e.: mean(abs(diff(CV))) Where CV is the relative CV.
- plsdacvthreshold : [ ] Same as above but used for PLSDA models.
- pcacvthreshold : [ ] Same as above but used for PCA models.
The default values for these options can also be set using the setplspref command or the preferences expert interface.