IPLS Variable Selection Interface: Difference between revisions
imported>Scott No edit summary |
imported>Jeremy No edit summary |
||
Line 1: | Line 1: | ||
'''Stepwise Interval Variable Selection | '''Stepwise Interval Variable Selection Panel''' | ||
The Interval Variable Selection panel in the Analysis | The Interval Variable Selection panel in the Analysis Window provides access to the interval variable selection functionality. | ||
Variable selection provides a simple way to discard variables which may be adding complexity to a model. Discarding such variables may improve performance of a final model. Variable selection can also be used as an exploratory tool to help identify variables which are the most interesting for a given task. For a detailed description of how interval variable selection works, see | Variable selection provides a simple way to discard variables which may be adding complexity to a model. Discarding such variables may improve performance of a final model. Variable selection can also be used as an exploratory tool to help identify variables which are the most interesting for a given task. For a detailed description of how interval variable selection works and practical considerations, see [[Interval PLS for Variable Selection]]. The following documentation describes the basic use of the controls on the Interval Variable Selection Panel. | ||
Note that in addition to the controls on the panel, the current preprocessing and cross-validation settings (see the Preprocess and Tools menus in the Analysis | Note that in addition to the controls on the panel, the current preprocessing and cross-validation settings (see the Preprocess and Tools menus in the Analysis Window) are also used by the Interval Variable Selection algorithm. | ||
==Mode:== | ==Mode:== | ||
Line 25: | Line 25: | ||
==Algorithm:== | ==Algorithm:== | ||
Specifies the regression algorithm to use during variable selection. By default this will match the current Analysis method (PLS, PCR, or MLR). | Specifies the regression algorithm to use during variable selection. By default this will match the current Analysis method (PLS, PLSDA, PCR, or MLR). | ||
==Max LVs:== | ==Max LVs:== | ||
For IPLS and IPCR, this setting defines the maximum number of Latent Variables or Principal Components which can be used for each variable subset mode | For IPLS, IPLSDA, and IPCR, this setting defines the maximum number of Latent Variables or Principal Components which can be used for each variable subset mode |
Revision as of 11:03, 20 June 2011
Stepwise Interval Variable Selection Panel
The Interval Variable Selection panel in the Analysis Window provides access to the interval variable selection functionality.
Variable selection provides a simple way to discard variables which may be adding complexity to a model. Discarding such variables may improve performance of a final model. Variable selection can also be used as an exploratory tool to help identify variables which are the most interesting for a given task. For a detailed description of how interval variable selection works and practical considerations, see Interval PLS for Variable Selection. The following documentation describes the basic use of the controls on the Interval Variable Selection Panel.
Note that in addition to the controls on the panel, the current preprocessing and cross-validation settings (see the Preprocess and Tools menus in the Analysis Window) are also used by the Interval Variable Selection algorithm.
Mode:
Defines the variable selection direction. "Forward" starts with no variables selected and looks for the best single interval to add. "Reverse" starts with all variables selected and looks for the worst single interval to discard. See the "No. of Intervals" option below to add or remove more than one interval. See the "Interval Size" option to add or remove "blocks" of variables in each interval.
No. of Intervals:
Defines the total number of intervals to be added or removed. When a specific number is entered, the algorithm will continue to add or remove intervals up to this number (A value of one will limit the algorithm to selecting only one interval to add or remove, 2 will look for the two best/worst intervals to add/remove, etc.) When "Automatic" is checked, the algorithm will continue to add or remove intervals until the cross-validation results cease to improve.
Interval Size (vars):
Specifies how many adjacent variables should be included in each interval. For example, a value of 5 will cause each interval to contain a set of 5 adjacent variables which must be added or removed as a group. A value of one (the default) will cause each interval to contain a single variable.
Step Size:
Specifies the number of variables between the start of each interval. When less than the Interval Size, intervals will "overlap" and variables may belong to more than one interval (this allows a "sliding window" style of variable selection). When greater than the Interval Size, there will be a gap of unused variables between intervals (can be used for quick, course selection of variable windows). The automatic setting will cause there to be no overlap nor gaps between intervals.
Algorithm:
Specifies the regression algorithm to use during variable selection. By default this will match the current Analysis method (PLS, PLSDA, PCR, or MLR).
Max LVs:
For IPLS, IPLSDA, and IPCR, this setting defines the maximum number of Latent Variables or Principal Components which can be used for each variable subset mode