IPLS Variable Selection Interface
Stepwise Interval Variable Selection Panel
The Interval Variable Selection panel in the Analysis Window provides access to the interval variable selection functionality.
Variable selection provides a simple way to discard variables which may be adding complexity to a model. Discarding such variables may improve performance of a final model. Variable selection can also be used as an exploratory tool to help identify variables which are the most interesting for a given task. For a detailed description of how interval variable selection works and practical considerations, see Interval PLS for Variable Selection. The following documentation describes the basic use of the controls on the Interval Variable Selection Panel.
Note that in addition to the controls on the panel, the current settings in the preprocessing and cross-validation settings of Analysis are also used by the Interval Variable Selection algorithm.
MustUse
Indices of variables which must be used in all models.
Mode
(Note: Mode is only visible if the "All Options" checkbox has been checked.) Defines the variable selection direction. "Forward" starts with no variables selected and looks for the best single interval to add. "Reverse" starts with all variables selected and looks for the worst single interval to discard. See the "No. of Intervals" option below to add or remove more than one interval. See the "Interval Size" option to add or remove "blocks" of variables in each interval.
Step Size
(Note: Step Size is only visible if the "All Options" checkbox has been checked.) Specifies the number of variables between the start of each interval. When less than the Interval Size, intervals will "overlap" and variables may belong to more than one interval (this allows a "sliding window" style of variable selection). When greater than the Interval Size, there will be a gap of unused variables between intervals (can be used for quick, course selection of variable windows). The automatic setting will cause there to be no overlap nor gaps between intervals.
Algorithm
Defines regression algorithm to use.
No. of Intervals
Defines the total number of intervals to be added or removed. When a specific number is entered, the algorithm will continue to add or remove intervals up to this number (A value of one will limit the algorithm to selecting only one interval to add or remove, 2 will look for the two best/worst intervals to add/remove, etc.) When "Automatic" is checked, the algorithm will continue to add or remove intervals until the cross-validation results cease to improve.
Interval Size (vars)
Specifies how many adjacent variables should be included in each interval. For example, a value of 5 will cause each interval to contain a set of 5 adjacent variables which must be added or removed as a group. A value of one (the default) will cause each interval to contain a single variable.
Max LVs
See CrossVal Interface, max LVs is set via CrossVal.