Rpls variable selection interface

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search

Stepwise Interval Variable Selection Panel

The Recursive Variable Selection panel in the Analysis Window provides access to the recursive variable selection functionality.

Variable selection provides a simple way to discard variables which may be adding complexity and/or noise to a model. Discarding such variables may improve performance of a final model. Variable selection can also be used as an exploratory tool to help identify variables which are the most interesting for a given task. For a detailed description of how recursive variable selection works and practical considerations, see Recursive PLS for Variable Selection. The following documentation describes the basic use of the controls on the Recursive Variable Selection Panel.



Currently, there are 3 available modes for recursive variable selection: “specified”, “suggested”, and “surveyed”. "Specified" performs recursive variable selection at the designated Max LVs exclusively. "Suggested" initially performs PLS & cross validation on the entire dataset to determine an appropriate number of latent variables then proceeds with using the RPLS algorithm. “Surveyed” performs RPLS comprehensively from 1 to Max LV models. Note that “Surveyed” is the slower relative to “Suggested” and might produce slightly better results.

Max LVs

This setting defines the maximum number of Latent Variables or Principal Components which can be used for each variable subset mode.

Max Iterations

Defines the maximum number of iterations RPLS will be performed before terminating. When a specific number is entered, the algorithm will continue to remove variables with each iteration until either the number of variables equals to the number of LVs or when the max number of iterations is reached.

Use Selected Intervals

Once a variable selection has been performed, the list of selected variable(s) will appear in the box at the bottom of the interface. To change the x-block to include only the selected variables, click the "Use" button. This will change the include field on the calibration X-block to the selected variables. To discard the list of selected variables and start variable selection clean, click the "Discard" button.