Faq use custom cross validation option

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search


How do I use the "custom" cross-validation option?

Possible Solutions:

In PLS_Toolbox, the user can also manually specify the subsets of samples to be used as tests sets in the cross-validation procedure, using the Custom cross-validation method. When using the GUI, if the Custom method in the Cross-Validation window is chosen, then the user will be asked to specify a custom cross-validation vector that specifies the subsets of objects to be placed in each test subset for each sub-validation. This custom cross-validation vector must be a vector of integers with dimensionality 1 x n, where n is the total number of objects in the currently-loaded data set. The values within this vector must adhere to the following set of rules:

  • A value of -2 indicates that the object is placed in every test set (never in a model-building set)
  • A value of -1 indicates that the object is placed in every model-building set (never in a test set)
  • A value of 0 indicates that the object is used for neither model-building nor model testing
  • Values of 1,2,3 indicate the test set number for each object (for those objects that are used in the cross-validation procedure)

For example, for a data set containing 9 objects, a custom cross validation array of:

[-1 0 1 1 -2 2 2 0 -1] 

will result in two sub-validation experiments:

  • one using objects 3,4 and 5 in the test set and 1,6,7 and 9 in the model-building set,
  • and one using objects 5,6 and 7 in the test set and 1,3,4 and 9 in the model-building set.

This method can be particularly useful in cases where the data sets have relatively few objects, or they are generated from a designed experiment. In such cases, it might be necessary to force some objects to always be either in the test set, in the model-building set, or out of the cross-validation procedure entirely.

Still having problems? Please contact our helpdesk at helpdesk@eigenvector.com