Splitcaltest: Difference between revisions
imported>Donal |
imported>Donal No edit summary |
||
Line 11: | Line 11: | ||
===Description=== | ===Description=== | ||
The split is based on the scores from the input model. If a matrix or DataSet is passed in place of a model, it is assumed to contain the scores for the data. A randomization is used in the splitting process so no assumption about the data acquisition order is necessary. | The split is based on the scores from the input model. If a matrix or DataSet is passed in place of a model, it is assumed to contain the scores for the data. A randomization is used in the splitting process so no assumption about the data acquisition order is necessary. It is possible to specify the usereplicates option to keep replicated samples together during the splitting process. | ||
If ''usereplicates'' option is enabled and ''repidclass'' option indicates which | |||
sample classset identifies replicated samples then the splitting will | |||
not separate replicated samples from each other. | |||
Replicates are first combined using classcenter | |||
before splitcaltest is applied to the class centered data. Replicates | |||
only contribute to the class centered result if they were not excluded | |||
in the input dataset or model. The results of splitting these combined | |||
samples are then mapped back to the original replicates, so replicates | |||
are never separated in the resulting calibration and test sets. | |||
====Inputs==== | ====Inputs==== | ||
Line 32: | Line 42: | ||
* '''fraction''': [ {0.66} ] fraction of data to be set as calibrations samples. | * '''fraction''': [ {0.66} ] fraction of data to be set as calibrations samples. | ||
* '''usereplicates''': [{0} | 1] Keep replicates together (1) or not (0). | |||
* '''repidclass''': [{1}] the X-block classset used to identify sample replicates | |||
===See Also=== | ===See Also=== | ||
[[crossval]], [[pca]], [[pcr]], [[preprocess]]. | [[crossval]], [[pca]], [[pcr]], [[preprocess]], [[classcenter]]. |
Revision as of 13:37, 4 February 2013
Purpose
Splits data into calibration and test sets.
Synopsis
- z = splitcaltest(model,options); %identifies model (calibration step)
- Also available in the Analysis interface via the data context menu
Description
The split is based on the scores from the input model. If a matrix or DataSet is passed in place of a model, it is assumed to contain the scores for the data. A randomization is used in the splitting process so no assumption about the data acquisition order is necessary. It is possible to specify the usereplicates option to keep replicated samples together during the splitting process.
If usereplicates option is enabled and repidclass option indicates which sample classset identifies replicated samples then the splitting will not separate replicated samples from each other. Replicates are first combined using classcenter before splitcaltest is applied to the class centered data. Replicates only contribute to the class centered result if they were not excluded in the input dataset or model. The results of splitting these combined samples are then mapped back to the original replicates, so replicates are never separated in the resulting calibration and test sets.
Inputs
- model = standard model structure from a factor-based model OR a double or DataSet object containing the scores to analyze.
Outputs
- z = a structure containing the class and classlookup table.
Options
- options = structure array with the following fields :
- plots: [ 'none' | {'final'} ] governs level of plotting
- algorithm: [ {'onion'} ]
- nonion: [ {3} ] the number of 'external layers'
- fraction: [ {0.66} ] fraction of data to be set as calibrations samples.
- usereplicates: [{0} | 1] Keep replicates together (1) or not (0).
- repidclass: [{1}] the X-block classset used to identify sample replicates
See Also
crossval, pca, pcr, preprocess, classcenter.