Simca: Difference between revisions
imported>Jeremy (Importing text file) |
imported>Jeremy (Importing text file) |
||
Line 1: | Line 1: | ||
===Purpose=== | ===Purpose=== | ||
Create soft independent method of class analogy models for classification. | Create soft independent method of class analogy models for classification. | ||
===Synopsis=== | ===Synopsis=== | ||
:model = simca(x,''ncomp,options'') %creates simca model on dataset x | :model = simca(x,''ncomp,options'') %creates simca model on dataset x | ||
:model = simca(x,classid,''labels'') %models double x with class id | :model = simca(x,classid,''labels'') %models double x with class id | ||
:pred = simca(x,model,''options''); %predictions on x with model | :pred = simca(x,model,''options''); %predictions on x with model | ||
:options = simca('options');. | :options = simca('options');. | ||
===Description=== | ===Description=== | ||
The function SIMCA develops a SIMCA model, which is really a collection of PCA models, one for each class of data in the data set and is used for supervised pattern recognition. | The function SIMCA develops a SIMCA model, which is really a collection of PCA models, one for each class of data in the data set and is used for supervised pattern recognition. | ||
SIMCA cross-validates the PCA model of each class using leave-one-out cross-validation if the number of samples in the class is <= 20. If there are more than 20 samples, the data is split into 10 contiguous blocks. | SIMCA cross-validates the PCA model of each class using leave-one-out cross-validation if the number of samples in the class is <= 20. If there are more than 20 samples, the data is split into 10 contiguous blocks. | ||
====INPUTS==== | ====INPUTS==== | ||
* '''x''' = ''M ''x ''N'' matrix of class "dataset" where class information is extracted from x.class{1,1} and labels from x.label{1,1}, or | * '''x''' = ''M ''x ''N'' matrix of class "dataset" where class information is extracted from x.class{1,1} and labels from x.label{1,1}, or | ||
* '''x''' = ''M ''x ''N'' data matrix of class "double" and | * '''x''' = ''M ''x ''N'' data matrix of class "double" and | ||
* '''classid''' = ''M ''x 1 vector of class identifiers where each element is an integer identifying the class number of the corresponding sample. | * '''classid''' = ''M ''x 1 vector of class identifiers where each element is an integer identifying the class number of the corresponding sample. | ||
* '''model''' = when making predictions, input model is a SIMCA model structure. | * '''model''' = when making predictions, input model is a SIMCA model structure. | ||
OPIONAL INPUTS: | OPIONAL INPUTS: | ||
* '''''ncomp''''' = integer, number of PCs to use in each model. This is rarely known ''a'' ''priori''. When ncomp=[] {default} the user is querried for number of PCs for each class. | * '''''ncomp''''' = integer, number of PCs to use in each model. This is rarely known ''a'' ''priori''. When ncomp=[] {default} the user is querried for number of PCs for each class. | ||
* '''''labels''''' = a character array with ''M'' rows that is used to label samples on Q vs. T<sup>2</sup> plots, otherwise the class identifiers are used. | * '''''labels''''' = a character array with ''M'' rows that is used to label samples on Q vs. T<sup>2</sup> plots, otherwise the class identifiers are used. | ||
* '''''options''''' = a structure array discussed below. | * '''''options''''' = a structure array discussed below. | ||
OUPUT: | OUPUT: | ||
* '''model''' = model structure array with the following fields: | * '''model''' = model structure array with the following fields: | ||
* '''modeltype''': 'SIMCA', | * '''modeltype''': 'SIMCA', | ||
* '''datasource''': structure array with information about input data, | * '''datasource''': structure array with information about input data, | ||
* '''date''': date of creation, | * '''date''': date of creation, | ||
* '''time''': time of creation, | * '''time''': time of creation, | ||
* '''info''': additional model information, | * '''info''': additional model information, | ||
* '''description''': cell array with text description of model, | * '''description''': cell array with text description of model, | ||
* '''submodel''': structure array with each record containing the PCA model of each class (see PCA), and | * '''submodel''': structure array with each record containing the PCA model of each class (see PCA), and | ||
* '''detail''': sub-structure with additional model details and results. | * '''detail''': sub-structure with additional model details and results. | ||
* '''pred''' = is a structure, similar to model, that contains the SIMCA predictions. Additional, or other, fields in pred are: | * '''pred''' = is a structure, similar to model, that contains the SIMCA predictions. Additional, or other, fields in pred are: | ||
* '''rtsq''': the reduced T<sup>2</sup> (T<sup>2</sup> divided by it's 95Found confidence limit line) where each column corresponds to each class in the SIMCA model, | * '''rtsq''': the reduced T<sup>2</sup> (T<sup>2</sup> divided by it's 95Found confidence limit line) where each column corresponds to each class in the SIMCA model, | ||
* '''rq''': the reduced Q (Q divided by it's 95Found confidence limit line) where each column corresponds to each class in the SIMCA model, | * '''rq''': the reduced Q (Q divided by it's 95Found confidence limit line) where each column corresponds to each class in the SIMCA model, | ||
* '''nclass''': the predicted class number (class to which the sample was closest when considering T<sup>2</sup> and Q combined), and | * '''nclass''': the predicted class number (class to which the sample was closest when considering T<sup>2</sup> and Q combined), and | ||
* '''submodelpred''': structure array with each record containing the PCA model predictions for each class (see PCA). | * '''submodelpred''': structure array with each record containing the PCA model predictions for each class (see PCA). | ||
Note: Calling simca with no inputs starts the graphical user interface (GUI) for this analysis method. | Note: Calling simca with no inputs starts the graphical user interface (GUI) for this analysis method. | ||
===Options=== | ===Options=== | ||
* '''''options''''' = a structure array with the following fields: | * '''''options''''' = a structure array with the following fields: | ||
* '''display''': [ {'on'} | 'off' ], governs level of display, | * '''display''': [ {'on'} | 'off' ], governs level of display, | ||
* '''plots''': ['none' | {'final'} ], governs level of plotting, | * '''plots''': ['none' | {'final'} ], governs level of plotting, | ||
* '''staticplots''': ['no' | {'yes'} ], produce ole-style "static" plots, | * '''staticplots''': ['no' | {'yes'} ], produce ole-style "static" plots, | ||
* '''rule''': [{'combined'} | 'final' | 'T2' | 'Q'], decision rule, | * '''rule''': [{'combined'} | 'final' | 'T2' | 'Q'], decision rule, | ||
* '''preprocessing''': { [ ] }, a preprocessing structure (see PREPROCESS) that is used to preprocess data in each class. | * '''preprocessing''': { [ ] }, a preprocessing structure (see PREPROCESS) that is used to preprocess data in each class. | ||
The default options can be retreived using: options = simca('options');. | The default options can be retreived using: options = simca('options');. | ||
Note: with display='off', plots='none', nocomp=(>0 integer) and preprocessing specified that SIMCA can be run without command line interaction. | Note: with display='off', plots='none', nocomp=(>0 integer) and preprocessing specified that SIMCA can be run without command line interaction. | ||
===See Also=== | ===See Also=== | ||
[[cluster]], [[crossval]], [[pca]], [[plsdthres]], [[discrimprob]], [[plsdaroc]], [[plsdthres]] | [[cluster]], [[crossval]], [[pca]], [[plsdthres]], [[discrimprob]], [[plsdaroc]], [[plsdthres]] |
Revision as of 14:26, 3 September 2008
Purpose
Create soft independent method of class analogy models for classification.
Synopsis
- model = simca(x,ncomp,options) %creates simca model on dataset x
- model = simca(x,classid,labels) %models double x with class id
- pred = simca(x,model,options); %predictions on x with model
- options = simca('options');.
Description
The function SIMCA develops a SIMCA model, which is really a collection of PCA models, one for each class of data in the data set and is used for supervised pattern recognition.
SIMCA cross-validates the PCA model of each class using leave-one-out cross-validation if the number of samples in the class is <= 20. If there are more than 20 samples, the data is split into 10 contiguous blocks.
INPUTS
- x = M x N matrix of class "dataset" where class information is extracted from x.class{1,1} and labels from x.label{1,1}, or
- x = M x N data matrix of class "double" and
- classid = M x 1 vector of class identifiers where each element is an integer identifying the class number of the corresponding sample.
- model = when making predictions, input model is a SIMCA model structure.
OPIONAL INPUTS:
- ncomp = integer, number of PCs to use in each model. This is rarely known a priori. When ncomp=[] {default} the user is querried for number of PCs for each class.
- labels = a character array with M rows that is used to label samples on Q vs. T2 plots, otherwise the class identifiers are used.
- options = a structure array discussed below.
OUPUT:
- model = model structure array with the following fields:
- modeltype: 'SIMCA',
- datasource: structure array with information about input data,
- date: date of creation,
- time: time of creation,
- info: additional model information,
- description: cell array with text description of model,
- submodel: structure array with each record containing the PCA model of each class (see PCA), and
- detail: sub-structure with additional model details and results.
- pred = is a structure, similar to model, that contains the SIMCA predictions. Additional, or other, fields in pred are:
- rtsq: the reduced T2 (T2 divided by it's 95Found confidence limit line) where each column corresponds to each class in the SIMCA model,
- rq: the reduced Q (Q divided by it's 95Found confidence limit line) where each column corresponds to each class in the SIMCA model,
- nclass: the predicted class number (class to which the sample was closest when considering T2 and Q combined), and
- submodelpred: structure array with each record containing the PCA model predictions for each class (see PCA).
Note: Calling simca with no inputs starts the graphical user interface (GUI) for this analysis method.
Options
- options = a structure array with the following fields:
- display: [ {'on'} | 'off' ], governs level of display,
- plots: ['none' | {'final'} ], governs level of plotting,
- staticplots: ['no' | {'yes'} ], produce ole-style "static" plots,
- rule: [{'combined'} | 'final' | 'T2' | 'Q'], decision rule,
- preprocessing: { [ ] }, a preprocessing structure (see PREPROCESS) that is used to preprocess data in each class.
The default options can be retreived using: options = simca('options');.
Note: with display='off', plots='none', nocomp=(>0 integer) and preprocessing specified that SIMCA can be run without command line interaction.
See Also
cluster, crossval, pca, plsdthres, discrimprob, plsdaroc, plsdthres