Pcr: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
(Importing text file)
 
imported>Jeremy
(Importing text file)
Line 9: Line 9:
PCR calculates a single principal components regression model using the given number of components ncomp to predict y from measurements x.
PCR calculates a single principal components regression model using the given number of components ncomp to predict y from measurements x.
To construct a PCR model, the inputs are x the predictor x-block (2-way array class "double" or "dataset"), y the predicted y-block (2-way array class "double" or "dataset"), ncomp the number of components to to be calculated (positive integer scalar) and the optional structure, ''options''. The output is a standard model structure model with the following fields (see MODELSTRUCT):
To construct a PCR model, the inputs are x the predictor x-block (2-way array class "double" or "dataset"), y the predicted y-block (2-way array class "double" or "dataset"), ncomp the number of components to to be calculated (positive integer scalar) and the optional structure, ''options''. The output is a standard model structure model with the following fields (see MODELSTRUCT):
* modeltype: 'PCR',
* '''modeltype''': 'PCR',
* datasource: structure array with information about input data,
* '''datasource''': structure array with information about input data,
* date: date of creation,
* '''date''': date of creation,
* time: time of creation,
* '''time''': time of creation,
* info: additional model information,
* '''info''': additional model information,
* reg: regression vector,
* '''reg''': regression vector,
* loads: cell array with model loadings for each mode/dimension,
* '''loads''': cell array with model loadings for each mode/dimension,
* pred: 2 element cell array with model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array) and the y-block predictions.
* '''pred''': 2 element cell array with model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array) and the y-block predictions.
* tsqs: cell array with T<sup>2</sup> values for each mode,
* '''tsqs''': cell array with T<sup>2</sup> values for each mode,
* ssqresiduals: cell array with sum of squares residuals for each mode,
* '''ssqresiduals''': cell array with sum of squares residuals for each mode,
* description: cell array with text description of model, and
* '''description''': cell array with text description of model, and
* detail: sub-structure with additional model details and results.
* '''detail''': sub-structure with additional model details and results.
To make predictions the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), and model the PCR model. The output pred is a structure, similar to model, that contains scores, predictions, etc. for the new data.
To make predictions the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), and model the PCR model. The output pred is a structure, similar to model, that contains scores, predictions, etc. for the new data.
If new y-block measurements are also available then the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), y the new predicted block (2-way array class "double" or "dataset"), and model the PCR model. The output valid is a structure, similar to model, that contains scores, predictions, and additional y-block statistics etc. for the new data.
If new y-block measurements are also available then the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), y the new predicted block (2-way array class "double" or "dataset"), and model the PCR model. The output valid is a structure, similar to model, that contains scores, predictions, and additional y-block statistics etc. for the new data.
Line 26: Line 26:
Note: Calling pcr with no inputs starts the graphical user interface (GUI) for this analysis method.  
Note: Calling pcr with no inputs starts the graphical user interface (GUI) for this analysis method.  
===Options===
===Options===
* ''options'' =  a structure array with the following fields:
* '''''options''''' =  a structure array with the following fields:
* display: [ 'off' | {'on'} ], governs level of display to command window,
* '''display''': [ 'off' | {'on'} ], governs level of display to command window,
* plots: [ 'none' | {'final'} ], governs level of plotting,
* '''plots''': [ 'none' | {'final'} ], governs level of plotting,
* outputversion: [ 2 | {3} ], governs output format (discussed below),
* '''outputversion''': [ 2 | {3} ], governs output format (discussed below),
* preprocessing: {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively),
* '''preprocessing''': {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively),
* algorithm: [ {'svd'} | ' robustpcr' | ' correlationpcr' ], governs which algorithm to use. 'svd' is standard algorithm. 'robustpcr' is robust algorithm with automatic outlier detection. 'correlationpcr' is standard PCR with re-ordering of factors in order of y-variance captured.
* '''algorithm''': [ {'svd'} | ' robustpcr' | ' correlationpcr' ], governs which algorithm to use. 'svd' is standard algorithm. 'robustpcr' is robust algorithm with automatic outlier detection. 'correlationpcr' is standard PCR with re-ordering of factors in order of y-variance captured.
* blockdetails: ['compact' | {'standard'} | 'all'], extent of predictions and raw residuals included in model. 'standard' = only y-block, 'all' x and y blocks.
* '''blockdetails''': ['compact' | {'standard'} | 'all'], extent of predictions and raw residuals included in model. 'standard' = only y-block, 'all' x and y blocks.
* confidencelimit: [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits,
* '''confidencelimit''': [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits,
* roptions: structure of options to pass to rpcr (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr',
* '''roptions''': structure of options to pass to rpcr (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr',
*  alpha :  [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'.
'''alpha''' :  [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'.
*  intadjust :  [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See ltsregres.m for details (Libra Toolbox).
'''intadjust''' :  [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See ltsregres.m for details (Libra Toolbox).
The default options can be retreived using: options = pcr('options');.
The default options can be retreived using: options = pcr('options');.
OUTPUTVERSION
OUTPUTVERSION
Line 42: Line 42:
:[b,ssq,t,p] = pcr(x,y,ncomp,''options'')
:[b,ssq,t,p] = pcr(x,y,ncomp,''options'')
where the outputs are
where the outputs are
* b = matrix of regression vectors or matrices for each number of principal components up to ncomp,
* '''b''' = matrix of regression vectors or matrices for each number of principal components up to ncomp,
* ssq = the sum of squares information,  
* '''ssq''' = the sum of squares information,  
* t = x-block scores, and
* '''t''' = x-block scores, and
* p = x-block loadings.
* '''p''' = x-block loadings.
Note: The regression matrices are ordered in b such that each ''Ny'' (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components.
Note: The regression matrices are ordered in b such that each ''Ny'' (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components.
===See Also===
===See Also===
[[analysis]], [[crossval]], [[frpcr]], [[modelstruct]], [[pca]], [[pls]], [[preprocess]], [[analysis]], [[ridge]]
[[analysis]], [[crossval]], [[frpcr]], [[modelstruct]], [[pca]], [[pls]], [[preprocess]], [[analysis]], [[ridge]]

Revision as of 19:57, 2 September 2008

Purpose

Principal components regression: multivariate inverse least squares regession.

Synopsis

model = pcr(x,y,ncomp,options) %calibration
pred = pcr(x,model,options) %prediction
valid = pcr(x,y,model,options) %validation
options = pcr('options')

Description

PCR calculates a single principal components regression model using the given number of components ncomp to predict y from measurements x. To construct a PCR model, the inputs are x the predictor x-block (2-way array class "double" or "dataset"), y the predicted y-block (2-way array class "double" or "dataset"), ncomp the number of components to to be calculated (positive integer scalar) and the optional structure, options. The output is a standard model structure model with the following fields (see MODELSTRUCT):

  • modeltype: 'PCR',
  • datasource: structure array with information about input data,
  • date: date of creation,
  • time: time of creation,
  • info: additional model information,
  • reg: regression vector,
  • loads: cell array with model loadings for each mode/dimension,
  • pred: 2 element cell array with model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array) and the y-block predictions.
  • tsqs: cell array with T2 values for each mode,
  • ssqresiduals: cell array with sum of squares residuals for each mode,
  • description: cell array with text description of model, and
  • detail: sub-structure with additional model details and results.

To make predictions the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), and model the PCR model. The output pred is a structure, similar to model, that contains scores, predictions, etc. for the new data. If new y-block measurements are also available then the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), y the new predicted block (2-way array class "double" or "dataset"), and model the PCR model. The output valid is a structure, similar to model, that contains scores, predictions, and additional y-block statistics etc. for the new data. In prediction and validation modes, the same model structure is used but predictions are provided in the model.detail.pred field. Note: Calling pcr with no inputs starts the graphical user interface (GUI) for this analysis method.

Options

  • options = a structure array with the following fields:
  • display: [ 'off' | {'on'} ], governs level of display to command window,
  • plots: [ 'none' | {'final'} ], governs level of plotting,
  • outputversion: [ 2 | {3} ], governs output format (discussed below),
  • preprocessing: {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively),
  • algorithm: [ {'svd'} | ' robustpcr' | ' correlationpcr' ], governs which algorithm to use. 'svd' is standard algorithm. 'robustpcr' is robust algorithm with automatic outlier detection. 'correlationpcr' is standard PCR with re-ordering of factors in order of y-variance captured.
  • blockdetails: ['compact' | {'standard'} | 'all'], extent of predictions and raw residuals included in model. 'standard' = only y-block, 'all' x and y blocks.
  • confidencelimit: [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits,
  • roptions: structure of options to pass to rpcr (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr',
  • alpha : [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'.
  • intadjust : [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See ltsregres.m for details (Libra Toolbox).

The default options can be retreived using: options = pcr('options');. OUTPUTVERSION By default (options.outputversion = 3) the output of the function is a standard model structure model. If options.outputversion = 2, the output format is:

[b,ssq,t,p] = pcr(x,y,ncomp,options)

where the outputs are

  • b = matrix of regression vectors or matrices for each number of principal components up to ncomp,
  • ssq = the sum of squares information,
  • t = x-block scores, and
  • p = x-block loadings.

Note: The regression matrices are ordered in b such that each Ny (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components.

See Also

analysis, crossval, frpcr, modelstruct, pca, pls, preprocess, analysis, ridge