Pcr: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
(Importing text file)
imported>Jeremy
(Importing text file)
Line 1: Line 1:
===Purpose===
===Purpose===
Principal components regression: multivariate inverse least squares regession.
Principal components regression: multivariate inverse least squares regession.
===Synopsis===
===Synopsis===
:model = pcr(x,y,ncomp,''options'')    %calibration
:model = pcr(x,y,ncomp,''options'')    %calibration
:pred  = pcr(x,model,''options'')      %prediction
:pred  = pcr(x,model,''options'')      %prediction
:valid = pcr(x,y,model,''options'')    %validation
:valid = pcr(x,y,model,''options'')    %validation
:options = pcr('options')
:options = pcr('options')
===Description===
===Description===
PCR calculates a single principal components regression model using the given number of components ncomp to predict y from measurements x.
PCR calculates a single principal components regression model using the given number of components ncomp to predict y from measurements x.
To construct a PCR model, the inputs are x the predictor x-block (2-way array class "double" or "dataset"), y the predicted y-block (2-way array class "double" or "dataset"), ncomp the number of components to to be calculated (positive integer scalar) and the optional structure, ''options''. The output is a standard model structure model with the following fields (see MODELSTRUCT):
To construct a PCR model, the inputs are x the predictor x-block (2-way array class "double" or "dataset"), y the predicted y-block (2-way array class "double" or "dataset"), ncomp the number of components to to be calculated (positive integer scalar) and the optional structure, ''options''. The output is a standard model structure model with the following fields (see MODELSTRUCT):
* '''modeltype''': 'PCR',
* '''modeltype''': 'PCR',
* '''datasource''': structure array with information about input data,
* '''datasource''': structure array with information about input data,
* '''date''': date of creation,
* '''date''': date of creation,
* '''time''': time of creation,
* '''time''': time of creation,
* '''info''': additional model information,
* '''info''': additional model information,
* '''reg''': regression vector,
* '''reg''': regression vector,
* '''loads''': cell array with model loadings for each mode/dimension,
* '''loads''': cell array with model loadings for each mode/dimension,
* '''pred''': 2 element cell array with model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array) and the y-block predictions.
* '''pred''': 2 element cell array with model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array) and the y-block predictions.
* '''tsqs''': cell array with T<sup>2</sup> values for each mode,
* '''tsqs''': cell array with T<sup>2</sup> values for each mode,
* '''ssqresiduals''': cell array with sum of squares residuals for each mode,
* '''ssqresiduals''': cell array with sum of squares residuals for each mode,
* '''description''': cell array with text description of model, and
* '''description''': cell array with text description of model, and
* '''detail''': sub-structure with additional model details and results.
* '''detail''': sub-structure with additional model details and results.
To make predictions the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), and model the PCR model. The output pred is a structure, similar to model, that contains scores, predictions, etc. for the new data.
To make predictions the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), and model the PCR model. The output pred is a structure, similar to model, that contains scores, predictions, etc. for the new data.
If new y-block measurements are also available then the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), y the new predicted block (2-way array class "double" or "dataset"), and model the PCR model. The output valid is a structure, similar to model, that contains scores, predictions, and additional y-block statistics etc. for the new data.
If new y-block measurements are also available then the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), y the new predicted block (2-way array class "double" or "dataset"), and model the PCR model. The output valid is a structure, similar to model, that contains scores, predictions, and additional y-block statistics etc. for the new data.
In prediction and validation modes, the same model structure is used but predictions are provided in the model.detail.pred field.
In prediction and validation modes, the same model structure is used but predictions are provided in the model.detail.pred field.
Note: Calling pcr with no inputs starts the graphical user interface (GUI) for this analysis method.  
Note: Calling pcr with no inputs starts the graphical user interface (GUI) for this analysis method.  
===Options===
===Options===
* '''''options''''' =  a structure array with the following fields:
* '''''options''''' =  a structure array with the following fields:
* '''display''': [ 'off' | {'on'} ], governs level of display to command window,
* '''display''': [ 'off' | {'on'} ], governs level of display to command window,
* '''plots''': [ 'none' | {'final'} ], governs level of plotting,
* '''plots''': [ 'none' | {'final'} ], governs level of plotting,
* '''outputversion''': [ 2 | {3} ], governs output format (discussed below),
* '''outputversion''': [ 2 | {3} ], governs output format (discussed below),
* '''preprocessing''': {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively),
* '''preprocessing''': {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively),
* '''algorithm''': [ {'svd'} | ' robustpcr' | ' correlationpcr' ], governs which algorithm to use. 'svd' is standard algorithm. 'robustpcr' is robust algorithm with automatic outlier detection. 'correlationpcr' is standard PCR with re-ordering of factors in order of y-variance captured.
* '''algorithm''': [ {'svd'} | ' robustpcr' | ' correlationpcr' ], governs which algorithm to use. 'svd' is standard algorithm. 'robustpcr' is robust algorithm with automatic outlier detection. 'correlationpcr' is standard PCR with re-ordering of factors in order of y-variance captured.
* '''blockdetails''': ['compact' | {'standard'} | 'all'], extent of predictions and raw residuals included in model. 'standard' = only y-block, 'all' x and y blocks.
* '''blockdetails''': ['compact' | {'standard'} | 'all'], extent of predictions and raw residuals included in model. 'standard' = only y-block, 'all' x and y blocks.
* '''confidencelimit''': [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits,
* '''confidencelimit''': [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits,
* '''roptions''': structure of options to pass to rpcr (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr',
* '''roptions''': structure of options to pass to rpcr (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr',
*  '''alpha''' :  [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'.
*  '''alpha''' :  [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'.
*  '''intadjust''' :  [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See ltsregres.m for details (Libra Toolbox).
*  '''intadjust''' :  [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See ltsregres.m for details (Libra Toolbox).
The default options can be retreived using: options = pcr('options');.
The default options can be retreived using: options = pcr('options');.
OUTPUTVERSION
OUTPUTVERSION
By default (options.outputversion = 3) the output of the function is a standard model structure model. If options.outputversion = 2, the output format is:
By default (options.outputversion = 3) the output of the function is a standard model structure model. If options.outputversion = 2, the output format is:
:[b,ssq,t,p] = pcr(x,y,ncomp,''options'')
:[b,ssq,t,p] = pcr(x,y,ncomp,''options'')
where the outputs are
where the outputs are
* '''b''' = matrix of regression vectors or matrices for each number of principal components up to ncomp,
* '''b''' = matrix of regression vectors or matrices for each number of principal components up to ncomp,
* '''ssq''' = the sum of squares information,  
* '''ssq''' = the sum of squares information,  
* '''t''' = x-block scores, and
* '''t''' = x-block scores, and
* '''p''' = x-block loadings.
* '''p''' = x-block loadings.
Note: The regression matrices are ordered in b such that each ''Ny'' (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components.
Note: The regression matrices are ordered in b such that each ''Ny'' (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components.
===See Also===
===See Also===
[[analysis]], [[crossval]], [[frpcr]], [[modelstruct]], [[pca]], [[pls]], [[preprocess]], [[analysis]], [[ridge]]
[[analysis]], [[crossval]], [[frpcr]], [[modelstruct]], [[pca]], [[pls]], [[preprocess]], [[analysis]], [[ridge]]

Revision as of 14:26, 3 September 2008

Purpose

Principal components regression: multivariate inverse least squares regession.

Synopsis

model = pcr(x,y,ncomp,options) %calibration
pred = pcr(x,model,options) %prediction
valid = pcr(x,y,model,options) %validation
options = pcr('options')

Description

PCR calculates a single principal components regression model using the given number of components ncomp to predict y from measurements x.

To construct a PCR model, the inputs are x the predictor x-block (2-way array class "double" or "dataset"), y the predicted y-block (2-way array class "double" or "dataset"), ncomp the number of components to to be calculated (positive integer scalar) and the optional structure, options. The output is a standard model structure model with the following fields (see MODELSTRUCT):

  • modeltype: 'PCR',
  • datasource: structure array with information about input data,
  • date: date of creation,
  • time: time of creation,
  • info: additional model information,
  • reg: regression vector,
  • loads: cell array with model loadings for each mode/dimension,
  • pred: 2 element cell array with model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array) and the y-block predictions.
  • tsqs: cell array with T2 values for each mode,
  • ssqresiduals: cell array with sum of squares residuals for each mode,
  • description: cell array with text description of model, and
  • detail: sub-structure with additional model details and results.

To make predictions the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), and model the PCR model. The output pred is a structure, similar to model, that contains scores, predictions, etc. for the new data.

If new y-block measurements are also available then the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), y the new predicted block (2-way array class "double" or "dataset"), and model the PCR model. The output valid is a structure, similar to model, that contains scores, predictions, and additional y-block statistics etc. for the new data.

In prediction and validation modes, the same model structure is used but predictions are provided in the model.detail.pred field.

Note: Calling pcr with no inputs starts the graphical user interface (GUI) for this analysis method.

Options

  • options = a structure array with the following fields:
  • display: [ 'off' | {'on'} ], governs level of display to command window,
  • plots: [ 'none' | {'final'} ], governs level of plotting,
  • outputversion: [ 2 | {3} ], governs output format (discussed below),
  • preprocessing: {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively),
  • algorithm: [ {'svd'} | ' robustpcr' | ' correlationpcr' ], governs which algorithm to use. 'svd' is standard algorithm. 'robustpcr' is robust algorithm with automatic outlier detection. 'correlationpcr' is standard PCR with re-ordering of factors in order of y-variance captured.
  • blockdetails: ['compact' | {'standard'} | 'all'], extent of predictions and raw residuals included in model. 'standard' = only y-block, 'all' x and y blocks.
  • confidencelimit: [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits,
  • roptions: structure of options to pass to rpcr (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr',
  • alpha : [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'.
  • intadjust : [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See ltsregres.m for details (Libra Toolbox).

The default options can be retreived using: options = pcr('options');.

OUTPUTVERSION

By default (options.outputversion = 3) the output of the function is a standard model structure model. If options.outputversion = 2, the output format is:

[b,ssq,t,p] = pcr(x,y,ncomp,options)

where the outputs are

  • b = matrix of regression vectors or matrices for each number of principal components up to ncomp,
  • ssq = the sum of squares information,
  • t = x-block scores, and
  • p = x-block loadings.

Note: The regression matrices are ordered in b such that each Ny (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components.

See Also

analysis, crossval, frpcr, modelstruct, pca, pls, preprocess, analysis, ridge