Pcr: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Chuck
imported>Jeremy
No edit summary
Line 11: Line 11:
===Description===
===Description===


PCR calculates a single principal components regression model using the given number of components '''ncomp''' to predict '''y''' from measurements '''x''', OR applies an existing PCR model to a new set of data '''x'''
PCR calculates a single principal components regression model using the given number of components <tt>ncomp</tt> to predict <tt>y</tt> from measurements <tt>x</tt>, OR applies an existing PCR model to a new set of data <tt>x</tt>


To make predictions, the inputs are '''x''' the new predictor x-block (2-way array class "double" or "dataset"), and '''model''' the PCR model. The output '''pred''' is a structure, similar to '''model''', that contains scores, predictions, etc. for the new data.
To make predictions, the inputs are <tt>x</tt> the new predictor x-block (2-way array class "double" or "dataset"), and <tt>model</tt> the PCR model. The output <tt>pred</tt> is a structure, similar to <tt>model</tt>, that contains scores, predictions, etc. for the new data.


If new y-block measurements are also available for the new data, then the inputs are '''x''' the new x-block (2-way array class "double" or "dataset"), '''y''' the new y-block (2-way array class "double" or "dataset"), and '''model''' the PCR model to apply. The output '''valid''' is a structure, similar to '''model''', that contains scores, predictions, and additional y-block statistics etc. for the new data.
If new y-block measurements are also available for the new data, then the inputs are <tt>x</tt> the new x-block (2-way array class "double" or "dataset"), <tt>y</tt> the new y-block (2-way array class "double" or "dataset"), and <tt>model</tt> the PCR model to apply. The output <tt>valid</tt> is a structure, similar to <tt>model</tt>, that contains scores, predictions, and additional y-block statistics etc. for the new data.


In prediction and validation modes, the same model structure is used but predictions are provided in the model.detail.pred field.
In prediction and validation modes, the same model structure is used but predictions are provided in the <tt>model.detail.pred</tt> field.


Note: Calling '''pcr''' with no inputs starts the graphical user interface (GUI) for this analysis method.
Note: Calling '''pcr''' with no inputs starts the graphical user interface (GUI) for this analysis method.

Revision as of 13:38, 10 October 2008

Purpose

Principal Components Regression: multivariate inverse least squares regression.

Synopsis

model = pcr(x,y,ncomp,options) %identifies model (calibration step)
pred = pcr(x,model,options) %applies model to a new X-block
valid = pcr(x,y,model,options) %applies model to a new X-block, with corresponding new Y values

Description

PCR calculates a single principal components regression model using the given number of components ncomp to predict y from measurements x, OR applies an existing PCR model to a new set of data x

To make predictions, the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), and model the PCR model. The output pred is a structure, similar to model, that contains scores, predictions, etc. for the new data.

If new y-block measurements are also available for the new data, then the inputs are x the new x-block (2-way array class "double" or "dataset"), y the new y-block (2-way array class "double" or "dataset"), and model the PCR model to apply. The output valid is a structure, similar to model, that contains scores, predictions, and additional y-block statistics etc. for the new data.

In prediction and validation modes, the same model structure is used but predictions are provided in the model.detail.pred field.

Note: Calling pcr with no inputs starts the graphical user interface (GUI) for this analysis method.

Inputs

  • x = X-block data (2-way array or DataSet Object)
  • y = Y-block data (2-way array or DataSet Object)
  • ncomp = number of components to to be calculated (positive integer scalar).

Optional Inputs

  • options discussed below

Outputs

The output is a standard model structure with the following fields (see MODELSTRUCT):

  • modeltype: 'PCR',
  • datasource: structure array with information about input data,
  • date: date of creation,
  • time: time of creation,
  • info: additional model information,
  • reg: regression vector,
  • loads: cell array with model loadings for each mode/dimension,
  • pred: 2 element cell array containing
    • model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array), and
    • the y-block predictions.
  • tsqs: cell array with T2 values for each mode,
  • ssqresiduals: cell array with sum of squares residuals for each mode,
  • description: cell array with text description of model, and
  • detail: sub-structure with additional model details and results.

Options

options = a structure array with the following fields:

  • display: [ 'off' | {'on'} ], governs level of display to command window,
  • plots: [ 'none' | {'final'} ], governs level of plotting,
  • outputversion: [ 2 | {3} ], governs output format (discussed below),
  • preprocessing: {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively),
  • algorithm: [ {'svd'} | ' robustpcr' | ' correlationpcr' ], governs which algorithm to use. 'svd' is standard algorithm. 'robustpcr' is robust algorithm with automatic outlier detection. 'correlationpcr' is standard PCR with re-ordering of factors in order of y-variance captured.
  • blockdetails: ['compact' | {'standard'} | 'all'], extent of predictions and raw residuals included in model. 'standard' = only y-block, 'all' x and y blocks.
  • confidencelimit: [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits,
  • roptions: structure of options to pass to rpcr (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr',
  • alpha : [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'.
  • intadjust : [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See ltsregres for details (Libra Toolbox).

The default options can be retreived using: options = pcr('options');.

OUTPUTVERSION

By default (options.outputversion = 3) the output of the function is a standard model structure model. If options.outputversion = 2, the output format is:

[b,ssq,t,p] = pcr(x,y,ncomp,options)

where the outputs are

  • b = matrix of regression vectors or matrices for each number of principal components up to ncomp,
  • ssq = the sum of squares information,
  • t = x-block scores, and
  • p = x-block loadings.

Note: The regression matrices are ordered in b such that each Ny (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components.

See Also

analysis, crossval, frpcr, modelstruct, pca, pls, preprocess, analysis, ridge