Pcr: Difference between revisions
imported>Jeremy (Importing text file) |
No edit summary |
||
(29 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
===Purpose=== | ===Purpose=== | ||
Principal | |||
Principal Components Regression: multivariate inverse least squares regression. | |||
===Synopsis=== | ===Synopsis=== | ||
:model = pcr(x,y,ncomp,''options'') %calibration | |||
:pred = pcr(x,model,''options'') % | :model = pcr(x,y,ncomp,''options'') %identifies model (calibration step) | ||
:valid = pcr(x,y,model,''options'') % | :pred = pcr(x,model,''options'') %applies model to a new X-block | ||
: | :valid = pcr(x,y,model,''options'') %applies model to a new X-block, with corresponding new Y values | ||
:pcr % Launches an Analysis window with PCR as the selected method. | |||
Please note that the recommended way to build and apply a PCR model from the command line is to use the Model Object. Please see [[EVRIModel_Objects | this wiki page on building and applying models using the Model Object]]. | |||
===Description=== | ===Description=== | ||
PCR calculates a single principal components regression model using the given number of components ncomp to predict y from measurements x | |||
To | PCR calculates a single principal components regression model using the given number of components <tt>ncomp</tt> to predict <tt>y</tt> from measurements <tt>x</tt>, OR applies an existing PCR model to a new set of data <tt>x</tt> | ||
* modeltype: 'PCR', | |||
* datasource: structure array with information about input data, | To make predictions, the inputs are <tt>x</tt> the new predictor x-block (2-way array class "double" or "dataset"), and <tt>model</tt> the PCR model. The output <tt>pred</tt> is a structure, similar to <tt>model</tt>, that contains scores, predictions, etc. for the new data. | ||
* date: date of creation, | |||
* time: time of creation, | If new y-block measurements are also available for the new data, then the inputs are <tt>x</tt> the new x-block (2-way array class "double" or "dataset"), <tt>y</tt> the new y-block (2-way array class "double" or "dataset"), and <tt>model</tt> the PCR model to apply. The output <tt>valid</tt> is a structure, similar to <tt>model</tt>, that contains scores, predictions, and additional y-block statistics etc. for the new data. | ||
* info: additional model information, | |||
* reg: regression vector, | In prediction and validation modes, the same model structure is used but predictions are provided in the <tt>model.detail.pred</tt> field. | ||
* loads: cell array with model loadings for each mode/dimension, | |||
* pred: 2 element cell array | Note: Calling '''pcr''' with no inputs starts the graphical user interface (GUI) for this analysis method. | ||
* tsqs: cell array with T<sup>2</sup> values for each mode, | |||
* ssqresiduals: cell array with sum of squares residuals for each mode, | ====Inputs==== | ||
* description: cell array with text description of model, and | |||
* detail: sub-structure with additional model details and results. | * '''x''' = X-block data (2-way array or DataSet Object) | ||
* '''y''' = Y-block data (2-way array or DataSet Object) | |||
* '''ncomp''' = number of components to to be calculated (positive integer scalar). | |||
====Optional Inputs==== | |||
* '''options''' discussed below | |||
====Outputs==== | |||
The output is a standard model structure with the following fields (see [[Standard Model Structure]]): | |||
* '''modeltype''': 'PCR', | |||
* '''datasource''': structure array with information about input data, | |||
* '''date''': date of creation, | |||
* '''time''': time of creation, | |||
* '''info''': additional model information, | |||
* '''reg''': regression vector, | |||
* '''loads''': cell array with model loadings for each mode/dimension, | |||
* '''pred''': 2 element cell array containing model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array), and the y-block predictions. | |||
* '''tsqs''': cell array with T<sup>2</sup> values for each mode, | |||
* '''ssqresiduals''': cell array with sum of squares residuals for each mode, | |||
* '''description''': cell array with text description of model, and | |||
* '''detail''': sub-structure with additional model details and results. | |||
===Options=== | ===Options=== | ||
* display: [ 'off' | {'on'} ], governs level of display to command window, | ''options'' = a structure array with the following fields: | ||
* plots: [ 'none' | {'final'} ], governs level of plotting, | |||
* outputversion: [ 2 | {3} ], governs output format (discussed below), | * '''display''': [ 'off' | {'on'} ], governs level of display to command window, | ||
* preprocessing: {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively), | |||
* algorithm: [ {'svd'} | ' robustpcr' | ' correlationpcr' ], governs which algorithm to use. 'svd' | * '''plots''': [ 'none' | {'final'} ], governs level of plotting, | ||
* blockdetails: ['compact' | {'standard'} | 'all'], | |||
* confidencelimit: [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits, | * '''outputversion''': [ 2 | {3} ], governs output format (discussed below), | ||
* roptions: structure of options to pass to rpcr (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr', | |||
* alpha : [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'. | * '''preprocessing''': {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively), | ||
* intadjust : [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See ltsregres | |||
* '''algorithm''': [ {'svd'} | ' robustpcr' | ' correlationpcr' | 'frpcr' ], governs which algorithm to use. | |||
** 'svd' = standard singular value decomposition algorithm. | |||
** 'robustpcr' = robust algorithm with automatic outlier detection. | |||
** 'correlationpcr' = standard PCR with re-ordering of factors in order of y-variance captured. | |||
** 'frpcr' = full-ratio PCR (a.k.a. optimized scaling) with automatic sample scale correction. Note that with FRPCR, models generally perform better without mean-centering on the x-block. | |||
* '''blockdetails''': [ 'compact' | {'standard'} | 'all' ] level of detail (predictions, raw residuals, and calibration data) included in the model. | |||
:* ‘Standard’ = the predictions and raw residuals for the X-block as well as the X-block itself are not stored in the model to reduce its size in memory. Specifically, these fields in the model object are left empty: 'model.pred{1}', 'model.detail.res{1}', 'model.detail.data{1}'. | |||
:* ‘Compact’ = for this function, 'compact' is identical to 'standard'. | |||
:* 'All' = keep predictions, raw residuals for both X- & Y-blocks as well as the X- & Y-blocks themselves. | |||
* '''confidencelimit''': [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits, | |||
* '''roptions''': structure of options to pass to '''rpcr''' (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr', | |||
* '''alpha''' : [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'. | |||
* '''intadjust''' : [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See '''ltsregres''' for details (Libra Toolbox). | |||
The default options can be retreived using: options = pcr('options');. | The default options can be retreived using: options = pcr('options');. | ||
OUTPUTVERSION | |||
====OUTPUTVERSION==== | |||
By default (options.outputversion = 3) the output of the function is a standard model structure model. If options.outputversion = 2, the output format is: | By default (options.outputversion = 3) the output of the function is a standard model structure model. If options.outputversion = 2, the output format is: | ||
:[b,ssq,t,p] = pcr(x,y,ncomp,''options'') | :[b,ssq,t,p] = pcr(x,y,ncomp,''options'') | ||
where the outputs are | where the outputs are | ||
* b = matrix of regression vectors or matrices for each number of principal components up to ncomp, | |||
* ssq = the sum of squares information, | * '''b''' = matrix of regression vectors or matrices for each number of principal components up to ncomp, | ||
* t = x-block scores, and | |||
* p = x-block loadings. | * '''ssq''' = the sum of squares information, | ||
Note: The regression matrices are ordered in b such that each ''Ny'' (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components. | |||
* '''t''' = x-block scores, and | |||
* '''p''' = x-block loadings. | |||
Note: The regression matrices are ordered in '''b''' such that each ''Ny'' (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components. | |||
===See Also=== | ===See Also=== | ||
[[analysis]], [[crossval]], [[frpcr]], [[modelstruct]], [[pca]], [[pls]], [[preprocess]], [[ | |||
[[analysis]], [[crossval]], [[frpcr]], [[mlr]], [[modelstruct]], [[pca]], [[pls]], [[preprocess]], [[ridge]], [[EVRIModel_Objects]] |
Latest revision as of 13:54, 6 February 2020
Purpose
Principal Components Regression: multivariate inverse least squares regression.
Synopsis
- model = pcr(x,y,ncomp,options) %identifies model (calibration step)
- pred = pcr(x,model,options) %applies model to a new X-block
- valid = pcr(x,y,model,options) %applies model to a new X-block, with corresponding new Y values
- pcr % Launches an Analysis window with PCR as the selected method.
Please note that the recommended way to build and apply a PCR model from the command line is to use the Model Object. Please see this wiki page on building and applying models using the Model Object.
Description
PCR calculates a single principal components regression model using the given number of components ncomp to predict y from measurements x, OR applies an existing PCR model to a new set of data x
To make predictions, the inputs are x the new predictor x-block (2-way array class "double" or "dataset"), and model the PCR model. The output pred is a structure, similar to model, that contains scores, predictions, etc. for the new data.
If new y-block measurements are also available for the new data, then the inputs are x the new x-block (2-way array class "double" or "dataset"), y the new y-block (2-way array class "double" or "dataset"), and model the PCR model to apply. The output valid is a structure, similar to model, that contains scores, predictions, and additional y-block statistics etc. for the new data.
In prediction and validation modes, the same model structure is used but predictions are provided in the model.detail.pred field.
Note: Calling pcr with no inputs starts the graphical user interface (GUI) for this analysis method.
Inputs
- x = X-block data (2-way array or DataSet Object)
- y = Y-block data (2-way array or DataSet Object)
- ncomp = number of components to to be calculated (positive integer scalar).
Optional Inputs
- options discussed below
Outputs
The output is a standard model structure with the following fields (see Standard Model Structure):
- modeltype: 'PCR',
- datasource: structure array with information about input data,
- date: date of creation,
- time: time of creation,
- info: additional model information,
- reg: regression vector,
- loads: cell array with model loadings for each mode/dimension,
- pred: 2 element cell array containing model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array), and the y-block predictions.
- tsqs: cell array with T2 values for each mode,
- ssqresiduals: cell array with sum of squares residuals for each mode,
- description: cell array with text description of model, and
- detail: sub-structure with additional model details and results.
Options
options = a structure array with the following fields:
- display: [ 'off' | {'on'} ], governs level of display to command window,
- plots: [ 'none' | {'final'} ], governs level of plotting,
- outputversion: [ 2 | {3} ], governs output format (discussed below),
- preprocessing: {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively),
- algorithm: [ {'svd'} | ' robustpcr' | ' correlationpcr' | 'frpcr' ], governs which algorithm to use.
- 'svd' = standard singular value decomposition algorithm.
- 'robustpcr' = robust algorithm with automatic outlier detection.
- 'correlationpcr' = standard PCR with re-ordering of factors in order of y-variance captured.
- 'frpcr' = full-ratio PCR (a.k.a. optimized scaling) with automatic sample scale correction. Note that with FRPCR, models generally perform better without mean-centering on the x-block.
- blockdetails: [ 'compact' | {'standard'} | 'all' ] level of detail (predictions, raw residuals, and calibration data) included in the model.
- ‘Standard’ = the predictions and raw residuals for the X-block as well as the X-block itself are not stored in the model to reduce its size in memory. Specifically, these fields in the model object are left empty: 'model.pred{1}', 'model.detail.res{1}', 'model.detail.data{1}'.
- ‘Compact’ = for this function, 'compact' is identical to 'standard'.
- 'All' = keep predictions, raw residuals for both X- & Y-blocks as well as the X- & Y-blocks themselves.
- confidencelimit: [ {'0.95'} ], confidence level for Q and T2 limits. A value of zero (0) disables calculation of confidence limits,
- roptions: structure of options to pass to rpcr (robust PCR engine from the Libra Toolbox). Only used when algorithm is 'robustpcr',
- alpha : [ {0.75} ], (1-alpha) measures the number of outliers the algorithm should resist. Any value between 0.5 and 1 may be specified. These options are only used when algorithm is 'robustpcr'.
- intadjust : [ {0} ], if equal to one, the intercept adjustment for the LTS-regression will be calculated. See ltsregres for details (Libra Toolbox).
The default options can be retreived using: options = pcr('options');.
OUTPUTVERSION
By default (options.outputversion = 3) the output of the function is a standard model structure model. If options.outputversion = 2, the output format is:
- [b,ssq,t,p] = pcr(x,y,ncomp,options)
where the outputs are
- b = matrix of regression vectors or matrices for each number of principal components up to ncomp,
- ssq = the sum of squares information,
- t = x-block scores, and
- p = x-block loadings.
Note: The regression matrices are ordered in b such that each Ny (number of y-block variables) rows correspond to the regression matrix for that particular number of principal components.
See Also
analysis, crossval, frpcr, mlr, modelstruct, pca, pls, preprocess, ridge, EVRIModel_Objects