Simpls: Difference between revisions
Jump to navigation
Jump to search
imported>Jeremy (Importing text file) |
imported>Benjamin mNo edit summary |
||
(5 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
===Purpose=== | ===Purpose=== | ||
Line 15: | Line 14: | ||
* '''x''' = X-block (predictor block) class "double" or "dataset", and | * '''x''' = X-block (predictor block) class "double" or "dataset", and | ||
* '''y''' = Y-block (predicted block) class "double" or "dataset". | * '''y''' = Y-block (predicted block) class "double" or "dataset". | ||
====Optional Inputs==== | |||
* '''''ncomp''''' = integer, number of latent variables to use in {default = rank of X-block}, and | * '''''ncomp''''' = integer, number of latent variables to use in {default = rank of X-block}, and | ||
* '''options''' = a structure array discussed below. | |||
====Outputs==== | |||
* '''reg''' = matrix of regression vectors, | * '''reg''' = matrix of regression vectors where each row corresponds to a regression vector for a given number of latent variables. If the Y-block contains multiple columns, the rows of '''reg''' will be in groups of latent variables (so that the regression vectors for all columns of Y at 1 latent variable will come first, followed by the regression vectors for all columns of Y at 2 latent variables, etc) | ||
::<math>\begin{bmatrix}{b_{y1,1}}\\ {b_{y2,1}}\\ {b_{y1,2}}\\ {b_{y2,2}}\\ {b_{y1,3}}\\ {b_{y2,3}}\end{bmatrix}</math> | |||
:where b<sub>yn,k</sub> is the regression vector for column "n" of the Y-block calculated from "k" latent variables. | |||
* '''ssq''' = the sum of squares captured (ssq) with the columns: | |||
::Column 1 = Number of latent variables (LVs) | |||
::Column 2 = Variance captured (as a percent) in the X-block by this LV | |||
::Column 3 = Total variance captured (%) by all LVs up to this row | |||
::Column 4 = Variance captured (as a percent) in the X-block by this LV | |||
::Column 5 = Total variance captured (%) by all LVs up to this row | |||
* '''xlds''' = X-block loadings (size: x-block columns by LVs), | |||
* '''ylds''' = Y-block loadings (size: y-block columns by LVs), | |||
* '''wts''' = X-block weights (size: x-block columns by LVs), | |||
* '''xscrs''' = X-block scores (size: samples by LVs), | |||
* '''yscrs''' = Y-block scores (size: samples by LVs), | |||
* '''basis''' = the basis of X-block loadings (size: x-block columns by LVs). | |||
'''NOTE:''' in previous versions of SIMPLS, the X-block scores were unit length and the X-block loadings contained the variance. As of Version 3.0, this algorithm now uses standard convention in which the X-block scores contain the variance. | |||
The calculations for Variance Captured is shown here: | |||
* | ::Xlds = ((X*wts)’X)’ | ||
::Ylds = Y’YX*wts | |||
::ssqX = ΣΣ(X.^2) | |||
::ssqY = ΣΣ(Y.^2) | |||
::VarX = diag(Xlds’*Xlds)/ssqX | |||
::VarY = diag(Ylds’*Ylds)/ssqY | |||
* | |||
===Options=== | ===Options=== | ||
Line 51: | Line 57: | ||
* '''display''': [ {'on'} | 'off' ], governs level of display, and | * '''display''': [ {'on'} | 'off' ], governs level of display, and | ||
* '''ranktest''': [ 'none' | 'data' | 'scores' | {'auto'} ], governs type of rank test to perform. | * '''ranktest''': [ 'none' | 'data' | 'scores' | {'auto'} ], governs type of rank test to perform. | ||
:: ''''data'''' = single test on X-block (faster with smaller data blocks and more components), | |||
:: ''''scores'''' = test during regression on scores matrix (faster with larger data matricies), | |||
:: ''''auto'''' = automatic selection, or | |||
:: ''''none'''' = assumes X-block has sufficient rank. | |||
===See Also=== | ===See Also=== | ||
[[crossval]], [[modelstruct]], [[pcr]], [[ | [[crossval]], [[modelstruct]], [[pcr]], [[pls]], [[preprocess]], [[nippls]], [[analysis]] |
Latest revision as of 13:14, 23 September 2016
Purpose
Partial Least Squares regression using the SIMPLS algorithm.
Synopsis
- [reg,ssq,xlds,ylds,wts,xscrs,yscrs,basis] = simpls(x,y,ncomp,options)
Description
SIMPLS performs PLS regression using SIMPLS algorithm.
Inputs
- x = X-block (predictor block) class "double" or "dataset", and
- y = Y-block (predicted block) class "double" or "dataset".
Optional Inputs
- ncomp = integer, number of latent variables to use in {default = rank of X-block}, and
- options = a structure array discussed below.
Outputs
- reg = matrix of regression vectors where each row corresponds to a regression vector for a given number of latent variables. If the Y-block contains multiple columns, the rows of reg will be in groups of latent variables (so that the regression vectors for all columns of Y at 1 latent variable will come first, followed by the regression vectors for all columns of Y at 2 latent variables, etc)
- where byn,k is the regression vector for column "n" of the Y-block calculated from "k" latent variables.
- ssq = the sum of squares captured (ssq) with the columns:
- Column 1 = Number of latent variables (LVs)
- Column 2 = Variance captured (as a percent) in the X-block by this LV
- Column 3 = Total variance captured (%) by all LVs up to this row
- Column 4 = Variance captured (as a percent) in the X-block by this LV
- Column 5 = Total variance captured (%) by all LVs up to this row
- xlds = X-block loadings (size: x-block columns by LVs),
- ylds = Y-block loadings (size: y-block columns by LVs),
- wts = X-block weights (size: x-block columns by LVs),
- xscrs = X-block scores (size: samples by LVs),
- yscrs = Y-block scores (size: samples by LVs),
- basis = the basis of X-block loadings (size: x-block columns by LVs).
NOTE: in previous versions of SIMPLS, the X-block scores were unit length and the X-block loadings contained the variance. As of Version 3.0, this algorithm now uses standard convention in which the X-block scores contain the variance.
The calculations for Variance Captured is shown here:
- Xlds = ((X*wts)’X)’
- Ylds = Y’YX*wts
- ssqX = ΣΣ(X.^2)
- ssqY = ΣΣ(Y.^2)
- VarX = diag(Xlds’*Xlds)/ssqX
- VarY = diag(Ylds’*Ylds)/ssqY
Options
options = a structure array with the following fields:
- display: [ {'on'} | 'off' ], governs level of display, and
- ranktest: [ 'none' | 'data' | 'scores' | {'auto'} ], governs type of rank test to perform.
- 'data' = single test on X-block (faster with smaller data blocks and more components),
- 'scores' = test during regression on scores matrix (faster with larger data matricies),
- 'auto' = automatic selection, or
- 'none' = assumes X-block has sufficient rank.
See Also
crossval, modelstruct, pcr, pls, preprocess, nippls, analysis