Simpls: Difference between revisions

Latest revision as of 14:14, 23 September 2016

Purpose

Partial Least Squares regression using the SIMPLS algorithm.

Synopsis

[reg,ssq,xlds,ylds,wts,xscrs,yscrs,basis] = simpls(x,y,ncomp,options)

Description

SIMPLS performs PLS regression using SIMPLS algorithm.

Inputs

x = X-block (predictor block) class "double" or "dataset", and
y = Y-block (predicted block) class "double" or "dataset".

Optional Inputs

ncomp = integer, number of latent variables to use in {default = rank of X-block}, and
options = a structure array discussed below.

Outputs

reg = matrix of regression vectors where each row corresponds to a regression vector for a given number of latent variables. If the Y-block contains multiple columns, the rows of reg will be in groups of latent variables (so that the regression vectors for all columns of Y at 1 latent variable will come first, followed by the regression vectors for all columns of Y at 2 latent variables, etc)

{\begin{bmatrix}{b_{y1,1}}\\{b_{y2,1}}\\{b_{y1,2}}\\{b_{y2,2}}\\{b_{y1,3}}\\{b_{y2,3}}\end{bmatrix}}

where b_yn,k is the regression vector for column "n" of the Y-block calculated from "k" latent variables.

ssq = the sum of squares captured (ssq) with the columns:

Column 1 = Number of latent variables (LVs)

Column 2 = Variance captured (as a percent) in the X-block by this LV

Column 3 = Total variance captured (%) by all LVs up to this row

Column 4 = Variance captured (as a percent) in the X-block by this LV

Column 5 = Total variance captured (%) by all LVs up to this row

xlds = X-block loadings (size: x-block columns by LVs),
ylds = Y-block loadings (size: y-block columns by LVs),
wts = X-block weights (size: x-block columns by LVs),
xscrs = X-block scores (size: samples by LVs),
yscrs = Y-block scores (size: samples by LVs),
basis = the basis of X-block loadings (size: x-block columns by LVs).

NOTE: in previous versions of SIMPLS, the X-block scores were unit length and the X-block loadings contained the variance. As of Version 3.0, this algorithm now uses standard convention in which the X-block scores contain the variance.

The calculations for Variance Captured is shown here:

Xlds = ((X*wts)’X)’

Ylds = Y’YX*wts

ssqX = ΣΣ(X.^2)

ssqY = ΣΣ(Y.^2)

VarX = diag(Xlds’*Xlds)/ssqX

VarY = diag(Ylds’*Ylds)/ssqY

Options

options = a structure array with the following fields:

display: [ {'on'} | 'off' ], governs level of display, and
ranktest: [ 'none' | 'data' | 'scores' | {'auto'} ], governs type of rank test to perform.

'data' = single test on X-block (faster with smaller data blocks and more components),

'scores' = test during regression on scores matrix (faster with larger data matricies),

'auto' = automatic selection, or

'none' = assumes X-block has sufficient rank.

@@ Line 23: / Line 23: @@
 ====Outputs====
-* '''reg''' = matrix of regression vectors,
+* '''reg''' = matrix of regression vectors where each row corresponds to a regression vector for a given number of latent variables. If the Y-block contains multiple columns, the rows of '''reg''' will be in groups of latent variables (so that the regression vectors for all columns of Y at 1 latent variable will come first, followed by the regression vectors for all columns of Y at 2 latent variables, etc)
-* '''ssq''' = the sum of squares captured (ssq),
+::<math>\begin{bmatrix}{b_{y1,1}}\\ {b_{y2,1}}\\ {b_{y1,2}}\\ {b_{y2,2}}\\ {b_{y1,3}}\\ {b_{y2,3}}\end{bmatrix}</math>
-* '''xlds''' = X-block loadings,
+:where b<sub>yn,k</sub> is the regression vector for column "n" of the Y-block calculated from "k" latent variables.
-* '''ylds''' = Y-block loadings,
+* '''ssq''' = the sum of squares captured (ssq) with the columns:
-* '''wts''' = X-block weights,
+::Column 1 = Number of latent variables (LVs)
-* '''xscrs''' = X-block scores,
+::Column 2 = Variance captured (as a percent) in the X-block by this LV
-* '''yscrs''' = Y-block scores, and
+::Column 3 = Total variance captured (%) by all LVs up to this row
-* '''basis''' = the basis of X-block loadings.
+::Column 4 = Variance captured (as a percent) in the X-block by this LV
+::Column 5 = Total variance captured (%) by all LVs up to this row
+* '''xlds''' = X-block loadings (size: x-block columns by LVs),
+* '''ylds''' = Y-block loadings (size: y-block columns by LVs),
+* '''wts''' = X-block weights  (size: x-block columns by LVs),
+* '''xscrs''' = X-block scores (size: samples by LVs),
+* '''yscrs''' = Y-block scores (size: samples by LVs),
+* '''basis''' = the basis of X-block loadings (size: x-block columns by LVs).
-'''NOTE:''' The regression matrices are ordered in reg such that each ''Ny'' (number of Y-block variables) rows correspond to the regression matrix for that particular number of latent variables.
+'''NOTE:''' in previous versions of SIMPLS, the X-block scores were unit length and the X-block loadings contained the variance. As of Version 3.0, this algorithm now uses standard convention in which the X-block scores contain the variance.
-'''NOTE:''' in previous versions of SIMPLS, the X-block scores were unit length and the X-block loadings contained the variance. As of Version 3.0, this algorithm now uses standard convention in which the X-block scores contain the variance.
+The calculations for Variance Captured is shown here:
+::Xlds = ((X*wts)’X)’
+::Ylds = Y’YX*wts
+::ssqX = ΣΣ(X.^2)
+::ssqY = ΣΣ(Y.^2)
+::VarX = diag(Xlds’*Xlds)/ssqX
+::VarY = diag(Ylds’*Ylds)/ssqY
 ===Options===
@@ Line 49: / Line 65: @@
 ===See Also===
-[[crossval]], [[modelstruct]], [[pcr]], [[plsnipal]], [[preprocess]], [[analysis]]
+[[crossval]], [[modelstruct]], [[pcr]], [[pls]], [[preprocess]], [[nippls]], [[analysis]]

Simpls: Difference between revisions

Latest revision as of 14:14, 23 September 2016

Contents

Purpose

Synopsis

Description

Inputs

Optional Inputs

Outputs

Options

See Also

Navigation menu

Simpls: Difference between revisions

Latest revision as of 14:14, 23 September 2016

Purpose

Synopsis

Description

Inputs

Optional Inputs

Outputs

Options

See Also

Navigation menu

Search