Simpls: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
imported>Benjamin
mNo edit summary
 
(One intermediate revision by one other user not shown)
Line 40: Line 40:


'''NOTE:''' in previous versions of SIMPLS, the X-block scores were unit length and the X-block loadings contained the variance. As of Version 3.0, this algorithm now uses standard convention in which the X-block scores contain the variance.
'''NOTE:''' in previous versions of SIMPLS, the X-block scores were unit length and the X-block loadings contained the variance. As of Version 3.0, this algorithm now uses standard convention in which the X-block scores contain the variance.
The calculations for Variance Captured is shown here:
::Xlds = ((X*wts)’X)’
::Ylds = Y’YX*wts
::ssqX = ΣΣ(X.^2)
::ssqY = ΣΣ(Y.^2)
::VarX = diag(Xlds’*Xlds)/ssqX
::VarY = diag(Ylds’*Ylds)/ssqY


===Options===
===Options===
Line 54: Line 65:
===See Also===
===See Also===


[[crossval]], [[modelstruct]], [[pcr]], [[pls]], [[plsnipal]], [[preprocess]], [[analysis]]
[[crossval]], [[modelstruct]], [[pcr]], [[pls]], [[preprocess]], [[nippls]], [[analysis]]

Latest revision as of 13:14, 23 September 2016

Purpose

Partial Least Squares regression using the SIMPLS algorithm.

Synopsis

[reg,ssq,xlds,ylds,wts,xscrs,yscrs,basis] = simpls(x,y,ncomp,options)

Description

SIMPLS performs PLS regression using SIMPLS algorithm.

Inputs

  • x = X-block (predictor block) class "double" or "dataset", and
  • y = Y-block (predicted block) class "double" or "dataset".

Optional Inputs

  • ncomp = integer, number of latent variables to use in {default = rank of X-block}, and
  • options = a structure array discussed below.

Outputs

  • reg = matrix of regression vectors where each row corresponds to a regression vector for a given number of latent variables. If the Y-block contains multiple columns, the rows of reg will be in groups of latent variables (so that the regression vectors for all columns of Y at 1 latent variable will come first, followed by the regression vectors for all columns of Y at 2 latent variables, etc)
where byn,k is the regression vector for column "n" of the Y-block calculated from "k" latent variables.
  • ssq = the sum of squares captured (ssq) with the columns:
Column 1 = Number of latent variables (LVs)
Column 2 = Variance captured (as a percent) in the X-block by this LV
Column 3 = Total variance captured (%) by all LVs up to this row
Column 4 = Variance captured (as a percent) in the X-block by this LV
Column 5 = Total variance captured (%) by all LVs up to this row
  • xlds = X-block loadings (size: x-block columns by LVs),
  • ylds = Y-block loadings (size: y-block columns by LVs),
  • wts = X-block weights (size: x-block columns by LVs),
  • xscrs = X-block scores (size: samples by LVs),
  • yscrs = Y-block scores (size: samples by LVs),
  • basis = the basis of X-block loadings (size: x-block columns by LVs).

NOTE: in previous versions of SIMPLS, the X-block scores were unit length and the X-block loadings contained the variance. As of Version 3.0, this algorithm now uses standard convention in which the X-block scores contain the variance.


The calculations for Variance Captured is shown here:

Xlds = ((X*wts)’X)’
Ylds = Y’YX*wts
ssqX = ΣΣ(X.^2)
ssqY = ΣΣ(Y.^2)
VarX = diag(Xlds’*Xlds)/ssqX
VarY = diag(Ylds’*Ylds)/ssqY

Options

options = a structure array with the following fields:

  • display: [ {'on'} | 'off' ], governs level of display, and
  • ranktest: [ 'none' | 'data' | 'scores' | {'auto'} ], governs type of rank test to perform.
'data' = single test on X-block (faster with smaller data blocks and more components),
'scores' = test during regression on scores matrix (faster with larger data matricies),
'auto' = automatic selection, or
'none' = assumes X-block has sufficient rank.

See Also

crossval, modelstruct, pcr, pls, preprocess, nippls, analysis