Plsda
Purpose
Partial least squares discriminant analysis.
Synopsis
- model = plsda(x,y,ncomp,options)
- model = plsda(x,ncomp,options)
- pred = plsda(x,model,options)
- valid = plsda(x,y,model,options)
Description
PLSDA is a multivariate inverse least squares discrimination method used to classify samples. The y-block in a PLSDA model indicates which samples are in the class(es) of interest through either:
(A) a column vector of class numbers indicating class assignments:
y = [1 1 3 2]';
NOTE: if classes are assigned in the input (x), y can be omitted and this option will be assumed using the first class set of the x-block rows.
(B) a matrix of one or more columns containing a logical zero (= not in class) or one (= in class) for each sample (row):
y = [1 0 0; 1 0 0; 0 0 1; 0 1 0]
NOTE: When a vector of class numbers is used (case A, above), class zero (0) is reserved for "unknown" samples and, thus, samples of class zero are never used when calibrating a PLSDA model. The model will include predictions for these samples.
The prediction from a PLSDA model is a value of nominally zero or one. A value closer to zero indicates the new sample is NOT in the modeled class; a value of one indicates a sample is in the modeled class. In practice a threshold between zero and one is determined above which a sample is in the class and below which a sample is not in the class (See, for example, PLSDTHRES). Similarly, a probability of a sample being inside or outside the class can be calculated using DISCRIMPROB. The predicted probability of each class is included in the output model structure in the field:
- model.details.predprobability
Inputs
- x = X-block (predictor block) class "double" or "dataset",
- y = Y-block - OPTIONAL if x is a dataset containing classes for sample mode (mode 1) otherwise, y is one of:
- (A) column vector of sample classes for each sample in x -OPTIONAL if x is a dataset containing classes for sample mode (mode 1)
- (B) a logical array with 1 indicating class membership for each sample (rows) in one or more classes (columns)
- or (C) a cell array of class groupings of classes from the x-block data. For example: {[1 2] [3]} would model classes 1 and 2 as a single group against class 3.
- ncomp = the number of latent variables to be calculated (positive integer scalar).
- options = an optional input options structure (see Options below)
Outputs
- model = standard model structure containing the PLSDA model (See MODELSTRUCT).
- pred = structure array with predictions
- valid = structure array with predictionsz
Note: Calling plsda with no inputs starts the graphical user interface (GUI) for this analysis method.
Options
- display: [ 'off' | {'on'} ] governs level of display to command window.
- plots: [ 'none' | {'final'} ] governs level of plotting.
- preprocessing: {[] []} preprocessing structures for x and y blocks (see PREPROCESS).
- algorithm: [ 'nip' | {'sim'} ] PLS algorithm to use: NIPALS or SIMPLS
- blockdetails: [ 'compact' | {'standard'} | 'all' ] Extent of detail included in model. 'standard' keeps only y-block, 'all' keeps both x- and y- blocks