Plsdthres

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search

Purpose

Bayesian threshold determination for PLS Discriminant Analysis (PLSDA).

Synopsis

[threshold,misclassed,prob] = plsdthres(model,options)
[threshold,misclassed,prob] = plsdthres(y,ypred,options)

Description

PLSDTHRES uses the distribution of calibration-sample predictions obtained from a PLS model built for two or more logical classes to automatically determine a threshold value which will best split those classes with the least probability of false classifications for future predictions. It is assumed that the predicted values for each class are approximately normally distributed. The calibration can contain more than 2 classes, in which case thresholds to distinguish all classes will be determined. It is assumed that with more than 2 classes the primary misclassification threat is from the adjacent class(es).

Inputs

  • y = measured Y-block values used in PLS, and
  • ypred = PLS predicted Y values for calibration samples, or
  • model = a PLS/PLSDA model structure from which both y and ypred can be obtained automatically.

Outputs

  • threshold = [], vector of thresholds. If y consists of more than two classes, threshold will be a vector giving the upper bound y-value for each class.
  • misclassed = [], array containing the fraction of misclassifications for each class (rows): Column 1 = false negatives and Column 2 = false positives.
  • prob = lookup matrix of predicted y (column 1) vs. probability of each class (columns 2 to end).

Options

options is a structure array with the following fields:

  • plots: ['none' | 'final' | {'auto'} |], governs plotting behavior
    • 'auto' makes plots if no output is requested {default}
  • cost: [], vector of logarithmic cost biases for each class in y, cost is used to bias against misclassification of a particular class or classes {default = [] uses all zeros i.e. equal cost}.
  • prior: [], vector of prior probabilities of observing each class. If any class prior is Inf, the frequency of observation of that class in the calibration is used as its prior probability. If all priors are Inf, this has the effect of providing the fewest incorrect predictions assuming that the probability of observing a given class in future samples is similar to the frequency that class in the calibration set. {default = [] uses all ones i.e. equal priors.}

See Also

class2logical, crossval, discrimprob, plsda, plsdaroc, simca