Savgolcv

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search

Purpose

Cross-validation for Savitzky-Golay smoothing and differentiation.

Synopsis

cumpress = savgolcv(x,y,lv,width,order,deriv,ind,rm,cvi,pre); %for x class "double"
cumpress = savgolcv(x,y,lv,width,order,deriv,[],rm,cvi,pre); %for x class "dataset"

Description

SAVGOLCV performs cross-validation of Savitzky-Golay parameters: filter width, polynomial order, and derviative order.

Inputs

  • x = M by N matrix of predictor variables with ROW vectors to be smoothed (e.g. spectra), and
  • y = M by P matrix of predicted variables.

Optional Inputs

  • ind = indices of columns of x to be used for calibration {default ind = [1:n] i.e. all x columns}.

The following are optional Savitzky-Golay parameters (calls SAVGOL). By entering a vector, instead of a scalar, these variables are cross-validated.

  • width = number of points in filter {default width = [11 17 23]}.
  • order = polynomial order {default order = [2 3]}.
  • deriv = derivative order {default deriv = [0 1 2]}.

The following are optional cross-validation parameters (calls CROSSVAL).

  • lv = maximum number of LVs {default lv = min(size(x))}.
  • rm = regression method. Options are: rm = 'nip', PLS via NIPALS algorithm; rm = 'sim', PLS via SIMPLS algorithm {default}, and rm = 'pcr', uses PCR.
  • cvi = cross-validation method. Options are: cvi = 'loo', leave-one-out, cvi = 'vet', venetian blinds {default}, cvi = 'con', contiguous blocks, and cvi = 'rnd', repeated random test sets.
  • split = number of subsets to split the data into {default = 5} and is required for cvi = 'vet', 'con', or 'rnd'.
  • iter = number of iterations {default = 5} and is required for cvi = 'rnd'.
  • mc = 0 supresses mean centering of subsets {default mc = 1}.

Outputs

The output is a 4 dimensional array of cumulative Predictive Residual Error Sum of Squares (PRESS) value with each dimension corresponding to one of the directions cross-validated over.

cumpress(i,:,:,:) = derivative dimension,
cumpress(:,j,:,:) = latent variable dimension,
cumpress(:,:,k,:) = window width dimension, and
cumpress(:,:,:,l) = polynomial order dimension.

See Also

baseline, crossval, lamsel, mscorr, savgol, specedit, stdfir