Estimatefactors

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search

Purpose

Estimate number of significant factors in multivariate data.

Synopsis

S = estimatefactors (x,options)

Description

Given a bilinear dataset, ESTIMATEFACTORS estimates the number of significant factors required to describe the data. The algorithm uses PCA bootstrapping (resampling) of the data. The PCA loadings determined for each resampling are compared for changes. Principal components which change significantly from one resampling to the next are probably due mostly to noise rather than signal.

The output is an estimate of the signal to noise ratio for each principal component. Ratios of 2 or below are dominated by noise, above 3 are OK, and between 2 and 3 are a jugement call. The number of factors needed to describe the data is the number of eigenvectors with signal to noise ratios greater than about 2.

This function is based on an algorithm developed and Copyrighted 1997 by Ronald C. Henry, Eun Sug Park, and Clifford H. Spiegelman and used by permission of the authors. For reference see:

  • Henry, R.C., Park, E.S., & Spiegelman, C.H. (1999). Comparing A New Algorithm With The Classic Methods For Estimating The Number Of Factors. Chemometrics and Intelligent Laboratory Systems, 48(1), 91-97.
  • Park, E.S., Henry, R.C., & Spiegelman C.H. (2000). Estimating The Number Of Factors To Include In A Height Dimensional Multivaraite Bilinear Model. Communications in Statistics-Theory and Methods, 29(3), 723-746.

The ESTIMATE FACTORS function is called when the user selects Estimate Factor SNR from the tools menu in analysis. Take a look at Estimate Factor SNR for more information concerning the SNR estimation using Estimate Factor.

Inputs

  • x = bilinear data, either in the form of a double array or dataset object

Outputs

  • S = vector containing an estimate of the signal to noise ratio for each principal component

Options

options = a structure array with the following fields:

  • plots: ['none' | {'final'} ] Governs plotting.
  • resample: [ {42} ] number of times the data is to be resampled. Generally, values of 40 or 50 are sufficient. Values greater than several hundred are not required.
  • maxfactors: [ {30} ] maximum number of factors to plot (if plots are selected by options.plots).
  • preprocessing: {[]} Preprocessing structure or keyword (see PREPROCESS), to apply before analyzing data.

The default options can be retreived using: options = estimatefactors('options');.

See Also

choosecomp, pca, pcaengine, preprocess, Estimate Factor SNR