Exploratory Analysis: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
No edit summary
No edit summary
 
(17 intermediate revisions by 3 users not shown)
Line 1: Line 1:
__TOC__
Exploratory analysis methods examine data for trends, correlations, or other relationships. Sometimes, models are created which can later identify when new data does not follow the same trend as previous data (see, for example, using [[pca|principal components analysis]] in multivariate statistical process control, see: [[Process Control and Statistics]]) or can be used to predict an amount of material or property (which is also discussed in [[Quantitative_Regression_Analysis|Quantitative Regression Analysis]].) Often, however, these methods are used simply to learn more about the data.
Exploratory analysis methods examine data for trends, correlations, or other relationships. Sometimes, models are created which can later identify when new data does not follow the same trend as previous data (see, for example, using [[pca|principal components analysis]] in multivariate statistical process control, see: [[Process Control and Statistics]]) or can be used to predict an amount of material or property (which is also discussed in [[Quantitative_Regression_Analysis|Quantitative Regression Analysis]].) Often, however, these methods are used simply to learn more about the data.


 
==Main Exploratory Analysis Tools==
====Top-Level Exploratory Analysis Functions====
These functions provide high-level analysis of data. Most have various options and output model structures.


:[[analysis]] - Graphical user interface for data analysis.
:[[analysis]] - Graphical user interface for data analysis.
:[[pca]] - Principal components analysis.
:[[pca]] - Principal components analysis.
:[[mcr]] - Multivariate curve resolution with constraints.
:[[mcr]] - Multivariate curve resolution with constraints.
:[[als_sit]] - Alternating Least Squares with Shift Invariant Trilinearity.
:[[purity]] - Self-modeling mixture analysis method based on purity of variables or spectra.
:[[purity]] - Self-modeling mixture analysis method based on purity of variables or spectra.
:[[corrspec]] - Resolves correlation spectroscopy maps.
:[[calccvbias]] - Calculate the Cross-Validation Bias from a cross-validated model.
:[[cluster]] - Cluster analysis with dendrograms using various algorithms.
:[[corrspec]] - Correlation spectroscopy maps.
:[[crossval]] - Cross-validation for decomposition and linear regression.
:[[crossval]] - Cross-validation for decomposition and linear regression.
 
* Also See: [[Multiway Exploratory Analysis]]
 
==Evolving and Windowed Factor Analysis==
These function provide moving and "evolving" (growing) windowed analysis of data.
:[[evolvfa]] - Evolving factor analysis (forward and reverse).
:[[evolvfa]] - Evolving factor analysis (forward and reverse).
:[[ewfa]] - Evolving window factor analysis.
:[[ewfa]] - Evolving window factor analysis.
:[[estimatefactors]] - Estimate number of significant factors in multivariate data.
:[[wtfa]] - Window target factor analysis.
:[[wtfa]] - Window target factor analysis.
:[[mlpca]] - Maximum likelihood principal components analysis.
 
==Other Exploratory Tools==
These are data-exploration tools, some of which provide interfaces to analyze the data or other medium-level analysis functionality.
 
:[[anglemapper]] - Classification based on angle measures between signals.  
:[[coda_dw]] - Calculates values for the Durbin_Watson criterion of columns of data set.
:[[coda_dw]] - Calculates values for the Durbin_Watson criterion of columns of data set.
:[[coda_dw_interactive]] - Interactive version of CODA_DW.
:[[coda_dw_interactive]] - Interactive version of CODA_DW.
:[[comparelcms_sim_interactive]] - Interactive interface for COMPARELCMS.
:[[comparelcms_sim_interactive]] - Interactive interface for COMPARELCMS.
:[[estimatefactors]] - Estimate number of significant factors in multivariate data.
:[[manrotate]] - Graphical interface to manually rotate model loadings.
:[[mlpca]] - Maximum likelihood principal components analysis.
:[[trendtool]] - Univariate trend analysis tool.
:[[trendtool]] - Univariate trend analysis tool.
:[[cluster]] - KNN and K-means cluster analysis with dendrograms.


See Also [[Multiway Exploratory Analysis]]
==Application of Models to New Data==
 
====Application of Models to New Data====
In most cases, the function used to create a model (e.g. PCA, PLS, etc) is also used to make a prediction with the created model. See the function used for more information on this. In addition, these utilities may be of use for certain applications.
In most cases, the function used to create a model (e.g. PCA, PLS, etc) is also used to make a prediction with the created model. See the function used for more information on this. In addition, these utilities may be of use for certain applications.


Line 39: Line 43:
:[[pcapro]] - Projects new data on old principal components model.
:[[pcapro]] - Projects new data on old principal components model.


====Model Analysis and Calculation Utilities====
==Model Analysis and Calculation Utilities==
:[[manrotate]] - Graphical interface to manually rotate model loadings.
Low-level engine and calculation functions.
 
:[[knnscoredistance]] - Calculate the average distance to the k-Nearest Neighbors in score space.
:[[qconcalc]] - Calculate Q residuals contributions for predictions on a model.
:[[qconcalc]] - Calculate Q residuals contributions for predictions on a model.
:[[residuallimit]] - Estimates confidence limits for sum squared residuals.
:[[residuallimit]] - Estimates confidence limits for sum squared residuals.
Line 57: Line 63:
:[[comparelcms_simengine]] - Calculational Engine for comparelcms.
:[[comparelcms_simengine]] - Calculational Engine for comparelcms.


====Plotting Utilities====
==Plotting Utilities==
:[[modlrder]] - Displays model info for standard model structures.
:[[modlrder]] - Displays model info for standard model structures.
:[[plotloads]] - Extract and display loadings information from a model structure.
:[[plotloads]] - Extract and display loadings information from a model structure.
Line 64: Line 70:
:[[ssqtable]] - Displays variance captured table for model.
:[[ssqtable]] - Displays variance captured table for model.
   
   
(Sub topic of [[Qualitative_Exploratory_Analysis_and_Classification|Qualitative_Exploratory_Analysis_and_Classification]])
 
(Sub topic of [[PLS_Toolbox_Topics|PLS_Toolbox_Topics]])

Latest revision as of 08:27, 11 December 2023

Exploratory analysis methods examine data for trends, correlations, or other relationships. Sometimes, models are created which can later identify when new data does not follow the same trend as previous data (see, for example, using principal components analysis in multivariate statistical process control, see: Process Control and Statistics) or can be used to predict an amount of material or property (which is also discussed in Quantitative Regression Analysis.) Often, however, these methods are used simply to learn more about the data.

Main Exploratory Analysis Tools

These functions provide high-level analysis of data. Most have various options and output model structures.

analysis - Graphical user interface for data analysis.
pca - Principal components analysis.
mcr - Multivariate curve resolution with constraints.
als_sit - Alternating Least Squares with Shift Invariant Trilinearity.
purity - Self-modeling mixture analysis method based on purity of variables or spectra.
calccvbias - Calculate the Cross-Validation Bias from a cross-validated model.
cluster - Cluster analysis with dendrograms using various algorithms.
corrspec - Correlation spectroscopy maps.
crossval - Cross-validation for decomposition and linear regression.

Evolving and Windowed Factor Analysis

These function provide moving and "evolving" (growing) windowed analysis of data.

evolvfa - Evolving factor analysis (forward and reverse).
ewfa - Evolving window factor analysis.
wtfa - Window target factor analysis.

Other Exploratory Tools

These are data-exploration tools, some of which provide interfaces to analyze the data or other medium-level analysis functionality.

anglemapper - Classification based on angle measures between signals.
coda_dw - Calculates values for the Durbin_Watson criterion of columns of data set.
coda_dw_interactive - Interactive version of CODA_DW.
comparelcms_sim_interactive - Interactive interface for COMPARELCMS.
estimatefactors - Estimate number of significant factors in multivariate data.
manrotate - Graphical interface to manually rotate model loadings.
mlpca - Maximum likelihood principal components analysis.
trendtool - Univariate trend analysis tool.

Application of Models to New Data

In most cases, the function used to create a model (e.g. PCA, PLS, etc) is also used to make a prediction with the created model. See the function used for more information on this. In addition, these utilities may be of use for certain applications.

modelselector - Create or apply a model selector model.
compressmodel - Remove references to unused variables from a model.
matchvars - Align variables of a dataset to allow prediction with a model.
pcapro - Projects new data on old principal components model.

Model Analysis and Calculation Utilities

Low-level engine and calculation functions.

knnscoredistance - Calculate the average distance to the k-Nearest Neighbors in score space.
qconcalc - Calculate Q residuals contributions for predictions on a model.
residuallimit - Estimates confidence limits for sum squared residuals.
reviewmodel - Examines a standard model structure for typical problems.
tconcalc - Calculate Hotellings T2 contributions for predictions on a model.
tsqlim - Confidence limits for Hotelling's T^2.
varcap - Variance captured for each variable in PCA model.
varimax - Orthogonal rotation of loadings.


als - Alternating Least Squares computational engine.
datahat - Calculates the model estimate and residuals of the data.
dispmat - Calculates the dispersion matrix of two spectral data sets.
pcaengine - Principal Components Analysis computational engine.
tsqmtx - Calculates matrix for T^2 contributions for PCA.
comparelcms_simengine - Calculational Engine for comparelcms.

Plotting Utilities

modlrder - Displays model info for standard model structures.
plotloads - Extract and display loadings information from a model structure.
plotscores - Extract and display score information from a model.
ploteigen - Builds dataset object of eigenvalues/RMSECV information.
ssqtable - Displays variance captured table for model.


(Sub topic of PLS_Toolbox_Topics)