Exploratory Analysis: Difference between revisions
imported>Jeremy |
No edit summary |
||
(10 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
__TOC__ | |||
Exploratory analysis methods examine data for trends, correlations, or other relationships. Sometimes, models are created which can later identify when new data does not follow the same trend as previous data (see, for example, using [[pca|principal components analysis]] in multivariate statistical process control, see: [[Process Control and Statistics]]) or can be used to predict an amount of material or property (which is also discussed in [[Quantitative_Regression_Analysis|Quantitative Regression Analysis]].) Often, however, these methods are used simply to learn more about the data. | Exploratory analysis methods examine data for trends, correlations, or other relationships. Sometimes, models are created which can later identify when new data does not follow the same trend as previous data (see, for example, using [[pca|principal components analysis]] in multivariate statistical process control, see: [[Process Control and Statistics]]) or can be used to predict an amount of material or property (which is also discussed in [[Quantitative_Regression_Analysis|Quantitative Regression Analysis]].) Often, however, these methods are used simply to learn more about the data. | ||
==Main Exploratory Analysis Tools== | |||
== | |||
These functions provide high-level analysis of data. Most have various options and output model structures. | These functions provide high-level analysis of data. Most have various options and output model structures. | ||
Line 8: | Line 8: | ||
:[[pca]] - Principal components analysis. | :[[pca]] - Principal components analysis. | ||
:[[mcr]] - Multivariate curve resolution with constraints. | :[[mcr]] - Multivariate curve resolution with constraints. | ||
:[[als_sit]] - Alternating Least Squares with Shift Invariant Trilinearity. | |||
:[[purity]] - Self-modeling mixture analysis method based on purity of variables or spectra. | :[[purity]] - Self-modeling mixture analysis method based on purity of variables or spectra. | ||
:[[calccvbias]] - Calculate the Cross-Validation Bias from a cross-validated model. | |||
:[[cluster]] - Cluster analysis with dendrograms using various algorithms. | :[[cluster]] - Cluster analysis with dendrograms using various algorithms. | ||
:[[corrspec]] - Correlation spectroscopy maps. | :[[corrspec]] - Correlation spectroscopy maps. | ||
:[[crossval]] - Cross-validation for decomposition and linear regression. | :[[crossval]] - Cross-validation for decomposition and linear regression. | ||
* Also See: [[Multiway Exploratory Analysis]] | |||
==Evolving and Windowed Factor Analysis== | |||
These function provide moving and "evolving" (growing) windowed analysis of data. | These function provide moving and "evolving" (growing) windowed analysis of data. | ||
:[[evolvfa]] - Evolving factor analysis (forward and reverse). | :[[evolvfa]] - Evolving factor analysis (forward and reverse). | ||
Line 20: | Line 23: | ||
:[[wtfa]] - Window target factor analysis. | :[[wtfa]] - Window target factor analysis. | ||
==Other Exploratory Tools== | |||
These are data-exploration tools, some of which provide interfaces to analyze the data or other medium-level analysis functionality. | These are data-exploration tools, some of which provide interfaces to analyze the data or other medium-level analysis functionality. | ||
:[[ | :[[anglemapper]] - Classification based on angle measures between signals. | ||
:[[coda_dw]] - Calculates values for the Durbin_Watson criterion of columns of data set. | :[[coda_dw]] - Calculates values for the Durbin_Watson criterion of columns of data set. | ||
:[[coda_dw_interactive]] - Interactive version of CODA_DW. | :[[coda_dw_interactive]] - Interactive version of CODA_DW. | ||
:[[comparelcms_sim_interactive]] - Interactive interface for COMPARELCMS. | :[[comparelcms_sim_interactive]] - Interactive interface for COMPARELCMS. | ||
:[[estimatefactors]] - Estimate number of significant factors in multivariate data. | |||
:[[manrotate]] - Graphical interface to manually rotate model loadings. | |||
:[[mlpca]] - Maximum likelihood principal components analysis. | |||
:[[trendtool]] - Univariate trend analysis tool. | :[[trendtool]] - Univariate trend analysis tool. | ||
==Application of Models to New Data== | |||
In most cases, the function used to create a model (e.g. PCA, PLS, etc) is also used to make a prediction with the created model. See the function used for more information on this. In addition, these utilities may be of use for certain applications. | In most cases, the function used to create a model (e.g. PCA, PLS, etc) is also used to make a prediction with the created model. See the function used for more information on this. In addition, these utilities may be of use for certain applications. | ||
Line 38: | Line 43: | ||
:[[pcapro]] - Projects new data on old principal components model. | :[[pcapro]] - Projects new data on old principal components model. | ||
==Model Analysis and Calculation Utilities== | |||
Low-level engine and calculation functions. | Low-level engine and calculation functions. | ||
:[[knnscoredistance]] - Calculate the average distance to the k-Nearest Neighbors in score space. | |||
:[[qconcalc]] - Calculate Q residuals contributions for predictions on a model. | :[[qconcalc]] - Calculate Q residuals contributions for predictions on a model. | ||
:[[residuallimit]] - Estimates confidence limits for sum squared residuals. | :[[residuallimit]] - Estimates confidence limits for sum squared residuals. | ||
Line 57: | Line 63: | ||
:[[comparelcms_simengine]] - Calculational Engine for comparelcms. | :[[comparelcms_simengine]] - Calculational Engine for comparelcms. | ||
==Plotting Utilities== | |||
:[[modlrder]] - Displays model info for standard model structures. | :[[modlrder]] - Displays model info for standard model structures. | ||
:[[plotloads]] - Extract and display loadings information from a model structure. | :[[plotloads]] - Extract and display loadings information from a model structure. | ||
Line 64: | Line 70: | ||
:[[ssqtable]] - Displays variance captured table for model. | :[[ssqtable]] - Displays variance captured table for model. | ||
(Sub topic of [[ | |||
(Sub topic of [[PLS_Toolbox_Topics|PLS_Toolbox_Topics]]) |
Latest revision as of 08:27, 11 December 2023
Exploratory analysis methods examine data for trends, correlations, or other relationships. Sometimes, models are created which can later identify when new data does not follow the same trend as previous data (see, for example, using principal components analysis in multivariate statistical process control, see: Process Control and Statistics) or can be used to predict an amount of material or property (which is also discussed in Quantitative Regression Analysis.) Often, however, these methods are used simply to learn more about the data.
Main Exploratory Analysis Tools
These functions provide high-level analysis of data. Most have various options and output model structures.
- analysis - Graphical user interface for data analysis.
- pca - Principal components analysis.
- mcr - Multivariate curve resolution with constraints.
- als_sit - Alternating Least Squares with Shift Invariant Trilinearity.
- purity - Self-modeling mixture analysis method based on purity of variables or spectra.
- calccvbias - Calculate the Cross-Validation Bias from a cross-validated model.
- cluster - Cluster analysis with dendrograms using various algorithms.
- corrspec - Correlation spectroscopy maps.
- crossval - Cross-validation for decomposition and linear regression.
- Also See: Multiway Exploratory Analysis
Evolving and Windowed Factor Analysis
These function provide moving and "evolving" (growing) windowed analysis of data.
- evolvfa - Evolving factor analysis (forward and reverse).
- ewfa - Evolving window factor analysis.
- wtfa - Window target factor analysis.
Other Exploratory Tools
These are data-exploration tools, some of which provide interfaces to analyze the data or other medium-level analysis functionality.
- anglemapper - Classification based on angle measures between signals.
- coda_dw - Calculates values for the Durbin_Watson criterion of columns of data set.
- coda_dw_interactive - Interactive version of CODA_DW.
- comparelcms_sim_interactive - Interactive interface for COMPARELCMS.
- estimatefactors - Estimate number of significant factors in multivariate data.
- manrotate - Graphical interface to manually rotate model loadings.
- mlpca - Maximum likelihood principal components analysis.
- trendtool - Univariate trend analysis tool.
Application of Models to New Data
In most cases, the function used to create a model (e.g. PCA, PLS, etc) is also used to make a prediction with the created model. See the function used for more information on this. In addition, these utilities may be of use for certain applications.
- modelselector - Create or apply a model selector model.
- compressmodel - Remove references to unused variables from a model.
- matchvars - Align variables of a dataset to allow prediction with a model.
- pcapro - Projects new data on old principal components model.
Model Analysis and Calculation Utilities
Low-level engine and calculation functions.
- knnscoredistance - Calculate the average distance to the k-Nearest Neighbors in score space.
- qconcalc - Calculate Q residuals contributions for predictions on a model.
- residuallimit - Estimates confidence limits for sum squared residuals.
- reviewmodel - Examines a standard model structure for typical problems.
- tconcalc - Calculate Hotellings T2 contributions for predictions on a model.
- tsqlim - Confidence limits for Hotelling's T^2.
- varcap - Variance captured for each variable in PCA model.
- varimax - Orthogonal rotation of loadings.
- als - Alternating Least Squares computational engine.
- datahat - Calculates the model estimate and residuals of the data.
- dispmat - Calculates the dispersion matrix of two spectral data sets.
- pcaengine - Principal Components Analysis computational engine.
- tsqmtx - Calculates matrix for T^2 contributions for PCA.
- comparelcms_simengine - Calculational Engine for comparelcms.
Plotting Utilities
- modlrder - Displays model info for standard model structures.
- plotloads - Extract and display loadings information from a model structure.
- plotscores - Extract and display score information from a model.
- ploteigen - Builds dataset object of eigenvalues/RMSECV information.
- ssqtable - Displays variance captured table for model.
(Sub topic of PLS_Toolbox_Topics)