Batchmaturity: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Donal
imported>Donal
No edit summary
Line 11: Line 11:


===Description===
===Description===
Analyzes multivariate batch process data to quantify the acceptable
variability of the process variables during normal processing conditions.
The resulting model can be used on new batch process data to identify
measurements which indicate abnormal processing behavior. See the
pred.inlimits field for this indicator.


Batch process model and monitoring.
====Methodology:====
Given multivariate X data and a Y variable which represents the
corresponding state of batch maturity (BM) build a model by:
# Build a PLS model on X and Y using specified preprocessing. Use its self-prediction of Y, ypred, as the indicator of BM.
# Simplify the X data by performing PCA analysis (with specified preprocessing). We now have PC scores and a measure of BM (ypred) for each sample.
# Sort the samples to be in order of increasing BM. Calculate running means "smoothed score" of these ordered scores for each PC. Calculate deviations of scores from the smoothed means for each PC.
# Form a set of equi-spaced BM values over the range (BMstart, BMend). For each BM point find the n samples which have BM closest to that value.
# For each BM point calculate low and high score limit values corresponding to the cl/2 and 1-cl/2 percentiles of the n sample score deviations just selected. This is done for each PC. Add the smoothed scores to these limits to get the actual limits for each PC at each BM point. These BM points and corresponding low/high score limits constitute a lookup table for score limits (for each PC) in terms of BM value.
# The score limits lookup table contains upper and lower score limits for each PC, for every equi-spaced BM point over the BM range.
# The batch maturity model contains the PLS and PCA sub-models and the score limits lookup table. It is applied to a new batch processing dataset, X1, by applying the PLS sub-model to get BM (ypred), then applying the PCA sub-model to get scores. The upper and lower score limits (for each PC) for each sample are obtained by using the sample's BM value and querying the score limits lookup table. A sample is considered to be an inlier if its score values are within the score limits for each PC.


====Inputs====
====Inputs====
Line 26: Line 40:
Model and pred contain the following fields which relate to score limits and
Model and pred contain the following fields which relate to score limits and
whether samples are within normal ranges or not:
whether samples are within normal ranges or not:
:limits : struct with fields:
:'''limits''' : struct with fields:
::cl: value used for cl option
::'''cl''': value used for cl option
::bm: (1 x bmlookuppts) bm values for score limits
::'''bm''': (1 x bmlookuppts) bm values for score limits
::low: (nPC x bmlookuppts) lower score limit of inliers
::'''low''': (nPC x bmlookuppts) lower score limit of inliers
::cl: (nPC x bmlookuppts) upper score limit of inliers
::'''cl''': (nPC x bmlookuppts) upper score limit of inliers
:inlimits : (nsample x nPC) logical indicating if samples are inliers.
:'''inlimits''' : (nsample x nPC) logical indicating if samples are inliers.
:t : (nsample x nPC) scores
:'''t''' : (nsample x nPC) scores
:t_reduced : (nsample x nPC) scores scaled by limits, with limits -> +/- 1
:'''t_reduced''' : (nsample x nPC) scores scaled by limits, with limits -> +/- 1
:submodelreg : regression model built to predict bm. Only PLS currently.
:'''submodelreg''' : regression model built to predict bm. Only PLS currently.
:submodelpca : PCA model used to calculate X-block scores.
:'''submodelpca''' : PCA model used to calculate X-block scores.


===Options===
===Options===

Revision as of 14:37, 1 October 2012

Purpose

Batch process model and monitoring, identifying outliers.

Synopsis

model = batchmaturity(x,ncomp_pca,options);
model = batchmaturity(x,y,ncomp_pca,options);
model = batchmaturity(x,y,ncomp_pca,ncomp_reg,options);
pred = batchmaturity(x,model,options);
pred = batchmaturity(x,model);

Description

Analyzes multivariate batch process data to quantify the acceptable variability of the process variables during normal processing conditions. The resulting model can be used on new batch process data to identify measurements which indicate abnormal processing behavior. See the pred.inlimits field for this indicator.

Methodology:

Given multivariate X data and a Y variable which represents the corresponding state of batch maturity (BM) build a model by:

  1. Build a PLS model on X and Y using specified preprocessing. Use its self-prediction of Y, ypred, as the indicator of BM.
  2. Simplify the X data by performing PCA analysis (with specified preprocessing). We now have PC scores and a measure of BM (ypred) for each sample.
  3. Sort the samples to be in order of increasing BM. Calculate running means "smoothed score" of these ordered scores for each PC. Calculate deviations of scores from the smoothed means for each PC.
  4. Form a set of equi-spaced BM values over the range (BMstart, BMend). For each BM point find the n samples which have BM closest to that value.
  5. For each BM point calculate low and high score limit values corresponding to the cl/2 and 1-cl/2 percentiles of the n sample score deviations just selected. This is done for each PC. Add the smoothed scores to these limits to get the actual limits for each PC at each BM point. These BM points and corresponding low/high score limits constitute a lookup table for score limits (for each PC) in terms of BM value.
  6. The score limits lookup table contains upper and lower score limits for each PC, for every equi-spaced BM point over the BM range.
  7. The batch maturity model contains the PLS and PCA sub-models and the score limits lookup table. It is applied to a new batch processing dataset, X1, by applying the PLS sub-model to get BM (ypred), then applying the PCA sub-model to get scores. The upper and lower score limits (for each PC) for each sample are obtained by using the sample's BM value and querying the score limits lookup table. A sample is considered to be an inlier if its score values are within the score limits for each PC.

Inputs

  • x = X-block (2-way array class "double" or "dataset").
  • y = Y-block (vector class "double" or "dataset").
  • ncomp_pca = Number of components to to be calculated in PCA model (positive integer scalar).
  • ncomp_reg = Number of latent variables for regression method.

Outputs

  • model = standard model structure containing the PCA and Regression model (See MODELSTRUCT).
  • pred = prediction structure contains the scores from PCA model for the input test data as pred.t.

Model and pred contain the following fields which relate to score limits and whether samples are within normal ranges or not:

limits : struct with fields:
cl: value used for cl option
bm: (1 x bmlookuppts) bm values for score limits
low: (nPC x bmlookuppts) lower score limit of inliers
cl: (nPC x bmlookuppts) upper score limit of inliers
inlimits : (nsample x nPC) logical indicating if samples are inliers.
t : (nsample x nPC) scores
t_reduced : (nsample x nPC) scores scaled by limits, with limits -> +/- 1
submodelreg : regression model built to predict bm. Only PLS currently.
submodelpca : PCA model used to calculate X-block scores.

Options

options = a structure array with the following fields:

  • regression_method : [ {'pls'} ] A string indicating type of regression method to use. Currently, only 'pls' is supported.
  • preprocessing : { [] } preprocessing structure goes to both PCA and PLS. PLS Y-block preprocessing will always be autoscale.
  • zerooffsety : [ 0 | {1}] transform y resetting to zero per batch
  • stretchy : [ 0 | {1}] transform y to have range=100 per batch
  • cl : [ 0.90 ] Confidence limit (2-sided) for moving limits (defined as 1 - Expected fraction of outliers.)
  • nearestpts : [{25}] number nearby scores used in getting limits
  • nsmooth : [{25}] number points (odd) used in savgol smoothing
  • bmlookuppts : [{1001}] number of equi-spaced points in bm lookup table
  • plots : [ 'none' | 'detailed' | {'final'} ] governs production of plots when model is built. 'final' shows standard scores and loadings plots. 'detailed' gives individual scores plots with limits for all PCs.
  • waitbar : [ 'off' | {'auto'} ] governs display of waitbar when calculating confidence limits ('auto' shows waitbar only when the calculation will take longer than 15 seconds)

See Also

batchfold, batchdigester