Batchdigester: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
(Importing text file)
imported>Jeremy
(Importing text file)
Line 1: Line 1:
===Purpose===
===Purpose===
Parse wafer or batch data into MPCA or Summary PCA form.
Parse wafer or batch data into MPCA or Summary PCA form.
===Synopsis===
===Synopsis===
:[out,options] = batchdigester(data,options);
:[out,options] = batchdigester(data,options);
:batchdigester    %prompt user for input and output
:batchdigester    %prompt user for input and output
===Description===
===Description===
Rearranges and optionally summarizes two-way dataset of batch or wafer data. Input data must be a DataSet object containing labels which identify different wafers or batches which should be split out of the data. Classes in data are (optionally) used to split each time profile of the batch/wafer into steps which can then be selected for inclusion in the output.  
Rearranges and optionally summarizes two-way dataset of batch or wafer data. Input data must be a DataSet object containing labels which identify different wafers or batches which should be split out of the data. Classes in data are (optionally) used to split each time profile of the batch/wafer into steps which can then be selected for inclusion in the output.  
MPCA mode: If data is rearranged into MPCA data, each wafer/batch is arranged as one slab of a 3-way matrix. Each row is a time point and each column is one of the original variables. Only selected steps are included in the output.
MPCA mode: If data is rearranged into MPCA data, each wafer/batch is arranged as one slab of a 3-way matrix. Each row is a time point and each column is one of the original variables. Only selected steps are included in the output.
Summary PCA mode: If data is summarized into Summary PCA data, all time points for a given step in a given wafer are summarized using one or more statistics:
Summary PCA mode: If data is summarized into Summary PCA data, all time points for a given step in a given wafer are summarized using one or more statistics:
*      '''Mean
*      '''Mean
**'''''''''''''''      Standard Deviation
**'''''''''''''''      Standard Deviation
*      '''Minimum
*      '''Minimum
**'''''''''''''''      Maximum
**'''''''''''''''      Maximum
*      '''Range
*      '''Range
**'''''''''''''''      Slope
**'''''''''''''''      Slope
*      '''Length''' (of step)
*      '''Length''' (of step)
The time profile for each original variable is summarized using the given statistic(s) and turned into a single variable (column) of the output data. If steps are used, this is repeated for each step segment (each creating a new, separate variable in the output). Each wafer/batch is thus a single row of the output data with all of the steps and original variables summarized as new variables.  
The time profile for each original variable is summarized using the given statistic(s) and turned into a single variable (column) of the output data. If steps are used, this is repeated for each step segment (each creating a new, separate variable in the output). Each wafer/batch is thus a single row of the output data with all of the steps and original variables summarized as new variables.  
Outputs are the digested data, out, and the options which can be used to reproduce the digestion process, options (see below).
Outputs are the digested data, out, and the options which can be used to reproduce the digestion process, options (see below).
===Options===
===Options===
* '''options'''  = structure with one or more of the following fields:
* '''options'''  = structure with one or more of the following fields:
* '''display''' :  [ 'off' | {'on'} ] governs level of display to command window.
* '''display''' :  [ 'off' | {'on'} ] governs level of display to command window.
* '''object''' : { 'batch' | 'wafer' } A string specifying the type of object being digested. This is used for display ONLY. The same algorithms are used in both cases but this option allows customization of the wording in the user prompts.
* '''object''' : { 'batch' | 'wafer' } A string specifying the type of object being digested. This is used for display ONLY. The same algorithms are used in both cases but this option allows customization of the wording in the user prompts.
* '''stepclassname''' : A string specifying the name of the class which should be used to indicate steps in the process.
* '''stepclassname''' : A string specifying the name of the class which should be used to indicate steps in the process.
* '''stepsdesired''' : A vector of steps which should be included in the digestion.
* '''stepsdesired''' : A vector of steps which should be included in the digestion.
* '''labelname''' : A string specifying the name of the label set which should be used to split data into batches/wafers. Use the keyword 'fixed' to specify that the batches are of fixed length and can be split using the nbatches option.
* '''labelname''' : A string specifying the name of the label set which should be used to split data into batches/wafers. Use the keyword 'fixed' to specify that the batches are of fixed length and can be split using the nbatches option.
* '''nbatches''' : The number of equally-sized batches to split the data into. Used ONLY when labelname is 'fixed'.  
* '''nbatches''' : The number of equally-sized batches to split the data into. Used ONLY when labelname is 'fixed'.  
* '''digestiontype''' : [ 'mpca' | 'spca' ] Specifies which digestion algorithm to use on the data.
* '''digestiontype''' : [ 'mpca' | 'spca' ] Specifies which digestion algorithm to use on the data.
* '''statistics''' : A cell specifying the statistics to be used on the data. Used ONLY when digestiontype = 'spca';
* '''statistics''' : A cell specifying the statistics to be used on the data. Used ONLY when digestiontype = 'spca';
If sufficent information is provided in these options, the processing of data will be automatic and the user will not have to answer any responses in the GUIs. Otherwise, only prompts for missing information will be given. The options which can be used to re-process using a given digestion "recipe" will be returned as the second output to any digestion request.
If sufficent information is provided in these options, the processing of data will be automatic and the user will not have to answer any responses in the GUIs. Otherwise, only prompts for missing information will be given. The options which can be used to re-process using a given digestion "recipe" will be returned as the second output to any digestion request.
===See Also===
===See Also===
[[mpca]], [[pca]]
[[mpca]], [[pca]]

Revision as of 14:24, 3 September 2008

Purpose

Parse wafer or batch data into MPCA or Summary PCA form.

Synopsis

[out,options] = batchdigester(data,options);
batchdigester %prompt user for input and output

Description

Rearranges and optionally summarizes two-way dataset of batch or wafer data. Input data must be a DataSet object containing labels which identify different wafers or batches which should be split out of the data. Classes in data are (optionally) used to split each time profile of the batch/wafer into steps which can then be selected for inclusion in the output.

MPCA mode: If data is rearranged into MPCA data, each wafer/batch is arranged as one slab of a 3-way matrix. Each row is a time point and each column is one of the original variables. Only selected steps are included in the output.

Summary PCA mode: If data is summarized into Summary PCA data, all time points for a given step in a given wafer are summarized using one or more statistics:

  • Mean
    • '''''''''' Standard Deviation
  • Minimum
    • '''''''''' Maximum
  • Range
    • '''''''''' Slope
  • Length (of step)

The time profile for each original variable is summarized using the given statistic(s) and turned into a single variable (column) of the output data. If steps are used, this is repeated for each step segment (each creating a new, separate variable in the output). Each wafer/batch is thus a single row of the output data with all of the steps and original variables summarized as new variables.

Outputs are the digested data, out, and the options which can be used to reproduce the digestion process, options (see below).

Options

  • options = structure with one or more of the following fields:
  • display : [ 'off' | {'on'} ] governs level of display to command window.
  • object : { 'batch' | 'wafer' } A string specifying the type of object being digested. This is used for display ONLY. The same algorithms are used in both cases but this option allows customization of the wording in the user prompts.
  • stepclassname : A string specifying the name of the class which should be used to indicate steps in the process.
  • stepsdesired : A vector of steps which should be included in the digestion.
  • labelname : A string specifying the name of the label set which should be used to split data into batches/wafers. Use the keyword 'fixed' to specify that the batches are of fixed length and can be split using the nbatches option.
  • nbatches : The number of equally-sized batches to split the data into. Used ONLY when labelname is 'fixed'.
  • digestiontype : [ 'mpca' | 'spca' ] Specifies which digestion algorithm to use on the data.
  • statistics : A cell specifying the statistics to be used on the data. Used ONLY when digestiontype = 'spca';

If sufficent information is provided in these options, the processing of data will be automatic and the user will not have to answer any responses in the GUIs. Otherwise, only prompts for missing information will be given. The options which can be used to re-process using a given digestion "recipe" will be returned as the second output to any digestion request.

See Also

mpca, pca