Rhist img and Bspcgui: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Scott
(New page: ===Purpose=== Locates regions and calculates region-size histogram for an image. ===Synopsis=== :[dx,dy,szmap] = rhist_img(im,options) ===Description=== The region-size is defined the ...)
 
imported>Scott
 
Line 1: Line 1:
===Purpose===
__TOC__


Locates regions and calculates region-size histogram for an image.
=Introduction=
Batch Statistical Process Control (BSPC) is the analysis of process data where the process is subdivided into "batches" (experiments) and may be further subdivided into "Steps" (sub-divisions of batch indicating processing segments or other divisions of batches). BSPC goes by many names, process monitoring, fault detection, anomaly detection, target detection. Often much is learned from the process of creating a model. Methods generally rely on a model that describes normal and/or desirable operation.


===Synopsis===
=Getting Started=
Data is derived directly from process data with the goal being to summarize high-dimensional data with a handful of factors that capture important directions in the data. Success is highly dependent upon the quantity and quality of process data.


:[dx,dy,szmap] = rhist_img(im,options)
Raw data is presumed to be in a 2 dimensional dataset with Variables as columns.


===Description===
[[Image:bspc_data_config.png|200px|Data Configuration]]
The region-size is defined the radius of the largest circle that will fit inside a given space. This function returns the histogram of region sizes for all spaces in an image. The histogram can be calculated for either the signal (positive values) or voids (zero-values) in an image and plotted as a function of radius, diameter, or area.


The algorithm used starts with thresholding the input image, followed by searching the image for the largest region which can be identified. Starting at that size, the regions into which the given radius circle can be inscribed are filled in. The radius is then decreased and the image is searched for the new size circle, repeating the fill-in process. This is continued until the smallest circle size is reached. The fill-in process can also be optionally "dilated" (see options) to help fill non-circular regions and avoid in-filling of edges with small circles.
===Model Types===


====Inputs====
{| class="wikitable" border="1"
* '''im''' = grayscale image (spatial x spatial 2D image).
|+ BSPC Model Types
! Model !! Modes (Dimensions) !! Equal Length Batches !! Steps Aligned !! Data Shape !! Model Comments
|-
| Summary PCA || 2 || No || No || Batch x (Step/Summary) || PCA on summary statistics of variables over time
|-
| [[Batchmaturity|Batch Maturity]] || 2 || No || No || (Batch/Step) x Variable, Can have Y-Block to indicate maturity || PCA with heterogeneous confidence limits
|-
| [[Mpca|MPCA]] || 3 || Yes || Yes || Time (step) x Variable x Batch || Multiway PCA
|-
| [[Parafac|PARAFAC]] || 3 || Yes || Yes || Batch x Variable x Time (step) || Parallel Factor Analysis (multiway)
|-
| Summary PARAFAC || 3 || No || No ||  Batch x Step x Summary || PARAFAC on summary statistics of variables over time
|-
| [[Parafac2|PARAFAC2]] || 3 || No || No ||  Cell Array of Batches || PARAFAC with relaxed multiway structures (only available at PLS_Toolbox command line)
|}


====Outputs====
See Also: [[batchmaturity|Batch Maturity]], [[mpca|MPCA]], [[MSPC_and_Identification_of_Finite_Impulse_Response_Models|MSPC]], [[parafac|PARAFAC]], [[parafac2|PARAFAC2]]
* '''dy''' = vector of the estimated number of domains found at each domain sizes specified in dx.
* '''dx''' = vector of the examined domain size radius, in pixels (see "units" option to change to other measures).
* '''szmap''' = a "size map" of the same dimensions of the input image. Each domain is color-coded to indicate its relative size.


===Options===
=Batch Processor Window=


options =  a structure array with the following fields:
The goal of the Batch Processor interface is to make it easier to assemble “batch” data for multivariate analysis. Because different analyses and conditions require different data manipulation, assembling data for batch analysis can be very difficult and [[media:Bspc_diagram_roadmap.png |‎ complicated]].


* '''plots : [ 'none' | 'detailed' |{'final'}] governs plots created by function. 'final' shows a histogram of domain sizes. 'detailed' includes histogram, original image, and a "size image" which shows the size of each region and its spatial  location as a color-coded image.
[[Image:BSPCGUI main.png| BSPC GUI]]
* '''units : [{'radius'}| 'diameter' | 'area' ] defines what units the measured areas should be reported in the plotted results.
* '''space : [ 'void' | 'signal' ] governs whether the algorithm will search positive 'signal' space in which domains of interest are indicated by signal, or 'void' space, in which domains of interest are indicated by the lack of signal.
* '''minsize : [2] governs the smallest size region to be identified in an image (as defined by the radius of the circle.)
* '''stepsize : [0.5] interval for histogram resolution (in circle radius units - the inscribing circle is stepped down this number of units in each iteration)
* '''pthreshold : [0.05] the maximum projection onto the circle which will still permit a region to be a "fit" to the circle.
* '''imthreshold : [] the intensity threshold used to segregate between signal and void spaces. If empty, the median of all values is used as a threshold (unless the image is already thresholded into a binary image.)
* '''dilatefill : [ 'off' |{'on'}] specifies whether the filled spaces should be dilated prior to marking region as accounted for. This allows for slightly better fill to non-circular regions without using additional "small" regions to fill in the edges. However, it also reduces the accuracy of the region count slightly.
* '''buffer : [10] the number of pixels to add around the image to avoid spill-over of regions (necessary due to convolutions algorithm used)


The workflow of the interface flows from left to right. Loading data and choosing an Analysis Type will enable relevant tabs. Clicking the '''Next''' button will open the next enabled tab. Batches and steps are defined then alignment and summary information is added. When finished, "folded" data can be saved or exported to the [[Analysis GUI|analysis]] interface and or a model for folding new data can be saved.


===See Also===
==Start==
Load, append, edit, and or clear data. Selecting the Analysis type will automatically enable/disable relevant tabs.


[[morph_img]]
* Dropping data onto the status area will load data. If previously loaded data exists, a prompt for overwrite or augment will appear.
** If augment is chosen, two options will be given, augment as new batch or not. Augment as new batch adds a class for the data being augmented otherwise a "normal" augment will occur and if the new dataset has a matching class it will be merged.
* Dragging and dropping multiple-selected (Excel) files from the system browser (e.g., Windows Explorer or Finder) will pre-augment the files and create a label indicating file name. This label can be used to identify batches in the '''Batches''' tab.
* Data can be edited in the [[DataSet Editor]] by clicking the '''Edit''' button. Editing will cause the model to be cleared.
 
==Batch==
Indicate source of Batch information in loaded dataset. Sources can be Class, Label, or Axisscale sets or a single Variable (column). If manually Loaded then a class is created. If the dataset contains a class with the default name of "BSPC Batch" then it will be automatically selected after loading.
 
* If variable is used, data for that column will be excluded (not deleted) so other mechanisms (preprocessing) can work.
* Once Batches have been identified, one or more batches can be plotted in the lower plot.
 
==Steps==
Steps (subdivisions of batches) can be indicated on the '''Steps''' tab. Steps can be created in the same manor as '''Batches''' or indicated manually.
 
Manual selection is done by selecting a primary variable and batch to align '''to''' then designating '''steps''' for the primary variable/batch. After the steps are set the [[batchalign]] function is used to "map" step location (as dataset class) for each batch.
 
===Manually Selecting Steps===
 
[[Image:bspc_manual_select.png|500px|Manual Selection Interface]]
 
To manually select steps:
 
# Select the variable and batch to use from the plot list boxes at the bottom of the interface. These will become the variable and batch to which all others are aligned to (designated by a "*" next to the list item.
# Click the '''Select''' button and the interface will switch.
# Click the '''Add''' button to place the first step marker.
# Drag this marker to the first step location.
# Repeat until all steps are placed.
# Select different batch from list menu to display "aligned" step position.
# Adjust alignment algorithm as needed using toolbar button.
# Click check-mark button to finish and save steps.
 
===Selected Steps Menu===
 
[[Image:bspc_selected_steps.png|300px|]]
 
Once steps have been designated, they will appear the '''Step Selection''' list. If one or more steps should be ignored they can be deselected in this menu. Selected steps will appear in the batch plot as solid green lines and unselected steps appear as red dashed lines.
 
==Align==
 
Methods that require equal length batches use the tools available on the '''Align''' tab from the [[batchalign]] function.
 
[[Image:bspc_align_settings.png|Align Settings ]]
 
NOTE: In the image above, the alignment batch is Class 0 (the default) which has no members. This must be changed before alignment will work.
 
# Select the type of alignment.
# Select the Batch and Variable or Load a vector.
# Select COW settings if using COW.
# Click Update Plot to see the results.
 
Alignment Types:
 
* '''Linear''' - Linear interpolation based on selected variable and batch.
* '''COW''' - [[cow|Correlation Optimized Warping]] with Alignment Settings values.
* '''Pad With NaN''' - Infill with NaN to make equal length.
 
Plots switch to displaying selected variables and batches pre aligned on top and post align on bottom. Must click '''Update Plots''' button to refresh plot.
 
==Summarize==
 
Available summary statistics as calculated by [[summary]] function.
 
[[Image:Bspc_summarize.png|Summary Options]]
 
All stats summarize each column except for:
* '''Length''' Length of steps, single number.
* '''Five-Number Summary''' 10, 25, 50, 75, 90th percentile, 5 values per step.
 
For example with the [[Demonstration_Datasets | Dupont]] demo calibration data (dupont_cal), if you choose mean, std, slope, skewness, and length the size of your folded summary pca data will be:
 
10 variables x 4 stats + length = 41 values per step * 5 steps = 205 columns
 
==Finish==
 
When completed there are 4 options:
 
* Send data directly to a new [[Analysis]] window.
* Save the data to the workspace.
* Save a model for future data application. NOTE: In some more complicated instances (loading outside information) the model may not be able to fully capture each step taken in the interface.
* Cancel and close the window.

Revision as of 09:59, 3 October 2012

Introduction

Batch Statistical Process Control (BSPC) is the analysis of process data where the process is subdivided into "batches" (experiments) and may be further subdivided into "Steps" (sub-divisions of batch indicating processing segments or other divisions of batches). BSPC goes by many names, process monitoring, fault detection, anomaly detection, target detection. Often much is learned from the process of creating a model. Methods generally rely on a model that describes normal and/or desirable operation.

Getting Started

Data is derived directly from process data with the goal being to summarize high-dimensional data with a handful of factors that capture important directions in the data. Success is highly dependent upon the quantity and quality of process data.

Raw data is presumed to be in a 2 dimensional dataset with Variables as columns.

Data Configuration

Model Types

BSPC Model Types
Model Modes (Dimensions) Equal Length Batches Steps Aligned Data Shape Model Comments
Summary PCA 2 No No Batch x (Step/Summary) PCA on summary statistics of variables over time
Batch Maturity 2 No No (Batch/Step) x Variable, Can have Y-Block to indicate maturity PCA with heterogeneous confidence limits
MPCA 3 Yes Yes Time (step) x Variable x Batch Multiway PCA
PARAFAC 3 Yes Yes Batch x Variable x Time (step) Parallel Factor Analysis (multiway)
Summary PARAFAC 3 No No Batch x Step x Summary PARAFAC on summary statistics of variables over time
PARAFAC2 3 No No Cell Array of Batches PARAFAC with relaxed multiway structures (only available at PLS_Toolbox command line)

See Also: Batch Maturity, MPCA, MSPC, PARAFAC, PARAFAC2

Batch Processor Window

The goal of the Batch Processor interface is to make it easier to assemble “batch” data for multivariate analysis. Because different analyses and conditions require different data manipulation, assembling data for batch analysis can be very difficult and ‎ complicated.

BSPC GUI

The workflow of the interface flows from left to right. Loading data and choosing an Analysis Type will enable relevant tabs. Clicking the Next button will open the next enabled tab. Batches and steps are defined then alignment and summary information is added. When finished, "folded" data can be saved or exported to the analysis interface and or a model for folding new data can be saved.

Start

Load, append, edit, and or clear data. Selecting the Analysis type will automatically enable/disable relevant tabs.

  • Dropping data onto the status area will load data. If previously loaded data exists, a prompt for overwrite or augment will appear.
    • If augment is chosen, two options will be given, augment as new batch or not. Augment as new batch adds a class for the data being augmented otherwise a "normal" augment will occur and if the new dataset has a matching class it will be merged.
  • Dragging and dropping multiple-selected (Excel) files from the system browser (e.g., Windows Explorer or Finder) will pre-augment the files and create a label indicating file name. This label can be used to identify batches in the Batches tab.
  • Data can be edited in the DataSet Editor by clicking the Edit button. Editing will cause the model to be cleared.

Batch

Indicate source of Batch information in loaded dataset. Sources can be Class, Label, or Axisscale sets or a single Variable (column). If manually Loaded then a class is created. If the dataset contains a class with the default name of "BSPC Batch" then it will be automatically selected after loading.

  • If variable is used, data for that column will be excluded (not deleted) so other mechanisms (preprocessing) can work.
  • Once Batches have been identified, one or more batches can be plotted in the lower plot.

Steps

Steps (subdivisions of batches) can be indicated on the Steps tab. Steps can be created in the same manor as Batches or indicated manually.

Manual selection is done by selecting a primary variable and batch to align to then designating steps for the primary variable/batch. After the steps are set the batchalign function is used to "map" step location (as dataset class) for each batch.

Manually Selecting Steps

Manual Selection Interface

To manually select steps:

  1. Select the variable and batch to use from the plot list boxes at the bottom of the interface. These will become the variable and batch to which all others are aligned to (designated by a "*" next to the list item.
  2. Click the Select button and the interface will switch.
  3. Click the Add button to place the first step marker.
  4. Drag this marker to the first step location.
  5. Repeat until all steps are placed.
  6. Select different batch from list menu to display "aligned" step position.
  7. Adjust alignment algorithm as needed using toolbar button.
  8. Click check-mark button to finish and save steps.

Selected Steps Menu

Bspc selected steps.png

Once steps have been designated, they will appear the Step Selection list. If one or more steps should be ignored they can be deselected in this menu. Selected steps will appear in the batch plot as solid green lines and unselected steps appear as red dashed lines.

Align

Methods that require equal length batches use the tools available on the Align tab from the batchalign function.

Align Settings

NOTE: In the image above, the alignment batch is Class 0 (the default) which has no members. This must be changed before alignment will work.

  1. Select the type of alignment.
  2. Select the Batch and Variable or Load a vector.
  3. Select COW settings if using COW.
  4. Click Update Plot to see the results.

Alignment Types:

  • Linear - Linear interpolation based on selected variable and batch.
  • COW - Correlation Optimized Warping with Alignment Settings values.
  • Pad With NaN - Infill with NaN to make equal length.

Plots switch to displaying selected variables and batches pre aligned on top and post align on bottom. Must click Update Plots button to refresh plot.

Summarize

Available summary statistics as calculated by summary function.

Summary Options

All stats summarize each column except for:

  • Length Length of steps, single number.
  • Five-Number Summary 10, 25, 50, 75, 90th percentile, 5 values per step.

For example with the Dupont demo calibration data (dupont_cal), if you choose mean, std, slope, skewness, and length the size of your folded summary pca data will be:

10 variables x 4 stats + length = 41 values per step * 5 steps = 205 columns

Finish

When completed there are 4 options:

  • Send data directly to a new Analysis window.
  • Save the data to the workspace.
  • Save a model for future data application. NOTE: In some more complicated instances (loading outside information) the model may not be able to fully capture each step taken in the interface.
  • Cancel and close the window.