Hierarchical Model Builder: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
No edit summary
imported>Jeremy
No edit summary
Line 9: Line 9:
* '''Input and Output Filtering:''' In some cases, prior to applying a model, the input data should be tested for appropriate conditioning (e.g. insufficient signal or an "invalid data" flag variable being set). In others, the output of a model should be tested for being outside appropriate limits before returning the result. Hierarchical models can be used to first test the raw input data for specific conditions or to test the output of a regression, classification, or projection (PCA, PARAFAC, etc.) model.
* '''Input and Output Filtering:''' In some cases, prior to applying a model, the input data should be tested for appropriate conditioning (e.g. insufficient signal or an "invalid data" flag variable being set). In others, the output of a model should be tested for being outside appropriate limits before returning the result. Hierarchical models can be used to first test the raw input data for specific conditions or to test the output of a regression, classification, or projection (PCA, PARAFAC, etc.) model.


===Heirarchical Model Builder Interface===
===Hierarchical Model Builder Interface===


The Hierarchical Model Builder interface comprises a set of toolbar buttons, a set of node tools containing different nodes used to build a model, a model canvas on which the hierarchical model is built, and a Model Cache interface from which models and data can be added to the hierarchical model canvas.
The Hierarchical Model Builder interface comprises a set of toolbar buttons, a set of node tools containing different nodes used to build a model, a model canvas on which the hierarchical model is built, and a Model Cache interface from which models and data can be added to the hierarchical model canvas.
Line 15: Line 15:
[[Image:Hierarchicalmodelbuilder.png||| ]]
[[Image:Hierarchicalmodelbuilder.png||| ]]


===Building a Hierarchical Model===
 
===Introduction to Node Types===
 
Hierarchical models are built from one or more rules (decision nodes) plus end nodes for each condition tested by the rule nodes. Rule nodes decide which of the branches underneath it should be selected. End nodes are at the ends of each branch and define what should be returned as a "prediction" for any sample that arrives at the end of that branch.
 
====Rule Nodes====
 
Rule nodes (also known as decision nodes) can be one of several types depending on what is being tested to decide which branch to follow:
 
* '''Classification Test-Based Rule''' - chooses a branch based on the class predicted by a classification model (PLSDA, KNN, SIMCA, SVMDA) from the input data. The number of branches of a node of this type is defined by the number of classes defined in the classification model. There will be one branch for each class defined, plus one branch which is used if none of the classes are selected. [[Image:Triggercls.png]]
 
* '''Variable Test-Based Rule''' - chooses a branch based on the state of the raw input data (rather than outputs of models.) Each branch is given a test condition that defines: a variable to test, a comparison operator, and a value to compare to. Variables to test can be defined using either a string (in double-quotes, e.g. "Toggle_A") which will select the variable of the raw data that has a matching label or an axisscale (given as a value inside square brackets, e.g. [123.4] ) [[Image:Triggervalue.png]]
 
* '''Regression or Projection Test-Based Rule''' - chooses a branch based on the predictions of the given (non-classification) model from the input data. Any model type that outputs a DataSet object or a standard prediction including, but not limited to regression models (PLS, PCR, MLR, CLS), decomposition models (PCA, MCR, PARAFAC), or other "transformation" models ([[trendtool]]). [[Image:Triggerreg.png]] [[Image:Triggerpro.png]]

Revision as of 17:37, 11 November 2013

Introduction

The Hierarchical Model Builder is used to create Model Selector models which can be used to perform decision tree operations. These models are used for applications like:

  • Local Regression Models: Badly non-linear data, or data which contains separate "domains" may require models which are specific to the each of the different sub-domains in the data (For example, when different solvents or operation conditions each require a specific "local" model to achieve the necessary accuracy.) A hierarchical model can be used to first determine which sub-domain you are in, then apply the appropriate local model.
  • Classification Decision Trees: Complicated classification problems often can not be solved using a single classification model (PLSDA, SVM, SIMCA) but instead require the problem to be broken down into sub-groups with individual models handling the increased detail of the problem. A hierarchical model might start by selecting between general categories, then sub-models (possibly including additional hierarchical models) would sub-divide those general categories using increasingly specific classification models.
  • Input and Output Filtering: In some cases, prior to applying a model, the input data should be tested for appropriate conditioning (e.g. insufficient signal or an "invalid data" flag variable being set). In others, the output of a model should be tested for being outside appropriate limits before returning the result. Hierarchical models can be used to first test the raw input data for specific conditions or to test the output of a regression, classification, or projection (PCA, PARAFAC, etc.) model.

Hierarchical Model Builder Interface

The Hierarchical Model Builder interface comprises a set of toolbar buttons, a set of node tools containing different nodes used to build a model, a model canvas on which the hierarchical model is built, and a Model Cache interface from which models and data can be added to the hierarchical model canvas.

Hierarchicalmodelbuilder.png


Introduction to Node Types

Hierarchical models are built from one or more rules (decision nodes) plus end nodes for each condition tested by the rule nodes. Rule nodes decide which of the branches underneath it should be selected. End nodes are at the ends of each branch and define what should be returned as a "prediction" for any sample that arrives at the end of that branch.

Rule Nodes

Rule nodes (also known as decision nodes) can be one of several types depending on what is being tested to decide which branch to follow:

  • Classification Test-Based Rule - chooses a branch based on the class predicted by a classification model (PLSDA, KNN, SIMCA, SVMDA) from the input data. The number of branches of a node of this type is defined by the number of classes defined in the classification model. There will be one branch for each class defined, plus one branch which is used if none of the classes are selected. Triggercls.png
  • Variable Test-Based Rule - chooses a branch based on the state of the raw input data (rather than outputs of models.) Each branch is given a test condition that defines: a variable to test, a comparison operator, and a value to compare to. Variables to test can be defined using either a string (in double-quotes, e.g. "Toggle_A") which will select the variable of the raw data that has a matching label or an axisscale (given as a value inside square brackets, e.g. [123.4] ) Triggervalue.png
  • Regression or Projection Test-Based Rule - chooses a branch based on the predictions of the given (non-classification) model from the input data. Any model type that outputs a DataSet object or a standard prediction including, but not limited to regression models (PLS, PCR, MLR, CLS), decomposition models (PCA, MCR, PARAFAC), or other "transformation" models (trendtool). Triggerreg.png Triggerpro.png