Modelselector

From Eigenvector Research Documentation Wiki
Revision as of 15:33, 7 October 2008 by imported>Scott (→‎Description)
Jump to navigation Jump to search

Purpose

Create or apply a model selector model.

Synopsis

model = modelselector(triggermodel,target_1,target_2,...,target_default);
[target_model,applymodel] = modelselector(data,model)

Description

A Selector Model is a special model type which, when applied to new data, selects between two or more "target" models based on a "trigger" model. It is used to implement discrete local models when a single global model is not sufficient for all possible scenarios.

For example, if a single PCA or PLS model does not perform sufficiently for all operating conditions but the operating conditions can be split into two or more easier-to-model subsets, a selector model can be used to choose between these subset models when applying the models to new data.

Selector models consist of a trigger model (trigger) which can be either a PLSDA model or a set of one or more logical test strings and a set of two or more target models (target_1, target_2, etc) which can be any type of standard model structure or an empty array [ ] to indicate a null model.

Guidelines and rules for trigger models:

  • (A) A PLSDA trigger model can be created using the PLSDA function. Themodel should be built with data representative of the sample types to which each target model can be applied. The number of classes separated by the PLSDA model dictates the number of target models which can be selected from. The target models should be in the same order as the numerical class numbers used with PLSDA (e.g. if classes 1, 2 and 3 are used in PLSDA, the target models should be ordered so that target_1 is appropriate if the PLSDA model finds that a sample is class 1, target_2 is for class 2, and target_3 is for class 3.)
  • (B) Logical test strings are specified as a trigger model by passing a cell containing one or more strings which perform a logical test on a variable from the data set. Variables are specified using either a label in double quotes (e.g. "flowrate"), or a axisscale value in quotes and square brackets (e.g. "[1530]"). The varaible can be used in any interpretable Matlab expression (including function calls) that returns a logical result. The simplest test could involve one of the Matlab logical comparison operators ( < > <= >= == and \~= ) and a value to which the given variable should be compared. For example, the target model:
{'"Fe">1100' '"Fe"<500'}

tests if the variable named "Fe" is greater than 1100. If true, the target_1 model is applied, if not true, "Fe" is tested for being less than 500, and if so, target_2 is selected. If neither test is true, the "default" target model (i.e. target_3) is selected.

Example 2:

{'"[1745.3]"<=500'}

tests if variable 1745.3 (on the variable axiscale) is less than or equal to 500. If true, target_1 is selected, if not true, default target model is selected. If variable 1745.3 does not exist, it is interpolated from the provided data.

When creating a selector model, there must be at least as many target models passed as there are classes (when trigger is a PLSDA model) or strings (when trigger is a cell of logical test strings). There may also be an additional target model (i.e. the "default" model) which is used if none of the classes or tests were positive.

Note that target models may be any standard model structure including another selector model (thus allowing multi-layer selector trees).

To apply a selector model, a single row of new data is passed as a dataset along with the selector model itself. The output is the selected target model (target_model) along with a unique description of the "branch(s)" taken to select the target model as a vector of branch numbers (applymodel). For example, given a multi-layer selector model containing:

selector_model -> target_1 = PCA_model_A1

                  target_2 = Selector_model -> target_1 = PCA_model_B1
                                               target_2 = PCA_model_B2 
                 target_3 = PCA_model_A2   

a returned value for applymodel of [2 1] implies that the second target model was selected from the first layer of target models, and this model was another selector model. From that second selector model, the first target model (PCA_model_B1) was selected and that is what was returned.

Note that if there are multiple "branches" (trigger models) the data passed to modelselector must contain all the data necessary for all trigger models within the selector model. If some of those variables are not used by a given model, modelselector will automatically discard unneeded variables before applying each trigger model.

See Also

lwrpred, plsda, simca