EVRIModel Objects and Variableselectiongui: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Jeremy
 
imported>Scott
No edit summary
 
Line 1: Line 1:
==Introduction==
==Introduction==


EVRIModel Objects provide access to the [[Standard Model Structure]] content of all models and provide some easy-to-use methods and properties for building, manipulating, and reviewing models from Matlab's command line, scripts, and functions. In addition, these properties and methods are available from [[Solo_Predictor_Script_Construction|Solo Scripting]] when using [[Solo_Predictor_User_Guide|Solo_Predictor]] and Solo_Server. This page describes the various modes, methods, and properties of EVRIModel objects, here shortened to just "model objects".
The Variable Selection panel contains an interface to several methods for performing variable selection. The goal is to find subsets of variables that improve predictions when compared to using all variables. This interface has several different methods available. Finding the best method and options settings will take some experimentation. Use links below for more information on particular methods.


Model objects have three distinct states:
==Methods==


# '''Empty Models''' - Empty models can be populated with data to analyze, "meta parameters" (model building settings), and other modeling options, then models can be calibrated or built from those settings.
* Automatic (VIP or sRatio)
# '''Calibrated Models''' - Calibrated models contain all the model results and parameters necessary to apply that model to new data. Plots and other information can be obtained from calibrated models.
* GA - Genetic Algorithm
# '''Applied Models''' - When a calibrated model is applied to new data, the result is a prediction or "applied model". This object contains all the results from applying the model to the new data. Plots and other information can be obtained from applied models.
* iPLS - Interval PLS
* rPLS - Recursive PLS
* sRatio - Selectivity Ratio
* VIP - Variable Importance in Projection


==Working with Model Objects in Matlab and Solo Scripting==
==Work Flow==


EVRIModels are standard Matlab objects which are manipulated using the dot notation to access properties and methods. For example, to retrieve the "model type" (modeltype) property from a model, you give the object (a.k.a. variable) name followed by .modeltype. All examples here will assume that the model is stored in a variable named "model".
* <u>Select a Method</u> - Select a method from the drop-down menu. Options for the method will be displayed. If a previous calculation has been done, the results of it will be displayed.  
 
* <u>Adjust Options</u> - By default, a simplified set of options are displayed. If the "Show All Options" checkbox is selected then all available options will be displayed. Depending on the options set, a particular method can take an extended amount of time to complete. For example, decreasing the window width in GA will increase the amount of time it takes to complete. See documentation for more details on optional settings.
<pre>model.modeltype</pre>
* <u>Run Variable Selection</u> - Clicking the "Execute" button will run the current variable selection method with values specified in the options. A waitbar will be displayed indicating the method is running. Some methods will display a waitbar with a message indicating it can be closed to cancel execution. NOTE: It can take some time for the method to finish a calculation loop and identify the user has canceled. If "Show Plots" is checked then any additional plots will be displayed in separate windows. This is useful for GA as it will show progress of the calculation.
 
* <u>View Results</u> - When a calculation is complete the selected variables will be displayed under a plot of the data mean as green bars.
Most object methods can be accessed in the same way:
 
<pre>model.plotscores</pre>
 
Some methods (<tt>.apply</tt> and <tt>.crossvalidate</tt>, for example) also require for additional inputs. These are passed in parenthesis after naming the method:
 
<pre>model.apply(newdata)</pre>
 
===Displaying Contents===
 
At the Matlab command line (but not in Solo Scripting), you can view the contents of a model object by simply typing its name or by using the <tt>.disp</tt> method. When viewing content, there are several ways to view the model:
# By Description (Desc.) : this view shows you a text description of the type of model, how it was built, and a summary of its results.
# By Contents : this view contains the raw field information from the model. Users of previous versions of PLS_Toolbox will recognize this as the previous standard display.
 
At the Matlab command window, you can turn either one of these sections on or off by clicking the [on] or [off] hyperlinks in the top display line (shown in blue below)
 
<tt>  PCA Model Object (Desc. ON/<font color="#0000ee"><u>[off]</u></font>  Contents ON/<font color="#0000ee"><u>[off]</u></font>)</tt>
 
 
==Building from Empty Model Object==
 
When an EVRIModel object has been initially created, it contains no data and no results. Many model objects' properties can then be populated with data, meta-parameters, and other settings (options) which can then be used with the <tt>.calibrate</tt> method to build a calibrated model. The <tt>.inputs</tt> property lists the specific properties that can be set for a given model type.
 
:'''NOTE:''' Some model types do NOT support calibration in this manner. In these cases, the <tt>.cancalibrate</tt> property to determine if it allows calibration directly or if it requires a call to the function named in ''modeltype''. In addition, the model will clearly show the state in its display at the command line with a statement to "See _____ function to calibrate." In these cases, the only way to create a calibrated model is to access the named function directly.
 
===Properties===
 
The properties of an uncalibrated model depend on the model type. Typically, a value can be provided for the data to model, plus some number of "meta-parameters" which define aspects of how the model will be built. The list of values available is indicated by the .inputs property defined in the General Model Properties section below.
 
&nbsp;
 
===Methods===
 
Both of the methods below return a model object. In Matlab, when no output is requested, the model object is stored back into the same object invoked. In Solo Scripting, these methods require an output variable, usually the same model object being built from. For example: <tt>m = m.calibrate</tt>
 
{| border="1" cellpadding="5" cellspacing="0" align="left" style="margin-left:3em" |-
|-
|valign="top" |
<tt>.calibrate</tt>
| Build the model based on the current meta-parameters and options.
|-
|valign="top" |
<tt>.crossvalidate(''cvi'',''ncomp'')</tt>
| Build the model and cross-validate with the supplied conditions. ''cvi'' is the cross-validation splitting as described for cvi in [[crossval]] (default = venetian blinds with square-root of the number of samples as splits). ''ncomp'' is the number of components (default = maximum number available).
|}
 
&nbsp;
 
==Working With Calibrated Models==
 
Once calibrated, a model object contains all the results (relevant to the model type) derived from the modeled data. The object also has all the information necessary to apply that model to new data. For many models, methods exist for plotting parts of the model (scores, loadings, eigenvalues, etc.)
 
===Properties===
 
The following properties are available for most models once they have been calibrated.
 
{| border="1" cellpadding="5" cellspacing="0" align="left" style="margin-left:3em"
|-
|valign="top" |
<tt>datasource</tt>
|
|-
|valign="top" |
<tt>date</tt>
|
 
|-
|valign="top" |
<tt>detail</tt>
|
|-
|valign="top" |
<tt>loadings</tt>
|
|-
|valign="top" |
<tt>ncomp</tt>
|
|-
|valign="top" |
<tt>parent</tt>
|
|-
|valign="top" |
<tt>prediction</tt>
|
|-
|valign="top" |
<tt>q</tt>
|
|-
|valign="top" |
<tt>scores</tt>
|
|-
|valign="top" |
<tt>t2</tt>
|
|-
|valign="top" |
<tt>time</tt>
|
|-
|valign="top" |
<tt>x</tt>
|
|-
|valign="top" |
<tt>x</tt>
|
|-
|valign="top" |
<tt>xhat</tt>
|
|-
|valign="top" |
<tt>y</tt>
|
|-
|valign="top" |
<tt>yhat</tt>
|
|}
 
:&nbsp;
 
===Methods===
 
The following methods are available when a model has been calibrated.
 
 
{| border="1" cellpadding="5" cellspacing="0" align="left" style="margin-left:3em"
|-
|valign="top" |
<tt>.apply()</tt>       
||
|-
|valign="top" |
<tt>.crossvalidate()</tt>       
||
|-
|valign="top" |
<tt>.ploteigen</tt>   
||
|-
|valign="top" |
<tt>.plotloads</tt>   
||
|-
|valign="top" |
<tt>.plotscores</tt>   
|| We will populate this table once we've got the others done.
|}
 
:&nbsp;
 
 
==General Model Properties and Methods==
 
The following properties and methods are always available in a model independent of the model state or model type:
 
===Informational Properties (Read-Only)===
{| border="1" cellpadding="5" cellspacing="0" align="left"  style="margin-left:3em"  |-
|valign="top" |
<tt>.author</tt>
| String describing the author and computer on which this model was created. Usually ''user@computername''. Given a system with assigned usernames and computer names, this is equivalent to an electronic signature on a model.
|-
|valign="top" |
<tt>.content</tt>
| Returns the "raw" model information in a form that is most similar to the model structures from previous versions of PLS_Toolbox and Solo. Generally, users need not access this field directly except to provide a model in a form more similar to old models.
|-
|valign="top" |
<tt>.downgradeinfo</tt>
| Informational string explaining the purpose of the <tt>.content</tt> field.
|-
|valign="top" |
<tt>.evrimodelversion</tt><br>
<tt>.modelversion</tt>
| Returns a string containing the model version description. The model version is almost always linked to the version of PLS_Toolbox or Solo that created the given model. The two field names here are synonymous.
|-
|valign="top" |
<tt>.info</tt>
| Returns (or displays with no outputs) the text description of the model. This is the same description shown at the Matlab command line when the model is viewed with content "on". With an output, the results are returned as a cell array of strings.
|-
|valign="top" |
<tt>.inputs</tt>
| Returns a cell array of strings indicating which properties can be set for the model in its current state. Most often this is used when a model is in an uncalibrated state and this property will indicate what parameters and data fields are available to the user to assign before calibrating the model.
|-
|valign="top" |
<tt>.uniqueid</tt>
| Returns a string which uniquely identifies this model including the author, author's computer, and a date/time stamp. This uniqueid can be used to safely discriminate between different models.
|-
|valign="top" |
<tt>.validmodeltypes</tt>
| Returns a cell array of strings listing the model types which are currently valid for assignment to the <tt>.modeltype</tt> field.
|}
 
&nbsp;
 
===Model Status Properties (Read-Only)===
{| border="1" cellpadding="5" cellspacing="0" align="left"  style="margin-left:3em" |-
|valign="top" |
<tt>.cancalibrate</tt>
| Returns (1) if the model contains a modeling building definition (see Empty Model description, below), or (0) if the model does not contain a definition and must be calibrated using the function defined in the modeltype property.
|-
|valign="top" |
<tt>.iscalibrated</tt>
| Returns (1) if the model has been calibrated or applied and (0) if the model is in the "empty" state and has not been calibrated.
|-
|valign="top" |
<tt>.isclassification</tt>
| Returns (1) if the model is a classification model that returns class assignments for unknowns or (0) if it is a decomposition or regression model type
|-
|valign="top" |
<tt>.isprediction</tt>
| Returns (1) if the model contains a prediction from applying a calibrated model to new data and (0) if the model is just "calibrated" or "empty".
|}
 
&nbsp;
 
===Modifiable Properties===
{| border="1" cellpadding="5" cellspacing="0" align="left" style="margin-left:3em" |-
|-
|valign="top" |
<tt>.modeltype</tt>
| Returns the short "keyword" model type of the current model (or empty if the model type has not been set). This keyword most often is linked to the PLS_Toolbox function that created the given model. Can be assigned (within the limits defined by the <tt>.validmodeltypes</tt> property)
|-
|valign="top" |
<tt>.display</tt>
| String property indicating 'on' if command-line display should be given when calibrating or applying a model and 'off' if no display should be given.
* ''''on'''' : Display command-line output
* ''''off'''' : Do not display any output
|-
|valign="top" |
<tt>.plots</tt>
| String property indicating 'final' if plots should be displayed after calibrating or applying a model and 'none' if no plots should be displayed.
* ''''final'''' : Generate plots (if possible)
* ''''none'''' : Do not generate any plots
|}
 
&nbsp;
===General Methods===
{| border="1" cellpadding="5" cellspacing="0" align="left" style="margin-left:3em" |-
|-
|valign="top" |
<tt>.disp</tt>
| Displays the contents of the model. There is no output variable from this method, it only displays the information. For access to the content, see the <tt>.info</tt> method.
|-
|valign="top" |
<tt>.help</tt>
| Alone without any additional sub-indexing, this method brings up the help which is most relevant for the particular model type. With the <tt>.predictions</tt> sub-field, this method returns [[Solo_Predictor_Script_Construction#Common_Return_Properties|a structure array of possible sub-fields]] that may be requested for certain properties of the current model.
|}
 
 
&nbsp;
==Alphabetical List of General Properties and Methods==
 
===Properties===
 
<pre>
author       
cancalibrate 
content       
datasource   
date         
detail       
display       
downgradeinfo 
evrimodelversion
info         
iscalibrated 
isclassification
isprediction 
loadings     
modeltype     
ncomp         
parent       
plots         
prediction   
q             
scores       
settings     
t2           
time         
uniqueid     
validmodeltypes
x             
xhat         
y             
yhat   
</pre>
 
===Methods===
 
<pre>
apply         
calibrate     
crossvalidate 
help         
ploteigen     
plotloads     
plotscores   
</pre>

Revision as of 14:24, 11 January 2018

Introduction

The Variable Selection panel contains an interface to several methods for performing variable selection. The goal is to find subsets of variables that improve predictions when compared to using all variables. This interface has several different methods available. Finding the best method and options settings will take some experimentation. Use links below for more information on particular methods.

Methods

  • Automatic (VIP or sRatio)
  • GA - Genetic Algorithm
  • iPLS - Interval PLS
  • rPLS - Recursive PLS
  • sRatio - Selectivity Ratio
  • VIP - Variable Importance in Projection

Work Flow

  • Select a Method - Select a method from the drop-down menu. Options for the method will be displayed. If a previous calculation has been done, the results of it will be displayed.
  • Adjust Options - By default, a simplified set of options are displayed. If the "Show All Options" checkbox is selected then all available options will be displayed. Depending on the options set, a particular method can take an extended amount of time to complete. For example, decreasing the window width in GA will increase the amount of time it takes to complete. See documentation for more details on optional settings.
  • Run Variable Selection - Clicking the "Execute" button will run the current variable selection method with values specified in the options. A waitbar will be displayed indicating the method is running. Some methods will display a waitbar with a message indicating it can be closed to cancel execution. NOTE: It can take some time for the method to finish a calculation loop and identify the user has canceled. If "Show Plots" is checked then any additional plots will be displayed in separate windows. This is useful for GA as it will show progress of the calculation.
  • View Results - When a calculation is complete the selected variables will be displayed under a plot of the data mean as green bars.