EVRIModel Objects: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
No edit summary
imported>Jeremy
(80 intermediate revisions by the same user not shown)
Line 1: Line 1:
EVRIModel Objects provide access to the [[Standard Model Structure]] content of all models as well as some easy-to-use methods and properties for building, manipulating, and reviewing models from Matlab's command-line, scripts, and functions. This page describes the various modes, methods, and properties of EVRIModel objects.
==Introduction==


EVRIModel Objects provide access to the [[Standard Model Structure]] content of all models and provide some easy-to-use methods and properties for building, manipulating, and reviewing models from Matlab's command line, scripts, and functions. In addition, these properties and methods are available from [[Solo_Predictor_Script_Construction|Solo Scripting]] when using [[Solo_Predictor_User_Guide|Solo_Predictor]] and Solo_Server. This page describes the various modes, methods, and properties of EVRIModel objects, here shortened to just "model objects".
Model objects have three distinct states:
# '''Empty Models''' - Empty models can be populated with data to analyze, "meta parameters" (model building settings), and other modeling options, then models can be calibrated or built from those settings.
# '''Calibrated Models''' - Calibrated models contain all the model results and parameters necessary to apply that model to new data. Plots and other information can be obtained from calibrated models.
# '''Applied Models''' - When a calibrated model is applied to new data, the result is a prediction or "applied model". This object contains all the results from applying the model to the new data. Plots and other information can be obtained from applied models.
==Working with Model Objects in Matlab and Solo Scripting==
EVRIModels are standard Matlab objects which are manipulated using the dot notation to access properties and methods. For example, to retrieve the "model type" (modeltype) property from a model, you give the object (a.k.a. variable) name followed by .modeltype. All examples here will assume that the model is stored in a variable named "model".
<pre>model.modeltype</pre>
Most object methods can be accessed in the same way:
<pre>model.plotscores</pre>
Some methods (<tt>.apply</tt> and <tt>.crossvalidate</tt>, for example) also require for additional inputs. These are passed in parenthesis after naming the method:
<pre>model.apply(newdata)</pre>
===Displaying Contents===
At the Matlab command line (but not in Solo Scripting), you can view the contents of a model object by simply typing its name or by using the <tt>.disp</tt> method. When viewing content, there are several ways to view the model:
# By Description (Desc.) : this view shows you a text description of the type of model, how it was built, and a summary of its results.
# By Contents : this view contains the raw field information from the model. Users of previous versions of PLS_Toolbox will recognize this as the previous standard display.
At the Matlab command window, you can turn either one of these sections on or off by clicking the [on] or [off] hyperlinks in the top display line (shown as <font color="#0000ee"><u>underlined blue</u></font> text below)
<tt>  PCA Model Object (Desc. ON/<font color="#0000ee"><u>[off]</u></font>  Contents ON/<font color="#0000ee"><u>[off]</u></font>)</tt>
==Building from Uncalibrated Model Objects==
When a model object has been initially created, it contains no data and no results. Many model objects' properties can then be populated with data, meta-parameters, and other settings (options) which can then be used with the <tt>.calibrate</tt> method to build a calibrated model. The <tt>.inputs</tt> property lists the specific properties that can be set for a given model type.
:'''NOTE:''' Some model types do NOT support calibration in this manner. In these cases, the <tt>.cancalibrate</tt> property to determine if it allows calibration directly or if it requires a call to the function named in ''modeltype''. In addition, the model will clearly show the state in its display at the command line with a statement to "See _____ function to calibrate." In these cases, the only way to create a calibrated model is to access the named function directly.
===Example===
The following is an example which would build a PCA model from the data stored in the <tt>data</tt> variable with 3 principal components:
<pre>
model = evrimodel('pca');
model.x = data;
model.ncomp = 3;
model.calibrate;
</pre>
===Uncalibrated Model Properties===
The properties of an uncalibrated model depend on the model type. Typically, a value can be provided for the data to model, plus some number of "meta-parameters" which define aspects of how the model will be built. The list of values available is indicated by the .inputs property. All models which are calibratable allow modification of the <tt>.display</tt> and <tt>.plots</tt> properties.
The properties which are generally available for all model types are listed below.
====Model Status Properties (Read-Only)====
{| border="1" cellpadding="5" cellspacing="0"  style="margin-left:3em"
|-
|valign="top" |
<tt>.cancalibrate</tt>
| Returns (1) if the model contains a modeling building definition (see Empty Model description, below), or (0) if the model does not contain a definition and must be calibrated using the function defined in the modeltype property.
|-
|valign="top" |
<tt>.inputs</tt>
| Returns a cell array of strings indicating which properties can be set for the model in its current state. Most often this is used when a model is in an uncalibrated state and this property will indicate what parameters and data fields are available to the user to assign before calibrating the model.
|-
|valign="top" |
<tt>.validmodeltypes</tt>
| Returns a cell array of strings listing the model types which are currently valid for assignment to the <tt>.modeltype</tt> field.
|}
&nbsp;
====Modifiable Properties====
{| border="1" cellpadding="5" cellspacing="0"  style="margin-left:3em"
|-
|valign="top" |
<tt>.modeltype</tt>
| Returns the short "keyword" model type of the current model (or empty string if the model type has not been set). This keyword most often is linked to the PLS_Toolbox function that created the given model. This can be assigned to any model type listed in the <tt>.validmodeltypes</tt> property.
|-
|valign="top" |
<tt>.display</tt>
| String property indicating 'on' if command-line display should be given when calibrating or applying a model and 'off' if no display should be given.
* ''''on'''' : Display command-line output
* ''''off'''' : Do not display any output
|-
|valign="top" |
<tt>.plots</tt>
| String property indicating 'final' if plots should be displayed after calibrating or applying a model and 'none' if no plots should be displayed.
* ''''final'''' : Generate plots (if possible)
* ''''none'''' : Do not generate any plots
|}
&nbsp;
===Uncalibrated Model Methods===
Both of the methods below return a model object. In Matlab, when no output is requested, the model object is stored back into the same object invoked. In Solo Scripting, these methods require an output variable, usually the same model object being built from. For example: <tt>m = m.calibrate</tt>
{| border="1" cellpadding="5" cellspacing="0"  style="margin-left:3em" |-
|-
|valign="top" |
<tt>.calibrate</tt>
| Build the model based on the current meta-parameters and options.
|-
|valign="top" |
<tt>.crossvalidate(''cvi'',''ncomp'')</tt>
| Build the model and cross-validate with the supplied conditions. ''cvi'' is the cross-validation splitting as described for cvi in [[crossval]] (default = venetian blinds with square-root of the number of samples as splits). ''ncomp'' is the number of components (default = maximum number available).
|}
&nbsp;


==Working With Calibrated Models==
==Working With Calibrated Models==


===Methods===
Once calibrated, a model object contains all the results (relevant to the model type) derived from the modeled data. The object also has all the information necessary to apply that model to new data. For many models, methods exist for plotting parts of the model (scores, loadings, eigenvalues, etc.)
 
===Calibrated Model Properties===
 
The following properties are available for most models once they have been calibrated.


{| border="1" cellpadding="5" cellspacing="0" align="left"
{| border="1" cellpadding="5" cellspacing="0" style="margin-left:3em"
|-
|valign="top" |
<tt>.iscalibrated</tt>
| Returns (1) if the model has been calibrated or applied and (0) if the model is in the "empty" state and has not been calibrated.
|-
|-
|valign="top" |
|valign="top" |
<tt>startApp()</tt>  
<tt>datasource</tt>
| Start up the Solo application. This method should be called after setting any of the desired startup properties listed below and before issuing any sendCommand() calls. Once started, none of the startup properties can be changed.
|
|-
|valign="top" |
<tt>date</tt>
|
|-
|valign="top" |
<tt>detail</tt>
|
|-
|valign="top" |
<tt>loadings</tt>
|
|-
|valign="top" |
<tt>ncomp</tt>
|
|-
|valign="top" |
<tt>parent</tt>
|
|-
|valign="top" |
<tt>prediction</tt>
|
|-
|valign="top" |
<tt>q</tt>
|
|-
|valign="top" |
<tt>scores</tt>
|
|-
|valign="top" |
<tt>t2</tt>
|
|-
|valign="top" |
<tt>time</tt>
|
|-
|valign="top" |
<tt>x</tt>
|
|-
|valign="top" |
<tt>x</tt>
|
|-
|valign="top" |
<tt>xhat</tt>
|
|-
|valign="top" |
<tt>y</tt>
|
|-
|valign="top" |
<tt>yhat</tt>
|
|}


'''RETURNS:''' 0 if start was successful or -1 if the start failed. Error messages from a failed start can be retrieved from the read-only property ''lastResponse''.
&nbsp;
 
===Calibrated Model Methods===
 
The following methods are available when a model has been calibrated.
 
 
{| border="1" cellpadding="5" cellspacing="0"  style="margin-left:3em"
|-
|valign="top" |
<tt>.apply()</tt>       
|
|-
|valign="top" |
<tt>.crossvalidate()</tt>       
|
|-
|valign="top" |
<tt>.ploteigen</tt>   
|With no outputs, this method generates a plot of the eigenvalues or other  statistics associated with changing the number of components in the model (e.g. RMSEC, misclassification rates) for the given model. With an output, no plot is generated but the DataSet object containing the data that would have been plotted is returned.
|-
|-
|valign="top" |
|valign="top" |
<tt>stopApp()</tt>  
<tt>.plotloads</tt>    
| Stop the Solo application. Note that once stopApp has been called, it cannot be restarted without unloading EigenvectorTools. In general it is recommended that this method be called only when the client application is performing final cleanup before exiting. Note that this method is also automatically called when the application object is destroyed, thus, directly calling this method is usually not needed.  
|With no outputs, this method generates a plot of the loadings (including all variable-specific statistics and results) for the given model. With an output, no plot is generated but the DataSet object containing the loadings is returned.
|-
|-
|valign="top" |
|valign="top" |
<tt>sendCommand(cmd)</tt>  
<tt>.plotscores</tt>  
| Send a string command "cmd" to the Solo application. This is the standard way of communicating with the application. The command may be any of the simple object-creation commands described in the [[Solo_Predictor_User_Guide]] documentation or a command making use of [[EVRIGUI_Objects]]. Any textural response from the application is stored in the read-only property ''lastResponse''.
| With no outputs, this method generates a plot of the scores (including all sample-specific statistics and results) for the given model. With an output, no plot is generated but the DataSet object containing the scores is returned.
|}


'''RETURNS:''' 0 if command was successful, -1 if the command failed due to low-level application error, or -2 if the command was invalid, there was a license error, or other high-level error. In either case, the textural response including any error can be found using the ''lastResponse'' property.
&nbsp;


==Working With Applied Models (Predictions)==
When a model is applied to new data, the output is an applied model, also known as a prediction object. The object type itself is still an EVRIModel Object and nearly all of the methods and properties that were available when working with a calibrated model are available with an applied model. The most notable difference is that any plots or sample-specific results extracted from the model will be for the data to which it was applied instead of the calibration data. For example, when a model which calculates scores is applied to new data, the resulting EVRIModel Object will contain a <tt>.scores</tt> property that is the scores calculated for the new data.
Whether a model object is a calibrated model or a model prediction can be determined by looking at the <tt>.isprediction</tt> field.
===Applied Model Properties===
{| border="1" cellpadding="5" cellspacing="0" style="margin-left:3em"
|-
|valign="top" |
<tt>.isprediction</tt>
| Returns (1) if the model contains a prediction from applying a calibrated model to new data and (0) if the model is just "calibrated" or "empty".
|}
|}


===Properties===
&nbsp;
 
==General Model Properties and Methods==


In addition to the properties and methods described above, the following properties and methods are always available in a model independent of the model state or model type:


{| border="1" cellpadding="5" cellspacing="0" align="left"
===Informational Properties (Read-Only)===
{| border="1" cellpadding="5" cellspacing="0"   style="margin-left:3em"  
|-
|valign="top" |
<tt>.author</tt>
| String describing the author and computer on which this model was created. Usually ''user@computername''. Given a system with assigned usernames and computer names, this is equivalent to an electronic signature on a model.
|-
|-
|valign="top" colspan="2"|
|valign="top" |
'''General Properties'''
<tt>.content</tt>
| Returns the "raw" model information in a form that is most similar to the model structures from previous versions of PLS_Toolbox and Solo. Generally, users need not access this field directly except to provide a model in a form more similar to old models.
|-
|valign="top" |
<tt>.downgradeinfo</tt>
| Informational string explaining the purpose of the <tt>.content</tt> field.
|-
|-
|valign="top" |
|valign="top" |
<tt>lastResponse</tt>
<tt>.evrimodelversion</tt><br>
| '''''READ ONLY: ''''' Returns the string response returned by the server after the last startApp() or sendCommand() call.
<tt>.modelversion</tt>
| Returns a string containing the model version description. The model version is almost always linked to the version of PLS_Toolbox or Solo that created the given model. The two field names here are synonymous.
|-
|-
|valign="top" |
|valign="top" |
<tt>allowedLength</tt><br>       
<tt>.info</tt>
<tt>allowedLengthLong</tt>        
| Returns (or displays with no outputs) the text description of the model. This is the same description shown at the Matlab command line when the model is viewed with content "on". With an output, the results are returned as a cell array of strings.
| Maximum string length allowed to be returned in lastResponse. '''In ET.NET''': allowedLength property is a Long integer type and allowedLengthLong is not available. '''In ETV6''', allowedLength and allowedLengthLong are synonymous except for the variable type. Both set the maximum length of string returned, but allowedLength property is typed as a Short Integer while the allowedLengthLong property is typed a Long Integer. In general, allowedLenghtLong should be used in ETV6 but allowedLength is supported to allow backwards compatibility. Maximum length of string when setting using the Short Integer version of this property is 65535 characters.
|-
|-
|valign="top" colspan="2"|
|valign="top" |
'''Start-up Properties - All Products'''
<tt>.uniqueid</tt>
| Returns a string which uniquely identifies this model including the author, author's computer, and a date/time stamp. This uniqueid can be used to safely discriminate between different models.
|-
|-
|valign="top" |
|valign="top" |
<tt>debug</tt><br>
<tt>.validmodeltypes</tt>
<tt>debugMode</tt>  
| Returns a cell array of strings listing the model types which are currently valid for assignment to the <tt>.modeltype</tt> field.
| Boolean flag indicating if additional startup output should be given. Usually, this output will be written to a file named solo_debug.log in the user's temp directory.
|}
'''Note:''' In EigenvectorToolsV6, this property is named debugMode to avoid namespace conflicts.
 
&nbsp;
 
===Model Status Properties (Read-Only)===
{| border="1" cellpadding="5" cellspacing="0"  style="margin-left:3em"
|-
|-
|valign="top" |
|valign="top" |
<tt>licensecode</tt>  
<tt>.isclassification</tt>
| String indicating the activation / license code to use with the application. Any previous code will be erased before using this code. If this code is not valid for the given version and product, the application will fail to start.
| Returns (1) if the model is a classification model that returns class assignments for unknowns or (0) if it is a decomposition or regression model type
|}
 
&nbsp;
===General Methods===
{| border="1" cellpadding="5" cellspacing="0"  style="margin-left:3em"
|-
|-
|valign="top" |
|valign="top" |
<tt>settingsFile</tt>
<tt>.disp</tt>
|String indicating an alternative settingsFile to load (instead of the default: default.xml). Also allows specifying location of default.xml when it can't be found automatically by the application. '''Note:''' In Solo and Solo+MIA, the contents of specified file will only be used when an existing setting is not currently specified. Thus, this setting should be used in conjunction with clearSettings property to assure these settings are used.
| Displays the contents of the model. There is no output variable from this method, it only displays the information. For access to the content, see the <tt>.info</tt> method.
|-
|-
|valign="top" |
|valign="top" |
<tt>addCmd</tt>  
<tt>.help</tt>
|String specifying any additional command-line options to pass into the application.
| Alone without any additional sub-indexing, this method brings up the help which is most relevant for the particular model type. With the <tt>.predictions</tt> sub-field, this method returns [[Solo_Predictor_Script_Construction#Common_Return_Properties|a structure array of possible sub-fields]] that may be requested for certain properties of the current model.
|}
|}


&nbsp;
&nbsp;

Revision as of 05:20, 10 September 2012

Introduction

EVRIModel Objects provide access to the Standard Model Structure content of all models and provide some easy-to-use methods and properties for building, manipulating, and reviewing models from Matlab's command line, scripts, and functions. In addition, these properties and methods are available from Solo Scripting when using Solo_Predictor and Solo_Server. This page describes the various modes, methods, and properties of EVRIModel objects, here shortened to just "model objects".

Model objects have three distinct states:

  1. Empty Models - Empty models can be populated with data to analyze, "meta parameters" (model building settings), and other modeling options, then models can be calibrated or built from those settings.
  2. Calibrated Models - Calibrated models contain all the model results and parameters necessary to apply that model to new data. Plots and other information can be obtained from calibrated models.
  3. Applied Models - When a calibrated model is applied to new data, the result is a prediction or "applied model". This object contains all the results from applying the model to the new data. Plots and other information can be obtained from applied models.

Working with Model Objects in Matlab and Solo Scripting

EVRIModels are standard Matlab objects which are manipulated using the dot notation to access properties and methods. For example, to retrieve the "model type" (modeltype) property from a model, you give the object (a.k.a. variable) name followed by .modeltype. All examples here will assume that the model is stored in a variable named "model".

model.modeltype

Most object methods can be accessed in the same way:

model.plotscores

Some methods (.apply and .crossvalidate, for example) also require for additional inputs. These are passed in parenthesis after naming the method:

model.apply(newdata)

Displaying Contents

At the Matlab command line (but not in Solo Scripting), you can view the contents of a model object by simply typing its name or by using the .disp method. When viewing content, there are several ways to view the model:

  1. By Description (Desc.) : this view shows you a text description of the type of model, how it was built, and a summary of its results.
  2. By Contents : this view contains the raw field information from the model. Users of previous versions of PLS_Toolbox will recognize this as the previous standard display.

At the Matlab command window, you can turn either one of these sections on or off by clicking the [on] or [off] hyperlinks in the top display line (shown as underlined blue text below)

PCA Model Object (Desc. ON/[off] Contents ON/[off])

Building from Uncalibrated Model Objects

When a model object has been initially created, it contains no data and no results. Many model objects' properties can then be populated with data, meta-parameters, and other settings (options) which can then be used with the .calibrate method to build a calibrated model. The .inputs property lists the specific properties that can be set for a given model type.

NOTE: Some model types do NOT support calibration in this manner. In these cases, the .cancalibrate property to determine if it allows calibration directly or if it requires a call to the function named in modeltype. In addition, the model will clearly show the state in its display at the command line with a statement to "See _____ function to calibrate." In these cases, the only way to create a calibrated model is to access the named function directly.

Example

The following is an example which would build a PCA model from the data stored in the data variable with 3 principal components:

model = evrimodel('pca');
model.x = data;
model.ncomp = 3;
model.calibrate;


Uncalibrated Model Properties

The properties of an uncalibrated model depend on the model type. Typically, a value can be provided for the data to model, plus some number of "meta-parameters" which define aspects of how the model will be built. The list of values available is indicated by the .inputs property. All models which are calibratable allow modification of the .display and .plots properties.

The properties which are generally available for all model types are listed below.

Model Status Properties (Read-Only)

.cancalibrate

Returns (1) if the model contains a modeling building definition (see Empty Model description, below), or (0) if the model does not contain a definition and must be calibrated using the function defined in the modeltype property.

.inputs

Returns a cell array of strings indicating which properties can be set for the model in its current state. Most often this is used when a model is in an uncalibrated state and this property will indicate what parameters and data fields are available to the user to assign before calibrating the model.

.validmodeltypes

Returns a cell array of strings listing the model types which are currently valid for assignment to the .modeltype field.

 

Modifiable Properties

.modeltype

Returns the short "keyword" model type of the current model (or empty string if the model type has not been set). This keyword most often is linked to the PLS_Toolbox function that created the given model. This can be assigned to any model type listed in the .validmodeltypes property.

.display

String property indicating 'on' if command-line display should be given when calibrating or applying a model and 'off' if no display should be given.
  • 'on' : Display command-line output
  • 'off' : Do not display any output

.plots

String property indicating 'final' if plots should be displayed after calibrating or applying a model and 'none' if no plots should be displayed.
  • 'final' : Generate plots (if possible)
  • 'none' : Do not generate any plots

 

Uncalibrated Model Methods

Both of the methods below return a model object. In Matlab, when no output is requested, the model object is stored back into the same object invoked. In Solo Scripting, these methods require an output variable, usually the same model object being built from. For example: m = m.calibrate

.calibrate

Build the model based on the current meta-parameters and options.

.crossvalidate(cvi,ncomp)

Build the model and cross-validate with the supplied conditions. cvi is the cross-validation splitting as described for cvi in crossval (default = venetian blinds with square-root of the number of samples as splits). ncomp is the number of components (default = maximum number available).

 

Working With Calibrated Models

Once calibrated, a model object contains all the results (relevant to the model type) derived from the modeled data. The object also has all the information necessary to apply that model to new data. For many models, methods exist for plotting parts of the model (scores, loadings, eigenvalues, etc.)

Calibrated Model Properties

The following properties are available for most models once they have been calibrated.

.iscalibrated

Returns (1) if the model has been calibrated or applied and (0) if the model is in the "empty" state and has not been calibrated.

datasource

date

detail

loadings

ncomp

parent

prediction

q

scores

t2

time

x

x

xhat

y

yhat

 

Calibrated Model Methods

The following methods are available when a model has been calibrated.


.apply()

.crossvalidate()

.ploteigen

With no outputs, this method generates a plot of the eigenvalues or other statistics associated with changing the number of components in the model (e.g. RMSEC, misclassification rates) for the given model. With an output, no plot is generated but the DataSet object containing the data that would have been plotted is returned.

.plotloads

With no outputs, this method generates a plot of the loadings (including all variable-specific statistics and results) for the given model. With an output, no plot is generated but the DataSet object containing the loadings is returned.

.plotscores

With no outputs, this method generates a plot of the scores (including all sample-specific statistics and results) for the given model. With an output, no plot is generated but the DataSet object containing the scores is returned.

 

Working With Applied Models (Predictions)

When a model is applied to new data, the output is an applied model, also known as a prediction object. The object type itself is still an EVRIModel Object and nearly all of the methods and properties that were available when working with a calibrated model are available with an applied model. The most notable difference is that any plots or sample-specific results extracted from the model will be for the data to which it was applied instead of the calibration data. For example, when a model which calculates scores is applied to new data, the resulting EVRIModel Object will contain a .scores property that is the scores calculated for the new data.

Whether a model object is a calibrated model or a model prediction can be determined by looking at the .isprediction field.

Applied Model Properties

.isprediction

Returns (1) if the model contains a prediction from applying a calibrated model to new data and (0) if the model is just "calibrated" or "empty".

 

General Model Properties and Methods

In addition to the properties and methods described above, the following properties and methods are always available in a model independent of the model state or model type:

Informational Properties (Read-Only)

.author

String describing the author and computer on which this model was created. Usually user@computername. Given a system with assigned usernames and computer names, this is equivalent to an electronic signature on a model.

.content

Returns the "raw" model information in a form that is most similar to the model structures from previous versions of PLS_Toolbox and Solo. Generally, users need not access this field directly except to provide a model in a form more similar to old models.

.downgradeinfo

Informational string explaining the purpose of the .content field.

.evrimodelversion
.modelversion

Returns a string containing the model version description. The model version is almost always linked to the version of PLS_Toolbox or Solo that created the given model. The two field names here are synonymous.

.info

Returns (or displays with no outputs) the text description of the model. This is the same description shown at the Matlab command line when the model is viewed with content "on". With an output, the results are returned as a cell array of strings.

.uniqueid

Returns a string which uniquely identifies this model including the author, author's computer, and a date/time stamp. This uniqueid can be used to safely discriminate between different models.

.validmodeltypes

Returns a cell array of strings listing the model types which are currently valid for assignment to the .modeltype field.

 

Model Status Properties (Read-Only)

.isclassification

Returns (1) if the model is a classification model that returns class assignments for unknowns or (0) if it is a decomposition or regression model type

 

General Methods

.disp

Displays the contents of the model. There is no output variable from this method, it only displays the information. For access to the content, see the .info method.

.help

Alone without any additional sub-indexing, this method brings up the help which is most relevant for the particular model type. With the .predictions sub-field, this method returns a structure array of possible sub-fields that may be requested for certain properties of the current model.