Workspace Browser: Data Icons 2 and Model Exporter Interpreter: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Jeremy
No edit summary
 
imported>Jeremy
No edit summary
 
Line 1: Line 1:
The Model_Exporter Interpreter implements several classes which can be used in Microsoft .NET environments to apply Model_Exporter models to new data. The following describes the classes implemented and their use. The classes include the ModelInterpreter, Workspace, and Matrix classes. The source code for these classes is supplied and can be used, compiled, and redistributed without restriction. Note that such redistribution permission does '''not''' permit the user to redistribute the Model_Exporter product itself - only the interpreter.
__TOC__
__TOC__
[[TableOfContents|Table of Contents]] | [[WorkspaceBrowser_DataIcons|Previous]] | [[DataSetEditorWindow_Layout|Next]]


===Manipulating items===
==ModelInterpreter Class==


After you have imported or loaded data items into the Workspace Browser, a variety of options are available for manipulating the item, including:
The primary class in the Model_Exporter C# Interpreter is a '''ModelInterpreter''' object. The object's interface implements the ability to assign input data, specify a model to interpret and apply, and retrieve results from the application. Input data and results are all stored using the Matrix class described below.


{|
===Constructors===


{| border="1" cellpadding="5" cellspacing="0" align="left"
|-
|-
 
|valign="top" |
|
<tt>ModelInterpreter(String filename)</tt>
* [[WorkspaceBrowser_DataIcons_2#"Viewing information about the item"|"Viewing information about the item"]] below.
| This form of the constructor method allows passing a filename of a Model_Exporter XML file which should be read and parsed for application.
 
|-
|-
|valign="top" |
<tt>ModelInterpreter(XmlDocument xdoc)</tt>
| This form of the constructor method allows passing a XmlDocument object which contains the contents of a Model_Exporter XML file. This method may be preferred when the content of an exported model has been previously read and stored in another location like a database. In such cases, the alternative ways to create XmlDocuments (see the XmlDocument documentation) which can be used to pass the parsed XML contents directly to the ModelInterpreter class.
|}


|
===Methods===
* [[WorkspaceBrowser_DataIcons_2#"Opening the item for viewing or editing."|"Opening the item for viewing or editing."]]


{| border="1" cellpadding="5" cellspacing="0"
|-
|-
 
|valign="top" |
|
<tt>apply()</tt>
* [[WorkspaceBrowser_DataIcons_2#Dragging and dropping items|Dragging and dropping items]].
| Applies the model to the ''inputdata'' and stores results in the ''results'' property. This method has no return value itself. An error is thrown if ''inputdata'' has not already been assigned.
 
|}
|}


====Viewing information about the item====
===Properties===
 
You can right-click on an icon and on the context menu that opens, select from options for viewing information about the item, renaming the item, and viewing details about the item. For a data item, options are also available for plotting the imported data, editing the item, and analyzing the data. In addition, if multiple data items are selected, then an option to combine the data items is also available.
 
''Context menu for an item in the Workspace Browser''
 
[[Image:Imported_data_context_menu.png|200x154px]]
 
====Opening the item for viewing or editing====
 
Double-click an icon to open the item for viewing only or for editing.
 
{|


{| border="1" cellpadding="5" cellspacing="0"
|-
|valign="top" colspan="2"|
'''Read-Only Properties'''
|-
|valign="top" |
<tt>inputDataSize</tt>
| (Int32) The number of columns expected for the inputdata matrix.
|-
|valign="top" |
<tt>modeltype</tt>       
| (String) The model type of the model. One of the supported model types such as: "PCA" "PLS" "PCR" "CLS" or "PLSDA" (or other supported model types) This often helps identify what variables are of interest in the ''results'' workspace after application of the model.
|-
|valign="top" |
<tt>information</tt>       
| (XmlDocument) The raw, unparsed XML contained in the information section of the model. This may be of interest to help identify where the model came from, what the model is predicting, how many outputs to expect, etc. See the exported model's &lt;information&gt; tag for details included.
|-
|valign="top" |
<tt>results</tt>       
| (Workspace) The workspace object that contains all the results after applying the model. This contains all the variables (see Workspace class below) that were used during model application as well as the variables which hold the results of interest such as ''yhat'', ''T2'', or ''Q''.
|-
|valign="top" colspan="2"|
'''Read/Write Properties'''
|-
|-
 
|valign="top" |
|
<tt>inputdata</tt>
* If the item is not editable, when you double-click the icon for it, an Information dialog box opens. You can view information about the non-editable item in this dialog box.
| (Matrix) Specifies the data to which the model should be applied. Type is ''Matrix'' as defined using the Matrix class described below. This field is not modified by applying the model. Assigning a new value this field erases the results from a previous model application.
 
|}
|}


''Information dialog box''
===Examples===


[[Image:Information_dialog_box_view_only_item.png|276x231px]]
The following C# example shows use of the ModelInterpreter class along with the Matrix and Workspace classes (described below). It start by creating a new ModelInterpreter object loading the model contained in plsexample.xml. It then creates a Matrix object with a vector of value from 1 to the number of variables expected by the model, stores this matrix in the inputdata, and applies the model. Finally, it retrieves the results from that model by listing ALL workspace contents. Alternatively, the commented-out code shows how the typical PLS (or other regression model) results would be retrieved specifically.


{|
  ModelInterpreter test = new ModelInterpreter("plsexample.xml");
  //display some of the common model information
  Console.WriteLine("Model type: " + test.modeltype);
  Console.WriteLine("Expected Data Size:" + test.inputDataSize);
  //create data to pass to the model
  Matrix inmatrix = new Matrix(1, test.inputDataSize);
  for (int i = 0; i < inmatrix.NoCols; i++) { inmatrix[0, i] = i + 1; };
  test.inputdata = inmatrix;
  //set data and apply model         
  test.inputdata = inmatrix;    //assign input data to object
  test.apply();  //apply model
  //Typical outputs for a PLS or other regression model:
  //Console.WriteLine("yhat = " + test.results.getVar("yhat"));
  //Console.WriteLine("T2 = " + test.results.getVar("T2"));
  //Console.WriteLine("Q = " + test.results.getVar("Q"));
               
  //List ALL contents of workspace     
  foreach (String name in test.results.varList)
  {
    Console.WriteLine(name + " = \n" + test.results.getVar(name));
  }


|-


|
==Workspace Class==
* If the item is a DataSet, when you double-click its icon, the DataSet Editor window opens. You can edit the DataSet (data, row labels, column labels, and so on) as needed in this window. (See [[DataSetEditorWindow_Layout|DataSet Editor Window]].)


|}
The Workspace class serves as a way to store any number of matrices (Matrix objects) each associated with a name. These names and their corresponding matrix are referred to as a "variable". A variable is stored in the workspace using the setVar() method and retrieved using the getVar() method.


''DataSet Editor window''
Note that although variables names can contain upper- and lower-case characters, the case is ignored when retrieving. Thus, a variable set using the name "Pred" can be set or retrieved using "Pred", "pred", or any other combination of upper- and lower-case. Since these two would be considered the same variable, you cannot set two different variables which are identical except for the case of the variable name.


[[Image:Dataset_editor_window.png|349x297px]]
===Constructors===
 
{|


{| border="1" cellpadding="5" cellspacing="0"
|-
|-
 
|valign="top" |
|
<tt>Workspace()</tt>
* If the item is not a DataSet, but another type of editable data, when you double-click its icon, the Open Item dialog box opens, asking you how you want to open the item-either as a new dataset or as a raw data (which means editing the data as a simple matrix without adding labels or other DataSet information).
|Creates an instance of the Workspace class.


|}
|}


''Open Item dialog box''
===Methods===


[[Image:Open_Variable_dialog_box.png|314x87px]]
The following methods are defined for the Workspace class. These are the primary means for assigning and retrieving values from a Workspace.
 
Because, in general, any data that is loaded into Solo must ultimately be converted to a DataSet object, you can click Create Dataset to proactively carry out this conversion; otherwise, you can click As Raw Data to edit the item "as is." The same window opens whether you click Create DataSet or As Raw Data; however, as shown in the figure below, if you click Create DataSet, all of the tabs are enabled for the Dataset Editor window while if you click As Raw Data, only the Data tab is open for the Data Editor window.
 
''DataSet Editor window and Data Editor window''
 
[[Image:WorkspaceBrowser_DataIcons_2.09.1.5.jpg|533x223px]]
 
With some exceptions, if you edit a data item, you must explicitly request to overwrite the data item in the Workspace Browser with the changes. To save changes to a data item:
 
{|


{| border="1" cellpadding="5" cellspacing="0"
|-
|valign="top" |
<tt>setVar(String name, Matrix value)</tt>
|Sets the variable specified by ''name'' with the matrix ''value''. If ''name'' already exists in the workspace, the previous value is overwritten without any warning. This method returns nothing. ''void setVar(String name, Matrix value)''
|-
|valign="top" |
<tt>setVar(Workspace toadd)</tt>
|Copies all variables in the ''toadd'' workspace into the workspace being addressed by the method. Any variables in the current workspace which have the same name as a variable in the ''toadd'' workspace are overwritten without warning. This method returns nothing. ''void setVar(Workspace toadd)''
|-
|valign="top" |
<tt>getVar(String name)</tt>
|(Matrix) Retrieves the specified variable ''name'' from the workspace. Returned value is a Matrix object. ''Matrix getVar(String name)''
|-
|valign="top" |
<tt>isSet(String name)</tt>
|(Boolean) Returns boolean ''TRUE'' if the given variable ''name'' is currently set in the workspace. ''Boolean isSet(String name)''   
|-
|-
 
|valign="top" |
|1.
<tt>clearAll()</tt>
 
|Clears all values from the workspace. This method returns nothing. ''void clearAll()'' 
|In the appropriate Editor window, edit the item as needed, and then on the Editor window menu, click File &gt; Save.
 
|}
|}


:The Save dialog box opens. The Variable name field is automatically populated with the name of the data item.
===Properties===
 
''Save dialog box''
 
[[Image:Save_dialogbox_ToFileshowing.png|291x206px]]
 
{|


{| border="1" cellpadding="5" cellspacing="0"
|-
|-
 
|valign="top" |
|2.
<tt>varList</tt>
 
|(List<String>) Returns the alphabetically sorted list of names for all variables currently set in the Workspace as a List<String> type. These names can be used with the getVar method to retrieve the values.
|Do one of the following:
 
|}
|}


:
{|
|-
|
* Click Save to override the selected data item in the Workspace Browser with the modified data item.


|}
===Examples===


:
The following example creates a new empty Workspace called "ws", a new Matrix (in this case, a vector with 1 row and 5 columns) called "myvalue", then stores myvalue into ws under the variable named "x". It then looks


{|
  Workspace ws = New Workspace();    //create new workspace
  Matrix myvalue = New Matrix(1,5);  //create all-zeros matrix "myvalue" as exmple
  ws.setVar("x",myvalue);            //assign variable "x" the value myvalue in workspace


|-
Example of retrieving a variable from a workspace "ws"


|
  if (ws.isSet("y"))                //is the variable "y" assigned?
* In the Variable field, enter a new name for the data item, and then click Save to save the modified data item as a new item in the Workspace Browser.
  {
 
    retrieved = ws.getVar("y");      //get value assigned in y
|}
  }
  else
  {
    retrieved = New Matrix(0,0);    //no "y" was set? use empty matrix instead
  }


====Dragging and dropping items====
==Matrix Class==


{|
The Matrix class is defined through a public-domain class implemented in the cMatrixLib. This class allows storing and retrieval of data in a simple two-dimensional matrix as well as performing various matrix calculations on that data. For the purposes of the Model_Exporter Interpreter, the main use is to provide a convenient method to exchange data between the client and the interpreter. Thus, the details of the class are not given here, but only the main creation and access information. This class is used for the ''inputdata'' property and the variables stored in the ''results'' Workspace of the '''ModelInterpreter'''.


|-
===Constructors===


|
Constructor for a new Matrix object is by either specifying the size (in rows and columns) for the new Matrix, or by passing an array of Double to the constructor:
* You can drag a data icon to a shortcut icon to open the Analysis window and analyze the data.


{| border="1" cellpadding="5" cellspacing="0"
|-
|-
 
|valign="top" |
|
<tt>Matrix(int rows, int columns)</tt>
* You can drag a model icon to an shortcut icon to open the Analysis window to load a model, and optionally, apply it to new data.
|Create a Matrix with the specified number of rows and columns. Matrix is initialized to all zeros. Note that rows = 0 and columns = 0 can be used to create an "empty" matrix. <tt>Matrix mymatrix = new Matrix(int rows, int columns)</tt>
 
|-
|-
 
|valign="top" |
|
<tt>Matrix(Double[,] data)</tt>
* If the size of data items matches in at least one dimension, (either the same number of rows or the same number of columns), or if data items are identical in size, you can drag a data icon onto another data icon or onto an open Editor window to combine these two data items and create a single data item. You can repeat this step as many times as needed to combine all of the necessary data items.
|Create a Matrix from the Double array ''data'' using the sizes of that array. <tt>Matrix mymatrix = new Matrix(Double[,] data)</tt>
 
|}
|}


Note: You cannot join data items that do not match in at least one dimension.
===Methods===
Changing and accessing values in a matrix can be accomplished using standard (zero-based) indexing.


Consider the following:
  mymatrix[0,0] = value;  //replace element 0,0 with specified Double "value"
  value = mymatrix[0,0];  //retrieve value at element 0,0


{|
===Properties===
To determine the size of a given matrix, use the ''NoRows'' and ''NoCols'' properties:


{| border="1" cellpadding="5" cellspacing="0"
|-
|-
 
|valign="top" |
|
<tt>NoRows</tt>
* DataSet item: A, 300 rows x 20 cols
|(int) Returns the total number of rows the matrix object is currently dimensioned to contain.
 
|-
|-
 
|valign="top" |
|
<tt>NoCols</tt>
* DataSet item: B, 200 rows x 20 cols
|(int) Returns the total number of columns the matrix object is currently dimensioned to contain.
 
|-
 
|
* DataSet item: C, 300 rows x 1 col
 
|-
 
|
* DataSet item: A_copy, 300 rows x 20 cols
 
|}
|}


You can join A with B because these DataSets have the same number of columns, or you can join A with C because these DataSets have the same number of rows. For example, when you join A with B, you are given two options:
===Examples===
 
''Overwrite existing data dialog box''


[[Image:Overwrite_Existing_Data_dialogbox.png|302x87px]]
Below are some general examples of using the constructors, methods, and properties specified above. Other methods exist which are not indicated in this documentation. See also the examples given for the Workspace class above.


You can overwrite A with the B data, or you can add the B data to A. In this case, the data is automatically joined as additional rows, and a 500 row x 20 column dataset is created. Similarly, if you join A with C, you can overwrite A with the C data, or you can add the C data to A. In this case, the data is automatically joined as additional columns, and a 300 row x 21 column dataset is created.
Create a matrix and populate it with some numbers:


You can also join A with A_copy because these two data items are identical. You are again given two options for joining the data:
  Matrix mat = New Matrix(3,5);
  For (int row=0; row<3; row++)
  {
    For (int col=0; col<5; col++)
    {
      mat[row,col] = row*10 + col;  //assigns values to indicated row/column
    }
  }


''Overwrite existing data dialog box''
Retrieve value from row 2 column 4:


[[Image:Overwrite_Existing_Data_dialogbox.png|302x87px]]
  Double value = mat[1,3];    //remember this is zero indexed


You can overwrite A with the A_copy data, or you can add the A_copy data to A. If you choose to add the A_copy data to A, you have three options for joining the data:
Display all elements of the first row of a matrix on the console:


''Augment data dialog box''
  int ncols = mat.NoCols;
 
  For (int col=0; col<ncols; col++) Console.Write(mat[0,col] + "  ");
[[Image:Augment_Data_dialog_box.png|254x87px]]
 
{|
 
|-
 
|
* You can join the data by rows. In this case, a 600 row x 20 column DataSet is created. The 300 new rows are considered as new samples.
 
|-
 
|
* You can join the data by columns. In this case, a 300 row x 40 column DataSet is created. The 20 new columns are considered as new variables for the same samples.
 
|-
 
|
* You can join the data as slabs. In this case, one DataSet is essentially placed behind the other to create a 300 column x 20 row x 2 DataSet. (You typically join data as slabs if the data is to be used in multi-way data analysis methods.)
 
|}

Revision as of 16:02, 29 March 2011

The Model_Exporter Interpreter implements several classes which can be used in Microsoft .NET environments to apply Model_Exporter models to new data. The following describes the classes implemented and their use. The classes include the ModelInterpreter, Workspace, and Matrix classes. The source code for these classes is supplied and can be used, compiled, and redistributed without restriction. Note that such redistribution permission does not permit the user to redistribute the Model_Exporter product itself - only the interpreter.

ModelInterpreter Class

The primary class in the Model_Exporter C# Interpreter is a ModelInterpreter object. The object's interface implements the ability to assign input data, specify a model to interpret and apply, and retrieve results from the application. Input data and results are all stored using the Matrix class described below.

Constructors

ModelInterpreter(String filename)

This form of the constructor method allows passing a filename of a Model_Exporter XML file which should be read and parsed for application.

ModelInterpreter(XmlDocument xdoc)

This form of the constructor method allows passing a XmlDocument object which contains the contents of a Model_Exporter XML file. This method may be preferred when the content of an exported model has been previously read and stored in another location like a database. In such cases, the alternative ways to create XmlDocuments (see the XmlDocument documentation) which can be used to pass the parsed XML contents directly to the ModelInterpreter class.

Methods

apply()

Applies the model to the inputdata and stores results in the results property. This method has no return value itself. An error is thrown if inputdata has not already been assigned.

Properties

Read-Only Properties

inputDataSize

(Int32) The number of columns expected for the inputdata matrix.

modeltype

(String) The model type of the model. One of the supported model types such as: "PCA" "PLS" "PCR" "CLS" or "PLSDA" (or other supported model types) This often helps identify what variables are of interest in the results workspace after application of the model.

information

(XmlDocument) The raw, unparsed XML contained in the information section of the model. This may be of interest to help identify where the model came from, what the model is predicting, how many outputs to expect, etc. See the exported model's <information> tag for details included.

results

(Workspace) The workspace object that contains all the results after applying the model. This contains all the variables (see Workspace class below) that were used during model application as well as the variables which hold the results of interest such as yhat, T2, or Q.

Read/Write Properties

inputdata

(Matrix) Specifies the data to which the model should be applied. Type is Matrix as defined using the Matrix class described below. This field is not modified by applying the model. Assigning a new value this field erases the results from a previous model application.

Examples

The following C# example shows use of the ModelInterpreter class along with the Matrix and Workspace classes (described below). It start by creating a new ModelInterpreter object loading the model contained in plsexample.xml. It then creates a Matrix object with a vector of value from 1 to the number of variables expected by the model, stores this matrix in the inputdata, and applies the model. Finally, it retrieves the results from that model by listing ALL workspace contents. Alternatively, the commented-out code shows how the typical PLS (or other regression model) results would be retrieved specifically.

 ModelInterpreter test = new ModelInterpreter("plsexample.xml");

 //display some of the common model information
 Console.WriteLine("Model type: " + test.modeltype);
 Console.WriteLine("Expected Data Size:" + test.inputDataSize);

 //create data to pass to the model
 Matrix inmatrix = new Matrix(1, test.inputDataSize);
 for (int i = 0; i < inmatrix.NoCols; i++) { inmatrix[0, i] = i + 1; };
 test.inputdata = inmatrix;

 //set data and apply model           
 test.inputdata = inmatrix;     //assign input data to object
 test.apply();  //apply model

 //Typical outputs for a PLS or other regression model:
 //Console.WriteLine("yhat = " + test.results.getVar("yhat"));
 //Console.WriteLine("T2 = " + test.results.getVar("T2"));
 //Console.WriteLine("Q = " + test.results.getVar("Q"));
               
 //List ALL contents of workspace      
 foreach (String name in test.results.varList)
 {
   Console.WriteLine(name + " = \n" + test.results.getVar(name));
 }


Workspace Class

The Workspace class serves as a way to store any number of matrices (Matrix objects) each associated with a name. These names and their corresponding matrix are referred to as a "variable". A variable is stored in the workspace using the setVar() method and retrieved using the getVar() method.

Note that although variables names can contain upper- and lower-case characters, the case is ignored when retrieving. Thus, a variable set using the name "Pred" can be set or retrieved using "Pred", "pred", or any other combination of upper- and lower-case. Since these two would be considered the same variable, you cannot set two different variables which are identical except for the case of the variable name.

Constructors

Workspace()

Creates an instance of the Workspace class.

Methods

The following methods are defined for the Workspace class. These are the primary means for assigning and retrieving values from a Workspace.

setVar(String name, Matrix value)

Sets the variable specified by name with the matrix value. If name already exists in the workspace, the previous value is overwritten without any warning. This method returns nothing. void setVar(String name, Matrix value)

setVar(Workspace toadd)

Copies all variables in the toadd workspace into the workspace being addressed by the method. Any variables in the current workspace which have the same name as a variable in the toadd workspace are overwritten without warning. This method returns nothing. void setVar(Workspace toadd)

getVar(String name)

(Matrix) Retrieves the specified variable name from the workspace. Returned value is a Matrix object. Matrix getVar(String name)

isSet(String name)

(Boolean) Returns boolean TRUE if the given variable name is currently set in the workspace. Boolean isSet(String name)

clearAll()

Clears all values from the workspace. This method returns nothing. void clearAll()

Properties

varList

(List<String>) Returns the alphabetically sorted list of names for all variables currently set in the Workspace as a List<String> type. These names can be used with the getVar method to retrieve the values.


Examples

The following example creates a new empty Workspace called "ws", a new Matrix (in this case, a vector with 1 row and 5 columns) called "myvalue", then stores myvalue into ws under the variable named "x". It then looks

 Workspace ws = New Workspace();    //create new workspace
 Matrix myvalue = New Matrix(1,5);  //create all-zeros matrix "myvalue" as exmple
 ws.setVar("x",myvalue);            //assign variable "x" the value myvalue in workspace

Example of retrieving a variable from a workspace "ws"

 if (ws.isSet("y"))                 //is the variable "y" assigned?
 {
   retrieved = ws.getVar("y");      //get value assigned in y
 }
 else
 {
   retrieved = New Matrix(0,0);     //no "y" was set? use empty matrix instead
 }

Matrix Class

The Matrix class is defined through a public-domain class implemented in the cMatrixLib. This class allows storing and retrieval of data in a simple two-dimensional matrix as well as performing various matrix calculations on that data. For the purposes of the Model_Exporter Interpreter, the main use is to provide a convenient method to exchange data between the client and the interpreter. Thus, the details of the class are not given here, but only the main creation and access information. This class is used for the inputdata property and the variables stored in the results Workspace of the ModelInterpreter.

Constructors

Constructor for a new Matrix object is by either specifying the size (in rows and columns) for the new Matrix, or by passing an array of Double to the constructor:

Matrix(int rows, int columns)

Create a Matrix with the specified number of rows and columns. Matrix is initialized to all zeros. Note that rows = 0 and columns = 0 can be used to create an "empty" matrix. Matrix mymatrix = new Matrix(int rows, int columns)

Matrix(Double[,] data)

Create a Matrix from the Double array data using the sizes of that array. Matrix mymatrix = new Matrix(Double[,] data)

Methods

Changing and accessing values in a matrix can be accomplished using standard (zero-based) indexing.

 mymatrix[0,0] = value;   //replace element 0,0 with specified Double "value"
 value = mymatrix[0,0];   //retrieve value at element 0,0

Properties

To determine the size of a given matrix, use the NoRows and NoCols properties:

NoRows

(int) Returns the total number of rows the matrix object is currently dimensioned to contain.

NoCols

(int) Returns the total number of columns the matrix object is currently dimensioned to contain.

Examples

Below are some general examples of using the constructors, methods, and properties specified above. Other methods exist which are not indicated in this documentation. See also the examples given for the Workspace class above.

Create a matrix and populate it with some numbers:

 Matrix mat = New Matrix(3,5);
 For (int row=0; row<3; row++)
 {
   For (int col=0; col<5; col++)
   {
     mat[row,col] = row*10 + col;   //assigns values to indicated row/column
   }
 }

Retrieve value from row 2 column 4:

 Double value = mat[1,3];     //remember this is zero indexed

Display all elements of the first row of a matrix on the console:

 int ncols = mat.NoCols;
 For (int col=0; col<ncols; col++) Console.Write(mat[0,col] + "  ");