Solo Predictor Reference Manual: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
imported>Jeremy
Line 14: Line 14:
:MAC OS X (Intel only)
:MAC OS X (Intel only)
:Linux (Intel only)
:Linux (Intel only)
* 200 MB Disk Space
* 32- or 64-bit processor
* 100 MB RAM (recommended)
* 200 MB Disk Space (32-bit) / 400 MB Disk Space (64-bit)
* minimum recommended 100 MB RAM (32-bit) / 200 MB RAM (64-bit)


==Features and Supported Methods==
==Features and Supported Methods==

Revision as of 20:19, 1 March 2011

Introduction

Solo_Predictor, from Eigenvector Research, Inc. (EVRI) is a stand-alone model application engine which applies models created by PLS_Toolbox or Solo. Solo_Predictor features a simple and flexible scripting language, platform- and operating-system-independent interface, and an inherent distributed-computation design.

This documentation describes the setup and use of Solo_Predictor and explains the script language used to issue commands.

System Requirements

Solo_Predictor requires the following:

  • Operating system:
Windows 2000, XP, 2003 server, or Vista
MAC OS X (Intel only)
Linux (Intel only)
  • 32- or 64-bit processor
  • 200 MB Disk Space (32-bit) / 400 MB Disk Space (64-bit)
  • minimum recommended 100 MB RAM (32-bit) / 200 MB RAM (64-bit)

Features and Supported Methods

Solo_Predictor is a prediction engine which supports importing of data and models from an external source, application of those models to the data, and retrieval of the values from the prediction. It supports predictions for all methods which produce standard model structures in PLS_Toolbox and Solo. This includes all methods in the Analysis GUI (including, but not limited to, PCA, PARAFAC, MCR, Purity, PLS, PCR, MLR, PLSDA, SIMCA), Calibration Transfer GUI, and any other PLS_Toolbox command-line functions which produce standard model structures.

Solo_Predictor also supports:

  • All preprocessing methods available in the custom Preprocessing GUI.
  • Missing data replacement (where supported by the model type)
  • Variable pre-alignment to model (handles resampling, extra variables, missing variables)
  • Importing all data types supported by the Analysis GUI and Workspace Browser including, but not limited to:
Comma-, tab-, space-, and other delimited text files (.csv, .dat)
X,Y… delimited files (.xy)
Excel spreadsheets (.xls, .xlst)
XML (Eigenvector XML data format) (.xml)

and vendor formats (list may be incomplete - some importers may have been added after this writing)

Analytical Spectral Devices (ASD) Indico (Versions 6 and 7)
Hamilton Sundstrand files (.asf, .aif, .pdf)
Horiba JY files (various)
JCAMP (simple single-record formats) (.jcamp .jdx)
Matlab .mat files (.mat)
Thermo-Galactic SPC files (single and multifile formats) (.spc)

Note that Solo_Predictor does not support execution of custom, user-defined MATLAB® scripts or commands. Such functionality requires a full MATLAB license. Please contact Eigenvector Research for more information on using Solo_Predictor in a MATLAB environment or for creating a custom version of Solo_Predictor for your application.

Solo_Predictor can be connected through a socket interface using TCP/IP, through an ActiveX or .NET object, or operate in a wait-for-file mode. It can send results to a client and/or write to an output file. Solo_Predictor also maintains a text-based log file to aid with diagnosis of problems.

Interface Specifications

In this description of the Solo_Predictor interface, the term "client" refers to a user-specified application which is requesting a prediction and the term "server" refers to Solo_Predictor. The client is often a distributed control system (DCS) or other data collection software (instrumentation software, etc) but can be any application which needs to apply a multivariate model to data. In general, the client issues one or more commands to Solo_Predictor either by passing data or by describing where data can be retrieved from. Additional commands are passed to instruct Solo_Predictor how to process that data and what results should be returned. See the Scripting Language section for details on the scripting language used for the instructions.

Introduction to Socket Interfaces

Solo_Predictor operates using standard TCP/IP (Transmission Control Protocol/Internet Protocol) communications over "socket" connections. Sockets are available on all operating system platforms (Windows, Mac, Linux) and are the same technology used in most Intranet and Internet communications including http, ftp, and other familiar inter-computer systems. They are also used for some "plug and play" hardware devices. Simply put, sockets are a general method to pass messages between two programs.

Although socket connections are most often used between computers, they can also be used when the client and server reside on the same computer (and even when that computer is not networked). When connecting two programs on the same computer, sockets are similar to other familiar inter-program communication systems (e.g. DDE or Active-X) with these added advantages:

  1. Sockets are completely platform independent. The same communication methods are used on all operating systems and hardware. They can also be used across mixed operating systems and platforms (e.g. Windows to Linux.)
  2. Most modern languages have some sort of provision for socket communication and require no proprietary technology to implement.
  3. Socket technology allows the client and server to be located on the same computer or separate computers connected by a network. The identical software and setup are used in both cases. The only modification needed is to provide a remote IP address or name for the server. As a result, sockets also inherently allow for distributed computation.

The procedure of communication over sockets is well described in many places. The basic procedure is:

  1. The client opens a socket connection between the client and server. This requires knowing the IP address of the server's computer (use "loopback" or "127.0.0.1" if the server and client are on the same computer) and the port number on which the server is "listening."
  2. The client sends a command to the server. The end of the message is indicated when no additional characters are available.
  3. The server receives the command and performs some operation.
  4. The server returns a response to the client often containing either a simple acknowledgement of the message or possibly some additional data or results.
  5. The socket connection is closed.

The messages passed to Solo_Predictor are passed in plain text, but the ability to pass XML to describe some more complicated data types also exists. The response from Solo_Predictor can be in any of a number of formats including plain text, XML, or HTML. In addition, Solo_Predictor also permits some standard HTTP-format (i.e. web browser-style) input and output messages. For more information on the message format, see the "Scripting Language" section in this manual.

See Appendix C Solo Predictor Example Connection Code for socket-connection coding examples.

End-of-Message Indicator Option

In some cases, a system has a high load (many programs running) or the messages being transferred are large. In these cases, the message transferred by the client may be broken up into smaller pieces. This may cause Solo_Predictor to believe the message is complete before it has received the entire message. In these cases, Solo_Predictor can be told to expect an end-of-message (EOM) character or string (e.g. "[EOM]") and it will wait to process a message until it sees that string arrive. See Incoming Message Format and Timeout Settings for how to set an EOM string.

POST Protocol Option

Solo_Predictor also accepts the common HTTP POST protocol for incoming messages. This format specifies the expected length of the message and, thus, allows messages to be split into segments because Solo_Predictor will not process the entire message until the received message is that length. See this external page for a simple example of the POST protocol format. Although standard POST format allows specification of different content types, the only Content-Type header which Solo_Predictor currently supports is text/plain. The following gives an example of a valid POST message for Solo_Predictor:

  POST . HTTP/1.0
  Content-Length: 15
  Content-Type: text/plain

  data='[1 2 3]';

Also note that outgoing messages from Solo_Predictor are never "chunked" (split into several pieces) nor do they ever use the POST format.

ActiveX and .NET Interfaces

For client applications which cannot or do not want to use sockets, Solo_Predictor provides both an ActiveX and .NET suite of objects called EigenvectorTools which can communicate with Solo_Predictor without the client having to implement socket interface code. EigenvectorTools must be installed on the same computer as the client application, but Solo_Predictor can still be located on the same computer or on a separate computer (if the socket option is used). Please note that EigenvectorTools are only available on Windows. Other platforms must use Sockets to communicate with Solo_Predictor.

For information on using EigenvectorTools, see the help page EigenvectorTools. Note that although the EigenvectorTools page makes reference to accessing graphical user interfaces (GUIs), Solo_Predictor does not allow access to the GUIs. Only the creation of data objects and application of models.

Wait-For-File Interface

Solo_Predictor also offers a basic wait-for-file method of interface. This feature is designed for compatibility with legacy systems which may not offer flexible interfacing and allows a client to trigger an analysis by simply dropping a readable file into a specified folder. Solo_Predictor can be configured to write a response file for the client to read the results of the analysis. For more information on this option, see the Installation and Configuration section and the Script Construction section.

Single- and Multi-Client Servers

Upon starting up, Solo_Predictor will automatically identify itself ("imprint") with the first client computer that makes contact with it. After imprinting, only that computer will be able to send commands to the server. This is true if the client and server are on the same computer, or on separate computers. Solo_Predictor can only be reset to respond to another client by restarting the server.

Some licenses will permit more than one client computer to access the predictor simultaneously. Thus, a single predictor can be installed on a centrally-located, networked computer and serve a number of clients on different computers (or multiple clients on the same local computer). Note that although multiple clients can make connections and request predictions, the following conditions are put into place:

  1. Each client normally has its own workspace to store data and results. That is, one client cannot normally access the workspace of other clients. This can be disabled if, for example, multiple clients are contributing to the data used to make a prediction or when a remote client will be used to interrogate the workspace of another client. See the Installation and Configuration section for more information on workspace options.
  2. In order to assure the fastest response for a given client, Solo_Predictor will only execute one client's request at a time.

Please contact Eigenvector Research, Inc. for more information on multi-client licenses.

Installation and Configuration

The following section describes the options available for configuring Solo_Predictor.

Installation

Solo_Predictor is packaged in several different ways depending on the platform on which it is being installed. Follow the instructions provided with the downloaded software to install on the appropriate platform.

Solo_Predictor is typically run by a start-up process so that it is always available, however, it can also be started "on-demand" by simply executing the Solo_Predictor file or shortcut (again depending on the operating system). The options for stopping or restarting the server depend on the configuration of the Status Window (see below).

Prior to the first start of Solo_Predictor, the user must enter their license code into the configuration file. See section below about adding the license code to the configuration file.

If connecting to Solo_Predictor via sockets, the server's IP address depends on the local network setup. If the client is running on the same system as Solo_Predictor, then the loopback address (127.0.0.1) can be used for both client and server. The port number is configured as described below. If, however, Solo_Predictor is on a different computer than the client, the client must make a connection into the computer running Solo_Predictor. Normally this is done by IP address but most sockets provide some means for looking up an IP address based on the computer name. If dynamic IP addresses are being used, it is recommended that the Solo_Predictor computer be set up with a preference for a given IP address. However, if the IP address does get changed, the client will need to be pointed to the new address.

For more information on programming socket connections, see Appendix C: Solo_Predictor_Example_Connection_Code.

Configuration

All configuration of Solo_Predictor is accomplished through the defaults.xml file which is located in the program's main folder. This XML file contains a number of tags which can be edited by the user. Note that changes in this file will not be read by Solo_Predictor until the server is stopped and restarted.

The tags within the <socketserver> tag control the server settings. In each case, an options value is provided using standard XML notation:

 <optionname>value</optionname>

In addition, inside each opening tag, several attributes are set:

 <optionname class="numeric" size="[1,1]">1</optionname>

The "class" attribute should not be changed from the given value. The "size" attribute is informational only and can be omitted.

The following are the user-modifiable options. The expected class attribute is included in parentheses.

License Code

At the bottom of the configuration file is the licensecode tag which is empty in a new installation. Entering a code into this tag allows Solo_Predictor to start up without asking the user for the code. The license code, provided by Eigenvector Research, can be added to the file by simply entering it between the <licensecode></licensecode> tags. The next time the server is restarted, the code (if valid) will be used and Solo_Predictor will not prompt for a code. If the license code is a demonstration code and expires, or if the code is invalid, Solo_Predictor will display a dialog indicating the error when it starts up. When done, the license code tag would look like that shown below:

<licensecode>123456-3456789-12-34ab-cdef</licensecode>

Status Window and Controls Options

These options control functionality of the Solo_Predictor status window.

  • controls (class="string"): Manages the display and functionality of the status window. Valid settings include:
    • none: no status window will be given and all controls are hidden.
    • status: status window is shown, but all server controls are disabled.
    • limited: status window is shown and only the "restart" control is enabled.
    • full: status window is shown and all controls (stop/start/restart/exit) are enabled.
Except when "full" settings are used, the only means to stop and/or restart the server is by using operating-system-specific process kill commands ("Program Manager" in windows, the "Activity Monitor" in OS X, and the kill command in linux or unix). Default is "status".
  • max_screen_lines (class="numeric"): Defines the total number of past message lines displayed on the (on-screen) status window. Default is 20 lines.
  • pulseperiod (class="numeric"): Defines the number of seconds between "pulse" messages in the status window. Default is 15 seconds.


Log File

These options control the log file and the level of detail and age of messages retained.

  • log_severity (class="numeric"): Defines the minimum message "severity" which will be reported in the log file (on disk). The level must be one of the following:
0 = log all messages
1 = log all startup, shutdown, rejected connection and fatal error messages
2 = log fatal error messages only
3 = log no messages (disable logging).
The default level is 1 (one).
  • max_log_size (class="numeric"): Defines the maximum log file size (in bytes). Solo_Predictor will discard old messages to keep the log file from exceeding this size. Default is 50000 (50 Kb).
  • logfile (class="string"): Gives the path and filename to use for the log file. By default, this is solo_pred.log in the user's temporary directory. The exact location of the temporary folder depends on the operating system. For example, this is usually:
Windows XP: \Documents and Settings\username\Local Settings\Temp.
Windows Vista: \Users\username\AppData\Local\Temp
  • log_backups (class="numeric"): Indicates how many "backup" copies of the log file to allow to exist at any one time. If zero, then the one log file will be rolled over and messages removed. If greater than zero, the existing log file will be rolled into a backup file when it reaches max_log_size bytes. Old backup files are renumbered in increasing order (allowing up to this number of backup files) and the oldest file is deleted. For example, a value of 2 will allow two backups of the log file (each max_log_size in bytes)

Server Connection Options

These options control the behavior of the socket server and the kind of connections it will accept.

  • port (class="numeric"): Defines the computer port on which the socket server will respond to requests. This value should be changed with great care as some sockets are used by the operating system and other software. The default port value of 2211 is selected to minimize conflict between known port uses. Additional ports which might be of use include: 2210, 2212, and 2005. Contact Eigenvector Research for more information on valid ports.
  • loopbackonly (class="numeric"): If set to 1 (one), the server will only respond to a client which is located on the same computer as the server. All external requests will be ignored. A value of 0 (zero) will respond to any IP address (see also validip option). Default is 1 (one).
  • validip (class="cell"): Gives a list of valid IP addresses to which the server may respond. If empty, any IP address client is permitted to contact the server (unless the loopbackonly option is set to 1 (one)). Remember that the server is limited to a given number of clients (usually 1 (one)) and once it has been contacted by that many clients, it cannot respond to any other clients. This setting only limits the clients who can contact the server before it has imprinted on a given client.
The ip addresses must be supplied as separate items each inside a set of tags with all tags enclosed in a set of tags. For example:

 <validip class="cell">

10.0.0.1 10.0.0.2 </validip>

  • privateworkspace (class="numeric"): If set to 1 (one), each client will have its own workspace to store objects and no client can access another client's objects. If set to 0 (zero), each client accesses the same workspace. A client may access and/or overwrite other client's objects. This may lead to unexpected results (if a given client expects a model to stay loaded but other clients are using the same object name and overwrite the model, for example). Default is one.
  • maxclients (class="numeric"): Maximum number of clients allowed to connect into server. If 1 (one), the first client to connect to the server will be the only client allowed to connect ever. If 0 (zero), no socket connections will be allowed. If set to the numeric value inf (infinity), there will be no limit to the number of clients allowed to connect. This final setting is used when Solo_Predictor is being used as a web server, for example.

Incoming Message Format and Timeout Settings

  • eomstring (class="string") End Of Message character or string. If non-empty, this character or string must be passed to indicate end of message. The same string will be appended onto any messages returned by the server. The use of an EOM string allows Solo_Predictor to function on higher-load systems or with large messages where the entire contents of the message may not be queued and delivered all at once. See Introduction to Socket Interfaces for more information. It is best to set a string which is very unique and will never show up in a common message, for example: **EndOfMessage**
  • tickletimeout (class="numeric") Number of seconds of delay allowed between opening socket and getting first character. At timeout, before sending client a space character. Required to tickle some clients into responding.
  • emptytimeout (class="numeric") Number of seconds of delay allowed before receiving first message from client. At timeout, throws an empty packet message.
  • eomtimeout (class="numeric") Number of seconds after which no more characters received indicates an end-of-message (generally for use with POST messages and EOMSTRING messages only).

Wait-For-File Options

Wait for file options control the optional Solo_Predictor wait-for-file engine. This engine will watch a given folder for a new file (with an optional specific file type). When a new file appears, the file will be automatically loaded as the object "data" and a specific script (stored in a disk file) will be executed. This script can use a :writefile command (see Script Construction section) to store results of an analysis in an output file.

  • waitforfile (class="string"): either "on" or "off" (the default). When "on", the wait-for-file functionality is enabled (although the waitfolder and waitscript must also be non-empty strings for wait-for-file to operate).
  • waitfolder (class="string"): defines the folder (local or networked) in which Solo_Predictor should look for new files.
  • waitfilespec (class="string"): defines the file specifications (if any) to which the wait-for-file should be limited. For example, waitfilespec = "*.dat" will only recognize .dat files appearing in the wait folder.
  • waitscript (class="string"): defines the filename containing the script to execute when a new file is found. This option must contain the entire path to the file. Note that the indicated script should expect to find the loaded data in the object named "data" in the current workspace.

Output Format Options

  • default_format (class="string"): Defines the default response format. This is the output format used by the server if no format type is included in the request script. Valid types are: "xml", "plain" or "html". See Scripting Language for more information on these formats. Default is "xml".
  • writefilefolder (class="string"): Defines the top-level folder to which writefile is allowed to write. Writefile command can ONLY write to this folder and any sub-folders of it. Empty string for writefilefolder = writefile is NOT permitted at all.

User Timers

User timers allow you to schedule particular scripts to be run at specified intervals. These scripts could perform cleanup, or trigger hardware, or even system restarts (to clean up system resources). User timers primarily consist of specifying a script to run, a time interval at which the script should be run, and the recurrence of the event (one time, repeating).

User timers are created by adding one or more <usertimer> tags to the configuration file (within the socketserver tag) with the following possible properties set as tags within each outer tag:

Required Usertimer Properties

  • script : (class="string") Any valid Solo_Predictor script script to execute. Often uses an :include command to read a script.
  • name : (class="string") REQUIRED descriptive name for the timer object, should be unique.

Recommended Usertimer Properties

  • ExecutionMode : (class="string") [ 'fixedDelay' | 'fixedRate' | 'fixedSpacing' |{'singleShot'}] type of timer execution. See Mathworks timer object documentation for more information.
  • Period : (class="numeric") [default = 1] seconds between executions if any mode other than 'singleShot'
  • BusyMode : (class="string") [{'drop'}| 'error' | 'queue' ] control overlapping timer executions. 'drop' ignores timer requests which occur while another timer process is executing. 'queue' keeps a queued list of executions which occured while another timer was executing and executes these missed actions in sequence. 'error' throws an error if two timers conflict.
  • StartDelay : (class="numeric") [default = 0] number of seconds to wait before initial execution.

Other Usertimer Properties:

  • error : (class="string") define how the timer should handle if an untrapped error occurs during the script. One of the following strings:
    'die' Let the timer die without action except log messages.
    'restart' [default] Attempt to restart the timer (see maxrestart setting below)
    'reboot' Reboot the computer (requires "shutdown" (Windows) or "reboot" (Linux) functions be on the system path)
    'stop' Stops Solo_Predictor (often used to trigger an alarm on the watchdog program.)
  • maxrestart : (class="numeric") number of times a timer can be restarted before switching to "errorrestart" error mode [default = 5]
  • errorrestart : (class="string") error mode (see error above) to use if restarts failed maxrestart times [default = 'die']

Script Construction

Solo_Predictor provides a simple, flexible scripting language with which clients can send instructions to load data, apply a model to that data ("make a prediction"), and retrieve results. For details, see the page: Solo_Predictor Script Construction

Appendices

The following additional information is available about using Solo_Predictor: