SiPAT Interface and Evri faq: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Jeremy
 
imported>Lyle
 
Line 1: Line 1:
==Introduction==
__TOC___
Eigenevctor Research's [[Function_Reference_Manual|PLS_Toolbox]] and [http://www.siemens.com/ Siemens' SiPAT product] can be used together for deployment of PLS_Toolbox or Solo multivariate analysis models in process control applications. This integration utilizes Mathworks' [http://www.mathworks.com/ Matlab] and functionality built into SiPAT to run custom Matlab functions. These functions can include calls to PLS_Toolbox functions.
==Importing / Exporting==


The following discusses the PLS_Toolbox-specific configuration and provides example m-files which can be used to quickly get SiPAT and PLS_Toolbox integrated.
[[faq_concatenate_multiple_files|How do I concatenate multiple files into a single DataSet?]]
__TOC__


====Licensing====
[[faq_create_multivariate_image_from_separate_images|How do I create a multivariate image from separate images?]]


Note that the [http://www.eigenvector.com/software/license_evri.html PLS_Toolbox license] requires that, unless special site licensing arrangements are made, you must obtain a separate PLS_Toolbox license for each instance of any PLS_Toolbox code you wish to deploy. Special licensing options are available. See the [http://www.eigenvector.com Eigenvector Website] for more information.
[[faq_export_PCA_scores_and_loadings_to_text_file|How do I export PCA scores and loadings to a text file (to read into MS Excel, for example)?]]


==Installation and Basic Configuration==
[[faq_import_three-way_data|How do I import three-way data into Solo or PLS_Toolbox?]]
===SiPAT Configuration===


The user is directed to the Siemens documentation for details on configuration of SiPAT for use with Matlab. The discussion below describes some of the PLS_Toolbox-specific considerations of this configuration.
[[faq_import_horiba_NGC_64bit |Why can't I import a Horiba NGC file on my 64-bit computer?]]


===Matlab and PLS_Toolbox Configuration===
[[faq_SPCREADR_cant_read_multiple_files |Why can't SPCREADR read multiple files I've selected?]]


With Matlab already installed (as per [http://www.mathworks.com The Mathworks installation instructions]), PLS_Toolbox can be copied onto the target computer using any of the available file types as described in the [[Installation|standard installation instructions]]. The primary differences from the standard installation will be that you will create two special files and place these files into the PLS_Toolbox/utilities folder on the target computer:
[[faq_some_EXCEL_files_fail_to_import |Why do some Excel files fail to import?]]
# Create a text file named '''evrilicense.lic''' and put your license code (available from the [http://download.eigenvector.com Eigenvector Research download page]) as a single line in that file. You can also obtain a evrilicense.lic file from the [mailto:helpdesk@eigenvector.com Eigenvector Helpdesk].
# Create an ''empty'' text file named '''evrinetwork.lic'''. This file will instruct PLS_Toolbox to ignore installation errors and run without warnings (necessary when calling PLS_Toolbox functions from within SiPAT.)


==Specific Model and M-file Configuration==
==General==


The SiPAT interface into Matlab provides for the specification of a data file to load and a Matlab function to execute (along with specific input and output configuration information.)
[[faq_PARALIND_in_PLS_Toolbox |Can I do PARALIND in PLS_Toolbox?]]


In the examples given here, the data file will be in the Matlab MAT format and will contain the model to apply. The Matlab function will be given in the Matlab m-file format and will contain the specific instructions for applying the model and returning the results to SiPAT. In these examples, it is assumed there is only one model that is being applied  per each method (no special calibration transfer or other pre-transformation steps being used).
[[faq_install_on_more_than_one_PC | Can I install PLS_Toolbox (or Solo) on more than one PC, such as on my desktop and laptop computer?]]


===Model MAT File Creation===
[[faq_multiple_class_sets_together_in_SIMCA_PLSDA_LDA | Can I use multiple class sets (categorical variables) together in a SIMCA, PLSDA, or LDA model?]]


The model you wish to apply to new data should be saved from Matlab or Solo into a MAT file as the one and only item stored in the MAT file. In the [[Analysis Window]], this is done by selecting the menu item: File > Save Model and using the [[WorkspaceBrowser_ImportingData#To_save_imported_data_to_a_.mat_file save dialog]] to specify a filename and an item name. This will save the model into the given filename with the specified item name. Although the MAT file name can be any standard filename, it will make m-file construction easier if the item name used is always the same in all saved models. In the example here, we will assume that the item is called "model". If a different name is used, the SiPAT-specific configuration comments in the m-file (see below) will need to be modified to indicate to SiPAT the name used for the model.
[[faq_more_info_on_R_Squared_statistic | Can you give me more information on the R-Squared statistic?]]


If saving the model from the Matlab command line, the following command can be used (assuming the model to use is currently named "model" in your Matlab workspace) :
[[faq_how_RMSEC_and_RMSECV_related to R2Y_and_Q2Y_seen_other_software | How are RMSEC and RMSECV related to R2Y and Q2Y I see in other software?]]
  save myfile.mat model


===Function M-file Creation===
[[faq_convergence_of_PARAFAC| Convergence of PARAFAC. How much variation between models is expected a particular PARAFAC is fit multiple times with the same settings?]]


The second part to the SiPAT/PLS_Toolbox interface is a Matlab m-file which contains a function definition (Note: The contents of this m-file must actually be a Matlab function, meaning it must contain a function header line as shown in the scripts below. It cannot be a "script" in the strict Matlab definition of that term which implies code that is not wrapped inside a function definition.)
[[faq_does_software_stop_working_if_maintenance_expires | Does the software stop working if my maintenance expires?]]


Three example m-files are given below: one for use with any of the regression model types (PLS, PCR, MLR, CLS, SVM, LWR, etc), one for use with classification model types (PLSDA, KNN, SIMCA, SVMDA/SVM-C), and one for use with principal component analysis (PCA) models. Each of these m-files assumes that the input is a single vector of values (passed by SiPAT) and each returns two or three values which correspond to the predictions from the model.
[[faq_report_a_problem_with_PLS_Toolbox | How and where do I report a problem with PLS_Toolbox?]]


These functions all also assume that the input data will be two columns where the first column contains axis scale information for the variables (such as wavelength, m/z, time, etc) and the second column is the actual measured data. If this does not fit the type of data being passed by SiPAT (e.g. no axisscale information), the initial lines:
[[faq_how_are_T_contributions_calculated | How are T-contributions calculated?]]


<pre>
[[faq_how_are_ROC_curves_calculated_for_PLSDA | How are the ROC curves calculated for PLSDA?]]
    %convert second column of input data into a dataset (if not appropriate column, change next line)
    x = data(:,2);
    x = dataset(double(x'));


    %Assume first column is axisscale information. If not true, comment out the next line
[[faq_how_are_error_bars_calculated_regression_model | How are the error bars calculated for a regression model and can they be related to a confidence limit (confidence in the prediction)?]]
    x.axisscale{2} = double(data(:,1));
</pre>


should be converted as necessary to handle the input data. For example, if only a single column of values is being passed, the following can be substituted in for the above code:
[[faq_improve_performance_with_PLS_Toolbx_and_Matlab_on_Mac | How can I improve performance with PLS_Toolbox and Matlab on the Mac platform?]]


<pre>
[[faq_assign_classes_for_samples_in_a_DataSet | How do I assign classes for samples in a DataSet?]]
    %convert column of input data into a dataset
    x = dataset(double(data'));
</pre>


'''Errors:''' These functions make use of try/catch statements to trap errors and save pertinent information into a text file in the root C: directory. The location of this file can be changed as desired. This code is added to help diagnose configuration problems. It is recommended that, in a final deployment, specific error codes be used such as setting all outputs to "inf" to trigger an alarm in SiPAT.
[[faq_build_a_classification_model_from_class_set_other_than_the_first | How do I build a classification model from a class set other than the first?]]


====Regression Model Predictions====
[[faq_choose_between_different_cross_validation_leave_out_options | How do I choose between the different cross-validation leave-out options?]]


The following m-file contents are appropriate for use with regression model types:
[[faq_reference_Eigenvector| How do I cite/reference Eigenvector?]]


<pre>
[[faq_interpret_ROC_curves_and_Sensitivity_Specificity_plots_from_PLSDA | How do I interpret the ROC curves and Sensitivity / Specificity plots from PLSDA?]]
%START SIPAT
%<CONFIG>
%  <MODEL>REGRESSION</MODEL>
%  <PREFIX></PREFIX>
%  <SUFFIX></SUFFIX>
%  <INPUTS>
%      <INPUT Name="data" XDataType="MultiValue" YDataType="Single" />
%  </INPUTS>
%  <OUTPUTS>
%      <OUTPUT Name="y" XDataType="SingleValue" YDataType="Double" />
%      <OUTPUT Name="q_x" XDataType="SingleValue" YDataType="Double" />
%      <OUTPUT Name="H_x" XDataType="SingleValue" YDataType="Double" />
%  </OUTPUTS>
%  <FUNCTION>[y,q_x,H_x]=regpred(data,model)</FUNCTION>
%</CONFIG>
%END SIPAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Regression using PLS_toobox from Eigenvector
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%This function calculates the responses y and the Q-residuals
%(q_x) and Hotellings T2 (H_x) corresponding to the input variables data
%(data) using a standard model structure.
%
%Before using this function, make sure you loaded the following data into
%the workspace:
%  model: Model structure used in PLS_Toolbox/Solo including all pretreatment.


function [y,q_x,H_x] = regpred(data,model)
[[faq_make_DataSet_backwards_compatible | How do I make a DataSet backwards compatible?]]


%defaults if something goes wrong or we can't get Q or T^2 from this model type
[[faq_obtain_or_use_recompilation_license_for_PLS_Toolbox | How do I obtain or use a recompilation license for PLS_Toolbox?]]
q_x = inf;
H_x = inf;


try
[[faq_use_custon_cross_validation_option | How do I use the "custom" cross-validation option?]]
    %convert second column of input data into a dataset (if not appropriate column, change next line)
    x = data(:,2);
    x = dataset(double(x'));


    %Assume first column is axisscale information. If not true, comment out the next line
[[faq_out_of_memory_error_when_analyzing_data | I keep getting "out of memory" errors when analyzing my data. What can I do?]]
    x.axisscale{2} = double(data(:,1));


    %make a prediction
[[faq_java_lang_OutOfMemoryError| What can I do if I get a java.lang.OutOfMemoryError error?]]
    opts =[];
    opts.plots='none';
    opts.display='off';
    pred_x = feval(lower(model.modeltype),x,model,opts);


    %return prediction in y 
[[faq_why_get_negative_scores_when_all_modes_are_set_to_nonnegativity | Nonnegativity (PARAFAC, PARAFAC2, Tucker): Why do I get negative scores when all modes are set to nonnegativity?]]
    y = double(pred_x.pred{2});
    if isfield(pred_x,'ssqresiduals')
        q_x = double(pred_x.ssqresiduals{1,1}./pred_x.detail.reslim{1});
        H_x = double(pred_x.tsqs{1}./pred_x.detail.tsqlim{1});
    end


catch
[[faq_what_are_relative_contributions | What are "Relative Contributions"?]]
    %errors are saved to the following file (change location as desired)
    fid=fopen('C:\sipat_error_sout.txt','w');
    fwrite(fid,encode(lasterror));
    fwrite(fid,encode(evalin('base','whos')));
    fclose(fid);
end
</pre>


====Classification Model Predictions====
[[faq_what_are_reduced_T^2_and_Q_Statistics | What are the "Reduced" T^2 and Q Statistics?]]


The following m-file contents are appropriate for use with classification model types:
[[faq_units_for_RMSEC_and_RMSECV_for_PLSDA | What are the units used for RMSEC and RMSECV when cross-validating PLSDA models?  Why do the cross-validation curves look strange for PLSDA?]]


<pre>
[[faq_what_do_the_four_Fit_/_Unique_Fit_stats_mean_in_MCR_PARAFAC | What do the four Fit/Unique Fit statistics mean in MCR and PARAFAC models?]]
%START SIPAT
%<CONFIG>
%  <MODEL>CLASSIFICATION</MODEL>
%  <PREFIX></PREFIX>
%  <SUFFIX></SUFFIX>
%  <INPUTS>
%      <INPUT Name="data" XDataType="MultiValue" YDataType="Single" />
%  </INPUTS>
%  <OUTPUTS>
%      <OUTPUT Name="y" XDataType="SingleValue" YDataType="Double" />
%      <OUTPUT Name="q_x" XDataType="SingleValue" YDataType="Double" />
%      <OUTPUT Name="H_x" XDataType="SingleValue" YDataType="Double" />
%  </OUTPUTS>
%  <FUNCTION>[y,q_x,H_x]=classpred(data,model)</FUNCTION>
%</CONFIG>
%END SIPAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Classification using PLS_toobox from Eigenvector
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%This function calculates the numerical class assignment, the Q-residuals
%(q_x) and Hotellings T2 (H_x) corresponding to the input variables data
%(data) using a standard model structure.
%
%Before using this function, make sure you loaded the following data into
%the workspace:
%  model: Model structure used in PLS_Toolbox/Solo including all pretreatment.


function [y,q_x,H_x] = classpred(data,model)
[[faq_internal_tests_used_to_select_suggested_number_of_PCs | What internal tests are used to select "suggested" number of PCs?]]


%defaults if something goes wrong or we can't get Q or T^2 from this model type
[[faq_what_is_PLS1_v_PLS2_and_how_to_create_separate_PLS1_models_from_multi_column_y_block | What is PLS1 vs PLS2 and how do I create separate PLS1 models when I have a multi-column y-block?]]
q_x = inf;
H_x = inf;


try
==Command Line==
    %convert second column of input data into a dataset (if not appropriate column, change next line)
==Manual==
    x = data(:,2);
==GUI==
    x = dataset(double(x'));
==Installation==


    %Assume first column is axisscale information. If not true, comment out the next line
    x.axisscale{2} = double(data(:,1));


    %make a prediction
    opts =[];
    opts.plots='none';
    opts.display='off';
    pred_x = feval(lower(model.modeltype),x,model,opts);


    %return prediction in y 
    y = double(pred_x.classification.mostprobable);
    if isfield(pred_x,'ssqresiduals')
        q_x = double(pred_x.ssqresiduals{1,1}./pred_x.detail.reslim{1});
        H_x = double(pred_x.tsqs{1}./pred_x.detail.tsqlim{1});
    end


catch
    %errors are saved to the following file (change location as desired)
    fid=fopen('C:\sipat_error_sout.txt','w');
    fwrite(fid,encode(lasterror));
    fwrite(fid,encode(evalin('base','whos')));
    fclose(fid);
end
</pre>




====PCA Model Predictions====


The following m-file contents are appropriate for use with a PCA model type. As written, it returns only TWO values, the Q and T2 values indicating if the sample belongs in the PCA model or not. This construction can be used when using a PCA model to detect process anomalies.
[[Category:FAQ]]
 
<pre>
%START SIPAT
%<CONFIG>
%  <MODEL>PCA</MODEL>
%  <PREFIX></PREFIX>
%  <SUFFIX></SUFFIX>
%  <INPUTS>
%      <INPUT Name="data" XDataType="MultiValue" YDataType="Single" />
%  </INPUTS>
%  <OUTPUTS>
%      <OUTPUT Name="q_x" XDataType="SingleValue" YDataType="Double" />
%      <OUTPUT Name="H_x" XDataType="SingleValue" YDataType="Double" />
%  </OUTPUTS>
%  <FUNCTION>[q_x,H_x]=pcapred(data,model)</FUNCTION>
%</CONFIG>
%END SIPAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% PCA Projection using PLS_toobox from Eigenvector
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%This function calculates the Q-residuals (q_x) and Hotellings T2 (H_x)
%corresponding to the input variables data (data) using a standard model
%structure.
%
%Before using this function, make sure you loaded the following data into
%the workspace:
%  model: Model structure used in PLS_Toolbox/Solo including all pretreatment.
 
function [q_x,H_x] = pcapred(data,model)
 
%defaults if something goes wrong or we can't get Q or T^2 from this model type
q_x = inf;
H_x = inf;
 
try
    %convert second column of input data into a dataset (if not appropriate column, change next line)
    x = data(:,2);
    x = dataset(double(x'));
 
    %Assume first column is axisscale information. If not true, comment out the next line
    x.axisscale{2} = double(data(:,1));
 
    %make a prediction
    opts =[];
    opts.plots='none';
    opts.display='off';
    pred_x = feval(lower(model.modeltype),x,model,opts);
 
    %return prediction of Q and T^2
    if isfield(pred_x,'ssqresiduals')
        q_x = double(pred_x.ssqresiduals{1,1}./pred_x.detail.reslim{1});
        H_x = double(pred_x.tsqs{1}./pred_x.detail.tsqlim{1});
    end
 
    %NOTE: if output of scores is desired, add a "y" output to
    %the function and SiPAT definition and use the next line:
    %  y = pred_x.loads{1,1};
    %which would return all the scores (as a row vector) for the given data
 
catch
    %errors are saved to the following file (change location as desired)
    fid=fopen('C:\sipat_error_sout.txt','w');
    fwrite(fid,encode(lasterror));
    fwrite(fid,encode(evalin('base','whos')));
    fclose(fid);
end
</pre>

Revision as of 10:48, 30 November 2018

_

Importing / Exporting

How do I concatenate multiple files into a single DataSet?

How do I create a multivariate image from separate images?

How do I export PCA scores and loadings to a text file (to read into MS Excel, for example)?

How do I import three-way data into Solo or PLS_Toolbox?

Why can't I import a Horiba NGC file on my 64-bit computer?

Why can't SPCREADR read multiple files I've selected?

Why do some Excel files fail to import?

General

Can I do PARALIND in PLS_Toolbox?

Can I install PLS_Toolbox (or Solo) on more than one PC, such as on my desktop and laptop computer?

Can I use multiple class sets (categorical variables) together in a SIMCA, PLSDA, or LDA model?

Can you give me more information on the R-Squared statistic?

How are RMSEC and RMSECV related to R2Y and Q2Y I see in other software?

Convergence of PARAFAC. How much variation between models is expected a particular PARAFAC is fit multiple times with the same settings?

Does the software stop working if my maintenance expires?

How and where do I report a problem with PLS_Toolbox?

How are T-contributions calculated?

How are the ROC curves calculated for PLSDA?

How are the error bars calculated for a regression model and can they be related to a confidence limit (confidence in the prediction)?

How can I improve performance with PLS_Toolbox and Matlab on the Mac platform?

How do I assign classes for samples in a DataSet?

How do I build a classification model from a class set other than the first?

How do I choose between the different cross-validation leave-out options?

How do I cite/reference Eigenvector?

How do I interpret the ROC curves and Sensitivity / Specificity plots from PLSDA?

How do I make a DataSet backwards compatible?

How do I obtain or use a recompilation license for PLS_Toolbox?

How do I use the "custom" cross-validation option?

I keep getting "out of memory" errors when analyzing my data. What can I do?

What can I do if I get a java.lang.OutOfMemoryError error?

Nonnegativity (PARAFAC, PARAFAC2, Tucker): Why do I get negative scores when all modes are set to nonnegativity?

What are "Relative Contributions"?

What are the "Reduced" T^2 and Q Statistics?

What are the units used for RMSEC and RMSECV when cross-validating PLSDA models? Why do the cross-validation curves look strange for PLSDA?

What do the four Fit/Unique Fit statistics mean in MCR and PARAFAC models?

What internal tests are used to select "suggested" number of PCs?

What is PLS1 vs PLS2 and how do I create separate PLS1 models when I have a multi-column y-block?

Command Line

Manual

GUI

Installation