Preprocessiterator and Faq more info on R Squared statistic: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Scott
No edit summary
 
imported>Lyle
No edit summary
 
Line 1: Line 1:
===Purpose===
===Issue:===


Create array of preprocessing combinations for use with [[modeloptimizer]].
Can you give me more information on the R-Squared statistic?


===Synopsis===
===Possible Solutions:===


: pplist = preprocessiterator(inpp);%Shows gui for iterator settings.
R-Squared (R<sup>2</sup>) is an assessment of how well the model does the prediction (it is similar to RMSEC except that it doesn't show if there is a bias).  
: pplist = preprocessiterator(inpp,imatrix);%Command line call.


===Description===
You can access the R<sup>2</sup> by right-clicking on a scores plot of predicted vs. measured. It is one of the items which show up in the information box ("Show on figure" puts it on the figure).


For given input preprocessing structure (inpp), create combinations of preprocessing based on PP methods that can be iterated over using simple min/steps/max values. If iteration matrix (imatrix) is not provided a window will appear allowing user to specify iterations. Some of the methods are discussed in [[Advanced_Preprocessing | Advanced Preprocessing]].
Note: in other software, R<sup>2</sup> is for the MODELED data only. In PLS_Toolbox we calculate it for the DISPLAYED data. That means that if you show excluded data, or if you show predicted/test data with calibration data ("Show Cal with Test") the R<sup>2</sup> will be for what is shown and will be different from the calibration data. Turn off the "Show Cal with Test" checkbox on the Plot Controls window to view the R<sup>2</sup> for only the test data.  


Supported Preprocessing Methods:
R<sup>2</sup> is calculated as the square of the correlation coefficient between the X and Y axes plotted in the figure. If the only data shown is the estimation of the calibration Y data vs. the actual calibration Y data, this is nearly the same as the standard R<sup>2</sup> for a model as defined by, e.g. Martens and Naes.
# Derivative ([[Savgol]])
# [[Normaliz | Normalize]]
# [[glsw | GLS Weighting]]
# [[glsw | EPO Filter]]
# [[wlsbaseline | Baseline (Automatic Whittaker Filter)]]
# [[baseline | Detrend]]
# [[Gapsegment | Gap Segment Derivative]]
# [[Auto | Autoscale]]
# [[poissonscale | Poisson (Sqrt Mean) Scaling]]




Iterator Matrix (imatrix) example. Cell array n x 9 with following columns:
'''Still having problems? Please contact our helpdesk at [mailto:helpdesk@eigenvector.com helpdesk@eigenvector.com]'''


# Relative Index - Relative index of given method.
[[Category:FAQ]]
# Preprocess Name - Name of preprocess method.
# Parameter Name - Name of .userdata parameter.
# Parameter Variable - Name of .userdata field.
# Data Type - Allowed values for Min and Max.
# Min - First value.
# Step - Size of interval of each step.
# Max - Last value.
# Use Log - Use a log scale to create values.
 
 
<pre>
inpp = preprocess('default','mean center','derivative','normalize', 'mean center','sqmnsc','normalize','log10','whittaker');
 
imatrix = { 1 'derivative' 'Width' 'width' 'int(1:inf)' 1 1 1 0;
1 'derivative' 'Derivative' 'deriv' 'int(1:inf)' 1 1 1 0;...
1 'derivative' 'Order' 'order' 'int(1:inf)' 1 1 1 0;
2 'Normalize' 'Norm Type' 'normtype' 'int(1:inf)' 1 2 2 0;...
1 'GLS Weighting' 'Alpha' 'a' 'float(0:inf)' 1 1 1 1};
 
pplist = preprocessiterator(inpp,imatrix)</pre>
 
NOTE:  If the original preprocess structure contains 2 Normalize steps, the second Normalize will be iterated over.
 
===See Also===
 
[[preprocess]], [[preprouser]]

Latest revision as of 13:23, 2 January 2019

Issue:

Can you give me more information on the R-Squared statistic?

Possible Solutions:

R-Squared (R2) is an assessment of how well the model does the prediction (it is similar to RMSEC except that it doesn't show if there is a bias).

You can access the R2 by right-clicking on a scores plot of predicted vs. measured. It is one of the items which show up in the information box ("Show on figure" puts it on the figure).

Note: in other software, R2 is for the MODELED data only. In PLS_Toolbox we calculate it for the DISPLAYED data. That means that if you show excluded data, or if you show predicted/test data with calibration data ("Show Cal with Test") the R2 will be for what is shown and will be different from the calibration data. Turn off the "Show Cal with Test" checkbox on the Plot Controls window to view the R2 for only the test data.

R2 is calculated as the square of the correlation coefficient between the X and Y axes plotted in the figure. If the only data shown is the estimation of the calibration Y data vs. the actual calibration Y data, this is nearly the same as the standard R2 for a model as defined by, e.g. Martens and Naes.


Still having problems? Please contact our helpdesk at helpdesk@eigenvector.com