Corrspecgui and Svm: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Willem
No edit summary
 
imported>Benjamin
No edit summary
 
Line 1: Line 1:
==Correlation Spectroscopy Interface==
===Purpose===
Corrspecgui is the interactive version of corrspec. The principles of correlation spectroscopy are shown in the demo of corrspec; in the manual in the chapter 'Tutorial for the correlation spectroscopy function corrspec' and for in depth information: W. Windig, D.E. Margevich, W.P. McKenna, A novel tool for two-dimensional (2D) correlation spectroscopy, Chemometrics and Intelligent Laboratory Systems, 28, 1995, 108-128.


After starting the program the files are read in through the file menu. The cursor is not displayed until pressing the 'Move cursor to max' button, described below. Often the setting 'Z Origin Equal Zero', described below,is turned on in order to simplify the dispersion matrix.  
SVM Support Vector Machine (LIBSVM) for regression. Use SVMDA for SVM classification ([[Svmda]]). Please also look at the [[Svmda]] page since it has more detailed information much of which also applies to SVM for regression.


Below the buttons the number of pure variables selected is displayed, followed by editable boxes defining the number of contour levels used in the display of the dispersion matrix and to define the offsets. Below the display of the pure variables the cursor position is displayed.
===Synopsis===


Below, the buttons and menus of corrspecgui will be discussed. The names of the buttons will be displayed by placing the cursor on top of them.
:model = svm(x,y,options);          %identifies model (calibration step).
''Toggle settings display'': turns the settings, displayed on the right, on or off.
:pred = svm(x,model,options);      %makes predictions with a new X-block
''Set pure variable:'' the components defined by the pure variables, indicated by the cross-hair cursor, are eliminated.
:pred = svm(x,y,model,options);    %performs a "test" call with a new X-block and known y-values
''Reset last variable'': undoes the action of the previous 'Set pure variable'.
:svm % Launches an Analysis window with SVM as the selected method.
''Cursor'': enables manual change of cursor position, indicating the pure variables.
''Move cursor to max'': moves the cursor to the maximum in the matrix defined by the setting 'Max Algorithm'.
''Plot At Cursor'': Plots the data at the cursor position in the follow forms:
a) Cursor Variable Plots, a plot of the (pure) variable of the X data set, the (pure) variable of the Y data set, both as a function of axisscale{1} and a plot of both variables, as a visual tool to judge the correlation between these two variables.
b) Cursor Spectrum Plots: a plot of the spectra in the displayed map.
''Inactivate'': Areas that should not be used for pure variables can be inactivated, after which these areas are excluded when the cursor is set to the maximum (by the program of the 'Move cursor to max' button). After clicking one of these buttons the mouse is used to indicate the area to be inactivated. The different inactivate options define whether only the x or y data are inactivated, or both x and y. 
''Reactivate last selection'': Reactivation of the previous inactivated area.
Plot map in 3D: a 3D presentation of the map. Works best with the 'Plot Settings' 'False Color' on. The display can be rotated with the mouse. The button toggles between 3D on and off.
''Resolve spectra'': resolves the data using the pure variables. The resolved results are displayed in the following form:
a)X Spec, X Con, Y spec, Y con, with the resolved spectra and contributions for both X and Y data.
b)X vs Y Con, which shows the resolved X contributions versus the resolved Y contributions, as a tool to judge the correlations between the variables.
c) Maps: the resolved correlation maps.
d) Diagnostics: the original dispersion matrix, the dispersion matrix reconstructed from the resolved components, the sum of the maps shown under the tab 'Maps' and an overlay of these maps, each in a different color.
e) Current plot: clicking on any of the plots mentioned above will display it enlarged under 'Current plot'.


The settings, displayed at the right, have the following functions:
===Description===
''Load demo data'': loads the data used for the demo in corrspec.
 
''Plot type'': determines plots along the z and y axes of the correlation matrix.
The SVM function or analysis method performs calibration and application of Support Vector Machine (SVM) regression models. SVM models can be used for regression problems. The model consists of a number of support vectors (essentially samples selected from the calibration set) and non-linear model coefficients which define the non-linear mapping of variables in the input x-block. The model allows prediction of the continuous y-block variable. It is recommended that classification be done through the svmda function.
''Max Algorithm'': defines the matrix of which its maximum determines pure variables.
 
''Dispersion Algorithm'': defines the matrix displayed.
Svm is implemented using the LIBSVM package which provides both epsilon-support vector regression (epsilon-SVR) and nu-support vector regression (nu-SVR). Linear and Gaussian Radial Basis Function kernel types are supported by this function.
''Plot Settings'': determines different aspects of the plots.
 
a) Axis type: defines the continuous or discrete character of plots.
Note: Calling svm with no inputs starts the graphical user interface (GUI) for this analysis method.
b) False color: creates false coloring of the dispersion matrix.
 
c) Z Origin Equal Zero: only displays positive values in dispersion matrix.
====Inputs====
d) Grid: toggle for grid in dispersion matrix.
 
e) Set X/Y direction: plots the x/y axis regular or in reverse.
* '''x''' = X-block (predictor block) class "double" or "dataset", containing numeric values,
f) Color bar: inserts color-bar to indicate z values in plot.  
* '''y''' = Y-block (predicted block) class "double" or "dataset", containing numeric values,
g) Colormap: determines which colormap is used.
* '''model''' = previously generated model (when applying model to new data).
 
====Outputs====
 
* '''model''' = a standard model structure model with the following fields (see [[Standard Model Structure]]):
** '''modeltype''': 'SVM',
** '''datasource''': structure array with information about input data,
** '''date''': date of creation,
** '''time''': time of creation,
** '''info''': additional model information,
** '''pred''': 2 element cell array with
*** model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
** '''detail''': sub-structure with additional model details and results, including:
*** model.detail.svm.model: Matlab version of the libsvm svm_model (Java). Note that the number of support vectors used is given by model.detai.svm.model.l. It is useful to check this because it can indicate overfitting if most of the calibration samples are used as support vectors, or can indicate problems fitting a model if there are no support vectors (and all prediction values will equal a constant value, a weighted mean).
*** model.detail.svm.cvscan: Results of CV parameter scan
*** model.detail.svm.svindices: Indices of X-block samples which are support vectors.
 
* '''pred''' a structure, similar to '''model''' for the new data.
 
===Options===
''options'' =  a structure array with the following fields:
 
* '''display''': [ 'off' | {'on'} ], governs level of display to command window,
* '''plots''' [ 'none' | {'final'} ], governs level of plotting,
* '''preprocessing''': {[] []}  preprocessing structures for x and y blocks (see PREPROCESS).
* '''compression''': [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the SVM model. 'pca' uses a simple PCA model to compress the information. 'pls' uses either a pls or plsda model (depending on the svmtype). Compression can make the SVM more stable and less prone to overfitting.
* '''compressncomp''': [1]  Number of latent variables (or principal components to include in the compression model.
* '''blockdetails''': [ {'standard'} | 'all' ], extent of predictions and residuals included in model, 'standard' = only y-block, 'all' x- and y-blocks.
* '''algorithm''': [ 'libsvm' ] algorithm to use. libsvm is default and currently only option.
* '''kerneltype''': [ 'linear' | {'rbf'} ], SVM kernel to use. 'rbf' is default.
* '''svmtype''': [ {'epsilon-svr'} | 'nu-svr' ] Type of SVM to apply. The default is 'epsilon-svr' for regression.
* '''probabilityestimates''': [0| {1} ], whether to train the SVR model for probability estimates, 0 or 1 (default 1)"
 
* '''cvtimelimit''': Set a time limit (seconds) on individual cross-validation sub-calculation when searching over supplied SVM parameter ranges for optimal parameters. Only relevant if parameter ranges are used for SVM parameters such as cost, epsilon, gamma or nu. Default is 10;
* '''splits''': Number of subsets to divide data into when applying n-fold cross validation. Default is 5. This option is only used when the "cvi" option is empty.
* '''cvi''': {{}} Standard cross-validation cell (see crossval) defining a split method, number of splits, and number of iterations. This cross-validation is use both for parameter optimization and for error estimate on the final selected parameter values. If empty (the default), then random cross-validation is done based on the "splits" option.
 
* '''gamma''': Value(s) to use for LIBSVM kernel gamma parameter. Default is 15 values from 10^-6 to 10, spaced uniformly in log.
* '''cost''': Value(s) to use for LIBSVM 'c' parameter. Default is 11 values from 10^-3 to 100, spaced uniformly in log.
* '''epsilon''': Value(s) to use for LIBSVM 'p' parameter (epsilon in loss function). Default is the set of values [1.0, 0.1, 0.01].
* '''nu''': Value(s) to use for LIBSVM 'n' parameter (nu of nu-SVC, and nu-SVR). Default is the set of values [0.2, 0.5, 0.8].
 
===Algorithm===
Svm uses the LIBSVM implementation using the user-specified values for the LIBSVM parameters (see ''options'' above). See [http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf] for further details of these options.
 
The default SVM parameters cost, epsilon, nu and gamma have value ranges rather than single values. This svm function uses a search over the grid of appropriate parameters using cross-validation to select the optimal SVM parameter values and builds an SVM model using those values. This is the recommended usage. The user can avoid this grid-search by passing in single values for these parameters, however. If you are using the command line SVM function to build a model then the optimal SVM parameters are shown in model.detail.svm.cvscan.best. If you are using the graphical Analysis SVM then the optimal parameters are reported in the summary window which is shown when you mouse-over the model icon, once the model is built.
 
====Model building performance====
Building a single SVM model can sometimes be slow, especially if the calibration dataset is large. Using ranges for the SVM parameters to search for the optimal parameter combination increases the final model building time significantly. If cross-validation is used the calculation is again increased, possibly dramatically if the number of CV subsets is large. Some suggestions for faster SVM building include:
:1) Turning CV off ("none") during preliminary analyses. This is MUCH faster and cross-validation is still performed using a default "Random Subsets" with 5 data splits and 1 iteration,
:2) Using a coarse grid of SVM parameter values to search over for optimal values,
:3) Choosing the CV method carefully, at least initially. For example, use "Random Subsets" with a small number of data splits (e.g. 5) and a small "Number of Iterations" (e.g. 1).
:4) Using the compression option if the number of variables is large.
 
====epsilon-SVR and nu-SVR====
There are two commonly used versions of SVM regression, 'epsilon-SVR' and 'nu-SVR'. The original SVM formulations for Regression (SVR) used parameters C [0, inf) and epsilon[0, inf) to apply a penalty to the optimization for points which were not correctly predicted.  An alternative version of both SVM regression was later developed where the epsilon penalty parameter was replaced by an alternative parameter, nu [0,1], which applies a slightly different penalty. The main motivation for the nu versions of SVM is that it has a has a more meaningful interpretation. This is because nu represents an upper bound on the fraction of training samples which are errors (badly predicted) and a lower bound on the fraction of samples which are support vectors. Some users feel nu is more intuitive to use than C or epsilon.
Epsilon or nu are just different versions of the penalty parameter. The same optimization problem is solved in either case. Thus it should not matter which form of SVM you use, epsilon or nu. PLS_Toolbox uses epsilon since this was the original formulation and is the most commonly used form. For more details on 'nu' SVM regression see [http://www.csie.ntu.edu.tw/~cjlin/papers/newsvr.pdf]
 
The user must provide parameters (or parameter ranges) for SVM regression as:
:*'epsilon-SVR':
::'''epsilon''','''C''',  (using linear kernel), or
::'''epsilon''','''C''', '''gamma''' (using radial basis function kernel),
 
:*'nu-SVR':
::'''nu''', '''C''',    (using linear kernel), or
::'''nu''', '''C''', '''gamma''' (using radial basis function kernel),
 
====SVM Parameters====
 
* '''cost''': Cost [0 ->inf] represents the penalty associated with errors larger than epsilon. Increasing cost value causes closer fitting to the calibration/training data.
* '''gamma''': Kernel ''gamma'' parameter controls the shape of the separating hyperplane. Increasing gamma usually increases number of support vectors.
* '''epsilon''': In training the regression function there is no penalty associated with points which are predicted within distance epsilon from the actual value. Decreasing epsilon forces closer fitting to the calibration/training data.
* '''nu''': Nu (0 -> 1] indicates a lower bound on the number of support  vectors to use, given as a fraction of total calibration samples, and an upper bound on the fraction of training samples which are errors (poorly predicted).
 
===See Also===
 
[[analysis]], [[ann]], [[mlr]], [[lwr]], [[pls]], [[pcr]], [[svmda]], [[preprocess]]

Revision as of 10:32, 12 January 2018

Purpose

SVM Support Vector Machine (LIBSVM) for regression. Use SVMDA for SVM classification (Svmda). Please also look at the Svmda page since it has more detailed information much of which also applies to SVM for regression.

Synopsis

model = svm(x,y,options); %identifies model (calibration step).
pred = svm(x,model,options); %makes predictions with a new X-block
pred = svm(x,y,model,options); %performs a "test" call with a new X-block and known y-values
svm % Launches an Analysis window with SVM as the selected method.

Description

The SVM function or analysis method performs calibration and application of Support Vector Machine (SVM) regression models. SVM models can be used for regression problems. The model consists of a number of support vectors (essentially samples selected from the calibration set) and non-linear model coefficients which define the non-linear mapping of variables in the input x-block. The model allows prediction of the continuous y-block variable. It is recommended that classification be done through the svmda function.

Svm is implemented using the LIBSVM package which provides both epsilon-support vector regression (epsilon-SVR) and nu-support vector regression (nu-SVR). Linear and Gaussian Radial Basis Function kernel types are supported by this function.

Note: Calling svm with no inputs starts the graphical user interface (GUI) for this analysis method.

Inputs

  • x = X-block (predictor block) class "double" or "dataset", containing numeric values,
  • y = Y-block (predicted block) class "double" or "dataset", containing numeric values,
  • model = previously generated model (when applying model to new data).

Outputs

  • model = a standard model structure model with the following fields (see Standard Model Structure):
    • modeltype: 'SVM',
    • datasource: structure array with information about input data,
    • date: date of creation,
    • time: time of creation,
    • info: additional model information,
    • pred: 2 element cell array with
      • model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
    • detail: sub-structure with additional model details and results, including:
      • model.detail.svm.model: Matlab version of the libsvm svm_model (Java). Note that the number of support vectors used is given by model.detai.svm.model.l. It is useful to check this because it can indicate overfitting if most of the calibration samples are used as support vectors, or can indicate problems fitting a model if there are no support vectors (and all prediction values will equal a constant value, a weighted mean).
      • model.detail.svm.cvscan: Results of CV parameter scan
      • model.detail.svm.svindices: Indices of X-block samples which are support vectors.
  • pred a structure, similar to model for the new data.

Options

options = a structure array with the following fields:

  • display: [ 'off' | {'on'} ], governs level of display to command window,
  • plots [ 'none' | {'final'} ], governs level of plotting,
  • preprocessing: {[] []} preprocessing structures for x and y blocks (see PREPROCESS).
  • compression: [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the SVM model. 'pca' uses a simple PCA model to compress the information. 'pls' uses either a pls or plsda model (depending on the svmtype). Compression can make the SVM more stable and less prone to overfitting.
  • compressncomp: [1] Number of latent variables (or principal components to include in the compression model.
  • blockdetails: [ {'standard'} | 'all' ], extent of predictions and residuals included in model, 'standard' = only y-block, 'all' x- and y-blocks.
  • algorithm: [ 'libsvm' ] algorithm to use. libsvm is default and currently only option.
  • kerneltype: [ 'linear' | {'rbf'} ], SVM kernel to use. 'rbf' is default.
  • svmtype: [ {'epsilon-svr'} | 'nu-svr' ] Type of SVM to apply. The default is 'epsilon-svr' for regression.
  • probabilityestimates: [0| {1} ], whether to train the SVR model for probability estimates, 0 or 1 (default 1)"
  • cvtimelimit: Set a time limit (seconds) on individual cross-validation sub-calculation when searching over supplied SVM parameter ranges for optimal parameters. Only relevant if parameter ranges are used for SVM parameters such as cost, epsilon, gamma or nu. Default is 10;
  • splits: Number of subsets to divide data into when applying n-fold cross validation. Default is 5. This option is only used when the "cvi" option is empty.
  • cvi: {{}} Standard cross-validation cell (see crossval) defining a split method, number of splits, and number of iterations. This cross-validation is use both for parameter optimization and for error estimate on the final selected parameter values. If empty (the default), then random cross-validation is done based on the "splits" option.
  • gamma: Value(s) to use for LIBSVM kernel gamma parameter. Default is 15 values from 10^-6 to 10, spaced uniformly in log.
  • cost: Value(s) to use for LIBSVM 'c' parameter. Default is 11 values from 10^-3 to 100, spaced uniformly in log.
  • epsilon: Value(s) to use for LIBSVM 'p' parameter (epsilon in loss function). Default is the set of values [1.0, 0.1, 0.01].
  • nu: Value(s) to use for LIBSVM 'n' parameter (nu of nu-SVC, and nu-SVR). Default is the set of values [0.2, 0.5, 0.8].

Algorithm

Svm uses the LIBSVM implementation using the user-specified values for the LIBSVM parameters (see options above). See [1] for further details of these options.

The default SVM parameters cost, epsilon, nu and gamma have value ranges rather than single values. This svm function uses a search over the grid of appropriate parameters using cross-validation to select the optimal SVM parameter values and builds an SVM model using those values. This is the recommended usage. The user can avoid this grid-search by passing in single values for these parameters, however. If you are using the command line SVM function to build a model then the optimal SVM parameters are shown in model.detail.svm.cvscan.best. If you are using the graphical Analysis SVM then the optimal parameters are reported in the summary window which is shown when you mouse-over the model icon, once the model is built.

Model building performance

Building a single SVM model can sometimes be slow, especially if the calibration dataset is large. Using ranges for the SVM parameters to search for the optimal parameter combination increases the final model building time significantly. If cross-validation is used the calculation is again increased, possibly dramatically if the number of CV subsets is large. Some suggestions for faster SVM building include:

1) Turning CV off ("none") during preliminary analyses. This is MUCH faster and cross-validation is still performed using a default "Random Subsets" with 5 data splits and 1 iteration,
2) Using a coarse grid of SVM parameter values to search over for optimal values,
3) Choosing the CV method carefully, at least initially. For example, use "Random Subsets" with a small number of data splits (e.g. 5) and a small "Number of Iterations" (e.g. 1).
4) Using the compression option if the number of variables is large.

epsilon-SVR and nu-SVR

There are two commonly used versions of SVM regression, 'epsilon-SVR' and 'nu-SVR'. The original SVM formulations for Regression (SVR) used parameters C [0, inf) and epsilon[0, inf) to apply a penalty to the optimization for points which were not correctly predicted. An alternative version of both SVM regression was later developed where the epsilon penalty parameter was replaced by an alternative parameter, nu [0,1], which applies a slightly different penalty. The main motivation for the nu versions of SVM is that it has a has a more meaningful interpretation. This is because nu represents an upper bound on the fraction of training samples which are errors (badly predicted) and a lower bound on the fraction of samples which are support vectors. Some users feel nu is more intuitive to use than C or epsilon. Epsilon or nu are just different versions of the penalty parameter. The same optimization problem is solved in either case. Thus it should not matter which form of SVM you use, epsilon or nu. PLS_Toolbox uses epsilon since this was the original formulation and is the most commonly used form. For more details on 'nu' SVM regression see [2]

The user must provide parameters (or parameter ranges) for SVM regression as:

  • 'epsilon-SVR':
epsilon,C, (using linear kernel), or
epsilon,C, gamma (using radial basis function kernel),
  • 'nu-SVR':
nu, C, (using linear kernel), or
nu, C, gamma (using radial basis function kernel),

SVM Parameters

  • cost: Cost [0 ->inf] represents the penalty associated with errors larger than epsilon. Increasing cost value causes closer fitting to the calibration/training data.
  • gamma: Kernel gamma parameter controls the shape of the separating hyperplane. Increasing gamma usually increases number of support vectors.
  • epsilon: In training the regression function there is no penalty associated with points which are predicted within distance epsilon from the actual value. Decreasing epsilon forces closer fitting to the calibration/training data.
  • nu: Nu (0 -> 1] indicates a lower bound on the number of support vectors to use, given as a fraction of total calibration samples, and an upper bound on the fraction of training samples which are errors (poorly predicted).

See Also

analysis, ann, mlr, lwr, pls, pcr, svmda, preprocess