SVM Function Settings: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Donal
Line 1: Line 1:
==Support Vector Machines ==
==Support Vector Machines ==


SVMs are non-linear models which can be used for regression or classification problems. The following settings can be access from the SVM Function Settings panel in the Analysis GUI or from the [[Working With Options|Options GUI]].  
SVMs are non-linear models which can be used for regression or classification problems. Settings can be adjusted from the in the Analysis GUI or from the [[Working With Options|Options GUI]]. For SVMDA, the probability estimates can be adjusted from the Function Settings panel.


<br>
<br>

Revision as of 21:51, 4 October 2023

Support Vector Machines

SVMs are non-linear models which can be used for regression or classification problems. Settings can be adjusted from the in the Analysis GUI or from the Options GUI. For SVMDA, the probability estimates can be adjusted from the Function Settings panel.


svm function settings

SVM Type

Classification (SVMDA)

  • Nu-SVM optimizes a model with an adjustable parameter Nu (0 -> 1] which indicates the upper bound on the number of misclassifications allowed.
  • C-SVC optimizes a model with an adjustable cost function C [0 -> inf] which indicates how strongly misclassifications should be penalized.

Regression (SVM)

(shown above)

  • Epsilon-SVR optimizes a model using the adjustable parameters epsilon (upper tolerance on prediction errors) and C (cost of prediction errors larger than epsilon.)
  • Nu-SVR optimizes a model using the adjustable parameter Nu (0 -> 1] which indicates a lower bound on the number of support vectors to use given as a fraction of total calibration samples.

Kernel Type

Two kernel types are supported, Linear and Gaussian Radial Basis Function (RBF). Linear kernel has no parameters but the RBF has a ‘gamma’ parameter (> 0). This parameter if available through the Options GUI.

Compression

X-block data may be compressed by applying PCA or PLS to the data to reduce the number of X-block variables. SVM is then applied to the resulting scores values. The number of latent variables (principal components) used in the compression model must be supplied. The compression model is included as part of the SVM model and will be automatically applied to new data.

Probability Estimates

For a given sample, estimates the per-class probability of the sample’s class membership. Results can be seen in the Scores Plot. This feature is only available for classification (SVMDA) type models.

Cross-validation

Performs n-fold cross validation. Data are randomly divided into n groups. Each group is excluded in turn and an svm trained on the remaining groups (which are separately preprocessed) and validated against the excluded group. Note that cross-validation results from this algorithm can be notably different from the cross-validation results from other model types and care should be taken in comparing results. It is strongly recommended that a separate validation set be used for comparison to other model types.