Lreg
Purpose
Predictions based on softmax multinomial logistic regression model.
Synopsis
- lreg - Launches an Analysis window with LREG as the selected method.
- [model] = lreg(x,options);
- [model] = lreg(x,y,options);
- [pred] = lreg(x,model,options);
- [valid] = lreg(x,y,model,options);
- [options] = lreg('options');
Please note that the recommended way to build and apply an LREG model from the command line is to use the Model Object. Please see this wiki page on building and applying models using the Model Object.
Description
Build an LREG model from input X and Y block data. Alternatively, if a model is passed in LREG makes a Y prediction for an input test X block.
LREG solves for the logistic regression model parameters using the minFunc software:
- M. Schmidt. minFunc: unconstrained differentiable multivariate optimization in Matlab. http://www.cs.ubc.ca/~schmidtm/Software/minFunc.html, 2005.
Inputs
- x = X-block (predictor block) class "double" or "dataset", containing numeric values,
- y = Y-block (optional) class "double" sample class values,
- model = previously generated model (when applying model to new data).
Outputs
- model = a standard model structure model with the following fields (see Standard Model Structure):
- modeltype: 'LREG',
- datasource: structure array with information about input data,
- date: date of creation,
- time: time of creation,
- info: additional model information,
- pred: 2 element cell array with
- model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
- detail: sub-structure with additional model details and results, including:
- model.detail.lreg: Structure containing 'lreg' matrix of model coefficients.
- pred a structure, similar to model for the new data.
Algorithm
The 'algorithm' option allows selection of Logistic Regression wit no regularization ('none'), L2 regularization ('ridge'), L1 regularization ('lasso'), or equally weighted L1 and L2 regularization ('elasticnet').
Cross-validation
Cross-validation can be applied to LREG when using either the LREG Analysis window or the command line. From the Analysis window specify the cross-validation method in the usual way (clicking on the model icon's red check-mark, or the "Choose Cross-Validation" link in the flowchart).
Options
options = a structure array with the following fields:
- display : [ 'off' |{'on'}] Governs display
- plots: [ {'none'} | 'final' ] governs plotting of results.
- blockdetails : [ {'standard'} | 'all' ] extent of detail included in model. 'standard' keeps only y-block, 'all' keeps both x- and y- blocks.
- waitbar : [ 'off' |{'auto'}| 'on' ] governs use of waitbar during analysis. 'auto' shows waitbar if delay will likely be longer than a reasonable waiting period.
- algorithm : [ {'ridge'} | 'none' | 'lasso' | 'elasticnet'] specify the LREG implementation to use:
- 'none' has no regularization,
- 'ridge' uses L2 regularization,
- 'lasso' uses L1 regularization, and
- 'elasticnet' uses equally weighted L1 and L2 regularization..
- maxIter : [400] Maximum number of iterations allowed in the minFunc optimization solver.
- lambda : [{0.1}] Regularization parameter
- preprocessing: {[] []} preprocessing structures for x and y blocks (see PREPROCESS).
- compression: [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the LREG model. 'pca' uses a simple PCA model to compress the information. 'pls' uses a pls model. Compression can make the LREG more stable and less prone to overfitting.
- compressncomp: [1] Number of latent variables (or principal components to include in the compression model.
- compressmd: [{'yes'} | 'no'] Use Mahalnobis Distance corrected.
- cvmethod : [{'con'} | 'vet' | 'loo' | 'rnd'] CV method, OR [] for Kennard-Stone single split.
- cvsplits : [{5}] Number of CV subsets.
- cvi : M element vector with integer elements allowing user defined subsets. (cvi) is a vector with the same number of elements as x has rows i.e., length(cvi) = size(x,1). Each cvi(i) is defined as:
- cvi(i) = -2 the sample is always in the test set.
- cvi(i) = -1 the sample is always in the calibration set,
- cvi(i) = 0 the sample is always never used, and
- cvi(i) = 1,2,3... defines each test subset.
Usage from LREG Analysis window
See the lregdademo.m function for command line usage example.