Ann: Difference between revisions
imported>Donal |
imported>Donal |
||
Line 64: | Line 64: | ||
* '''display''' : [ 'off' |{'on'}] Governs display | * '''display''' : [ 'off' |{'on'}] Governs display | ||
* '''plots''': [ {'none'} | 'final' ] governs plotting of results. | * '''plots''': [ {'none'} | 'final' ] governs plotting of results. | ||
* '''blockdetails''' : [ {'standard'} | 'all' ] | * '''blockdetails''' : [ {'standard'} | 'all' ] extent of detail included in model. 'standard' keeps only y-block, 'all' keeps both x- and y- blocks | ||
* '''waitbar''' : [ 'off' |{'auto'}| 'on' ] governs use of waitbar during analysis. 'auto' shows waitbar if delay will likely be longer than a reasonable waiting period. | * '''waitbar''' : [ 'off' |{'auto'}| 'on' ] governs use of waitbar during analysis. 'auto' shows waitbar if delay will likely be longer than a reasonable waiting period. | ||
* '''algorithm''' : [{'bpn'} | 'encog'] ANN implementation to use. | * '''algorithm''' : [{'bpn'} | 'encog'] ANN implementation to use. |
Revision as of 14:18, 13 June 2014
Purpose
Predictions based on Artificial Neural Network (ANN) regression models.
Synopsis
- [model] = ann(x,y,options);
- [model] = ann(x,y, nhid, options);
- [pred] = ann(x,model,options);
- [valid] = ann(x,y,model,options);
Description
Build an ANN model from input X and Y block data using the specified number of layers and layer nodes. Alternatively, if a model is passed in ANN makes a Y prediction for an input test X block. The ANN model contains quantities (weights etc) calculated from the calibration data. When a model structure is passed in to ANN then these weights do not need to be calculated.
There are two implementations of ANN available referred to as 'BPN' and 'Encog'.
- BPN is a feedforward ANN using backpropagation training and is implemented in Matlab.
- Encog is a feedforward ANN using Resilient Backpropagation training. See Rprop for further details.
Encog is implemented using the Encog framework Encog provided by Heaton Research, Inc, under the Apache 2.0 license. Further details of Encog Neural Network features are available at Encog Documentation. BPN is the ANN version used by default but the user can specify the option 'algorithm' = 'encog' to use Encog instead. Both implementations should give similar results but one may be faster than the other for different datasets. BPN is currently the only version which calculates RMSECV.
Inputs
- x = X-block (predictor block) class "double" or "dataset", containing numeric values,
- y = Y-block (predicted block) class "double" or "dataset", containing numeric values,
- nhid = number of nodes in a single hidden layer ANN, or vector of two two numbers, indicating a two hidden layer ANN, representing the number of nodes in the two hidden layers. (this takes precedence over options nhid1 and nhid2),
- model = previously generated model (when applying model to new data).
Outputs
- model = a standard model structure model with the following fields (see Standard Model Structure):
- modeltype: 'ANN',
- datasource: structure array with information about input data,
- date: date of creation,
- time: time of creation,
- info: additional model information,
- pred: 2 element cell array with
- model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
- detail: sub-structure with additional model details and results, including:
- model.detail.ann.W: Structure containing details of the ANN, including the ANN type, number of hidden layers and the weights.
- pred a structure, similar to model for the new data.
Training Termination
The ANN is trained on a calibration dataset to minimize prediction error, RMSEC. It is important to not overtrain, however, so some some criteria for ending training are needed.
BPN determines the optimal number of learning iteration cycles by selecting the minumum RMSEP for a test subset over a range of learning iterations.
Encog training terminates whenever either a) RMSE becomes smaller than the option 'terminalrmse' value, or b) the rate of improvement of RMSE per 100 training iterations becomes smaller than the option 'terminalrmserate' value, or c) time exceeds the option 'maxseconds' value (though results are not optimal if is stopped prematurely by this time limit). Note these RMSE values refer to the internal preprocessed and scaled y values.
Options
options = a structure array with the following fields:
- display : [ 'off' |{'on'}] Governs display
- plots: [ {'none'} | 'final' ] governs plotting of results.
- blockdetails : [ {'standard'} | 'all' ] extent of detail included in model. 'standard' keeps only y-block, 'all' keeps both x- and y- blocks
- waitbar : [ 'off' |{'auto'}| 'on' ] governs use of waitbar during analysis. 'auto' shows waitbar if delay will likely be longer than a reasonable waiting period.
- algorithm : [{'bpn'} | 'encog'] ANN implementation to use.
- nhid1 : [{2}] Number of nodes in first hidden layer.
- nhid2 : [{0}] Number of nodes in second hidden layer.
- learnrate : [0.125] ANN backpropagation learning rate (bpn only).
- learncycles : [20] Number of ANN learning iterations (bpn only).
- terminalrmse : [0.05] Termination RMSE value (of scaled y) for ANN iterations (encog only).
- terminalrmserate : [1.e-9] Termination rate of change of RMSE per 100 iterations (encog only).
- maxseconds : [{20}] Maximum duration of ANN training in seconds (encog only).
- preprocessing: {[] []} preprocessing structures for x and y blocks (see PREPROCESS).
- compression: [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the ANN model. 'pca' uses a simple PCA model to compress the information. 'pls' uses a pls model. Compression can make the ANN more stable and less prone to overfitting.
- compressncomp: [1] Number of latent variables (or principal components to include in the compression model.
- compressmd: [{'yes'} | 'no'] Use Mahalnobis Distance corrected.
- cvmethod : [{'con'} | 'vet' | 'loo' | 'rnd'] CV method, OR [] for Kennard-Stone single split.
- cvsplits : [{5}] Number of CV subsets.
- cvi : M element vector with integer elements allowing user defined subsets. (cvi) is a vector with the same number of elements as x has rows i.e., length(cvi) = size(x,1). Each cvi(i) is defined as:
- cvi(i) = -2 the sample is always in the test set.
- cvi(i) = -1 the sample is always in the calibration set,
- cvi(i) = 0 the sample is always never used, and
- cvi(i) = 1,2,3... defines each test subset.