Ann

From Eigenvector Research Documentation Wiki
Revision as of 15:41, 27 June 2014 by imported>Donal (→‎Cross-validation)
Jump to navigation Jump to search

Purpose

Predictions based on Artificial Neural Network (ANN) regression models.

Synopsis

[model] = ann(x,y,options);
[model] = ann(x,y, nhid, options);
[pred] = ann(x,model,options);
[valid] = ann(x,y,model,options);

Description

Build an ANN model from input X and Y block data using the specified number of layers and layer nodes. Alternatively, if a model is passed in ANN makes a Y prediction for an input test X block. The ANN model contains quantities (weights etc) calculated from the calibration data. When a model structure is passed in to ANN then these weights do not need to be calculated.

There are two implementations of ANN available referred to as 'BPN' and 'Encog'.

BPN is a feedforward ANN using backpropagation training and is implemented in Matlab.
Encog is a feedforward ANN using Resilient Backpropagation training. See Rprop for further details.

Encog is implemented using the Encog framework Encog provided by Heaton Research, Inc, under the Apache 2.0 license. Further details of Encog Neural Network features are available at Encog Documentation. BPN is the ANN version used by default but the user can specify the option 'algorithm' = 'encog' to use Encog instead. Both implementations should give similar results but one may be faster than the other for different datasets. BPN is currently the only version which calculates RMSECV.

Inputs

  • x = X-block (predictor block) class "double" or "dataset", containing numeric values,
  • y = Y-block (predicted block) class "double" or "dataset", containing numeric values,
  • nhid = number of nodes in a single hidden layer ANN, or vector of two two numbers, indicating a two hidden layer ANN, representing the number of nodes in the two hidden layers. (this takes precedence over options nhid1 and nhid2),
  • model = previously generated model (when applying model to new data).

Outputs

  • model = a standard model structure model with the following fields (see Standard Model Structure):
    • modeltype: 'ANN',
    • datasource: structure array with information about input data,
    • date: date of creation,
    • time: time of creation,
    • info: additional model information,
    • pred: 2 element cell array with
      • model predictions for each input block (when options.blockdetail='normal' x-block predictions are not saved and this will be an empty array)
    • detail: sub-structure with additional model details and results, including:
      • model.detail.ann.W: Structure containing details of the ANN, including the ANN type, number of hidden layers and the weights.
  • pred a structure, similar to model for the new data.

Training Termination

The ANN is trained on a calibration dataset to minimize prediction error, RMSEC. It is important to not overtrain, however, so some some criteria for ending training are needed.

BPN determines the optimal number of learning iteration cycles by selecting the minumum RMSECV using the calibration data over a range of learning iterations values. The cross-validation used is determined by option cvi, or else by cvmethod. If neither of these are specified then the minumum RMSEP using a single subset of samples from a 5-fold random split of the calibration data is used. This value is not saved in the model.rmsecv field. Apply cross-validation (see below) to add this information to the model.

Encog training terminates whenever either a) RMSE becomes smaller than the option 'terminalrmse' value, or b) the rate of improvement of RMSE per 100 training iterations becomes smaller than the option 'terminalrmserate' value, or c) time exceeds the option 'maxseconds' value (though results are not optimal if is stopped prematurely by this time limit). Note these RMSE values refer to the internal preprocessed and scaled y values.

Cross-validation

Cross-validation can be applied to ANN when using either the ANN Analysis window or the command line. From the Analysis window specify the cross-validation method in the usual way (clicking on the model icon's red check-mark, or the "Choose Cross-Validation" link in the flowchart). In the cross-validation window the "Maximum Number of Nodes" specifies how many hidden-layer 1 nodes to test over. Viewing RMSECV versus number of hidden-layer 1 nodes (toolbar icon to left of Scores Plot) is useful for choosing the number of layer 1 nodes. From the command line use the crossval method to add crossvalidation information to an existing model.

Options

options = a structure array with the following fields:

  • display : [ 'off' |{'on'}] Governs display
  • plots: [ {'none'} | 'final' ] governs plotting of results.
  • blockdetails : [ {'standard'} | 'all' ] extent of detail included in model. 'standard' keeps only y-block, 'all' keeps both x- and y- blocks.
  • waitbar : [ 'off' |{'auto'}| 'on' ] governs use of waitbar during analysis. 'auto' shows waitbar if delay will likely be longer than a reasonable waiting period.
  • algorithm : [{'bpn'} | 'encog'] ANN implementation to use.
  • nhid1 : [{2}] Number of nodes in first hidden layer.
  • nhid2 : [{0}] Number of nodes in second hidden layer.
  • learnrate : [0.125] ANN backpropagation learning rate (bpn only).
  • learncycles : [20] Number of ANN learning iterations (bpn only).
  • terminalrmse : [0.05] Termination RMSE value (of scaled y) for ANN iterations (encog only).
  • terminalrmserate : [1.e-9] Termination rate of change of RMSE per 100 iterations (encog only).
  • maxseconds : [{20}] Maximum duration of ANN training in seconds (encog only).
  • preprocessing: {[] []} preprocessing structures for x and y blocks (see PREPROCESS).
  • compression: [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the ANN model. 'pca' uses a simple PCA model to compress the information. 'pls' uses a pls model. Compression can make the ANN more stable and less prone to overfitting.
  • compressncomp: [1] Number of latent variables (or principal components to include in the compression model.
  • compressmd: [{'yes'} | 'no'] Use Mahalnobis Distance corrected.
  • cvmethod : [{'con'} | 'vet' | 'loo' | 'rnd'] CV method, OR [] for Kennard-Stone single split.
  • cvsplits : [{5}] Number of CV subsets.
  • cvi : M element vector with integer elements allowing user defined subsets. (cvi) is a vector with the same number of elements as x has rows i.e., length(cvi) = size(x,1). Each cvi(i) is defined as:
cvi(i) = -2 the sample is always in the test set.
cvi(i) = -1 the sample is always in the calibration set,
cvi(i) = 0 the sample is always never used, and
cvi(i) = 1,2,3... defines each test subset.

See Also

analysis, crossval, modelselector