Xgb: Difference between revisions
Jump to navigation
Jump to search
imported>Scott |
imported>Scott No edit summary |
||
Line 25: | Line 25: | ||
* '''model''' = standard model structure containing the xgboost model (see [[Standard Model Structure]]). Feature scores are contained in model.detail.xgb.featurescores. | * '''model''' = standard model structure containing the xgboost model (see [[Standard Model Structure]]). Feature scores are contained in model.detail.xgb.featurescores. | ||
* '''pred''' = structure array with predictions | * '''pred''' = structure array with predictions | ||
* '''valid''' = structure array with predictions | * '''valid''' = structure array with predictions | ||
===Options=== | |||
''options'' = a structure array with the following fields: | |||
* '''display''': [ 'off' | {'on'} ] governs level of display to command window. | |||
* '''plots''' [ 'none' | {'final'} ] governs level of plotting. | |||
* '''waitbar''': [ off | {'on'} ] governs display of waitbar during optimization and predictions. | |||
* '''preprocessing''': {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively) | |||
* '''algorithm''': [ 'xgboost' ] algorithm to use. xgboost is default and currently only option. | |||
* '''classset''' : [ 1 ] indicates which class set in x to use when no y-block is provided. | |||
* '''xgbtype''' : [ {'xgbr'} | 'xgbc' ] Type of XGB to apply. Default is 'xgbc' for classification, and 'xgbr' for regression. | |||
* '''compression''' : [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the XGB model. 'pca' uses a simple PCA model to compress the information. 'pls' uses either a pls or plsda model (depending on the xgbtype). Compression can make the XGB more stable and less prone to overfitting. | |||
* '''compressncomp''' : [ 1 ] Number of latent variables (or principal components to include in the compression model. | |||
* '''compressmd''' : [ 'no' |{'yes'}] Use Mahalnobis Distance corrected scores from compression model. |
Revision as of 17:57, 17 December 2018
Purpose
Gradient Boosted Tree (XGBoost) for regression or classification.
Synopsis
- model = xgb(x,y,options); %identifies model (calibration step)
- pred = xgb(x,model,options); %makes predictions with a new X-block
- valid = xgb(x,y,model,options); %performs a "test" call with a new X-block and known y-values
Description
To choose between regression and classification, use the xgbtype option:
- regression : xgbtype = 'xgbr'
- classification : xgbtype = 'xgbc'
It is recommended that classification be done through the xgbda function.
Inputs
- x = X-block (predictor block) class "double" or "dataset",
- y = Y-block (predicted block) class "double" or "dataset",
- model = previously generated model (when applying model to new data)
Outputs
- model = standard model structure containing the xgboost model (see Standard Model Structure). Feature scores are contained in model.detail.xgb.featurescores.
- pred = structure array with predictions
- valid = structure array with predictions
Options
options = a structure array with the following fields:
- display: [ 'off' | {'on'} ] governs level of display to command window.
- plots [ 'none' | {'final'} ] governs level of plotting.
- waitbar: [ off | {'on'} ] governs display of waitbar during optimization and predictions.
- preprocessing: {[] []}, two element cell array containing preprocessing structures (see PREPROCESS) defining preprocessing to use on the x- and y-blocks (first and second elements respectively)
- algorithm: [ 'xgboost' ] algorithm to use. xgboost is default and currently only option.
- classset : [ 1 ] indicates which class set in x to use when no y-block is provided.
- xgbtype : [ {'xgbr'} | 'xgbc' ] Type of XGB to apply. Default is 'xgbc' for classification, and 'xgbr' for regression.
- compression : [{'none'}| 'pca' | 'pls' ] type of data compression to perform on the x-block prior to calculaing or applying the XGB model. 'pca' uses a simple PCA model to compress the information. 'pls' uses either a pls or plsda model (depending on the xgbtype). Compression can make the XGB more stable and less prone to overfitting.
- compressncomp : [ 1 ] Number of latent variables (or principal components to include in the compression model.
- compressmd : [ 'no' |{'yes'}] Use Mahalnobis Distance corrected scores from compression model.