Skip to contents

eXtreme Gradient Boosting classification. Calls xgboost::xgb.train() from package xgboost.

If not specified otherwise, the evaluation metric is set to the default "logloss" for binary classification problems and set to "mlogloss" for multiclass problems. This was necessary to silence a deprecation warning.

Note

To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.

Custom mlr3 defaults

  • nrounds:

    • Actual default: no default.

    • Adjusted default: 1.

    • Reason for change: Without a default construction of the learner would error. Just setting a nonsense default to workaround this. nrounds needs to be tuned by the user.

  • nthread:

    • Actual value: Undefined, triggering auto-detection of the number of CPUs.

    • Adjusted value: 1.

    • Reason for change: Conflicting with parallelization via future.

  • verbose:

    • Actual default: 1.

    • Adjusted default: 0.

    • Reason for change: Reduce verbosity.

Dictionary

This Learner can be instantiated via the dictionary mlr_learners or with the associated sugar function lrn():

mlr_learners$get("classif.xgboost")
lrn("classif.xgboost")

Meta Information

  • Task type: “classif”

  • Predict Types: “response”, “prob”

  • Feature Types: “logical”, “integer”, “numeric”

  • Required Packages: mlr3, mlr3learners, xgboost

Parameters

IdTypeDefaultRangeLevels
alphanumeric0\([0, \infty)\)-
approxcontriblogicalFALSE-TRUE, FALSE
base_scorenumeric0.5\((-\infty, \infty)\)-
boostercharactergbtree-gbtree, gblinear, dart
callbackslistNULL--
colsample_bylevelnumeric1\([0, 1]\)-
colsample_bynodenumeric1\([0, 1]\)-
colsample_bytreenumeric1\([0, 1]\)-
disable_default_eval_metriclogicalFALSE-TRUE, FALSE
early_stopping_roundsintegerNULL\([1, \infty)\)-
etanumeric0.3\([0, 1]\)-
eval_metriclist---
feature_selectorcharactercyclic-cyclic, shuffle, random, greedy, thrifty
fevallistNULL--
gammanumeric0\([0, \infty)\)-
grow_policycharacterdepthwise-depthwise, lossguide
interaction_constraintslist---
iterationrangelist---
lambdanumeric1\([0, \infty)\)-
lambda_biasnumeric0\([0, \infty)\)-
max_bininteger256\([2, \infty)\)-
max_delta_stepnumeric0\([0, \infty)\)-
max_depthinteger6\([0, \infty)\)-
max_leavesinteger0\([0, \infty)\)-
maximizelogicalNULL-TRUE, FALSE
min_child_weightnumeric1\([0, \infty)\)-
missingnumericNA\((-\infty, \infty)\)-
monotone_constraintslist0--
normalize_typecharactertree-tree, forest
nroundsinteger-\([1, \infty)\)-
nthreadinteger1\([1, \infty)\)-
ntreelimitintegerNULL\([1, \infty)\)-
num_parallel_treeinteger1\([1, \infty)\)-
objectivelistbinary:logistic--
one_droplogicalFALSE-TRUE, FALSE
outputmarginlogicalFALSE-TRUE, FALSE
predcontriblogicalFALSE-TRUE, FALSE
predictorcharactercpu_predictor-cpu_predictor, gpu_predictor
predinteractionlogicalFALSE-TRUE, FALSE
predleaflogicalFALSE-TRUE, FALSE
print_every_ninteger1\([1, \infty)\)-
process_typecharacterdefault-default, update
rate_dropnumeric0\([0, 1]\)-
refresh_leaflogicalTRUE-TRUE, FALSE
reshapelogicalFALSE-TRUE, FALSE
seed_per_iterationlogicalFALSE-TRUE, FALSE
sampling_methodcharacteruniform-uniform, gradient_based
sample_typecharacteruniform-uniform, weighted
save_namelistNULL--
save_periodintegerNULL\([0, \infty)\)-
scale_pos_weightnumeric1\((-\infty, \infty)\)-
sketch_epsnumeric0.03\([0, 1]\)-
skip_dropnumeric0\([0, 1]\)-
single_precision_histogramlogicalFALSE-TRUE, FALSE
strict_shapelogicalFALSE-TRUE, FALSE
subsamplenumeric1\([0, 1]\)-
top_kinteger0\([0, \infty)\)-
traininglogicalFALSE-TRUE, FALSE
tree_methodcharacterauto-auto, exact, approx, hist, gpu_hist
tweedie_variance_powernumeric1.5\([1, 2]\)-
updaterlist---
verboseinteger1\([0, 2]\)-
watchlistlistNULL--
xgb_modellistNULL--

References

Chen, Tianqi, Guestrin, Carlos (2016). “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 785--794. ACM. doi: 10.1145/2939672.2939785 .

See also

Other Learner: mlr_learners_classif.cv_glmnet, mlr_learners_classif.glmnet, mlr_learners_classif.kknn, mlr_learners_classif.lda, mlr_learners_classif.log_reg, mlr_learners_classif.multinom, mlr_learners_classif.naive_bayes, mlr_learners_classif.nnet, mlr_learners_classif.qda, mlr_learners_classif.ranger, mlr_learners_classif.svm, mlr_learners_regr.cv_glmnet, mlr_learners_regr.glmnet, mlr_learners_regr.kknn, mlr_learners_regr.km, mlr_learners_regr.lm, mlr_learners_regr.ranger, mlr_learners_regr.svm, mlr_learners_regr.xgboost, mlr_learners_surv.cv_glmnet, mlr_learners_surv.glmnet, mlr_learners_surv.ranger, mlr_learners_surv.xgboost

Super classes

mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifXgboost

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage


Method importance()

The importance scores are calculated with xgboost::xgb.importance().

Usage

LearnerClassifXgboost$importance()

Returns

Named numeric().


Method clone()

The objects of this class are cloneable with this method.

Usage

LearnerClassifXgboost$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

if (requireNamespace("xgboost", quietly = TRUE)) {
  learner = mlr3::lrn("classif.xgboost")
  print(learner)

  # available parameters:
learner$param_set$ids()
}
#> <LearnerClassifXgboost:classif.xgboost>
#> * Model: -
#> * Parameters: nrounds=1, nthread=1, verbose=0
#> * Packages: mlr3, mlr3learners, xgboost
#> * Predict Type: response
#> * Feature types: logical, integer, numeric
#> * Properties: hotstart_forward, importance, missings, multiclass,
#>   twoclass, weights
#>  [1] "alpha"                       "approxcontrib"              
#>  [3] "base_score"                  "booster"                    
#>  [5] "callbacks"                   "colsample_bylevel"          
#>  [7] "colsample_bynode"            "colsample_bytree"           
#>  [9] "disable_default_eval_metric" "early_stopping_rounds"      
#> [11] "eta"                         "eval_metric"                
#> [13] "feature_selector"            "feval"                      
#> [15] "gamma"                       "grow_policy"                
#> [17] "interaction_constraints"     "iterationrange"             
#> [19] "lambda"                      "lambda_bias"                
#> [21] "max_bin"                     "max_delta_step"             
#> [23] "max_depth"                   "max_leaves"                 
#> [25] "maximize"                    "min_child_weight"           
#> [27] "missing"                     "monotone_constraints"       
#> [29] "normalize_type"              "nrounds"                    
#> [31] "nthread"                     "ntreelimit"                 
#> [33] "num_parallel_tree"           "objective"                  
#> [35] "one_drop"                    "outputmargin"               
#> [37] "predcontrib"                 "predictor"                  
#> [39] "predinteraction"             "predleaf"                   
#> [41] "print_every_n"               "process_type"               
#> [43] "rate_drop"                   "refresh_leaf"               
#> [45] "reshape"                     "seed_per_iteration"         
#> [47] "sampling_method"             "sample_type"                
#> [49] "save_name"                   "save_period"                
#> [51] "scale_pos_weight"            "sketch_eps"                 
#> [53] "skip_drop"                   "single_precision_histogram" 
#> [55] "strict_shape"                "subsample"                  
#> [57] "top_k"                       "training"                   
#> [59] "tree_method"                 "tweedie_variance_power"     
#> [61] "updater"                     "verbose"                    
#> [63] "watchlist"                   "xgb_model"