Extreme Gradient Boosting Regression Learner
Source:R/LearnerRegrXgboost.R
mlr_learners_regr.xgboost.Rd
eXtreme Gradient Boosting regression.
Calls xgboost::xgb.train()
from package xgboost.
To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.
Note
To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.
Dictionary
This Learner can be instantiated via the dictionary mlr_learners or with the associated sugar function lrn()
:
$get("regr.xgboost")
mlr_learnerslrn("regr.xgboost")
Meta Information
, * Task type: “regr”, * Predict Types: “response”, * Feature Types: “logical”, “integer”, “numeric”, * Required Packages: mlr3, mlr3learners, xgboost
Parameters
, |Id |Type |Default |Levels |Range |, |:---------------------------|:---------|:----------------|:----------------------------------------|:------------------------------------|, |alpha |numeric |0 | |\([0, \infty)\) |, |approxcontrib |logical |FALSE |TRUE, FALSE |- |, |base_score |numeric |0.5 | |\((-\infty, \infty)\) |, |booster |character |gbtree |gbtree, gblinear, dart |- |, |callbacks |untyped |list | |- |, |colsample_bylevel |numeric |1 | |\([0, 1]\) |, |colsample_bynode |numeric |1 | |\([0, 1]\) |, |colsample_bytree |numeric |1 | |\([0, 1]\) |, |disable_default_eval_metric |logical |FALSE |TRUE, FALSE |- |, |early_stopping_rounds |integer |NULL | |\([1, \infty)\) |, |eta |numeric |0.3 | |\([0, 1]\) |, |eval_metric |untyped |rmse | |- |, |feature_selector |character |cyclic |cyclic, shuffle, random, greedy, thrifty |- |, |feval |untyped | | |- |, |gamma |numeric |0 | |\([0, \infty)\) |, |grow_policy |character |depthwise |depthwise, lossguide |- |, |interaction_constraints |untyped |- | |- |, |iterationrange |untyped |- | |- |, |lambda |numeric |1 | |\([0, \infty)\) |, |lambda_bias |numeric |0 | |\([0, \infty)\) |, |max_bin |integer |256 | |\([2, \infty)\) |, |max_delta_step |numeric |0 | |\([0, \infty)\) |, |max_depth |integer |6 | |\([0, \infty)\) |, |max_leaves |integer |0 | |\([0, \infty)\) |, |maximize |logical |NULL |TRUE, FALSE |- |, |min_child_weight |numeric |1 | |\([0, \infty)\) |, |missing |numeric |NA | |\((-\infty, \infty)\) |, |monotone_constraints |untyped |0 | |- |, |normalize_type |character |tree |tree, forest |- |, |nrounds |integer |- | |\([1, \infty)\) |, |nthread |integer |1 | |\([1, \infty)\) |, |ntreelimit |integer |NULL | |\([1, \infty)\) |, |num_parallel_tree |integer |1 | |\([1, \infty)\) |, |objective |untyped |reg:squarederror | |- |, |one_drop |logical |FALSE |TRUE, FALSE |- |, |outputmargin |logical |FALSE |TRUE, FALSE |- |, |predcontrib |logical |FALSE |TRUE, FALSE |- |, |predictor |character |cpu_predictor |cpu_predictor, gpu_predictor |- |, |predinteraction |logical |FALSE |TRUE, FALSE |- |, |predleaf |logical |FALSE |TRUE, FALSE |- |, |print_every_n |integer |1 | |\([1, \infty)\) |, |process_type |character |default |default, update |- |, |rate_drop |numeric |0 | |\([0, 1]\) |, |refresh_leaf |logical |TRUE |TRUE, FALSE |- |, |reshape |logical |FALSE |TRUE, FALSE |- |, |sampling_method |character |uniform |uniform, gradient_based |- |, |sample_type |character |uniform |uniform, weighted |- |, |save_name |untyped | | |- |, |save_period |integer |NULL | |\([0, \infty)\) |, |scale_pos_weight |numeric |1 | |\((-\infty, \infty)\) |, |seed_per_iteration |logical |FALSE |TRUE, FALSE |- |, |sketch_eps |numeric |0.03 | |\([0, 1]\) |, |skip_drop |numeric |0 | |\([0, 1]\) |, |strict_shape |logical |FALSE |TRUE, FALSE |- |, |subsample |numeric |1 | |\([0, 1]\) |, |top_k |integer |0 | |\([0, \infty)\) |, |training |logical |FALSE |TRUE, FALSE |- |, |tree_method |character |auto |auto, exact, approx, hist, gpu_hist |- |, |tweedie_variance_power |numeric |1.5 | |\([1, 2]\) |, |updater |untyped |- | |- |, |verbose |integer |1 | |\([0, 2]\) |, |watchlist |untyped | | |- |, |xgb_model |untyped | | |- |
Custom mlr3 defaults
nrounds
:Actual default: no default.
Adjusted default: 1.
Reason for change: Without a default construction of the learner would error. Just setting a nonsense default to workaround this.
nrounds
needs to be tuned by the user.
nthread
:Actual value: Undefined, triggering auto-detection of the number of CPUs.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
verbose
:Actual default: 1.
Adjusted default: 0.
Reason for change: Reduce verbosity.
References
Chen, Tianqi, Guestrin, Carlos (2016). “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 785--794. ACM. doi:10.1145/2939672.2939785 .
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
Package mlr3extralearners for more learners.
Dictionary of Learners: mlr_learners
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
Super classes
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrXgboost
Methods
Method importance()
The importance scores are calculated with xgboost::xgb.importance()
.
Returns
Named numeric()
.
Examples
if (requireNamespace("xgboost", quietly = TRUE)) {
learner = mlr3::lrn("regr.xgboost")
print(learner)
# available parameters:
learner$param_set$ids()
}
#> <LearnerRegrXgboost:regr.xgboost>
#> * Model: -
#> * Parameters: nrounds=1, nthread=1, verbose=0
#> * Packages: mlr3, mlr3learners, xgboost
#> * Predict Type: response
#> * Feature types: logical, integer, numeric
#> * Properties: hotstart_forward, importance, missings, weights
#> [1] "alpha" "approxcontrib"
#> [3] "base_score" "booster"
#> [5] "callbacks" "colsample_bylevel"
#> [7] "colsample_bynode" "colsample_bytree"
#> [9] "disable_default_eval_metric" "early_stopping_rounds"
#> [11] "eta" "eval_metric"
#> [13] "feature_selector" "feval"
#> [15] "gamma" "grow_policy"
#> [17] "interaction_constraints" "iterationrange"
#> [19] "lambda" "lambda_bias"
#> [21] "max_bin" "max_delta_step"
#> [23] "max_depth" "max_leaves"
#> [25] "maximize" "min_child_weight"
#> [27] "missing" "monotone_constraints"
#> [29] "normalize_type" "nrounds"
#> [31] "nthread" "ntreelimit"
#> [33] "num_parallel_tree" "objective"
#> [35] "one_drop" "outputmargin"
#> [37] "predcontrib" "predictor"
#> [39] "predinteraction" "predleaf"
#> [41] "print_every_n" "process_type"
#> [43] "rate_drop" "refresh_leaf"
#> [45] "reshape" "sampling_method"
#> [47] "sample_type" "save_name"
#> [49] "save_period" "scale_pos_weight"
#> [51] "seed_per_iteration" "sketch_eps"
#> [53] "skip_drop" "strict_shape"
#> [55] "subsample" "top_k"
#> [57] "training" "tree_method"
#> [59] "tweedie_variance_power" "updater"
#> [61] "verbose" "watchlist"
#> [63] "xgb_model"