Extreme Gradient Boosting Survival Learner
Source:R/LearnerSurvXgboost.R
mlr_learners_surv.xgboost.Rd
eXtreme Gradient Boosting regression.
Calls xgboost::xgb.train()
from package xgboost.
Note
To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.
Custom mlr3 defaults
nrounds
:Actual default: no default.
Adjusted default: 1.
Reason for change: Without a default construction of the learner would error. Just setting a nonsense default to workaround this.
nrounds
needs to be tuned by the user.
nthread
:Actual value: Undefined, triggering auto-detection of the number of CPUs.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
verbose
:Actual default: 1.
Adjusted default: 0.
Reason for change: Reduce verbosity.
objective
:Actual default:
reg:squarederror
.Adjusted default:
survival:cox
.Reason for change: Changed to a survival objective.
Dictionary
This Learner can be instantiated via the dictionary mlr_learners or with the associated sugar function lrn()
:
mlr_learners$get("surv.xgboost")
lrn("surv.xgboost")
Meta Information
Task type: “surv”
Predict Types: “crank”, “lp”
Feature Types: “integer”, “numeric”
Required Packages: mlr3, mlr3learners, xgboost
Parameters
Id | Type | Default | Range | Levels |
aft_loss_distribution | character | normal | - | normal, logistic, extreme |
aft_loss_distribution_scale | numeric | - | \((-\infty, \infty)\) | - |
alpha | numeric | 0 | \([0, \infty)\) | - |
base_score | numeric | 0.5 | \((-\infty, \infty)\) | - |
booster | character | gbtree | - | gbtree, gblinear, dart |
callbacks | list | NULL | - | - |
colsample_bylevel | numeric | 1 | \([0, 1]\) | - |
colsample_bynode | numeric | 1 | \([0, 1]\) | - |
colsample_bytree | numeric | 1 | \([0, 1]\) | - |
disable_default_eval_metric | logical | FALSE | - | TRUE, FALSE |
early_stopping_rounds | integer | NULL | \([1, \infty)\) | - |
eta | numeric | 0.3 | \([0, 1]\) | - |
feature_selector | character | cyclic | - | cyclic, shuffle, random, greedy, thrifty |
feval | list | NULL | - | - |
gamma | numeric | 0 | \([0, \infty)\) | - |
grow_policy | character | depthwise | - | depthwise, lossguide |
interaction_constraints | list | - | - | - |
iterationrange | list | - | - | - |
lambda | numeric | 1 | \([0, \infty)\) | - |
lambda_bias | numeric | 0 | \([0, \infty)\) | - |
max_bin | integer | 256 | \([2, \infty)\) | - |
max_delta_step | numeric | 0 | \([0, \infty)\) | - |
max_depth | integer | 6 | \([0, \infty)\) | - |
max_leaves | integer | 0 | \([0, \infty)\) | - |
maximize | logical | NULL | - | TRUE, FALSE |
min_child_weight | numeric | 1 | \([0, \infty)\) | - |
missing | numeric | NA | \((-\infty, \infty)\) | - |
monotone_constraints | integer | 0 | \([-1, 1]\) | - |
normalize_type | character | tree | - | tree, forest |
nrounds | integer | - | \([1, \infty)\) | - |
nthread | integer | 1 | \([1, \infty)\) | - |
ntreelimit | integer | - | \([1, \infty)\) | - |
num_parallel_tree | integer | 1 | \([1, \infty)\) | - |
objective | character | survival:cox | - | survival:cox, survival:aft |
one_drop | logical | FALSE | - | TRUE, FALSE |
predictor | character | cpu_predictor | - | cpu_predictor, gpu_predictor |
print_every_n | integer | 1 | \([1, \infty)\) | - |
process_type | character | default | - | default, update |
rate_drop | numeric | 0 | \([0, 1]\) | - |
refresh_leaf | logical | TRUE | - | TRUE, FALSE |
sampling_method | character | uniform | - | uniform, gradient_based |
sample_type | character | uniform | - | uniform, weighted |
save_name | list | - | - | - |
save_period | integer | - | \([0, \infty)\) | - |
scale_pos_weight | numeric | 1 | \((-\infty, \infty)\) | - |
seed_per_iteration | logical | FALSE | - | TRUE, FALSE |
sketch_eps | numeric | 0.03 | \([0, 1]\) | - |
skip_drop | numeric | 0 | \([0, 1]\) | - |
single_precision_histogram | logical | FALSE | - | TRUE, FALSE |
strict_shape | logical | FALSE | - | TRUE, FALSE |
subsample | numeric | 1 | \([0, 1]\) | - |
top_k | integer | 0 | \([0, \infty)\) | - |
tree_method | character | auto | - | auto, exact, approx, hist, gpu_hist |
tweedie_variance_power | numeric | 1.5 | \([1, 2]\) | - |
updater | list | - | - | - |
verbose | integer | 1 | \([0, 2]\) | - |
watchlist | list | NULL | - | - |
xgb_model | list | - | - | - |
References
Chen, Tianqi, Guestrin, Carlos (2016). “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 785--794. ACM. doi: 10.1145/2939672.2939785 .
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/basics.html#learners
Package mlr3extralearners for more learners.
Dictionary of Learners: mlr_learners
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
,
mlr_learners_surv.cv_glmnet
,
mlr_learners_surv.glmnet
,
mlr_learners_surv.ranger
Super classes
mlr3::Learner
-> mlr3proba::LearnerSurv
-> LearnerSurvXgboost
Methods
Method importance()
The importance scores are calculated with xgboost::xgb.importance()
.
Returns
Named numeric()
.
Examples
if (requireNamespace("xgboost", quietly = TRUE)) {
learner = mlr3::lrn("surv.xgboost")
print(learner)
# available parameters:
learner$param_set$ids()
}
#> <LearnerSurvXgboost:surv.xgboost>
#> * Model: -
#> * Parameters: nrounds=1, nthread=1, verbose=0
#> * Packages: mlr3, mlr3proba, mlr3learners, xgboost
#> * Predict Type: crank
#> * Feature types: integer, numeric
#> * Properties: importance, missings, weights
#> [1] "aft_loss_distribution" "aft_loss_distribution_scale"
#> [3] "alpha" "base_score"
#> [5] "booster" "callbacks"
#> [7] "colsample_bylevel" "colsample_bynode"
#> [9] "colsample_bytree" "disable_default_eval_metric"
#> [11] "early_stopping_rounds" "eta"
#> [13] "feature_selector" "feval"
#> [15] "gamma" "grow_policy"
#> [17] "interaction_constraints" "iterationrange"
#> [19] "lambda" "lambda_bias"
#> [21] "max_bin" "max_delta_step"
#> [23] "max_depth" "max_leaves"
#> [25] "maximize" "min_child_weight"
#> [27] "missing" "monotone_constraints"
#> [29] "normalize_type" "nrounds"
#> [31] "nthread" "ntreelimit"
#> [33] "num_parallel_tree" "objective"
#> [35] "one_drop" "predictor"
#> [37] "print_every_n" "process_type"
#> [39] "rate_drop" "refresh_leaf"
#> [41] "sampling_method" "sample_type"
#> [43] "save_name" "save_period"
#> [45] "scale_pos_weight" "seed_per_iteration"
#> [47] "sketch_eps" "skip_drop"
#> [49] "single_precision_histogram" "strict_shape"
#> [51] "subsample" "top_k"
#> [53] "tree_method" "tweedie_variance_power"
#> [55] "updater" "verbose"
#> [57] "watchlist" "xgb_model"