Random classification forest.
Calls ranger::ranger() from package ranger.
Custom mlr3 parameters
mtry:This hyperparameter can alternatively be set via our hyperparameter
mtry.ratioasmtry = max(ceiling(mtry.ratio * n_features), 1). Note thatmtryandmtry.ratioare mutually exclusive.
Initial parameter values
num.threads:Actual default:
2, using two threads, while also respecting environment variableR_RANGER_NUM_THREADS,options(ranger.num.threads = N), oroptions(Ncpus = N), with precedence in that order.Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
Dictionary
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
Meta Information
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, ranger
Parameters
| Id | Type | Default | Levels | Range |
| always.split.variables | untyped | - | - | |
| class.weights | untyped | NULL | - | |
| holdout | logical | FALSE | TRUE, FALSE | - |
| importance | character | - | none, impurity, impurity_corrected, permutation | - |
| keep.inbag | logical | FALSE | TRUE, FALSE | - |
| max.depth | integer | NULL | \([1, \infty)\) | |
| min.bucket | untyped | 1L | - | |
| min.node.size | untyped | NULL | - | |
| mtry | integer | - | \([1, \infty)\) | |
| mtry.ratio | numeric | - | \([0, 1]\) | |
| na.action | character | na.learn | na.learn, na.omit, na.fail | - |
| num.random.splits | integer | 1 | \([1, \infty)\) | |
| node.stats | logical | FALSE | TRUE, FALSE | - |
| num.threads | integer | 1 | \([1, \infty)\) | |
| num.trees | integer | 500 | \([1, \infty)\) | |
| oob.error | logical | TRUE | TRUE, FALSE | - |
| regularization.factor | untyped | 1 | - | |
| regularization.usedepth | logical | FALSE | TRUE, FALSE | - |
| replace | logical | TRUE | TRUE, FALSE | - |
| respect.unordered.factors | character | - | ignore, order, partition | - |
| sample.fraction | numeric | - | \([0, 1]\) | |
| save.memory | logical | FALSE | TRUE, FALSE | - |
| scale.permutation.importance | logical | FALSE | TRUE, FALSE | - |
| local.importance | logical | FALSE | TRUE, FALSE | - |
| seed | integer | NULL | \((-\infty, \infty)\) | |
| split.select.weights | untyped | NULL | - | |
| splitrule | character | gini | gini, extratrees, hellinger | - |
| verbose | logical | TRUE | TRUE, FALSE | - |
| write.forest | logical | TRUE | TRUE, FALSE | - |
References
Wright, N. M, Ziegler, Andreas (2017). “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software, 77(1), 1–17. doi:10.18637/jss.v077.i01 .
Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324 .
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
Super classes
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifRanger
Methods
Inherited methods
Method importance()
The importance scores are extracted from the model slot variable.importance.
Parameter importance.mode must be set to "impurity", "impurity_corrected", or
"permutation"
Returns
Named numeric().
Examples
# Define the Learner and set parameter values
learner = lrn("classif.ranger")
learner$param_set$set_values(importance = "permutation")
print(learner)
#>
#> ── <LearnerClassifRanger> (classif.ranger): Random Forest ──────────────────────
#> • Model: -
#> • Parameters: importance=permutation, num.threads=1
#> • Packages: mlr3, mlr3learners, and ranger
#> • Predict Types: [response] and prob
#> • Feature Types: logical, integer, numeric, character, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: hotstart_backward, importance, missings, multiclass, oob_error,
#> selected_features, twoclass, and weights
#> • Other settings: use_weights = 'use'
# Define a Task
task = tsk("sonar")
# Create train and test set
ids = partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
# Print the model
print(learner$model)
#> Ranger result
#>
#> Call:
#> ranger::ranger(dependent.variable.name = task$target_names, data = task$data(), probability = self$predict_type == "prob", importance = "permutation", num.threads = 1L)
#>
#> Type: Classification
#> Number of trees: 500
#> Sample size: 139
#> Number of independent variables: 60
#> Mtry: 7
#> Target node size: 1
#> Variable importance mode: permutation
#> Splitrule: gini
#> OOB prediction error: 18.71 %
# Importance method
print(learner$importance())
#> V11 V12 V9 V48 V10
#> 2.907732e-02 2.081168e-02 1.416521e-02 1.013412e-02 9.932664e-03
#> V49 V36 V45 V37 V46
#> 9.218636e-03 7.653640e-03 7.491513e-03 6.358383e-03 5.262950e-03
#> V13 V5 V6 V28 V47
#> 4.958936e-03 4.894211e-03 4.724120e-03 4.619224e-03 4.013001e-03
#> V18 V31 V52 V7 V21
#> 3.949882e-03 3.902219e-03 3.875056e-03 3.814926e-03 3.583567e-03
#> V8 V20 V27 V4 V17
#> 3.404986e-03 3.207793e-03 2.849060e-03 2.683960e-03 2.550693e-03
#> V23 V33 V32 V35 V39
#> 2.433713e-03 2.352639e-03 2.339197e-03 2.055994e-03 1.730373e-03
#> V43 V22 V2 V16 V19
#> 1.703165e-03 1.532948e-03 1.523019e-03 1.436040e-03 1.301010e-03
#> V15 V42 V44 V51 V25
#> 1.252977e-03 1.247813e-03 1.242833e-03 1.235211e-03 1.210964e-03
#> V34 V24 V30 V55 V1
#> 1.164864e-03 1.029063e-03 9.988362e-04 9.534820e-04 9.399139e-04
#> V38 V53 V59 V60 V41
#> 9.338938e-04 7.314214e-04 5.471997e-04 5.172327e-04 4.846357e-04
#> V14 V26 V3 V29 V56
#> 4.457893e-04 4.084666e-04 3.791453e-04 3.259506e-04 2.045161e-04
#> V54 V58 V57 V40 V50
#> 5.106627e-05 -6.653969e-05 -1.476340e-04 -3.734763e-04 -8.806703e-04
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> classif.ce
#> 0.1594203