Skip to contents

Random classification forest. Calls ranger::ranger() from package ranger.

Custom mlr3 parameters

  • mtry:

    • This hyperparameter can alternatively be set via our hyperparameter mtry.ratio as mtry = max(ceiling(mtry.ratio * n_features), 1). Note that mtry and mtry.ratio are mutually exclusive.

Initial parameter values

  • num.threads:

    • Actual default: 2, using two threads, while also respecting environment variable R_RANGER_NUM_THREADS, options(ranger.num.threads = N), or options(Ncpus = N), with precedence in that order.

    • Adjusted value: 1.

    • Reason for change: Conflicting with parallelization via future.

Dictionary

This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():

mlr_learners$get("classif.ranger")
lrn("classif.ranger")

Meta Information

  • Task type: “classif”

  • Predict Types: “response”, “prob”

  • Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”

  • Required Packages: mlr3, mlr3learners, ranger

Parameters

IdTypeDefaultLevelsRange
always.split.variablesuntyped--
class.weightsuntypedNULL-
holdoutlogicalFALSETRUE, FALSE-
importancecharacter-none, impurity, impurity_corrected, permutation-
keep.inbaglogicalFALSETRUE, FALSE-
max.depthintegerNULL\([1, \infty)\)
min.bucketuntyped1L-
min.node.sizeuntypedNULL-
mtryinteger-\([1, \infty)\)
mtry.rationumeric-\([0, 1]\)
na.actioncharacterna.learnna.learn, na.omit, na.fail-
num.random.splitsinteger1\([1, \infty)\)
node.statslogicalFALSETRUE, FALSE-
num.threadsinteger1\([1, \infty)\)
num.treesinteger500\([1, \infty)\)
oob.errorlogicalTRUETRUE, FALSE-
regularization.factoruntyped1-
regularization.usedepthlogicalFALSETRUE, FALSE-
replacelogicalTRUETRUE, FALSE-
respect.unordered.factorscharacter-ignore, order, partition-
sample.fractionnumeric-\([0, 1]\)
save.memorylogicalFALSETRUE, FALSE-
scale.permutation.importancelogicalFALSETRUE, FALSE-
local.importancelogicalFALSETRUE, FALSE-
seedintegerNULL\((-\infty, \infty)\)
split.select.weightsuntypedNULL-
splitrulecharacterginigini, extratrees, hellinger-
verboselogicalTRUETRUE, FALSE-
write.forestlogicalTRUETRUE, FALSE-

References

Wright, N. M, Ziegler, Andreas (2017). “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software, 77(1), 1–17. doi:10.18637/jss.v077.i01 .

Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324 .

See also

Other Learner: mlr_learners_classif.cv_glmnet, mlr_learners_classif.glmnet, mlr_learners_classif.kknn, mlr_learners_classif.lda, mlr_learners_classif.log_reg, mlr_learners_classif.multinom, mlr_learners_classif.naive_bayes, mlr_learners_classif.nnet, mlr_learners_classif.qda, mlr_learners_classif.svm, mlr_learners_classif.xgboost, mlr_learners_regr.cv_glmnet, mlr_learners_regr.glmnet, mlr_learners_regr.kknn, mlr_learners_regr.km, mlr_learners_regr.lm, mlr_learners_regr.nnet, mlr_learners_regr.ranger, mlr_learners_regr.svm, mlr_learners_regr.xgboost

Super classes

mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifRanger

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage


Method importance()

The importance scores are extracted from the model slot variable.importance. Parameter importance.mode must be set to "impurity", "impurity_corrected", or "permutation"

Usage

LearnerClassifRanger$importance()

Returns

Named numeric().


Method oob_error()

The out-of-bag error, extracted from model slot prediction.error.

Usage

LearnerClassifRanger$oob_error()

Returns

numeric(1).


Method selected_features()

The set of features used for node splitting in the forest.

Usage

LearnerClassifRanger$selected_features()

Returns

character().


Method clone()

The objects of this class are cloneable with this method.

Usage

LearnerClassifRanger$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Define the Learner and set parameter values
learner = lrn("classif.ranger")
learner$param_set$set_values(importance = "permutation")
print(learner)
#> 
#> ── <LearnerClassifRanger> (classif.ranger): Random Forest ──────────────────────
#> • Model: -
#> • Parameters: importance=permutation, num.threads=1
#> • Packages: mlr3, mlr3learners, and ranger
#> • Predict Types: [response] and prob
#> • Feature Types: logical, integer, numeric, character, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: hotstart_backward, importance, missings, multiclass, oob_error,
#> selected_features, twoclass, and weights
#> • Other settings: use_weights = 'use'

# Define a Task
task = tsk("sonar")

# Create train and test set
ids = partition(task)

# Train the learner on the training ids
learner$train(task, row_ids = ids$train)

# Print the model
print(learner$model)
#> Ranger result
#> 
#> Call:
#>  ranger::ranger(dependent.variable.name = task$target_names, data = task$data(),      probability = self$predict_type == "prob", importance = "permutation",      num.threads = 1L) 
#> 
#> Type:                             Classification 
#> Number of trees:                  500 
#> Sample size:                      139 
#> Number of independent variables:  60 
#> Mtry:                             7 
#> Target node size:                 1 
#> Variable importance mode:         permutation 
#> Splitrule:                        gini 
#> OOB prediction error:             18.71 % 

# Importance method
print(learner$importance())
#>           V11           V12            V9           V48           V10 
#>  2.907732e-02  2.081168e-02  1.416521e-02  1.013412e-02  9.932664e-03 
#>           V49           V36           V45           V37           V46 
#>  9.218636e-03  7.653640e-03  7.491513e-03  6.358383e-03  5.262950e-03 
#>           V13            V5            V6           V28           V47 
#>  4.958936e-03  4.894211e-03  4.724120e-03  4.619224e-03  4.013001e-03 
#>           V18           V31           V52            V7           V21 
#>  3.949882e-03  3.902219e-03  3.875056e-03  3.814926e-03  3.583567e-03 
#>            V8           V20           V27            V4           V17 
#>  3.404986e-03  3.207793e-03  2.849060e-03  2.683960e-03  2.550693e-03 
#>           V23           V33           V32           V35           V39 
#>  2.433713e-03  2.352639e-03  2.339197e-03  2.055994e-03  1.730373e-03 
#>           V43           V22            V2           V16           V19 
#>  1.703165e-03  1.532948e-03  1.523019e-03  1.436040e-03  1.301010e-03 
#>           V15           V42           V44           V51           V25 
#>  1.252977e-03  1.247813e-03  1.242833e-03  1.235211e-03  1.210964e-03 
#>           V34           V24           V30           V55            V1 
#>  1.164864e-03  1.029063e-03  9.988362e-04  9.534820e-04  9.399139e-04 
#>           V38           V53           V59           V60           V41 
#>  9.338938e-04  7.314214e-04  5.471997e-04  5.172327e-04  4.846357e-04 
#>           V14           V26            V3           V29           V56 
#>  4.457893e-04  4.084666e-04  3.791453e-04  3.259506e-04  2.045161e-04 
#>           V54           V58           V57           V40           V50 
#>  5.106627e-05 -6.653969e-05 -1.476340e-04 -3.734763e-04 -8.806703e-04 

# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)

# Score the predictions
predictions$score()
#> classif.ce 
#>  0.1594203