Random regression forest.
Calls ranger::ranger()
from package ranger.
Dictionary
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
Meta Information
Task type: “regr”
Predict Types: “response”, “se”, “quantiles”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, ranger
Parameters
Id | Type | Default | Levels | Range |
alpha | numeric | 0.5 | \((-\infty, \infty)\) | |
always.split.variables | untyped | - | - | |
holdout | logical | FALSE | TRUE, FALSE | - |
importance | character | - | none, impurity, impurity_corrected, permutation | - |
keep.inbag | logical | FALSE | TRUE, FALSE | - |
max.depth | integer | NULL | \([0, \infty)\) | |
min.bucket | integer | 1 | \([1, \infty)\) | |
min.node.size | integer | 5 | \([1, \infty)\) | |
minprop | numeric | 0.1 | \((-\infty, \infty)\) | |
mtry | integer | - | \([1, \infty)\) | |
mtry.ratio | numeric | - | \([0, 1]\) | |
node.stats | logical | FALSE | TRUE, FALSE | - |
num.random.splits | integer | 1 | \([1, \infty)\) | |
num.threads | integer | 1 | \([1, \infty)\) | |
num.trees | integer | 500 | \([1, \infty)\) | |
oob.error | logical | TRUE | TRUE, FALSE | - |
regularization.factor | untyped | 1 | - | |
regularization.usedepth | logical | FALSE | TRUE, FALSE | - |
replace | logical | TRUE | TRUE, FALSE | - |
respect.unordered.factors | character | ignore | ignore, order, partition | - |
sample.fraction | numeric | - | \([0, 1]\) | |
save.memory | logical | FALSE | TRUE, FALSE | - |
scale.permutation.importance | logical | FALSE | TRUE, FALSE | - |
se.method | character | infjack | jack, infjack | - |
seed | integer | NULL | \((-\infty, \infty)\) | |
split.select.weights | untyped | NULL | - | |
splitrule | character | variance | variance, extratrees, maxstat | - |
verbose | logical | TRUE | TRUE, FALSE | - |
write.forest | logical | TRUE | TRUE, FALSE | - |
Custom mlr3 parameters
mtry
:This hyperparameter can alternatively be set via our hyperparameter
mtry.ratio
asmtry = max(ceiling(mtry.ratio * n_features), 1)
. Note thatmtry
andmtry.ratio
are mutually exclusive.
Initial parameter values
num.threads
:Actual default:
NULL
, triggering auto-detection of the number of CPUs.Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
References
Wright, N. M, Ziegler, Andreas (2017). “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software, 77(1), 1–17. doi:10.18637/jss.v077.i01 .
Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324 .
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
Super classes
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrRanger
Methods
Method importance()
The importance scores are extracted from the model slot variable.importance
.
Parameter importance.mode
must be set to "impurity"
, "impurity_corrected"
, or
"permutation"
Returns
Named numeric()
.
Examples
if (requireNamespace("ranger", quietly = TRUE)) {
# Define the Learner and set parameter values
learner = lrn("regr.ranger")
print(learner)
# Define a Task
task = tsk("mtcars")
# Create train and test set
ids = partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
# print the model
print(learner$model)
# importance method
if("importance" %in% learner$properties) print(learner$importance)
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
}
#> <LearnerRegrRanger:regr.ranger>: Random Forest
#> * Model: -
#> * Parameters: num.threads=1
#> * Packages: mlr3, mlr3learners, ranger
#> * Predict Types: [response], se, quantiles
#> * Feature Types: logical, integer, numeric, character, factor, ordered
#> * Properties: hotstart_backward, importance, oob_error, weights
#> Ranger result
#>
#> Call:
#> ranger::ranger(dependent.variable.name = task$target_names, data = task$data(), case.weights = task$weights$weight, num.threads = 1L)
#>
#> Type: Regression
#> Number of trees: 500
#> Sample size: 21
#> Number of independent variables: 10
#> Mtry: 3
#> Target node size: 5
#> Variable importance mode: none
#> Splitrule: variance
#> OOB prediction error (MSE): 6.165724
#> R squared (OOB): 0.8537504
#> function ()
#> .__LearnerRegrRanger__importance(self = self, private = private,
#> super = super)
#> <environment: 0x555e26e45300>
#> regr.mse
#> 4.799664