GLM with Elastic Net Regularization Regression Learner
Source:R/LearnerRegrCVGlmnet.R
mlr_learners_regr.cv_glmnet.RdGeneralized linear models with elastic net regularization.
Calls glmnet::cv.glmnet() from package glmnet.
Supported family values are "gaussian" and "poisson".
The default for the hyperparameter family is "gaussian".
Dictionary
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners
or with the associated sugar function mlr3::lrn():
Meta Information
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, glmnet
Parameters
| Id | Type | Default | Levels | Range |
| lambda | untyped | NULL | - | |
| type.measure | character | deviance | deviance, mse, mae | - |
| nfolds | integer | 10 | \([3, \infty)\) | |
| foldid | untyped | NULL | - | |
| alignment | character | lambda | lambda, fraction | - |
| grouped | logical | TRUE | TRUE, FALSE | - |
| keep | logical | FALSE | TRUE, FALSE | - |
| parallel | logical | FALSE | TRUE, FALSE | - |
| gamma | untyped | c(0, 0.25, 0.5, 0.75, 1) | - | |
| relax | logical | FALSE | TRUE, FALSE | - |
| trace.it | integer | 0 | \([0, 1]\) | |
| family | character | - | gaussian, poisson | - |
| alpha | numeric | 1 | \([0, 1]\) | |
| nlambda | integer | 100 | \([1, \infty)\) | |
| lambda.min.ratio | numeric | - | \([0, 1]\) | |
| standardize | logical | TRUE | TRUE, FALSE | - |
| intercept | logical | TRUE | TRUE, FALSE | - |
| exclude | untyped | NULL | - | |
| penalty.factor | untyped | - | - | |
| lower.limits | untyped | -Inf | - | |
| upper.limits | untyped | Inf | - | |
| type.gaussian | character | - | covariance, naive | - |
| maxp | integer | - | \([1, \infty)\) | |
| path | logical | FALSE | TRUE, FALSE | - |
| fdev | numeric | 1e-05 | \([0, 1]\) | |
| devmax | numeric | 0.999 | \([0, 1]\) | |
| eps | numeric | 1e-06 | \([0, 1]\) | |
| big | numeric | 9.9e+35 | \((-\infty, \infty)\) | |
| mnlam | integer | 5 | \((-\infty, \infty)\) | |
| pmin | numeric | 1e-09 | \([0, 1]\) | |
| exmx | numeric | 250 | \((-\infty, \infty)\) | |
| prec | numeric | 1e-10 | \((-\infty, \infty)\) | |
| mxit | integer | 100 | \([1, \infty)\) | |
| epsnr | numeric | 1e-06 | \([0, 1]\) | |
| mxitnr | integer | 25 | \([1, \infty)\) | |
| thresh | numeric | 1e-07 | \([0, \infty)\) | |
| maxit | integer | 100000 | \([1, \infty)\) | |
| dfmax | integer | NULL | \((-\infty, \infty)\) | |
| pmax | integer | NULL | \((-\infty, \infty)\) | |
| s | numeric | lambda.1se | \([0, \infty)\) | |
| predict.gamma | numeric | gamma.1se | \([0, 1]\) | |
| exact | logical | FALSE | TRUE, FALSE | - |
| use_pred_offset | logical | - | TRUE, FALSE | - |
| seed | integer | - | \((-\infty, \infty)\) |
Custom mlr3 parameters
seed:Optional integer used to seed the call to
glmnet::cv.glmnet(), making its random fold assignment, and therefore the selected lambda, reproducible.The global random state is reset afterwards, so it is left unchanged.
Defaults to
NA, in which case no seed is set and the global random state is used.
Offset
If a Task contains a column with the offset role,
it is automatically incorporated during training via the offset argument in glmnet::glmnet().
During prediction, the offset column from the test set is used only if use_pred_offset = TRUE (default),
passed via the newoffset argument in glmnet::predict.glmnet().
Otherwise, if the user sets use_pred_offset = FALSE, a zero offset is applied,
effectively disabling the offset adjustment during prediction.
References
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1–22. doi:10.18637/jss.v033.i01 .
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
Super classes
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrCVGlmnet
Methods
Inherited methods
LearnerRegrCVGlmnet$selected_features()
Returns the set of selected features as reported by glmnet::predict.glmnet()
with type set to "nonzero".
Arguments
lambda(
numeric(1))
Customlambda, defaults to the active lambda depending on parameter set.
Returns
(character()) of feature names.
Examples
# Define the Learner and set parameter values
learner = lrn("regr.cv_glmnet")
print(learner)
#>
#> ── <LearnerRegrCVGlmnet> (regr.cv_glmnet): GLM with Elastic Net Regularization ─
#> • Model: -
#> • Parameters: family=gaussian, use_pred_offset=TRUE, seed=NA
#> • Packages: mlr3, mlr3learners, and glmnet
#> • Predict Types: [response]
#> • Feature Types: logical, integer, and numeric
#> • Encapsulation: none (fallback: -)
#> • Properties: offset, selected_features, and weights
#> • Other settings: use_weights = 'use', predict_raw = 'FALSE'
# Define a Task
task = tsk("mtcars")
# Create train and test set
ids = partition(task)
# Train the learner on the training ids
learner$train(task, row_ids = ids$train)
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
# Print the model
print(learner$model)
#>
#> Call: glmnet::cv.glmnet(x = data, y = target, family = "gaussian")
#>
#> Measure: Mean-Squared Error
#>
#> Lambda Index Measure SE Nonzero
#> min 0.2398 35 9.745 3.084 7
#> 1se 1.8570 13 12.761 4.231 3
# Importance method
if ("importance" %in% learner$properties) print(learner$importance())
# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)
# Score the predictions
predictions$score()
#> regr.mse
#> 4.415262