Most Relevant Features Using Lasso Regression

Use Lasso regression to identify the most relevant variables that can predict/identify another variable. You might want to compare with corr_var() and/or x2y() results to compliment the analysis No need to standardize, center or scale your data. Tidyverse friendly.

Usage

lasso_vars(
  df,
  variable,
  ignore = NULL,
  nlambdas = 100,
  nfolds = 10,
  top = 20,
  quiet = FALSE,
  seed = 123,
  ...
)

Arguments

df: Dataframe. Any dataframe is valid as ohse will be applied to process categorical values, and values will be standardize automatically.
variable: Variable. Dependent variable or response.
ignore: Character vector. Variables to exclude from study.
nlambdas: Integer. Number of lambdas to be used in a search.
nfolds: Integer. Number of folds for K-fold cross-validation (>= 2).
top: Integer. Plot top n results only.
quiet: Boolean. Keep quiet? If not, informative messages will be shown.
seed: Numeric.
...: Additional parameters passed to ohse().

Value

List. Contains lasso model coefficients, performance metrics, the actual model fitted and a plot.

Examples

if (FALSE) { # \dontrun{
# CRAN
Sys.unsetenv("LARES_FONT") # Temporal
data(dft) # Titanic dataset

m <- lasso_vars(dft, Survived, ignore = c("Cabin"))
print(m$coef)
print(m$metrics)
plot(m$plot)
} # }

Most Relevant Features Using Lasso Regression

Usage

Arguments

Value

See also

Examples