Skip to contents

This function splits automatically a dataframe into train and test datasets. You can define a seed to get the same results every time, but has a default value. You can prevent it from printing the split counter result.

Usage

msplit(df, size = 0.7, seed = 0, print = TRUE)

Arguments

df

Dataframe

size

Numeric. Split rate value, between 0 and 1. If set to 1, the train and test set will be the same.

seed

Integer. Seed for random split

print

Boolean. Print summary results?

Value

List with both datasets, summary, and split rate.

Examples

data(dft) # Titanic dataset
splits <- msplit(dft, size = 0.7, seed = 123)
#> train_size  test_size 
#>        623        268 
names(splits)
#> [1] "train"       "test"        "summary"     "split_size"  "train_index"