Remove/Drop Columns in which ALL or SOME values are NAs

Usage

removenacols(df, all = TRUE, ignore = NULL)

removenarows(df, all = TRUE)

numericalonly(df, dropnacols = TRUE, logs = FALSE, natransform = NA)

Arguments

df: Data.frame
all: Boolean. Remove rows which contains ONLY NA values. If set to FALSE, rows which contains at least one NA will be removed
ignore: Character vector. Column names to ignore validation.
dropnacols: Boolean. Drop columns with only NA values?
logs: Boolean. Calculate log(x)+1 for numerical columns?
natransform: String. "mean" or 0 to impute NA values. If set to NA no calculation will run.

Value

data.frame with removed columns.

data.frame with removed rows.

data.frame with all numerical columns selected.

Examples

data(dft) # Titanic dataset
str(dft)
#> 'data.frame':	891 obs. of  11 variables:
#>  $ PassengerId: int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ Survived   : logi  FALSE TRUE TRUE TRUE FALSE FALSE ...
#>  $ Pclass     : Factor w/ 3 levels "1","2","3": 3 1 3 1 3 3 1 3 3 2 ...
#>  $ Sex        : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 2 1 1 ...
#>  $ Age        : num  22 38 26 35 35 NA 54 2 27 14 ...
#>  $ SibSp      : int  1 1 0 1 0 0 0 3 0 1 ...
#>  $ Parch      : int  0 0 0 0 0 0 0 1 2 0 ...
#>  $ Ticket     : chr  "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
#>  $ Fare       : num  7.25 71.28 7.92 53.1 8.05 ...
#>  $ Cabin      : Factor w/ 148 levels "","A10","A14",..: 1 83 1 57 1 1 131 1 1 1 ...
#>  $ Embarked   : Factor w/ 4 levels "","C","Q","S": 4 2 4 4 4 3 4 4 4 2 ...
numericalonly(dft) %>% head()
#>   PassengerId Age SibSp Parch    Fare Survived Sex
#> 1           1  22     1     0  7.2500        0   1
#> 2           2  38     1     0 71.2833        1   0
#> 3           3  26     0     0  7.9250        1   0
#> 4           4  35     1     0 53.1000        1   0
#> 5           5  35     0     0  8.0500        0   1
#> 6           6  NA     0     0  8.4583        0   1
numericalonly(dft, natransform = "mean") %>% head()
#>   PassengerId      Age SibSp Parch    Fare Survived Sex
#> 1           1 22.00000     1     0  7.2500        0   1
#> 2           2 38.00000     1     0 71.2833        1   0
#> 3           3 26.00000     0     0  7.9250        1   0
#> 4           4 35.00000     1     0 53.1000        1   0
#> 5           5 35.00000     0     0  8.0500        0   1
#> 6           6 29.69912     0     0  8.4583        0   1

Remove/Drop Columns in which ALL or SOME values are NAs

Usage

Arguments

Value

See also

Examples