Skip to contents

This function inspects a data.table and attempts to identify columns that should be numeric but have been read in as character, and fixes their type in-place. This can happen when one or more fields contain non-numeric characters, for example.

Usage

dt_set_autotypes(x, cols = NULL, verbosity = 1L)

Arguments

x

data.table: Input data.table. Will be modified in-place, if needed.

cols

Character vector: columns to work on. If not defined, will work on all columns

verbosity

Integer: Verbosity level.

Value

data.table, invisibly.

Author

EDG

Examples

library(data.table)
x <- data.table(
  id = 8001:8006,
  a = c("3", "5", "undefined", "21", "4", NA),
  b = c("mango", "banana", "tangerine", NA, "apple", "kiwi"),
  c = c(1, 2, 3, 4, 5, 6)
)
str(x)
#> Classes ‘data.table’ and 'data.frame':	6 obs. of  4 variables:
#>  $ id: int  8001 8002 8003 8004 8005 8006
#>  $ a : chr  "3" "5" "undefined" "21" ...
#>  $ b : chr  "mango" "banana" "tangerine" NA ...
#>  $ c : num  1 2 3 4 5 6
#>  - attr(*, ".internal.selfref")=<externalptr> 
# ***in-place*** operation means no assignment is needed
dt_set_autotypes(x)
#> 2026-02-22 18:59:26 
#> Converting a to numeric
#>  [dt_set_autotypes]
str(x)
#> Classes ‘data.table’ and 'data.frame':	6 obs. of  4 variables:
#>  $ id: int  8001 8002 8003 8004 8005 8006
#>  $ a : num  3 5 NA 21 4 NA
#>  $ b : chr  "mango" "banana" "tangerine" NA ...
#>  $ c : num  1 2 3 4 5 6
#>  - attr(*, ".internal.selfref")=<externalptr> 

# Try excluding column 'a' from autotyping
x <- data.table(
  id = 8001:8006,
  a = c("3", "5", "undefined", "21", "4", NA),
  b = c("mango", "banana", "tangerine", NA, "apple", "kiwi"),
  c = c(1, 2, 3, 4, 5, 6)
)
str(x)
#> Classes ‘data.table’ and 'data.frame':	6 obs. of  4 variables:
#>  $ id: int  8001 8002 8003 8004 8005 8006
#>  $ a : chr  "3" "5" "undefined" "21" ...
#>  $ b : chr  "mango" "banana" "tangerine" NA ...
#>  $ c : num  1 2 3 4 5 6
#>  - attr(*, ".internal.selfref")=<externalptr> 
# exclude column 'a' from autotyping
dt_set_autotypes(x, cols = setdiff(names(x), "a"))
str(x)
#> Classes ‘data.table’ and 'data.frame':	6 obs. of  4 variables:
#>  $ id: int  8001 8002 8003 8004 8005 8006
#>  $ a : chr  "3" "5" "undefined" "21" ...
#>  $ b : chr  "mango" "banana" "tangerine" NA ...
#>  $ c : num  1 2 3 4 5 6
#>  - attr(*, ".internal.selfref")=<externalptr>