Skip to contents

Create resamples of your data, e.g. for model building or validation. "KFold" creates stratified folds, , "StratSub" creates stratified subsamples, "Bootstrap" gives the standard bootstrap, i.e. random sampling with replacement, while "StratBoot" uses StratSub and then randomly duplicates some of the training cases to reach original length of input (default) or length defined by target_length.

Usage

resample(x, config = setup_Resampler(), verbosity = 1L)

Arguments

x

Vector or data.frame: Usually the outcome; NROW(x) defines the sample size.

config

Resampler object created by setup_Resampler.

verbosity

Integer: Verbosity level.

Value

Resampler object.

Details

Note that option 'KFold' may result in resamples of slightly different length. Avoid all operations which rely on equal-length vectors. For example, you can't place resamples in a data.frame, but must use a list instead.

Author

EDG

Examples

y <- rnorm(200)
# 10-fold (stratified)
y_10fold <- resample(y, setup_Resampler(10L, "kfold"))
y_10fold
#> <Resampler>
#>      type: KFold
#> resamples: 
#>             Fold_1: 1, 2, 3, 4...
#>             Fold_2: 1, 2, 3, 4...
#>             Fold_3: 1, 2, 3, 4...
#>             Fold_4: 3, 4, 5, 6...
#>             Fold_5: 1, 2, 3, 4...
#>             Fold_6: 1, 2, 3, 4...
#>             Fold_7: 1, 2, 3, 4...
#>             Fold_8: 1, 2, 4, 5...
#>             Fold_9: 1, 2, 3, 4...
#>            Fold_10: 1, 2, 3, 5...
#>    config:
#>            <KFoldConfig>
#>                       n: 10
#>            stratify_var: NULL
#>            strat_n_bins: 4
#>                id_strat: NULL
#>                    seed: NULL
# 25 stratified subsamples
y_25strat <- resample(y, setup_Resampler(25L, "stratsub"))
y_25strat
#> <Resampler>
#>      type: StratSub
#> resamples: 
#>            Showing first 12 of 25 items.
#>             Subsample_1: 1, 2, 3, 5...
#>             Subsample_2: 1, 2, 3, 5...
#>             Subsample_3: 2, 3, 4, 5...
#>             Subsample_4: 2, 4, 5, 6...
#>             Subsample_5: 1, 2, 4, 5...
#>             Subsample_6: 2, 3, 5, 8...
#>             Subsample_7: 1, 2, 3, 5...
#>             Subsample_8: 1, 3, 4, 6...
#>             Subsample_9: 1, 3, 4, 6...
#>            Subsample_10: 1, 2, 3, 4...
#>            Subsample_11: 1, 2, 3, 4...
#>            Subsample_12: 1, 2, 4, 7...
#>            ...13 more items not shown.
#>    config:
#>            <StratSubConfig>
#>                       n: 25
#>                 train_p: 0.75
#>            stratify_var: NULL
#>            strat_n_bins: 4
#>                id_strat: NULL
#>                    seed: NULL
# 100 stratified bootstraps
y_100strat <- resample(y, setup_Resampler(100L, "stratboot"))
y_100strat
#> <Resampler>
#>      type: StratBoot
#> resamples: 
#>            Showing first 12 of 100 items.
#>              StratBoot_1: 1, 1, 2, 2...
#>              StratBoot_2: 1, 2, 3, 4...
#>              StratBoot_3: 1, 3, 4, 4...
#>              StratBoot_4: 1, 2, 3, 3...
#>              StratBoot_5: 1, 2, 3, 5...
#>              StratBoot_6: 1, 4, 4, 5...
#>              StratBoot_7: 1, 4, 5, 6...
#>              StratBoot_8: 1, 2, 3, 4...
#>              StratBoot_9: 1, 2, 2, 5...
#>             StratBoot_10: 1, 1, 2, 2...
#>             StratBoot_11: 1, 1, 2, 3...
#>             StratBoot_12: 2, 2, 3, 3...
#>            ...88 more items not shown.
#>    config:
#>            <StratBootConfig>
#>                        n: 100
#>             stratify_var: NULL
#>                  train_p: 0.75
#>             strat_n_bins: 4
#>            target_length: NULL
#>                 id_strat: NULL
#>                     seed: NULL
# LOOCV
y_loocv <- resample(y, setup_Resampler(type = "LOOCV"))
y_loocv
#> <Resampler>
#>      type: LOOCV
#> resamples: 
#>            Showing first 12 of 200 items.
#>              Fold_1: 2, 3, 4, 5...
#>              Fold_2: 1, 3, 4, 5...
#>              Fold_3: 1, 2, 4, 5...
#>              Fold_4: 1, 2, 3, 5...
#>              Fold_5: 1, 2, 3, 4...
#>              Fold_6: 1, 2, 3, 4...
#>              Fold_7: 1, 2, 3, 4...
#>              Fold_8: 1, 2, 3, 4...
#>              Fold_9: 1, 2, 3, 4...
#>             Fold_10: 1, 2, 3, 4...
#>             Fold_11: 1, 2, 3, 4...
#>             Fold_12: 1, 2, 3, 4...
#>            ...188 more items not shown.
#>    config:
#>            <LOOCVConfig>
#>            n: 200