Skip to contents

Advanced Machine Learning & Visualization made efficient, accessible, reproducible

Online Documentation and Vignettes

https://rdocs.rtemis.org

System Setup

There are some options you can define in your .Rprofile (usually found in your home directory), so you do not have to define each time you execute a function.

rtemis_theme

General plotting theme; set to e.g. "whiteigrid" or "darkgraygrid"

rtemis_font

Font family to use in plots.

rtemis_palette

Name of default palette to use in plots. See options by running get_palette()

Visualization

Graphics are handled using the draw family, which produces interactive plots primarily using plotly and other packages.

Supervised Learning

By convention, the last column of the data is the outcome variable, and all other columns are predictors. Convenience function set_outcome can be used to move a specified column to the end of the data. Regression and Classification is performed using train(). This function allows you to preprocess, train, tune, and test models on multiple resamples. Use available_supervised to get a list of available algorithms

Classification

For training of binary classification models, the outcome should be provided as a factor, with the second level of the factor being the 'positive' class.

Clustering

Clustering is performed using cluster(). Use available_clustering to get a list of available algorithms.

Decomposition

Decomposition is performed using decomp(). Use available_decomposition to get a list of available algorithms.

Type Documentation

Function documentation includes input type (e.g. "Character", "Integer", "Float"/"Numeric", etc). When applicable, value ranges are provided in interval notation. For example, Float: [0, 1) means floats between 0 and 1 including 0, but excluding 1. Categorical variables may include set of allowed values using curly braces. For example, Character: {"future", "mirai", "none"}.

Tabular Data

rtemis internally uses methods for efficient handling of tabular data, with support for data.frame, data.table, and tibble. If a function is documented as accepting "tabular data", it should work with any of these data structures. If a function is documented as accepting only one of these, then it should only be used with that structure. For example, some optimized data.table operations that perform in-place modifications only work with data.table objects.

Author

Maintainer: E.D. Gennatas [email protected] (ORCID) [copyright holder]