This Cheat Sheet is a handy reference for Base R statistical functions, interactive applications, machine learning, databases, and images.
Base R statistical functions
Here’s a selection of statistical functions that come with the standard R installation. You’ll find many others in R packages.
Central Tendency and Variability
Function | What it calculates |
mean(x) | Mean of the numbers in vector x |
median(x) | Median of the numbers in vector x |
var(x) | Estimated variance of the population from which the numbers in vector x are sampled |
sd(x) | Estimated standard deviation of the population from which the numbers in vector x are sampled |
scale(x) | Standard scores (z-scores) for the numbers in vector x |
Relative Standing
Function | What it calculates |
sort(x) | The numbers in vector x in increasing order |
sort(x)[n] | The nth smallest number in vector x |
rank(x) | Ranks of the numbers (in increasing order) in vector x |
rank(-x) | Ranks of the numbers (in decreasing order) in vector x |
rank(x, ties.method= “average”) | Ranks of the numbers (in increasing order) in vector x, with tied numbers given the average of the ranks that the ties would have attained |
rank(x, ties.method= “min”) | Ranks of the numbers (in increasing order) in vector x, with tied numbers given the minimum of the ranks that the ties would have attained |
rank(x, ties.method = “max”) | Ranks of the numbers (in increasing order) in vector x, with tied numbers given the maximum of the ranks that the ties would have attained |
quantile(x) | The 0th, 25th, 50th, 75th, and 100th percentiles (the quartiles, in other words) of the numbers in vector x. (That’s not a misprint: quantile(x) returns the quartiles of x.) |
t-tests
Function | What it calculates |
t.test(x,mu=n, alternative = “two.sided”) | Two-tailed t-test that the mean of the numbers in vector x is different from n. |
t.test(x,mu=n, alternative = “greater”) | One-tailed t-test that the mean of the numbers in vector x is greater than n. |
t.test(x,mu=n, alternative = “less”) | One-tailed t-test that the mean of the numbers in vector x is less than n. |
t.test(x,y,mu=0, var.equal = TRUE, alternative = “two.sided”) | Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The variances in the two vectors are assumed to be equal. |
t.test(x,y,mu=0, alternative = “two.sided”, paired = TRUE) | Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The vectors represent matched samples. |
Analysis of Variance (ANOVA)
Function | What it calculates |
aov(y~x, data = d) | Single-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vector x as the levels of the independent variable. The data are in data frame d. |
aov(y~x + Error(w/x), data = d) | Repeated Measures ANOVA, with the numbers in vector y as the dependent variable and the elements in vector x as the levels of an independent variable. Error(w/x) indicates that each element in vector w experiences all the levels of x. (In other words, x is a repeated measure.) The data are in data frame d. |
aov(y~x*z, data = d) | Two-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vectors x and z as the levels of the two independent variables. The data are in data frame d. |
aov(y~x*z + Error(w/z), data = d) | Mixed ANOVA, with the numbers in vector z as the dependent variable and the elements of vectors x and y as the levels of the two independent variables. Error(w/z) indicates that each element in vector w experiences all the levels of z. (In other words, z is a repeated measure.) The data are in data frame d. |
Correlation and regression
Function | What it calculates |
cor(x,y) | Correlation coefficient between the numbers in vector x and the numbers in vector y |
cor.test(x,y) | Correlation coefficient between the numbers in vector x and the numbers in vector y, along with a t-test of the significance of the correlation coefficient. |
lm(y~x, data = d) | Linear regression analysis with the numbers in vector y as the dependent variable and the numbers in vector x as the independent variable. Data are in data frame d. |
coefficients(a) | Slope and intercept of linear regression model a. |
confint(a) | Confidence intervals of the slope and intercept of linear regression model a. |
lm(y~x+z, data = d) | Multiple regression analysis with the numbers in vector y as the dependent variable and the numbers in vectors x and z as the independent variables. Data are in data frame d. |
When you carry out an ANOVA or a regression analysis, store the analysis in a list — for example: a <- lm(y~x, data = d). Then, to see the tabled results, use the summary() function: summary(a)
Interacting with a user
R provides the shiny package and the shinydashboard package for developing interactive applications. Here are selected functions from these packages.
Functions from the shiny package
Function | What it does |
shinyApp() | Ties a user interface and a server into a shiny application |
fluidPage() | Creates a browser page that changes with the width of the browser |
sliderInput() | Defines a slider and its input for a shiny user interface |
plotOutput() | Reserves a shiny user interface area for a plot |
renderPlot() | Draws the plot on a shiny user interface |
textOutput() | Reserves a shiny user interface area for text |
renderText() | Adds text to a shiny user interface |
selectInput() | Creates a drop-down menu on a shiny user interface |
Functions from the shinydashboard package
Function | What it creates for a shinydashboard page |
dashboardPage() | The page |
dashboardHeader() | Page header |
dashboardSidebar() | Page sidebar |
sidebarMenu() | A menu for a sidebar |
menuItem() | An item for a menu |
dashboardBody() | Page body |
fluidRow() | A variable-width row inside the dashboard body |
box() | A box inside a row |
valueBoxOutput() | A reserved space for a value box |
renderValueBox | Reactive context for a value box |
valueBox | A value box |
column() | A column within a fluid row |
tabBox() | A tab for a tabbed page |
Machine learning
R provides a number of packages and functions for machine learning. Here are some of them.
Machine learning packages and functions
Package | Function | What it does |
rattle | rattle() | Opens the Rattle graphical user interface |
rpart | rpart() | Creates a decision tree |
rpart.plot | prp() | Draws a decision tree |
randomForest | randomForest() | Creates a random forest of decision trees |
rattle | printRandomForests() | Prints the rules of a forest’s individual decision trees |
e1071 | svm() | Trains a support vector machine |
e1071 | predict() | Creates a vector of predicted classifications based on a support vector machine |
kernlab | ksvm() | Trains a support vector machine |
base R | kmeans() | Creates a k-means clustering analysis |
nnet | nnet() | Creates a neural network with one hidden layer |
NeuralNetTools | plotnet() | Draws a neural network |
nnet | predict() | Creates a vector of predictions based on a neural network |
Databases
Created for statistical analysis, R has wide array of packages and functions for dealing with large amounts of data. This selection is the tip of the iceberg’s tip.
Packages and functions for exploring databases
Package | Function | What it does |
didrooRFM | findRFM() | Performs a recency, frequency, money analysis on a database of retail transactions |
vcd | assocstats() | Calculates statistics for tables of categorical data |
vcd | assoc() | Creates a graphic that shows deviations from independence in a table of categorical data |
tidyverse | glimpse() | Provides a partial view of a data frame with the columns appearing onscreen as rows |
plotrix | std.error() | Calculates the standard error of the mean |
plyr | inner_join() | Joins data frames |
lubridate | wday() | Returns day of the week of a calendar date |
lubridate | ymd() | Returns a date in R date-format |
Images
Here are some functions to help you get started using R to process images. They all live in the magick package.
Functions from the magick package
Function | What it does |
image_read() | Reads an image into R and turns it into a magick object |
image_resize() | Resizes an image |
image_rotate() | Rotates an image |
image_flip() | Rotates an image on a horizontal axis |
image_flop() | Rotates an image on a vertical axis |
image_annotate() | Adds text to an image |
image_background() | Sets the background for an image |
image_composite() | Combines images |
image_morph() | Makes one image appear to gradually become (morph into) another |
image_animate() | Puts an animation into the RStudio Viewer window |
image_apply() | Applies a function to every frame in an animated GIF |
image_write() | Saves an animation as a reusable GIF |