This tutorial walks through how to create jittered logo plots using {ggplot2} and {cbbplotR} for college basketball. We will be plotting performance against seed expectation – the cumulative number of wins above or below seed expectation – from 2000-2024 for every team with at least five tournament appearances.
The How
For this table, we will need:
library(tidyverse)
library(cbbdata)
library(cbbplotR)
library(vipor)
The Data
For this visualization, we will be pulling data from Barttorvik using the {cbbdata} package.
data <- cbd_torvik_ncaa_results(2000, 2024) %>%
filter(r64 >= 5) %>%
select(team, pase) %>%
mutate(pase_rk = dense_rank(-pase))
All we’re doing here is pulling tournament performance data, filtering for five or more appearances (r64), and calculating PASE rank – which will be used to highlight top teams.
Calculate the jitter
We want to “jitter” our plot, which is something made easy by using the ggbeeswarm
package. “Jittering” broadly refers to offsetting points to minimize overlap. Unfourtantely, {cbbplotR} does not yet support jittering points, so we need to do it ourselves.
Behind the scenes, {ggbeeswarm} uses the {vipor} package and its offsetSingleGroup function to calculate new x-values for plotting. With this knowledge, we can create a small wrapper around offsetSingleGroup to achieve similar results.
calculate_quasirandom_jitter <- function(y, x, width = 0.2) {
jittered_offset <- offsetSingleGroup(y, method = "quasirandom")
jittered_offset <- jittered_offset * width
x + jittered_offset
}
Next, we’ll apply this function to our data.
data <- data %>%
mutate(x = calculate_quasirandom_jitter(pase, 1))
Plotting
Now, time to plot! Let’s briefly go over some things:
geom_mean_lines
This is a utility function to add mean (or median) lines to any plot. Notice that you must refer to your values as either y0 or x0, not y or x.
scale_X_identity
Inside geom_cbb_teams, you might notice that we are conditionally defining widths (logo size) and alpha (logo transparency) values. The scale_x_identity family of functions are used when “your data is already scaled such that the data and aesthetic spaces are the same.” That is, whenever you are passing direct values for a scale inside of any aes, you must use the appropriate _identity function for ggplot to recognize those values as literal representations.
plot.margin
This is how you add padding to your plot. Sometimes padding makes your graph look a bit cleaner.
Using ggpreview
with logo plots
If you are plotting numerous team logos, you might notice that RStudio can be slow to return the plot itself – which can possibly lead to your R session aborting. To fix this, {cbbplotR} borrows a function from the {ggpath} package called ggpreview – which saves a temporary image of your plot and returns it in the Viewer pane. It is recommend to then expand that window in your browser.
To use ggpreview, you need to store your plot as a variable and then pass it to the ggpreview function. The function also takes arguments for plot dimensions.
For example, if we were to draw a plot showing every team’s adjusted efficiencies, that would require rendering 362 logos, which would definitely cause us some problems. But with ggpreview, we can store our plot as a variable and view a temporary image of it! This entire process takes fewer than 10 seconds.
The plot
plot <- data %>%
ggplot(aes(x, pase)) +
geom_mean_lines(aes(y0 = pase), color = "grey70") +
geom_cbb_teams(aes(team = team,
width = ifelse(pase_rk <= 20, 0.07, 0.055),
alpha = ifelse(pase_rk <= 20, 1, 0.15))) +
scale_alpha_identity() +
scale_y_continuous(breaks = seq(-10, 20, 5), labels = c("- 10", as.character(seq(-5, 15, 5)), "+ 20"),
limits = c(-10, 20)) +
theme_minimal() +
theme(plot.title.position = "plot",
plot.title = element_text(family = "RadioCanadaBig-Bold", hjust = 0.5, size = 14),
plot.subtitle = element_text(family = "RadioCanadaBig-Regular", hjust = 0.5,
vjust = 2.7, size = 10),
plot.caption.position = "plot",
plot.caption = ggtext::element_markdown(family = "RadioCanadaBig-Regular",
lineheight = 1.2, size = 8),
axis.text = element_text(family = "RadioCanadaBig-Regular"),
axis.title = element_text(family = "RadioCanadaBig-SemiBold"),
axis.title.y = element_text(vjust = 2),
axis.text.x = element_blank(),
plot.margin = margin(t = 20, r = 20, b = 20, l = 20, unit = "pt"),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.minor.y = element_blank(),
plot.background = element_rect(fill = "#F6F7F2")) +
labs(title = "The programs who routinely outperform March expectations",
subtitle = "Sorted by PASE (performance against seed expectation) from 2000-2024.\nMin. five tournament appearances.",
caption = "Data by cbbdata<br>Viz by @andreweatherman + cbbplotR",
y = "Aggregate wins +/- seed expectation",
x = NULL)
Saving the plot
When you’re using custom fonts, as we are, sometimes ggsave won’t properly render them. To sidestep this, you need to specify a device, shown below.
ggsave(plot = plot, "pase_graph.png", h = 6.5, w = 6, dpi = 600, device = grDevices::png)