Exploring dialogue trends in Bob’s Burgers with the power of {ggplot2} and {gghalves}, showcasing half-violin and haf-beeswarm scatter plots for storytelling.
#TidyTuesday
{gghalves}
Author
Aditya Dahiya
Published
November 23, 2024
About the Data
This dataset explores the dialogue metrics of episodes from the animated TV series Bob’s Burgers. Compiled by Steven Ponce, the data is made available through the {bobsburgersR} R package, which includes episode transcripts and additional metadata. A blog post by Steven Ponce provides insights into visualizing this dataset. Curated for this week’s Tidy Tuesday challenge by Jon Harmon, the dataset offers an opportunity to analyze dialogue trends and patterns across seasons. Key variables include dialogue density, average dialogue length, sentiment variance, unique word counts, and punctuation ratios (questions and exclamations). You can load the dataset using the tidytuesdayR package or directly from GitHub.
The visualization explores trends in dialogue punctuation in Bob’s Burgers episodes across seasons, focusing on the proportion of dialogue lines containing question marks (question_ratio) and exclamation points (exclamation_ratio). The graphic features a split representation of data: a scatter plot highlighting individual episode ratios and a violin plot illustrating the overall distribution of these proportions by season. The analysis reveals a noticeable decline in both ratios over the latter seasons, suggesting a decrease in the use of questions and exclamatory expressions in dialogue. This shift could indicate changes in the show’s tone, writing style, or character interactions as the series progresses.
How I made this graphic?
To create this graphic, I used the {ggplot2} package, a foundational tool for data visualization in R, in combination with the {gghalves} package authored by Frederik Tiedemann and hosted on GitHub. The {gghalves} package simplifies the creation of “half-half” plots by extending {ggplot2} with geoms such as gghalves::geom_half_violin() and gghalves::geom_half_point(). In this graphic, I used geom_half_violin() to represent the distribution of punctuation ratios across seasons and geom_half_point() to plot individual episode data points, combining them to show detailed insights. To highlight trends across seasons, I added smoothed lines using ggplot2::geom_smooth(). {gghalves} is a versatile package that makes it easy to compose compact visualizations by combining plot types, and more details can be found on its GitHub repository.
Steps
Loading required libraries, data import & creating custom functions.
Code
# Data Import and Wrangling Toolslibrary(tidyverse) # All things tidy# Final plot toolslibrary(scales) # Nice Scales for ggplot2library(fontawesome) # Icons display in ggplot2library(ggtext) # Markdown text support for ggplot2library(showtext) # Display fonts in ggplot2library(colorspace) # Lighten and Darken colourslibrary(patchwork) # Compiling Plots# devtools::install_github('erocoar/gghalves')# Geocomputationlibrary(gghalves) # Half violin plot# Load the dataepisode_metrics <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-11-19/episode_metrics.csv')
Visualization Parameters
Code
# Font for titlesfont_add_google("Rampart One",family ="title_font") # Font for the captionfont_add_google("Saira Extra Condensed",family ="caption_font") # Font for plot textfont_add_google("Atma",family ="body_font") showtext_auto()bobs_burgers_palette <-c("#f0262a", "#f172a9", "#95d244","#fcdd60","#9ac7e8")# A base Colourbg_col <-"#FEDD00FF"seecolor::print_color(bg_col)# Colour for highlighted texttext_hil <-"#BA0C2FFF"seecolor::print_color(text_hil)# Colour for the texttext_col <-"#000000FF"seecolor::print_color(text_col)# Define Base Text Sizebts <-90# Caption stuff for the plotsysfonts::font_add(family ="Font Awesome 6 Brands",regular = here::here("docs", "Font Awesome 6 Brands-Regular-400.otf"))github <-""github_username <-"aditya-dahiya"xtwitter <-""xtwitter_username <-"@adityadahiyaias"social_caption_1 <- glue::glue("<span style='font-family:\"Font Awesome 6 Brands\";'>{github};</span> <span style='color: {text_col}'>{github_username} </span>")social_caption_2 <- glue::glue("<span style='font-family:\"Font Awesome 6 Brands\";'>{xtwitter};</span> <span style='color: {text_col}'>{xtwitter_username}</span>")# Add text to plot--------------------------------------------------------------plot_title <-"Fading Punctuation:\nThe Changing Voice of Bob's Burgers"plot_subtitle <-"Each dot represents an Episode of Bob's Burgers. Over the seasons, there is a decline in the use of both questions and exclamations in its dialogue, suggesting a subtle shift in the show's tone: moving toward less dramatic and inquisitive interactions."plot_caption <-paste0("**Data:** {bobsburgersR} by Steven Ponce", " | **Code:** ", social_caption_1, " | **Graphics:** ", social_caption_2 )rm(github, github_username, xtwitter, xtwitter_username, social_caption_1, social_caption_2)
# Saving a thumbnaillibrary(magick)# Saving a thumbnail for the webpageimage_read(here::here("data_vizs", "tidy_bobs_burgers.png")) |>image_resize(geometry ="400") |>image_write( here::here("data_vizs", "thumbnails", "tidy_bobs_burgers.png" ) )
Session Info
Code
# Data Import and Wrangling Toolslibrary(tidyverse) # All things tidy# Final plot toolslibrary(scales) # Nice Scales for ggplot2library(fontawesome) # Icons display in ggplot2library(ggtext) # Markdown text support for ggplot2library(showtext) # Display fonts in ggplot2library(colorspace) # Lighten and Darken colourslibrary(patchwork) # Compiling Plots# devtools::install_github('erocoar/gghalves')# Geocomputationlibrary(gghalves) # Half violin plotsessioninfo::session_info()$packages |>as_tibble() |>select(package, version = loadedversion, date, source) |>arrange(package) |> janitor::clean_names(case ="title" ) |> gt::gt() |> gt::opt_interactive(use_search =TRUE ) |> gtExtras::gt_theme_espn()