Fertility Rate: A Bump Chart of Changing Rankings

This graphic depicts the rankings of the top 10 most populous countries based on their Total Fertility Rate (TFR) from the 1950s to 2021. It showcases the dynamic shifts in fertility rates among these nations over seven decades.

A4 Size Viz

Our World in Data

Public Health

Author

Aditya Dahiya

Published

July 13, 2024

The data used in this graphic is sourced from the United Nations’ World Population Prospects (2022) and processed by Our World in Data. It encompasses the Total Fertility Rate (TFR) for the top 10 most populous countries from the 1950s to 2021. TFR is the average number of children a woman is expected to have during her lifetime. The graphic employs a bump chart to illustrate changes in fertility rate rankings over seven decades, highlighting significant trends such as the consistently high fertility rates in Pakistan, Bangladesh, and Nigeria, and the dramatic decline in China due to the “One Child Policy.” This visualization underscores the dynamic nature of fertility rates among the world’s most populous nations. (Inspired from David Sjoberg’s example).

Figure 1: This graphic presents the rankings of the top 10 most populous countries based on their Total Fertility Rate (TFR) from the 1950s to 2021. The x-axis represents the decades and the year 2021, while the y-axis shows the rankings from 1 to 14. Each colored line represents a different country, with country flags and population figures shown at the ends of the lines.

How I made this graphic?

Getting the data

Code

# Data Import and Wrangling Tools
library(tidyverse)            # All things tidy
library(owidR)                # Get data from Our World in R

# Final plot tools
library(scales)               # Nice Scales for ggplot2
library(fontawesome)          # Icons display in ggplot2
library(ggtext)               # Markdown text support
library(showtext)             # Display fonts in ggplot2
library(colorspace)           # To lighten and darken colours
library(patchwork)            # Combining plots

# ggbump package for Bump-Charts and Sigmoid lines
# install.packages("ggbump")
library(ggbump)               # For bump charts 

search1 <- owidR::owid_search("fertility")

df1 <- owid("children-per-woman")

popdf <- owid("population-with-un-projections")

Visualization Parameters

Code

# Font for titles
font_add_google("News Cycle",
  family = "title_font"
) 

# Font for the caption
font_add_google("Saira Extra Condensed",
  family = "caption_font"
) 

# Font for plot text
font_add_google("Roboto Condensed",
  family = "body_font"
) 

showtext_auto()

# Colour Palette
mypal <- rev(paletteer::paletteer_d("LaCroixColoR::Lime")[c(1,2,4:6)])

# Background Colour
bg_col <- "grey95"
text_col <- "grey10"
text_hil <- "grey25"

# Base Text Size
bts <- 80

plot_title <- "Fertility Rate Rankings among most populous Nations"

plot_subtitle <- "The rankings of most populous countries based on their Total Fertility Rate (TFR) from the 1950 to 2021. TFR is the average number of children a woman is expected to have during her lifetime. Notice the consistently high fertility rates in Pakistan, Bangladesh, and Nigeria, and the significant decline in China's fertility due to its One Child Policy. The number in the coloured bands shows the TFR. Below the flags are country's name and population."

data_annotation <- "About the Data: This data on fertility rates, sourced from the United Nations' World Population Prospects (2022) and processed by Our World in Data, spans from 1950 to 2021. It reflects the average number of live births per woman, with age-specific rates, providing insights into global fertility trends across decades."

# Caption stuff for the plot
sysfonts::font_add(
  family = "Font Awesome 6 Brands",
  regular = here::here("docs", "Font Awesome 6 Brands-Regular-400.otf")
)
github <- "&#xf09b"
github_username <- "aditya-dahiya"
xtwitter <- "&#xe61b"
xtwitter_username <- "@adityadahiyaias"
social_caption_1 <- glue::glue("<span style='font-family:\"Font Awesome 6 Brands\";'>{github};</span> <span style='color: {text_hil}'>{github_username}  </span>")
social_caption_2 <- glue::glue("<span style='font-family:\"Font Awesome 6 Brands\";'>{xtwitter};</span> <span style='color: {text_hil}'>{xtwitter_username}</span>")

plot_caption <- paste0(
  "**Data:** United Nations & Our World in Data  |  ",
  "**Code:** ", 
  social_caption_1, 
  " |  **Graphics:** ", 
  social_caption_2
  )
rm(github, github_username, xtwitter, 
   xtwitter_username, social_caption_1, social_caption_2)

Data Wrangling

Code

# A clean tibble for the fertility levels in each year
df2 <- df1 |> 
  as_tibble() |> 
  janitor::clean_names() |> 
  rename(fertility = fertility_rate_sex_all_age_all_variant_estimates) |> 
  filter(!is.na(code)) |> 
  filter(entity != "World") |> 
  select(-countries_continents) |> 
  filter(!str_detect(code, "OWID"))

# A clean tibble of populations for each coutnry in each year
popdf1 <- popdf |> 
  as_tibble() |> 
  janitor::clean_names() |> 
  filter(!is.na(code)) |> 
  filter(entity != "World") |> 
  mutate(
    population = ifelse(
      is.na(population_sex_all_age_all_variant_estimates),
      population_sex_all_age_all_variant_medium,
      population_sex_all_age_all_variant_estimates
    )
  ) |> 
  select(-c(population_sex_all_age_all_variant_estimates,
            population_sex_all_age_all_variant_medium))

# Tibble for names of continents and ISO-2 country codes
# to be used for flags
df_continents <- rnaturalearth::ne_countries() |> 
  as_tibble() |> 
  janitor::clean_names() |> 
  select(iso_a3, iso_a2, continent) |> 
  rename(code = iso_a3) |> 
  mutate(country = str_to_lower(iso_a2), .keep = "unused")

x_axis_labels <- c("1950s", "1960s", "1970s", "1980s", "1990s", "2000s", "2010s", "2021")

# A tibble to use for final computation and country selection
df3 <- df2 |> 
  left_join(popdf1) |>
  left_join(df_continents) |> 
  drop_na() |> 
  filter(year >= 1951 & year <= 2020) |> 
  mutate(
    year = cut(
      year, 
      breaks = seq(1950, 2020, 10), 
      labels = 1:7
      )
  ) |> 
  group_by(continent, entity, country, year) |> 
  summarise(
    fertility = weighted.mean(fertility, w = population, na.rm = T),
    population = mean(population, na.rm = T)
  ) |> 
  mutate(year = as.numeric(year)) |>
  ungroup() |> 
  bind_rows(
    df2 |> 
      left_join(popdf1) |>
      left_join(df_continents) |> 
      drop_na() |> 
      filter(year == 2021) |> 
      mutate(year = 8)
  ) |> 
  select(-code)

# Countries to display
sel_cons <- df3 |> 
  group_by(year) |> 
  slice_max(population, n = 10) |> 
  pull(entity) |> 
  unique()

# Levels of countries for colour scales
levels_sel_cons <- df3 |> 
  filter(entity %in% sel_cons) |> 
  filter(year == 1) |> 
  arrange(desc(fertility)) |> 
  pull(entity)

# Tibble for final plotting with ggplot2
df4 <- df3 |> 
  filter(entity %in% sel_cons) |> 
  group_by(year) |> 
  arrange(desc(fertility)) |> 
  mutate(top_rank = row_number()) |> 
  ungroup() |> 
  mutate(entity = fct(entity, levels = levels_sel_cons))

# Adding dummy data to extend bump lines along x-axis
df4 <- df4 |> 
  bind_rows(df4 |> filter(year == 1) |> mutate(year = 0.5)) |> 
  bind_rows(df4 |> filter(year == 8) |> mutate(year = 8.5))

Visualization: Credits to {ggbump} by (Sjoberg 2020)

Code

g <- df4 |> 
  ggplot() +
  
  # Panel visualization
  geom_bump(
    mapping = aes(
      x = year,
      y = top_rank,
      group = entity,
      colour = entity
    ),
    linewidth = 7, 
    alpha = 0.5
  ) +
  geom_text(
    mapping = aes(
      label = paste0(
        round(fertility, 1)
        ),
      x = year,
      y = top_rank
      ),
    colour = text_col,
    lineheight = 0.6,
    family = "caption_font",
    size = bts / 6,
    fontface = "bold"
  ) +
  
  # Left Y-Axis Ranks
  geom_text(
    data = tibble(
      rank_var = 1:14,
      y_var = 1:14
    ),
    mapping = aes(
      x = -0.1,
      y = y_var,
      label = rank_var
    ),
    size = bts / 2.2,
    family = "title_font",
    fontface = "bold",
    colour = text_col
  ) +
  
  # Left Y-Axis country labels
  geom_text(
    data = df4 |> filter(year == 1),
    mapping = aes(
      x = 0.5,
      y = top_rank,
      label = paste0(
        entity,
        " (", 
        scales::number(
          population,
          accuracy = 0.1,
          scale_cut = cut_short_scale()
        ),
        ")"
      )
    ),
    nudge_y = -0.4,
    lineheight = 0.3,
    hjust = 0.5,
    family = "caption_font",
    size = bts / 6
  ) +
  
  # Right Y-Axis country labels
  geom_text(
    data = df4 |> filter(year == 8),
    mapping = aes(
      x = 8.5,
      y = top_rank,
      label = paste0(
        entity,
        " (", 
        scales::number(
          population,
          accuracy = 0.1,
          scale_cut = cut_short_scale()
        ),
        ")"
      )
    ),
    nudge_y = -0.4,
    lineheight = 0.3,
    hjust = 0.5,
    family = "caption_font",
    size = bts / 6
  ) +
  
  # Adding country flags on top of chart - Right-end
  ggflags::geom_flag(
    data = df4 |> filter(year == 8),
    mapping = aes(
      x = 8.5, y = top_rank,
      country = country
    ),
    size = 19
  ) +
  # Adding country flags on top of chart - Left-end
  ggflags::geom_flag(
    data = df4 |> filter(year == 1),
    mapping = aes(
      x = 0.5, y = top_rank,
      country = country
    ),
    size = 19
  ) +
  
  # Scales and Coordinates
  paletteer::scale_colour_paletteer_d(
    "awtools::bpalette"
  ) +
  scale_y_reverse() +
  scale_x_continuous(
    breaks = 1:8,
    labels = x_axis_labels,
    expand = expansion(c(0.02, 0.1))
  ) +
  scale_size_continuous(
    range = c(5, 20)
  ) +
  coord_cartesian(
    clip = "off"
  ) +
  
  # Labels and Themes
  labs(
    title = plot_title,
    subtitle = str_wrap(plot_subtitle, 95),
    caption = plot_caption,
    x = NULL,
    y = "Ranking of these Countries  (on Total Fertility Rate)",
    colour = NULL
  ) +
  theme_minimal(
    base_family = "body_font",
    base_size = bts
  ) +
  theme(
    legend.position = "none",
    panel.grid = element_blank(),
    axis.text.y = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_text(
      margin = margin(0,0,0,0, "mm"),
      colour = text_hil,
      hjust = 0.5
    ),
    axis.ticks.length = unit(0, "mm"),
    plot.margin = margin(10,10,10,10, "mm"),
    axis.text.x = element_text(
      colour = text_hil,
      margin = margin(0,0,0,0, "mm"),
      size = bts * 1.2,
      face = "bold"
    ),
    plot.title.position = "plot",
    plot.caption = element_textbox(
      hjust = 0.5,
      family = "caption_font",
      colour = text_hil,
      margin = margin(5,0,0,0, "mm")
    ),
    plot.subtitle = element_text(
      lineheight = 0.28,
      colour = text_hil,
      hjust = 0.5,
      margin = margin(5,0,0,0, "mm")
    ),
    plot.title = element_text(
      family = "caption_font",
      colour = text_hil,
      size = 2.5 * bts,
      face = "bold",
      hjust = 0.5,
      margin = margin(10,0,0,0, "mm")
    )
  )

Save the graphic and a thumbnail

Code

ggsave(
  filename = here::here("data_vizs", "owid_fertility_bumps.png"),
  plot = g,
  width = 400,
  height = 500,
  units = "mm",
  bg = bg_col
)

library(magick)
# Saving a thumbnail for the webpage
image_read(here::here("data_vizs", 
                      "owid_fertility_bumps.png")) |> 
  image_resize(geometry = "400") |> 
  image_write(here::here("data_vizs", "thumbnails", 
                         "owid_fertility_bumps.png"))

An A4 size version, with highlight on China, added QR code and about the data shown in Figure 2

Code

levels(df4$entity)

cn_col <- paletteer::paletteer_d("awtools::bpalette")[1]

# A colour palette to highlight China
mypal <- c(
 paletteer::paletteer_d("awtools::bpalette")[2:7],
 
 # Red colour for China
 paletteer::paletteer_d("awtools::bpalette")[1],
 paletteer::paletteer_d("awtools::bpalette")[8:14]
)

# Improved markdown subtitle
plot_subtitle <- glue::glue("The rankings of most populous countries based on their Total Fertility Rate (TFR) from the<br>1950 to 2021. TFR is the average number of children a woman is expected to have during her<br>lifetime. Notice the consistently high fertility rates in Pakistan, Bangladesh, and Nigeria,<br>and <b style='color:{cn_col}'>the significant decline in China's fertility due to its One Child Policy</b>. The number in<br>the coloured bands shows the TFR. Below the flags are country's name and population.")

g <- df4 |> 
  ggplot() +
  
  # Panel visualization
  geom_bump(
    mapping = aes(
      x = year,
      y = top_rank,
      group = entity,
      colour = entity,
      alpha = (entity == "China")
    ),
    linewidth = 7
  ) +
  scale_alpha_manual(
    values = c(0.3, 0.9)
  ) +
  geom_label(
    mapping = aes(
      label = paste0(
        round(fertility, 1)
        ),
      x = year,
      y = top_rank
      ),
    colour = text_col,
    lineheight = 0.6,
    family = "caption_font",
    size = bts / 6,
    fontface = "bold",
    fill = "white",
    alpha = 0.3,
    label.size = NA,
    label.padding = unit(0.1, "lines")
  ) +
  
  # Left Y-Axis Ranks
  geom_text(
    data = tibble(
      rank_var = 1:14,
      y_var = 1:14
    ),
    mapping = aes(
      x = -0.1,
      y = y_var,
      label = rank_var
    ),
    size = bts / 2.2,
    family = "title_font",
    fontface = "bold",
    colour = text_col
  ) +
  
  # Left Y-Axis country labels
  geom_text(
    data = df4 |> filter(year == 1),
    mapping = aes(
      x = 0.5,
      y = top_rank,
      label = paste0(
        entity,
        " (", 
        scales::number(
          population,
          accuracy = 0.1,
          scale_cut = cut_short_scale()
        ),
        ")"
      )
    ),
    nudge_y = -0.4,
    lineheight = 0.3,
    hjust = 0.5,
    family = "caption_font",
    size = bts / 6
  ) +
  
  # Right Y-Axis country labels
  geom_text(
    data = df4 |> filter(year == 8),
    mapping = aes(
      x = 8.5,
      y = top_rank,
      label = paste0(
        entity,
        " (", 
        scales::number(
          population,
          accuracy = 0.1,
          scale_cut = cut_short_scale()
        ),
        ")"
      )
    ),
    nudge_y = -0.4,
    lineheight = 0.3,
    hjust = 0.5,
    family = "caption_font",
    size = bts / 6
  ) +
  
  # Adding country flags on top of chart - Right-end
  ggflags::geom_flag(
    data = df4 |> filter(year == 8),
    mapping = aes(
      x = 8.5, y = top_rank,
      country = country
    ),
    size = 19
  ) +
  # Adding country flags on top of chart - Left-end
  ggflags::geom_flag(
    data = df4 |> filter(year == 1),
    mapping = aes(
      x = 0.5, y = top_rank,
      country = country
    ),
    size = 19
  ) +
  
  # Scales and Coordinates
  scale_colour_manual(values = mypal) +
  scale_y_reverse() +
  scale_x_continuous(
    breaks = 1:8,
    labels = x_axis_labels,
    expand = expansion(c(0.02, 0.1))
  ) +
  scale_size_continuous(
    range = c(5, 20)
  ) +
  coord_cartesian(
    clip = "off"
  ) +
  
  # Labels and Themes
  labs(
    title = plot_title,
    subtitle = str_wrap(plot_subtitle, 95),
    caption = plot_caption,
    x = NULL,
    y = "Ranking of these Countries  (on Total Fertility Rate)",
    colour = NULL
  ) +
  theme_minimal(
    base_family = "body_font",
    base_size = bts
  ) +
  theme(
    legend.position = "none",
    panel.grid = element_blank(),
    axis.text.y = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_text(
      margin = margin(0,0,0,0, "mm"),
      colour = text_hil,
      hjust = 0.5
    ),
    axis.ticks.length = unit(0, "mm"),
    plot.margin = margin(10,10,10,10, "mm"),
    axis.text.x = element_text(
      colour = text_hil,
      margin = margin(0,0,0,0, "mm"),
      size = bts * 1.2,
      face = "bold"
    ),
    plot.title.position = "plot",
    plot.caption = element_textbox(
      hjust = 0.5,
      family = "caption_font",
      colour = text_hil,
      margin = margin(5,0,0,0, "mm")
    ),
    plot.subtitle = element_markdown(
      lineheight = 0.28,
      colour = text_hil,
      hjust = 0,
      margin = margin(5,0,0,0, "mm")
    ),
    plot.title = element_text(
      family = "caption_font",
      colour = text_hil,
      size = 2.5 * bts,
      face = "bold",
      hjust = 0,
      margin = margin(10,0,0,0, "mm")
    )
  )

# QR Code for the plot
url_graphics <- paste0(
  "https://aditya-dahiya.github.io/projects_presentations/projects/",
  # The file name of the current .qmd file
  "owid_fertility_bumps",
  ".qmd"
)
# remotes::install_github('coolbutuseless/ggqr')
# library(ggqr)
plot_qr <- ggplot(
  data = NULL, 
  aes(x = 0, y = 0, label = url_graphics)
  ) + 
  ggqr::geom_qr(
    colour = text_hil, 
    fill = bg_col,
    size = 1.6
    ) +
  # labs(caption = "Scan for the Interactive Version") +
  coord_fixed() +
  theme_void() +
  theme(plot.background = element_rect(
    fill = "transparent", 
    colour = "transparent"
    )
  )

g_full <- g +
  inset_element(
    p = plot_qr,
    left = 0.83, right = 0.98,
    top = 1.125, bottom = 0.97,
    align_to = "panel",
    clip = FALSE
  ) +
  plot_annotation(
    theme = theme(
      plot.background = element_rect(
        fill = "transparent", colour = "transparent"
      ),
      panel.background = element_rect(
        fill = "transparent", colour = "transparent"
      )
    )
  )

ggsave(
  filename = here::here("data_vizs", "a4_owid_fertility_bumps.png"),
  plot = g_full,
  width = 210 * 2,
  height = 297 * 2,
  units = "mm",
  bg = bg_col
)

Figure 2: This graphic presents the rankings of the most populous countries based on their Total Fertility Rate (TFR) from the 1950s to 2021. The x-axis represents the decades and the year 2021, while the y-axis shows the rankings from 1 to 14. Each colored line represents a different country, with country flags and population figures shown at the ends of the lines. The red colour focuses on China, showing the rapidly falling TFR.

References

Sjoberg, David. 2020. “Ggbump: Bump Chart and Sigmoid Curves.” https://CRAN.R-project.org/package=ggbump.