Chapter 7

Networks

Author

Aditya Dahiya

Published

December 9, 2023

Note

There are no exercises in this Chapter to generate solutions for. Instead, we create Network Graphs for Star Wars characters using Star Wars Social Networks data.

Code
# Loading the required libraries
library(tidyverse)
library(ggraph)
library(tidygraph)
library(jsonlite)

7.1 Network Data

7.1.1 Tidygraph: A tidy network manipulation API

I demonstrate an example to import a .json file using the Star Wars Social Networks data from here (Gabasova 2016). Then, I manipulate the data using tidygraph and tidyverse functions.

  • Creating a tbl_graph object (tidygraph) (Pedersen 2023) from a .json file using jsonlite package. (Ooms 2014).
Code
# The url for the data
url <- "https://raw.githubusercontent.com/evelinag/star-wars-network-data/master/starwars-full-interactions.json"

# Importing json data
starwars <- jsonlite::read_json(url, simplifyVector = TRUE)

# Getting the nodes (i.e., Star-Wars characters)
sw_nodes <- starwars[[1]] |> 
  as_tibble() |> 
  # An id to match nodes and links
  mutate(id = row_number()) |> 
  relocate(id) |> 
  # Conventional naming
  rename(
    firstname = name,
    name = id
  )

# Getting the links (i.e. connections between characters)
sw_links <- starwars[[2]] |> 
  as_tibble() |> 
  # Network Graphs Tidygraph does not recognize 0 as a node, so adding 1.
  mutate(
    source = source + 1,
    target = target + 1
  ) |> 
  # Conventional Naming
  rename(
    from = source,
    to = target,
    weightage = value
  )

# Creating a tbl_graph object
sw_graph <- tbl_graph(
  nodes = sw_nodes,
  edges = sw_links,
  node_key = "name",
  directed = FALSE
) 

sw_graph
# A tbl_graph: 110 nodes and 398 edges
#
# An undirected simple graph with 2 components
#
# A tibble: 110 × 4
   name firstname   value colour 
  <int> <chr>       <int> <chr>  
1     1 QUI-GON        61 #4f4fb1
2     2 NUTE GUNRAY    24 #808080
3     3 PK-4            3 #808080
4     4 TC-14           4 #808080
5     5 OBI-WAN       147 #48D1CC
6     6 DOFINE          3 #808080
# ℹ 104 more rows
#
# A tibble: 398 × 3
   from    to weightage
  <int> <int>     <int>
1     1     2         1
2     3     4         1
3     4     5         1
# ℹ 395 more rows

7.1.3 Algorithms

The real benefit of tbl_graph data is that we can do many operations on them, while preserving the underlying structure, for example: centrality calculation, ranking, grouping, etc.

Finding groups within Star Wars Characters using group_leading_eigen() which groups nodes based on the leading eigenvector of the modularity matrix using igraph::cluster_leading_eigen() . The variable col_var shows the group, which can be used to create groups.

sw_graph <- sw_graph |> 
  # Creating groups with group_leading_eigen()
  activate(nodes) |> 
  mutate(col_var = tidygraph::group_leading_eigen())

The group_leading_eigen() creates groups, and as we can see below in Figure 1 , the groups are of related characters who appear often together in the movie.

Code
sw_graph |> 
  activate(nodes) |> 
  as_tibble() |> 
  select(col_var, firstname, value) |> 
  mutate(
    Group = paste0("Gp. ", col_var),
    firstname = snakecase::to_title_case(firstname)
  ) |> 
  group_by(Group) |>
  arrange(desc(value)) |> 
  summarise(Characters = paste(firstname, collapse = ", ")) |> 
  gt::gt() |> 
  gtExtras::gt_theme_espn()
Group Characters
Gp. 1 Obi Wan, Emperor, Yoda, Nute Gunray, Bail Organa, Mace Windu, General Grievous, Rune, Count Dooku, Sio Bibble, Lama Su, Ki Adi Mundi, Darth Maul, Jango Fett, Tc 14, Tey How, General Ceel, Odd Ball, Clone Commander Gree, Clone Commander Cody, Pk 4, Dofine, Valorum, Senator Ask Aak, Orn Free Taa, Sun Rit, Poggle, Plo Koon, Tion Medon, Rabe
Gp. 2 Anakin, Padme, Qui Gon, Jar jar, Captain Panaka, Ric Olie, Shmi, Fode Beed, Owen, Watto, Jabba, Captain Typho, Bravo Two, Boss Nass, Kitster, Ruwee, Tarpals, Sebulba, Jira, Wald, Greedo, Bravo Three, Sola, Jobal, Cliegg, Fang Zar, Giddean Danu
Gp. 3 Luke, C 3 Po, Leia, Lando, Red Leader, Wedge, Biggs, Gold Leader, Admiral Ackbar, Red Ten, Rieekan, Beru, Zev, Mon Mothma, Dack, Janson, Captain Antilles, Camie, Dodonna, Derlin, Bib Fortuna, Boushh
Gp. 4 Han, Finn, Rey, Poe, Kylo Ren, General Hux, Captain Phasma, Maz, Unkar Plutt, Snap, Lor San Tekka, Bala Tik, Snoke, Ello Asty, Jess, Lieutenant Mitaka, Admiral Statura, Yolo Ziff, Colonel Datoo, Niv Lek, Bb 8
Gp. 5 Darth Vader, Tarkin, Piett, Boba Fett, Taun We, Motti, Ozzel, Needa, Jerjerrod
Gp. 6 Gold Five
Figure 1: Groups of Star Wars characters as formed by group_leading_eigen()

7.2 Visualizing Networks

We can use the tbl_graph object with ggraph to create visualizations easily.

  • In Figure 2, we are visualizing Networks amongst most important characters (imporantance determined by >40 appearances in scenes). As we can see, out tidygraph function group_leading_eigen() has successfully group the characters into groups, based on their interactions.
Code
sw_graph |> 
  
  activate(nodes) |> 
  filter(value > 40) |> 
  
  ggraph(layout = "stress") +
  geom_edge_link(
    mapping = aes(width = weightage),
    alpha = 0.5,
    color = "grey"
  ) +
  geom_node_point(
    mapping = aes(size = value, 
                  colour = as.factor(col_var))
  ) +
  geom_node_text(
    mapping = aes(label = firstname),
    check_overlap = TRUE,
    repel = TRUE
  ) +
  scale_size_area(max_size = 10) +
  scale_colour_brewer(palette = "Dark2") +
  theme_void() + 
  theme(legend.position = "none")

Figure 2: Linkages (defined by simultaneous appearance in a scene in the movie) amongst most important characters of Star Wars movies
  • Below, we trying out different layouts in Figure 3 using the argument layout = "" to the function ggraph(). The ggraph provides well over 20 different layouts1 to choose from, including (Credits: R Documentation for layout_tbl_graph_igraph): —
  • Hierarchical layouts

    • tree: Uses the Reingold-Tilford algorithm to place the nodes below their parent with the parent centered above its children.

    • sugiyama: Designed for directed acyclic graphs (that is, hierarchies where multiple parents are allowed) it minimizes the number of crossing edges.

  • Standard layouts

    • bipartite: Minimize edge-crossings in a simple two-row (or column) layout for bipartite graphs.

    • star: Place one node in the center and the rest equidistantly around it.

    • circle: Place nodes in a circle in the order of their index. Consider using layout_tbl_graph_linear() with circular=TRUE for more control.

    • nicely: Tries to pick an appropriate layout. See igraph::nicely() for a description of the simple decision tree it uses

    • dh: Uses Davidson and Harels simulated annealing algorithm to place nodes.

    • gem: Place nodes on the plane using the GEM force-directed layout algorithm.

    • graphopt: Uses the Graphopt algorithm based on alternating attraction and repulsion to place nodes.

    • grid: Place nodes on a rectangular grid.

    • mds: Perform a multidimensional scaling of nodes using either the shortest path or a user supplied distance.

    • sphere: Place nodes uniformly on a sphere - less relevant for 2D visualizations of networks.

    • randomly: Places nodes uniformly random.

    • fr: Places nodes according to the force-directed algorithm of Fruchterman and Reingold.

    • kk: Uses the spring-based algorithm by Kamada and Kawai to place nodes.

    • drl: Uses the force directed algorithm from the DrL toolbox to place nodes.

    • lgl: Uses the algorithm from Large Graph Layout to place nodes.

Code
sw_graph_fun <- function(my_layout){
  sw_graph |>
  activate(nodes) |>
  filter(value > 40) |> 
  ggraph(layout = my_layout) +
  geom_edge_link(
    mapping = aes(width = weightage),
    alpha = 0.5,
    color = "grey"
  ) +
  geom_node_point(mapping = aes(size = value, 
                                col = as.factor(col_var))) +
  geom_node_text(
    mapping = aes(label = firstname),
    check_overlap = TRUE, 
    repel = TRUE,
    fontface = "bold"
  ) +
  labs(title = paste0("layout = \"", my_layout, "\"")) +
  scale_size_area(max_size = 10) +
  theme_void() +
  theme(
    legend.position = "none",
    plot.title = element_text(
      face = "bold",
      family = "mono",
      size = 20,
      hjust = 0
    )
  )
}

sw_graph_fun("drl")

sw_graph_fun("circle")

sw_graph_fun("nicely")

sw_graph_fun("dh")

sw_graph_fun("gem")

sw_graph_fun("graphopt")

sw_graph_fun("grid")

sw_graph_fun("kk")

(a) drl

(b) circle

(c) nicely

(d) dh

(e) gem

(f) graphopt

(g) grid

(h) kk

Figure 3: Different Layouts for the ggraph()

  • We can further customizing Edge-widths, curvature and colours to represent strength of links between the nodes using ggraph, as depicted in Figure 4.
Code
library(showtext)
library(fontawesome)

font_add_google("Poller One",
                family = "font_sw")

font_add_google("PT Sans Narrow", "font_title")
font_add_google("Nova Square", "font_body")
text_col <- "black"
# Caption stuff
github <- "&#xf09b"
github_username <- "aditya-dahiya"
xtwitter <- "&#xe61b"
xtwitter_username <- "adityadahiyaias"
linkedin <- "&#xf08c"
linkedin_username <- "dr-aditya-dahiya-ias"
social_caption <- glue::glue("<span style='font-family:\"Font Awesome 6 Brands\";'>{github};</span> <span style='color: {text_col}'>{github_username}  </span> <span style='font-family:\"Font Awesome 6 Brands\";'>{xtwitter};</span> <span style='color: {text_col}'>{xtwitter_username}</span> <span style='font-family:\"Font Awesome 6 Brands\";'>{linkedin};</span> <span style='color: {text_col}'>{linkedin_username}</span>")


showtext_auto()
set.seed(4)
sw_graph |>
  activate(nodes) |>
  mutate(col_var = as.character(col_var)) |>
  filter(value > 40) |>
  
  # Start plotting network graph
  ggraph(layout = "nicely") +
  
  # Edges
  geom_edge_bend2(
    aes(
      colour = node.col_var,
      width = weightage
    ),
    lineend = "round"
  ) +
  
  # Nodes
  geom_node_point(
    aes(
      size = value,
      fill = col_var
    ),
    pch = 21,
    colour = "white",
    stroke = 2
  ) +
  
  # Labels
  geom_node_text(
    aes(label = firstname),
    check_overlap = TRUE,
    repel = TRUE,
    fontface = "bold"
  ) +
  labs(title = "Network showing appearances of main characters in Star Wars films",
       subtitle = "Size of dot indicates screen-time; width of link indicates common appearances in scenes",
       caption = social_caption) +
  scale_size_area(max_size = 15) +
  theme_void() +
  theme(
    legend.position = "none",
    plot.title = element_text(
      family = "font_title",
      face = "bold",
      size = 21,
      hjust = 0
    ),
    plot.subtitle = element_text(
      family = "font_body",
      hjust = 0
    ),
    plot.caption = element_text(
      hjust = 1,
      family = "font_title"
    )
  )

Figure 4: Customizing the links between the nodes with geom_edge_link2()
  • Faceting: Lastly, we can also facet on the nodes or on the edges, and produce multiple network graphs, as shown in Figure 5 for the entire Star Wars cast (not just the most important characters): —
Code
labels_facet = c(
  "Jedi & Sith Lords",
  "Prequel Trilogy", 
  "Original Trilogy",
  "Sequel Trilogy",
  "Villains",
  "An outlier!"
)
labels_facet <- str_to_upper(labels_facet)
names(labels_facet) = 1:6

sw_graph |>
  activate(nodes) |> 
  mutate(firstname = snakecase::to_title_case(firstname)) |> 
  ggraph() +
  geom_edge_link(
    mapping = aes(width = weightage),
    alpha = 0.5,
    color = "grey"
  ) +
  geom_node_point(
    aes(size = value, 
        col = as.factor(col_var))) +
  geom_node_text(
    aes(label = firstname,
        size = value),
    repel = TRUE
  ) +
  scale_size_area(max_size = 10) +
  scale_size_continuous(range = c(6, 15)) +
  scale_color_brewer(palette = "Set2") +
  facet_nodes(~ col_var, 
              scales = "free",
              labeller = as_labeller(labels_facet),
              ncol = 2) +
  theme_void() +
  theme(
    legend.position = "none", 
    strip.text = element_text(
      family = "font_sw",
      hjust = 0.5,
      size = 36
    ),
    panel.border = element_rect(colour = "darkgrey",
                                fill = NA),
    strip.background = element_rect(colour = "darkgrey",
                                    fill = NA)
  )

Figure 5: Different Network of Characters in Star Wars: A Faceted ggraph

References

Gabasova, Evelina. 2016. “Star Wars Social Network.” Zenodo. https://doi.org/10.5281/ZENODO.1411479.
Ooms, Jeroen. 2014. “The Jsonlite Package: A Practical and Consistent Mapping Between JSON Data and r Objects.” https://arxiv.org/abs/1403.2805.
Pedersen, Thomas Lin. 2023. “Tidygraph: A Tidy API for Graph Manipulation.” https://CRAN.R-project.org/package=tidygraph.

Footnotes

  1. The layouts have been picked up (copied) from RDocumentation webpage: ggraph (version 2.1.0) layout_tbl_graph_igraph: using igraph layout algorithms for layout_tbl_graph↩︎