Computing shortest routes in the sea that avoid land

Using data from Killer Whales encounters in Salish Sea to plot routes of their recorded encounters, and showing those routes that dont intersect land - i.e., removing erroneous data.

Routes

Background Map

Raster Map

{sf}

Author

Aditya Dahiya

Published

October 20, 2024

Dataset used: This #TidyTuesday dataset comes from the Center for Whale Research (CWR), which monitors Southern Resident killer whales in the Salish Sea, part of the Pacific Northwest. The dataset, scraped by Jadey Ryan and documented here, contains information on encounters from 2017 to 2024. Each encounter involves photographing and identifying individual whales. The data can be accessed via the {orcas} R package and includes variables like encounter duration, location, and pod. While the dataset is mostly tidy, some inconsistencies such as missing values and negative durations remain. | Source | Data.

Here’s my #TidyTuesday Visualization for this project in Figure 1, and the code used and the visualization webpage.

Figure 1: This map visualizes the movements of Southern Resident killer whales, with arrows marking the starting and ending points of each recorded encounter. The concentration of arrows within a small area highlights the key regions in the Salish Sea where these encounters occur most frequently. Background map images provided by StadiaMaps.

Method 1: Using {geosphere} and {sf}

Step 1: Loading libraries, getting the data and cleaning it

Code

# Loading the Libraries
library(tidyverse)       # Data wrangling
library(sf)              # SF objects
library(showtext)        # Using google fonts in R
library(geosphere)       # Spherical trigonometry for geography

# Set fonts for including in all graphics
font_add_google("Saira Semi Condensed", "body_font")
font_add_google("Saira Extra Condensed","caption_font")
showtext_auto()

# Load orcas data
orcas <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-10-15/orcas.csv')

sessioninfo::session_info()$packages |> 
  as_tibble() |> 
  dplyr::select(package, 
         version = loadedversion, 
         date, source) |>
  filter(package %in% .packages()) |> 
  arrange(package) |> 
  janitor::clean_names(
    case = "title"
  ) |> 
  gt::gt() |> 
  gt::opt_interactive(
    use_search = TRUE
  ) |> 
  gtExtras::gt_theme_espn()

Table 1: List of packages used during this analysis and their versions

Step 2: Clean the data, make it in {`sf`} format

Code

# A cleaner tibble to use for our visualization
df_sf <- orcas |> 
  as_tibble() |>
  dplyr::select(year, duration, 
         begin_latitude, begin_longitude,
         end_latitude, end_longitude) |> 

  # Convert the duration of encounter into seconds
  mutate(
    # remove parenthesis content from duration
    duration = str_remove(duration, "\\s*\\(.*\\)"),
    
    # remove the "s" for seconds
    duration = str_extract(duration, "-?\\d+"),
    
    # convert the duration into number
    duration = as.numeric(duration)
  ) |> 
  
  # Remove aberrant observation with durations less than zero
  filter(duration >= 0) |> 
  
  # Remove observations with missingness
  drop_na() |> 
  
  # Allot an ID number to each finally selected observation
  mutate(id = row_number()) |> 
  dplyr::relocate(id, .before = everything())

########################################################
# Get a bounding box on the coordinates of encounters
bbox_orcas <- bind_rows(
  
  # Geometry column of starting coordiantes
    sf::st_as_sf(
      df_sf |> dplyr::select(begin_latitude, begin_longitude),
      coords = c("begin_longitude","begin_latitude"),
      crs = 'EPSG:4326'
    ) |>
    mutate(type = "start_coords"),
    

  # Geometry column of starting coordiantes
    sf::st_as_sf(
      df_sf |> dplyr::select(end_latitude, end_longitude),
      coords = c("end_longitude","end_latitude"),
      crs = 'EPSG:4326'
    ) |>
    mutate(type = "end_coords")
) |> 
  st_bbox()

######################################################
# Display cleaned data

df_sf |> 
  gt::gt() |> 
  gt::opt_interactive() |> 
  gtExtras::gt_theme_espn() |> 
  gt::fmt_number(
    columns = -c(year, id),
    decimals = 2
  ) |> 
  gt::tab_header(
    title = "Cleaned Data on starting and ending coordinates",
    subtitle = "Recorded Encounters of Orcas (Southern Killer Whales) in the Salish Sea (2017-2024)"
  )

Table 2: A table of clean data that shows id, year, duration and starting and ending coordinates of each Killer Whale encounter as an {sf} class column

Cleaned Data on starting and ending coordinates

Recorded Encounters of Orcas (Southern Killer Whales) in the Salish Sea (2017-2024)

Step 3: Computing routes using `geosphere::gcIntermediate()`

Code

# Technique Credits: https://github.com/xmc811/flightplot/blob/master/R/main.R
# Credits: Mingchu Xu
# https://www.linkedin.com/in/mingchu-xu-467a0946/
# On Twitter: @xmc811

routes <- geosphere::gcIntermediate(
  p1 = df_sf |> dplyr::select(begin_longitude, begin_latitude),
  p2 = df_sf |> dplyr::select(end_longitude, end_latitude),
  n = 100,
  addStartEnd = TRUE,
  sp = TRUE) |> 
  sf::st_as_sf()

# Check whether it works
ggplot() +
  geom_sf(data = routes)

Figure 2: The routes computed by the {geosphere} package’s gcIntermediate() function

Step 4: Getting background maps for the Salish Sea area: USA and Canada

For this, we first convert our sf object on Map of USA and Canada, shown in Figure 3 (a) into a SpatVector. In the {terra} package in R, a SpatVector is the main class used to represent and work with vector data in geographic information system (GIS) contexts. A SpatVector can store points, lines, polygons, or any combination of these geometries, along with associated attributes (data linked to these geometries). We can create a SpatVector from:

Shapefiles (widely used for vector data in GIS)
Other vector file formats like GeoJSON, KML, or GPKG.
R objects such as data frames or matrices that contain coordinate information.

The reason to create a SpatVector is for performing spatial operations like buffering, intersecting, or spatial joins.The terra::vect() is the method to create these objects from various file formats or other R objects.

Then, we use the terra::crop() to crop USA and Canada map to a specified geographic extent Figure 3 (b) defined by bounding box of the sf objects of orcas created in previous step. When applied to a SpatVector object (vector data), terra::crop() trims the geometries (points, lines, or polygons) so that only the portions within a specified spatial extent remain. Lastly, we re-convert the cropped SpatVector back into an sf object, shown in ?@fig-basemap1-3

Code

# Extract country borders data
base_map <- rgeoboundaries::gb_adm0(country = c("USA", "Canada")) %>% 
  rmapshaper::ms_simplify(0.5)

# checking the size
# object.size(base_map) |> print(units = "Mb")
# 4.3 Mb

ggplot(base_map) +
  geom_sf() +
  coord_sf(
    crs = usmap::usmap_crs()
  )

sea <- terra::crop(terra::vect(base_map), bbox_orcas)

# Finally, reconvert the Cropped area back into an sf object
sea_sf <- sea |> 
  st_as_sf()

ggplot(sea_sf) +
  geom_sf()

(a) Map of entire USA and Canada from {rgeoboundaries}

Step 5: Computing and Removing routes intersecting with land (i.e., erroneous data)

Code

# Test if path is only sea. Each logical test if for each ID in the 
# df1 tibble
test_intersect <- lengths(
  st_intersects(
    st_transform(routes, st_crs(base_map)), base_map
  )
) > 0

# Compute distance for each 
dist_encounter <- st_length(routes)

# Create a second tibble of distance & paths for each encounter
df_routes <- routes |> 
  st_transform(st_crs(base_map)) |> 
  bind_cols(id = df_sf$id) |> 
  bind_cols(whether_intersect_land = test_intersect) |> 
  bind_cols(dist_encounter = as.numeric(dist_encounter)) |> 
  left_join(df_sf |> dplyr::select(year, id, duration)) |> 
  mutate(speed = dist_encounter / duration)
  
# A vector of IDs whose paths dont intersect land
ids_to_plot <- df_routes |> 
  filter(whether_intersect_land == FALSE) |> 
  pull(id)

df_routes |> 
  dplyr::relocate(geometry, .after = everything()) |> 
  gt::gt() |> 
  # gt::cols_hide(geometry) |> 
  gt::fmt_number(
    columns = c(dist_encounter, speed),
    decimals = 1
  ) |> 
  gt::fmt(
    columns = geometry,
    fns = function(x) {str_sub(x, 1, 20)}
  ) |> 
  gt::opt_interactive() |> 
  gtExtras::gt_theme_espn()

Table 3: A table of intersecting and non-intersecting routes

Step 6: Plotting the non-intersecting routes in Figure 4

Code

# ggmap::register_stadiamaps("YOUR-KEY-HERE")

bbox_stadiamap <- c(left = bbox_orcas["xmin"],
  right = bbox_orcas["xmax"],
  top = bbox_orcas["ymax"],
  bottom = bbox_orcas["ymin"])
names(bbox_stadiamap) <- c("left", "right", "top", "bottom")

# Getting the Stamen Maps for the background tiles as raster
stamen_tiles_lowres <- ggmap::get_stadiamap(
   bbox_stadiamap,
   zoom = 8,
   maptype = "stamen_terrain"
)

# Step: 1: 
# Credits: https://stackoverflow.com/questions/47749078/how-to-put-a-geom-sf-produced-map-on-top-of-a-ggmap-produced-raster by andyteucher on StackOverFlow (https://stackoverflow.com/users/1736291/andyteucher)

# Define a function to fix the bbox to be in CRS EPSG:3857
ggmap_bbox <- function(map) {
  # Extract the bounding box (in lat/lon) from the ggmap
  # to a numeric vector, and set the names to what
  # sf::st_bbox expects:
  map_bbox <- setNames(
    unlist(attr(map, "bb")),
    c("ymin", "xmin", "ymax", "xmax")
  )

  # Coonvert the bbox to an sf polygon, transform it to 3857,
  # and convert back to a bbox (convoluted, but it works)
  bbox_3857 <- st_bbox(
    st_transform(
      st_as_sfc(
        st_bbox(map_bbox, crs = 4326)
        ), 
      3857
    )
  )

  # Overwrite the bbox of the ggmap object with the transformed coordinates
  attr(map, "bb")$ll.lat <- bbox_3857["ymin"]
  attr(map, "bb")$ll.lon <- bbox_3857["xmin"]
  attr(map, "bb")$ur.lat <- bbox_3857["ymax"]
  attr(map, "bb")$ur.lon <- bbox_3857["xmax"]
  map
}

# Use the function to convert our downloaded Raster Files into 
# the new CRS and new bounding box CRS
stamen_tiles_lowres2 <- ggmap_bbox(stamen_tiles_lowres)

g <- ggmap::ggmap(stamen_tiles_lowres2) +
  geom_sf(
    data = df_routes |> 
            filter(!whether_intersect_land) |> 
            st_transform(crs = 3857),
    mapping = aes(
      geometry = geometry
    ),
    color = "grey10",
    alpha = 0.5,
    inherit.aes = FALSE,
    arrow = arrow(
      angle = 20,
      length = unit(0.5, "mm")
    ),
    linewidth = 0.2
  ) +
  coord_sf(
    expand = F
  ) +
  labs(
    x = "Longitude", y = "Latitude"
  ) +
  theme_minimal()

ggsave(
  plot = g,
  filename = here::here("geocomputation", "images", "computing_sea_routes_1.png"),
  width = 900,
  height = 700,
  units = "px",
  bg = "white"
)

Figure 4: This map visualizes the movements of Southern Resident killer whales, with arrows marking the starting and ending points of each recorded encounter. The concentration of arrows within a small area highlights the key regions in the Salish Sea where these encounters occur most frequently. Background map images provided by StadiaMaps.

Method 2: Using raster and `gdistance` package

The following code for computing sea routes is inspired from Code by Benjamin Nowak hosted on GitHub as a part of #TidyTuesday visualizations.

Step 1: Getting the data and cleaning it

The packages used in this analysis are shown in the Table 4

Code

# Loading the Libraries
library(tidyverse)       # Data wrangling
library(rgeoboundaries)  # Getting country & admin boundaries.
library(rmapshaper)      # For multi-polygon simplification
library(sf)              # SF objects
library(terra)           # Spatial data analysis
library(tidyterra)       # Tidyverse methods for terra objects
library(gdistance)       # Distances and Routes on Geographical Grids
library(showtext)        # Using google fonts in R

# Set fonts for including in all graphics
font_add_google("Saira Semi Condensed", "body_font")
font_add_google("Saira Extra Condensed","caption_font")
showtext_auto()

# Load orcas data
orcas <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-10-15/orcas.csv')

sessioninfo::session_info()$packages |> 
  as_tibble() |> 
  dplyr::select(package, 
         version = loadedversion, 
         date, source) |> 
  filter(package %in% .packages()) |> 
  arrange(package) |> 
  janitor::clean_names(
    case = "title"
  ) |> 
  gt::gt() |> 
  gt::opt_interactive(
    use_search = TRUE
  ) |> 
  gtExtras::gt_theme_espn()

Table 4: List of packages used during this analysis and their versions

Step 2: Clean the data

Code

# A cleaner tibble to use for our visualization
df1 <- orcas |> 
  as_tibble() |>
  dplyr::select(year, duration, 
         begin_latitude, begin_longitude,
         end_latitude, end_longitude) |> 

  # Convert the duration of encounter into seconds
  mutate(
    # remove parenthesis content from duration
    duration = str_remove(duration, "\\s*\\(.*\\)"),
    
    # remove the "s" for seconds
    duration = str_extract(duration, "-?\\d+"),
    
    # convert the duration into number
    duration = as.numeric(duration)
  ) |> 
  
  # Remove aberrant observation with durations less than zero
  filter(duration >= 0) |> 
  
  # Remove observations with missingness
  drop_na() |> 
  
  # Allot an ID number to each finally selected observation
  mutate(id = row_number()) |> 
  dplyr::relocate(id, .before = everything())

df1 |> 
  gt::gt() |> 
  gt::opt_interactive() |> 
  gtExtras::gt_theme_espn() |> 
  gt::fmt_number(
    columns = -c(year, id),
    decimals = 2
  )

Table 5: A table of clean data that shows id, year, duration and starting and ending coordinates of each Killer Whale encounter

Step 3: Creating data is specific formats needed by {`sf`} `{terra}`

Code

# The starting Coordinates as an sf object
start_coordinates <- df1 |> 
  dplyr::select(id, begin_latitude, begin_longitude) |> 
  sf::st_as_sf(
    coords = c("begin_longitude","begin_latitude"),
    crs = 'EPSG:4326'
  ) |> 
  mutate(type = "start")

# The ending Coordinates as an sf object  
end_coordinates <- df1 |> 
  dplyr::select(id, end_latitude, end_longitude) |> 
  sf::st_as_sf(
    coords = c("end_longitude","end_latitude"),
    crs = 'EPSG:4326'
  ) |> 
  mutate(type = "end")

# Compiling starting and ending coordinates into a tibble
orcas_sf <- start_coordinates |> 
  bind_rows(end_coordinates)

# Extracting the bounding box of that tibble to get our map
bb_orcas <- st_bbox(orcas_sf)

orcas_sf |> 
  print(n = 10)

Simple feature collection with 1168 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -125.6233 ymin: 47.85333 xmax: -122.0445 ymax: 49.55733
Geodetic CRS:  WGS 84
# A tibble: 1,168 × 3
      id             geometry type 
 * <int>          <POINT [°]> <chr>
 1     1  (-124.6925 48.5105) start
 2     2  (-123.1578 48.4765) start
 3     3    (-123.2 48.57167) start
 4     4 (-123.3367 49.09267) start
 5     5 (-123.0368 48.79683) start
 6     6 (-122.9295 48.43633) start
 7     7 (-123.8992 48.84733) start
 8     8  (-123.592 48.02783) start
 9     9    (-123.4358 48.87) start
10    10 (-123.3302 48.80333) start
# ℹ 1,158 more rows

Step 4: Getting background maps for the Salish Sea area: USA and Canada

For description of the actions performed, please see Step 4 of the Method 1 above.

Code

# Extract country borders data
base_map <- rgeoboundaries::gb_adm0(country = c("USA", "Canada")) %>% 
  rmapshaper::ms_simplify(0.5)

# checking the size
# object.size(base_map) |> print(units = "Mb")
# 4.3 Mb

ggplot(base_map) +
  geom_sf() +
  coord_sf(
    crs = usmap::usmap_crs()
  )

sea <- terra::crop(terra::vect(base_map), bb_orcas)

ggplot(sea) +
  geom_sf()

# Finally, reconvert the Cropped area back into an sf object
sea_sf <- sea |> 
  st_as_sf()

ggplot(sea_sf) +
  geom_sf()

Step 5: Converting into rasters to plot compute distances and intersections

terra::rast() function is used to create a raster object or load an existing raster dataset (e.g., GeoTIFF, ASCII, or other raster formats). A raster is a grid of cells (pixels) where each cell has a value representing information such as elevation, temperature, land cover, etc. We use it here to Create an empty raster, defining the number of rows, columns, extent, and coordinate reference system (CRS) to create a raster template.

terra::rasterize() function is then used to convert vector data (points, lines, polygons) into raster format. This process assigns values from vector geometries to raster cells, typically based on whether the geometries overlap with the cells or using attributes from the vector data. For example, here we are Rasterizing polygons: i.e., for each land types: USA, Canada or other, we can rasterize it so that each raster cell represents one of these three.

Code

# Convert vector to raster and set highly diffferent pixel values based on whether an area is sea or land (i.e. not sea)

# Step 5.1: Generate an empty raster defining the resolution by
#           number of rows and columns, and a CRS from sea_sf
r <- terra::rast(sea_sf, ncols = 500, nrows = 500)

ggplot() +
  geom_spatraster(data = r)

rr <- terra::rasterize(sea_sf, r, "shapeISO") %>%
  mutate(shapeISO = case_when(
    shapeISO %in% c('CAN', 'USA') ~ 1,  # assign value 1 to land
    TRUE ~ 1000                         # assign value 1000 to sea
  ))

ggplot() +
  geom_spatraster(data = rr)

Figure 6: The empty raster of 500 X 500 created using terra::rast()

Figure 7: The raster is enhanced by adding the polygons data from the salish sea area map we cropped in the last step.

Step 6: Computing the distance between starting and ending coordinates

The gdistance::transition() function is used to create a transition matrix from raster data. The Transition Matrix is a sparse matrix where each element represents the movement “cost” or “resistance” from one cell to its neighboring cells. The transition matrix enables the calculation of the most efficient path (e.g., the least-cost path) from one location to another. Thus, it is used for calculating least-cost paths, commute distances, and other kinds of spatial movement analyses.

The transition matrix helps in Movement Modeling: It is used to model movement across a landscape, such as wildlife migration, water flow, or human navigation, where each raster cell’s value might represent an obstacle or ease of travel. Note that we had assigned different value to land and sea raster points.
After creating a transition layer using gdistance::transition(), the gdistance::geoCorrection() function is used to apply geographic corrections to account for the varying distances between raster cells due to the curvature of the Earth or grid layout. This step is crucial when working with spatial data in a geographic coordinate system (e.g., latitude and longitude) where distances between cells are not uniform.
The gdistance::shortestPath() is then used to compute the shortest (or least-cost) path between two points on a raster grid, based on a transition matrix that describes the “cost” or “resistance” of moving from one cell to another. The function calculates this path by minimizing the total cost or resistance, taking into account the values in the transition matrix, which typically represent the difficulty or ease of moving through each cell.

Code

# For quick rendering of this .qmd file, I have not evaluated 
# this chunk of code, and rather saved the results of "distance"
# as an .rds file and reloaded it.

# Compute transition matrix from raster pixels
r_trans <- gdistance::transition(
  x = raster(rr), 
  transitionFunction = mean, 
  directions = 16
)
# object.size(r_trans) |> print(units = "Mb")
# 24.7 Mb

r_trans <- geoCorrection(r_trans)
# object.size(r_trans) |> print(units = "Mb")
# 24.7 Mb

# Compute the shortest path between start and end for the 
# first line of the transition matrix, and convert into sf object:

distance <- gdistance::shortestPath(
  r_trans, 
  c(df1 |> filter(id == 1) |> pull(begin_longitude),
    df1 |> filter(id == 1) |> pull(begin_latitude)), 
  c(df1 |> filter(id == 1) |> pull(end_longitude),
    df1 |> filter(id == 1) |> pull(end_latitude)), 
  output = "SpatialLines"
) |> 
  st_as_sf()

# Repeat the process for the other points / IDs
for (i in 2:nrow(df1)) {
  
  temp <- gdistance::shortestPath(
  r_trans, 
  c(df1 |> filter(id == i) |> pull(begin_longitude),
    df1 |> filter(id == i) |> pull(begin_latitude)), 
  c(df1 |> filter(id == i) |> pull(end_longitude),
    df1 |> filter(id == i) |> pull(end_latitude)), 
  output = "SpatialLines"
) |> 
  st_as_sf()
  
  distance <- distance |> 
    bind_rows(temp)

}

# Add a CRS to the newly created sf object 
distance <- distance |> 
  st_set_crs(st_crs(base_map))

saveRDS(distance, file = here::here("data", "orcas_distance.rds"))

Code

distance <- readRDS(here::here("data", "orcas_distance.rds"))

# Displaying the shortest paths
ggplot() +
  
  geom_sf(
    data = sea_sf, 
    alpha = 0.75, 
    fill = "#725428") +

  geom_sf(
    data = distance
  ) +
  coord_sf(expand = FALSE) +
  
  theme(
    panel.background = element_rect(fill = "#b6e3db")
  )

Figure 8: The shortest paths computed using {gdistance} show us that many of them are passing over land - these seem to be errors in the data

Step 7: Check whether paths intersect land, and retain only non-intersecting routes

The sf::st_intersects() function is used to determine whether two spatial geometries intersect. It checks if any part of one geometry touches or overlaps with another.

The sf::st_length() function is then used to calculate the length of geometries represented in sf (simple features) objects. It returns the length of each geometry in the specified unit of measurement.

Code

# Test if path is only sea. Each logical test if for each ID in the 
# df1 tibble
test_intersect <- lengths(st_intersects(distance, base_map)) > 0

# Compute distance for each 
dist_encounter <- st_length(distance)

# Create a second tibble of distance & paths for each encounter
df2 <- distance |> 
  bind_cols(id = start_coordinates$id) |> 
  bind_cols(whether_intersect_land = test_intersect) |> 
  bind_cols(dist_encounter = as.numeric(dist_encounter)) |> 
  left_join(df1 |> dplyr::select(year, id, duration)) |> 
  mutate(speed = dist_encounter / duration)
  
# A vector of IDs whose paths dont intersect land
ids_to_plot <- df2 |> 
  filter(whether_intersect_land == FALSE) |> 
  pull(id)

df2 |> 
  dplyr::relocate(geometry, .after = everything()) |> 
  slice_head(n = 10) |> 
  mutate(geometry = as.character(geometry)) |> 
  gt::gt() |> 
  # gt::cols_hide(geometry) |> 
  gt::fmt_number(
    columns = c(dist_encounter, speed),
    decimals = 2
  ) |> 
  gt::fmt(
    columns = geometry,
    fns = function(x) {str_sub(x, 1, 50)}
  )

id	whether_intersect_land	dist_encounter	year	duration	speed	geometry
1	FALSE	12,981.52	2024	5580	2.33	c(-124.689260978699, -124.703576316833, -124.71789
2	FALSE	757.91	2024	2460	0.31	c(-123.157519798279, -123.157519798279, -123.15751
3	FALSE	1,845.91	2024	9900	0.19	c(-123.200465812683, -123.193308143616, -123.18615
4	FALSE	11,256.39	2024	5460	2.06	c(-123.336461524963, -123.336461524963, -123.33646
5	FALSE	6,932.37	2024	2460	2.82	c(-123.035839424133, -123.035839424133, -123.03583
6	FALSE	11,305.02	2024	6300	1.79	c(-122.928474388123, -122.921316719055, -122.91415
7	TRUE	7,742.05	2024	6360	1.22	c(-123.901917381287, -123.901917381287, -123.90191
8	TRUE	130,922.24	2024	2340	55.95	c(-123.594137611389, -123.594137611389, -123.59413
9	FALSE	7,738.62	2024	3900	1.98	c(-123.436668891907, -123.443826560974, -123.45098
10	FALSE	2,816.18	2024	840	3.35	c(-123.329303855896, -123.329303855896, -123.32930

Step 8: Plotting the final routes with ggplot2

Code

land <- "#725428"
sea <- "#b6e3db"
orc <- "grey10"

ggplot() +
  geom_sf(
    data = sea_sf,
    mapping = aes(geometry = geometry),
    fill = land, color = NA,
    alpha = 0.9
  ) +
  geom_sf(
    df2 |> filter(!whether_intersect_land),
    mapping = aes(
      alpha = speed,
      geometry = geometry
    ),
    color = orc
  ) + 
  coord_sf(
    expand = F
  ) +
  scale_alpha(range = c(0.5, 0.9)) +
  guides(alpha = "none") +
  theme_minimal() +
  theme(
    panel.background = element_rect(fill = sea, color = NA)
  )

Figure 9: The final map with routes shown that don’t intersect land. The alpha (transparency) of each route is mapped to speed.

Method 1: Using {geosphere} and {sf}

Step 1: Loading libraries, getting the data and cleaning it

Step 2: Clean the data, make it in {sf} format

Step 3: Computing routes using geosphere::gcIntermediate()

Step 4: Getting background maps for the Salish Sea area: USA and Canada

Step 5: Computing and Removing routes intersecting with land (i.e., erroneous data)

Method 2: Using raster and gdistance package

Step 1: Getting the data and cleaning it

Step 2: Clean the data

Step 3: Creating data is specific formats needed by {sf} {terra}

Step 4: Getting background maps for the Salish Sea area: USA and Canada

Step 5: Converting into rasters to plot compute distances and intersections

Step 6: Computing the distance between starting and ending coordinates

Step 7: Check whether paths intersect land, and retain only non-intersecting routes

Step 8: Plotting the final routes with ggplot2

Step 2: Clean the data, make it in {`sf`} format

Step 3: Computing routes using `geosphere::gcIntermediate()`

Method 2: Using raster and `gdistance` package

Step 3: Creating data is specific formats needed by {`sf`} `{terra}`