Key Learnings from, and Solutions to the exercises in Chapter 8 of the book Geocomputation with R by Robin Lovelace, Jakub Nowosad and Jannes Muenchow.
Geocomputation with R
Textbook Solutions
Author
Aditya Dahiya
Published
January 19, 2025
Code
library(sf) # Simple Features in Rlibrary(terra) # Handling rasters in Rlibrary(tidyterra) # For plotting rasters in ggplot2library(magrittr) # Using pipes with raster objectslibrary(tidyverse) # All things tidy; Data Wranglinglibrary(spData) # Spatial Datasetslibrary(patchwork) # Composing plotslibrary(gt) # Display GT tables with R
8.1 Introduction
Importance of Data I/O: Geographic data input/output (I/O) is critical for geocomputation, ensuring accurate real-world applications and enabling data sharing.
Impact of Errors: Mistakes in data I/O (e.g., using outdated datasets) can cause significant issues later in projects.
8.2 File Formats
Geographic data storage: Stored as files or spatial databases. File formats can store either vector or raster data; databases can handle both, and the most common types are shown in Table 1.
Code
# A Code to fetch icons of different file formats# Get file icons for common file typesfile_format =c("Shapefile", "GeoJSON", "KML", "GPKG", "CSV", "DXF", "MapInfo File", "GML", "PostGIS", "TopoJSON", "GeoTIFF", "NetCDF", "JPEG", "PNG", "HDF4", "HDF5", "GRIB", "Erdas Imagine", "ASCII Grid", "MBTiles")temp_df <-tibble(file_format = file_format,id =1:20)# Get a custom google search engine and API key# Tutorial: https://developers.google.com/custom-search/v1/overview# Tutorial 2: https://programmablesearchengine.google.com/# google_api_key <- "LOAD YOUR GOOGLE API KEY HERE"# my_cx <- "GET YOUR CUSTOM SEARCH ENGINE ID HERE"# Load necessary packageslibrary(httr)library(magick)# Define function to download and save movie posterdownload_icons <-function(file_format_name, id) { api_key <- google_api_key cx <- my_cx# Build the API request URL url <-paste0("https://www.googleapis.com/customsearch/v1?q=", URLencode(paste0(file_format_name, " file icon svg")), "&cx=", cx, "&searchType=image&key=", api_key)# Make the request response <-GET(url) result <-content(response, "parsed")# Get the URL of the first image result image_url <- result$items[[1]]$link im <- magick::image_read(image_url) |>image_resize("x100")# set background as whiteimage_write(image = im,path = here::here("book_solutions", "images", "temp_icons", paste0("temp_icon_", id,".png")),format ="png" )}for (i in1:20) {download_icons(file_format_name = temp_df$file_format[i],id = i )}# for (i in c(1:3, 5:14, 16:17, 19)) {# download_icons(# file_format_name = temp_df$file_format[i],# id = i# )# }# Some custom workimage_url_4 <-"https://icons.veryicon.com/png/o/construction-tools/supermap-gis-product-color-system-function/postgis-workspace.png"image_url_15 <-"https://cdn.iconscout.com/icon/premium/png-256-thumb/hdf-file-2538306-2129659.png"image_url_18 <-"https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQA0Pb2JqaMO5UGuYCgZzRi4BXQCw9lbmTDvg&s"image_url_20 <-"https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRlucEC8-sm9U0IXIfY-wh_c3nBFKLyg0JJdQ&s"image_url_19 <-"https://cdn.iconscout.com/icon/premium/png-256-thumb/ascii-file-1934516-1634566.png?f=webp&w=256"download_icon_custom <-function(id) { im <- magick::image_read(get(paste0("image_url_", id))) |>image_resize("x100")# set background as whiteimage_write(image = im,path = here::here("book_solutions", "images", "temp_icons",paste0("temp_icon_", id,".png")),format ="png" )}for (i inc(4, 15, 18, 20)) {download_icon_custom(i)}image_url_7 <-"https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcR0Zb1IW-Ui71fJQy4G5kj3Ue2zQop38Kn1mA&s"image_url_12 <-"https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTr8HP5xS8im3nK5Zm8l4iP8zDrGhoiakFbCA&s"for (i inc(7, 12, 19)) {download_icon_custom(i)}image_url_11 <-"https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSHPryNCuawMSnTpre5hHZZiGuWqAhRQsjWQg&s"download_icon_custom(11)
Code
# Get file icons for common file typesfile_format =c("Shapefile", "GeoJSON", "KML", "GPKG", "CSV", "DXF", "MapInfo File", "GML", "PostGIS", "TopoJSON", "GeoTIFF", "NetCDF", "JPEG", "PNG", "HDF4", "HDF5", "GRIB", "Erdas Imagine", "ASCII Grid", "MBTiles")temp_df <-tibble(file_format = file_format,id =1:20)# Raster formats tibbleraster_formats <-tibble(Format =c("GeoTIFF", "NetCDF", "JPEG", "PNG", "HDF4", "HDF5", "GRIB", "Erdas Imagine", "ASCII Grid", "MBTiles" ),Description =c("Tagged Image File Format for georeferenced raster data.","Network Common Data Form for array-oriented scientific data.","Compressed image format commonly used for photographic images.","Portable Network Graphics format for raster graphics.","Hierarchical Data Format v4 for scientific data.","Hierarchical Data Format v5, successor to HDF4.","Gridded Binary format for meteorological data.","Raster format for Erdas Imagine software.","ASCII representation of raster data.","SQLite-based raster tile storage format." ),Extension =c(".tif/.tiff", ".nc", ".jpg/.jpeg", ".png", ".hdf", ".h5", ".grb/.grib", ".img", ".asc", ".mbtiles" ),GeoReferencing =c("Yes", "Yes", "No", "No", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes" )) |>mutate(group ="Raster File Formats")# Vector formats tibblevector_formats <-tibble(Format =c("Shapefile", "GeoJSON", "KML", "GPKG", "CSV", "DXF", "MapInfo File", "GML", "PostGIS", "TopoJSON"),Description =c("Esri Shapefile format for vector geometries.","GeoJSON format for encoding a variety of geographic data structures.","Keyhole Markup Language for geographic data in XML.","Geopackage format for spatial data storage.","Comma-Separated Values for point data with attributes.","Drawing Exchange Format for CAD data.","MapInfo TAB format for vector geometries.","Geography Markup Language for XML-based vector data.","PostGIS spatial extension for PostgreSQL database.","TopoJSON format for topology-preserving JSON data." ),Extension =c(".shp", ".geojson", ".kml", ".gpkg", ".csv", ".dxf", ".tab", ".gml", "PostgreSQL", ".topojson"),GeoReferencing =c("Yes", "Yes", "Yes", "Yes", "No", "No", "Yes", "Yes", "Yes", "Yes" )) |>mutate(group ="Vector File Formats" )# Display vector formats tablebind_rows( vector_formats, raster_formats) |>left_join(temp_df, by =join_by(Format == file_format)) |>mutate(file_path =paste0("images/temp_icons/temp_icon_", id, ".png")) |>select(-id) |>group_by(group) |>gt() |>cols_label(Format ="File Format",Description ="Description",Extension ="File Extension",GeoReferencing ="Geo-referencing",file_path ="File Icon" ) |>tab_footnote(md("_**Source**: GDAL documentation (https://gdal.org/)_") ) |>tab_header(title ="Different kinds of file formats for geographical data" ) |> gtExtras::gt_theme_espn() |>fmt_image(columns = file_path ) |>tab_style(style =list(cell_text(weight ="bold") ),locations =cells_body(columns = Format ) ) |>tab_style(style =list(cell_text(color ="darkgreen", weight ="bold") ),locations =cells_body(rows = GeoReferencing =="Yes",columns = GeoReferencing ) ) |>tab_style(style =list(cell_text(color ="red", weight ="bold") ),locations =cells_body(rows = GeoReferencing =="No",columns = GeoReferencing ) ) |>tab_style(style =list(cell_text(align ="center") ),locations =cells_body(columns =c(Extension, GeoReferencing) ) ) |>tab_style(style =list(cell_text(align ="left") ),locations =cells_body(columns = Description ) ) |>tab_style(style =list(cell_fill(color ="lightgrey"), cell_text(weight ="bold", align ="center") ),locations =cells_row_groups() ) |>tab_style(style =list(cell_text(align ="center") ),locations =cells_title() )
Table 1: Different kinds of File Formats dealing with Geo-Spatial Data
Different kinds of file formats for geographical data
File Format
Description
File Extension
Geo-referencing
File Icon
Vector File Formats
Shapefile
Esri Shapefile format for vector geometries.
.shp
Yes
GeoJSON
GeoJSON format for encoding a variety of geographic data structures.
.geojson
Yes
KML
Keyhole Markup Language for geographic data in XML.
.kml
Yes
GPKG
Geopackage format for spatial data storage.
.gpkg
Yes
CSV
Comma-Separated Values for point data with attributes.
.csv
No
DXF
Drawing Exchange Format for CAD data.
.dxf
No
MapInfo File
MapInfo TAB format for vector geometries.
.tab
Yes
GML
Geography Markup Language for XML-based vector data.
.gml
Yes
PostGIS
PostGIS spatial extension for PostgreSQL database.
PostgreSQL
Yes
TopoJSON
TopoJSON format for topology-preserving JSON data.
.topojson
Yes
Raster File Formats
GeoTIFF
Tagged Image File Format for georeferenced raster data.
.tif/.tiff
Yes
NetCDF
Network Common Data Form for array-oriented scientific data.
.nc
Yes
JPEG
Compressed image format commonly used for photographic images.
.jpg/.jpeg
No
PNG
Portable Network Graphics format for raster graphics.
.png
No
HDF4
Hierarchical Data Format v4 for scientific data.
.hdf
Yes
HDF5
Hierarchical Data Format v5, successor to HDF4.
.h5
Yes
GRIB
Gridded Binary format for meteorological data.
.grb/.grib
Yes
Erdas Imagine
Raster format for Erdas Imagine software.
.img
Yes
ASCII Grid
ASCII representation of raster data.
.asc
Yes
MBTiles
SQLite-based raster tile storage format.
.mbtiles
Yes
Source: GDAL documentation (https://gdal.org/)
GDAL (Geospatial Data Abstraction Library):
Unified library for reading/writing raster and vector formats since the year 2000.
Powers GIS software like GRASS GIS, ArcGIS, and QGIS.
Supports over 200 vector and raster formats.
Common vector file formats:
ESRI Shapefile: Multi-file format; limited by 2GB size, 255 columns, and 10-character names.
GeoJSON: JSON-based, supports coordinates; open format.
KML/KMZ: XML-based; used for Google Earth.
GPX: GPS data exchange.
FlatGeobuf: Fast, single-file with streaming capabilities.
Common raster file formats:
GeoTIFF: Popular raster format embedding CRS; supports COG (Cloud Optimized GeoTIFF) for efficient online access.
Arc ASCII: Text-based raster format with header and cell values.
Database formats:
SQLite/SpatiaLite: Lightweight relational database with spatial extensions.
ESRI FileGDB: Proprietary database format supporting feature classes and topology.
GeoPackage: Open standard SQLite-based lightweight spatial database for vector, raster, and non-spatial data.
Open standards by OGC (Open Geospatial Consortium):
Promotes transparency and user-driven development.
Formats like GML, KML, and GeoPackage adhere to OGC standards.
ShapeFiles and their Limitations
ESRI Shapefiles have been a cornerstone in geospatial data exchange since their introduction in the early 1990s.
Reasons for Shapefile Popularity:
Historical Prevalence: Introduced by ESRI in the early 1990s, Shapefiles became widely adopted due to their early entry into the GIS market.
Simplicity and Compatibility: Shapefiles are straightforward to use and are supported by a vast array of GIS software, facilitating easy sharing and exchange of geographic information across different platforms.
Open Specification: Despite being a proprietary format, ESRI published the technical specifications for Shapefiles, allowing other software developers to implement support for them, which contributed to their widespread adoption.
Contents of a ShapeFile Folder
.shp: Stores geometric shape data (points, lines, polygons).
.shx: Index file for quick spatial queries of shapes.
.dbf: Attribute table with associated data in tabular format.
.prj: Coordinate system and projection information.
.cpg: Encoding of character data (e.g., UTF-8).
.sbn/.sbx: Optional spatial index for faster searches.
.qpj: Alternative projection format used by some software.
Limitations of Shapefiles:
Multi-File Structure: A single Shapefile consists of multiple files (e.g., .shp, .shx, .dbf), which can complicate file management and increase the risk of losing associated files.
Attribute Constraints: Field names in the attribute table are limited to 10 characters, and the maximum number of fields is 255, restricting the complexity of attribute data that can be stored.
File Size Limitation: Shapefiles have a maximum size limit of 2 GB, which can be restrictive for large datasets.
Lack of Topological Information: Shapefiles do not store topological relationships between features, limiting their utility in analyses that require such information.
Limited Data Types and Encoding: Shapefiles have poor support for Unicode, limiting international character representation, and they cannot store null values for numeric or text fields, which can lead to data inaccuracies.
Emerging Alternatives:
GeoPackage: An open, standards-based format developed by the Open Geospatial Consortium (OGC), GeoPackage uses a single SQLite database file to store both vector and raster data, overcoming many of the limitations associated with Shapefiles.
File Geodatabase: ESRI’s File Geodatabase offers a more robust solution with no size limitations, support for complex data types, and improved performance, making it a suitable alternative for users within the ESRI ecosystem.
GeoJSON: A format based on JavaScript Object Notation (JSON), GeoJSON is widely used for web applications and supports a variety of geometry types, making it a flexible alternative for sharing geospatial data, especially in web environments.
FlatGeobuf: A modern, open format designed for performant reading and writing of geospatial data, FlatGeobuf supports spatial indexing and is suitable for web and cloud environments.
Other new emerging formats:
LAS/LAZ: For lidar point clouds.
NetCDF/HDF: For multidimensional arrays.
Emerging formats: GeoArrow, Zarr.
Non-spatial formats:
Tabular formats (e.g., CSV, Excel) used for simple spatial data sharing.
Lack support for complex geometries and spatial metadata like CRS.
Loading vector data: Use sf::read_sf() to load vector data into R.
Loading raster data: Use terra::rast() to read raster data.
Data storage: Imported data is stored in RAM and becomes accessible in the .GlobalEnv of the session.
Workspace management: Imported objects can be listed with ls() and are visible in IDE ‘Environment’ panels.
8.3.1 Vector Data
Spatial vector data formats: Common formats include .geojson, .gpkg, .shp. Import these with sf::read_sf() or sf::st_read(), which rely on GDAL drivers.
Driver overview: Use sf::st_drivers() to see available GDAL drivers, their features, and file format capabilities (e.g., vector, raster support).
Import Geospatial Data into DuckDB: This command reads the specified shapefile and creates a table named spatial_data containing the geospatial information.
Perform Spatial Queries in DuckDB: With the spatial extension, you can execute spatial functions such as ST_Intersects, ST_Within, and ST_Contains directly within DuckDB.
Export Data to R for Use with {sf}: After processing, you can export the geospatial data back to R and convert it into an {sf} object for further analysis or visualization.
spatial_df <-dbGetQuery(con, "SELECT * FROM spatial_data;")library(sf)sf_object <-st_as_sf(spatial_df, wkt ="geometry_column")
8.3.2 Raster data
Raster File Formats:
Raster data supports single-layer and multi-layer files.
Use the rast() function from the terra package to load raster files.
Reading Online Raster Data: Use the /vsicurl/ prefix to read raster files from web resources (e.g., HTTP, FTP).
Note
Cloud Optimized GeoTIFF (COG) and GeoTIFF Files
GeoTIFF:
GeoTIFF is a public domain metadata standard that allows geo-referencing information to be embedded within a TIFF file.
Stores raster data with spatial reference, essential for GIS and remote sensing applications.
Includes metadata like coordinate system, map projections, and georeferencing tags.
A specialized format of GeoTIFF designed for efficient access and processing over the web.
COG files can be accessed using URLs with the /vsicurl/ prefix in GDAL-supported libraries. Functions like terra::crop() or terra::extract() fetch required data portions on demand, without downloading the full dataset.
Thus, /vsicurl/ prefix establishes a connection to the COG file without loading it into RAM. Values are accessed only during specific operations, e.g., crop() or extract().
Advantages of COG:
Streaming Access: Reads only the necessary portions of data (e.g., specific coordinates), minimizing memory usage.
Compatibility: Readable by tools like GDAL, Rasterio, and R packages like terra and stars.
Virtual File Systems (VFS): GDAL provides several Virtual File System (VFS) prefixes (shown in Table 3) that allow direct access to various file systems, including compressed archives, network resources, and in-memory files. These prefixes can be utilized in R using functions like sf::read_sf() for vector data and terra::rast() for raster data. Source GDAL Virtual File Systems.
Code
# Create the tibblefile_access_table <-tibble(Prefix =c("/vsicurl/","/vsizip/","/vsi7z/","/vsis3/","/vsigs/","/vsicurl_streaming/" ),Description =c("Enables GDAL to access files over [HTTP](https://developer.mozilla.org/en-US/docs/Web/HTTP), [HTTPS](https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview), or [FTP](https://en.wikipedia.org/wiki/File_Transfer_Protocol) protocols, allowing direct reading of remote datasets such as [GeoTIFFs](https://gdal.org/drivers/raster/gtiff.html) or [Shapefiles](https://gdal.org/drivers/vector/shapefile.html) hosted online.","Allows reading files directly from [ZIP archives](https://en.wikipedia.org/wiki/Zip_(file_format)) without extraction. ZIP files can contain various geospatial data formats like [Shapefiles](https://gdal.org/drivers/vector/shapefile.html), [GeoJSON](https://gdal.org/drivers/vector/geojson.html), or [KML files](https://gdal.org/drivers/vector/kml.html).","Supports reading files directly from [7z archives](https://en.wikipedia.org/wiki/7z), a high-compression format. These archives can include geospatial files such as [GeoTIFFs](https://gdal.org/drivers/raster/gtiff.html) or [Shapefiles](https://gdal.org/drivers/vector/shapefile.html).","Facilitates access to files stored in [Amazon S3](https://aws.amazon.com/s3/) buckets, enabling operations on remote datasets like [GeoTIFFs](https://gdal.org/drivers/raster/gtiff.html) or [Shapefiles](https://gdal.org/drivers/vector/shapefile.html) stored in the cloud.","Provides access to files stored in [Google Cloud Storage](https://cloud.google.com/storage/), allowing GDAL to read datasets such as [GeoTIFFs](https://gdal.org/drivers/raster/gtiff.html) or [Shapefiles](https://gdal.org/drivers/vector/shapefile.html) directly from cloud storage.","Enables streaming of files over [HTTP](https://developer.mozilla.org/en-US/docs/Web/HTTP), [HTTPS](https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview), or [FTP](https://en.wikipedia.org/wiki/File_Transfer_Protocol) without seeking capabilities, suitable for sequential reading of large remote datasets like [GeoJSON](https://gdal.org/drivers/vector/geojson.html) files." ))# Display the tibble using gtfile_access_table |>gt() |>fmt_markdown(columns = Description) |>tab_header(title ="Virtual File(s) Access Methods with GDAL",subtitle =md("Prefixes, Descriptions, and Example Usage for `sf::read_sf()` & `terra::rast()`") ) |> gtExtras::gt_theme_espn() |>tab_footnote(footnote =md("GDAL virtual file system prefixes, their descriptions, and example usage with `read_sf()` and `rast()` functions in R for accessing remote and compressed files. For more details, refer to the [GDAL Virtual File Systems documentation](https://gdal.org/user/virtual_file_systems.html) and the [terra package reference](https://rspatial.org/terra/).") ) |>tab_style(style =cell_text(size ="smaller" ),locations =cells_footnotes() ) |>tab_style(style =list(cell_text(font ="monospace",weight ="bold" ) ),locations =cells_body(columns =c(Prefix) ) )
Table 3: Table showcasing various GDAL virtual file system prefixes for accessing remote and compressed files
Virtual File(s) Access Methods with GDAL
Prefixes, Descriptions, and Example Usage for sf::read_sf() & terra::rast()
Prefix
Description
/vsicurl/
Enables GDAL to access files over HTTP, HTTPS, or FTP protocols, allowing direct reading of remote datasets such as GeoTIFFs or Shapefiles hosted online.
/vsizip/
Allows reading files directly from ZIP archives without extraction. ZIP files can contain various geospatial data formats like Shapefiles, GeoJSON, or KML files.
/vsi7z/
Supports reading files directly from 7z archives, a high-compression format. These archives can include geospatial files such as GeoTIFFs or Shapefiles.
/vsis3/
Facilitates access to files stored in Amazon S3 buckets, enabling operations on remote datasets like GeoTIFFs or Shapefiles stored in the cloud.
Enables streaming of files over HTTP, HTTPS, or FTP without seeking capabilities, suitable for sequential reading of large remote datasets like GeoJSON files.
GDAL virtual file system prefixes, their descriptions, and example usage with read_sf() and rast() functions in R for accessing remote and compressed files. For more details, refer to the GDAL Virtual File Systems documentation and the terra package reference.
8.4 Data Output (O)
8.4.1 Vector Data
Writing vector data: The counterpart of read_sf() is write_sf(). It writes sf objects to various geographic vector file formats, shown in Table 4
Automatic driver selection: write_sf() automatically determines the file driver based on the file name.
Overwriting files: By default, write_sf() overwrites files if the same file name is used.
Code
library(tibble)library(gt)file_types <-tibble(`File Type`=c("ESRI Shapefile", "GeoJSON", "GPKG (GeoPackage)", "KML", "CSV with WKT geometry"),Extension =c(".shp", ".geojson", ".gpkg", ".kml", ".csv"),`Use Cases`=c("Widely used for vector data storage and exchange","Web applications and data sharing","Efficient storage and transfer of spatial data","Sharing geographic data with Google Earth and similar applications","Simple data exchange and interoperability" ),Advantages =c("Broad compatibility with many GIS applications","Human-readable; easy integration with web mapping libraries","Single-file SQLite-based format; supports both vector and raster data","Suitable for visualization in Google Earth","Easy to create and read; compatible with non-GIS software" ),Disadvantages =c("Limited attribute name lengths; consists of multiple files","Larger file sizes; slower read/write performance","Less widespread adoption compared to Shapefile","Limited attribute support; not ideal for large datasets","Lacks spatial indexing; larger file sizes; limited to simple geometries" ))file_types |>gt() |>tab_header(title =md("Comparison of File Formats Supported by `sf::st_write()`") ) |>tab_footnote(footnote =md("*Sources: R Spatial Documentation ([r-spatial.github.io](https://r-spatial.github.io/sf/reference/st_write.html)) and GDAL Vector Drivers Documentation ([gdal.org](https://gdal.org/drivers/vector/index.html)).*") ) |> gtExtras::gt_theme_espn() |>tab_style(style =list(cell_text(weight ="bold") ),locations =cells_body(columns =`File Type` ) ) |>tab_style(style =list(cell_text(font ="monospace") ),locations =cells_body(columns = Extension ) )
Table 4: The key file formats compatible with the st_write() function in the sf package, including their use cases, advantages, and disadvantages.
Comparison of File Formats Supported by sf::st_write()
File Type
Extension
Use Cases
Advantages
Disadvantages
ESRI Shapefile
.shp
Widely used for vector data storage and exchange
Broad compatibility with many GIS applications
Limited attribute name lengths; consists of multiple files
GeoJSON
.geojson
Web applications and data sharing
Human-readable; easy integration with web mapping libraries
Larger file sizes; slower read/write performance
GPKG (GeoPackage)
.gpkg
Efficient storage and transfer of spatial data
Single-file SQLite-based format; supports both vector and raster data
Less widespread adoption compared to Shapefile
KML
.kml
Sharing geographic data with Google Earth and similar applications
Suitable for visualization in Google Earth
Limited attribute support; not ideal for large datasets
CSV with WKT geometry
.csv
Simple data exchange and interoperability
Easy to create and read; compatible with non-GIS software
Lacks spatial indexing; larger file sizes; limited to simple geometries
Adding layers: Use the layer argument to add new layers to formats like GeoPackage instead of overwriting existing files.
Example: layer = "second_layer" for additional layers in a .gpkg file.
Writing spatial data as text: Use the layer_options argument to write spatial data to text files:
GEOMETRY=AS_XY: Creates two columns for coordinates (useful for point datasets).
GEOMETRY=AS_WKT: Creates a column with well-known text (WKT) representations for complex spatial data.
Alternative function:st_write() is equivalent to write_sf(). Key differences:
Does not overwrite files (throws an error instead).
Provides a summary of the written file format and object.
8.4.2 Raster Data
Saving Raster Data:
Use writeRaster() from the terra package to save SpatRaster objects. It supports custom file formats, shown in Table 5, data types, and GDAL options.
Code
library(tibble)library(gt)raster_formats <- tibble::tibble(`File Type`=c("GeoTIFF", "COG (Cloud Optimized GeoTIFF)", "NetCDF", "ESRI ASCII", "ENVI"),`Extension`=c(".tif", ".tif", ".nc", ".asc", ".envi"),`Use Cases`=c("Widely used for geospatial data storage and exchange","Efficient for web-based applications and cloud storage","Ideal for climate and atmospheric data; supports multidimensional data","Simple text format; useful for data exchange and readability","Commonly used in remote sensing applications" ),`Advantages`=c("Supports multiple bands; allows compression; retains georeferencing info","Optimized for cloud storage; supports streaming and efficient access","Handles large datasets efficiently; self-describing format","Human-readable; easy to edit and share","Supports multiple bands; retains metadata" ),`Disadvantages`=c("Larger file sizes if not compressed","Requires specific software to fully leverage streaming benefits","Complexity in structure; may require specific software to read","Larger file sizes; slower read/write operations","Requires accompanying header file; less widely supported" ))raster_formats |>gt() |>tab_header(title =md("Comparison of Raster File Formats Supported by `terra::writeRaster()`") ) |>tab_footnote(footnote =md("Sourced from the [terra package documentation](https://rdrr.io/cran/terra/man/writeRaster.html) and other geospatial data resources.") ) |> gtExtras::gt_theme_espn() |>tab_style(style =list(cell_text(weight ="bold") ),locations =cells_body(columns =`File Type` ) ) |>tab_style(style =list(cell_text(font ="monospace") ),locations =cells_body(columns = Extension ) )
Table 5: This table summarizes key details about various raster file formats supported by the terra::writeRaster() function in R
Comparison of Raster File Formats Supported by terra::writeRaster()
File Type
Extension
Use Cases
Advantages
Disadvantages
GeoTIFF
.tif
Widely used for geospatial data storage and exchange
Supports multiple bands; allows compression; retains georeferencing info
Larger file sizes if not compressed
COG (Cloud Optimized GeoTIFF)
.tif
Efficient for web-based applications and cloud storage
Optimized for cloud storage; supports streaming and efficient access
Requires specific software to fully leverage streaming benefits
NetCDF
.nc
Ideal for climate and atmospheric data; supports multidimensional data
Handles large datasets efficiently; self-describing format
Complexity in structure; may require specific software to read
ESRI ASCII
.asc
Simple text format; useful for data exchange and readability
Human-readable; easy to edit and share
Larger file sizes; slower read/write operations
ENVI
.envi
Commonly used in remote sensing applications
Supports multiple bands; retains metadata
Requires accompanying header file; less widely supported
Use unsigned integers (INT1U, INT2U, INT4U) for categorical data.
Use floats (FLT4S, FLT8S) for continuous data.
Default: FLT4S, but consider smaller types for efficiency based on data range.
Note
Optimizing GeoTIFFs for Efficient Map Serving
In his article “GeoTIFF Compression for Dummies,” Paul Ramsey discusses effective strategies for preparing GeoTIFF images for map serving. He recommends using JPEG compression, internal tiling, the YCBCR color space, and internal overviews to significantly reduce file sizes without noticeable loss in visual quality. These techniques enhance performance in open-source rendering engines like MapServer, GeoServer, and Mapnik. Additionally, Ramsey highlights the advantages of Cloud Optimized GeoTIFFs (COGs) for efficient access in cloud storage environments. Some key advantages of Cloud Optimized GeoTIFFs (COGs) over traditional GeoTIFF files:
Direct Access to Data: COGs allow efficient partial reads, enabling users to access specific portions of the file (e.g., particular regions or zoom levels) without downloading the entire file.
Optimized for Cloud Storage: They are structured to work seamlessly with cloud storage systems, enabling faster access and reducing storage and retrieval costs.
Internal Overviews and Tiling: COGs come with built-in overviews (lower-resolution versions of the dataset) and tiling, improving performance for applications requiring multi-resolution access, such as web mapping.
Compatibility with Web-Based Tools: Many open-source tools and cloud platforms, such as AWS S3 and Google Earth Engine, directly support COGs, facilitating their integration into cloud-based workflows.
Improved Performance for Large Datasets: By accessing only the needed data chunks, COGs reduce network bandwidth requirements and improve responsiveness for large raster datasets.
Interoperability: COGs are compatible with standard GeoTIFF readers while offering enhanced functionality when used with tools that understand their optimized format.
8.5 Geoportals
Interactive exploration of datasets in a browser is an effective way to understand available layers.
Downloading data with code is preferable for reproducibility and efficiency. Command-line techniques can initiate downloads using URLs or APIs.
Code
library(tidyverse)library(gt)geoportals <-tibble(`Portal Name`=c("[Data.gov](https://catalog.data.gov/dataset?metadata_type=geospatial)","[GEOSS Portal](https://www.geoportal.org/)","[Copernicus Data Space Ecosystem](https://dataspace.copernicus.eu/)","[SEDAC](https://sedac.ciesin.columbia.edu/)","[INSPIRE Geoportal](http://inspire-geoportal.ec.europa.eu/)","[EarthExplorer](https://earthexplorer.usgs.gov/)" ),`Description`=c("Offers a vast array of geospatial datasets, including environmental data, climate records, and transportation networks.","Provides access to global Earth observation data, including satellite imagery and in-situ measurements across agriculture, biodiversity, and climate.","Delivers free access to Copernicus Sentinel satellite data, including high-resolution optical and radar imagery for environmental monitoring.","Focuses on integrating socioeconomic and environmental data, offering population distribution, urban extents, and climate impact indicators.","Facilitates access to European spatial data, including administrative boundaries, hydrography, and transportation networks adhering to INSPIRE standards.","Provides access to remote sensing data like Landsat imagery and digital elevation models managed by the U.S. Geological Survey." ),Type =c("Raster & Vector","Raster & Vector","Raster","Vector","Raster & Vector","Raster" ))# Print the tibblegeoportals |>gt() |>fmt_markdown(columns =`Portal Name`) |>cols_label_with(fn = snakecase::to_title_case) |>tab_header(title ="Geospatial Data Portals for Raster and Vector Datasets" ) |>tab_footnote(footnote =md("Data sources: Adapted from the textbook `Geocomputation with R`, *Section 8.5 Geoportals*.") ) |> gtExtras::gt_theme_espn()
Table 6: Some Common Geoportals available freely on the internet.
Geospatial Data Portals for Raster and Vector Datasets
Facilitates access to European spatial data, including administrative boundaries, hydrography, and transportation networks adhering to INSPIRE standards.
Provides access to remote sensing data like Landsat imagery and digital elevation models managed by the U.S. Geological Survey.
Raster
Data sources: Adapted from the textbook Geocomputation with R, Section 8.5 Geoportals.
8.6 Geographic data packages
Code
library(tidyverse)library(gt)geographic_data_packages <-tibble(Package =c("[climateR](https://cran.r-project.org/package=climateR)","[elevatr](https://cran.r-project.org/package=elevatr)","[FedData](https://cran.r-project.org/package=FedData)","[geodata](https://cran.r-project.org/package=geodata)","[osmdata](https://cran.r-project.org/package=osmdata)","[osmextract](https://cran.r-project.org/package=osmextract)","[rnaturalearth](https://cran.r-project.org/package=rnaturalearth)","[rnoaa](https://cran.r-project.org/package=rnoaa)","[giscoR](https://cran.r-project.org/package=giscoR)" ),Description =c("Provides access to over 100,000 gridded climate and landscape datasets from more than 2,000 data providers, allowing users to retrieve data by specifying an area of interest.","Facilitates access to point and raster elevation data from various sources, enabling users to obtain elevation information for specific locations or regions.","Allows users to download datasets maintained by the U.S. federal government, including elevation, land cover, and other geospatial data.","Simplifies the download and import of administrative boundaries, elevation models, WorldClim climate data, and more for various regions worldwide.","Enables downloading and importing of small to medium-sized datasets from OpenStreetMap, suitable for detailed geographic analyses.","Designed for downloading and importing large OpenStreetMap datasets, making it suitable for extensive geographic analyses.","Provides access to Natural Earth vector and raster data, including country boundaries, rivers, and urban areas, useful for creating maps and visualizations.","Interfaces with the National Oceanic and Atmospheric Administration (NOAA) to import climate and weather data, including forecasts, historical data, and severe weather alerts.","Allows users to download and use geospatial data from GISCO (Geographical Information System of the European Commission), including administrative boundaries and other geographic features." ),Important_Functions =c("`getDataAOI()` – Retrieve data for a specified area of interest.","`get_elev_point()` – Get elevation data for specific points.","`get_ned()` – Download National Elevation Dataset data.","`worldclim_global()` – Access global climate data.","`opq()` – Build an Overpass query; `add_osm_feature()` – Specify features to retrieve.","`oe_read()` – Read OpenStreetMap data for a specified region.","`ne_countries()` – Retrieve country boundaries; `ne_download()` – Download Natural Earth data.","`meteo_pull_monitors()` – Retrieve meteorological data from specified monitors.","`gisco_get_countries()` – Retrieve country boundaries; `gisco_get_nuts()` – Access NUTS regions." ))geographic_data_packages |>gt() |>fmt_markdown(columns =c(Package, Important_Functions)) |>cols_label_with(fn = snakecase::to_title_case) |>tab_header(title ="R Packages for Geographic Data" ) |> gtExtras::gt_theme_espn() |>tab_style(style =list(cell_text(font ="monospace") ),locations =cells_body(columns = Package ) )
Table 7: Selected R Packages for Geographic Data Retrieval and Analysis
Provides access to over 100,000 gridded climate and landscape datasets from more than 2,000 data providers, allowing users to retrieve data by specifying an area of interest.
getDataAOI() – Retrieve data for a specified area of interest.
Facilitates access to point and raster elevation data from various sources, enabling users to obtain elevation information for specific locations or regions.
get_elev_point() – Get elevation data for specific points.
Provides access to Natural Earth vector and raster data, including country boundaries, rivers, and urban areas, useful for creating maps and visualizations.
Interfaces with the National Oceanic and Atmospheric Administration (NOAA) to import climate and weather data, including forecasts, historical data, and severe weather alerts.
meteo_pull_monitors() – Retrieve meteorological data from specified monitors.
Allows users to download and use geospatial data from GISCO (Geographical Information System of the European Commission), including administrative boundaries and other geographic features.
gisco_get_countries() – Retrieve country boundaries; gisco_get_nuts() – Access NUTS regions.