library(DBI)
library(dbplyr)
library(tidyverse)
Chapter 22
Databases
# Reading in the data as a database
<- DBI::dbConnect(duckdb::duckdb())
connection dbWriteTable(connection, name = "db_diamonds", value = ggplot2::diamonds)
<- tbl(connection, "db_diamonds") db_df
22.5.10 Exercises
Question 1
What is distinct()
translated to? How about head()
?
As we can see below, the dplyr
function distinct()
translates into SELECT DISTINCT <variable name>
.
Similarly, the function head()
translates to SELECT * FROM <table name> LIMIT 6
.
|>
db_df distinct(cut) |>
show_query()
<SQL>
SELECT DISTINCT cut
FROM db_diamonds
|>
db_df head() |>
show_query()
<SQL>
SELECT *
FROM db_diamonds
LIMIT 6
Question 2
Explain what each of the following SQL queries do and try recreate them using dbplyr.
SELECT *
FROM flights
WHERE dep_delay < arr_delay
The code above is equivalent to dplyr
’s flights |> filter(dep_delay < arr_delay)
.
SELECT *, distance / (airtime / 60) AS speed
FROM flights
The code above is equivalent to dplyr
’s flights |> mutate(speed = distance / (airtime / 60))
.