ch_21_solutions

Prerequisites:


library(DBI)
library(tidyverse)
con <- DBI::dbConnect(duckdb::duckdb())
dbplyr::copy_nycflights13(con)
flights <- tbl(con, "flights")

21.5.10 Exercises:

  1. The equivalent of R’s distinct function is the word distinct in the select statement. The equivalent of the headfunction for duckdb is LIMIT, but the syntax varies depending on the SQL version used.

        flights |> 
          distinct(tailnum) |> 
          show_query()
    ## <SQL>
    ## SELECT DISTINCT tailnum
    ## FROM flights
    
        flights |> 
          head(5) |> 
          show_query()
    ## <SQL>
    ## SELECT flights.*
    ## FROM flights
    ## LIMIT 5
  2. The first query filters for rows where dep_delay is less than arr_delay, this can be thought of as flights that gained time after departing. The second query creates a column for miles per hour (since air_time is in minutes we need to divide by 60).

    flights |> 
      filter(dep_delay < arr_delay) |> 
      show_query()
    ## <SQL>
    ## SELECT flights.*
    ## FROM flights
    ## WHERE (dep_delay < arr_delay)
    
    flights |> 
      mutate(speed = distance / (air_time / 60)) |> 
      show_query()
    ## <SQL>
    ## SELECT flights.*, distance / (air_time / 60.0) AS speed
    ## FROM flights