Visualizing and Analyzing Temporal Data

Fundamentals of Data Science

MTH-391A | Spring 2025 | University of Portland

March 14, 2025

Objectives

Previously… (1/2)

Date/Time Ambiguity and Formats

Example Dates

# year-month-day
ymd("2017-01-31")
## [1] "2017-01-31"
# month-day-year
mdy("January 31st, 2017")
## [1] "2017-01-31"
# day-month-year
dmy("31-Jan-2017")
## [1] "2017-01-31"

Converting String Dates into date/time format

Previously… (2/2)

Time Granularity and Time-Zones

Case Study I

New York City Flights

Load Packages

library(nycflights13)
library(nycflights23)

Data Frames

# bind the two data frames
flights <- nycflights13::flights %>% 
  rbind(nycflights23::flights)

# show size
dim(flights)
## [1] 772128     19

Variables of Interest

Each row in the flights tibble is a unique flight from NYC.

# view random sample
flights_sub |> 
  sample_n(3)
## # A tibble: 3 × 9
##   date_time           carrier origin dest  dep_delay  year month   day  hour
##   <dttm>              <chr>   <chr>  <chr>     <dbl> <int> <int> <int> <dbl>
## 1 2013-11-07 06:00:00 WN      LGA    ATL           0  2013    11     7     6
## 2 2013-10-30 12:00:00 AA      JFK    BOS           2  2013    10    30    12
## 3 2023-08-24 21:00:00 9E      LGA    PWM          34  2023     8    24    21

How to Get Started on Exploring Temporal Data

The main goal of time-series data is to analyze how a variable or set of variables evolve over time. This typically involves identifying patterns and trends.

How many flights occurred in 2013 and 2023?

Increase Granularity

What is the number of fights per month?

Incorporate Other Variables

What is median departure delay per month coming out of each NYC airport?

More Precision

What is the median daily delay during the summer months?

A Stragety on Visualizing Time-Series Data

Effective Methods for Analyzing Trends Over Time

  1. Define the Objective
  1. Choose the Right Visualization Type
  1. Data Preprocessing
  1. Enhance Readability

Activity: Visualize Time-Series

  1. Log-in to Posit Cloud and open the R Studio assignment MA15: Visualize Time-Series.
  2. Make sure you are in the current working directory. Rename the .Rmd file by replacing [name] with your name using the format [First name][Last initial]. Then, open the .Rmd file.
  3. Change the author in the YAML header.
  4. Read the provided instructions.
  5. Answer all exercise problems on the designated sections.