Visualizing Temperature Trends Over Time with ggplot2: A Step-by-Step Guide

Understanding Time Series Data and Plotting with ggplot2

Introduction

Time series data is a collection of observations taken at regular time intervals. In this article, we’ll explore how to plot a graph comparing temperature trends over time using the ggplot2 package in R.

What is Time Series Data?

A time series dataset typically consists of multiple variables, such as temperature, precipitation, or stock prices, recorded at different times. Each observation is associated with a specific date and time.

For example, let’s consider a temperature dataset like the one provided:

DATENTEMPminmaxYEAR
2012-09-012416.11666715.916.42012
2012-09-022416.43333316.316.82012

Each row represents a single observation, with the date, number of observations (N), temperature value, minimum and maximum temperatures, and year.

Understanding the Problem

The question asks us to plot a graph comparing the temperature trend over time, specifically looking at how different the temperature is between years. However, we need to modify the x-axis to show only months (January to December) instead of the entire year.

Setting Up ggplot2

To solve this problem, we’ll use the ggplot2 package in R. First, let’s load the necessary libraries and create a sample dataset:

# Load required libraries
library(ggplot2)

# Create a sample dataset (Note: real data should be used instead)
DailyTemp <- data.frame(
  DATE = c("2012-09-01", "2012-09-02", "2012-09-03", "2012-09-04"),
  N = c(24, 24, 24, 24),
  TEMP = c(16.116667, 16.433333, 16.300000, 16.508333),
  min = c(15.9, 16.3, 16.2, 16.3),
  max = c(16.4, 16.8, 16.5, 16.8),
  YEAR = c("2012", "2012", "2012", "2012")
)

# Convert the 'DATE' column to a datetime format
DailyTemp$DATE <- as.Date(DailyTemp$DATE)

Modifying the X-Axis

To show only months on the x-axis, we’ll modify the ggplot2 code to use the month function from the lubridate package. We’ll also need to create a new column for the year value.

# Load required libraries (if not already loaded)
library(lubridate)

# Create a new column 'YEAR_VALUE' with the actual year value
DailyTemp$YEAR_VALUE <- as.integer(DailyTemp$YEAR[match(DailyTemp$DATE, days_of_year(DailyTemp$DATE))])

# Convert the 'DATE' column to a datetime format (if not already done)
DailyTemp$DATE <- ymd(DailyTemp$DATE)

# Create a new data frame with the month values
monthlyData <- DailyTemp %>%
  group_by(MONTH = month(DATE)) %>%
  summarise(AVG_TEMP = mean(TEMP))

# Plot the temperature trend using ggplot2
ggplot(monthlyData, aes(x = MONTH, y = AVG_TEMP, group = YEAR_VALUE, colour = YEAR_VALUE)) +
  geom_line() +
  facet_grid(YEAR_VALUE ~.)

How It Works

Here’s a step-by-step explanation of how the modified code works:

  1. We first load the necessary libraries, including ggplot2 and lubridate.
  2. We create a sample dataset DailyTemp with the given data.
  3. We convert the ‘DATE’ column to a datetime format using the as.Date() function.
  4. We create a new column 'YEAR_VALUE' with the actual year value by matching each date to its corresponding year in the days_of_year() function from lubridate.
  5. We group the data by month and calculate the average temperature for each month using group_by(), summarise(), and mean().
  6. We plot the temperature trend using ggplot2, specifying the x-axis as the month values, y-axis as the average temperature values, and grouping the data by year value and color.
  7. Finally, we use the facet_grid() function to display multiple facets for each year value.

Output

The resulting plot will show the average temperature trend over time, with each line representing a different year. The x-axis will only show months (January to December), making it easier to compare temperature trends between years.

By following these steps and using ggplot2, we can effectively modify the x-axis to display only months while still showing the overall temperature trend over time.


Last modified on 2025-02-08