Display Formatted Values as Numeric in Y-Axis of ggplot2
In this article, we will explore how to format values from thousand to k and use them as numeric values in the y-axis of a ggplot2 plot.
Introduction
ggplot2 is a powerful data visualization library for R. It provides a simple and efficient way to create high-quality visualizations. One of its strengths is its ability to customize the appearance of plots, including the formatting of axis labels. In this article, we will delve into how to format values in the y-axis of a ggplot2 plot.
Background
The ggplot2 library uses a grammar-based approach for creating plots. This means that it is based on a set of rules and conventions rather than a list of predefined options. The scale_y_continuous
function in ggplot2 is used to customize the appearance of the y-axis, including its labels.
Problem Statement
Suppose we have a dataset with values that range from thousand to k. We want to display these values as numeric in the y-axis of our ggplot2 plot. However, instead of displaying them as numbers, we want to format them according to a specific pattern (e.g., 1234 becomes 1.234k).
Solution
Instead of formatting the values directly, we can use the labels
argument of the scale_y_continuous
function to format the labels in the y-axis.
library(plotly)
# Dummy data
data <- data.frame(
day = as.Date("2017-06-14") - 0:364,
value = runif(365) + seq(-140, 224)^2 / 10
)
p <- ggplot(data, aes(x=day, y=value)) +
geom_line() +
scale_y_continuous(labels = scales::label_number_si(accuracy = 0.1))
xlab("")
ggplotly(p)
In the code above, we use the scales::label_number_si
function to format the labels in the y-axis. The accuracy
argument is used to specify the accuracy of the formatting.
Customizing the Formatting Pattern
We can customize the formatting pattern by passing a custom function to the labels
argument. For example, if we want to display values above 1000 as k and below 1000 as number:
library(plotly)
# Dummy data
data <- data.frame(
day = as.Date("2017-06-14") - 0:364,
value = runif(365) + seq(-140, 224)^2 / 10
)
p <- ggplot(data, aes(x=day, y=value)) +
geom_line() +
scale_y_continuous(labels = function(x) ifelse(x > 1000, paste(round(x/1000), "k"), round(x)))
xlab("")
ggplotly(p)
In the code above, we use an ifelse
statement to check if the value is greater than 1000. If it is, we display it as k; otherwise, we display it as a number.
Multiple Formatting Rules
We can apply multiple formatting rules by using a list of functions in the labels
argument:
library(plotly)
# Dummy data
data <- data.frame(
day = as.Date("2017-06-14") - 0:364,
value = runif(365) + seq(-140, 224)^2 / 10
)
p <- ggplot(data, aes(x=day, y=value)) +
geom_line() +
scale_y_continuous(labels = list(
function(x) ifelse(x > 1000, paste(round(x/1000), "k"), round(x)),
function(x) ifelse(x < 100, paste(round(x/10), "x"), round(x))
))
xlab("")
ggplotly(p)
In the code above, we use a list of functions to apply multiple formatting rules. The first function formats values above 1000 as k and below 100 as x; otherwise, it displays them as numbers.
Conclusion
In this article, we explored how to display formatted values as numeric in the y-axis of a ggplot2 plot. We discussed different approaches for customizing the appearance of plots, including using the scale_y_continuous
function and formatting rules. By applying these techniques, you can create high-quality visualizations that effectively communicate your data insights.
Last modified on 2023-10-22