Using the Power of rlang: A Step-by-Step Guide to Parsing Expressions with dplyr's case_when Function

Understanding the case_when Function in dplyr and rlang

Introduction

The case_when function is a powerful tool in R for creating conditional statements. It allows users to define multiple conditions and corresponding actions. In this article, we will explore how to use the case_when function in conjunction with the rlang package to parse expressions from character vectors.

Background on Case_When

The case_when function is a part of the dplyr package, which provides data manipulation functions for R. The function takes multiple conditions as input and returns one value based on the condition that is true.

For example:

# Define the case_when function
case_when(
  x > 10 ~ "x is greater than 10",
  y == 5 ~ "y is equal to 5",
  otherwise ~ "unknown"
)

This would return one of three values: “x is greater than 10”, “y is equal to 5”, or “unknown”.

Working with rlang and parse_exprs

The rlang package provides a range of functions for working with R expressions. One such function is parse_exprs, which takes an expression as input and returns multiple expressions that can be used in the case_when function.

In the provided Stack Overflow question, the user is trying to use case_when in a Shiny application to build an app that shows a preview of some selection policy expressed as a set of rules. The user has inputted an expression in case_when syntax, but wants to substitute it into the case_when function.

Using parse_exprs to Parse Expressions

One way to achieve this is by using the parse_exprs function from the rlang package. This function takes a character vector of expressions and returns multiple expressions that can be used in the case_when function.

Here’s an example:

# Load the necessary libraries
library(dplyr)
library(rlang)

# Define the expression as a character vector
cond <- "Age > 40 ~ 1, TRUE ~ 0"

# Split the expressions on semicolons
cond_vec <- strsplit(cond, ";")[[1]]

# Use parse_exprs to get multiple expressions
exprs <- rlang::parse_exprs(cond_vec)

This would return a list of two expressions: Age > 40 and TRUE.

Using cond to Define Multiple Conditions

The user also mentions that the expression is defined as “cond = ‘Age > 40 ~ 1, TRUE ~ 0’”. This means that we need to split this string on semicolons to get multiple conditions.

Here’s how you can do it:

# Load the necessary libraries
library(dplyr)
library(rlang)

# Define the expression as a character vector
cond <- "Age > 40 ~ 1, TRUE ~ 0"

# Remove any leading or trailing spaces from the string
cond <- gsub("^\\s+|\\s+$", "", cond)

# Split the expressions on semicolons
cond_vec <- strsplit(cond, ";")[[1]]

# Use parse_exprs to get multiple expressions
exprs <- rlang::parse_exprs(cond_vec)

This would return a list of two expressions: Age > 40 and TRUE.

Putting it All Together

Now that we have the individual expressions, we can use them in the case_when function.

Here’s an example:

# Load the necessary libraries
library(dplyr)
library(rlang)

# Define the data
repdata <- tibble::tribble(
  ~Age, 23, 26, 32, 50, 51, 52, 25, 49, 34, 54
)

# Define the expression as a character vector
cond <- "Age > 40 ~ 1, TRUE ~ 0"

# Remove any leading or trailing spaces from the string
cond <- gsub("^\\s+|\\s+$", "", cond)

# Split the expressions on semicolons
cond_vec <- strsplit(cond, ";")[[1]]

# Use parse_exprs to get multiple expressions
exprs <- rlang::parse_exprs(cond_vec)

# Use case_when to apply the conditions
result <- repdata %>% 
  mutate(result = case_when(
    exprs[[1]] ~ 1,
    exprs[[2]] ~ 0
  ))

This would return a tibble with an additional column “result” that contains either 0 or 1 based on the condition.

Conclusion

In this article, we explored how to use the case_when function in dplyr and rlang to parse expressions from character vectors. We used the parse_exprs function from the rlang package to split the expressions on semicolons and return multiple expressions that can be used in the case_when function.

We also discussed how to define multiple conditions by splitting a string on semicolons and removing any leading or trailing spaces from the input string.


Last modified on 2025-04-14