Subset a Vector of Lists in R
Introduction
In this article, we will explore how to subset a vector of lists in R. This involves understanding the data types and structures involved in R and using the relevant functions to achieve our desired outcome.
What are Vectors and Lists?
R has two primary data structures: vectors and lists. A vector is an object that stores a collection of values of the same type, whereas a list is an object that can store a mixture of different data types, including vectors.
A key difference between vectors and lists in R is how they handle operations on their elements. Vectors are homogeneous, meaning all elements must be of the same class, whereas lists are heterogeneous, allowing for elements of any class.
Understanding the tribble
Function
The tribble
function in the tidyverse is a convenient way to create a dataframe with multiple columns and rows. It returns a tibble (a type of dataframe) which can be used as input for various data manipulation functions.
library(tidyverse)
# Create a vector of lists using tribble
d <- tribble(
~x,
c(10, 20, 64),
c(22, 11),
c(5, 9, 99),
c(55, 67),
c(76, 65)
)
# Print the resulting vector of lists
print(d)
Output:
x |
---|
(10, 20, 64) |
(22, 11) |
(5, 9, 99) |
(55, 67) |
(76, 65) |
Subset a Vector of Lists
The goal is to subset this vector such that we have rows with lists having a length greater than 2. We will use the lengths
function from the tidyverse.
Using lengths
When using lengths
, the ‘x’ in our vector is treated as a list. This means we can apply the lengths
function to each element of the list and compare it with a threshold value (in this case, 2).
library(dplyr)
# Subset the vector using lengths
subset_d <- d %>%
filter(lengths(x) > 2)
print(subset_d)
Output:
x |
---|
(10, 20, 64) |
(5, 9, 99) |
In this example, we have successfully subset the vector to include only those rows where the length of ‘x’ is greater than 2.
Understanding Why lengths
Works
The key to understanding why lengths
works lies in how it operates on lists. When you apply lengths
to a list in R, it returns a vector containing the lengths of each element in the original list.
For example:
# Create a list with multiple elements
list_of_elements <- list(1:3, 4:6)
# Apply lengths function
lengths_list <- lengths(list_of_elements)
print(lengths_list)
Output:
[1] 3 2
As you can see, lengths
returns a vector where each element represents the length of an individual list in the original input.
Conclusion
In this article, we explored how to subset a vector of lists in R. We used the lengths
function from the tidyverse to achieve our desired outcome and understood why it works by examining how it operates on lists.
Using lengths
provides a concise way to filter data based on conditions applied to individual elements within a list, which can be particularly useful when dealing with complex or nested data structures.
Last modified on 2023-06-17