Introduction
The problem presented in the Stack Overflow post is about obtaining data frames from a list of objects created using the stargazer
function in R. The function generates a table with summary statistics for a given dataset, but the resulting list object contains the actual data instead of just the summary statistics. This makes it difficult to work with the output directly.
Background
The stargazer
function is used to create tables from datasets in various formats, including data frames and matrices. The split
function splits a data frame into separate data frames based on certain conditions, such as a specific column or group of columns. However, when using stargazer
, the resulting list object contains more than just the summary statistics.
Solution
There are several ways to solve this problem, but we’ll explore two approaches here: using the dplyr
package and using the map
function from the base R.
Approach 1: Using dplyr
The most straightforward way to achieve the desired output is by using the dplyr
package. We can use the select
, pivot_longer
, group_by
, and summarise
functions to create a new data frame with only the summary statistics.
# Install and load dplyr
install.packages("dplyr")
library(dplyr)
# Create a data frame with summary statistics
mtcars_sumstat <- mtcars %>%
select(mpg:qsec, am) %>%
pivot_longer(-am) %>%
group_by(am, name) %>%
summarise(across(value, .fns=list(mean = mean, sd = sd, n = length), .names = "{fn}")) %>%
group_split()
# View the output
mtcars_sumstat
This approach produces a list of data frames with only the desired summary statistics.
Approach 2: Using map
Another way to achieve the desired output is by using the map
function from the base R. We can use the stargazer
function and then apply the map(tibble)
function to convert each result to a data frame.
# Create a list of summary statistics
mtcars_sumstat <- mtcars %>%
select(mpg:qsec, am) %>%
as.data.frame() %>%
split(.$am) %>%
map_df(~stargazer(., type = "text", summary.stat = c("n", "mean", "sd"))) %>%
map(tibble)
# View the output
mtcars_sumstat
This approach also produces a list of data frames with only the desired summary statistics.
Discussion
Both approaches produce the same result, but using dplyr
is generally considered more efficient and readable. The map
function can be useful when working with complex pipelines or when you need to apply multiple functions to each element of a list.
However, it’s worth noting that using stargazer
directly might not be the best approach in this case, since it generates tables from datasets rather than summary statistics. Nevertheless, if you’re already familiar with stargazer
, you can still use it to create summary statistics and then convert them to data frames.
Conclusion
In conclusion, obtaining a data frame from a list object created using stargazer
requires some creativity and knowledge of R’s piping language. Both approaches presented in this solution work, but using dplyr
is generally more efficient and readable.
Last modified on 2023-05-23