Calculating Average and Median on Monthly Data and Convert to HTML Table R
In this article, we will explore how to calculate average and median on monthly data using R programming language. We’ll also cover how to convert the output into an HTML table format.
Introduction
R is a popular programming language used for statistical computing, data visualization, and data analysis. The dplyr
library provides a grammar of data manipulation, which makes it easy to perform various data transformations and calculations.
In this article, we’ll focus on calculating average and median on monthly data using the dplyr
library. We’ll also cover how to convert the output into an HTML table format using the tableHTML
package.
The Problem
The problem presented in the question is a real-world scenario where we have a dataset with multiple variables, including Date1
, T1
, and Val1
. We want to calculate the average and median of Val1
for each month of the year. We also want to add two new columns, Median of A
and Median of B
, which represent the median values of Val1
for T1
= “A” and T1
= “B”, respectively.
We’ll use the following code as an example:
library(dplyr)
library(lubridate)
library(tableHTML)
my_data <- read.table(text =
"ID Date1 T1 Date2 Val1
A-1 '2018-01-10 15:05:24' A 2018-01-15 10
A-2 '2018-01-05 14:15:22' B 2018-01-14 12
A-3 '2018-01-04 13:20:21' A 2018-01-13 15
A-4 '2018-01-01 18:35:45' B 2018-01-12 22
A-5 '2017-12-28 19:45:10' A 2018-01-11 18
A-6 '2017-12-10 08:03:29' A 2018-01-10 21
A-7 '2017-12-06 20:55:55' A 2018-01-09 28
A-8 '2018-01-10 10:02:12' A 2018-01-15 10
A-9 '2018-01-05 17:15:14' B 2018-01-14 12
A-10 '2018-01-04 18:35:58' A 2018-01-13 15
A-11 '2018-01-01 21:09:25' B 2018-01-12 22
A-12 '2017-12-28 02:12:22' A 2018-01-11 18
A-13 '2017-12-10 03:45:44' A 2018-01-10 21
A-14 '2017-12-06 07:15:25' A 2018-01-09 28
A-18 '2017-10-07 08:02:84 B 2017-11-05 20
A-21 '2017-10-01 06:04:04 A 2017-10-20 15
A-51 '2017-09-20 08:07:07' A 2017-09-19 12
A-52 '2017-09-21 08:05:04' B 2017-09-18 16
A-53 '2017-09-22 08:03:06' B 2017-09-19 14
A-54 '2017-09-23 08:01:08' B 2017-09-18 13
A-55 '2017-09-24 07:59:10' B 2017-09-17 12
A-56 '2017-09-25 07:57:12' B 2017-09-16 11")
my_data <- my_data %>%
group_by(Month) %>%
summarise(
`# of A` = n(),
`sum of A` = sum(Val1, na.rm = TRUE),
`Median of A` = median(Val1, na.rm = TRUE),
row_number() = row_number()
) %>%
arrange(row_number())
my_data <- my_data %>%
mutate(
`# of B` = n(),
`sum of B` = sum(Val1, na.rm = TRUE),
`Median of B` = median(Val1, na.rm = TRUE)
)
library(tableHTML)
table_2 <- my_data %>%
group_by(Month) %>%
summarise(
`# of A` = n(),
`sum of A` = sum(Val1, na.rm = TRUE),
`Median of A` = median(Val1, na.rm = TRUE),
row_number() = row_number(),
`# of B` = n(),
`sum of B` = sum(Val1, na.rm = TRUE),
`Median of B` = median(Val1, na.rm = TRUE)
) %>%
arrange(row_number()) %>%
mutate(
`MOM Growth # of A` = round(`# of A` / lag(`# of A`, default = 0), 2),
`MOM Growth sum of A` = round(`sum of A` / lag(`sum of A`, default = 0), 2)
)
table_2 <- table_2 %>%
mutate(
`MOM Growth # of B` = round(`# of B` / lag(`# of B`, default = 0), 2),
`MOM Growth sum of B` = round(`sum of B` / lag(`sum of B`, default = 0), 2)
)
table_2 <- table_2 %>%
mutate(
`MOM Growth # of A` = if_else(is.infinite(`MOM Growth # of A`), 100, `MOM Growth # of A`),
`MOM Growth sum of A` = if_else(is.infinite(`MOM Growth sum of A`), 100, `MOM Growth sum of A`),
`MOM Growth # of B` = if_else(is.infinite(`MOM Growth # of B`), 100, `MOM Growth # of B`),
`MOM Growth sum of B` = if_else(is.infinite(`MOM Growth sum of B`), 100, `MOM Growth sum of B`)
)
table_2 <- table_2 %>%
filter(!is.na(`Median of A`) & !is.na(`Median of B`))
table_2 <- table_2 %>%
tableHTML(rownames = FALSE,
widths = rep(100, 13),
second_headers = list(c(1, 4, 4), c("", "Status of A", "Status of B")),
caption = "A & B consolidated") %>%
add_css_caption(css = list(c("font-weight", "border"), c("bold", "1px solid black")))
The code above calculates the average and median of Val1
for each month of the year. It also adds two new columns, Median of A
and Median of B
, which represent the median values of Val1
for T1
= “A” and T1
= “B”, respectively.
The code then converts the output into an HTML table format using the tableHTML
package. The resulting table shows the average and median values of Val1
for each month of the year, as well as the MOM Growth # of A
, MOM Growth sum of A
, MOM Growth # of B
, and MOM Growth sum of B
columns.
Conclusion
In this article, we demonstrated how to calculate average and median on monthly data using R programming language. We also covered how to convert the output into an HTML table format using the tableHTML
package.
The code provided can be used as a starting point for similar calculations in the future. The dplyr
library provides a powerful framework for data manipulation, which makes it easy to perform various data transformations and calculations.
Last modified on 2024-09-22