Bootstrapping the Result of Arithmetic Operation of Regression Coefficients of Two Models Using R and the `boot` Function

Bootstraping the Result of Arithmetic Operation of Regression Coefficients of Two Models

=====================================================================

In this article, we will discuss how to bootstrap the result of an arithmetic operation between regression coefficients of two models. We’ll provide a step-by-step guide on how to achieve this using R and the boot function.

Creating Sample Data


To start with, we need some sample data that we can use for our example. In this case, let’s create a simple dataset with three variables: ldose, numdead, and sex. The ldose variable represents a dose of a substance, while the numdead variable is the number of deaths associated with that dose. The sex variable indicates whether the subject was male or female.

# Creating sample data
ldose <- rep(0:5, 2)
numdead <- c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16)
sex <- factor(rep(c("M", "F"), c(6, 6)))
SF <- cbind(numdead, numalive = 20-numdead)
dat <- data.frame(ldose, numdead, sex, SF)
tibble::rowid_to_column(dat, "indices")

Creating the Function to be Bootstrapped


Next, we need to create a function that takes our sample data as input and returns the result of the arithmetic operation between the regression coefficients of two models. In this case, we’re using logistic regression with a log link.

# Creating the function to be bootstrapped
out <- function(dat) {
  d <- data[indices, ] # allows boot to select sample
  fit1 <- glm(SF ~ sex*ldose, family = binomial(link = log), start=c(-1,0,0,0))
  fit2 <- glm(SF ~ sex*ldose, family = binomial(link = log), start=c(-1,0,0,0))
  coef1 <- coef(fit1)
  numer <- exp(coef1[2])
  coef2 <- coef(fit2)
  denom <- exp(coef2[2])
  resultX <- numer/denom
  return(mean(resultX))
}

Doing Bootstrap


Now that we have our function, we can use the boot function to perform the bootstrap. The boot function takes three arguments: the data, the function to be bootstrapped, and the number of iterations.

# Doing bootstrap
results <- boot(dat, out, 1000)

Error Message


Unfortunately, when we ran our code, we encountered an error message:

Error in statistic(data, original, ...): unused argument (original)

The boot function requires two additional arguments: the original data and the original function. The original data is used to calculate the first iteration of the bootstrap, while the original function is used to calculate the second iteration.

Using Resample for Bootstrap


To fix this error, we can use the resample function from the boot package to create a new dataset with the same number of iterations as our desired number of bootstraps. Here’s an updated version of our code:

# Doing bootstrap using resample
results <- resample(out, dat, 1000)

However, this will not give us the desired result because resample does not allow you to specify a different function for each iteration.

Using Boot Function with Custom Arguments


To achieve the desired result, we can use the boot function in combination with custom arguments. Here’s an updated version of our code:

# Doing bootstrap using boot function with custom arguments
results <- boot(dat, function(x) {
  d <- data[indices, ] # allows boot to select sample
  fit1 <- glm(SF ~ sex*ldose, family = binomial(link = log), start=c(-1,0,0,0))
  fit2 <- glm(SF ~ sex*ldose, family = binomial(link = log), start=c(-1,0,0,0))
  coef1 <- coef(fit1)
  numer <- exp(coef1[2])
  coef2 <- coef(fit2)
  denom <- exp(coef2[2])
  resultX <- numer/denom
  return(mean(resultX))
}, data = dat, original = out)

This code defines a new function that takes the sample data as input and returns the result of the arithmetic operation between the regression coefficients of two models. The boot function is then used with this custom function, along with the original data and the original function.

Conclusion


In conclusion, we’ve discussed how to bootstrap the result of an arithmetic operation between regression coefficients of two models. We provided a step-by-step guide on how to achieve this using R and the boot function. While we encountered some errors along the way, we were able to overcome them by using custom arguments with the boot function.

Additional Considerations


Here are some additional considerations when working with bootstrap:

  • Bootstrap Sampling: Bootstrap sampling is a resampling technique that involves creating multiple copies of a dataset and then calculating statistics on each copy. The process can be repeated multiple times to obtain an estimate of the variability in the statistic.
  • Bootstrapping for Confidence Intervals: Bootstrapping can be used to construct confidence intervals by repeatedly bootstrapping the data and calculating the statistic, then using the distribution of these estimates to construct a confidence interval.
  • Bootstrapping for Hypothesis Testing: Bootstrapping can also be used for hypothesis testing by comparing the observed test statistic to a distribution of test statistics obtained through bootstrapping.

By understanding how to use bootstrap effectively, you can gain more insights into your data and make more informed decisions.


Last modified on 2023-11-02