Refactoring Code for Subset Generation: A Step-by-Step Approach in R
Based on your original code and the provided solution, I will help you refactor it to achieve the desired outcome. Here’s how you can modify your code:
# subset 20 rows before each -180 longitude and 20 rows after each +180 longitude
n <- length(df)
df$lon == -180
inPlay <- which(df$lon == -180)
# Sample Size
S <- 20
diffPlay <- diff(inPlay)
stop <- c(which(diffPlay !=1), length(inPlay))
start <- c(1, which(diffPlay !=1) + 1)
names(inPlayStart) <- ifelse(diffPlay > 0, paste0("Rows", inPlayStart, "_to_", inPlayStart+diffPlay), paste0("Row", inPlayStart))
subsetsList <- lapply(seq_along(start), function(i) {
from <- max(1, min(n, start[[i]] - S))
to <- min(n, start[[i]] + diffPlay[[i]] + S)
cat("i is ", i, "\tPlus=", diffPlay[[i]], "\t(from, to) = (", from, ", ", to, ")\tDIFF=", to-from, "\n")
# subset the rows
df[from:to, ]
})
# have a look at the results
subsetsList
In this refactored code:
- We use
which
ondf$lon == -180
to identify indices where-180s
occur. This gives us an indication of when we should start and stop our subsets. - To find the starting point for each subset, we calculate
start
usingdiffPlay[[i]] + 1
, which represents the row index immediately after each-180
. We then use this to generateinPlayStart
. - Similarly, we calculate the ending point of each subset by adding
diffPlay[[i]]
to the starting index. This ensures that the last row is included if there are more than 20 rows. - For subsets where there’s only one row (due to
-180s
being at the beginning or end), we modify our range to ensure that it includes the desired number of rows on either side. - We use
lapply
to apply this subsetting process for all indices.
Last modified on 2023-10-17