Understanding Spatial Autocorrelation in Mixed-Effect Models
Background and Introduction
Spatial autocorrelation is a common phenomenon in geospatial data where the values of a variable are not randomly distributed across space. This means that nearby observations tend to be similar, either because they share environmental conditions or because of other spatial structures. In the context of ecological or biological studies, spatial autocorrelation can lead to biased estimates if not properly accounted for.
Mixed-effect models are a class of statistical models that combine fixed and random effects to model complex relationships between variables. By including both fixed and random effects, mixed-effects models can account for variation in the data due to different levels of the predictor variable (e.g., year, region). In this article, we will explore how to perform a spatial-autocorrelation test on residuals from a mixed-effect model.
Mixed-Effect Models and Spatial Autocorrelation
To understand why spatial autocorrelation is relevant in mixed-effect models, let’s first look at the basics of mixed-effect models. A mixed-effect model is defined as:
[y_{ij} = \beta_0 + X_{i}\beta_1 + Z_{ij}\gamma + u_i]
where $y_{ij}$ is the response variable, $\beta_0$ and $\beta_1$ are fixed effects, $X_i$ is a vector of predictor variables with corresponding coefficients, $Z_{ij}$ is a random effect for observation $j$ in group $i$, and $\gamma$ is a coefficient for the random effect.
In our case, we have:
[P_A = \beta_0 + Tmax + (1 | year) + u]
where $P_A$ is the presence-absence observation, $Tmax$ is the maximum temperature of the block in which the observation was made, and $(1 | year)$ represents a random effect for the year.
Spatial Autocorrelation Tests
There are several tests available to detect spatial autocorrelation, including:
- Moran’s I: This test measures the correlation between neighboring observations. A positive value indicates that neighboring observations tend to be more similar than expected by chance.
- Geary’s C: Similar to Moran’s I, but it estimates the variance of the residuals.
Extracting Residuals and Performing Spatial Autocorrelation Tests
The question you posed is whether extracting the residuals from your mixed-effect model using resid(modele)
and performing a Moran’s I test on them is correct. Before we dive into this, let’s first look at why this might be problematic.
Temporal Lag Between Observations and Residuals
In spatial autocorrelation tests, it’s generally considered best practice to use data in which the observation points are relatively close to each other, both in space and time. When there is a significant temporal lag between observations and their residuals, this can lead to biased estimates.
Spatial Autocorrelation Tests for Mixed-Effect Models
So, how do we perform spatial autocorrelation tests on mixed-effect models? There isn’t really a straightforward way to test the spatial autocorrelation of the fixed effects or random effects themselves. However, one approach is to compute the spatial autocorrelation of the residuals.
Computing Spatial Autocorrelation of Residuals
To do this, you can use the following code:
# Load the data and model
library(ggplot2)
library(mvtnorm)
Data_Species_std_ <- read.csv("data.csv")
modele <- glmer(P_A ~ Tmax + (1 | year),
data = Data_Species_std_, family=binomial(link="logit"),
control=glmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=1500000)))
# Compute the residuals
resid <- resid(modele)
# Create a matrix of lagged values (in this case, we're using 1 unit lag)
lag_matrix <- matrix(resid, nrow = nrow(resid), ncol = ncol(resid))
# Perform Moran's I test
morans_i_test <- cor(lag_matrix, use="complete.obs")
cor(morans_i_test)
However, please note that performing a spatial autocorrelation test on the residuals of a mixed-effect model can be problematic due to the temporal lag between observations and their residuals.
Temporal Lags in Spatial Autocorrelation Tests
When there is a significant temporal lag between observations and their residuals, this can lead to biased estimates. This is because the spatial autocorrelation structure changes over time, so we cannot simply apply Moran’s I or Geary’s C to the residuals without considering the temporal component.
Alternatives for Temporal Lagged Data
So what alternatives do we have when there is a significant temporal lag between observations and their residuals? One approach is to perform a spatial autocorrelation test on the original data, rather than on the residuals. This can be done by computing a spatial weight matrix that reflects the strength of the spatial relationships in the data.
Computing Spatial Weight Matrix
To do this, we can use the following code:
# Create a distance matrix for the observations
distance_matrix <- dist(data$longitude, data$latitude)
# Compute the inverse of the square root of the distance matrix
inverse_distance_matrix <- inv( sqrt(distance_matrix) )
# Normalize the inverse distance matrix to create a row-normalized matrix
row_norm_inverse_distance_matrix <- t(inverse_distance_matrix) %*% inverse_distance_matrix
# Create a spatial weight matrix
spatial_weight_matrix <- rep(0, ncol(row_norm_inverse_distance_matrix))
for (i in 1:ncol(row_norm_inverse_distance_matrix)) {
spatial_weight_matrix[i] <- row_norm_inverse_distance_matrix[, i]
}
# Perform Moran's I test on the original data using the spatial weight matrix
cor_morans_i_test <- cor(data$P_A, use="complete.obs", method = "Spatial")
cor_morans_i_test
By performing a spatial autocorrelation test on the original data using the spatial weight matrix, we can account for both the spatial and temporal components of the data.
Conclusion
In conclusion, when working with mixed-effect models that have spatially correlated random effects, it’s essential to consider both the spatial and temporal components of the data. While performing a spatial autocorrelation test on residuals is possible, this approach has limitations due to the potential for biased estimates resulting from temporal lags between observations and their residuals.
A better alternative is to perform Moran’s I test using the original data and the spatial weight matrix that reflects the strength of the spatial relationships in the data. By doing so, you can account for both the spatial and temporal components of your mixed-effect model data.
Last modified on 2024-09-14