You are now in the main content area

Jillian Kingston

The Impact of Spatial Autocorrelation on Confidence Intervals In Group (Cluster) Sampling for Accuracy Assessment © 2004

Accuracy assessment is an important part of the image classification process to ensure that the accuracy of an image or a map is sufficient to fulfill its intended use. Survey methodologies, where most of the sampling statistics have been developed, are primarily based on the perfect mixture model (i.e. urns and balls), so spatial autocorrelation is not addressed in most accuracy assessments. Ignoring positive spatial autocorrelation often results in an underestimation of the variance of accuracy statistics, leading to overly optimistic (narrow) confidence intervals. Since very high positive spatial autocorrelation of error is frequently observed in nature, its effects should be considered when group (i.e. cluster) sampling is used for accuracy assessment.

A review of literature pertaining to accuracy assessment, with emphasis on group sampling and spatial autocorrelation is presented. A stochastic simulation has been designed to investigate the bias of variance estimates in a non-random landscape. Using a Gauss-Markov process, a binary landscape consisting of 512 by 512 pixels is produced (i.e. match/mismatch) according to pre-selected levels of both overall match (0-100%) and spatial autocorrelation (0-1). Within this landscape, group samples of equal size (i.e. nearly square blocks which together total 2625 pixels or approximately 1% of the image pixels) are systematically selected through one-stage cluster sampling and drawn according to twelve fixed sampling designs (i.e. consisting of 5, 7, 15, 21, 25, 35, 75, 105, 125, 175, 375, and 525 groups, respectively).

Within each group, the number of matched pixels is counted (the accuracy) and the variance of the response variable (the accuracy) is calculated for the groups for each of the twelve designs. The sampled variances are then compared with binomial variance estimates for various combinations of levels of accuracy and spatial autocorrelation. The number of simulations (n = 1000) is sufficient for rigorous statistical inference.

Although landscapes displaying uniform spatial autocorrelation will rarely occur in nature, this simulation exercise is designed to draw attention to the bias in variance estimation as a result of spatial autocorrelation. This will contribute to the general understanding of the behaviour of spatial data under given conditions and it is envisioned that this can be applied primarily in accuracy assessments of remotely sensed data.

close