At the Dino09 conference, Anne de Vernal and Taoufik Radi gave a workshop “Dinocyst assemblages as proxy in late Cenozoic paleoceanography: towards quantitative reconstructions using transfer functions”.
The training manual for the workshop contains this:
When the database is very large, the leave-one-out method may result in underestimation of the error because of spatial autocorrelation (e.g., Telford and Birks, 2005). However, in the case of dinocyst database characterized by large environmental gradients, spatial autocorrelation is not a major issue and it does not affect more MAT than other transfer function techniques (Guiot and de Vernal, 2011).
Some of this is true. It is easy to demonstrate that leave-one-out cross-validation underestimates uncertainty in spatially autocorrelated environments, either by h-block cross-validation (where the test sample and samples within h-km of the test sample are omitted from the training set), the use of spatially independent test sets, or by showing how well artificial spatially autocorrelated variables can be reconstructed.
But the database does not have to be large for spatial autocorrelation to be a potential problem. I would worry about a foram sea-level training set composed of 20 samples on a transect across a salt marsh being influenced by spatial autocorrelation. With regard to ocean-scale training sets, the more samples the more serious the potential problem. I doubt that autocorrelation is a problem for Imbrie & Kipp’s (1971) 61-sample foram-SST training set, but when there are an order of magnitude more samples, there is a clear problem. It is density of samples rather than simply number of samples that causes problems.
It is not immediately obvious why large environmental gradients should save dinocyst training sets from autocorrelation: this sounds like special pleading. Certainly the large gradients don’t help the foram-SST training set, where autocorrelation makes the modern analogue technique (MAT), under-estimate the uncertainty by a factor of ~2.
Contrary to what de Vernal and Radi write here, we should expect autocorrelation to be a more severe problem for MAT than for methods like weighted averaging or weighed averaging partial least squares (WAPLS). MAT considers only the most taxonomically similar analogues, and completely ignores the remainder of the dataset. If there is strong spatial autocorrelation, the best analogues will tend to be geographically close to the test site during cross-validation, and so will have apparently good estimates for any spatially structured environmental variable, whether meaningful or not. In contrast, WAPLS considers the entire training set when calculating the species optima, so it is much less sensitive to local conditions.
This contrast between WAPLS and MAT can be demonstrated by using h-block cross-validation on a spatially autocorrelated training set. With MAT, the performance tends to be very high when h is small, but fall rapidly as h increases. WAPLS tends to perform less well than MAT when h is small, but because the degradation in performance is less marked as h increases, performs better when h is large.