In eight-dimensional space, no one can hear your data scream

Mauri et al (2015) make a gridded climate reconstruction for Europe over the Holocene based on almost 900 pollen stratigraphies. The hope is that the reconstruction will be useful for evaluating climate models. This is a useful goal – if the climate models can generate realistic Holocene climates, our confidence in their predictions for the future should increase. This hope will only be realised if the reconstructions are reliable, unbiased, and with realistic estimates of reconstruction uncertainty.

Mauri et al reconstruct eight climatic variables: mean summer (JJA), winter (DJF) and annual temperature and precipitation, mean annual GDD5 (growing degree days over 5 °C) and mean annual P–E (precipitation minus evaporation).

Mauri et al are not the first to try to reconstruct many environmental variables simultaneously, nor do they have the record for the most variables (I seem to recall one paper reconstructing over 20 variables from pollen, but cannot find the reference). But like so many papers, Mauri et al do not consider whether they really can reconstruct so many variables. I do not dispute that all the variables reconstructed by Mauri et al are important to plants (actually I do – I don’t think that plants care about annual mean temperature per se, but are instead sensitive to seasonal temperatures and their combination), but I am sceptical that all can be reconstructed at all sites. It would seem more plausible to me if some variables could be satisfactorily reconstructed in some sites, and other variables in other sites depending on which variables are limiting species abundances.

Mauri et al have this to say about their choice of environmental variables:

Here we do not attempt to justify our choice of parameters, other than to point to the extensive peer reviewed literature in which these parameters have already been applied.

Not an entirely satisfactory justification. Would it not have been better to try to determine which variable can be reconstructed and where before providing gridded reconstructed climate? One can hardly expect the climate modeller to do this.

Mauri et al use the modern analogue technique (MAT, aka k-nearest neighbours) based on 4700 modern pollen and climate observations. MAT is sensitive to spatial autocorrelation which makes the transfer function model appear more precise that can be justified. Mauri et al are aware of the problem describing the problem accurately.

The performance of a training-set is often estimated using cross-validation techniques, but performance can be over estimated as a result of spatial autocorrelation from geographically close analogues.

But then comes this

The extent of this problem has not generally been considered to be significant enough to limit the application of the MAT technique, and indeed the spatial structure in the data may still be an important function of the climatic response, especially at regional scales (Bartlein et al., 2010).

It is perhaps true that spatial autocorrelation has not “been considered … enough”. Telford and Birks (2009) demonstrate that spatial autocorrelation can greatly bias the performance estimates of pollen transfer functions that use MAT. If the true uncertainty is, say, 50% higher than apparent, we risk finding that climate models do not agree with the data when the  agreement is actually reasonably good. I’ve no idea what Mauri et al mean by the second part of this sentence – time to look at Bartlein et al (2011) who have a paragraph about spatial autocorrelation.

Standard goodness-of-fit statistics such as R2 may overestimate the predictive power of climate reconstructions (especially those made with the modern analogue technique) due to unaccounted-for spatial autocorrelation in the response variables (e.g. Telford and Birks 2005).


The extent of this effect in published pollen-based climate reconstructions cannot easily be quantified.

Mainly because no one has tried.

However, it should be noted that the spatial autocorrelation of vegetation composition at a regional scale derives almost entirely from its causal relation to climate, provided that attention is confined to variables that influence the growth, establishment and regeneration of plants (Harrison et al. 2009).

Harrison et al do not appear to have tested this conjecture. This conjecture can only be true if non-climate factors such as soil are not important and there is no dispersal limitation. Evidence for dispersal limitation in European trees can be found in Nogués-Bravo et al (2014) and in the ability of European trees to naturalise far outside their native range (e.g., Abies alba in Denmark)

Spatial pattern in pollen data thus constitutes valuable information for the reconstruction, to be retained rather than rejected (Legendre 1993; Legendre and Legendre 1998).

This is true only if you make the assumption that the spatial structure of climate did not change in the past. This assumption is very unlikely to be valid.

In any case, spatial autocorrelation in pollen data becomes non-significant at length scales of 200–300 km, and is slight at any scale when full taxon lists are used (Sawada et al. 2004).

Sawada et al did not examine the autocorrelation in the pollen data but the autocorrelation in MAT residuals. Since MAT uses the spatial structure in the data to artificially improve its fit, the autocorrelation in the residuals will be smaller than that in the data. In any case, 200-300 km is sufficiently large to contain many potential analogues in cross-validation and bias the performance estimates.

Guiot and de Vernal (2007) showed that the goodness-of-fit (as measured by the R2 statistic) is an appropriate measure when spatial autocorrelation in the pollen data arises from the underlying climate and not from processes internal to the vegetation system.

Guiot and de Vernal (2007) is simply wrong. The authors were equally wrong in 2011.

Bartlein et al. appears not to give any robust support to the argument in Mauri et al.

Back to Mauri et al. They attempt a solution to spatial autocorrelation

 In evaluating our transfer function, we have tried to take account of the auto-correlation problem by adopting an n-fold-leave-one-out cross validation which provides a more reliable estimate of the model performance than simple leave-one-out cross-validation (Barrows and Juggins, 2005).

[n-fold cross-validation provides] a more reliable estimate of the model performance than is provided by simple leave-one-out cross-validation, especially using MAT where the effect of spatial autocorrelation can otherwise cause uncertainty to be under-estimated (Barrows and Juggins, 2005).

Barrows and Juggins (2005) contains nothing about spatial autocorrelation; it cannot be used to justify n-fold cross-validation as a solution to spatial autocorrelation.

With MAT, spatially close observations in the calibration set are often selected as analogues. If they were being selected just because they are similar in the environmental variable of interest, there would be no problem. However, they may be being selected as analogues because they are similar for many environmental variables. One way to deal with spatial autocorrelation is to exclude observations that are spatially close to the test observation during cross-validation – h-block cross-validation (Telford and Birks 2009)n-fold cross-validation will remove some spatially close observations, on average 1/n of them, but leaving (n-1)/n of them to affect the analysis. If you think n-fold cross-validation is a solution to spatial autocorrelation, I have a sieve you can use as an umbrella.

Mauri et al plot their reconstruction uncertainties which is commendable, they just don’t tell the reader how they calculated the uncertainties, nor how many analogues were used. The uncertainties resulting from the calibration set being dominated by moss-polsters while the reconstructions are from lakes and bogs is also not discussed. Perhaps these are terribly tedious methodological details, but they are needed to be able to properly evaluate the paper.

Mauri et al is certainly a better analysis that Davis et al (2003) which it supersedes, but falls short of what could have been achieved.


About richard telford

Ecologist with interests in quantitative methods and palaeoenvironments
This entry was posted in Peer reviewed literature, transfer function and tagged , . Bookmark the permalink.

3 Responses to In eight-dimensional space, no one can hear your data scream

  1. Eli Rabett says:

    Any gardener can tell you that perennials are most sensitive to extreme winter and summer temps. Don’t know if there is a paleo way to get at this.

    • We generally hope that extreme winter cold correlates well with mean winter temperature. Data for means are readily available as gridded data, for example worldClim. I don’t know if data for extremes are available, except as station data. It might be interesting to test how good the correlation is with station data – I suspect some places have higher variance than others.

  2. Pingback: Assessing performance and seasonal bias of pollen-based climate reconstructions in a perfect model world | Musings on Quantitative Palaeoecology

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s