The ‘New York’ principle of site selection

If I can make it there,
I’ll make it anywhere.
It’s up to you, New York, New York.

Palaeoecologists typically try to choose sites where the environmental variable they want to reconstruct is likely an important, ideally the most important, variable determining microfossil assemblages in the past. If other environmental variables are important, the basic assumptions of transfer functions risk being violated and the reconstruction may be spurious, driven by the other variables.

5. Other environmental variables than the one(s) of interest (Xf) have had negligible influence on Yf during the time window of interest, the joint distribution of these variables of interest in the past was the same as today, or their effect on Yf did not lead to past changes in assemblage states resembling shifts indicative in the modern environment of changes in the variable of interest

Most palaeoecologists also try to minimise non-analogue problems by choosing  sites that are similar to the calibration set that the transfer function uses.

These two site-selection guidelines make Speke Hall Lake, a polluted eutrophic lake near Liverpool, a curious lake to choose to try to reconstruct July air temperature from chironomid head capsules using the Norwegian chironomid calibration set. But this is what Lang et al (2017) have done. They find a statistically significant correlation between the reconstruction and instrumental records of July temperature from Anglesey (r = 0.620; n = 16; p = 0.01) and declare that

This study demonstrated that a chironomid-based temperature inference model can produce reliable estimates of mean July air temperature, even from a lake that has experienced large changes in heavy metal and sulphur inputs, and trophic status.

Or in other words, if you can reconstruct temperature in Speke Hall Lake, you can reconstruct temperature anywhere.

I would not be so hasty to ignore the assumptions of transfer functions, lest we exemplify the “sick science” problem (curiously, Juggins (2013) is not cited despite its relevance ). Given the enormous ecological, chronological, and taphonomic difficulties that high-resolution chironomid reconstructions face (insurmountable at annual resolution, challenging at decadal resolution), I would judge it far more likely that the reported correlation is due to chance than that everything we know about the limitations of transfer functions is wrong. No single study at p = 0.01 is going to change my mind (you can find homoeopathy studies with lower p-values), and the review of high-resolution reconstructions that I am writing, shows there are serious problems with many of the ten sub-decadal chironomid-temperature reconstructions that I have found.

I am entirely happy to ascribe the key result from Speke Hall Lake to chance, but there are some other aspects of the paper which merit attention.


Lang et al use the constant rate of supply (CRS) model to produce a chronology from their 210Pb data. The CRS model is


where A(0) is the total unsupported 210Pb inventory and A is the inventory below the sample being dated. This resulting age-depth model will always be monotonic as the inventory below the sample being dated will always decline with depth. The CRS model shown in Lang et al is not monotonic (fig 1b)


Lang et al Fig 1. Speke Hall Lake location (a), chronology (b), and core matching  with magnetic susceptibility measurements (c).

From the timing of the impossible wiggle, it looks like the 137Cs peak from atmospheric bomb testing might have been included as an age rather than a check on the CRS model. I hope this is simply a plotting problem and that the ages of the chironomid samples are unaffected.

The 210Pb dates are on a different core from the chironomid samples. The chronology is transferred to the chironomid stratigraphy by aligning the magnetic susceptibility record. The overall agreement between the two mag sus records is excellent (Fig 1c), but the details are not perfectly reproduced. Since these details are used to align the records, there will inevitably be some error in the alignment. It is not clear from the paper if this uncertainty is accounted for (even an error of 2 years would seriously degrade the expected correlation between the reconstruction and the instrumental record).

The ordination

Lang et al do a constrained ordination and find that their variables explain 68% of the variance in the chironomid stratigraphy. This seems impressive until you realise that they used seven predictor variables and have fourteen fossil samples. Given the strong autocorrelation, especially in the geochemical variables, I suspect this result is little better than chance. Had 13 variables been used, they would have explained 100% of the variance!


Lang et al Fig. 5. Canonical correspondence analysis (CCA) for the upper sections (1932–2005) of the Speke Hall record. Anglesey is the July temperature data.

Note that in the ordination the temperature arrow is inversely correlated with most of the pollution indicators.

Reconstruction diagnostics

Lang et al include some reconstruction diagnostics, a plot of residual squared distances and a timetrack plot. Unfortunately, they conflate their residual squared distances (goodness-of-fit) with analogue quality making it difficult to be sure of what they have done. It is possible to have fossil samples that have excellent analogues (short squared chord distance) in the calibration set but a poor goodness-of-fit, and vice versa. What I would like to have seen is a plot of the fossil abundance against calibration set abundance.

Interpreting the correlation

There is a strong trend in the instrumental temperature data (r2 = 0.5) and the assemblage composition is autocorrelated. It would therefore seem prudent to correct the p-value of the correlation between the reconstruction and the instrumental record for autocorrelation. Of course, with a only 16 fossil data points covered by the Anglesey record, this will be difficult, but the corrected p-value is bound to be higher.

The apparent inverse correlation between the temperature and pollution indicators could also help to inflate the correlation between the reconstruction and the instrumental record.

The correlation with the longer CET series is only 0.25. No explanation for this much weaker (and non-significant) correlation is given.

Final questions for the authors

Had the correlation between the reconstruction and the instrumental record not appeared significant would you (and would the editors/reviewers have let you) publish a paper which could be summarised as ‘unpromising ponds cannot be used for high-resolution climate reconstructions’? I wonder if there are any failed high-resolution reconstructions decorating the interior of filing cabinets.

As I have started asking in all my reviews: where are the data going to be archived?

About richard telford

Ecologist with interests in quantitative methods and palaeoenvironments
This entry was posted in Age-depth modelling, Peer reviewed literature, transfer function and tagged , , . Bookmark the permalink.

6 Responses to The ‘New York’ principle of site selection

  1. Eli Rabett says:

    Have they compared their result against the CET which reliably goes back to 1780 and for this purpose maybe 1700?

  2. I did wonder whether you might comment on this paper. When I saw it in my research alerts I had a WTF moment and was half tempted to comment on the issues you raise here.

    JoPL has become a cesspool of poor statistical practice; it beggars belief that, in light of Steve’s Sick Science paper, any study like this gets published in this state.

    You’d hope that other palaeolimnologists would just ignore this rubbish, but no doubt we’ll see this cited in support of further gibberish in the literature regarding transfer functions.

  3. Andrew Scott says:

    Ah I made the blog! I was wondering when one of my papers would. That being said, some of the comments are a bit skewed.

    I actually did look at the number of specific taxa found in the fossil dataset that were not in the calibration set used for the transfer function, that number would be 0. The calibration set is quite taxa rich, and this lake is not. There are some taxa that are found in higher abundances in the fossil data compared to the calibration-set, especially at specific intervals (e.g., Chironomus plumosus), which was expected and supported periods of eutrophication in the core. The passive-plot of the fossil data ontop of the calibration data is in the supplementary of JoPL (or at least it should be). I sent the code of the goodness-of-fit analysis to Richard prior to using it to double check if I was interpreting it correctly, so what was done should be pretty transparent! Although reference to the supplementary figure may not have been made in an adequate spot of the paper.

    The CET correlation is much less than from the met-station near this lake.. but that is expected… we are talking about comparing temperatures at a specific site in one location of south-west england to that of a mean central england temperature average (which in itself only has a 0.75 correlation to that of the local met-station). Likewise, as Richard has correctly pointed out, dating uncertainty increases with age/depth. For this particular lake, that is especially true beyond the ~24cm mark. The larger amount of variation and uncertainty further back in time is not surprising. I for one was fairly amazed there was any correlation let alone similar trends, which I thought I was fairly transparent about in the discussion.

    The point of this paper was not to suggest doing a temperature reconstruction on eutrophic lakes, or whether that itself was wise at all (its really not).. but to show that chironomids are responsive to trends in temperature, even in an ecosystem that is depressed by multiple other factors. Would I trust these estimates of temperature? no… again, not the point. The thought process behind why temperature was reconstructed at all on this lake? well, I remember a comment by a reviewer on one of my other works along the lines of ‘well next thing you know they will be reconstructing temperature on eutrophic lakes’. My first thoughts are.. why not? If temperature is a driver of the assemblage, then surely you should see some temperature-indicators shift..and we did. This contribution was made to directly address comments that imply there is no relationship between temperature and chironomids, which defies all science and logic behind insect ecology, but are becoming mainstream review comments in chironomid-based studies.

    I did not reference Steve Juggins paper for a number of reasons, the primary one being that I have no intention of defending whatever temperature estimates come from this reconstruction, it is not the point of this paper. Transfer functions have their place (regardless of the ‘cesspool science’ comments), and we should be interpreting those trends found in the biological data, as well as the numbers that are spit out of our tests. That is exactly what we are highlighting here, the trends in this lake cannot be simply explained by pollution factors, the transfer function allows us to see that and interpret it. This is a point I have made clearly in several recent papers that examine reconstructions that work, and those that do not make sense. I find the ones that do not make sense more interesting.. such is science. My next project is to actually dissect the specific indicator taxa within the datasets (which taxa ARE responding? which are not? why?), it involves re-identifying literally everything in all of the datasets I can get my hands on (to ensure consistency in specific taxa across multiple regions and datasets)… hopefully I don’t collapse or go blind before I finish.


    • I had not realised that you had sent me the code for the goodness of fit analysis – I didn’t see the attachment before. Sorry about that. You appear to be correctly calculating the residual distance, but are then conflating it with analogue quality (from your emails I was sure you were working on the latter). Analogue quality (distance to nearest neighbour) and goodness of fit (residual distance in an ordination) are two distinct reconstruction diagnostics. Both should be checked as it is possible to have fossil observations that have good analogues but a poor goodness of fit and vice versa.

      The time track plot is useful, but it can only show if the fossil data are anomalous on the first two dimensions. If they are really weird on axis 13, we would never know from this plot.

      The ordination with seven predictor variables for fourteen observations is a monstrous overfit (I am sure you will be able to find equally bad examples in the literature). This is not an esoteric problem, so that the editors and reviewers did not stop to think about this (presumably) does not reflect well on the journal.

      By my calculation the correlation of CET with Valley is 0.87 and the correlation with Bidston (the closest weather station to the Speke Hall) is 0.96. This correlation is certainly not the problem.

      Steve’s work is absolutely critical, especially when the environmental variable being reconstructed has limited variance (~1°C range) and other ecologically important variables (eg eutrophication) are important. Steve showed that these secondary gradients can drive apparent trends in the reconstruction. Sometimes these trends will, by chance, correlate well with instrumental temperature. I don’t know how you determined that pollution did not drive the trends in Speke Hall Lake.

      I don’t doubt that chironomids are sensitive to temperature: that is clear from the many calibration sets. What I am interested in is whether they are sensitive enough to relatively small (roughly equal to the RMSEP) short term temperature variability to be useful proxies on subdecadal scales in the late Holocene. This I believe to be highly unlikely for a multitude of reasons ecological, taphonomic and chronological issues – my review is almost ready for submission. Speke Hall Lake provides no convincing evidence to the contrary.

      Can you confirm that the inversion in the age-depth model is simply a plotting problem and does not affect the remainder of the analyses?

  4. Andrew Scott says:

    I did not create the Figure 1 of the paper, although I was fairly confident in it as the data came from Peter Appleby himself. I will have to take a look at recreating the figure from the data myself to understand how the plotting problem may have been generated. I re-checked all of the actual analysis myself before submission, but admit I did not recreate that particular figure. The ordination was something I had asked to remove, but it was re-inserted in the paper.. I believe this was part of the review process… There is difficulty in trying to match a paleo record to meteorological data, assumptions made about dating, matching intervals of varying sedimentation, etc. We approached those as best we could,and were fairly clear about the challenges and assumptions in the paper. The Anglesey record was used in the paper even though I had access to 4 other meteorological stations because the other stations did not have records that went back as far. The manuscript became too chaotic from my perspective when Barb Lang was using records from four. The other stations I have data from are “Ness, Bradford, and Ringway” though. I am not sure about Bidston.

    Some of the language in the paper was moved around a bit in review, ultimately it became a factor of just accepting some of the editorial comments. I can see your point about analogue quality and goodness of fit, although I do think that the timetrack is useful in this regard.

    I have a hard time believing that the trends seen here can be a function of pollution factors alone, it is not logical to assume a ‘coincidence’ between these trends. There are very clear eutrophication trends in the core, but those are for select intervals which we identify. The assemblage is still shifting beyond those, to suggest a response to pollution alone would be very dubious. So what is left? well, that is kind of the point we were getting at.. the transfer function clarifies that link.

    In terms of Steves work pointing out secondary gradients, I agree. I wrote an entire paper on it. I also made the very clear point, that the Speke Hall lake record is indicating a temperature response even under a very clear (and obvious) pollution related response. To me, it is an obvious point. Unfortunately it has not been obvious to others that chironomids respond to temperature. I have far better records I have worked on (am working on) and would intend to work on in the future if my goal was to reconstruct late-Holocene temperature accurately!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s