Diatoms, running correlations and solar variability

Almost two years ago, I wrote a post about running correlations and their problems. It is still a well read post. I wish it was better read.

It is not that running correlation cannot be be a useful tool. If a good correlation has been found between two variables, it can be useful to test how consistent this correlation is over time. But if the correlation between two variables is weak and non-significant, then running correlations risks being a data dredging technique.

Case in point: Jiang et al (2015) who reconstruct Holocene sea-surface temperatures (SST) just north of Iceland using diatoms and relate the variability in SST to cosmogenic isotopes (an indicator of solar variability) using running correlation.

The abstract starts

Mounting evidence from proxy records suggests that variations in solar activity have played a significant role in triggering past climate changes.

As readers of my critical evaluations of papers reporting of solar-palaeoecology links will know, much of this “mounting evidence” is weak. How robust is Jiang et al?

Lets start with a detour into the transfer function that Jiang et al use to reconstruct SST from their fossil diatom assemblage. For reasons I don’t understand, they cite me (Telford and Birks 2009) when reporting that they test six transfer function methods. We didn’t and testing so many methods this risks a model selection bias (Telford et al 2004). In a previous paper, Jiang et al (2005) cite Juggins and ter Braak (1992) for an identical phrase about six methods.

Jiang et al (2015) settle on a four component weighted average-partial least squares model. They claim this is a parsimonious choice having used a five component model in Jiang et al (2005). I suspect using so many components (rare to need more than two) means that they have a spatial autocorrelation problem although the modern analogue technique (which normally does well in such cases) performs surprisingly badly relative to the other methods. It would have been good if they had tested if there was a spatial autocorrelation problem – code is available.

The choice of a four component WAPLS model won’t bias the results, but it might make the model performance appear better than it really is and make the reconstruction more variable. As it is, almost all the high frequency variability in the SST reconstruction is less than ±1°C, about the same at the root mean square error of prediction, so potentially a lot of this variability is just noise.

The chronology is based on tephra layers, circumventing any problems with a variable radiocarbon reservoir effect, and the sedimentation rate is fairly linear over most of the Holocene. The chronology is as good as chronologies get for marine cores, but still the chronological uncertainty on the pre-settlement tephras is about 100 years, enough to matter for a high resolution correlation.

What about the relationship between the reconstruction and solar activity? Jiang et al start by showing that the long term trends match the orbitally driven decline in summer insolation, as do many proxy records of summer temperature in the North Atlantic region. Next they compare the reconstruction with cosmogenic isotopes, detrending both records with a 6th order polynomial and then using a 50 year lowpass filter to remove high frequency variability. (There must also be an undocumented interpolation step to even temporal spacing.)

Jiang et al Figure 4. Figure 4. Comparison of the 14C production rate (Muscheler et  al., 2005; Reimer et al., 2009) and the reconstructed summer  sea-surface temperature (SST) data from core MD99–2275 (50 yr  averages). Both records were detrended by removing a 6th order  polynomial fitted to the data and low-pass filtered to remove  high-frequency variations on time scales shorter than 50 yr.

Jiang et al Figure 4. Figure 4. Comparison of the 14C production rate (Muscheler et
al., 2005; Reimer et al., 2009) and the reconstructed summer sea-surface temperature (SST) data from core MD99–2275 (50 yr averages).

For the period 9500-4500 BP, there is no obvious correlation between SST and the solar proxy. For the period 4500-0 BP at least some of the wiggles align, as would be expected for smoothed data. Jiang et al don’t report or test the overall correlations between the solar activity proxy and the SST reconstruction, instead they proceed directly to a running correlation. Jiang et al base their significance level of the running correlation on a Monte Carlo test using surrogate time series with the same temporal autocorrelation as the SST reconstruction (they use Ebisuzaki’s (1997) phase randomisation method). This is good: often correlations are either not quantified or autocorrelation is ignored (eg Jiang et al 2005). However, Jiang et al. (2012) do not take account of the multiple testing inherent in a running correlation.

Jiang et al Figure 5 Comparison of proxy records of solar forcing and reconstructed summer sea-surface temperatures (SSTs) from core MD99–2275. A: Direct comparison of  band-pass filtered (1/1800 yr to 1/500 yr) and linearly detrended 14C and SST data. B: The same comparison between summer SST and 10Be fluxes to Summit, Greenland.  C: Running correlation coefficient between 14C production rate and SST reconstruction shown in Figure 4 (2000-yr long windows moved in steps of 100 yr). D: Result of a significance analysis indicating highly significant negative  correlations for the past ~4000 yr. The analysis included  a random phase test that takes into account the autocorrelations present in the time series (Ebisuzaki, 1997).

Jiang et al Figure 5 Comparison of proxy records of solar forcing and reconstructed summer sea-surface temperatures (SSTs) from core MD99–2275. A: Direct comparison of band-pass filtered (1/1800 yr to 1/500 yr) and linearly detrended 14C and SST data. B: The same comparison between SST and 10Be fluxes to Summit, Greenland. C: Running correlation coefficient between 14C production rate and SST shown in Fig. 4 (2000-yr long windows moved in steps of 100 yr). D: Result of a significance analysis indicating highly significant negative correlations for the past ~4000 yr.

How serious a problem is multiple testing for Jiang et al? I’ve repeated their analysis as well as I can (they leave several details undocumented – how was SST interpolated (sampling resolution varies from 2 to >50 years), what filter did they use for the low pass). I find the absolute maximum correlation in a running correlation with a window width of 2000 years, step size 100 years, for 1000 phase-randomised detrended-SST surrogates. The 95th percentile of this null distribution is 0.44. Almost exactly the same as the absolute maximum correlation of Jiang et al’s running correlation. Rather than suggesting a strong link between solar activity and SST over the last 4000 years, Jiang et al’s result is on the cusp of statistical significance at the p=0.05 level. Not the worst result possible, but it makes their story less persuasive. My choice of methodological details may have affected the significance threshold somewhat.

Jiang et al also run a spectral analysis that finds several peaks that are close to some of the solar cycle frequencies, but not others.

Of course Jiang et al have an explanation of why their reconstruction is only sensitive to solar variability some of the time (more sensitive in cool climates). However plausible these explanations are, without supporting evidence we have to ask whether a more parsimonious explanation is that the on-off correlation between solar activity and the SST construction is due to chance.

(hat tip to Kaustubh Thirumalai @holy_kau)

Advertisements

About richard telford

Ecologist with interests in quantitative methods and palaeoenvironments
This entry was posted in Peer reviewed literature, solar variability, transfer function and tagged , . Bookmark the permalink.

2 Responses to Diatoms, running correlations and solar variability

  1. I still don’t know why their RMSEP is so low. Jiang et al (2002) have an almost identical calibration set (I understand Jiang et al 2005 & 2015 have one extra observation) and use a WAPLS 2 model with an RMSEP of 1.26°C. This is competitive with the 4-analogue MAT model reported by Jiang et al (2015) – in such a small calibration set, I would not expect MAT to outperform all other methods in the way it does in the 1000-observation foram or dinocyst transfer functions.

    Jiang et al have not simply reported non-crossvalidated performance estimates by mistake (this has been done by other palaeoecologists) as Jiang et al (2002) shows that these are much lower.

  2. Kaustubh says:

    Excellent post! Thanks…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s