The Sun on the Nile: how many degrees of freedom?

In 1978, A. B. Pittock wrote a critical review of long-term Sun-weather relationships, complaining of the low quality of papers reporting solar effects on weather. One of the paper’s recommendations is that authors should

3. Critically examine the statistical significance of the result, making proper allowance for spatial coherence, autocorrelations and smoothing, and data selection

Statistical analysis of climate-sun relationships have, of course, improved greatly since 1978 and the statistical significance of the results will be critically examined. Unfortunately, not always by the authors, reviewers or editors. Today it is your turn.

Hennekam et al (2014) investigate the Holocene palaeoceanography of the Eastern Mediterranean and seek to explain the variability they find with solar forcing. Yes, this is another addition to my critical review of palaeoclimate evidence of solar-climate relationships.

The paper focus on a high resolution δ18O record from the planktonic foraminifera Globigerinoides ruber and the Δ14C record of solar variability from Stuiver et al (1998). It uses a running correlation and find some strong and apparently significant correlations between solar activity and the proxy data.

Figure 6. (b) Comparison of detrended and filtered data (0.256–3.333 kyr) of the time series. Top to bottom: Solar activity Δ14Cres [Stuiver et al., 1998], PS009PC δ18Oruber (this study), Gulf of Guinea G. ruber Ba/Ca [Weldeab et al., 2007], and Oman δ18Ospeleothem [Fleitmann et al., 2003]. Results of a running correlation are indicated in the same color (window width = 1005 year, shift increment = 5 year) of the “monsoon” time series to Δ14Cres. The 99% confidence threshold is indicated by black horizontal dashed lines (note that these are sensitive to the resampling). The asterisk indicates that the running correlation of the Gulf of Guinea G. ruber Ba/Ca has a reversed y axis; for this record a negative correlation indicates a high coherence between increased solar activity and increased monsoon activity. The periods of simultaneous higher Ba/Al, higher V/Al, and negative <em>G. ruber</em> oxygen isotope values, during sapropel S1 formation in core PS009PC, are marked I–V (based on Figure 5).

Figure 1. Comparison of detrended and filtered (0.256–3.333 kyr) time series. Top to bottom: Solar activity Δ14Cres [Stuiver et al., 1998], PS009PC δ18Oruber (this study), Gulf of Guinea G. ruber Ba/Ca [Weldeab et al., 2007], and Oman δ18Ospeleothem [Fleitmann et al., 2003]. Results of a running correlation are indicated in the same colour (window width = 1005 year, shift increment = 5 year) of the proxy time series to Δ14Cres. The 99% confidence threshold is indicated by black horizontal dashed lines (note that these are sensitive to the resampling). The running correlation of the Gulf of Guinea G. ruber Ba/Ca has a reversed y axis; for this record a negative correlation indicates a high coherence between increased solar activity and increased monsoon activity. The periods of simultaneous higher Ba/Al, higher V/Al, and negative G. ruber oxygen isotope values, during sapropel S1 formation in core PS009PC, are marked I–V.

The time series shown in the figure are not the raw data: the plots and the running correlation are of two heavily smoothed time series. What could possibly go wrong? Have the authors followed Pittock’s (1978) advice and critically examined the statistical significance of the result, making proper allowance for spatial coherence, autocorrelations and smoothing, and data selection?

I’m going to ask two questions. 

  • How many degrees of freedom were assumed when calculating the p=0.01 significance threshold of the running correlation in figure 1?
  • How many degrees of freedom should have been allowed?

The methods in the paper are generally well described, but the procedure for estimating the significance threshold is not described, nor is it obvious. The cryptic comment that “note that [the significance thresholds] are sensitive to the resampling” is not explained.

Fortunately we can work out what has been done. The significance threshold is at r = ~0.2. Plugging numbers into into an Pearson’s correlation significance calculator shows that for a two-sided test ,if the number of observations is 201 (df = n-2 = 199) then at p = 0.01, r = 0.18. 201? The Δ14C data have 5 year resolution during the Holocene so there are 201 observations in the 1005 year window used in the running correlation.

Is this the correct number of degrees of freedom for the running correlation? It might be if the resolution of the foram δ18O was 5 years. It isn’t. The forams are sampled every centimetre, which given the sedimentation rate of this core represents ~46 years. About 22 such samples can fit into a 1005 year window. So rather than 201-2 = 199 degrees of freedom, we have 22-2 = 20. With this many degrees of freedom, the p = 0.01 significance threshold is just above r = 0.5. No problem. The running correlation between foram δ18O and Δ14C exceeds this new threshold.

The estimate of 20 degrees of freedom assumes that the observations are independent. If the observations are not independent – the time series is autocorrelated – then the effective number of observations will be smaller and the significance threshold higher. The Δ14C record is strongly autocorrelated, I’m not sure about the foram δ18O record, but it doesn’t really matter. Both times series are low pass filtered to remove frequencies above 1/256 yrs. The filtered times series are very strongly autocorrelated; there are very few effective observations. I’m not sure how few – my guess is four per 1005 year window (i.e. 1005/256), but it might be a little more. Let’s be generous and assume there are eight effective observations. The p = 0.01 significance threshold is now over r = 0.8 and little if any of the running correlation exceeds this new threshold. If my guess of four effective observations is correct, the significance threshold is r = 0.99!

So rather than having fantastically strong correlations between solar variability and the proxy, we have little or no evidence of any relationship. And we still have not discussed the problem of multiple testing in running correlations which will widen the significance thresholds further. How many degrees of freedom will be left?

Somehow, I don’t think that Pittock’s recommendations were followed.

 

I find it rather sad the authors feel that they need combine their high quality palaeoclimate data with low quality statistical analysis to generate a publishable story. It is a Van Gogh in a tawdry frame, sold on the value of the frame.

Advertisements

About richard telford

Ecologist with interests in quantitative methods and palaeoenvironments
This entry was posted in climate, Peer reviewed literature, solar variability and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s