Numerical methods are methods

The methods section of a paper should detail the methods used in the paper so that the results can be replicated and so that potential problems with the methods can be identified by reviewers and readers. Most papers do this, but some papers are good at describing the methods used to collect the data but omit to describe the numerical methods used. This is not good practice.

For example, Chu et al have a biomarker record from a maar lake in north east China and use wavelet analysis to identify solar cycles in the data. The methods section describes the coring and the geochemical analyses, but not the wavelet analysis.

Wavelet analysis of the δ13C27–31 time series during the past 9.0 ka BP. High/low power is indicated by red/blue colors. The black line shows the 95% confidence level and the dark shaded area indicates the cone-of-influence.

Wavelet analysis of the δ13C27–31 time series during the past 9.0 ka BP. High/low power is indicated by red/blue colours. The black line shows the 95% confidence level and the dark shaded area indicates the cone-of-influence.

Spectral methods, such as wavelet analysis, are prone to biases and artefacts. I’ve been exploring some of these problems this summer – these methods are tricksy, they need to be used with caution and the methods described in detail.

There are several methodological details that Chu et al should have reported. I’m most interested in three

  • How they decided which proxy record to subject to wavelet analysis
  • How they coped with their unevenly spaced proxy data
  • What was the null hypothesis

It is impossible to evaluate the report of significant solar frequencies without knowing how the analysis was done. The wavelet analysis is not an incidental part of the paper – the result of this analysis are reflected in the second word of the title “Holocene cyclic climatic variations and the role of the Pacific Ocean as recorded in varved sediments from northeastern China”. The reviewers failed. They should have returned the paper for revision.

We can make some inferences about what Chu et al did to their data and how this is likely to affect their results.

Fig. 6.  Comparison of lipid biomarker proxies. The wavelet analysis is based on the curve (E) The weighted δ13C27–31 values of the n-C27–31 alkanes.

Comparison of lipid biomarker proxies. The wavelet analysis is based on the curve (E) The weighted δ13C27–31 values of the n-C27–31 alkanes.

I can find no indication in Chu et al as to why the δ13C27–31 record is analysed rather than the other proxies. Whenever there are multiple proxies, it is tempting to analyse all of them and only report the most interesting results. Of course such a strategy greatly inflates the risk of a Type I error, reporting significant periodicities where none exist. 

The 9000 year compound-specific stable isotope record is unevenly sampled, with higher resolution in the last 2000 years. The mean sample spacing is probably about 50 years, but the wavelet plot includes periods below 16 years. The highest frequency on the plot is about 10 years, so the data must have been interpolated to at least 5-year spacing.  My guess is that the data have been interpolated to an annual spacing. The effects of interpolating unevenly spaced data on the power spectrum are well known, over sampling the interpolated data as apparently done here will have a large effect, greatly increasing the autocorrelation in the data. The most obvious effect is for there to be very little wavelet power at high frequencies, shown by the dark blue in the wavelet plot.

It is easy to demonstrate the effect of interpolation with a simulation. Here I start with white noise and interpolate it. The white noise has an AR1 coefficient near zero, the interpolated white noise has an AR1 coefficient in excess of 0.97. The effect on the spectrum will be dramatic.

 xt<-seq(10, 1000, 10)
 res<-replicate(100, {
   x<-rnorm(100)#white noise
   x2<-approx(xt, x, xout=10:1000)$y#interpolation
   c(ar(x, order.max=1, aic=FALSE)$ar,
   ar(x2, order.max=1, aic=FALSE)$ar)
 })

 quantile(res[1,])#white noise
#         0%         25%         50%         75%        100%
#-0.27134488 -0.07788977 -0.02006099  0.05662604  0.20494047
 quantile(res[2,])#interpolated white noise
#       0%       25%       50%       75%      100%
#0.9772944 0.9813582 0.9829961 0.9853674 0.9887854

 

The standard null hypotheses is that the proxy data come from a white or a red noise (AR1) process. It looks like Chu et al assumed a red noise background. This can have problems if the data are not from an AR1 process (for example the data are a sum of two different AR1 processes, or AR1 + white noise), but I’m going to ignore this potential problem for now. The key question is how was the red noise background estimated. I can see two options.

The easy option is to fit an AR1 model to the interpolated data. Because the interpolation increases the autocorrelation hugely, the estimated AR1 coefficient will be hugely biased, and the estimated background spectrum wrong. It is easy to demonstrate that this procedure will greatly inflate the risk of a Type 1 error.

The alternative would be to use a Monte Carlo procedure, generating many surrogate time series with the same autocorrelation as the observed record, interpolating them and using the mean of their wavelet power as the background. With this procedure, even if the interpolation has biased the wavelet plot, the significance levels will be approximately correct.

We cannot tell how Chu et al estimated the red noise null, but my guess is that they fitted an AR1 model to the interpolated data. If they did this, the results of their wavelet analysis cannot be trusted and their report of solar powered variability would be unreliable.

Without the numerical methods described, Chu et al is yet another paper reporting solar variability in proxy data that cannot be trusted.


Chu et al. 2014. Holocene cyclic climatic variations and the role of the Pacific Ocean as recorded in varved sediments from northeastern China. Quaternary Science Reviews 102: 85–95

Advertisements

About richard telford

Ecologist with interests in quantitative methods and palaeoenvironments
This entry was posted in Peer reviewed literature, solar variability and tagged , . Bookmark the permalink.

3 Responses to Numerical methods are methods

  1. An update after emailing the authors.

    I have still no idea how the wavelet analysis was performed, but the significant periodicities reported are not from the wavelet analysis as the paper suggests

    “Wavelet power spectrum of the δ13C27–31 data yields highly significant (>99% confidence level) quasi-periodicities of 205–212; and quasi-periodicities of 87–89, 113–114, 136–137, 146–147, 1020–1050, and 1750–2041 years with a confidence level >95%”.

    Instead the periodicities are from REDFIT, which is not mentioned and not cited by Chu et al. Reviewers or readers could not possibly have know this. Some of the concerns raised above are not relevant.

    REDFIT is only appropriate if the data are from an AR1 process. If the data are not from an AR1 process, the significance levels can be highly biased. There is a test in REDFIT of whether the data are from an AR1 process, I have never seen it used except in the original REDFIT paper.

  2. Kevin O'Neill says:

    “The mean sample spacing is probably about 50 years, but the wavelet plot includes periods below 16 years. The highest frequency on the plot is about 10 years, so the data must have been interpolated to at least 5-year spacing. ”

    I’m not sure I understand the rationale behind the second sentence quoted here. It almost sounds like you’re implying some Nyquist limit, but that’s not a limitation to time series analysis.

    Tamino dealt with this in his posts Sampling Rate and Sampling Rate, part 2

    • If we use a Fourier transform directly on the unevenly spaced time series (for example the Lomb Scargle transformation), then sure, there is no Nyquist limit and there is information about frequencies higher than the mean sampling resolution.

      But if we interpolate the time series to an even spacing first, we start to bias the spectrum, making it redder. The Nyquist limit is now set by the arbitrary resolution we have resampled at. If we over-sample the interpolated time series, the bias can become very large at high frequencies and will drown out most or all of the signal that might have been present.

      Chu et al have certainly resampled below the mean resolution. Even at the top of the core the mean resolution is 10-15 years, so there must be at least a 3-fold over-sampling. At the base of the core, where the mean resolution is about 70 years, they have at least a 14-fold over-sampling.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s