From the price of wheat (Hersche 1801) to childhood mortality (Skjærvø et al 2015), there seems to be no end to papers reporting spurious correlations with solar variability. As both these examples are published at solar maxima (±5.5 years), I conclude there is a correlation between the incidence of such publications and solar activity. Further evidence for this revolutionary hypothesis is provided by the publication of Wing et al (2015) near a solar maxima. Wing et al find “highly significant” correlations between the incidence of two types of arthritis and solar variability.
I don’t really need to do any more than link to XKCD to show that Wing et al is almost certainly spurious.
Wing et al analyse the incidence of giant cell arthritis (GCA) and rheumatoid arthritis (RA) with in Olmsted County, Minnesota over five decades. They correlate the 3-year smoothed incident data with the F10.7 index (solar radiation at 10.7 cm wavelength) and the AL index (a proxy for the westward auroral electrojet), allowing for lags of up to 14 years.
The first problem is in interpreting the p-value. It is a measure of how likely a correlation as large as that observed is under the null hypothesis of no correlation. It does not indicate how likely the alternative hypothesis, that there is a relationship, is. To know that we have to have some idea of how plausible the alternative hypothesis is. As the XKCD cartoon above shows, if the hypothesis is unlikely to be true, it is more likely that a highly significant correlation is a fluke than a genuine finding.
Since Wing et al is a single study without a strong theoretical expectation of a relationship between solar variability and arthritis, even a highly significant p-value is not strong evidence. Even if there were no other problems, this would be enough to be fairly certain that the correlations in Wing et al are spurious. And there are other problems.
Hypothesis tests are only fully valid if they are designed before the data are observed. According to the press release, it was the observation of a 10-year cycle in the incidence data that inspired the study. If data have a 10-year cycle, they are virtually certain to correlate with solar variability with a lag of 0-14 years. This is data-snooping and inflates the risk of finding a “significant” p-value when there is no relationship. A better strategy would be to use these data to help develop a hypothesis and then use independent data from another region to test this hypothesis.
The p-value is valid if a single correlation is analysed. If multiple correlations are analysed, there are multiple chances of finding a significant p-value, just as buying several lottery tickets increases your chances of winning a prize. Wing et al test the correlation between the incidence of arthritis and two solar proxies at lags 0-14 years. This does not get them 30 tickets to the p-value lottery because, for example, the solar proxies at lag 0 and lag 1 are highly correlated, but it does give them several chances to win. It is possible to correct for multiple testing, and at least the paper should have shown that the authors are aware of the problem. I wouldn’t dream of suggesting that the authors might have examined other solar variability proxies before settling on the two they report as the westward auroral electrojet is such an obvious place to start.
The incidence of both types of arthritis is temporally autocorrelated: if one year has a high incidence, the next year is likely to, and vice versa. The statistical test used by Wing et al assumes that the observations are independent, that there is no autocorrelation. Violating this assumption makes the statistical tests more liberal, more likely to report a significant result than is justified by the data. The autocorrelation inherent in the incidence data is enhanced by the 3-year smooth used, making the problem worse. Wing et al should have corrected for the autocorrelation in the (smoothed) data. There are several strategies that could be used, all would result in a less impressive p-value.
Even though I don’t find this paper in the least plausible, I do agree with the authors’ conclusions that those afflicted by arthritis should move to lower latitudes. I’ll start packing now.
(Andrew Alden @aboutgeology alerted me to this paper.)