Dissembling with graphs: Murry Salby edition

Perhaps the easiest way to mislead your audience, or indeed yourself, is to generate deceptive graphics. Murry Salby is to be saluted for his mastery of this art, with several fine examples in his recent London lecture. Here are the best two.

Bomb 14C by Salby

Fig 1. Bomb 14C by Salby (0:30:31)

The first example is a figure showing the changes in atmospheric 14C due to atmospheric nuclear bomb testing and the subsequent absorption of this 14C into the oceans and biomass. Salby is arguing that this occurs quickly.

“Within two decades the nuclear surplus of 14C was history”

Bomb-radiocarbon is a useful dating technique over the last 60 years, so I am familiar with the general shape of the curve and something is wrong with Salby’s figure – it shows atmospheric 14C to have declined to its original value within 20 years of the 1963 test ban treaty. In reality, 14C concentrations remain above background 50 years later. Comparing Salby’s version with the correct data, it becomes obvious (OK, Cathy worked it out first) what he has done.

Bomb 14C curves for different geographical regions from Hua et al (2013)

Fig 2. Bomb 14C curves for different geographical regions from Hua et al (2013)

Rather than showing the whole curve, Salby starts his curve in 1958 after many bomb tests have been conducted. His background level of 14C is therefore much higher than the true background level. This makes it appear that the bomb 14C moved out of the atmosphere much faster than it really did. These shenanigans would have been obvious had he extended his figure to the present day as he would have had negative 14C values, and the poor fit of his exponential curve would have been exposed.

Salby’s claim that 14C levels declined to background within 20 years is bogus, his graph is exceptionally deceptive.

The second example is a figure Salby uses to argue that emissions are a function of population as the two curves overlie each other since 1860. If we accept this claim, it implies that population control rather than emissions abatement is needed.

Fossil fuel emissions and population growth since 1850

Fig 3. Fossil fuel emissions and population growth since 1850 (0:40:42)

There is just the little problem of the scale of the two curves. The emissions scale starts at zero, the population curve starts at about one billion. The two curves only overlie because of this scaling trick. A fair plot would show that emissions have risen much faster than population, over this time period. Does Salby really think that per-capita emissions in 1860 (horse-drawn ploughs, etc) were equal to those of 2010? The data suggest otherwise.

Is Salby aware of his skill in constructing deceptive figures? Certainly his audiences are not.

Posted in Fake climate sceptics, Silliness | Tagged , | 10 Comments

The most interesting part of Murry Salby’s lecture

I watched Murry Salby’s London lecture: it was awful. Salby addresses what he calls the core issue of climate change (0:2:30) “Why is atmospheric CO2 increasing?” The answer is obvious – because of CO2 emissions from fossil fuel burning and land-use changes – but Salby does not like the answer so repeats oft rebutted fallacies in a hopeless attempt to prove the increase is almost all natural.

First he shows that the annual increase in atmospheric CO2 concentrations is correlated to temperature. This is the well-known effect of El Niño which induces global temperature increases, drought over south-east Asia and changes in Pacific Ocean productivity. this relationship explains the year-to-year variability in the increase in atmospheric CO2, not the trend. This is not a novel error.

Salby’s second argument is that the atmospheric life-time of a CO2 molecule is short, less than five years. This is true but irrelevant. What matters is how long a pulse of CO2 stays in the atmosphere, even though the individual molecules may be exchanged between the atmosphere and ocean or vegetation. This crucial difference has been explained many times: Salby is wantonly ignoring facts that refute his mad hypothesis, or hope that his audience is ignorant.

This brings me to the interesting part of the lecture – the first questions from the audience (1:13:57) and its answer. Our favourite Viscount, Christopher Monckton, offers this fulsome praise, demonstrating that he either does not realise or does not care that the lecture was nonsense.

Professor Salby, I think we all want to start by just saying thank you. You are one of a tiny band of immensely courageous genuine scientists who have had their livelihood and their professional career stolen from them, not because their science was bad, but because it was socially inconvenient, politically uncongenial, and financially unprofitable to the governing class. Your bravery with persisting with your research for so many years after this was done to you is commendable. The clarity, breadth and depth of your presentation, which has grown since I last saw it only a year ago, and grown exponentially , is breath-taking, and my question therefore is this: when are you going to publish in a journal that they cannot ignore?

Salby replies

Thank you for your gracious remarks. I am not worthy of them, but thank you nonetheless. The immediate answer to your question is that this material will not be published, until the material from which it is derived is published. That won’t be published until I have recovered my research files and been reinstated in the field.

If Salby really believed that his work proved that CO2 emissions were natural, he would rush to publish, saving the world from unnecessary action to abate climate change, and receive the accolades not only of a lunatic lord but the entire population. A Nobel Prize awaits.

Yet it would seem that Salby prefers to play the martyr to a tiny audience of climate sceptics (perhaps 12) than to submit his research to scrutiny. His conditions for publication are pathetic. He does not need his research files. None of the material Salby presented was based on his own data: the atmospheric CO2 concentration, global temperature and other datasets he used can all be downloaded within an hour. None of the analyses Salby presented were complicated: it should be possible to repeat them within a few days. He also does not need to be reinstated: if his research is valid he would not want for employment at any institute of his choice.

By refusing to publish, does Salby believe he is holding the World to ransom to get his job back or is he too embarrassed to face the reality that his errors are not even novel?

Posted in Fake climate sceptics, Silliness | Tagged , | 17 Comments

Expressions in R

expression() and related functions including bquote are powerful tools for annotating figures with mathematical notation in R. This functionality is not obvious from their respective help files. demo(plotmath) nicely shows the huge potential of expression(), but does not help that much with getting the code need for many real cases.

I tend to get my expressions to work by trial and lots of errors (although having put this together, I now understand them at least temporarily). I’ve just searched through my code library and extracted and annotated some examples of expression() being used. I hope someone finds it useful.

I’m going to use expression() with title(), but the same expressions can be used with any of the functions (text(), title(), mtext(), legend(), etc) used for putting text on plots.

x11(width=4, height=5, point=14);par(mar=rep(0,4), cex.main=.8)
plot(1, type="n", axes=FALSE, ann=FALSE)

The simplest use of expression is take a character or string of characters and it will be added to the plot. If the string contains spaces, it must be enclosed in quotes (alternatively, the space can be replaced by a tilde ~, which probably gives better code – see comment from Gavin below).

title(line=-1, main=expression(fish))

This use of expression is entirely pointless, but is a useful starting point. Some strings have special meanings, for example infinity will draw the infinity symbol. If for some reason you want to have “infinity” written on your plot, it must be in quotes. Greek letters can be used by giving their name in lower-case or with the first letter capitalised to get the lower or upper case character respectively.

title(line=-2, main=expression(infinity))
title(line=-3, main=expression(pi))
title(line=-4, main=expression(Delta))

Subscript or superscript can be added to a string using ^ and [] notation respectively.

title(line=-5, main=expression(r^2))
title(line=-6, main=expression(beta[1]))

If the string we want to have as sub- or superscript contains a space, the string must be in quotes. Braces can be used to force multiple elements to all be superscript.

Strings can be separated by mathematical operators.

title(line=-7, main=expression(N[high]-N[low]))
title(line=-8, main=expression(N[2]==5))

To make more complicated expressions, build them up from separate parts by either using * or paste to join them together (if you want a multiplication symbol, use %*%). The * notation gives nicer code.

title(line=-9, main=expression(Delta*"R yr"))
title(line=-10, main=expression(paste(Delta,"R yr")))
title(line=-11, main=expression(paste("Two Year Minimum ",O[2])))
#title(line=-11, main=expression(Two~Year~Minimum~O[2]))
title(line=-12, main=expression(paste("Coefficient ", beta[1])))
#title(line=-12, main=expression(Coefficient~beta[1]))
title(line=-13, main=expression(paste("TP ", mu,"g l"^-1)))
#title(line=-13, main=expression(TP~mu*g~l^-1))
title(line=-14, main=expression(paste(delta^18,"O")))
#title(line=-14, main=expression(delta^18*O))
title(line=-15, main=expression(paste("Foram ", exp(H*minute[bc]))))
#title(line=-15, main=expression(Foram~exp(H*minute[bc])))

To start an expression() with a superscript (or subscript), I use an empty string (you can also use phantom()).

title(line=-16, main= expression(""^14*C*" years BP"))
#title(line=-16, main= expression(phantom()^14*C~years~BP))

So far so good. But sometimes, you want to use the value of an R-object in plot annotation.

For example, if we wanted to label a point with its x value, this will not work.

x<-5
title(line=-17, main= expression(x==x))

Instead of using expression(), we have to use bquote(), with the object we want written out inside .()

title(line=-18, main= bquote(x==.(x)))
title(line=-19, main= bquote(x==.(x)~mu*g~l^-1))
Plot annotations with expression and bquote

Plot annotations with expression and bquote

If you understand these examples, you should be able to use the remainder of the functionality demonstrated by demo(plotmath) and at ?plotmath.

Posted in R | Tagged , | 2 Comments

Is there robust evidence of solar variability in palaeoclimate proxy data?

This is my EGU 2015 poster which I am presenting this evening. Poster B25 if any readers are at EGU and want to see it nailed to the board.

With my coauthors Kira Rehfeld and Scott St George, I have done a systematic review of high-resolution proxy data to detect possible solar-signals. It is an attempt to avoid the publication bias and methodological problems in the existing literature on solar-palaeoproxy relationships. A manuscript in in preparation.

There is no prize for finding any typos.

Posted in solar variability | 9 Comments

A deliberately misleading title?

How many readers at WUWT will read today’s headline

Strong evidence for ‘rapid climate change’ found in past millenia

as

Strong evidence for ‘rapid climate change’ found in past millennium,

a subtle difference with a very different meaning? (Yes there is a typo in his title – I only mention this so certain readers do not think it mine)

Watt’s introduction to the press release

From the University of South Carolina, comes this paper that offers strong evidence of ‘rapid climate change’ occurring within less than a thousand years, with some occurring over just decades to centuries, near the same scale that proponents of man-made climate change worry so greatly about today.

doesn’t give much away.

The paper Watt’s is referring to investigates δ15N in Cariaco Basin, off the coast of Venezuela during Marine Isotope Stage 3, over 36 thousand years ago. Not surprisingly, the record responds to Dansgaard-Oeschger events which have been known about for thirty years and are absolutely not analogous to current warming as Watt’s strives to imply.

Posted in Uncategorized | 3 Comments

All age-depth models are wrong, but getting better

Today at EGU, Mathias Trachsel presented an update to my 2005 paper “All-age depth models are wrong: but how badly?“. He looked at the performance of the Bayesian age-depth models that have been developed over the last decade. Generally, they perform better than the classical age-depth models, but there are some problems setting parameters.

His presentation can be downloaded here.

A manuscript based on the same analyses is almost ready for submission.

Posted in Uncategorized | 3 Comments

Limits to transfer function precision

Transfer functions are widely used to reconstruct past environmental conditions from fossil assemblages using the relationship between species and the environment in a modern calibration set. Naturally, palaeoecologists want to generate reconstructions that are as precise as possible, and take steps to achieve this:

  • taxonomic resolution can be improved in the hope that the new taxa will have narrower ecological niches than the aggregate taxa they replaced
  • larger calibration sets can be generated, which can improve precision but can also worsen it if the new observations are not comparable with the old
  • maximising the environmental gradient of interest while minimising nuisance environmental variables will usually improve calibration set performance (but not necessarily the reconstructions)
  • developing and using new transfer function methods
  • increasing the spatial density of observations in an autocorrelated environment (and using transfer function methods, such as the modern analogue technique, that are not robust to autocorrelation)

I want to suggest that there are limits to the precision that can be achieved in practice due to the inherent noise in species-environment relationships and that papers that report transfer functions with exceptionally good performance should be treated with caution. Temperature is one of the most commonly reconstructed environmental variables as it is a key climatic variable and is ecologically important, so I am going to focus on this.

With all the certainty of a hunch, I am going to place my threshold for dubious precision (the root mean squared error of prediction; RMSEP) at 1°C for transfer functions with long temperature gradients (i.e. equator to pole), and somewhat lower if the temperature gradient is small.

Several transfer functions have been declared to have performance better than this threshold. I’m going to focus on the planktonic-foraminifera sea-surface temperature (SST) transfer functions as I know these fairly well; the system is relatively simple (compared with diatoms in lakes at least); and there are some interesting issues to explore.

Pflaumann et al (2003) reported a planktonic foraminifera-SST transfer function with a standard deviation of residuals (similar to RMSEP if bias is low) of 0.75°C for winter and 0.82°C for summer using the SIMMAX method. SIMMAX was (hopefully I am correct in using the past tense) a version of the modern analogue technique (MAT) that weighed analogues by their geographic proximity to the test site during cross-validation. Since SST is spatially autocorrelated, giving high weights to close analogues will tend to make the predictions appear more precise. But this is a spurious precision, bought at the expense of the independence of the test observation, otherwise known as cheating. Since Telford et al (2004) described the problem with SIMMAX, it has been little used.

Waelbroeck et al (1998) introduced the revised analogue method, another version of MAT that attempted to merge the properties of MAT and response surfaces. Unfortunately the response surface was only calculated once rather than repeatedly during cross-validation. This means that the impressive performance for their planktonic foraminifera-SST transfer function, with a standard deviation of residuals of 0.7°C for winter and 0.91°C for summer, is biased by the failure to ensure that the test observation is independent of the calibration set during cross validation. I’ve not seen RAM used much since Telford et al (2004) described the problem with it.

Artificial neural networks (ANN) were used by Malmgren et al (2001), with a reported RMSEP of 0.99°C for winter and 1.07°C for summer. ANNs learn by iteratively adjusting a large set of parameters, which are initially set at random values, to minimize the error between the predicted and actual output. If trained for too long, ANNs can over-fit the data, learning particular features of the modeling set rather than the general rules. This is normally controlled by using splitting the data, training the models on on portion of the data and testing the models with a second portion and stopping the training when the model stops reducing the RMSEP of this second portion. Typically many ANN models are generated from different random initial conditions and configurations and the best model used judged using the second portion. By selecting models that give the lowest RMSEP for the second data partition, the RMSEP is biased low. A third data partition is needed to give an unbiased estimate of model performance (again, see Telford et al (2004) ). Malmgren et al did not use the this independent test set, so their results are biased low.

MAT is perhaps the most widely used transfer function method for reconstructing SST from planktonic foraminifera. Telford & Birks (2005) report an RMSEP of 0.89°C for winter SST in the North Atlantic (Kucera et al (2005) report a larger RMSEP of 1.32°C for winter and 1.42°C for summer – I don’t know what causes the difference). As Telford & Birks (2005) show, this low RMSEP is biased by spatial autocorrelation in the calibration set which means that the test observation is not independent of the calibration set during cross-validation.

All of these low RMSEP are demonstrably biased. To have an RMSEP of 1°C, species need to have very clean responses to the temperature. Nuisance variables and noise make this unlikely. With short gradients, the magnitude of error that is possible decreases, so lower RMSEPs are expected (but also lower r²). So for example, the Norwegian pollen-July temperature RMSEP of just over 1°C is plausible. This model has none of the problems outlined above and uses methods that are reasonable robust to autocorrelation.

In reality, different threshold are needed for different proxies. When the relationship between the organisms and the environmental variable being reconstructed is less direct (for example between chironomids and air temperature) or there are large nuisance gradients (again e.g. chironomids), the threshold at which I start to wonder is raised.

The same logic outlined here holds for transfer functions for reconstructing other variables – if the results look too good to be true, there might be problems. For example, There is at least one transfer function where I suspect that the authors have forgotten to cross-validate their model, so good is the performance. Unfortunately, short of acquiring the data and re-running the analyses, there is little that can be done to check such cases.

A question for readers, do you know of any transfer functions with suspiciously good performance that ought to be examined?

Posted in transfer function | Tagged | Leave a comment