Dinocysts, transfer functions and spatial autocorrelation: part 1207

I don’t always comment on papers that use transfer functions but neglect to consider how spatial autocorrelation in the modern calibration set might make the reconstructions spuriously precise. It gets tedious, especially when the same authors make the same mistakes time and again. But sometimes I am asked to review such papers, and I oblige.

One such paper was Wary et al (2017), published yesterday. The paper suggests that when Greenland and the North Atlantic cool during Dansgaard–Oeschger oscillations, the surface of the Norwegian Sea warms, and vice versa. The warmth in the Norwegian Sea is reconstructed from the cysts of dinoflagellates, which live near the surface. Cold subsurface conditions in the Norwegian Sea are reconstructed from planktic foraminifera.

Since Wary et al  was published in Climate of the Past, the complete peer review process – reviews, editors comments and author replies –  is publicly available. This now includes the second round of reviews/editor comments which were previously hidden.

Lacking the expertise to critique the physical plausibility of this regional see-saw, my review focused on the dinocyst reconstructions. The paper reconstructs summer and winter sea surface temperatures and salinities, together with sea-ice duration. I really doubt that all five variables can be reconstructed independently, especially since most dinoflagellates overwinter in cysts on the sea floor.

I criticised the paper reporting model performance statistics from a cross-validation scheme (either leave-one-out or k-fold cross-validation – the paper is not clear) that ignores the considerable spatial autocorrelation in the calibration set, and suggested that the true uncertainty was severely underestimated. I also criticised the lack of reconstruction diagnostics to help the reader evaluate the reconstructions.

The editor agreed these were important concerns. So how did the authors respond?

The authors added – as they had promised – a plot showing the taxonomic distance of each fossil sample to the nearest analogue in the modern calibration set. They claim this plot will “ensure that one can assess by his own the reliability and robustness of our reconstructions”.


Wray et al (2017) Figure S5. Distance to the nearest analogue
in the four studied cores 

Well good luck with that. Usually, plots of the distance to the nearest analogue show some reference levels (often the 5th and 10th percentile of all distances in the calibration set) that the distances can be compared with. Wray et al do not, so there is no way to know if the distances are high or low (a problem exacerbated by the absence of information on which distance metric was used, hampering replication). This figure is almost useless.


Wary et al rely on Guiot and de Vernal (2011a, b) (a paper and their response to a comment) in support of their assertion that

parallel studies equally based on cross-validation schemes showed that this spatial autocorrelation has in fact relatively low impact on the calculation of the error of prediction of the MAT transfer function applied to dinocyst assemblages.

Unfortunately, Guiot and de Vernal (2011a, b) is a strong contender for one of the worst papers ever published in Quaternary Science Reviews, managing to simultaneously demonstrate and deny that autocorrelation is a problem, and use an irrelevant test to prove nothing. It is absolutely not evidence that autocorrelation is not a serious problem for transfer functions.

The authors also cite de Vernal et al (2013a) and de Vernal et al (2013b) as further evidence that autocorrelation is not a problem for dinocyst transfer functions. However neither paper even attempts to test if autocorrelation leads to an overestimation of model performance. Both papers use k-fold cross-validation. This is only minutely less sensitive to autocorrelation than leave-one-out cross-validation: it is a solution to autocorrelation to the same extent that a sieve makes a good boat.

The authors graciously cite several of my papers which demonstrate that utocorrelation is a problem and suggest means to identify it and deal with it. However, I would much rather that instead of contributing towards increasing my h-index, the authors had engaged with the h-block cross-validation scheme I proposed. In conclusion, Wary et al is yet another wasted opportunity to determine the true utility of dinocyst-based transfer functions.


Posted in Uncategorized | 1 Comment

Critical perspectives in climate reconstructions from pollen

This week I am at Caux, high above Lac Léman, Switzerland, attending a Pollen Climate Model Intercomparison Project workshop.

The invited talk I gave this morning talk was on   Critical_Issues_in_pollen_climate_reconstructions. (I’ve made the ioslides presentation into a pdf – it looked much prettier before).

Posted in Uncategorized | Leave a comment

“Fossil Insect Study Suggests That Los Angeles Climate Has Been Relatively Stable for at Least 50,000 Years”

So sayeth the press release. But what about the paper, and the 182 beetles sampled from La Brea tar pits?

Fossil preservation in the tar pits is exceptional, but the constant stream of gas through the tar deposits mixes the fossils – there is no stratigraphy in the tar. Holden et al have to radiocarbon date each and every beetle they analyse. Naturally, this limits the number of beetles they can analyse, both because of the financial cost and the destruction of the beetles. It also limits the number of species than can be analysed (to seven), as only species with abundant fossils can be vaporised  (we are not told what other species are found – I would have liked this information).

Contrast the situation with the tar pits with a more typical site for beetle analysis, say a section through a peat bog, where large volumes of sediment in stratigraphic order can be collected, with dozens or more beetle fossils from many species in each sample, and only a few dates needed to constrain the chronology. La Brea is a challenging site.

These are dates of the seven beetle species at La Brea.


Holden et al figure 2. Median calibrated age of each beetle with 2-sigma ranges. Cases where only the error bar is shown are “greater than” ages where the results were very close to background and only a lower limit could be specified. The low quality of the image is because Elsevier are hopeless.

The first thing to note is that no beetles were dated to the last glacial maximum. Holden et al ascribe this to either the lack of insect collections from the pits containing LGM mammal fossils, or a cooler LGM climate making the tar less sticky so fewer insect were caught. If the former explanation is correct, the press release claim that “Los Angeles Climate has been relatively stable for at least 50,000 years” cannot be substantiated as there is no evidence from a critical interval. If the latter explanation is correct, the press release is refuted. A third explanation not considered by Holden et al is that the climate changed such that the seven beetles they use were not present at La Brea. Whatever the reason for the lack of LGM beetles, the headline of the press release is wrong.

There is also a beetle gap in the early Holocene thermal maximum. Again, the lack of evidence precludes a conclusion that the climate was stable.

It is always going to be difficult to reconstruct climate from a just seven species of beetles, selected in part because they were common. So it is not greatly surprising the reconstructed climate, for the intervals where there are data, is similar to modern. Holden et al report “mean summer temperatures within ±5 °C of today’s conditions”. A range wide enough that only the LGM (for which there are no data) could reasonably be expected to exceed.

The method for reconstructing climate is unclear. The paper appears to assume that the assemblage composition is constant through time and hence that the climatic conditions must have remained similar to modern. I would have liked to see a figure showing the assemblage in each time window. Something like this.


Number of beetles by time interval. ?Modern beetles have an age of < 200 cal BP, and so might represent accidental modern contamination of the tar pit.

There are distinct shifts in the species composition through time that Holden et al do not explore. I don’t know if these shifts have possible climatic interpretations. Although Holden et al collate modern records for their species, they don’t present the results in an easy-to-interpret way (they present violin plots, each species in a separate file, and scatter plots).

While the species distribution modelling could certainly have been done better, and the method for climate reconstruction made explicit, it is always going to be difficult to work on the stratigraphically-mixed deposits at La Brea, and without a huge amount of money for dating, and a willingness to atomise any beetles for dates, it will be impossible to get the data and reconstruction that could be expected from a more typical site. But that is no excuse for a press release that is unsupported by the paper.





Posted in climate, Peer reviewed literature | Tagged , | Leave a comment

Are Tibetan chironomids mesmerised by solar variability?

It’s been a while since I examined a paper that purports to present palaeoecological evidence for a climate response to solar variability. But last night, flicking through the recently published papers in Quaternary Science Reviews, I came across Zhang et al who suggest in their abstract that “solar activity could be an important mechanism driving the centennial-scale variability“.

Zhang et al present a high-resolution (50-yr) Holocene chironomid stratigraphy from the small Lake Tiancai on the south-east margin of the Qinghai-Tibetan Plateau from which they use a transfer function to infer Holocene temperature evolution. Unlike many authors, they present a range of transfer function diagnostics. These suggest there may be some problems with poor analogues for the fossil samples, especially in the late glacial/early Holocene, in the modern calibration set.

So far, so good. But what of the solar variability?

Alas, it is a correlation-by-eye.


Zhang et al Fig. 5. Multi-proxy records compared from Tiancai Lake (a) the percentage abundance of a pollen genus Tsuga; (b) December insolation at 30 °N; (c) chironomid-based mean July temperature reconstruction from this study (solid black line) with 5-point-running average (red solid line) highlighting the overall trend and (d) the percentage abundance of a diatom species Aulacoseira alpigena. Highlighted areas in blue correspond to the recognized Hallstatt solar cycle minima centered at 8200, 5500, 2500 and 500 cal yr BP.

Although the total solar irradiance reconstruction from Steinhilber et al. is shown in figure 6, figure 5 shows the totality of evidence for a link between chironomid-inferred temperatures and solar activity at Tiancai Lake. No statistical tests are used to test if this relationship is better than expected by chance.

The 5-point running-average has obviously been miscalculated – it should never reach the extremes of the data – it appears to be joining every fifth sample instead (it is done correctly in figure 6). This makes comparison of the smoothed record with the solar minima more difficult. Only one of the solar minima convincingly aligns with a temperature minimum and other large temperature minima do not align.

I would argue that this paper provides absolutely no evidence for a solar-temperature link, and that the purported link is a distraction to an otherwise good paper.

So, fulfilling Betteridge’s law of headlines, no, the chironomids were not mesmerised by solar variability. But the authors, reviewers and editors might well have been.



Posted in Peer reviewed literature, solar variability, transfer function | Tagged , | Leave a comment

Downloading Polish weather data

I needed some temperature data from Poland for some work I am doing on the pheonology of understorey plants in Białowieża Forest.

The easy source of data is the GHCN which can be accessed through the rnoaa package. This code loads the necessary packages and finds stations in GHCN within 200 km of Białowieża Forest.

#Download Bialowieza and regional temperature data

##regional climate
#find nearby sites
dat <- read.table(header = TRUE, sep = ",", text = "
longitude, latitude, id
23.894614, 52.744313,Białowieża"

stations <- meteo_nearby_stations(dat, lat_colname = "latitude", lon_colname = "longitude", station_data = ghcnd_stations(), var = "all",  year_min = 1960, year_max = 2000, radius = 200)

mp <- map_data("world", xlim = c(16, 30), ylim = c(48, 58))

ggplot(stations$Białowieża, aes(x = longitude, y = latitude, label = name)) +
geom_map(data = mp, mapping = aes(map_id = region), map = mp, fill = "grey80", colour = "black", inherit.aes = FALSE) +
geom_point() +
geom_label_repel() +
geom_point(data = dat, aes(x = longitude, y = latitude), colour = "red", size = 3, inherit.aes = FALSE)

Climate stations near Bialowieza Forest (red dot)

Now I can download the data for the closest few station

#download data
regionalData <- stations$Białowieża %>%
filter(distance < 70) %>% # 100 km radius
group_by(name, id) %>%
do(ghcnd_search(.$id, var = "TAVG")$tavg) %>% #download
mutate(tavg = tavg/10,
variable = "tavg") %>%
rename(value = tavg)

#plot regional data
g <- regionalData %>% filter(year(date) == 2000) %>%
ggplot(aes(x = date, y = value, colour = name, group = name)) +
geom_line() +
labs(x = "Date", y = "Daily mean temperature °C", colour = "Station")


Mean daily temperature for the year 2000 for the four nearest GHCN stations

Unfortunately, only a small proportion of the Polish weather data are available through GHCN. Until last year, these data were difficult to access. Now they are available to download from https://dane.imgw.pl/.

It is only possible to download seven days’ data at once. This would become tedious if you wanted data for several years. I wanted 50 years’ data, so I wrote an R function to hit the server ~2500 times slowly (server limit of 1000 queries per client per ten minutes). To use the script, you need to know the site ID for the weather station (map), the variable name (hope your Polish is good), and a registration. What I haven’t found are meta-data showing which stations have which variables and for how long.

## Bialowieza data from

#authentic <- "richard.telford@uib.no:Pa55w0rd"#not real password
#save(authentic, file = "data/authentic.Rdata")
startDate <- as.Date("2000-1-1")
endDate <- as.Date("2000-12-31")
siteCode <- "252230120"

BialowiezaDaily <- get_Polish_weather_data(
siteCode = siteCode,
variableCode = "B100B007CD",
startDate = startDate,
endDate = endDate,
authentic = authentic

BialowiezaDaily <- BialowiezaDaily %>%
mutate(name = "Białowieża", variable = "tavg")

And a plot to compare the Białowieża data with the GHCN data

## compare Białowieża with regional data
all_temperatures <- bind_rows(BialowiezaDaily, regionalData)

g %+% filter(all_temperatures, year(date) == 2000)

The four nearest GHCN stations and Bialowieza

Not surprisingly, all the data are in good agreement, but note that the GHCN-daily data are not homogenised to account for station moves etc. I suspect the IMGW data are not homogenised either.

Posted in Uncategorized | Tagged , | Leave a comment

Chironomid vs pollen: Holocene climate change in southern Europe

Pollen-inferred summer temperature reconstructions from southern Europe show cool early-Holocene summers and warmer late-Holocene summers (Davis et al 2003, Mauri et al 2015). In contrast,  warm early-Holocene summers are reconstructed elsewhere in Europe and most of the mid-high latitude Northern Hemisphere, as expected from the increased summer insolation due to orbital forcing.

Since one important use of palaeoclimate reconstructions is to validate climate models by testing whether model simulations of past climate match the climate reconstructed from proxies, it is important to understand whether these cool early-Holocene summer reconstructions are valid. Uncritical use of palaeodata to validate models is not going to be useful.

Time to call in the cavalry chironomids.

Samartin et al 2017 (which we covered in our reading group last week – not all ideas here are my own) reports chironomid-inferred summer temperature reconstructions from two lakes in the Italian mountains. Both (yes there is replication) lakes show a warm early Holocene, consistent with the insolation changes, marine proxy records and climate models, but not the pollen-based reconstructions.


Samartin et al figure 3. Chironomid-inferred mean July air temperature for Lakes Gemini and Verdarolo compared with other palaeotemperature records.

Samartin et al use a combined Swiss-Norwegian chironomid calibration set and report that all the fossil samples have fair/good analogues in this calibration set. I would have liked to have seen a plot of the analogue quality in the supplementary material. Otherwise, I like this paper.

Samartin et al suggest that the pollen-based reconstructions are so different because temperature is not the only control on vegetation in southern Europe. Moisture availability is important, and several millenia of disturbance by humans have replaced the natural vegetation with “humanized vegetation types” such as Garrigue. Thus pollen assemblages from the early Holocene may lack appropriate modern analogues and so pick inappropriate analogues from higher elevations where there are remnant forests. (In June, I am going to a PAGES workshop on improving pollen-based reconstruction. I must remember that in some cases the best reconstruction is no reconstruction.)

There is one problem with Samartin et al. Nature has strict guidelines about the availability of data etc.

A condition of publication in a Nature journal is that authors are required to make materials, data, code, and associated protocols promptly available to readers without undue qualifications.

A data availability statement is to be included in papers which

will report the availability of the ‘minimal data set’ necessary to interpret, replicate and build on the findings reported in the paper.

This is the full data availability statement from Samartin et al:

The chironomid-inferred mean July air temperature records from Gemini and Verdarolo are available at the NOAA National Centers for Environmental Information webpage (https://www.ncdc.noaa.gov/paleo/study/21030). Data from reference 1 (shown in Fig. 3) have been digitized from the original publication. Data from ref. 3 (online resource http://ncdc.noaa.gov/paleo/study/18317), ref. 21 (IGBP PAGES/World Data Center for Paleoclimatology Data Contribution Series no. 2006-106) and ref. 46 (IGBP PAGES/World Data Center for Paleoclimatology Data Contribution Series no. 92-007) have been downloaded from the NOAA National Centers for Environmental Information webpage.

Climate model data from refs 36,37 (shown in Fig. 3) are available at the NCAR climate data webpage (https://www.earthsystemgrid.org/dataset/ucar.cgd.ccsm3.trace.html). Climate model data from ref. 6 can be obtained from H.R. on request.

Notice anything missing?

There is no mention of where the chironomid calibration set and fossil data are archived or how they can be accessed. I do hope this omission was just an oversight.

There is no better explanation of why the data should be made available than this sentence from the statement.

Data from reference 1 (shown in Fig. 3) have been digitized from the original publication.

Posted in Peer reviewed literature, transfer function | Tagged , , , , , | Leave a comment

Resilience workshop in Finse: deadline soon

At the end of March, there is a workshop on Measuring Components of Resilience in Long-term Ecological Datasets

The workshop is at Finse, high in the mountains between Bergen and Oslo. Along with debate about how ecosystem resilience can estimated from palaeoenvironmental data, and work to implement our ideas in R, I can promise there will be lots of snow, sauna and skiing. If you have size 39 feet, I can lend you some skis (other skis will be available). Northern lights are possible if the weather is good (we saw them last year). The food is excellent, and the train line to Finse from Bergen or Oslo is one of the most scenic anywhere.


Not just ptarmigan and arctic hares & foxes at Finse

There are still a few days left to apply. All travel within Norway (i.e. trains to Finse and back), food and accommodation at the Finse Research Station, is provided. Some funds are available to support travel of early-career researchers.

Posted in Uncategorized | Tagged | Leave a comment