## Calibration-in-time at Seebergsee

Most transfer functions for reconstructing palaeoenvironmental variables from microfossil assemblages use a modern calibration set of paired microfossil and environmental data from many sites. This is sometimes known as the calibration-in-space approach. An alternative approach adopted by a few papers is to use paired microfossil and environmental data from different times at one site, the so-called calibration-in-time approach.

In principle, there are advantages to a calibration-in-time approach: there is no need to collect an extensive modern calibration set; and the calibration set is tailored for the site being studied which might be especially useful if the site is unusual. Despite these advantages, I’ve always been worried that the recent microfossil assemblages used as the calibration set might be poor analogues for the older assemblages, especially there has been considerable environmental change. Having now completed my review of sub-decadal palaeoenvironmental reconstruction from microfossil assemblages, I think the demands of a near perfect chronology, simple taphonomy and a strong and simple relationship to the environmental variable reconstructed are very difficult to meet.

How then to explain the good performance of the calibration-in-time reconstruction of July air temperature from chironomids from Seebergsee by Larocque-Tobler et al (2011)?

Larocque-Tobler et al (2011) extract chironomid assemblages from a sediment core from Seebergsee and pair these with air temperature data from a climate station in the region for the period 1900–2005 CE. The calibration-in-time WAPLS model is reported as having an bootstrap r2 of 0.56 and an RMSEP of 0.84 °C. According to their figure 6, the calibration for the full period 1900–2005 CE has a correlation of 0.73 with the instrumental temperature. This is approximately sqrt(0.56): so far so good.

No data are archived from Larocque-Tobler et al (2011), but there is not much to see as the Californian Zephyr trundles through Nebraska in the dark, so I digitised the fossil assemblages and climate data.

When I rerun the WAPLS-2 model, my reconstruction matches the original and I get similar performance statistics – r2 = 0.53, RMSE = 0.86 – the niggle is that these are the apparent performance statistics, not the cross-validated statistics. The leave-one-out cross-validated r2 is only 0.02.

It appears that Larocque-Tobler et al (2011) are reporting the apparent performance statistics but claiming that they are the bootstrap cross-validated performance statistics. While this could easily just be a silly mistake (they appear to report the calibration-in-space model performance correctly), the utter lack of skill of the calibration-in-time model means that both Larocque-Tobler et al (2011) and Larocque-Tobler et al (2012), which uses the model to reconstruct air temperature from Seebergsee for the last 1000 years, are invalid.

In a subsequent post, I’ll take a closer look at the rather curious chironomid assemblage data from Seebergsee.

## Data archiving in palaeoecology

Perhaps the main impediment in trying to reproduce the results from Dr Larocque-Tobler’s papers was the incomplete archiving of data for several papers. So I had a look at he requirements for data archiving in journals that commonly publish palaeoecological research. Here are the data availability requirements (my emphasis except for Nature).

### Journal of Paleolimnology (Springer)

Authors may deposit long tables, species lists, protocols, and additional figures in the Publisher’s Electronic Supplementary Material (ESM) system, which is directly linked online to the published article. Alternatively, any official repository may be used (e.g., World Data Center-A for Paleoclimatology at NOAA/NGDC).

### Quaternary Science Reviews; Palaeogeography, Palaeoclimatology, Palaeoecology (Elsevier)

This journal encourages and enables you to share data that supports your research publication where appropriate, and enables you to interlink the data with your published articles. Research data refers to the results of observations or experimentation that validate research findings. To facilitate reproducibility and data reuse, this journal also encourages you to share your software, code, models, algorithms, protocols, methods and other useful materials related to the project.

Data statement
To foster transparency, we encourage you to state the availability of your data in your submission. This may be a requirement of your funding body or institution. If your data is unavailable to access or unsuitable to post, you will have the opportunity to indicate why during the submission process, for example by stating that the research data is confidential. The statement will appear with your published article on ScienceDirect. For more information, visit the Data Statement page.

### Holocene (SAGE)

The Holocene requests all authors submitting any primary data used in their research articles [“alongside their article submissions” or “if the articles are accepted”] to be published in the online version of the journal, or provide detailed information in their articles on how the data can be obtained. This information should include links to third-party data repositories or detailed contact information for third-party data sources. Data available only on an author-maintained website will need to be loaded onto either the journal’s platform or a third-party platform to ensure continuing accessibility. Examples of data types include but are not limited to statistical data files, replication code, text files, audio files, images, videos, appendices, and additional charts and graphs necessary to understand the original research. [The editor(s) may consider limited embargoes on proprietary data.] The editor(s) [can/will] also grant exceptions for data that cannot legally or ethically be released. All data submitted should comply with Institutional or Ethical Review Board requirements and applicable government regulations. For further information, please contact the editorial office at [email address].

### Boreas (Wiley)

Data that is integral to the paper must be made available in such a way as to enable readers to replicate, verify and build upon the conclusions published in the paper. Any restriction on the availability of this data must be disclosed at the time of submission.

Data may be included as part of the main article where practical. We recommend that data for which public repositories are widely used, and are accessible to all, should be deposited in such a repository prior to publication. The appropriate linking details and identifier(s) should then be included in the publication and where possible the repository, to facilitate linking between the journal article and the data. If such a repository does not exist, data should be included as supporting information to the published paper or authors should agree to make their data available upon reasonable request.

### Climate of the Past (EGU)

Copernicus Publications recommends depositing data that correspond to journal articles in reliable (public) data repositories, assigning digital object identifiers, and properly citing data sets as individual contributions. Please find your appropriate data repository in the registry for research data repositories re3data.org. A data citation in a publication should resemble a bibliographic citation and be located in the publication’s reference list. To foster the proper citation of data, Copernicus Publications requires all authors to provide a statement on the availability of underlying data as the last paragraph of each article (see section data availability). In addition, Copernicus Publications provides with Earth System Science Data (ESSD) a journal dedicated to the publication of data papers including peer review on data sets. Authors might consider submitting a data paper to ESSD in addition to their research paper in CP.

Best practice following the Joint Declaration of Data Citation Principles initiated by FORCE 11:

### COPDESS

In addition to promoting these data citation principles, Copernicus Publications is a signatory of the Coalition on Publishing Data in the Earth and Space Sciences (COPDESS) commitment statement.

### Statement on the availability of underlying data

Authors are required to provide a statement on how their underlying research data can be accessed. This must be placed as the section “Data availability” at the end of the manuscript before the acknowledgements. Please see the manuscript composition for the correct sequence. If the data are not publicly accessible, a detailed explanation of why this is the case is required.

The best way to provide access to data is by depositing them (as well as related metadata) in reliable public data repositories, assigning digital object identifiers, and properly citing data sets as individual contributions. If different data sets are deposited in different repositories, this needs to be indicated in the data availability section. If data from a third party were used, this needs to be explained (including a reference to these data). Data Cite recommends the following elements for a data citation:

Creators: Title, Publisher/Repository, Identifier, Publication Year (e.g.: Loew, A., Bennartz, R., Fell, F., Lattanzio, A., Doutriaux-Boucher, M., and Schulz, J.: Surface Albedo Validation Sites, EUMETSAT, http://dx.doi.org/10.15770/EUM_SEC_CLM_1001, 2015).

Copernicus Publications also accepts supplements containing smaller amounts of data. However, please note that this is not the preferred way of making data available.

### Other underlying material

Data do not comprise the only information which is important in the context of reproducibility. Therefore, Copernicus Publications encourages authors to also deposit software, algorithms, model code, and other underlying material on suitable repositories/archives whenever possible. These materials should be referenced in the article and preferably cited via a persistent identifier as a DOI.

### Nature Geosciences (Nature Group)

An inherent principle of publication is that others should be able to replicate and build upon the authors’ published claims. A condition of publication in a Nature Research journal is that authors are required to make materials, data, code, and associated protocols promptly available to readers without undue qualifications. Any restrictions on the availability of materials or information must be disclosed to the editors at the time of submission. Any restrictions must also be disclosed in the submitted manuscript.

After publication, readers who encounter refusal by the authors to comply with these policies should contact the chief editor of the journal. In cases where editors are unable to resolve a complaint, the journal may refer the matter to the authors’ funding institution and/or publish a formal statement of correction, attached online to the publication, stating that readers have been unable to obtain necessary materials to replicate the findings. [Emphasis in original]

### PAGES-acknowledged publications

Prior to publication, all essential input and output data must be archived in a community recognized, publicly accessible, long-term data repository to allow for a proper “data citation” (see below).

There is a huge variation in what the journals require, from “may” to “must”, and whether the data need to be archived or only promised on request. Only Nature explains what will happen if the authors renege on their agreement.

Of course, what really matters is how well these policies are enforced.

## Abisko, 2003

In the summer of 2003, our first summer in Bergen, Cathy and I took a grand tour of Scandinavia on the way to the Paleolimnology Symposium in Espoo. On the way, we stopped in Abisko, northern Sweden, for a few days to hike through the forests and alpine grassland. It was beautiful, but that is not what I have to write about today.

Larocque and Hall (2003) correlate high-resolution (1–7 years) reconstructions of July air temperature inferred from chironomid assemblages in four lakes near Abisko with instrumental temperature data. These sub-decadal resolution palaeoenvironmental reconstructions are within the scope of a review paper I am writing, I thought I ought to finally have a closer look at them.

The paper reports that

Strong, statistically significant correlations (p <= 0.05) were observed between chironomid-inferred mean July air temperatures and mean meteorological data at all four sites.

The correlations and their associated significance values are given on one of the figures,  I’ve put them into a table below.

Part of Larocque and Hall (2003) Figure 5. Comparison between meteorological data and chironomid-inferred temperatures at one of the four study sites. The line represents the 5-year running means of the lapse-rate corrected meteorological data from Abisko. Open squares represent the 5-year running means of the meteorological data corresponding to the date obtained at each level. Solid circles are the chironomid-inferred temperatures with the estimated errors as vertical bars (mean +/- SSE). Horizontal error bars represent an estimated error in dating. Open stars indicate sediment intervals where the instrumental values fall outside the range of chironomid-inferred temperature (mean +/- SSE). The Pearson correlation coefficient, r, and associated p-values are presented and indicate statistically significant correlation between measured and chironomid-inferred mean July air temperatures at all study sites. The arrows indicate the climate normals (mean 1960–1999).

Lake Number of reconstructions r p
Njulla 19 0.390 0.05
Lake 850 23 0.365 0.10
Vuoskkujavri 14 0.350 0.05
Alanen 24 0.370 0.05

Although the text reports that the all the p-values are below 0.05, the figure reports that Lake 850 has a p-value greater than this threshold. However, since Lake 850’s correlation is comparable to those of the other lakes and the number of reconstructed values is high, this must be an error.

It is easy to recalculate the p-value from the correlations and the number of reconstructed values. Using a one-sided test (which is reasonable as we expect a positive correlation), Lake 850 has a p-value below 0.05, but Vuoskkujavri has a p-value of 0.11. The other two lakes have p-values below 0.05.

Intrigued by these discrepancies, and, noticing some other oddities such as Lake 850 having a reconstruction but no assemblage for 1999, I want to recalculate the correlations. I also really want to see a plot of reconstructed against instrumental temperature, not just plots against time. It is time to use xyscan again.

Reconstructed against instrumental temperatures for four lakes near Abisko

With the digitised data, I can only replicate the reported correlation for one of the lakes. One of the lakes has far worse correlation than reported, and one far better.

lake r
Njulla 0.04
Lake850 0.45
Vuoskkujavri 0.69
Alanen 0.37

Correlations are not the most difficult of statistics to calculate (most people can estimate them fairly well by eye), so if the correlations have been blatantly miscalculated, I am disinclined to trust the reconstructions as these involve more complicated calculations (which the first author has been known to get wrong – see for example the corrigendum to Larocque-Tobler et al (2015). Since the chironomid assemblage data have not yet been archived (hopes spring eternal), I cannot tell whether the reconstructions are correct.

A few weeks ago, I wrote to the first author, advising them of the problems with this and other several papers, suggesting that they should correct or retract non-reproducible papers and archive data for the remainder so that the community can be assured that the results are reproducible. I have had no reply.

Three of the lakes studied by Larocque and Hall are also analysed by Bigler and Hall (2003) who reconstruct July air temperature from diatom assemblages. They do not quantify the agreement between the reconstruction and the instrumental data smoothed with a 13-year running mean, but write in the abstract that the reconstructions “correspond in general closely with the meteorological records”. The results section is still optimistic, but lists several differences. It is not clear from the method section whether the lakes for which reconstructions were made were included in the calibration set. If so, there is a lack of independence between the reconstruction and the observed temperature. Because of the 13-year smooth, this paper falls outside the scope of my review.

## The ‘New York’ principle of site selection

If I can make it there,
I’ll make it anywhere.
It’s up to you, New York, New York.

Palaeoecologists typically try to choose sites where the environmental variable they want to reconstruct is likely an important, ideally the most important, variable determining microfossil assemblages in the past. If other environmental variables are important, the basic assumptions of transfer functions risk being violated and the reconstruction may be spurious, driven by the other variables.

5. Other environmental variables than the one(s) of interest (Xf) have had negligible influence on Yf during the time window of interest, the joint distribution of these variables of interest in the past was the same as today, or their effect on Yf did not lead to past changes in assemblage states resembling shifts indicative in the modern environment of changes in the variable of interest

Most palaeoecologists also try to minimise non-analogue problems by choosing  sites that are similar to the calibration set that the transfer function uses.

These two site-selection guidelines make Speke Hall Lake, a polluted eutrophic lake near Liverpool, a curious lake to choose to try to reconstruct July air temperature from chironomid head capsules using the Norwegian chironomid calibration set. But this is what Lang et al (2017) have done. They find a statistically significant correlation between the reconstruction and instrumental records of July temperature from Anglesey (r = 0.620; n = 16; p = 0.01) and declare that

This study demonstrated that a chironomid-based temperature inference model can produce reliable estimates of mean July air temperature, even from a lake that has experienced large changes in heavy metal and sulphur inputs, and trophic status.

Or in other words, if you can reconstruct temperature in Speke Hall Lake, you can reconstruct temperature anywhere.

I would not be so hasty to ignore the assumptions of transfer functions, lest we exemplify the “sick science” problem (curiously, Juggins (2013) is not cited despite its relevance ). Given the enormous ecological, chronological, and taphonomic difficulties that high-resolution chironomid reconstructions face (insurmountable at annual resolution, challenging at decadal resolution), I would judge it far more likely that the reported correlation is due to chance than that everything we know about the limitations of transfer functions is wrong. No single study at p = 0.01 is going to change my mind (you can find homoeopathy studies with lower p-values), and the review of high-resolution reconstructions that I am writing, shows there are serious problems with many of the ten sub-decadal chironomid-temperature reconstructions that I have found.

I am entirely happy to ascribe the key result from Speke Hall Lake to chance, but there are some other aspects of the paper which merit attention.

### Chronology

Lang et al use the constant rate of supply (CRS) model to produce a chronology from their 210Pb data. The CRS model is

$t=\frac{1}{\lambda}ln\frac{A(0)}{A}$

where A(0) is the total unsupported 210Pb inventory and A is the inventory below the sample being dated. This resulting age-depth model will always be monotonic as the inventory below the sample being dated will always decline with depth. The CRS model shown in Lang et al is not monotonic (fig 1b)

Lang et al Fig 1. Speke Hall Lake location (a), chronology (b), and core matching  with magnetic susceptibility measurements (c).

From the timing of the impossible wiggle, it looks like the 137Cs peak from atmospheric bomb testing might have been included as an age rather than a check on the CRS model. I hope this is simply a plotting problem and that the ages of the chironomid samples are unaffected.

The 210Pb dates are on a different core from the chironomid samples. The chronology is transferred to the chironomid stratigraphy by aligning the magnetic susceptibility record. The overall agreement between the two mag sus records is excellent (Fig 1c), but the details are not perfectly reproduced. Since these details are used to align the records, there will inevitably be some error in the alignment. It is not clear from the paper if this uncertainty is accounted for (even an error of 2 years would seriously degrade the expected correlation between the reconstruction and the instrumental record).

### The ordination

Lang et al do a constrained ordination and find that their variables explain 68% of the variance in the chironomid stratigraphy. This seems impressive until you realise that they used seven predictor variables and have fourteen fossil samples. Given the strong autocorrelation, especially in the geochemical variables, I suspect this result is little better than chance. Had 13 variables been used, they would have explained 100% of the variance!

Lang et al Fig. 5. Canonical correspondence analysis (CCA) for the upper sections (1932–2005) of the Speke Hall record. Anglesey is the July temperature data.

Note that in the ordination the temperature arrow is inversely correlated with most of the pollution indicators.

### Reconstruction diagnostics

Lang et al include some reconstruction diagnostics, a plot of residual squared distances and a timetrack plot. Unfortunately, they conflate their residual squared distances (goodness-of-fit) with analogue quality making it difficult to be sure of what they have done. It is possible to have fossil samples that have excellent analogues (short squared chord distance) in the calibration set but a poor goodness-of-fit, and vice versa. What I would like to have seen is a plot of the fossil abundance against calibration set abundance.

### Interpreting the correlation

There is a strong trend in the instrumental temperature data (r2 = 0.5) and the assemblage composition is autocorrelated. It would therefore seem prudent to correct the p-value of the correlation between the reconstruction and the instrumental record for autocorrelation. Of course, with a only 16 fossil data points covered by the Anglesey record, this will be difficult, but the corrected p-value is bound to be higher.

The apparent inverse correlation between the temperature and pollution indicators could also help to inflate the correlation between the reconstruction and the instrumental record.

The correlation with the longer CET series is only 0.25. No explanation for this much weaker (and non-significant) correlation is given.

### Final questions for the authors

Had the correlation between the reconstruction and the instrumental record not appeared significant would you (and would the editors/reviewers have let you) publish a paper which could be summarised as ‘unpromising ponds cannot be used for high-resolution climate reconstructions’? I wonder if there are any failed high-resolution reconstructions decorating the interior of filing cabinets.

As I have started asking in all my reviews: where are the data going to be archived?

## Pattern obfuscation of ocean pH

I noticed that my blog had been cited by a couple of papers, so I went to have a look.

Albert Parker has a paper in Nonlinear Engineering. I’m sure this journal wasn’t chosen for the relevant expertise of the editors and usual pool of reviewers. More likely the converse: Parker (2016) is neither good nor original, a Gish gallop of a paper, recycling bad ideas from climate denialist blogs.

The paper returns repeatedly to the Monterey Bay pH time series, contrasting this series with the Hawaii Ocean Timeseries (HOT) from the North Pacific gyre. Parker prefers the Monterey Bay series, which does not show a decline in pH to the HOT data which show significant acidification (or as Parker would probably prefer, dealkalinisation).

Whereas the HOT data were explicitly designed to test for ocean acidification and  changes to the carbon inventory, the Monterey Bay series has a more immediate goal. The Monterey Bay Aquarium measures the chemistry of the water being pumped into their tanks from an intake 17m below the sea surface in the bay. Most of the time, the water at this depth is well-oxygenated surface water, but during upwelling events, cold, hypoxic, low-pH water is drawn into the pipe. The correlation between oxygen concentration and pH is high – 0.675 according to this Masters thesis which examines the time series in detail. The upwelling events, which occur more frequently in summer, add a large amount of variability to the data, obscuring any trend in ocean surface pH. The Monterey Bay pH series is in the perfect location if you want to avoid killing your fish with hypoxic water, but an utterly useless place if you want to test for ocean acidification: pH trends could be caused by ocean acidification or changes in upwelling. A better place would have much simpler hydrography, and so a much higher signal to noise ratio, perhaps an ocean gyre north of Hawaii. Parker complains that the IPCC ignores the Monterey Bay series. Of course they do. Anyone who uses the lack of trend in the Monterey Bay series as evidence against ocean acidification is clueless.

A couple of years ago Mike Wallace produced some excitement on climate denier blogs with a compilation of all historical pH measurements into a global mean which showed variability on multidecadal time scales. The problem is that the seasonal and geographical coverage of the pH measurement varies over time. Imagine if temperature data were most abundant in summer in northern Europe in the 1920s and then more common in east Asia and in winter in the 1990s, and then back to Europe after 2000, and that climatologists simply took the mean of the raw data. Obviously this would be junk: it is no less junk when done with pH measurements. Parker is unable or unwilling to understand this point. He has even read – or at least he cites – my blog, writing that I excuse the dismissal of the historical data because of “poor sampling protocols and non-gridded data”.  I think he should read it again.

Parker is not a fan of the HOT time series. pH was measured for two periods, between these measurements, pH was estimated by a “model”. The horror, the horror.

Closing the significant gaps with computer model results or extending the time range to 1988 by computer model results is not a legitimate procedure when there is no validation for the models. No model can be trusted if not validated first against accurate measurements, and if the measurements are not available or not accurate, than the use of the model is pointless. The time series of the actually measured pH does not satisfy the minimal quality requirements to infer any trend.

Since the HOT data are archived, we can for the very first time ever, validate the “model” against the measurements.

Calculated against measured pH. The r2 is 0.85.

I think this “model” can be trusted. Nevertheless, Parker, who should not be trusted, ignores the model data, but finds a significant decline in pH of −0.00157±0.00015 pH/year even though his model ignores the strong seasonality in the data. He wants to add extra uncertainty for measurement error: I don’t think he understands how linear models work. Because of some non-linearity in the trend, Parker decides that a 60-year sine wave would be a better model. Not on statistical grounds of course. Certainly not on physical grounds. His main attraction to the model seems to be that sine waves produce oscillations and that oscillations imply natural cycles.

There’s more. Much more. But it doesn’t get any better. For example

The measured surface CO2 from the NASA OCO-2 project differ considerably from the NOAA/PMEL estimated ocean CO2 and pH as proposed by Takahashi et al (2014)

Parker (2016) spent 10 months in review and revision. And still it is dross. But Parker obviously enjoyed the experience of publishing in Nonlinear Engineering as he has since published a second paper there under his alternative name Albert Boretti.

## Palaeo papers can be retracted

Via Retraction Watch, I find that the conjecture, cherished by so many, that, unlike all other fields of science, there is absolutely no misconduct in palaeoecology may need revising ever so slightly as L & O have retracted Zou et al (2017).

The article by Youjia Zou, Xiangying Xi, and Chaoyang Zhang entitled “Southward migrations of the Atlantic Equatorial Currents during the Younger Dryas” (doi 10.1002/lno.10529) published in Limnology & Oceanography has been retracted by journal Editor in Chief, Robert W. Howarth, the Association for the Sciences of Limnology & Oceanography, and Wiley Periodicals, Inc. Author Zhang had no affiliation with the Georgia Institute of Technology as claimed, and the authors have been unable to provide information that would allow verification of critical aspects of how the research was conducted, leaving the journal unable to verify the intellectual integrity of the data and work.

Left pondering what “critical aspects of the research” are in question, since forgetting which institute one works for hardly seems like a hanging offence, I’ve read the paper. It is perhaps the only “high-resolution (∼ 75 yr/sample)” isotope stratigraphy with only about 20 samples in the last 14000 years. I’m sure the authors meant to write “low-resolution (∼ 750 yr/sample)”, but keyboards are such tricky things.

Zou et al report oxygen isotope, Mg/Ca and calcareous dinocyst data from two cores in the Atlantic, one either side of the thermal equator. I would have expected a sentence like ‘cores SAU1702 and SAD1006 were collected from HMS Pinafore in 2027 with a Glew corer’, but its not there. I cannot find anything about these cores via google either.

The acknowledgements should surely give some hint: who funded ship time; who gave access to cores; who paid the salaries of the palynologist and geochemists. Quick scroll to the end of the paper. No acknowledgements. Oh.

The methods section is thin and confusing, almost like the authors don’t really know what they did:

Stable isotope analyses for the cores were mainly based on foraminifers G. ruber (white variety) from the > 255 μm size fraction, using approximately 50 (80 for replicated samples) gently crushed shells, splitting into aliquots for Mg/Ca and δ18O analyses, and transferring to clean vials. Any visible coarse grains were removed prior to transfer to the vials. Samples were wet-sieved using a 75 μm mesh. Sediment from each core interval was dried overnight at ∼ 50°C, and then disaggregated in ultra clean water for 6–8 h on a shaker table.

This is to recipe to bake a loaf of bread, then knead the dough, and finally mix in water.

Some of the referencing is innovative. Take, for example,

autotrophic dinoflagellates whose distributions in geography are controlled primarily by seawater temperature and nutrient availability, more in cooler waters but less in warmer waters [Madsen et al. 2001; Barker et al. 2009])

Neither Madsen et al nor Barker et al have anything to say about dinocysts, calcareous or otherwise. This is a paper that should never have passed peer review.

But what of other papers by the authors. Zou (who, except for a coauthor, seems to be the only person using the affiliation “Department of Meteorology and Oceanography, Shanghai Maritime University”), and Xi, from the Management Faculty at Wuhan University of Technology, have many talents. A previous paper models tropical Atlantic currents during the Little Ice Age. The text on the model set-up reads thus:

an oceanic general circulation model (OGCM) proposed by Kim et al. (2004) has been employed and modified after considering the geometry of the continental shelf and bottom topography.

That is it. I can find no information in the paper about how it the model was forced. Contrast with the first paragraph of the model description in Kim et al (2004).

[7] The model used for this study is the Japan Marine Science and Technology Center (JAMSTEC) OGCM [Ishida et al., 1998], based on Modular Ocean Model version 2 [Pacanowski, 1995]. It covers a global domain except for the Arctic Ocean extending from 75°S to 75°N, and has realistic coastline and bottom topography based on the National Geophysical Data Center data set (ETOPO5). The model has a horizontal resolution of 0.25° both in longitude and latitude and has 55 levels in the vertical. The vertical grid spacing increases smoothly from 10 m at the surface to about 50 m near 500 m, about 70 m near 1000 m depth, and about 400 m at 6000 m. Data of the upper 32 levels (0–1007.29 m) are used for this study.

Zou and Xi also have papers, also lacking acknowledgements, on capsizing of bulk carriers carrying nickel ores, and the motions of anchored Capesize ships (Capesized – my new-word-of-the-day – are really big ships, too huge to fit in the Suez Canal). There are not many people can count tiny calcareous dinocysts and numerically model gigantic ships.

## The chironomid triennial

Over at the Subfossil Chironomid group on Facebook, Dr Larocque-Tobler posted a link to Zhang et al. (2017), describing it as impressive. I hadn’t seen the published version of Zhang et al, so I popped over fully prepared to be impressed.

Zhang et al report a transfer function for reconstructing July air temperature from chironomid assemblages using a 100-lake calibration set from the south-eastern margin of the Tibetan Plateau. The authors then apply this transfer function to a high-resolution chironomid stratigraphy from the high-elevation (3900m) Tiancai Lake. They report that the correlation between the reconstruction and the instrumental record from Lijiang (55km away and 1500m lower) is statistically significant (r = 0.45, p < 0.05,  n = 31).

This is the key figure.

Zhang et al Figure 6. (a) Chironomid-based mean July temperature (MJT) reconstruction results from Tiancai Lake based on two transfer function models: the solid black line is the reconstruction based on the weighted-average partial least squares (WA-PLS) bootstrap model with two components and the dashed black line is the reconstruction based on the weighted-average with inverse deshrinking (WAinv) bootstrap model. Red solid line is the instrumental data from Lijiang weather station, corrected applying the lapse rate and solid grey line is the three-sample moving average of the data set. Reconstruction of diagnostic statistics for the 100 lake data set where (b) displays the goodness-of-fit statistics of the fossil samples with MJT. Dashed lines are used to identify samples with “poor fit” (> 95th percentile) and “very poor fit” (> 90th percentile) with temperature [note: the percentiles are the wrong way round and should be poor fit >90th and very poor fit >95th]. (c) Nearest modern analogues for the fossil samples in the calibration data set, where the dashed line is used to show fossil samples with “no good” (5 %) modern analogues. (d) Percentage of chironomid taxa in fossil samples that are rare in the modern calibration data set (Hill’s N2< 2). (e) Comparison between the chironomid-based transfer function reconstructed trends (represented by MJT anomalies) with the instrumental data from Lijiang weather station (in red solid line, with three-sample moving average). The black solid line represents the reconstruction based on the WA-PLS bootstrapped model with two components using 100-lake calibration set.

The problem is obvious once you have found it – it helped immensely having some archived data. Compare the grey line in panel a with the red line in panel e. These are both supposed to be the three-sample moving average of the temperature data from Lijiang (a lapse-rate corrected; e anomalies) and should therefore have exactly the same shape, but the resemblance is limited. (Panel a was not shown in the Climate of the Past Discussion paper, so the reviewers at least can be absolved of any responsibility for not noticing this.)

Using the archived data from dropbox, I can confirm that panel a is correct, except that the curve should extend before 1960 as the instrumental temperature series starts in 1951. It took a while to work out what the authors had done in panel e: at least a couple of minutes.

If instead of taking the three-year moving average, you take every third year, you can get a plot that looks almost exactly like panel e. Interpolate the triennial data and the correlation with the reconstruction is very similar to what the authors report. This is obviously a very lucky error: nobody thinks that chironomids are only sensitive to July temperature every third year. This analysis implicitly assumes that the chironomids can predict temperature up to two years ahead! Now that would be impressive.

Reconstruction from Zhang et al (black) and every third year temperature data from Lijiang (red). Both series are presented as anomalies. cf panel e above

With the promised three-year moving average, the correlation is much weaker (r = 0.21, p = 0.28). I am not impressed.