The evaluation of reconstruction diagonstics is an essential part of the process of generating palaeoenvironmental reconstructions from microfossil assemblages using transfer functions. If the reconstruction diagnostics are bad, we should be especially cautious about interpreting the reconstruction. The problems is that “good” and “bad” are not well defined and we rely on various rules-of-thumb to guide us.

Since the chironomid-based reconstruction of August air temperature presented by Larocque-Tobler et al (2015; hereafter LT15) from Lake Żabińskie is so remarkably good, it should be an interesting case to test how well reconstruction diagnostics work.

LT15 use analogue quality as their main diagnostic method

For the combined transfer function, to determine whether the modern calibration models had adequate analogues for the fossil assemblages, the modern analogue technique (MAT) was performed using C2 (Juggins, 2005), with squared chord distance as the dissimilarity coefficient (DC) (Overpeck et al., 1985). Confidence intervals were based on minimum DC distance within the calibration sets (Laing et al., 1999). Fossil assemblages above the 95% confidence interval were considered to have no analogues in the calibration set; samples between 75% and 95% were considered to have fair analogues (Francis et al., 2006).

This text from LT15 is not entirely clear – what confidence intervals? Time to read Francis et al (2006).

In order to determine whether the modern calibration model had adequate analogs for the fossil assemblages, modern analog testing (MAT) was performed using the computer program C2, with squared chord distance as the dissimilarity coefficient (Overpeck et al., 1985). Confidence intervals were based on minimum DC distance within the calibration set following Laing et al. (1999). Fossil assemblages above the 95% confidence interval were considered to have no analogs in the calibration set, samples between 75% and 95% were considered to have fair analogs.

I get a slight sense that I’ve read this before somewhere. I’m sure it is just a coincidence. But having read it twice, I understand what is being done. The squared-chord distances between each fossil sample and its closest analogue in the calibration set is being compared with the 75^{th} and 95^{th} percentiles of the distribution of distances between each calibration-set sample and its nearest neighbour.

LT15 report that

No sample had chironomid assemblages outside the 95% confidence interval suggesting that the transfer function can be applied to the downcore samples.

but don’t show this with a figure (no complaint here, I would do the same if the analogues were good). I want to see a figure showing the analogue distances.

First we need to load the data, which can be downloaded from NOAA.

library(readxl) fname <- "zabinskie2015cit.xls" excel_sheets(fname) spp <- read_excel(fname, sheet = "Training species") env <- read_excel(fname, sheet = "Training temperature") fos <- read_excel(fname, sheet = "Chironomids Zabinsk percentages") recon <- read_excel(fname, sheet = "Reconstruction ") names(recon) <- c("date", "temperature") rownames(spp) <- spp[, 1] spp[, 1] <- NULL rownames(env) <- env[, 1] env <- env[, 2, drop = FALSE] lowCount <- c("SAL", "LEK", "TRZ", "WAS", "SZOS", "GOR", "KOS", "ZAB") spp <- spp[!rownames(spp) %in% lowCount, ] env <- env[!rownames(env) %in% lowCount, , drop = FALSE] identical(rownames(spp), rownames(env)) env <- env$Temp chron <- fos[, 1] fos <- fos[, -c(1, ncol(fos))] ####check names#### setdiff(names(fos), names(spp)) setdiff(names(spp), names(fos))

Distances to the nearest analogue are easily calculated with the rioja package which should give the same result as C2.

library(rioja) library(ggplot2) matmod <- MAT(spp, env) matpred <- predict(matmod, fos) goodpoorbad <- quantile(matmod$dist.n[, 1], prob=c(0.75, 0.95)) qualitybands <- data.frame(xmin = rep(-Inf, 3), xmax = rep(Inf, 3), ymax = c(goodpoorbad, Inf), ymin = c(-Inf, goodpoorbad), fill = factor(c("Good", "Fair", "None"), levels = c("None", "Fair", "Good"))) fillscale <- scale_fill_manual(values = c("salmon", "lightyellow", "skyblue"), name = "Analogue Quality") g <- ggplot(data.frame(chron, analogue = matpred$dist.n[,1])) + geom_point(aes(x = chron, y = analogue)) + labs(x = "Date CE", y = "Squared chord distance to nearest analogue") + geom_rect(aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax, fill = fill), qualitybands, alpha = .5) + fillscale print(g)

Thirty eight of the 89 samples have no modern analogues under the definition used by LT15. Only 11 samples have good analogues. This is difficult to reconcile with the claim by LT15 that

No sample had chironomid assemblages outside the 95% confidence interval

The reconstruction of August air-temperature in LT15 is remarkably good, almost as good as what would be expected if chironomids were perfect thermometers, yet the analogue quality is fairly awful (squared residual length is also fairly awful). Does this mean that these diagnostics are utterly useless guides to the utility of reconstructions? Or is this another remarkable feature of the Lake Żabińskie chironomid reconstruction?

Perhaps some ordinations would be useful to investigate what is going on. I’ll show some in a future post.

Pingback: Replicating the Lake Żabińskie reconstruction | Musings on Quantitative Palaeoecology

Pingback: The Humpty Dumpty theory of palaeoecology | Musings on Quantitative Palaeoecology

I have been fiddling with this all day trying to replicate this example of analogue matching with a different core I have been working on when I realized there were two issues that you may need to consider. 1) The number of rare taxa in the fossil dataset, and 2) the transformation of the dataset. I assume in the polish lakes paper that some kind of transformation was applied, but in your example you do not apply any transformation to the data. When I use this script and add a Hellinger distance transformation to the data it has a much better analogue for the dataset than is shown. Likewise if I remove rare taxa it has a better analogue. While even with those two factors considered there are non-analogue intervals, it is quite a bit different than what is shown by the example output. I remember quite some time ago you posted a different kind of chord distance analogue matching example, which I have used in the past (its not as pretty as this visual though). That example also gives a different output for these imported data, with the difference in how it computed the quantile range with “quantile(paldist(spp)”

Rare taxa inclusion rules should not make that much difference to the distance calculations, unless an inner join in used when comparing the calibration and fossil data.

Of course, you can choose what data transformation to use, but that changes the distance metric used. Untranformed data will give the squared-chord distance by default with `rioja`, but of course other distance metrics could be used as well.

`paldist(spp)` will return all distances (including the zero distances on the diagonal). I would normally use the lower triangle of this, but here the authors are explicit that they only consider the minimum distances, so this is what I extract from the `MAT` output.

Pingback: why would anyone not trust the author???? | Musings on Quantitative Palaeoecology