The potential for using palaeoclimate reconstructions to validate climate models is a popular argument in proposals for funding research – if models can accurately predict past climate, our confidence in their ability to predict future climate should be enhanced.
Klein et al (2013), currently under review at Climate of the Past Discussions, compare mid-Holocene Arctic sea-ice reconstructions with several climate models from the PMIP3 project. They conclude that the models have little or no skill. Models have skill if they can reproduce the reconstructed sea-ice concentration better than a baseline null model, in this case, the baseline is one of no change between the mid and late Holocene.
I would not be in the least surprised if the models’ predictions of mid Holocene sea-ice concentration had low skill:
- the difference in climate forcings between the mid and late Holocene is small, at least relative to the difference since the Last Glacial Maximum, or expected over the coming century
- the models have a hard time reproducing the dramatic decline in sea-ice concentration during the last decade
But before writing off the models’ sea-ice predictions, we should consider the alternative possibility, that the reconstructions have little or no skill. The 18 reconstructions are all based on dinocyst assemblages using a transfer function. It would be easy to point out that the uncertainty on the reconstructions is probably 50-100% higher than the figure of 11% reported by Klein et al, because of the uneven distribution of observations along the environmental gradient, autocorrelation and pseudoreplication, and incomplete sampling of the dinocysts’ niches.
However, estimating the real uncertainty on these reconstructions needs more thought than this. As the mid-Holocene values used are anomalies, we need to consider the uncertainly in both the modern and the mid-Holocene estimates of sea-ice concentration. If these uncertainties are independent, they need to be added in quadrature. However, if they are not independent, but instead have a shared site-specific error then the uncertainty in the anomalies may be reduced. Further reductions in the uncertainty might accrue because the reconstructions used are the mean of several individual reconstructions within 500 years of the 6000 yr BP target. If these reconstructions are at least partially independent, the uncertainty on the mean will be smaller than on any of the individual reconstructions.
Some of these uncertainty multipliers may be possible to quantify, others perhaps not, so for simplicity, I am going to assume that all these factors magically cancel out, and that the real uncertainty is 11%.
The mean of the mid-Holocene sea-ice concentration anomalies is 0%, and the standard deviation is 9%. All but four of the eighteen reconstructions have anomalies of less than 11% and only one exceeds (just) twice this. The reconstructions are consistent with an assumption of no change in sea-ice concentration since the mid-Holocene. It is difficult to make good estimates when the uncertainty is large relative to the change.
Surprisingly, Klein et al ignore the reconstruction uncertainty, essentially assuming that the reconstructions are correct, and that any errors lie in the modelled ice-ice concentrations.
Hargreaves et al (2013), which is cited by Klein et al, introduce a method for determining model skill when there are uncertainties in the reconstruction, and note that the difference between the model and reconstructions should be at least as large as the error on the reconstructions. Curiously, five of the eleven models tested by Klein et al have a model-reconstruction difference that is smaller than the reconstruction uncertainty. One of these has been forced by the reconstructions, so should be smaller, but the others? Hargreaves et al suggest that this can only happen if the reconstruction errors have been underestimated (we can neglect the possibility that the models have been tuned to these data).
I read Klein et al expecting to conclude that their results were flawed because the reconstruction uncertainty was underestimated, but instead find that the results can only be explained if the reconstruction uncertainty is overestimated.
One possibility is that the inclusion of reconstructions from south of the sea-ice margin in the models, are biasing the results. These sites might have an uncertainty that is lower than the overall 11% uncertainty because of the uneven sampling of the calibration set. Alternatively, the effect of taking the mean of several individual reconstructions might be responsible in reducing the uncertainties.
Final score for the dinocyst-model derby. I think it is a draw, probably nil-nil, as neither have clearly demonstrated skill. The main conclusion is that estimating the uncertainty on palaeoclimate reconstructions is both critical – model-data comparisons are meaningless without good uncertainty estimates – and hard.
Klein, F, Goosse, H, Mairesse, A & de Vernal, A (2013) Model-data comparison and data assimilation of mid-Holocene Arctic sea-ice concentration. Clim. Past Discuss., 9, 6515–6549 doi:10.5194/cpd-9-6515-201