Simplistic and Dangerous Models

A few weeks ago there were none. Three weeks ago, with an entirely inadequate search strategy, ten cases were found. Last Saturday there were 43! With three inaccurate data points, there is enough information to fit an exponential curve: the prevalence is doubling every seven days. Armchair epidemiologists should start worrying that by Christmas there will be 10¹² preprints relating COVID-19 to weather and climate unless an antidote is found.

Fortunately, a first version of an antidote to one form of the preprint plague that is sweeping the planet known as the SDM (apparently this not an acronym for Simplistic and Dangerous Model). A second version is due to be published soon.

So why are SDMs such a bad tool for trying to model the spread of COVID-19?

1) The system is not at equilibrium

The first case of what has become known as COVID-19 was reported on 17th November 2019 in Wuhan. The weather was a pleasant 17°C. An SDM fitted to the occurrences of COVID-19 at this time would show it was an extreme specialist, with a preferred temperature of 17°C. Of course it would be absurd to fit an SDM.

animated_cases

By mid-January, the virus had spread across China and an SDM would find that Chinese winter temperatures were ideal. By mid-March, the outbreaks were developing in Europe and America – places with good transport links to China and each other, and an SDM would tell a different story. A month later, and there are COVID-19 outbreaks in almost all countries. An SDM fitted now to presence absence data would show no evidence of a niche.

We can visualise this by fitting a GAM to the presence of >5 cases of COVID-19 in each country/region for each date in the John Hopkins dataset. This is similar to the data Araujo & Naimi (2020) used in the first version of their preprint.

covid-gam

Only for a short time period is there an apparent climatic niche in the distribution of COVID-19 cases. It would be foolish to assert that that brief window gives us insight into the climatic niche of COVID-19 rather than the volume of passenger transport from China and subsequently other countries with outbreaks as the disease spreads.

Almost 40,000 cases of COVID-19 in Brazil testify to the utter failure of the prediction in Araujo & Naimi (2020) that tropical climates are less favourable for the spread of COVID-19.

For simplicity, I fitted the GAM using February mean temperatures from Chelsa. If you think it would make any difference to use observed temperatures rather than a climatology you are missing the point.

Problems with fitting SDM to invasive species, where the range is not in approximate equilibrium with climate, are well known. Anyone working in the field should be aware of these issues.

2) Microclimate

Temperature and humidity can affect the life-time of droplets and the time that virus particles on surfaces are viable. But the 2m outside air temperature is a poor predictor of the actual conditions where many people are exposed to infection. It’s all about microclimate: many people are inside most of the time, especially when it is cold outside.

Qian et al (2020) examined outbreaks of COVID-19 in China (excluding Hubei) that could be traced to their source. They found only one of 318 outbreaks in was transmitted outdoors. Most outbreaks occurred at home. One proviso about this study, it might be more difficult to trace transmissions that occurred outside, causing an undercount.

3) Ignores other factors

Climate and seasonality probably will play some role in COVID-19 dynamics, as it does, for reasons that are not entirely clear, in influenza, for which several hypotheses have been suggested including the influence of seasonality on human behaviour (inside more in cold weather) and our immune systems. While they may play some role, climate and seasonality are unlikely to be the dominant factors controlling COVID-19 transmission. The initial spread around the world reflects patterns of human movement. The number of infections in a location depend on human behaviour including policy to minimise spread taken successfully (South Korea) or less so (United Kingdom). A standard SDM takes none of these factors into consideration, nor any other component of an epidemiological model.

Dangerous

So certainly simplistic, what about dangerous?

If the effect of climate and seasonality on COVID-19 is small and policy makers are influenced by these preprints to premature relax (or apply belatedly) restrictions on movement and businesses, people will die.

Conclusions

Araujo & Naimi (2020) is not the only preprint using an SDM to try to find a relationship between COVID-19 and climate. Harbert et al (2020) use a somewhat more cautious approach, concluding that while they cannot exclude an influence of climate, it is less important than population size in explaining county-level COVID-19 statistics in the United States. Bariotakis et al (2020) use the full suite of 19 WorldClim bioclimatic variables as predictors in MaxEnt. They find …. Actually, I don’t really care what they find as the premise is nonsense so any result can only be an artefact.

Just because you have a bag of hammers, does not make every problem a nail.

3 Responses to Simplistic and Dangerous Models

chaamjamal says:

21/04/2020 at 1:28 am

Thank you for this injection of rationality into a season of research gone mad.

ecoquant says:

27/04/2020 at 5:09 pm

Reblogged this on hypergeometric and commented:
Nice to see Generalized Additive Models used.

Eli Rabett says:

01/06/2020 at 12:25 am

Thanks

	vincepi on Tools for a reproducible …
	richard telford on The lure of underwater vo…
	Joe on The lure of underwater vo…
	Reproducibility of h… on why would anyone not trust the…
	Reproducibility of h… on 73 lakes become 78