Why what we think we know is wrong

by | Apr 12, 2020 | Latest News

 

A recent Guardian article explained why many of the data that are reported about the spread of Coronavirus are bound to be wrong.

As discussed in previous posts – here, for example – a difficulty when studying data from the Coronavirus pandemic is the reliability and completeness of the data. This is especially true when looking at the number of confirmed cases, since this measure depends very much on the protocol used for testing. However, it’s also true for the number of fatalities. The following tweet leads to a thread by James Tozer of the Economist who has collated evidence from journalists across Europe that suggests the number of deaths due to COVID-19, at least indirectly, might be around double the officially reported number.

The thread is also available in complete form here.

The analysis is pretty much summarised in the following graphic:

The diagrams take different forms, but in each case there’s a black or grey level that corresponds to the number of deaths that would be expected in that period based on previous year’s data. The red level then shows the number of reported deaths due to COVID-19 in that period this year. And the pink region shows the total number of deaths due to any cause, again in the same period this year.

And it’s similar in the United States. The following graph – provided on Twitter by @Tangotiger – shows the excess number of deaths per month in New York, compared to the long-term average, over a period from 2000 onwards.

There is a large spike in September 2001 caused by the 9/11 tragedy. But there is a much larger spike for March/April 2020. But only around 60% of that excess is due to officially recorded COVID-19 related causes. So what explains the other 40%. It’s too large to be explained by random variation – compare its size to the variations that you see in other months over the same period – so it must be due to some specific effect in 2020, for which the only plausible explanation is COVID-19. That’s not to say that all of these deaths are directly attributable to the Coronavirus, though almost certainly many are from people who were positive but not tested. Others, though, are likely to be due to people dying from illnesses that in normal circumstances would have been treatable with medical support.

So, as with all data that are generated by this pandemic, what we think we know about fatality counts is almost certainly wrong. A reasonable run-of-thumb is to take the officially published numbers and double them.

 

Stuart Coles

Stuart Coles

Author

I joined Smartodds in 2004, having previously been a lecturer of Statistics in universities in the UK and Italy. A famous quote about statistics is that “Statistics is the art of lying by means of figures”. In writing this blog I’m hoping to provide evidence that this is wrong.