COVID-19: Five Common Statistics Errors – and How to Avoid Them

Featured image: Covid-19 cases in New York, United States (9.04.2020) from the Center for Systems Science and Engineering (CSSE) at JHU. Photo: KOBU Agency/Unsplash, (CC BY-SA)

If we don’t analyse statistics for a living, it’s easy to be taken in by misinformation about COVID-19 statistics on social media, especially if we don’t have the right context.

For instance, we may cherry-pick statistics supporting our viewpoint and ignore statistics showing we are wrong. We also still need to correctly interpret these statistics.

Advertisement
Advertisement

It’s easy for us to share this misinformation. Many of these statistics are also interrelated, so misunderstandings can quickly multiply.

Here’s how we can avoid five common errors, and impress friends and family by getting the statistics right.

1. It’s the infection rate that’s scary, not the death rate

Social media posts comparing COVID-19 to other causes of death, such as the flu, imply COVID-19 isn’t really that deadly.

But these posts miss COVID-19’s infectiousness. For that, we need to look at the infection fatality rate (IFR) — the number of COVID-19 deaths divided by all those infected (a number we can only estimate at this stage, see also point 3 below).

While the jury is still out, COVID-19 has a higher IFR than the flu. Posts implying a low IFR for COVID-19 most certainly underestimate it. They also miss two other points.

First, if we compare the typical flu IFR of 0.1% with the most optimistic COVID-19 estimate of 0.25%, then COVID-19 remains more than twice as deadly as the flu.

Second, and more importantly, we need to look at the basic reproduction number (R₀) for each virus. This is the number of extra people one infected person is estimated to infect.

Also read: Making Sense of India’s COVID Mortality Through Simple Lies, Damned Lies and Statistics

Flu’s R₀ is about 1.3. Although COVID-19 estimates vary, its R₀ sits around a median of 2.8. Because of the way infections grow exponentially (see below), the jump from 1.3 to 2.8 means COVID-19 is vastly more infectious than flu.

When you combine all these statistics, you can see the motivation behind our public health measures to “limit the spread”. It’s not only that COVID-19 is so deadly, it’s deadly and highly infectious.

2. Exponential growth and misleading graphs

A simple graph might plot the number of new COVID cases over time. But as new cases might be reported erratically, statisticians are more interested in the rate of growth of total cases over time. The steeper the upwards slope on the graph, the more we should be worried.

For COVID-19, statisticians look to track exponential growth in cases. Put simply, unrestrained COVID cases can lead to a continuously growing number of more cases. This gives us a graph that tracks slowly at the start, but then sharply curves upwards with time. This is the curve we want to flatten, as shown below.

“Flattening the curve” is another way of saying “slowing the spread”. The epidemic is lengthened, but we reduce the number of severe cases, causing less burden on public health systems. Photo: The Conversation/CC BY ND

However, social media posts routinely compare COVID-19 figures with those of other causes of death that show: