Featured image: A medic looks at a patient who has shown positive symptoms for COVID-19 at an isolation ward in Hyderabad, March 10, 2020. Photo: PTI.
When it comes to COVID-19, the gulf between what we would really like to know and what is routinely measured is vast. The temptation to twist what data we have to fit into a convenient framework is both natural and dangerous. Nothing illustrates this better than the narratives that have taken shape around COVID-19 fatality rates in India.
As the epidemic grew, public discussion shifted gradually from how fast the virus was spreading to how many patients were dying. This shift is understandable as deaths mount, but it coincided with a new narrative. Local and central governments were pointing to India’s relatively low death toll and saying, “Yes, disease is widespread but not so many are dying.” In late May, the Union health ministry claimed, dubiously, that India’s “fatality rate” was the lowest in the world. The purported reasons were sometimes given explicitly: “timely lockdown”, “early case detection”, “strategic intervention”, etc. There were also hints of exceptionalism – a suggestion that somehow Indians might be less at risk of dying if they contracted the virus. None of these claims were backed by any credible data.
CFR, IFR and all that
As the focus shifted to fatalities, the government and its supporters were quoting one statistic more and more often: the total recorded deaths as a percentage of ‘cases’, i.e. the total recorded infections. This number is known popularly as the case fatality rate (CFR), but strictly speaking it is an approximation of the true CFR, which is precisely this quantity but calculated at the end of an epidemic, when we know the outcomes of all cases. There are a number of ways of improving the approximation but that is another story.
Dig through the news archives and you will find that comparisons of CFR between localities have been used to make misleading claims about the “success” and “failure” of COVID-19 control strategies. The government has also used CFR, problematically, as evidence of how well cases are being detected. Most of all, it has often been treated as if it represents the likelihood of dying if you contract COVID-19.
However, the fraction of COVID-19 infections that result in death is not CFR. It is by definition IFR – the infection fatality rate of the disease. We don’t know this important number for India, although estimates from around the world place the average value of this number between 0.5% and 1%.
We’d love to know and track IFR as even small decreases in its value would mean many lives saved. The available data tells us that IFR should depend heavily on the age-structure of a population and the levels of comorbidities, and more marginally on other factors like quality of medical care. Yes, despite a weaker healthcare system, a young population like India’s very likely does mean a lower IFR than in Europe or North America.
COVID-19’s CFR is a very poor approximation of COVID-19’s IFR because, simply put, both the numerator and the denominator are inaccurate. The denominator – the number of cases – is generally a small fraction of the actual number of infections, most of which go unrecorded. The numerator – the number of recorded deaths – may also only be a fraction of the actual number of deaths.
There are systemic problems with death registration in India. There is also extensive evidence that several states have been undercounting deaths – and in other countries too. And yet COVID-19 death undercounting is often ignored or, worse still, relegated to a deceptively simple caveat like “barring a few undiagnosed deaths”.
To further complicate matters, there is a good chance that the deviations in the numerator and denominator that distinguish CFR from IFR change over time. Evolving outbreaks often indicate a pattern of decreasing case detection and death detection over time.
Thus competing and time varying errors, all hard to measure, make CFR a poor substitute for the IFR. Low case detection means we overestimate fatality rate. Possible death undercounting and delays between case and death recording mean we underestimate fatality rate. And how this all adds up varies dramatically between locations and over time.
Narratives around CFR
While CFR is hard to interpret, the discussion around it is often so careless that even the term itself gets obscured. Often it is mislabelled as the mortality rate, which actually refers to the total deaths as a fraction of the population. More subtly, and perhaps most misleadingly, it is frequently just called “fatality rate”.
Terminology may not seem like the highest priority during a pandemic – but misnaming CFR is not necessarily neutral. Doing so could give this statistic a weight it does not have. For example, just calling CFR “fatality rate” legitimises both the numerator and the denominator. It suggests that we should take recorded cases as being representative of all infections and recorded deaths as being representative of all deaths.
Once legitimised in this way, CFR becomes a dangerous metric. States with low but rapidly rising CFR are “failing”, even though this behaviour is natural in the early stages of an epidemic. Even worse, states which bring down CFR by undercounting fatalities become the new heroes.
Can CFR ever be useful?
So should we junk all discussion of CFR? No. If we acknowledge its limitations, it can be quite useful. If CFR behaves in unexpected ways, we should ask why. For example, a falling CFR during an epidemic goes against our expectations, so we should respond with further scrutiny using modelling and journalistic approaches, not clap our hands. A drop could be the result of several causes, acting separately or together:
- Deaths being increasingly underreported
- Testing picking up an increasing fraction of infections
- A genuine drop in IFR caused, for example, by better treatment of severe cases, or the virus spreading through a younger population
Then again, we shouldn’t accept any of these as the cause of a dropping CFR without proper investigation. For example, if better case detection is responsible, we should also see concomitant changes in the testing numbers and in metrics like the test positivity rate 1.
If a part of the story is a demographic shift in the virus’s spread, then we should look for age-related data to back this up. This reason would be most likely in contexts where age-wise segregation is common. For example, in Europe, we saw disease sweeping through care homes (and then sometimes being controlled, leading to probable fluctuations in the IFR).
If anyone claims that hospitals are recording fewer deaths by providing better care, we should check if those deaths that did happen were delayed. It’s noteworthy that this didn’t happen in Mumbai, discrediting a claim that a sustained fall in Mumbai’s CFR was a consequence of better medical care.
It would be great to see news reports and analyses that refer to CFR correctly and also acknowledge its complexities. If any writer or reporter is discussing CFR, they should consider possible changes in case detection rates and whether COVID-19 deaths are accurately classified and recorded as such. Comparisons between regions should generally be avoided, unless there is evidence that the regions are in similar stages of the epidemic, and have similar testing rates and response strategies, and similarly efficacious death registration systems.
CFR should, in effect, be used as a signal that points to stories worth investigating – and not be treated as the story itself.
Murad Banaji is a mathematician with an interest in disease modelling.
The ratio of ‘positive’ tests to total tests ↩