Does Bihar’s COVID-19 Data Seem Too Good To Be True?

02/11/2020

A medical professor puts on PPE as his wife looks on, at their home in Bhagalpur, Bihar, July 2020. Photo: Reuters/Danish Siddiqui.

The COVID-19 situation in Bihar has improved lately, but in July and August things were bad. There were reports of hospitals at breaking point, for example in Patna and Bhagalpur. A large number of healthcare workers were reportedly infected with COVID-19, and many were dying. There were also reports of dysfunctional crematoria and bodies burnt in the open.

However, such reports were at odds with Bihar’s relatively low total count of COVID-19 cases and (especially) deaths. How could this be?

In a previous article, this author had examined two possible explanations for Bihar’s COVID-19 data: slow spread of disease, or poor disease surveillance. Preliminary reports of seroprevalence data from Bihar which I later examined suggest a dominant role for the second explanation: weak disease surveillance. The data paints a stark picture of very poor detection of infections, and probably very many missed deaths. It also challenges the narrative of slower spread of the coronavirus in rural areas.

The results are from six districts – Arwal, Begusarai, Buxar, Madhubani, Muzaffarpur and Purnia – surveyed as part of the second national seroprevalence survey in late August. These results were not widely reported. A Google search for key terms, and direct searches on news websites on October 31 found no English language reports of the results. A few Hindi outlets, including Dainik Bhaskar, had reported the findings around October 12.

The limited coverage is perplexing, given the high interest in Bihar in the run up to the state’s legislative assembly elections, currently underway, and data so inconsistent with the NDA narrative of Bihar’s successful handling of COVID-19.

Bihar versus Chhattisgarh

Bihar’s headline figures from the survey were as follows: About 16% of the population in the surveyed districts had developed antibodies to SARS-CoV2 by late August, implying about 34 lakh infections in these districts. By the end of August, roughly 24,000 cases had been reported from these districts – which means only around 1 out of every 140 infections had been detected. Moreover, the six surveyed districts had recorded only 72 deaths. This yields a naïve infection fatality rate (IFR) – the ratio of recorded COVID-19 deaths to estimated infections – of about 0.002%. Put another way, only about 1 out of every 47,000 of those infected had died – officially.

These numbers are best viewed in contrast to numbers from nearby Chhattisgarh. After early success in controlling its epidemic, by mid-September Chhattisgarh had become a COVID hotspot recording around 3,000 new cases daily. The Indian Council of Medical Research carried out a seroprevalence survey in 10 districts in the second half of September: Baloda Bazar, Balrampur, Bilaspur, Janjgir Champa, Jashpur, Korba, Mungeli, Rajnandgaon, Raipur and Durg. After correcting for district populations, an estimated 6.7% of the people in these districts had reportedly developed antibodies to SARS-CoV-2. This amounts to roughly 10 lakh individuals.

By that time, the surveyed distracts had reported a total of 66,000 cases: about 6.4% of the infections had been detected in testing – almost 10-times higher than in Bihar. This figure drops to 4.6% if the predominantly urban districts of Durg and Raipur are omitted.

According to official figures, how many of those infected in Chhattisgarh were dying? Around the time of the survey there had been near to 600 deaths in the surveyed districts. This gives a naïve IFR of about 0.06% – which is comparable to that of Delhi. The epidemic was still raging, and total deaths in these districts rose by over 20% during the following week – taking into account delays in reporting deaths could push IFR values up. If we omit Raipur and Durg from the calculations, the naive IFR goes down to 0.03%, suggesting that although infections were detected almost equally in rural and urban districts, death recording could have been weaker in the more rural districts of Chhattisgarh.

Taking reported deaths at face value, Chhattisgarh’s death rate was 30 times higher than Bihar’s, or 15 times higher if we omit Raipur and Durg. We have to ask: is COVID-19 really so much deadlier in Chhattisgarh than in Bihar? Or was Bihar failing to detect or report most of its COVID-19 fatalities?

Let’s start with Chhattisgarh. The naive IFR in its surveyed districts is comparable to Delhi’s but lower than estimates from Mumbai’s and Chennai’s seroprevalence surveys. It is also much less than 0.16%, the lowest value expected from internationally measured age-stratified fatality data. There could be two to three missed deaths for every recorded death, perhaps predominantly from more rural districts. It is also possible that the disease had spread faster in younger populations lowering the death rate – data is not available to check this hypothesis.

But Bihar’s numbers are in a different league altogether – so much so that either the people in Bihar were mysteriously protected from severe forms of the disease or almost none of Bihar’s COVID-19 deaths had been recorded. From international data, we expect 70- to 140-times as many deaths as recorded in the surveyed districts. From Mumbai’s data, we expect about 60 times the recorded deaths. From Delhi’s or Chhattisgarh’s data, we expect about 30-times the recorded deaths. From Chhattisgarh’s data, omitting the predominantly urban districts, we expect about 15-times the recorded deaths. So however we look at it, Bihar’s deaths count doesn’t add up. And taking reporting delays into account makes little difference to this conclusion.

The urban-rural divide

Aside from highlighting poor detection of COVID infections and probable death underreporting, Bihar’s seroprevalence survey data also indicates rapid and undetected rural spread.

To illustrate this, consider Madhubani, a large district whose population was, at the 2011 census, 96% rural. This makes it the most rural of the surveyed districts, and almost the most rural in the state. And yet, at 23%, Madhubani recorded the highest seroprevalence of Bihar’s six surveyed districts. With just 4,555 cases and six deaths at the end of August, the true number of infected people was more than 200-times greater than the number identified through testing. Moreover, officially, only about 1 in 170,000 of those infected had died.

Although Madhubani is the most extreme case, there was a clear negative association between urbanisation and seroprevalence in Bihar’s surveyed districts: disease spread was in general greater in more rural districts. This remains true even after removing Madhubani from the data, although the trend is weaker. On the other hand, there was a fairly strong positive association between urbanisation and infection detection: an infection was most likely to be missed in a rural area. The level of urbanisation explained about 45% of the variation in infection detection between districts.

This trend persisted even if Madhubani was removed from the data. Poor rural detection and rapid rural spread are very likely linked – if infections are not being detected, most people have little idea that COVID-19 is spreading in their localities.

In Chhattisgarh, by contrast, there was a clear positive association between urbanisation and the reported seroprevalence – disease levels were, in general, higher in more urban districts, with the highest infection levels in Durg and Raipur. The fraction of infections being detected did, as in Bihar, increase in more urban districts. But the trend was less sharp – a 10% increase in urbanisation resulted in a less pronounced estimated change in detection than in Bihar. Urbanisation explained only about 10% of the variations in infection detection between districts.

Summary

Comparing Bihar with Chhattisgarh highlights some striking differences, summarised in the following table. Bihar shows much weaker detection of infections, implausibly few deaths, and rapid rural spread.

The analysis suggests that we should not rush to interpret low COVID-19 case loads and deaths in predominantly rural states as being a consequence of low infection levels. Although more rural states do in general report fewer cases and deaths, it seems that transmission can be rapid in rural settings. The picture probably varies considerably from state to state.

The data strongly suggests that Bihar’s ‘success story’ is founded on poor disease and death surveillance, and that Bihar’s COVID-19 fatality data in particular should be treated with extreme caution. Without a proper survey of mortality in Bihar during the pandemic so far, people in the state may never know its true toll.

All data on cases and deaths at the state and district levels obtained from here. Population estimates were scaled up using 2020 projections here. Many thanks to Nikhil Rampal for pointing me to reports on Bihar’s seroprevalence survey.

Murad Banaji is a mathematician with an interest in disease modelling.