A view of the emergency ward in Jawaharlal Nehru Medical College and Hospital, Bhagalpur, July 27, 2020. Photo: Reuters/Danish Siddiqui.
Let’s start with a contrast. To date:
* Bihar, with an estimated population of 120 million, has had 2.1 lakh recorded COVID-19 cases and just a thousand recorded COVID-19 deaths
* Maharashtra, with a similar estimated population, has had 16.4 lakh recorded COVID-19 cases and 43,000 recorded COVID-19 deaths
What explains such huge differences? Has Bihar “defeated” the novel coronavirus? Or has Bihar’s data been manipulated in the run-up to an election? We’ll see that there are some signs the data has been massaged, but there are several other factors in play.
A continuing epidemic
First, Bihar’s COVID-19 epidemic is by no means over. Average daily cases rose from about 500 in the first week of July and peaked at about 3,800 in the second week of August. But the subsequent decline effectively halted by the end of September and daily cases have been steady at about 1,200 through October so far. Daily recorded COVID-19 deaths peaked at around the same time as cases, came down, and have recently been rising again.
When we look at individual districts the picture is more messy – as we might expect. Not all districts have seen a clear peak in cases, while several have had more than one apparent peak. Cases in Patna district, by far the biggest contributor to Bihar’s case and death data, peaked in early August, dropped sharply, but have been rising gradually through much of September and October.
A big increase in testing magnified peak in cases
In mid-July Bihar was coming under fire for low testing and the high ratio of cases to tests. There followed a very sharp rise in tests in late July and early August: the daily total rose more than ten-fold from about 10,000 near July 22 to over one lakh one month later. Testing has stayed high since then, although with some fluctuations.
It seems that after the climb in testing a very large proportion of tests were rapid antigen tests (RATs). This percentage is not regularly shared, but by August 11 it was already at 86%, and at the last available count (Sept 21) about 92% of all of Bihar’s tests were RATs. The ramp-up increased the apparent speed at which Bihar’s epidemic was growing in late July and early August – detection was almost certainly improving even as infections were also rising.
The rise in testing and the fact that most tests were relatively low-sensitivity RATs led to a dramatic fall in test positivity (here defined as the ratio of cases to total tests) – this dropped from over 15% in late July to under 2% a month later, and is currently at around 1%. For two weeks during late July and early August test positivity was falling fast, even as cases were rising fast.
Unclear targeting of tests
What has the testing strategy been? There were reports of doctors struggling to meet targets for rapid testing without any clear messaging about who should be tested. We also know that RATs miss a significant fraction of infections. They can still play an important role in disease surveillance; but they are also convenient from the point of view of making test positivity fall!
The most striking thing about district-level data is that there is only a weak correlation between cases in a district and tests conducted in that district. This results in wide fluctuations in test positivity between districts. For example, as of October 24, Patna had seen 16% of the state’s cases and 25% of the state’s deaths, but only 4% of the state’s tests.
This data adds force to the accusation that testing was ramped up in order to make test positivity fall – if testing was focussed on developing hotspots we would expect a stronger correlation between tests and cases.
Testing alone can’t explain Bihar’s cases and deaths
Could the big increase in testing have helped to control Bihar’s epidemic? Despite poor targeting, improved detection must have allowed some people with the infection to isolate early and thus slowed the spread. Even a modest increase in detection would break some chains of transmission and help control the epidemic. But this seems unlikely to fully explain the drop in COVID numbers and the state’s relatively low total count of cases and deaths. Delhi’s per capita testing, for example, is almost twice that of Bihar’s, and yet Delhi’s cases and deaths per million are many times higher than Bihar’s.
Logically speaking, there are two possible explanations for Bihar’s low per capita case and death count: poor disease surveillance, or low spread of infection. Unclear targeting of tests means that we cannot rush to assume that spread has been low. Seroprevalence data could help to clarify this picture. Preliminary reports on seroprevalence from six of Bihar’s 38 districts suggest that detection of infections in Bihar prior to the survey was very poor, and that the great majority of COVID-19 deaths in the state may have been missed.
An alternative approach is to try and infer how many have had COVID-19 by looking at recorded fatalities. To do this, we first need to consider possible fatality undercounting, and estimate infection fatality rate (IFR) in the state – the fraction of infections which result in death.
Estimating IFR and prevalence
Bihar is a young state. At the time of the 2011 census, 37.3% of the state’s population was 0-14 years old, as opposed to 27.2% in Maharashtra and 29.5% nationwide. On the other hand 7.0% were over 60 as against 9.3% in Maharashtra and 8.0% nationwide. As we know, age makes a very large difference to expected COVID-19 fatality rates.
Using Bihar’s 2011 age pyramid, we can calculate the state’s expected COVID-19 IFR using various sources of age-adjusted IFR for COVID. We find a range of expected values for Bihar’s IFR from about 0.14% to 0.28%. By comparison, the same data gives an IFR range of 0.21% to 0.41% for Maharashtra – 50% higher than Bihar’s. For India as a whole we get a range of 0.18% to 0.35%.
Note that these estimates are very rough and rely on partial data. The population will have aged somewhat since 2011, pushing expected IFR up; meanwhile, IFR would be decreased by lower prevalence amongst the elderly, as can be inferred, for example, from Mumbai’s seroprevalence survey.
In order to estimate prevalence via IFR we need either to take fatalities at face value, or to make some assumptions about fatality underreporting. Some outcomes of this process for Bihar and Maharashtra are summarised in the following table.
For example, if we assume that about one in three COVID-19 deaths in Bihar has been counted, and take a mid-point IFR estimate of 0.21%, then by early October only about 1.2% of Bihar’s population would have had COVID. This is much lower than the national estimate of 6.6% from the August-September seroprevalence survey. In this scenario, a surprisingly high 14% of total infections would been detected.
But these estimates of low levels of COVID-19 and good infection detection are inconsistent with reports of high seroprevalence from Bihar. This discrepancy provides a warning sign that detection of infections and deaths in Bihar may have been very poor.
By contrast, let’s consider Maharashtra again, and suppose that, with its better death surveillance, the state has recorded about half of its fatalities. Using the mid-point IFR estimate of 0.31%, about 23% of people in the state would have had COVID. In fact all calculations put the levels of prior infection in Maharashtra at much above the national estimate.
The urban-rural divide
Bihar’s young population means that we should expect fewer deaths and more mild/asymptomatic infections which could go undetected. But the calculations above suggest that additionally disease spread in Bihar may have been quite limited. If so, then why?
Other than the youthful population, a second factor stands out in Bihar. According to 2020 projections Bihar’s population is 88% rural. This makes Bihar amongst the most rural states in the country, second only to Himachal Pradesh. By contrast Maharashtra’s population is about 52% rural, while India’s as a whole is 66%.
Does this matter? Yes, practically speaking, very much. If we look across Indian states and territories, we find that urbanisation is a key predictor of the number of COVID-19 cases and deaths in a region. In fact, the relationship, shown in the next two plots is surprisingly pronounced given different epidemic timings, and probably very big differences in disease surveillance, mitigation measures and fatality undercounting.
(In the plots above 2020 projected populations and urbanisation levels are taken from National Commission on Population data; case and death data is from here. States/regions are coded as follows: JK = Jammu & Kashmir, HP = Himachal Pradesh, PB = Punjab, CH = Chandigarh, UT = Uttarakhand, HY = Haryana, DL = NCT Of Delhi, RJ = Rajasthan, UP = Uttar Pradesh, BR = Bihar, SK = Sikkim, AR = Arunachal Pradesh, NL = Nagaland, MN = Manipur, TR = Tripura, ML = Meghalaya, AS = Assam, WB = West Bengal, JH = Jharkhand, OR = ODISHA, CT = Chhattisgarh, MP = Madhya Pradesh, GJ = Gujarat, MH = Maharashtra, AP = Andhra Pradesh, KA = Karnataka, GA = Goa, KL = Kerala, TN = Tamil Nadu, PY = Puducherry, AN = Andaman & Nicobar Islands, and TG = Telangana.
Mizoram has reported no deaths; Dadra and Nagar Haveli and Daman and Diu have reported two deaths, and Lakshadweep remains the only territory to report no cases so far, and they have been omitted.)
Once we take into account the urban-rural divide, Bihar no longer appears as an outlier in India’s COVID-19 data. In fact, both its cases and deaths lie close to the lines given by regression analysis for the country as a whole. Even within the state the urban-rural divide seems clear. Patna is by far Bihar’s most urban district: and with 5.6% of the state’s population it has reported 16% of its cases and 25% of its deaths.
Maharashtra, on the other hand, does stand out, particularly when it comes to high fatalities. We should remember that Maharashtra has had several long and harsh city epidemics. It is also possible that fatality reporting in Maharashtra is above the national average, given relatively good indicators for death surveillance.
While the relationship between urbanisation and cases/deaths is strong, it is important to stress that we don’t know what drives it. Is the spread of disease in rural areas slowed by lower population density and different modes of interaction? Could urbanisation be connected with other factors which accelerate transmission and increase mortality? To what extent are fewer recorded cases and deaths a consequence of poorer surveillance and less access to healthcare in rural settings?
As we saw above, examining Bihar’s fatalities, even assuming that two in three have been missed, leads to the conclusion that spread has been quite limited in the state so far. Alternatively, Bihar’s IFR is lower than estimated from international data for some as yet unknown reason; or fatality undercounting is considerably higher than guessed above.
Coming to this last possibility, is it possible that a very large fraction of Bihar’s COVID-19 deaths have never showed up in official figures? It cannot be ruled out. We know that infectious disease deaths can be missed: for example, the WHO estimates about 4.4 lakh tuberculosis deaths in India every year but only about one fifth of these get recorded as such. It is also worth noting that medical certification of deaths is very low in Bihar.
If we look at district-wise data, case fatality rates (CFRs – the ratio of confirmed deaths to confirmed cases) fluctuate widely by district.
Is this explained by the wide fluctuations in testing and hence case detection discussed earlier? Probably not. For example, by October 24, Bhagalpur had recorded 8,307 cases from 281K tests and had 65 confirmed deaths; Gopalganj had recorded 5,055 cases from 210K tests, but had only 7 confirmed deaths. Test positivity in these two districts is comparable, but case fatality is not. Overall, there is only a very weak correlation between case fatality rates and test positivity rates, suggesting possibly uneven death reporting.
Bihar’s COVID-19 epidemic is not over: it has risen and fallen, and then stabilised. The state’s big increase in testing, with a high volume of rapid antigen tests, led to a dramatic fall in test positivity and exaggerated the rise in infections. Even though there are large question marks over how testing was targeted, it could still have played a part in controlling the epidemic.
Has Bihar’s data been manipulated to manufacture a success story? As is often the case, crucial data that could help answer this question is lacking. There are signs that testing was geared more towards reducing test positivity than catching a higher proportion of infections. There are also indications that death recording is not uniform across districts, with some districts reporting suspiciously few deaths.
To get a clearer picture of what lies behind the recorded numbers, excess deaths data and a proper investigation into mortality in Bihar during the COVID-19 pandemic are needed. Bihar’s low numbers of cases and deaths must be explained by some combination of:
* Relatively low spread in a predominantly rural population,
* Low IFR with more mild cases as a consequence of a youthful population, and
* Poor detection of infections and fatality undercounting
The available data suggests that the last of these may be the dominant effect.
Finally, Bihar’s data highlights a strong connection between urbanisation and low COVID-19 numbers that warrants further exploration. If, indeed, spread has been low, then the great majority are still susceptible. There is no reason to rule out a resurgence driven, for example, by easing restrictions, the festival season and huge election rallies.
More complete references, calculations and data sources are available here.
Note: This article was updated at 4:20 pm on October 30, 2020, to account for aspects of Bihar’s seroprevalence data that had been overlooked at the time this article was first published.
Murad Banaji is a mathematician with an interest in disease modelling.