A man walks past a graffiti of people wearing protective masks, in Navi Mumbai, January 21, 2021. Photo: Reuters/Francis Mascarenhas.
Headline figures from the third national seroprevalence survey, carried out between December 17 2020 and January 8 2021, have now been released. Antibodies to SARS-CoV-2, the virus which causes COVID-19, were detected in 21.4% of adults surveyed, and 25.3% of 10 to 17 year olds surveyed. This means that about 3.5% of infections have been detected and recorded as “cases”, which currently number over one crore.
Seropositivity was highest in urban slums at 31.7%, dipped to 26.2% in urban non-slum areas, and was lowest in rural areas at 19.1%. Confidence intervals – measures of the levels of uncertainty in these figures – have been given, but the technical details of the calculations are not yet available. What does it all mean?
The first two national surveys
To put the numbers in context, let’s consider the first two national serosurveys. In the first, adults were surveyed during the second half of May to early June, and 0.73% seropositivity was reported.
But this figure comes with a great deal of uncertainty – interpretation of the data was fraught with difficulties, which were never adequately addressed. It is best not to read too much into the first survey beyond the broad conclusion that prevalence was still low outside of hotspots at this point. Ironically, the hotspot data was not released.
The second survey was carried out during August and September as the country approached and passed its peak in daily cases and deaths. The survey found 6.6% seropositivity amongst all those aged 10 years and above. The figure was 16.9% in urban slums, 9.0% in urban non-slum areas, and just 5.2% in rural areas.
The antibody test used in the second survey, which was also used in Mumbai’s serosurveys, is known to fare somewhat poorly at detecting older infections. As a result, the level of prior infection may have been underestimated in the second national serosurvey. But this should have mainly affected areas which saw major early spread, namely urban slums.
Is the new data consistent with case and death data?
Although the full details are not yet available, it appears that the test used in the third survey was less vulnerable to the problem of insensitivity to older infections than the one used in the second survey. We thus have no reason to believe, in advance, that the latest data greatly underestimates spread.
At face value, between second and third surveys, seroprevalence in urban slums roughly doubled, in urban non-slum areas it roughly tripled, and in rural areas, it almost quadrupled. Overall there was an increase in seroprevalence of over three-fold.
Factoring in a delay of perhaps three weeks between an infection occurring and antibodies being picked up in a survey, this would mean that between mid-August and early December total infections increased about three-fold.
This is actually fairly consistent with case data, assuming that cases typically get recorded a couple of weeks after infection occurs: cases almost tripled between early September and late December. Deaths on the other hand only doubled during this period, a fact we’ll return to.
Poorly detected rural spread?
The figures imply that detection of infections has not improved overall since September – if anything it may have fallen. But we know there have been big improvements in detection in major cities such as Mumbai and Delhi. And we know that, nationally, test-positivity, here taken to mean the ratio of positive tests to total tests, has been falling. How can we resolve this apparent contradiction?
The key could lie in the shift in infection towards rural areas. Both national and state-level serosurveys tell us fairly convincingly that the rural epidemic came later than the urban one, perhaps as a consequence of national lockdown. However, once it arrived it was of significant scale and, indeed, continued to smoulder as city epidemics petered out.
How does this explain what appears to be decreased detection? The answer is that detection of COVID-19 in rural areas is, on the whole, poorer than in urban areas. The declines in cases from big, better surveilled, urban epidemics masked a continued flow of rural infections. It may seem like a contradiction, but it is quite possible that detection improved in both urban and rural areas, while decreasing overall as a consequence of a shift in infection from from urban to rural.
What about fatalities?
Taking the prevalence values at face value, the third survey implies a naive infection fatality rate (IFR), namely the ratio of recorded COVID-19 deaths to estimated COVID-19 infections, of around 0.05%, or one in 2000.
This is down somewhat from the second survey estimate of around 0.08%. Put another way, infections have increased faster than recorded deaths. Before we rush to paint this as good news, this drop would be consistent with a shift of disease to rural areas where death surveillance is likely weaker.
If India’s true IFR is roughly in line with expectations based on international data and India’s age profile, then at least three or four COVID-19 fatalities have been missed for each recorded fatality. Is this plausible?
Serosurveys have thrown up huge geographical variations in naive IFR, with some badly hit regions generating almost no deaths. These variations almost certainly reflect highly uneven COVID-19 death recording. Taking such data alongside reports of fatality undercounting, three or more missed deaths for each recorded death does not seem implausible at all.
Also read: India Is Undercounting Its COVID-19 Deaths. This Is How.
Why is COVID declining nationally?
If less than a quarter of the country has, to date, been infected, what explains the fairly consistent decline in daily cases and deaths since mid-September?
The question assumes that the decline is real – i.e., daily infections and not just cases – have been declining. This seems very likely to be true, although with data skewed towards urban areas, the scale of the decline may not be as dramatic as cases suggest. It also tacitly assumes that the decline must be driven by the country reaching herd immunity.
Could the country really have reached herd immunity? We should not rush to conclude this, but it is possible. We cannot entirely rule out that the third national serosurvey has significantly underestimated prevalence. It is also possible that the herd immunity threshold, namely the fraction of individuals who need to be infected for a natural slowing in new infections to occur, has been over-estimated.
But equally, if effective mitigation continues to play an important role in keeping infections low, the country could still be far from herd immunity. In this case, if mitigation were suddenly to halt, a significant new upswing could occur.
It’s important to note that there could be some truth in all the possibilities: altered behaviours are still reducing transmission; total spread could have been underestimated; and the herd immunity threshold could be lower than initially believed. How much each of these contribute to the total picture is important, because it determines vulnerability to new surges as vaccination is rolled out.
The herd immunity threshold
Could India’s ‘R0’ value, the typical number of people infected by one person with COVID-19 if behaviour is “normal” and everyone is susceptible, have been overestimated at the start of the epidemic? If so, this would imply that the herd immunity threshold for the disease in the country has also been overestimated.
With most early spread occurring in cities and, additionally, better surveillance in cities, the early case and death data on COVID-19 nationwide was predominantly urban data. It is this urban data on which national estimates of R0 are based. Perhaps if we had been able to adequately disentangle rural data from the total we would have found that rural spread is naturally slower.
Although such questions are scientifically important, it goes without saying that relying on a lower-than-expected national herd immunity threshold to limit disease would be foolish. We have seen cities surging after high levels of infection. Even the natural speed at which the disease spreads in rural areas is probably highly variable given different patterns of rural development. Moreover, mitigation is almost certainly still playing a part in limiting transmission, in some parts of the country at least.
Some conclusions
The results of the third national serosurvey are broadly consistent with data on cases and deaths once we factor in some shift in spread from urban to rural settings. There is no very strong reason to believe that the current survey has greatly underestimated the extent of infection nationally, although this can’t be ruled out entirely.
The data tells us that we should be a little cautious in celebrating the decline in daily cases since mid-September. As I’d discussed in an earlier piece, case and death data magnify what is happening in a few well-surveilled areas. Meanwhile, the relative increase in infections between the last two surveys has been greatest in rural areas. A very significant number of people were infected with COVID-19 in rural areas, even as the national situation was improving.
Also read: Are We Finally Seeing a Light at the End of India’s COVID-19 Tunnel?
We can only hypothesise about what lies behind the national wind-down. It is possible that rural spread is, on the whole, slower than inferred from early urban data. Alongside some continuing mitigation, this could help explain the relatively low daily cases and deaths we are seeing at the moment. This said, the latest seroprevalence data is no cause for complacency, and new surges should not be ruled out.
Finally, the survey paints a picture around fatality consistent with previous surveys. We see many fewer fatalities than expected from international data. The increase in “missing” fatalities between second and third surveys would be consistent with a higher fraction of infections in areas with poor death surveillance. Seroprevalence values at district level would shed more light on questions of fatality, but this data has not, so far, been made available.
Murad Banaji is a mathematician with an interest in disease modelling.