Now Reading
COVID-19 Pandemic: Should You Believe What the Models Say About India?

COVID-19 Pandemic: Should You Believe What the Models Say About India?

The 21-day national lockdown that Prime Minister Narendra Modi announced in his address to the nation on March 24 ends on April 15. What happens next?

Should we anticipate a rise in cases once the lockdown is lifted, so that we’re back to about where we started before the lockdown? Will a further set of lockdowns of varying intervals, interspersed with ‘open periods’, help? The mathematical models that epidemiologists use to predict how infectious diseases spread can help answer such questions.

The simplest epidemiological models are called SIR models. They divide a population into ‘compartments’ depending on how individual people can be described in relation to the disease. The compartments are: ‘susceptible’, ‘infected’ and ‘recovered’ (S, I and R, for short). The rules that determine how the numbers of S, I and R change define the model.

More complicated models of the same type, which researchers use to describe different diseases, typically have more compartments. For examples of other types of models, read the companion article.

Of the many models projecting the spread of COVID-19 through India, I’ll describe four.

The first is a model developed by scientists at the Indian Council of Medical Research (ICMR) and their collaborators. I’ll call this the ICMR study.

The second is a model produced by a group of epidemiologists and statisticians largely from the University of Michigan. I’ll call this the Michigan study.

The third is a set of reports published by the Centre for Disease Dynamics, Economics and Policy (CDDEP) at Johns Hopkins University. I’ll call this the Hopkins study.

Finally, there is a recent study from scientists at Cambridge University, with one of the authors also affiliated with the Institute of Mathematical Sciences, Chennai. I’ll refer to this as the Cambridge study.


India’s coronavirus case trajectories as on April 3, 10:30 pm. Image:

The ICMR study has two parts; the second is relevant here. It assumes that the pandemic has begun and asks whether quarantining just those people who test positive for COVID-19 after showing symptoms would be an effective way to control the virus’s spread. The study partitions the population into S, I and R compartments but adds an additional ‘exposed’ (E) state between the S and I states. Such models are called SEIR models. Among those infected, some fraction may be quarantined if they can be identified. If infected individuals are not identified, they can infect susceptible people before they themselves recover or die. Those who have been quarantined can’t infect anyone.

This study explores a number of scenarios based on different values of the basic reproductive ratio1 and different levels of intervention. In an optimistic scenario, the ICMR group suggests that the disease will peak when there are about 100 cases for every 10,000 people. In their most pessimistic scenario, the peak numbers range between 200 and 1,000 cases for every 10,000 people. Effectively, this model suggests that between 1% and 10% of the population will be infected at the peak of the epidemic, depending on its severity.

The Michigan study uses a model most closely related to the classic SIR compartmental model, and differs from the SEIR model. There are, however, sound epidemiological reasons to expect that an SEIR and not an SIR model is more appropriate to forecast the spread of the new coronavirus. The Michigan model uses the observed number of cases to constrain the model. It can then be used to predict future events, including the effects of interventions, quarantines and lockdowns. In the absence of any interventions, the Michigan model predicts there will be about 16 cases per 10,000 people.

The authors of the Hopkins study say it applies both standard epidemiological modelling as well as  agent-based modelling, which is a more computationally intensive method. This model originates in an older model called IndiaSim developed by the same group. The Hopkins model provides state-level information for the number of infected people as a function of time. A brief description provided of the model says it takes in information such as when these cases were first seen, the presence of metros where they could be more rapidly spread and demographic variables describing the population.

IndiaSim, as of March 24, had predicted that the total number of people with COVID-19 in India could peak at 1 crore to 2.5 crore2 between March and August 2020, with the difference in numbers arising from assumptions about the spread of the disease. The April 2 version provides the number of those expected to be hospitalised in each state, across each age bracket, assuming that only 0.5%, 1% or 5% of the population is infected; it doesn’t provide India-level estimates.

According to the March 24 version of IndiaSim, the maximum number of people who could be hospitalised with COVID-19 in Uttar Pradesh ranges from 2 to 5 lakh. According to the April 2 version, the range widens to 0.8 to 9 lakh. The authors say they are “continually updating the model as new data on parameters and the Indian population become available”.

The Cambridge model, the last, is again an SIR model. The authors of this model split each compartment into multiple age brackets. What is new here is the way different age brackets interact with each other. In India, the common presence of three generations or more within a single family unit implies greater social – and presumably physical – contact across generations.

The Cambridge model describes the effects of a lockdown by changing parameters to represent a total ‘switching off’ of infections. They find that a single lockdown of 21 days has little effect, at least beyond a temporary suppression of the case growth rate. Instead, it recommends a single 48-day lockdown for a more long-lasting effect.

Note that the Cambridge model is an SIR model and not an SEIR model. As a result, it predicts the numbers of people with COVID-19 in India will decline immediately  after a lockdown is imposed. In contrast, an SEIR model would have predicted that the case load would continue to increase before beginning to drop. This is why the daily data on new infections, published by the Union health ministry, disagrees sharply with the Cambridge model’s predictions. Another problem with this model is that it ignores asymptomatic infections – people who have the virus but show no outward signs of it.


Time to take the temperature. Photo: Markus Spiske/Unsplash

Which one of these models should we believe or, indeed, should we believe any of them?

The answer isn’t straightforward. Models like SIR and SEIR are simple and intuitive, even though only a computer could chart out their detailed predictions. However, the ICMR, Michigan and Cambridge models ignore many features we know are important to the spread of an infectious disease.

Perhaps most importantly, thinking of the entire population of India as a single unit ignores important differences, such as the high population density of Mumbai and the relative sparsity of Arunachal Pradesh. And we do know that infectious diseases that require close contacts to spread from one person to another also spread more easily in more densely packed areas.

The simplest models don’t account for different mobility patterns either, or the role of public places such as schools, public transport and crowded workplaces – all of which help infectious diseases spread faster. There are also crucial differences between rural and urban India. Isolated ‘super spreaders’, a label for people from whom the virus spreads to a lot more people than the average basic reproductive ratio indicates, aren’t accounted for.

Agent-based modelling is capable of accounting for some of these variations. However, there is a tradeoff: if you make a model more complex, you also need to make more assumptions to see the calculations through. As a result, it’s important to ensure the model’s output does not depend very sensitively on the assumptions. If it did, even slightly different assumptions could yield very different outcomes.

For these reasons, none of the numbers in the models should be taken at face value. Whether the models can be trusted to provide an idea of what specific interventions are likely to be more powerful than others is a different, and more delicate, matter.

A good point of view from which to address modelling studies today is that they are, to use the words of Wolfgang Pauli, “not even wrong”. There is simply too little trustworthy data right now to suggest the predictions of one model (that might fit the data better now) should be trusted more than the predictions of another (which appears to do a worse job). Overreacting to a particular modelling claim in the here-and-now does more harm than good.

The only useful way to compare models with each other at the moment should be on the basis of what we know about the disease and what the model’s inputs are, together with some scepticism about the assumptions that accompany each model.

Second, local models are more useful than global models. A model that shows state-wise behaviour is better than a model that purports to be India-wide. It is indeed more rational to think of different policies at the level of individual states or districts than to demand a single nationwide policy be applied uniformly across India, mindless of local circumstances.

Third, researchers are constantly improving their models. The best modelling studies are those that build on previous work, incorporate feedback and can easily be updated.

While the authors of the ICMR model, the Michigan model and the Cambridge model have all explained their work in sufficient detail, and/or have made their code available in the public domain so that their results can be checked by others, the moving parts of the Hopkins model remain out of view. We don’t know what went into the current version of the Hopkins model; without this information, it’s impossible for a modeller to understand the ‘how’, much less critique it.

Older papers about IndiaSim are not particularly relevant to the current work on COVID-19 either since none of them modelled a respiratory pathogen that spreads very fast.

This brings us to a simple but important point about modelling: the guiding philosophy of responsible modelling is transparency and honesty. This requires, among other things, descriptions detailed enough to enable someone else to question the approach intelligently, making software programs available to those who might want to run them, and publishing the methods used on a public preprint server so other experts can check them. Broad summaries on organisational websites don’t count.

In fact, a lot of the work of scientists around the world studying COVID-19 is currently available on preprint servers. Such ‘knowledge commons’ is a relatively new feature of how scientists share their work with the world. And it certainly brings more democracy and sunlight into the practice of science.

Gautam I. Menon is a professor at Ashoka University, Sonepat and, at the Institute of Mathematical Sciences, Chennai. The views expressed here are his own.

  1. The average number of susceptible people to whom one infected person can transmit the virus.

  2. 1 crore = 100 lakh

Scroll To Top