How Many People Will Get Covid-19?

As a member of the editorial board of Significance, I’ve been very pleased by the rapid work from statisticians to help the magazine arrive at a guide for understanding statistics about Covid-19. One of our offerings is a list of definitions, key points, and best practices for talking about this pandemic responsibly.

As HRDAG’s executive director, I’m even more pleased by my own colleagues’ lightning-speed work to contribute to Significance’s guide. Today, the magazine published two articles that add depth and knowledge to the many discussions circling the pandemic.

Tarak Shah, our data scientist, wrote “How many people are infected with Covid-19?,” in which he discusses current, reliable research by three different teams, one of which is our own HRDAG team composed of James Johndrow, Kristian Lum, and Patrick Ball. From Tarak’s article: “We’ve seen three different studies that touch on the question of how many people are infected. Perkins and Johndrow both rely on existing research about the fatality rate of Covid-19 and the observed number of deaths to get to an estimate of the total number of infections. Verity et al., for whom the fatality rate is itself the quantity they want to estimate, rely on some key assumptions about susceptibility for different age groups in order to estimate rates of under-reporting, and thus arrive at the total size of the infected population. These papers also reveal how researchers evaluate the quality of data as it relates to their research question. The authors of Johndrow rely heavily on observed deaths by Covid-19 because they believe that data is more complete than numbers of positive Covid-19 tests. Verity et al. rely on the Wuhan repatriation data, one of the few instances where the entirety of a susceptible population was tested, for information about asymptomatic cases.”

The second article, written by our director of research Patrick Ball, is “How do epidemiologists know how many people will get Covid-19?” In this article, Patrick explains the foundation of epidemiological modeling, known as the SIR model, which enables us to estimate the progression, day by day, of the sizes of three sub-populations: those who are Susceptible, Infectious, and Removed. This explainer is especially useful for people who want to logically and responsibly examine and report on the pandemic. From Patrick’s article: “In the long term, we’ll learn which models were best. However, time is too short for more than a tiny number of these models to be subject to formal peer review in time to be relevant. That means it is more important than ever that engaged laypeople (especially journalists) have at least a minimum sense of how to read these essential studies.”

As this crisis continues to unfold, we will continue to generate and illuminate good science in our attempt to answer to the critical questions facing us today and tomorrow.


Support for this project was provided by grants from the John D. and Catherine T. MacArthur Foundation and the Oak Foundation. (For more information about HRDAG’s supporters, please see our Funding page.)

Find more articles about HRDAG research and resources regarding Covid-19.

Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.