Protecting the Privacy of Whistle-Blowers: The Staten Island Files

HRDAG built a machine-learning tool to strip the raw data of any potentially identifying information such as names and court case numbers. There was no "acceptable error rate."

El Salvador

Some of the earliest large-scale human rights information projects happened in El Salvador. One was developed by Patrick Ball at the Salvadoran non-governmental Human Rights Commission, also known as Comision de Derechos Humanos de El Salvador (CDHES-ng). Between 1977 and 1990, more than 9,000 testimonies were taken in an effort to document the nature and scope of the bloody conflict between the army and the Farabundo Marti National Liberation Front (FMLN). Starting in 1991, Patrick worked with CDHES staff to organize the information in an early computer database. They linked reported human rights violations with the career structures of individual ...

How we go about estimating casualties in Syria—Part 1

I spent the two weeks over Easter working with Patrick and Megan in San Francisco, trying to figure out a strategy of how best to estimate the number of casualties the Syrian civil war has claimed in the past two years. In January, HRDAG published a report on the number of fully identified casualties reported in the Syrian Arab Republic between March 2011 and November 2012. The number of de-duplicated records of killings for this period was 59,648, a number that is likely to be an undercount since we know that many incidences of lethal violence in conflict go unreported, and that the unreported cases are not missing at random. (more…)

HRDAG is hiring – technical lead

If this could be you, let us know. Also, please feel free to pass on this link to great people. Job Title. Technical lead with a hacker's heart Location. A cool office in SOMA, San Francisco. You need to be on-site with us. What we do. The Human Rights Data Analysis Group (HRDAG) develops statistical techniques to measure human rights atrocities. Our work helps bring dictators to justice through data analysis of human rights atrocities around the world. Over more than 20 years, our small team has developed technology and statistical techniques to take disjoint, incomplete, and inaccurate information from conflict zones and process it to identify ...

Estimating the Number of SARS-CoV-2 Infections and the Impact of Mitigation Policies in the United States

James Johndrow, Patrick Ball, Maria Gargiulo, and Kristian Lum. (2020). Estimating the Number of SARS-CoV-2 Infections and the Impact of Mitigation Policies in the United States. Harvard Data Science Review. 24 November, 2020. © The Authors, 2020, CC BY 4.0. https://doi.org/10.1162/99608f92.7679a1ed

FAQs on Predictive Policing and Bias

Last month Significance magazine published an article on the topic of predictive policing and police bias, which I co-authored with William Isaac. Since then, we've published a blogpost about it and fielded a few recurring questions. Here they are, along with our responses. Do your findings still apply given that PredPol uses crime reports rather than arrests as training data? Because this article was meant for an audience that is not necessarily well-versed in criminal justice data and we were under a strict word limit, we simplified language in describing the data. The data we used is a version of the Oakland Police Department’s crime report...

Lessons at HRDAG: Holding Public Institutions Accountable

Principled Data Processing is a way to prove to someone, usually yourself, that what you did was right.

To Count the Uncounted: An Estimation of Lethal Violence in Casanare,

Tamy Guberek, Daniel Guzmán, Megan Price, Kristian Lum and Patrick Ball, “To Count the Uncounted: An Estimation of Lethal Violence in Casanare,” A Report by the Benetech Human Rights Program. 10 February 2010. (Available in Spanish) © 2010 Benetech. Creative Commons BY-NC-SA.

Criminality registration and measurement

Patrick Ball and Michael Reed Hurtado. 2016. El registro y la medición de la criminalidad. El problema de los datos faltantes y el uso de la ciencia para producir estimaciones en relación con el homicidio en Colombia, demostrado a partir de un ejemplo: el departamento de Antioquia (2003-2011). Revista Criminalidad, 58 (1): 9-23.

HRDAG Adds Three New Board Members

HRDAG's advisory board has added three new members.

Measuring the Mortality Consequences of Armed Conflict in Amritsar, India: A New Approach to the Indirect Sampling of Conflict-Related Mortality

Romesh Silva and Jeff Klingner. “Measuring the Mortality Consequences of Armed Conflict in Amritsar, India: A New Approach to the Indirect Sampling of Conflict-Related Mortality.” Poster presented at the Population Association of America 2011 Annual Meeting. © 2011 Benetech. Creative Commons BY-NC-SA.

500 Tamils forcibly disappeared in three days, after surrendering to army in 2009

A new study has estimated that over 500 Tamils were forcibly disappeared in just three days, after surrendering to the Sri Lankan army in May 2009.

The study, carried out by the Human Rights Data Analysis Group and the International Truth and Justice Project, based on compiled lists which identify those who were known to have surrendered, estimated that 503 people had been forcibly disappeared between the 17th– 19th of May 2009.

Counting The Dead: How Statistics Can Find Unreported Killings

Ball analyzed the data reporters had collected from a variety of sources – including on-the-ground interviews, police records, and human rights groups – and used a statistical technique called multiple systems estimation to roughly calculate the number of unreported deaths in three areas of the capital city Manila.

The team discovered that the number of drug-related killings was much higher than police had reported. The journalists, who published their findings last month in The Atlantic, documented 2,320 drug-linked killings over an 18-month period, approximately 1,400 more than the official number. Ball’s statistical analysis, which estimated the number of killings the reporters hadn’t heard about, found that close to 3,000 people could have been killed – more than three times the police figure.

Ball said there are both moral and technical reasons for making sure everyone who has been killed in mass violence is counted.

“The moral reason is because everyone who has been murdered should be remembered,” he said. “A terrible thing happened to them and we have an obligation as a society to justice and to dignity to remember them.”

HRDAG Retreat 2014

Ten data nerds gathered in a large hilltop beach house to analyze counts of killings from several war-torn countries. The time was January 16-20, 2014, the place was near San Francisco, the agenda was packed, and I was excited to be there. Having defended my dissertation at Carnegie Mellon University just days before, I had often supposed that my thesis on a generalization of log-linear models for capture-recapture might serve little other purpose than to fill a line on my curriculum vitae. This perception faded after a mid-2013 discussion with Patrick convinced me that HRDAG's data challenges could easily be the best match to my research ...

The impact of overbooking on a pre-trial risk assessment tool

Kristian Lum, Chesa Boudin and Megan Price (2020). The impact of overbooking on a pre-trial risk assessment tool. FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. January 2020. Pages 482–491. https://doi.org/10.1145/3351095.3372846 ©ACM, Inc., 2020.

Assessing Claims of Declining Lethal Violence in Colombia

Patrick Ball, Tamy Guberek, Daniel Guzmán, Amelia Hoover, and Meghan Lynch (2007). “Assessing Claims of Declining Lethal Violence in Colombia.” Benetech. Also available in Spanish – “Para Evaluar Afirmaciones Sobre la Reducción de la Violencia Letal en Colombia.”

BJS Report on Arrest-Related Deaths: True Number Likely Much Greater

(This post is co-authored by Patrick Ball and Kristian Lum.) Today the Bureau of Justice Statistics (BJS) released a report on their effort to document “all deaths that occur during the process of arrest in the United States.” The analysis estimates that the Arrest-Related Deaths (ARD) program covers only 34-49% of these deaths. A parallel program by the FBI (the Supplementary Homicide Reports, SHR) is estimated to cover approximately the same proportion of deaths. Even taking into consideration both programs, 28% of all police homicides remain unreported. In order to estimate the total number of homicides that appear on neither the ARD or ...

HRDAG Retreat 2015

I look at the beach and then at the table surrounded by nerds, deep in thought and conversation about Dirichlet priors, matching algorithms, and armed conflicts. This peculiar (in the best way) environment catalyzes a moment of reflection: how did I get here? Four years ago, as a second-year statistics PhD student, I watched "Guatemala: The Secret Files" on PBS Frontline World. I listened to stories of family members who disappeared without answers or justice. Then the story shifted to the work being done by archivists and data experts at Guatemala's Historic Archive of the National Police. The scientists' pursuit of the truth energized me. I ...

Measuring lethal counterinsurgency violence in Amritsar District, India using a referral-based sampling technique

Romesh Silva, Jeff Klingner, and Scott Weikart. “Measuring lethal counterinsurgency violence in Amritsar District, India using a referral-based sampling technique.” In JSM Proceedings, Social Statistics Section. Alexandria, VA: American Statistical Association, 2010. © 201o JSM. All rights reserved.

Press Release, Timor-Leste, February 2006

SILICON VALLEY GROUP USES TECHNOLOGY TO HELP THE TRUTH COMMISSION ANSWER DISPUTED QUESTIONS ABOUT MASSIVE POLITICAL VIOLENCE IN TIMOR-LESTE Palo Alto, CA, February 9, 2006 – The Benetech® Initiative today released a statistical report detailing widespread and systematic violations in Timor-Leste during the period 1974-1999. Benetech's statistical analysis establishes that at least 102,800 (+/- 11,000) Timorese died as a result of the conflict. Approximately 18,600 (+/- 1000) Timorese were killed or disappeared, while the remainder died due to hunger and illness in excess of what would be expected due to peacetime mortality. The magnitude of deaths ...

Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.
