HRDAG and the Digital Commons

One of the three main goals of HRDAG is education and outreach, and to that end we use Creative Commons licenses for all of our blogposts and, whenever possible, for our publications. Using a Creative Commons license makes it clear that educators are free to use HRDAG's publications, in their entirety, and with the peace of mind that they are doing so with our blessing. Also, the use of the Creative Commons license allows us to participate in and encourage the creation of a digital commons, which we feel helps to advance another one of our goals, the creation of knowledge. We feel that it’s important to offer up our publications for use and reuse ...

FAQs on Predictive Policing and Bias

Last month Significance magazine published an article on the topic of predictive policing and police bias, which I co-authored with William Isaac. Since then, we've published a blogpost about it and fielded a few recurring questions. Here they are, along with our responses. Do your findings still apply given that PredPol uses crime reports rather than arrests as training data? Because this article was meant for an audience that is not necessarily well-versed in criminal justice data and we were under a strict word limit, we simplified language in describing the data. The data we used is a version of the Oakland Police Department’s crime report...

Multiple Systems Estimation: Does it Really Work?

<< Previous post, MSE: Stratification and Estimation Q15. Are there other MSE models one might use with human rights data? Q16. Is it possible to use MSE to model non-lethal human rights violations? Q17. I am concerned about using MSE with my data, because the datasets were gathered by opposing organizations. Victims who were reported to an NGO were very unlikely to be reported to state sources, but also very likely to be reported to religious organizations. Won't that cause the overlaps between the NGO list and the state list to be artificially low, and the overlaps between the NGO list and the church list to be artificially high? Does ...

Convenience Samples: What they are, and what they should (and should not) be used for

As noted on our Core Concepts page, we spend a lot of time worrying about the ways data are used to make claims about human rights violations.  This is because inaccurate statistics can damage the credibility of human rights claims.  Analyses of records of human rights violations are used to guide policy decisions, determine resource allocation for interventions, and inform transitional justice mechanisms.  It is vital that such analyses are accurate. Unfortunately, all too often these decisions are based, inappropriately, on analyses of a single convenience sample. (more…)

Strong Crypto Safeguards Human Rights Data

Strong cryptography can safeguard critical human rights data from repressive governments that steal data in order to persecute citizens. When vulnerable citizens dare to bear witness by naming perpetrators, their crimes, and their victims, the sensitive identifying information about those witnesses must be protected. In the late 1990s, HRDAG’s Director of Research, Patrick Ball, began his work with encrypted data while documenting crimes committed by the Guatemalan national police—and strong cryptography has remained critical to all of HRDAG’s work. hr {border-width:20px;} .main-container p a{color:#f98d00 !important;} h2 {font-we...

Reflections on Data Science for Real-World Problems

"Data science can lay out evidence to engage the state about systemic issues and assure people that they are not alone."

The task is a quantum of workflow

This post describes how we organize our work over ten years, twenty analysts, dozens of countries, and hundreds of projects: we start with a task. A task is a single chunk of work, a quantum of workflow. Each task is self-contained and self-documenting; I'll talk about these ideas at length below. We try to keep each task as small as possible, which makes it easy to understand what the task is doing, and how to test whether the results are correct. In the example I'll describe here, I'm going to describe work from our Syria database matching project, which includes about 100 tasks. I'll start with the first thing we do with files we receive ...


HRDAG’s analysis and expertise continues to deepen the national conversation about police violence and criminal justice reform in the United States. In 2015 we began by considering undocumented victims of police violence, relying on the same methodological approach we’ve tested internationally for decades. Shortly after, we examined “predictive policing” software, and demonstrated the ways that racial bias is baked into the algorithms. Following our partners’ lead, we next considered the impact of bail, and found that setting bail increases the likelihood of a defendant being found guilty. We then broadened our investigations to examine ...

Killings of Social Movement Leaders in Colombia

Using multiple system estimation, we estimate the total population of social movement leaders killed in Colombia during 2018.

India FAQs

Violent Deaths and Enforced Disappearances During the Counterinsurgency in Punjab, India: A Preliminary Quantitative Analysis Frequenty Asked Questions If there is so much data available, why can't you make claims about the number of people killed by security forces during the Punjab counterinsurgency campaign? Haven't Punjab Police and government bodies already documented the number of people killed and "illegally cremated?" Why doesn't this suffice? What has been the impact of quantitative studies of human rights violations in other regions? What impact do these findings have in the Punjab context? Why did you undertake this study? What are the ...

Liberian TRC Data and Data Dictionary

The files linked on this page contain the data used in the calculations presented in Benetech's report to the Liberian Truth and Reconciliation Commission entitled "Descriptive Statistics From Statements to the Liberian Truth and Reconciliation Commission." In accordance with Benetech's Memorandum of Understanding with the TRC, these data are published on the Internet so that others can use the material to replicate our findings and continue research on past human rights violations in Liberia. In order to protect the privacy of the people who suffered, the information in the files below contains no personal identifying information about the victims or ...

New results for the identification of municipalities with clandestine graves in Mexico

The goal of this project is identify Mexican municipalities with a high probability of having clandestine graves. Knowing where to search will help to create better public programs regarding missing persons in Mexico.

Multiple Systems Estimation: Collection, Cleaning and Canonicalization of Data

<< Previous post: MSE: The Basics Q3. What are the steps in an MSE analysis? Q4. What does data collection look like in the human rights context? What kind of data do you collect? Q5. [In depth] Do you include unnamed or anonymous victims in the matching process? Q6. What do you mean by "cleaning" and "canonicalization?" Q7. [In depth] What are some of the challenges of canonicalization? (more…)

Multiple Systems Estimation: Stratification and Estimation

<< Previous post, MSE: The Matching Process Q10. What is stratification? Q11. [In depth] How do HRDAG analysts approach stratification, and why is it important? Q12. How does MSE find the total number of violations? Q13. [In depth] What are the assumptions of two-system MSE (capture-recapture)? Why are they not necessary with three or more systems? Q14. What statistical model(s) does HRDAG typically use to calculate MSE estimates? (more…)

Amnesty International Reports Organized Murder Of Detainees In Syrian Prison

100x100nprReports of torture and disappearances in Syria are not new. But the Amnesty International report says the magnitude and severity of abuse has “increased drastically” since 2011. Citing the Human Rights Data Analysis Group, the report says “at least 17,723 people were killed in government custody between March 2011 and December 2015, an average of 300 deaths each month.”

Syria: No word on four abducted activists

Razan Zatouneh is an esteemed colleague of ours, and we are one of 57 organizations demanding immediate release for her and the three other human rights defenders still missing. A year on, no information on Douma Four The prominent Syrian human rights defenders Razan Zaitouneh, Samira Khalil, Wa’el Hamada and Nazem Hamadi – the Douma Four—remain missing a year after their abduction, 57 organizations said today. The four were abducted in Duma, a city near Damascus under the control of armed opposition groups. They should be released immediately, the groups said. On 9 December 2013, at about 10:40 pm, a group of armed men stormed into the ...

Reflections: Some Stories Shape You

The first time I met anyone at HRDAG, I was a journalist. It was 2006. I was working on a story about a graduate student at Carnegie Mellon who’d collaborated with the organization on a survey in Sierra Leone, and I contacted Patrick Ball to discuss the work. At the time, I found him challenging. But I thought his work—trying to estimate how many people were killed, or, in that study, otherwise injured, during wars—was fascinating. Over the next few years, I got to know other researchers working on similar questions. In 2008, as the war in Iraq ramped up, I spoke with epidemiologists from Johns Hopkins University, the World Health Organiz...

Frequently Asked Questions

Multiple Systems Estimation What is MSE?  What do you mean by statistical inference?  What is an overlap, and how do we know when lists overlap?   How does MSE find the total number of violations?  How was MSE originally developed?  How does the Benetech Human Rights Program use MSE?    1. What is MSE? A: Multiple Systems Estimation, or MSE, is a family of techniques for statistical inference. MSE uses the overlaps between several incomplete lists of human rights violations to determine the total number of violations. Return to Top 2. What do you mean by statistical inference? A: ...

How many police homicides in the US? A reconsideration

(This post is co-authored by Patrick Ball and Kristian Lum.) In early March, the Bureau of Justice Statistics published a report that estimated that in the period 2003-2009 and 2011, there were approximately 7427 homicides committed by police in the US. We responded that the method the analysts used, capture-recapture with two databases, is vulnerable to underestimation if the databases exhibit positive dependence. We conduct a thorough sensitivity analysis on the original independence model as applied to the police homicides databases. We used information from several other countries where our partners created multiple databases of homicides. We ...

Reflections: A Love Letter to HRDAG

On the anniversary of the Universal Declaration of Human Rights, HRDAG executive director Megan Price tells us why she loves her work, and why she feels hopeful about the future.

Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.