In July 2009, The Human Rights Data Analysis Group (HRDAG) concluded a three-year project with the Liberian Truth and Reconciliation Commission to help clarify Liberia's violent history and hold perpetrators of human rights abuses accountable for their actions. (This work was conducted by HRDAG while with Benetech.) In the course of this work, HRDAG analyzed more than 17,000 victim and witness statements collected by the Liberian Truth and Reconciliation Commission and compiled the data into a report entitled "Descriptive Statistics From Statements to the Liberian Truth and Reconciliation Commission." The report is included as an annex to the final ...

CIIDH Data – Dictionary

Version date: 2000.01.29 Current version: ATV20.1 Patrick Ball & Herbert F. Spirer The unit of analysis for each record in this structure is VIOLATION. Each violation was of a particular type, happened at a particular time and place, and was committed by zero, one, or several organizational perpetrators. The violation was committed against zero or one named (individually identified) victim, and zero or more anonymous (unidentified) additional victims. The violation was reported one or more times in one, two, or three source types. Note that to count the number of times individuals suffered particular violations, users should sum either the ...

Letter from Alejandro Valencia Villa

Alejandro Valencia Villa is a Former Commissioner of the Colombian Truth Commission. (Letter in English, and letter in Spanish.) Introduction One of the most obvious and most difficult questions to answer when analyzing an armed conflict is determining the number of victims. In a conflict like Colombia’s, prolonged and with complex characteristics due to the different nature of the armed actors and because they committed a great variety and quantity of human rights violations and breaches of humanitarian law, the challenge is even greater. As if this were not enough, Colombia also had a large number of records of these violations and infract...

The task is a quantum of workflow

This post describes how we organize our work over ten years, twenty analysts, dozens of countries, and hundreds of projects: we start with a task. A task is a single chunk of work, a quantum of workflow. Each task is self-contained and self-documenting; I'll talk about these ideas at length below. We try to keep each task as small as possible, which makes it easy to understand what the task is doing, and how to test whether the results are correct. In the example I'll describe here, I'm going to describe work from our Syria database matching project, which includes about 100 tasks. I'll start with the first thing we do with files we receive ...

How many police homicides in the US? A reconsideration

(This post is co-authored by Patrick Ball and Kristian Lum.) In early March, the Bureau of Justice Statistics published a report that estimated that in the period 2003-2009 and 2011, there were approximately 7427 homicides committed by police in the US. We responded that the method the analysts used, capture-recapture with two databases, is vulnerable to underestimation if the databases exhibit positive dependence. We conduct a thorough sensitivity analysis on the original independence model as applied to the police homicides databases. We used information from several other countries where our partners created multiple databases of homicides. We ...


In 2009, as Indians debated institutional reform of their security forces in the wake of the previous year's Mumbai attacks, HRDAG issued a groundbreaking report about the human cost of suspending the rule of law during a violent counterinsurgency campaign in the Indian state of Punjab. Together with our partner Ensaaf, HRDAG released findings that cast substantial doubt on the Indian government's past explanations and justifications for disappearances and extrajudicial killings during the height of the Punjab counterinsurgency in the early 1990s. These findings contribute to an increasing body of knowledge that informs policy questions about the ...

Casanare, Colombia

Estimates of Killings and Disappearances in Casanare Casanare is a large, rural department or state in Colombia that includes 19 municipalities and a population of almost 300,000 inhabitants. Located in the foothills of the Andes and on the eastern plains, Casanare has a history of violence. Multiple armed groups have operated in Casanare including paramilitaries, guerillas and the Colombian military. Many Casanare citizens have suffered violent deaths and disappearances. But how many people have been killed or disappeared? For reasons of policy, accountability and historical clarification, this question deserves a valid answer. In February ...

In Colombia: HRDAG and Dejusticia on the Importance of Missing Data

It’s inevitable that databases will have information gaps, and special care must be taken to account for these gaps.

14 Questions about Counting Casualties in Syria

In early 2012, HRDAG was commissioned by the UN Office of the High Commissioner for Human Rights (OHCHR) to do an enumeration project, essentially a count of all of the reported casualties in the Syrian conflict. HRDAG has published two analyses so far, the first in January 2013, and the second in June 2013. In this post, HRDAG scientists Anita Gohdes, Megan Price, and Patrick Ball answer questions about that project. So, how many people have been killed in the Syrian conflict? This is a complicated question. As of our last report, in June 2013, we know that there have been at least 93,000 reported, identifiable conflict-related casualties. The ...

Las cifras de la CVR en el 2019

Las estimaciones se estratificaron por ubicación y perpetrador.

Lessons at HRDAG: Making More Syrian Records Usable

If we could glean key missing information from those fields, we would be able to use more records.

Chad – Photo Essay

[français] Hissène Habré was president of the former French colony of Chad from 1982 to 1990. Credible allegations of systematic torture and crimes against humanity have been made against Habré’s state security force, the Documentation and Security Directorate (DDS), which pursued political opponents and operated notorious prisons during his regime. One prison where the DDS is alleged to have tortured prisoners is the “Piscine,” a former swimming pool covered by a concrete roof. Prisoners were held in ten dank cells where witnesses say they were starved and abused. After being forced from power in 1990, Habré went into exile ...

Analizando los patrones de violencia en Colombia con más de 100 bases de datos

El objetivo de esta institución temporal es conocer la verdad de lo ocurrido en el  marco del conflicto armado.

How we go about estimating casualties in Syria—Part 1

I spent the two weeks over Easter working with Patrick and Megan in San Francisco, trying to figure out a strategy of how best to estimate the number of casualties the Syrian civil war has claimed in the past two years. In January, HRDAG published a report on the number of fully identified casualties reported in the Syrian Arab Republic between March 2011 and November 2012. The number of de-duplicated records of killings for this period was 59,648, a number that is likely to be an undercount since we know that many incidences of lethal violence in conflict go unreported, and that the unreported cases are not missing at random. (more…)

The Statistics of Mortality Due to Conflict in Peru

A key point is that human rights data collection prior to the TRC largely ignored violence by the Shining Path.

12 Questions about Using Data Analysis to Bring Guatemalan War Criminals to Justice

When people talk about war criminals in Guatemala, which war are they talking about? They’re talking about the Guatemalan civil war, which began in 1960 and ended in 1996. That’s thirty-six years of civil war. Even though it ended almost two decades ago, Guatemala is still recovering from it. At its simplest, this civil war story was right-wing government forces fighting leftist rebels. But it went deeper than that, of course. The majority of the rebel forces was composed of indigenous peoples, primarily the Maya, (more…)


During the violence in Timor-Leste in June 2006, armed gangs broke into the offices of the Commission for Reception, Truth and Reconciliation (CAVR) in Dili and stole their motorbikes. The Human Rights Data Analysis Group, then at Benetech®, and other human rights observers wondered whether the mobs would soon return to loot the irreplaceable paper records used by the CAVR to compile their definitive report entitled "Chega!" The Benetech Initiative contributed to the CAVR findings and released a separate statistical report (PDF) establishing that at least 102,800 (+/- 11,000) Timorese died as a result of human rights violations in Timor-Leste ...

Multiple Systems Estimation: Collection, Cleaning and Canonicalization of Data

<< Previous post: MSE: The Basics Q3. What are the steps in an MSE analysis? Q4. What does data collection look like in the human rights context? What kind of data do you collect? Q5. [In depth] Do you include unnamed or anonymous victims in the matching process? Q6. What do you mean by "cleaning" and "canonicalization?" Q7. [In depth] What are some of the challenges of canonicalization? (more…)

Frequently Asked Questions

Multiple Systems Estimation What is MSE?  What do you mean by statistical inference?  What is an overlap, and how do we know when lists overlap?   How does MSE find the total number of violations?  How was MSE originally developed?  How does the Benetech Human Rights Program use MSE?    1. What is MSE? A: Multiple Systems Estimation, or MSE, is a family of techniques for statistical inference. MSE uses the overlaps between several incomplete lists of human rights violations to determine the total number of violations. Return to Top 2. What do you mean by statistical inference? A: ...

Data Archaeology for Human Rights in Central America: HRDAG Collaborates with UWCHR

Patrick Ball is kicking himself for a decision he made almost 25 years ago. “I was clever, but I wasn’t smart,” he says ruefully, as he considers the labyrinth of tables and ASCII-encoded keystrings he used to design a database of human rights violations for the pioneering Salvadoran non-governmental Human Rights Commission (CDHES). Now I’m sitting in his office in San Francisco’s Mission District watching over his shoulder, and trying to keep up, as he bangs out code to decipher the priceless data contained in these old files. Created in 1991 and 1992, during the last days of El Salvador’s internal armed conflict, the files detail ...

Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.
