13 results for author: Patrick Ball


Using MSE to Estimate Unobserved Events

At HRDAG, we worry about what we don't know. Specifically, we worry about how we can use statistical techniques to estimate homicides that are not observed by human rights groups. Based on what we've seen studying many conflicts over the last 25 years, what we don't know is often quite different from what we do know. The technique we use most often to estimate what we don't know is called "multiple systems estimation." In this medium-technical post, I explain how to organize data and use three R packages to estimate unobserved events. Click here for Computing Multiple Systems Estimation in R.

Clustering and Solving the Right Problem

In our database deduplication work, we’re trying to figure out which records refer to the same person, and which other records refer to different people. We write software that looks at tens of millions of pairs of records. We calculate a model that assigns each pair of records a probability that the pair of records refers to the same person. This step is called pairwise classification. However, there may be more than just one pair of records that refer to the same person. Sometimes three, four, or more reports of the same death are recorded. So once we have all the pairs classified, we need to decide which groups of records refer to the ...

The task is a quantum of workflow

This post describes how we organize our work over ten years, twenty analysts, dozens of countries, and hundreds of projects: we start with a task. A task is a single chunk of work, a quantum of workflow. Each task is self-contained and self-documenting; I'll talk about these ideas at length below. We try to keep each task as small as possible, which makes it easy to understand what the task is doing, and how to test whether the results are correct. In the example I'll describe here, I'm going to describe work from our Syria database matching project, which includes about 100 tasks. I'll start with the first thing we do with files we receive ...

A geeky deep-dive: database deduplication to identify victims of human rights violations

In our work, we merge many databases to figure out how many people have been killed in violent conflict. Merging is a lot harder than you might think. Many of the database records refer to the same people--the records are duplicated. We want to identify and link all the records that refer to the same victims so that each victim is counted only once, and so that we can use the structure of overlapping records to do multiple systems estimation. Merging records that refer to the same person is called entity resolution, database deduplication, or record linkage. For definitive overviews of the field, see Scheuren, Herzog, and Winkler, Data Quality ...

Focus on Good Science, not Scientists

We recently learned about an article by Dr Nafeez Ahmed that criticizes the methods and conclusions of the Iraq Body Count (IBC) and the work of Professor Michael Spagat. Dr Ahmed cites our work extensively in support of his arguments, so we think it’s useful for us to reply. We welcome Dr Ahmed’s summary of various points of scientific debate about mortality due to violence, specifically in Iraq and Colombia. We think these are very important questions for the analysis of data about violent conflict, and indeed, about data analysis more generally. We appreciate his exploration of the technical nuances of this difficult field. Unfortunately, ...

How many police homicides in the US? A reconsideration

(This post is co-authored by Patrick Ball and Kristian Lum.) In early March, the Bureau of Justice Statistics published a report that estimated that in the period 2003-2009 and 2011, there were approximately 7427 homicides committed by police in the US. We responded that the method the analysts used, capture-recapture with two databases, is vulnerable to underestimation if the databases exhibit positive dependence. We conduct a thorough sensitivity analysis on the original independence model as applied to the police homicides databases. We used information from several other countries where our partners created multiple databases of homicides. We ...

Yezidi Activists Teach HRDAG about Human Rights – updated

UPDATE (21 Dec 2014): Juan Cole is reporting that the Kurdish militia (the peshmerga) have retaken Shingal (also known as Sinjar) mountain where many Yezidi people have been trapped since 3 August 2014. They are now moving to liberate other Yezidi towns south of the mountain. The Yezidi people trapped on the mountain are now free. There is no word yet on the thousands of Yezidi people enslaved by ISIS. ORIGINAL (19 Nov 2014): Farhad (not his real name) got the call from ISIS on his personal cell phone just after lunch: we have your sister, and we will give her back if you pay us $6000, plus $1500 for the driver. Carrying little more than his ...

Revisiting the analysis of event size bias in the Iraq Body Count

(This post is co-authored by Patrick Ball and Megan Price) In a recent article in the SAIS Review of International Affairs, we wrote about "event size bias," the problem that events of different sizes have different probabilities of being reported. In this case, the size of an event is defined by the number of reported victims. Our concern is that not all violent (in this case homicide) events are recorded, that is, some events will have zero sources. Our theory is that events with fewer victims will receive less coverage than events with more victims, and that a higher proportion of small events will have zero sources relative to large events. The ...

The Day We Fight Back

Today, February 11, is the day of national protests against the National Security Administration. The critical threat is mass surveillance. In the words of The Day We Fight Back, “Together we will push back against powers that seek to observe, collect, and analyze our every digital action. Together, we will make it clear that such behavior is not compatible with democratic governance. Together, if we persist, we will win this fight.” (more…)

Ouster of Guatemala’s Attorney General

We were surprised and disappointed to learn that our colleague Claudia Paz y Paz has had her term as Guatemala’s attorney general cut short. The nation’s Constitutional Court ruled on 6 February that her four-year term will end this May, instead of in December. (She was appointed in December 2010, replacing an attorney general who was appointed in May 2010.) During her term, she put four generals from Guatemala’s civil war on the stand for charges of crimes against humanity and genocide, including General José Efraín Ríos Montt, who ruled from 1982 to 1983. We were fortunate to work with her on that trial and to witness the handing down of a ...

Why raw data doesn't support analysis of violence

This morning I got a query from a journalist asking for our data from the report we published yesterday. The journalist was hoping to create an interactive infographic to track the number of deaths in the Syrian conflict over time. Our data would not support an analysis like the one proposed, so I wrote this reply. We can't send you these data because they would be misleading—seriously misleading—for the purpose you describe. Here's why: What we have is a list of documented deaths, in essence, a highly non-random sample, though a very big one. We like bigger samples because we think that they must be closer to true. The mathematical justificat...

Historic verdict in Guatemala—Gen.Efraín Ríos Montt found guilty

I've been working with various projects in Guatemala to document mass violence since 1993, so in 2011, when Claudia Paz y Paz asked me to revisit the analysis I did for the Commission for Historical Clarification examining the differential mortality rates due to homicide for indigenous and non-indigenous people in the Ixil region, I was delighted. We have far better data processing and statistical methods than we had in 1998, plus much more data. I think the resulting analysis is a conservative lower bound on total homicides of indigenous people. (more…)

Welcome!

As of today, the Human Rights Data Analysis Group (HRDAG) is an independent* non-profit! It's been a long time coming, and we're delighted to have gotten to this point. HRDAG is a non-profit, non-partisan organization that applies rigorous science to the analysis of human rights violations around the world; for more information, see our About Us page. Benetech has spun out the scientific and statistical part of the Human Rights Program to HRDAG. The spinout includes (as staff) me -- Patrick Ball -- and Dr Megan Price, as well as our many part-time scientific and field consultants (a list is here). The software and technology component of our work -- ...