In 2009, as Indians debated institutional reform of their security forces in the wake of the previous year's Mumbai attacks, HRDAG issued a groundbreaking report about the human cost of suspending the rule of law during a violent counterinsurgency campaign in the Indian state of Punjab. Together with our partner Ensaaf, HRDAG released findings that cast substantial doubt on the Indian government's past explanations and justifications for disappearances and extrajudicial killings during the height of the Punjab counterinsurgency in the early 1990s. These findings contribute to an increasing body of knowledge that informs policy questions about the ...
(This post is co-authored by Patrick Ball and Megan Price)
In a recent article in the SAIS Review of International Affairs, we wrote about "event size bias," the problem that events of different sizes have different probabilities of being reported. In this case, the size of an event is defined by the number of reported victims. Our concern is that not all violent (in this case homicide) events are recorded, that is, some events will have zero sources. Our theory is that events with fewer victims will receive less coverage than events with more victims, and that a higher proportion of small events will have zero sources relative to large events.
The ...
Violent Deaths and Enforced Disappearances During the Counterinsurgency in Punjab, India: A Preliminary Quantitative Analysis
Frequenty Asked Questions
If there is so much data available, why can't you make claims about the number of people killed by security forces during the Punjab counterinsurgency campaign?
Haven't Punjab Police and government bodies already documented the number of people killed and "illegally cremated?" Why doesn't this suffice?
What has been the impact of quantitative studies of human rights violations in other regions?
What impact do these findings have in the Punjab context? Why did you undertake this study?
What are the ...
Version date: 2000.01.29
Current version: ATV20.1
Patrick Ball & Herbert F. Spirer
v_ind
-------------+-----------
Victim |
Ethnic |
category |
| Freq.
-------------+-----------
1 Indigenous | 2,722
2 Ladino | 1,014
3 Unknown | 13,687
|
Total | 17,423
-------------+-----------
v_sex
----------+-----------
Victim |
Sex | Freq.
----------+-----------
4 F | 2,001
5 M | 11,445
6 d | 3,977
|
Total | 17,423
----------+-----------
v_eth
-------------+-----------
Victim |
Maternal |
language ...
.outter-wrapper.feature {
background: #15795b;
}
.outter-wrapper.feature hr {
border-width: 0;
height: 30px;
}
.outter-wrapper.feature h4 {
/* height: 30px; */
border-width: 0;
}
.wrapper {
padding: 20px 0;
}
.branding-headline {
width: 100%;
font-size: 40px;
font-weight: 600;
padding-bottom: 20px;
color: #15795b;
line-height: 43.2px;
}
.border-line {
border-bottom: 1px solid #000;
margin: 20px 0;
}
.hed-dek-illo {
margin: 20px 0;
}
.illo {
width: 100%;
min-height: 200px;
}
.illo img {
margin: 0;
}
.blog-pages {
display: flex;
}
.blog-post {
flex: 0 0 ...
The HRDAG Tech Corner is where we collect the deeper and geekier content that we create for the website. Click the accordion blocks below to reveal each of the Tech Corner entries.
Sifting Massive Datasets with Machine Learning
Principled Data Processing
Over the last few years, we've tried to make the data organized in our projects publicly accessible. We have encouraged our partners to publish the data at the completion of the project. We continue to believe it is important to offer access to the data used in our projects for the sake of transparency as well as to encourage further research and analysis. However, we are increasingly concerned about how raw data are used. Data collected by what we can observe is what statisticians call a convenience sample, which is subject to selection bias.
We're keeping these datasets available for researchers who want to use them for simulation or estimation ...
Patrick Ball and Miguel Cruz (2003). “Human freedom and free software: Why choices about technology matter to human rights advocates.”
Patrick Ball (2005). “Free Software,” in The Encyclopedia of Science, Technology, and Ethics. ed. by Carl Mitcham. Farmington Hills, MI: Thomson Gale.
Kristian Lum, lead statistician at the Human Rights Data Analysis Group (and letter signatory), fears that “in order to flag even a small proportion of future terrorists, this tool will likely flag a huge number of people who would never go on to be terrorists,” and that “these ‘false positives’ will be real people who would never have gone on to commit criminal acts but will suffer the consequences of being flagged just the same.”
Collecting and Protecting Human Rights Data in Guatemala (1991-2013)
In 1996, a peace accord brokered by the United Nations ended 36 years of internal armed conflict in Guatemala. During the hostilities, non-governmental organizations asked for technical support from the scientific community in the project to gather the experiences of witnesses and victims in databases.
From 1993 to 1999 Dr. Patrick Ball, then at the American Association for the Advancement of Science (AAAS), worked with the International Center for Human Rights Research in Guatemala (CIIDH) to collect and organize evidence of more than 43,000 human rights violations. The ...
HRDAG’s analysis and expertise continues to deepen the national conversation about police violence and criminal justice reform in the United States. In 2015 we began by considering undocumented victims of police violence, relying on the same methodological approach we’ve tested internationally for decades. Shortly after, we examined “predictive policing” software, and demonstrated the ways that racial bias is baked into the algorithms. Following our partners’ lead, we next considered the impact of bail, and found that setting bail increases the likelihood of a defendant being found guilty. We then broadened our investigations to examine ...
This post describes how we organize our work over ten years, twenty analysts, dozens of countries, and hundreds of projects: we start with a task. A task is a single chunk of work, a quantum of workflow. Each task is self-contained and self-documenting; I'll talk about these ideas at length below. We try to keep each task as small as possible, which makes it easy to understand what the task is doing, and how to test whether the results are correct.
In the example I'll describe here, I'm going to describe work from our Syria database matching project, which includes about 100 tasks. I'll start with the first thing we do with files we receive ...