During the violence in Timor-Leste in June 2006, armed gangs broke into the offices of the Commission for Reception, Truth and Reconciliation (CAVR) in Dili and stole their motorbikes.
The Human Rights Data Analysis Group, then at Benetech®, and other human rights observers wondered whether the mobs would soon return to loot the irreplaceable paper records used by the CAVR to compile their definitive report entitled "Chega!"
The Benetech Initiative contributed to the CAVR findings and released a separate statistical report (PDF) establishing that at least 102,800 (+/- 11,000) Timorese died as a result of human rights violations in Timor-Leste ...
Last month Significance magazine published an article on the topic of predictive policing and police bias, which I co-authored with William Isaac. Since then, we've published a blogpost about it and fielded a few recurring questions. Here they are, along with our responses.
Do your findings still apply given that PredPol uses crime reports rather than arrests as training data?
Because this article was meant for an audience that is not necessarily well-versed in criminal justice data and we were under a strict word limit, we simplified language in describing the data. The data we used is a version of the Oakland Police Department’s crime report...
I spent the two weeks over Easter working with Patrick and Megan in San Francisco, trying to figure out a strategy of how best to estimate the number of casualties the Syrian civil war has claimed in the past two years. In January, HRDAG published a report on the number of fully identified casualties reported in the Syrian Arab Republic between March 2011 and November 2012. The number of de-duplicated records of killings for this period was 59,648, a number that is likely to be an undercount since we know that many incidences of lethal violence in conflict go unreported, and that the unreported cases are not missing at random. (more…)
The data on killings in Kosovo are in four files. All of the files are comma-delimited ASCII. The fields in each file are described below.
If you use these data on Kosovo killings, please cite them with the following citation, as well as this note:
“These are convenience sample data, and as such they are not a statistically representative sample of events in this conflict. These data do not support conclusions about patterns, trends, or other substantive comparisons (such as over time, space, ethnicity, age, etc.).”
Patrick Ball, Wendy Betts, Fritz Scheuren, Jana Dudukovich, and Jana Asher. (2002). AAAS/ABA-CEELI/Human Rights Data ...
Reports of torture and disappearances in Syria are not new. But the Amnesty International report says the magnitude and severity of abuse has “increased drastically” since 2011. Citing the Human Rights Data Analysis Group, the report says “at least 17,723 people were killed in government custody between March 2011 and December 2015, an average of 300 deaths each month.”
Kilómetro Cero is making a comparison of police killings in Puerto Rico and police killings in the non-territorial United States, and HRDAG is helping to organize the data.
Violent Deaths and Enforced Disappearances During the Counterinsurgency in Punjab, India: A Preliminary Quantitative Analysis
Frequenty Asked Questions
If there is so much data available, why can't you make claims about the number of people killed by security forces during the Punjab counterinsurgency campaign?
Haven't Punjab Police and government bodies already documented the number of people killed and "illegally cremated?" Why doesn't this suffice?
What has been the impact of quantitative studies of human rights violations in other regions?
What impact do these findings have in the Punjab context? Why did you undertake this study?
What are the ...
I will use the skills and culture I learned from HRDAG’s team to understand how the conflict has affected the people in my country.
/wp-content/uploads/2013/01/Definition_of_Database_Design_Standards_1994.pdf
Patrick Ball. “A Definition of Database Design Standards for Human Rights Agencies.” © 1994 American Association for the Advancement of Science. [pdf]
The HRDAG Tech Corner is where we collect the deeper and geekier content that we create for the website. Click the accordion blocks below to reveal each of the Tech Corner entries.
Sifting Massive Datasets with Machine Learning
Principled Data Processing
(This post is co-authored by Patrick Ball and Kristian Lum.)
Today the Bureau of Justice Statistics (BJS) released a report on their effort to document “all deaths that occur during the process of arrest in the United States.” The analysis estimates that the Arrest-Related Deaths (ARD) program covers only 34-49% of these deaths. A parallel program by the FBI (the Supplementary Homicide Reports, SHR) is estimated to cover approximately the same proportion of deaths. Even taking into consideration both programs, 28% of all police homicides remain unreported.
In order to estimate the total number of homicides that appear on neither the ARD or ...
It took me a while to realize I had become part of the HRDAG incubator—at least that’s what it felt like to me—for young data analysts who wanted to use statistical knowledge to make a real impact on human rights debates.
Ten years after the war ended in Sri Lanka, we still don’t know to the nearest ten thousand how many people perished. The estimates for the death toll for the last five months of the war alone vary between 7000 and 147,000. In 2011, the UN said it thought approximately 40,000 civilians had died; then in 2012 an internal UN report estimated it was at least 70,000. Population data from World Bank and UN sources indicated that more than 100,000 Tamils living in the conflict areas in the north have not returned home after the war.
HRDAG has provided technical assistance to a broad range of non-governmental human rights organizations in Sri ...
The Historic Archive of the Guatemalan National Police (hereafter the Archive) was discovered, quite by accident, in July 2005. Researchers immediately recognized both the importance and the fragility of the Archive's contents. As a result, in early 2006 the Archive team invited Patrick to evaluate the documents and help them answer a seemingly simple question: How can we learn about the contents of the Archive in a shorter period of time than is needed to systematically examine each individual document?
After inspecting the Archive, Patrick designed a multi-stage random sample of documents. In May 2006, Tamy Guberek, Daniel Guzmán, and ...