Valentina Rozo Ángel has joined our team as our new visiting analyst this fall.
We’re pleased to announce that Camille Fassett has joined our team as our new data science fellow.
Over the last few years, we've tried to make the data organized in our projects publicly accessible. We have encouraged our partners to publish the data at the completion of the project. We continue to believe it is important to offer access to the data used in our projects for the sake of transparency as well as to encourage further research and analysis. However, we are increasingly concerned about how raw data are used. Data collected by what we can observe is what statisticians call a convenience sample, which is subject to selection bias.
We're keeping these datasets available for researchers who want to use them for simulation or estimation ...
Today Guatemala’s former national police chief Colonel Héctor Rafael Bol de la Cruz was convicted and sentenced to 40 years in prison for his role in the 1984 kidnapping and disappearance of 27-year-old student union leader Fernando Garcia, who was last seen when officers detained him outside his home. Along with Bol de la Cruz, former senior police officer Jorge Gomez was also tried; he received a sentence of 40 years in prison. That verdict comes in part because of testimony this month by HRDAG’s Patrick Ball, who served as an expert witness and presented data analysis done with colleague Daniel Guzmán to assess the flow of thousands of ...
Shira Mitchell, Al Ozonoff, Alan Zaslavsky, Bethany Hedt-Gauthier, Kristian Lum and Brent Coull (2013). A Comparison of Marginal and Conditional Models for Capture-Recapture Data with Application to Human Rights Violations Data. Biometrics, Volume 69, Issue 4, pages 1022–1032, December 2013. © 2013, The International Biometric Society. DOI: 10.1111/biom.12089.
Shira Mitchell, Al Ozonoff, Alan Zaslavsky, Bethany Hedt-Gauthier, Kristian Lum and Brent Coull (2013). A Comparison of Marginal and Conditional Models for Capture-Recapture Data with Application to Human Rights Violations Data. Biometrics, , Issue 4, pages 1022–1032, December 2013. © 2013, The International Biometric Society. DOI: 10.1111/biom.12089.
This rigorous estimate shows that 1-2 percent of the country’s population was killed or disappeared during the civil war.
We have accomplished so much in the last 10 years at the Historical Archive of the National Police. And yet, despite the efforts, dedication, and commitment of each person who since 2006 has worked in the AHPN, we still can not say “mission accomplished.”
In 10 years the environment at the Archive has changed so much and become so full of life. Where the building once sheltered unknown stories, over time some of those stories have been revealed. But Guatemala has a long way to go in letting the world get to know more deeply about the secrets within the documents stored there.
Guatemalans and the rest of the world have a very important ...
Text in English
[popup citation="Tamy Guberek, Daniel Guzmán, Megan Price, Kristian Lum and Patrick Ball. (2010). Benetech/Human Rights Data Analysis Group database of lethal violence in Casanare."]
Estimaciones de Homicidios y Desapariciones en Casanare
Casanare es un departamento extenso y rural de Colombia con 19 municipios y una población de casi 300.000 habitantes. Ubicado entre las faldas de los Andes y las planicies orientales, Casanare tiene una larga historia de violencia. Diversos grupos armados han hecho presencia en Casanare, entre ellos paramilitares, guerrillas y el ejército colombiano. Muchos habitantes del Casanare han sido vícti...
“Patrick Ball, HRDAG’s Director of Research and the statistician behind the code, explained that the Random Forest classifier was able to predict with 100% accuracy which counties that would go on to have mass graves found in them in 2014 by using the model against data from 2013. The model also predicted the counties that did not have mass hidden graves found in them, but that show a high likelihood of the possibility. This prediction aspect of the model is the part that holds the most potential for future research.”
Lum, Kristian, Megan Emily Price, and David Banks. 2013. The American Statistician 67, no. 4: 191-200. doi: 10.1080/00031305.2013.821093. © 2013 The American Statistician. All rights reserved. [free eprint may be available].
HRDAG is joining Partnership on AI to Benefit People and Society (PAI).
HRDAG associate William Isaac is quoted in this article about how predictive policing algorithms such as PredPol exacerbate the problem of racial bias in policing.
<< Previous post: MSE: The Basics
Q3. What are the steps in an MSE analysis?
Q4. What does data collection look like in the human rights context? What kind of data do you collect?
Q5. [In depth] Do you include unnamed or anonymous victims in the matching process?
Q6. What do you mean by "cleaning" and "canonicalization?"
Q7. [In depth] What are some of the challenges of canonicalization? (more…)
Solving for X documents Patrick's team as they travel to Guatemala, Kosovo, and Liberia, helping human rights supporters apply sophisticated computer analysis to human rights events.
The World According to Artificial Intelligence: Targeted by Algorithm (Part 1)
The Big Picture: The World According to AI explores how artificial intelligence is being used today, and what it means to those on its receiving end.
Patrick Ball is interviewed: “Machine learning is pretty good at finding elements out of a huge pool of non-elements… But we’ll get a lot of false positives along the way.”
Today we celebrate the 65th anniversary of the Universal Declaration of Human Rights, which was adopted by the UN General Assembly on 10 December 1948. At HRDAG, we are non-partisan: we do not favor any party or government in conflicts. But we are not neutral: we are always in favor of human rights. We believe in the power and value of data; as we see it, data distills human actions and existence, all of which have power and value. With this in mind, we propose these seven articles that comprise our declaration of a few data rights (click through the links for some examples).
Preamble
Whereas data represents the suffering of human beings,
Whereas ...
Kristian Lum and William Isaac (2016). To predict and serve? Significance. October 10, 2016. © 2016 The Royal Statistical Society.
Kristian Lum and William Isaac (2016). To predict and serve? Significance. October 10, 2016. © 2016 The Royal Statistical Society.
Sarah L. Desmarais and Evan M. Lowder (2019). Pretrial Risk Assessment Tools: A Primer for Judges, Prosecutors, and Defense Attorneys. Safety and Justice Challenge, February 2019. © 2019 Safety and Justice Challenge. <<HRDAG's Kristian Lum and Tarak Shah served as Project Members and made contributions to the primer.>>
Sarah L. Desmarais and Evan M. Lowder (2019). Pretrial Risk Assessment Tools: A Primer for Judges, Prosecutors, and Defense Attorneys. Safety and Justice Challenge, February 2019. © 2019 Safety and Justice Challenge. <<HRDAG’s Kristian Lum and Tarak Shah served as Project Members and made significant contributions to the primer.>>
In our work, we merge many databases to figure out how many people have been killed in violent conflict. Merging is a lot harder than you might think.
Many of the database records refer to the same people--the records are duplicated. We want to identify and link all the records that refer to the same victims so that each victim is counted only once, and so that we can use the structure of overlapping records to do multiple systems estimation.
Merging records that refer to the same person is called entity resolution, database deduplication, or record linkage. For definitive overviews of the field, see Scheuren, Herzog, and Winkler, Data Quality ...