236 results for search: www.xn--299aqd11rg1lb34as7a.net/feed/privacy
Open Source Summit 2018
On October 23, 2018, Patrick Ball keynoted at the Open Source Summit in Edinburgh, Scotland.
CIIDH Data – Variables List
Version date: 2000.01.29
Current version: ATV20.1
Patrick Ball & Herbert F. Spirer
Below are listed the 19 files that constitute the CIIDH database. We have noted those that include data that might be analytically useful in future versions of ATV. File names and brief definitions are in bold, and variable summaries are in bulleted points.
CXTOV2 (Context; links to VLCNV2)
Additional detail on geographic location of case
Narrative summary
CXTOV2ex (Context extension; links to CXTOV2)
Fine breakdown on the age category & sex of anonymous victims
CXTOV2lg (Context extension; links to CXTOV2)
Legal procedures taken on behalf of the ...
RustConf 2019, and systems programming as a data scientist
It could make sense to use Rust as a data journalist for in-browser computations, and other thoughts from RustConf.
How Data Analysis Confirmed the Bias in a Family Screening Tool
In Pittsburgh …
BJS Report on Arrest-Related Deaths: True Number Likely Much Greater
(This post is co-authored by Patrick Ball and Kristian Lum.)
Today the Bureau of Justice Statistics (BJS) released a report on their effort to document “all deaths that occur during the process of arrest in the United States.” The analysis estimates that the Arrest-Related Deaths (ARD) program covers only 34-49% of these deaths. A parallel program by the FBI (the Supplementary Homicide Reports, SHR) is estimated to cover approximately the same proportion of deaths. Even taking into consideration both programs, 28% of all police homicides remain unreported.
In order to estimate the total number of homicides that appear on neither the ARD or ...
How we go about estimating casualties in Syria—Part 1
I spent the two weeks over Easter working with Patrick and Megan in San Francisco, trying to figure out a strategy of how best to estimate the number of casualties the Syrian civil war has claimed in the past two years. In January, HRDAG published a report on the number of fully identified casualties reported in the Syrian Arab Republic between March 2011 and November 2012. The number of de-duplicated records of killings for this period was 59,648, a number that is likely to be an undercount since we know that many incidences of lethal violence in conflict go unreported, and that the unreported cases are not missing at random. (more…)
How we make sure that nobody is counted twice: A peek into HRDAG's record de-duplication
HRDAG is currently evaluating the quality and completeness of the Kosovo Memory Book of the Humanitarian Law Center (HLC) in Belgrade, Serbia. The objective of the Kosovo Memory Book (KMB) is to commemorate every single person who fell victim to armed conflict in Kosovo from 1998 to 2000, either through death or disappearance.
While building and reviewing their database, one of the things that HLC has to do is “record linkage,” a process also known as “matching.” Matching determines whether two records are the same people (“a match”) or different people (“a non-match”). Matching helps to identify whether two existing records refer ...
Pulling Back the Curtain on LLMs & Policing Data
Structural Zero Issue 04
September 30, 2025
Artificial intelligence is transforming how we work with information. At HRDAG, that changes how I do my job every day. My most recent project was using LLMs to explore and parse vast quantities of data about police abuses in California.
In this newsletter, I’ll pull back the curtain on that work. I’ll describe how a diverse coalition gathered more than a million pages of documents about police misconduct in California and how LLMs helped us make sense of them in ways that wouldn’t have been possible before the advent of this technology.
In addition to understanding my work, I hope that this ...
.Rproj Considered Harmful
We aim to produce code that is clear, replicatable across machines and operating systems, and that leaves an easy-to-follow audit trail.
Learning a Modular, Auditable and Reproducible Workflow
The modular nature of the workflow and use of Git allowed us to work on different parts of the project from across the country.
Identifiers of Detained Children Have Implications for Data Security and Estimation
Identifiers being sequential could make possible estimations of the population of detained children.
Herb Spirer, 1925 – 2018
Herb led and mentored a generation of statisticians working in human rights.
Can the Armed Conflict Become Part of Colombia’s History?
Paula Amado and María Juliana Durán Fedullo reflect on how the Truth Commission may change Colombia’s history, finally officially acknowledging the 50-year conflict and its casualties, and reckoning with who did what to whom.
How Predictive Policing Reinforces Bias
Algorithmic tools like PredPol were supposed to reduce bias. But HRDAG has found that racial bias is baked into the data used to train the tools.
How Pretrial Risk Assessment Tools Perpetuate Unfairness
Tools like Compas allegedly help judges predict future criminal activities and eliminate bias. HRDAG and partners showed how the tools recycle bias.
Partners
How we work with partners is how we relate to the whole human rights community. We work with human rights advocates and defenders to support their goals by complementing their substantive expertise with our technical expertise. To date, partners have included truth commissions, international criminal tribunals, United Nations missions, and non-governmental human rights organizations on five continents.
Here are a few stories that illustrate how we work with our partners:
HRDAG partner stories:
Quantifying Police Misconduct in Louisiana (2023)
Scraping for Pattern: Protecting Immigrant Rights in Washington State (2022)
Police Violence ...
Data on Kosovo Killings
The data on killings in Kosovo are in four files. All of the files are comma-delimited ASCII. The fields in each file are described below.
If you use these data on Kosovo killings, please cite them with the following citation, as well as this note:
“These are convenience sample data, and as such they are not a statistically representative sample of events in this conflict. These data do not support conclusions about patterns, trends, or other substantive comparisons (such as over time, space, ethnicity, age, etc.).”
Patrick Ball, Wendy Betts, Fritz Scheuren, Jana Dudukovich, and Jana Asher. (2002). AAAS/ABA-CEELI/Human Rights Data ...