Scanning Documents to Uncover Police Violence

Administrative paperwork generated by abusive regimes have long been an integral component of human rights investigations. Documents obtained through public records law can provide important evidence of state violence, as well as the paper trail to demonstrate responsibility for human rights violations. Among the many formats such records can take, scanned and redacted documents saved as PDF files are common, and present unique challenges for data processing.

This post reviews the methods and software tools we use to process such collections and get them into an analysis-ready format. It also provides running examples from our work with the Invisible Institute’s Citizens Police Data Project to design and maintain a data pipeline, the ACLU of Massachusetts to review Boston Police Department SWAT reports, and he University of Washington Center for Human Rights to answer the question of whether ICE and CBP are detaining people at sensitive locations such as prisons and hospitals.

Read the full post here: Processing scanned documents for investigations of police violence.


Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.

Donate