How LLMs Made the Police Records Access Project Possible

In 2018, California passed the Right to Know Act—essentially a police transparency law— intended to increase police accountability by requiring law enforcement agencies to release records (when requested) related to officer misconduct, use of force, sexual assault, and more.

On the heels of this legislation, a coalition formed with the goal of putting the new law into practice: the Police Records Access Project, composed of members from the Community Law Enforcement Accountability Network (CLEAN). The coalition is a nationwide collaborative effort involving journalists, data scientists, public defenders, community advocates, human rights defenders, and First Amendment lawyers, and its mission is to obtain and distribute law enforcement records to the public. It’s composed of 18 organizations, among them the Berkeley Institute for Data Science, the Innocence Project, and the National Association of Criminal Defense Lawyers—and the Human Rights Data Analysis Group.

HRDAG collaborated with CLEAN to launch the Police Records Access Project, which went live in August, 2025. It’s a searchable database with millions of pages of documents related to tens of thousands of police use-of-force and misconduct cases. The records that are available through the Police Records Access Project will help researchers document patterns in police violence and build legal cases for many years to come.

Helping to build the database was a heavy lift. There are millions of documents in the database, disclosed by hundreds of different agencies throughout California, including local police and sheriffs departments, correctional facilities, university police, transit police, highway patrol, local probationary offices, the California Department of Justice, and many others. The coalition requests records from each of the nearly 700 entities every year.

Records arrived disconnected and without helpful metadata or documentation. The team used large language models (LLMs) to extract key facts which, with continuous manual supervision, were used to cluster documents into “cases” about the same incident or investigation. From there, LLM extraction was used again to extract case information required to compare disclosed cases to other state databases to measure completeness. LLMs also enabled extraction of structured fields such as dates and incident types that facilitate sifting through the cases in the database.

Within the search tool itself, LLMs are used to rank and sort search results, which has led to more relevant results appearing early in the results, improved search experience, and less time spent finding specific records. Finally, LLMs were key to removing information that might be seen as sensitive, such as Social Security numbers, the names and addresses of civilians, and medical details, flagging the records to be withheld from the public site.

Related publications

HRDAG. Tarak Shah. 30 September, 2025.
Pulling Back the Curtain on LLMs and Policing Data (Structural Zero 04).

HRDAG. Rainey Reitman. 14 August, 2025.
Millions of Pages of Police Use-of-Force Files Available Through New Searchable Database

Acknowledgments

This work was supported by the Filecoin Foundation for the Decentralized Web, Ford Foundation, Heising Simons Foundation, Hewlett Foundation, and MacArthur Foundation.

Image: David Peters.

How LLMs Made the Police Records Access Project Possible

Further reading

Related publications

Related videos and podcasts

Acknowledgments

Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.

Donate

HRDAG

Selected projects

Stay informed about our work

How LLMs Made the Police Records Access Project Possible

Further reading

Related publications

Related videos and podcasts

Acknowledgments

Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents. Donate

HRDAG

Selected projects

Stay informed about our work

Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.

Donate