Innocence Discovery Lab – Harnessing large language models to surface data buried in wrongful conviction case documents

The Wrongful Conviction Law Review

The recent advent of commercial artificial intelligence (AI), especially in natural language processing (NLP), introduces transformative possibilities for wrongful conviction research. NLP, a pivotal branch of AI that forms the basis for Large Language Models (LLMs), enables computers to interpret human language with a nuanced understanding. This technological advancement is particularly valuable for analyzing the complex language found in case documents associated with wrongful convictions. This paper explores the effectiveness of LLMs in analyzing and extracting data from case documents collected by the Innocence Project New Orleans and the National Registry of Exonerations. The diverse and comprehensive nature of these datasets makes them ideal for assessing the capabilities of LLMs. The findings of this study advance our understanding of how LLMs can be utilized to make wrongful conviction case documents easily accessible by automating the extraction of relevant data.

Creative Commons Attribution 4.0 International License.

Ayyub Ibrahim, Huy Dao, and Tarak Shah (2024). Innocence Discovery Lab – Harnessing Large Language Models to Surface Data Buried in Wrongful Conviction Case Documents. The Wrongful Conviction Law Review 5 (1):103-25. 31 May, 2024. https://doi.org/10.29173/wclawr112. ©  2024 Ayyub Ibrahim, Huy Dao, Tarak Shah.


Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.

Donate