The Data are Clear: Public Engagement Improves AI Science

Structural Zero Issue 06

Here’s what I’ve noticed: a lot of scientists and AI developers think that involving affected communities in AI research is probably the right thing to do, ethically. But they hesitate because they fear the messy unpredictability of human participation. They worry that even if human participation is the moral choice, it could degrade the science.

But what if that isn’t true? What if input from affected communities could make AI more reliable, more nuanced, and more precise?

At a MacArthur convening in New Mexico in 2023, Nathan Matias, the founder of CAT Lab at Cornell University, and I realized we had a similar set of experiences: too many scientists were dismissing human participation in AI projects due to the belief that it would degrade the science. Nathan approached the problem from the academic side, working closely with researchers and scholars, while my vantage point in San Francisco meant grappling more often with tech companies in Silicon Valley. Nathan and I began brainstorming whether we could produce a wide-reaching research paper examining the impact of community engagement. Our hope was to show compelling evidence that human involvement doesn’t weaken AI science. Indeed, we hoped to show that well-designed participatory models can dramatically strengthen it.

That meeting kicked off a two-year research and writing project that culminated in a peer-reviewed article recently published in Proceedings of the National Academy of Sciences of the United States of America (PNAS). The paper accomplishes a few things:

  • An overview of the common models of public engagement in AI research, coupled with the common concerns raised about these participatory modes
  • A sweeping literature review of participatory science
  • Case studies from AI in healthcare
  • Specific examples of participatory models informing AI that Nathan and I have worked on, from policing to social media
  • An explanation of the five domains where AI science can be improved by public participation: equipoise, explanation, measurement, inference, and interpretation

One of the most dramatic examples I’ve experienced of public participation improving the precision of AI was in the Beneath the Surface project, spearheaded by Invisible Institute and long supported by HRDAG. After years of abuse and misconduct allegations, the Chicago Police Department was placed under a consent decree by a federal judge in 2017. This meant the Chicago Police Department faced extra oversight and transparency requirements with the goal of reforming the department. As part of that, CPD had to publicly report on citizens’ complaints of inappropriate conduct by police. But for a long time, CPD didn’t actually release the text of the complaints. Instead, it just summarized what the complaints were about based on their interpretation.

Finally, in 2020, a federal judge forced the CPD to make the complaints available publicly in response to a lawsuit brought by the Invisible Institute

When Invisible Institute finally received the complaints, they were a big, unstructured data blob—unfortunately typical for how data is shared from police to citizen groups and nonprofits. Invisible Institute organized hundreds of volunteers in reading groups to help make sense of the documents, and HRDAG’s data scientist Tarak Shah was one of the many working on the project. Tarak and Invisible Institute’s data director trina reynolds-tyler worked together to code up a machine learning model trained on the hand-categorized data provided by the volunteers. This model was then used to categorize the larger data set.

One of the project’s most significant findings was that the Chicago Police Department often downplayed the severity of complaints in their published summaries—for example, labeling incidents that volunteers categorized as sexual assault or rape as something more routine, such as an inappropriate search.

Another issue community volunteers flagged in the complaints was how often individuals complained about lack of police response to a missing persons report. This prompted trina reynolds-tyler and her co-researcher Sarah Conway to pursue another in-depth investigation into how CPD handles missing persons reports, a project that ultimately won a Pulitzer Prize.

That’s just one of the projects HRDAG has worked on where direct community engagement has improved the quality of the science.

With the recent research Nathan and I published offering clear examples and methods of community engagement, I hope the science community and Silicon Valley will take note. Scientific funders should be backing projects that center community involvement, not only out of a moral obligation but because it makes the results more reliable and precise. Our research can also serve as a tool for scholars and scientists who are on the fence about participatory research, or who are facing resistance from within their companies or research institutes. The data are clear: public participation improves the science of AI.

— mep

This article was written by Megan Price, Executive Director of the Human Rights Data Analysis Group (HRDAG), a nonprofit organization using scientific data analysis to shed light on human rights violations.

Structural Zero is a free monthly newsletter that helps explore what scientific and mathematical concepts teach us about the past and the present. Appropriate for scientists as well as anyone who is curious about how statistics can help us understand the world, Structural Zero is edited by Rainey Reitman and written by 5 data scientists who use their skills in support of human rights. Subscribe today to get our next installment. You can also follow us on BlueskyMastodonLinkedIn, and Threads.

If you get value out of these articles, please support us by subscribing, telling your friends about the newsletter, and recommending Structural Zero to others.


Our work has been used by truth commissions, international criminal tribunals, and non-governmental human rights organizations. We have worked with partners on projects on five continents.

Donate