This month we have Muskan Walia talking about using Gen AI for redacting race information from police reports.
Thank you to NYU for hosting us.
Everybody attending must RSVP through the registration form at nyhackr. There is a charge for in-person and virtual tickets are free.
Space is extremely limited and in-person registration closes at 3 PM the day of the talk.
About the Talk: Generative AI systems are rapidly moving from prototypes to embedded components of government decision-making infrastructure. For instance, California mandates all prosecutors implement a new review procedure called “race-blind charging”, where prosecutors review case documents with race-related information redacted. To make this feasible, the state encouraged prosecutors to use AI-based redaction. We designed and tested one such system that uses generative AI to automatically redact race-related information from police reports. Our solution is now used in over 50% of California prosecutor offices, covering a constituent population of nearly 18 million people through its deployment. In the first public validation of generative AI used for race-related redaction, we assess algorithmic performance by drawing on a corpus of ~10,000 police reports we collected from 253 jurisdictions across nearly every U.S. state. We present these validation results, demonstrating that our algorithm reliably removes all race-related indicators required by law, reduces the ability to predict an arrestee’s race from redacted narratives, and performs at the top of its class relative to existing alternatives. This work demonstrates the feasibility of using large language models at scale, while highlighting the importance of rigorous validation when deploying algorithms in high-stakes legal contexts.
About Muskan: Muskan Walia is a PhD student in Statistics and Computational Social Science. She works at the Harvard Kennedy School’s Computational Policy Lab, supporting the development of an algorithm that uses generative AI to help prosecutors avoid implicit bias in charging decisions. Her research focuses on the development of artificial intelligence, machine learning, and scientific computing methods in collaboration with government agencies to tackle pressing social issues and improve institutional decision-making processes. In particular, Muskan designs automated systems that integrate statistical and computational methods—including benchmark data curation, uncertainty propagation, and LLMs-as-a-judge—to support the validation of generative AI in the public interest.
The venue doors open at 6:30 PM America/New_York where we will continue enjoying pizza together (we encourage the virtual audience to have pizza as well). The talk, and livestream, begins at 7:00 PM America/New_York.
Slack