Tales from the Wild West: Crafting Scenarios to Audit Bias in LLMs

May 12, 2024 • White Paper

By
Katherine-Marie Robinson, Violet Turri, Carol J. Smith, and Shannon Gallagher

This work introduces a scenario-based audit using RPG-style prompts where LLMs role-play characters to reveal bias in descriptions of individuals around them.

Publisher

Software Engineering Institute

Topic or Tag

Abstract

While large language models (LLMs) introduce opportunities to build exciting new kinds of human-computer interactions, they also present a host of risks such as the unintended perpetuation of harmful biases. To better identify and mitigate biases in LLMs, new evaluation and auditing methods are needed that circumvent safeguards and reveal underlying learned behaviors. In this work, we present a scenario-based auditing approach to uncovering biases in which the LLM plays the role of a character and describes individuals living in the world around them in the context of a role-playing game (RPG). Through a scenario centered around a cowboy named Jett, we elicit open-ended responses from ChatGPT that reveal ethnic and gender biases. Our findings demonstrate the importance of taking an exploratory approach to identifying bias in LLMs and suggest paths for future investigation.

Tales from the Wild West: Crafting Scenarios to Audit Bias in LLMs

May 12, 2024 • White Paper

By Katherine-Marie Robinson, Violet Turri, Carol J. Smith, and Shannon Gallagher

Publisher

Topic or Tag

Abstract

SHARE

Part of a Collection

By
Katherine-Marie Robinson, Violet Turri, Carol J. Smith, and Shannon Gallagher