Skip to main content

Can Generative AI Replace Humans in Qualitative Research Studies?

CMU Study Finds Limitations, Ethical Concerns With LLMs as Study Participants

Media Inquiries
Name
Aaron Aupperlee
Title
School of Computer Science

Having humans participate in a study can be time-consuming and expensive for researchers stretching limited budgets on strict deadlines. Sophisticated, generative large language models (LLM) can complete many tasks, so some researchers and companies have explored the idea of using them in studies instead of human participants.

Researchers from Carnegie Mellon University's School of Computer Science(opens in new window) identified fundamental limitations to using LLMs in qualitative research focused on a human's perspective, including the ways information is gathered and aggregated and issues surrounding consent and data collection.

"We looked into this question of if LLM-based agents can replace human participation in qualitative research, and the high-level answer was no," said Hoda Heidari(opens in new window), the K&L Gates Career Development Assistant Professor in Ethics and Computational Technologies in CMU's Software and Societal Systems Department(opens in new window) (S3D) and Machine Learning Department(opens in new window). "There are all sorts of nuances that human participants contribute that you cannot possibly get out of LLM-based agents, no matter how good the technology is."

The team's paper, "Simulacrum of Stories: Examining Large Language Models as Qualitative Research Participants,"(opens in new window) received an honorable mention award at the Association for Computing Machinery's Conference on Human Factors in Computing Systems(opens in new window) last week in Yokohama, Japan. Team members from SCS included Heidari; Shivani Kapania(opens in new window), a doctoral student in the Human-Computer Interaction Institute(opens in new window) (HCII); William Agnew(opens in new window), the Carnegie Bosch Postdoctoral Fellow in the HCII; Motahhare Eslami(opens in new window), an assistant professor in the HCII and S3D; and Sarah Fox(opens in new window), an assistant professor in the HCII.

LLMs are used as tools in training across a variety of fields. In the medical and legal professions, these tools allow professionals to simulate and practice real-life scenarios, such as a therapist training to identify mental health crises(opens in new window). In qualitative research, which is often interview-based, LLMs are being trained to mimic human behavior in their responses to questions and prompts.

In the study, the CMU team interviewed 19 humans with experience in qualitative research. Participants interacted with an LLM chatbot-style tool, typing messages back and forth. The tool allowed researchers to compare LLM-generated data with human-generated data and reflect on ethical concerns.

Researchers identified several ways using LLMs as study participants introduced limitations to scientific inquiry, including the model's method of gathering and interpreting knowledge. Study participants noted that the LLM tool often compiled its answers from multiple sources and fit them — sometimes unnaturally — into one response. For example, in a study about factory working conditions, a worker on the floor and a manager would likely have different responses about a variety of aspects of the work and workplace. Yet an LLM participant generating responses might combine these two perspectives into one answer — conflating attitudes in ways not reflective of reality.

Another way the LLM responder introduced problems into the scientific inquiry process was in the form of consent. In the paper, the researchers note that LLMs trained on publicly available data from a social media platform could raise questions about informed consent and if the people whose data the models are trained on have the option to opt out.

Overall, the study raises doubts about using LLMs as study participants, noting ethical concerns and questions about the validity of these tools.

"These models are encoded with the biases, assumptions and power dynamics of model producers and the data and contexts from which they are derived," the researchers wrote. "As such, their use in research reshapes the nature of the knowledge produced, often in ways that reinforce existing hierarchies and exclusions."

Person talking to a AI chatbot on a computer

CMU research shows that replacing humans with LLMs has limitations and presents ethical concerns.

— Related Content —