
As artificial intelligence continues to evolve and integrate into healthcare communication, a recent study highlights a critical concern: large language models (LLMs) may perpetuate harmful stereotypes by using stigmatizing language. Researchers at Mass General Brigham discovered that over 35% of responses related to alcohol and substance use disorders from these models contained stigmatizing language. However, the study, published in The Journal of Addiction Medicine, also notes that targeted prompts can significantly reduce such language.
“Using patient-centered language can build trust and improve patient engagement and outcomes. It tells patients we care about them and want to help,” said Dr. Wei Zhang, the study’s corresponding author and an assistant professor of Medicine at Mass General Hospital. “Stigmatizing language, even through LLMs, may make patients feel judged and could cause a loss of trust in clinicians.”
Understanding the Impact of Stigmatizing Language
LLM responses are generated from everyday language, which often includes biased or harmful language towards patients. The study emphasizes that prompt engineering—a process of strategically crafting input instructions—can guide model outputs towards non-stigmatizing language. This approach has shown to reduce the likelihood of stigmatizing language by 88%.
For their research, the authors tested 14 LLMs using 60 clinically relevant prompts related to alcohol use disorder (AUD), alcohol-associated liver disease (ALD), and substance use disorder (SUD). Physicians from Mass General Brigham evaluated the responses for stigmatizing language, using guidelines from the National Institute on Drug Abuse and the National Institute on Alcohol Abuse and Alcoholism.
35.4% of responses from LLMs without prompt engineering contained stigmatizing language, compared to just 6.3% with prompt engineering.
Challenges and Opportunities in AI Communication
The study’s findings underscore the importance of language in healthcare communication, especially as LLMs become more prevalent. Longer responses were found to be more likely to contain stigmatizing language, a trend observed across all 14 models tested. Some models were more prone than others to use stigmatizing terms, highlighting the need for careful model selection and prompt engineering.
The authors suggest developing chatbots that avoid stigmatizing language to enhance patient engagement and outcomes. They also recommend that clinicians proofread LLM-generated content to ensure it is free from stigmatizing language before using it in patient interactions. Furthermore, involving patients and family members with lived experience in future research could refine definitions and lexicons of stigmatizing language, aligning LLM outputs with the needs of those most affected.
Looking Forward: The Role of AI in Patient Care
This study reinforces the necessity of prioritizing language in patient care, as LLMs are increasingly utilized in healthcare communication. The authors, including Yichen Wang, Kelly Hsu, Christopher Brokus, Yuting Huang, Nneka Ufere, Sarah Wakeman, and James Zou, stress the importance of developing AI tools that support rather than hinder patient trust and engagement.
Funded by grants from the May Center Clinic for Digital Health in partnership with the Mayo Clinic Office of Equity, Inclusion, and Diversity and Dalio Philanthropies, this research marks a significant step towards more inclusive and patient-centered AI communication in healthcare.
As the field progresses, the integration of AI in healthcare presents both challenges and opportunities. By addressing the issue of stigmatizing language, researchers and clinicians can work together to improve patient outcomes and foster a more supportive healthcare environment.