Using Hand Gestures To Evaluate AI Biases
LTI researchers have created a model to help generative AI systems understand the cultural nuance of gestures.
By Marylee Williams
Media Inquiries- Language Technologies Institute
Humans routinely use hand gestures to communicate, from a wave in traffic to a thumbs-up at a sporting event. But what might be an innocent gesture in one culture could be offensive in another. Crossing your index and middle fingers is associated with good luck in America, for example, but it's not something you'd want to do in Vietnam.
As the use of generative AI systems increases globally, it's imperative that they understand these cultural nuances. It's easier than ever to have an AI system generate an image of a hand gesture. Simply type some text and the program creates an image that can be used in a company's marketing materials. But without cross-cultural awareness, the company might face repercussions for generating what they assumed was an innocent image.
Researchers from Carnegie Mellon University's School of Computer Science have found limitations in AI systems for flagging offensive hand gestures cross-culturally, leading the systems to sometimes produce images of offensive gestures.
"What's perfectly acceptable in one culture can be deeply offensive in another, leading to misunderstandings, harm or even significant business risks," said Akhila Yerukola, a doctoral student in the Language Technologies Institute (LTI). "While no single person can be aware of every global cultural nuance, AI systems intended for global interaction should be expected to understand these differences. For example, AI systems shouldn't accidentally generate a culturally offensive image or provide culturally insensitive advice. For AI to be a truly beneficial global tool, it must interact in ways that are respectful and appropriate, and not offensive to the diverse cultures it encounters."
In their study, researchers created a dataset called Multi-Cultural Set of Inappropriate Gestures and Nonverbal Signs (MC-SIGNS), which catalogs hand gestures and their interpretations across 85 countries. The team created gesture-country pairs, taking a set of 25 hand gestures and pairing them with their country-specific interpretations. The dataset focuses on global coverage rooted in local knowledge. Researchers built on existing anthropological research and used insights from local experts for detailed explanations of each offensive gesture.
Researchers said they wanted to focus on hand gestures because they're a perfect window into AI system biases.
"When globally deployed AI systems blindly default to one or a few cultural interpretations — typically the ones from the creators of those AI systems — the consequences can be significant," said LTI Assistant Professor Maarten Sap. "At best, the system might simply misunderstand the user. At worst, it could deeply offend and alienate them. Understanding these risks is crucial, which is why studying how AI handles offensive gestures allows us to identify and address these vulnerabilities before they cause real harm in sensitive situations like international diplomacy, cross-cultural business or global marketing campaigns."
This dataset was used to test different AI systems, prompting them to produce images of various offensive gestures and noting if they were flagged or if the system generated them. The team found that popular AI models over-flag gestures as offensive and exhibit a U.S.-centric bias, exhibiting low accuracy when identifying offensive hand gestures from countries outside the U.S. Researchers found that DALLE-3, a popular image generator, blocks only 10% of prompts involving offensive gestures.
Researchers noted that while text-to-image generation has improved in some regards, it remains vulnerable in others, such as generating images of hand gestures with cultural sensitivity. MC-SIGNS provides researchers with a means to test and refine these AI systems, thereby reducing the risk of offensive AI-generated content and facilitating the development of more culturally aware AI models.
Along with researchers from CMU, the team included Saadia Gabriel and Nanyun Peng from the University of California, Los Angeles.
Moving forward, the researchers aim to improve AI's ability to directly identify culturally offensive or inappropriate actions or gestures within an image. For example, in India it's considered disrespectful to sit with the soles of one's feet directly facing an idol of a deity. The team hopes that one day, when AI systems generate images of an Indian family praying, they can ensure those images don't include any disrespectful postures.
For more on MC-SIGNS, read the team's paper, which will be presented in July at the Association for Computational Linguistics Annual Meeting in Vienna.