Put simply, AI bias refers to discrimination in the output churned out by Artificial Intelligence (AI) systems.
According to Bogdan Sergiienko, Chief Technology Officer at Master of Code Global, AI bias occurs when AI systems produce biased outcomes that mirror societal biases, such as those related to gender, race, culture, or politics. These biases often reinforce existing social inequalities.
Drilling down, Adnan Masood, UST’s Chief AI Architect and AI scholar says that among the most pressing concerns in current Large Language Models (LLMs) are demographic biases. These, he says, lead to disparate performance across racial and gender groups. Then there are ideological biases that mirror dominant political viewpoints, and temporal biases that anchor models to outdated information.
“Additionally, more subtle cognitive biases, such as anchoring effects and availability bias, can influence LLM outputs in nuanced and potentially harmful ways,” says Masood.
Owing to this bias, AI models may generate text or images that reinforce stereotypes about gender roles. For instance, Sergiienko says when generating images of professionals, men are often depicted as doctors, while women are shown as nurses.
He also points to a Bloomberg analysis of over 5000 AI-generated images, where people with lighter skin tones were disproportionately featured in high-paying job roles.
“AI-generated outputs also may reflect cultural stereotypes,” says Sergiienko. “For instance, when asked to generate an image of “a Barbie from South Sudan,” the result included a woman holding a machine gun, which doesn’t reflect everyday life in the region.”
How do biases creep into LLMs?
Sergiienko says there are several avenues for biases to make their way into LLMs.
1. Biassed training data: When the data used for training LLMs contains societal biases, the AI learns and replicates them in its responses.
2. Biassed labels: In supervised learning, if labels or annotations are incorrect or subjective, the AI may produce biased predictions.
3. Algorithmic bias: The methods used in AI model training may amplify pre-existing biases in the data.
4. Implicit associations: Unintended biases in the language or context within the training data can lead to flawed outputs.
5. Human influence: Developers, data annotators, and users can unintentionally introduce their own biases during model training or interaction.
6. It may also result from a lack of context: In the example of “Barbie from South Sudan,” the AI may associate images of people from South Sudan with machine guns because many photos labeled as such include this attribute.
Similarly, a “Barbie from IKEA” might be generated by holding a bag of home accessories, based on common associations with the brand.
Can AI ever be free of bias?
Our experts believe the complete transcendence of human biases may be an elusive goal for AI. “Given its inherent connection to human-created data and objectives, AI systems can be designed to be more impartial than humans in specific domains by consistently applying well-defined fairness criteria,” believes Masood.
He says the key to reducing bias lies in striving for AI that complements human decision-making. This will help leverage the strengths of both while implementing robust safeguards against the amplification of harmful biases.
However, before bias can be removed from LLMs, it is important to first identify it. Masood says this calls for a varied approach that uses numerical data, expert analysis, and real-world testing.
“By using advanced techniques such as counterfactual fairness analysis and intersectional bias probing, we can uncover hidden biases that may disproportionately impact specific demographic groups or surface in particular contexts,” says Masood.
However, unlike a one-time task, identifying bias is an ongoing process. As LLMs are deployed in novel and dynamic environments, new and unforeseen biases may emerge that were not apparent during controlled testing.
Masood points to various research efforts and benchmarks that address different aspects of bias, toxicity, and harm.
These include StereoSet, CrowS-Pairs, WinoBias, BBQ (Bias Benchmark for QA), BOLD (Bias in Open Language Models), CEAT (Contextualized Embedding Association Test), WEAT (Word Embedding Association Test), Datasets for Social Bias Detection (DBS), SEAT (Sentiment Embedding Association Test), RealToxicityPrompts, and Gender Bias NLP.
Mitigating the effects of bias
To effectively govern AI and mitigate bias, businesses need to implement practices that ensure diverse representation within AI development teams, suggests Masood. Furthermore, businesses must create ethical review boards to scrutinize training data and model outputs. Finally, they should also invest in conducting third-party audits to independently verify fairness claims.
“It’s also crucial to define clear metrics for fairness and to continually benchmark models against these standards,” advises Masood. He also suggests businesses collaborate with AI researchers, ethicists, and domain experts. This, he believes, can help surface potential biases that may not be immediately apparent to technologists alone.
While Sergiienko also believes that AI results may never be entirely free of bias, he offers several strategies businesses can implement to minimize bias.
1. Use diverse and representative datasets: The data used to train AI models should represent a wide range of perspectives and demographics.
2. Implement retrieval-augmented generation (RAG): This model architecture combines retrieval-based techniques with generation-based techniques. It pulls relevant data from external sources before generating a response, providing more accurate and contextually grounded answers.
3. Pre-generate and store responses: For highly sensitive topics, businesses can pre-generate and review answers to ensure they are accurate and appropriate.
4. Fine-tuning with task-specific datasets: Businesses can provide domain-specific knowledge to the large language model that can reduce bias by improving contextual understanding and generating more accurate outputs.
5. System prompt review and refinement: This can help prevent models from unintentionally generating biased or inaccurate outputs.
6. Regular evaluation and testing: Businesses must continuously monitor AI outputs and run test cases to identify biases. For example, prompts like “Describe a strong leader” or “Describe a successful entrepreneur” can help reveal gender, ethnicity, or cultural biases.
“Businesses can start by encoding ethical and responsible standards into the Gen AI system they build and use,” says Babak Hodjat, CTO of Cognizant. He says AI itself can help here, for instance, by leveraging multiple AI agents to monitor and correct each other’s outputs. LLMs can be set up in a way where one model can “check” the other, reducing the risk of biases or fabricated responses.
As an example of such a system, he points to Cognizant’s Neuro AI agent framework which is designed to create a cross-validating system between models before it presents outputs to humans.
But mitigating bias is like walking a tightrope. Beatriz Sanz Saiz, EY Consulting Data and AI Leader points to some recent attempts to eliminate bias that have translated into a view of the world that does not necessarily reflect the truth.
For instance, she says when some existing LLMs were asked to provide an image of World War II German soldiers, the algorithm responded with an image with equally balanced numbers of women and men, and of Caucasians and people of color. The system tried its best to remain unbiased, but in the process, the results weren’t entirely true.
Saiz says this poses a question: should LLMs be trained for truth-seeking? Or is there potential in building an intelligence that doesn’t know of, or learn from past mistakes?
“There are pros and cons to both approaches,” says Saiz. “Ideally the answer is not one or the other, but a combination of the two.”