What is algorithmic bias?
20 September 2024
Authors
Alexandra Jonker Editorial Content Lead
What is algorithmic bias?

Algorithmic bias occurs when systematic errors in machine learning algorithms produce unfair or discriminatory outcomes. It often reflects or reinforces existing socioeconomic, racial and gender biases.

Artificial intelligence (AI) systems use algorithms to discover patterns and insights in data, or to predict output values from a given set of input variables. Biased algorithms can impact these insights and outputs in ways that lead to harmful decisions or actions, promote or perpetuate discrimination and inequality, and erode trust in AI and the institutions that use AI. These impacts can create legal and financial risks for businesses. For example, per the EU AI Act, non-compliance with its prohibited AI practices can mean fines up to EUR 35,000,000 or 7% of worldwide annual turnover, whichever is higher.

Algorithmic bias is especially concerning when found within AI systems that support life-altering decisions in areas such as healthcare, law enforcement and human resources. Bias can enter algorithms in many ways, such as skewed or limited training input data, subjective programming decisions or result interpretation.

Mitigating algorithmic bias starts with applying AI governance principles, including transparency and explainability, across the AI lifecycle.

3D design of balls rolling on a track
The latest AI News + Insights 
 Discover expertly curated insights and news on AI, cloud and more in the weekly Think Newsletter. 
What causes algorithmic bias?

Algorithmic bias is not caused by the algorithm itself, but by how the data science team collects and codes the training data. Specific causes include:

  • Biases in training data
  • Biases in algorithm design
  • Biases in proxy data
  • Biases in evaluation
Biases in data

Flawed data is characterized as non-representative, lacking information, historically biased or otherwise “bad” data.1 It leads to algorithms that produce unfair outcomes and amplify any biases in the data. AI systems that use biased results as input data for decision-making create a feedback loop that can also reinforce bias over time. This cycle, where the algorithm continuously learns and perpetuates the same biased patterns, leads to increasingly skewed results.

Bias can also arise during the training phase if data is incorrectly categorized or assessed. Sometimes, algorithms can “learn” from data correlation rather than causation, as they do not possess the abilities to understand the difference. When this happens, the output of the algorithm can be biased in that the model failed to consider other factors in the data that may be of more importance.

A commonly cited example of correlation bias is a hypothetical model that determines a causal relationship between increased shark attacks and higher ice cream sales. In reality, both situations tend to occur during summer and only possess a correlating relationship.

Biases in algorithmic design

Algorithms design can also introduce bias. Programming errors, such as an AI designer unfairly weighting factors in the decision-making process, can unknowingly transfer into the system. Weighting is often a technique to avoid bias, as it involves adjustments to data so that it better reflects the actual population. However, it might require assumptions from designers, which can lead to inaccuracies and introduce bias. Developers might also embed the algorithm with subjective rules based on their own conscious or unconscious biases.

Biases in proxy data

AI systems sometimes use proxies as a stand-in for protected attributes, like race or gender. However, proxies can be unintentionally biased as they might have a false or accidental correlation with the sensitive attributes they were meant to replace. For example, if an algorithm uses postal codes as a proxy for economic status, it might unfairly disadvantage certain groups where postal codes are associated with specific racial demographics.

Biases in evaluation

Biases in evaluation occur when algorithm results are interpreted based on the preconceptions of the individuals involved, rather than the objective findings. Even if the algorithm is neutral and data-driven, how an individual or business applies the algorithm’s output can lead to unfair outcomes depending on how they understand the outputs.

AI Academy
Trust, transparency and governance in AI

AI trust is arguably the most important topic in AI. It's also an understandably overwhelming topic. We'll unpack issues such as hallucination, bias and risk, and share steps to adopt AI in an ethical, responsible and fair manner.

The risks of algorithmic bias

When algorithmic bias goes unaddressed, it can perpetuate discrimination and inequality, create legal and reputational damage and erode trust.

Discrimination and inequality

Biased algorithmic decisions reinforce existing societal disparities faced by marginalized groups and these human biases lead to unfair and potentially harmful outcomes from AI systems. While many of the most common AI applications might seem low-stakes (such as search engines, chatbots and social media sites) other applications of AI can influence life-altering decisions. The use of biased AI tools within areas like criminal justice, healthcare and hiring could yield devastating results.

For example, the marginalization of African American people in the past is reflected in historical arrest data from Oakland, California in the United States. If this data is used to train a current predictive policing algorithm (PPA), the decisions made by the PPA are likely to reflect and reinforce those past racial biases.

Legal and reputational damage

Organizations that use biased AI systems could face legal consequences and reputational damage, as biased recommendations can have what’s known as a disparate impact. This is a legal term referring to situations where seemingly neutral policies and practices can disproportionately affect individuals from protected classes, such as those susceptible to discrimination based on race, religion, gender and other characteristics.

Protected groups adversely affected by biased AI decisions might file lawsuits, potentially leading to significant financial liabilities, long-term reputational damage and condemnation from stakeholders. Organizations could also face financial penalties if they are found to be in violation of any applicable antidiscrimination laws.

Erosion of trust

Biased results from AI tools erode trust in AI in multiple ways. If an organization is found to have biased AI systems, they might lose the trust of stakeholders within the business who no longer have confidence in the algorithmic decision-making processes. These stakeholders might also no longer consider the optimization value of AI to outweigh its risk and lose confidence in the technology overall.

Algorithmic bias can also lose the trust of customers. It takes only one case of discrimination to ruin brand reputation, especially in the era of fast-spreading news. Trust in AI is especially important to retain with marginalized groups like people of color, who already experience bias and discrimination in the physical world.

Real-world examples of algorithmic bias

Algorithmic bias can occur in any scenario or sector that uses an AI system to make decisions. Here are some potential real-world examples of algorithmic bias:

  • Bias in the criminal justice system
  • Bias in healthcare
  • Bias in recruitment
  • Bias in financial services
  • Bias in facial recognition systems
  • Bias in pricing
Bias in criminal justice

United States courts use the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) tool to assess defendants’ recidivism risk. A study by ProPublica revealed that the tools algorithm might have classified white and Black defendants differently. For example, in the risk assessment, Black defendants were twice as likely as white defendants to be misclassified as being a higher risk of violent recidivism. The company that created the tool disputes this analysis; however, it does not disclose the methods used to arrive at risk scores.2

Bias in predictive policing

Researchers built their own predictive policing algorithm trained on victim report data from Bogotá, Colombia. When they compared the model’s predictions with actual crime data sets, they found major errors. For example, it predicted 20% more high-crime locations than reality in districts with a high volume of reports. But this reflected a social bias: Black people are more likely to be reported for a crime than white people.3

Bias in healthcare

In healthcare, underrepresentation of minority groups in data can skew predictive AI algorithms. For example, computer-aided diagnosis (CAD) systems have been found to return lower accuracy results for Black patients than white patients.

Bias in recruitment

Amazon abandoned an AI recruiting tool after discovering it systematically discriminated against female job applicants. Developers trained the hiring algorithm using resumes from past hires—who were predominately male. As a result, the algorithm unfairly favored keywords and characteristics found in men’s resumes.4

Bias in financial services

Bias within financial services can have severe consequences for people’s livelihoods, as historical data can contain demographic biases affecting creditworthiness, loan approvals and more. For example, a study from the University of California, Berkeley showed that an AI system for mortgages routinely charged minority borrowers higher rates for the same loans when compared to white borrowers.5

Bias in image generation

Academic researchers found gender bias in the AI image generator Midjourney. During the analysis of over 100 generated images, they also found instances of racial, class and age bias in the results. For example, when asked to create images of people in specialized professions, it showed both younger and older people, but the older people were always men, reinforcing gender bias regarding women in the workplace.6

Bias in facial recognition systems

Research from MIT found that some general purpose commercial facial recognition systems—used for matching faces in photos, for example—were unable to recognize darker-skinned individuals. Recognition was even worse for darker-skinned women. Training data that misrepresented real demographics skewed the results.7

Bias in pricing

After a Chicago law forced ride-hailing companies to disclose their fares, researchers discovered that Uber and Lyft’s pricing algorithm charged more for drop-offs in neighborhoods with high non-white populations.8

How to avoid algorithmic bias

Mitigating bias from AI systems starts with AI governance, which refers to the guardrails that make sure AI tools and systems are and remain safe and ethical. It establishes the frameworks, rules and standards that direct AI research, development and application to help ensure safety, fairness and respect for human rights.

Organizations might consider the following AI governance principles to avoid potential AI bias across the system lifecycle:

  • Diverse and representative data
  • Bias detection and mitigation
  • Transparency and interpretability
  • Inclusive design and development
Diverse and representative data

Machine learning is only as good as the data that trains it. For AI to better reflect the diverse communities it serves, a far wider variety of human beings’ data must be represented in models. Data fed into machine learning models and deep learning systems must be comprehensive and balanced, representative of all the groups of people and reflective of the actual demographics of society.

Bias detection and mitigation

No computer systems are ever fully “trained” or “finished.” Ongoing monitoring and testing (through initiatives like impact assessments, algorithmic auditing and causation tests) can help detect and correct potential biases before they create problems. Processes like the “human-in-the-loop” system require recommendations to be reviewed by humans before a decision is made to provide another layer of quality assurance.

Transparency and interpretability

AI systems can be “black boxes,” which makes it difficult to understand their outcomes. Transparent AI systems clearly document and explain the underlying algorithm’s methodology and who trained it. The more people understand how AI systems are trained and tuned and how they make decisions, the more individual stakeholders and society at large can trust AI’s accuracy and fairness.

Inclusive design and development

Inclusive AI starts with a diverse and interdisciplinary team of AI programmers, developers, data scientists, ML engineers and more who are varied racially, economically, by educational level, by gender, by job description and other demographic metrics. Diversity within design and development will bring different perspectives to help identify and mitigate biases that might otherwise go unnoticed.

Algorithmic bias regulation

Governments and policymakers are creating AI frameworks and regulations to help guide—and in some cases, enforce—the safe and responsible use of AI. For example:

  • The European Union introduced the EU AI Act, which has specific requirements for high-risk AI systems such as measures to prevent and mitigate biases.

  • The Algorithmic Impact Assessments Report from New York University’s AI Now Institute is a practical framework. Like an environmental impact assessment, it guides public agencies in assessing AI systems to ensure public accountability.9

  • The White House’s Blueprint for an AI Bill of Rights has a principle dedicated to algorithmic discrimination protections. It includes expectations and guidance on how to put this principle into practice.

  • The Biden Administration’s Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence sets guidelines for AI development and use, including addressing algorithmic discrimination through training, technical assistance and coordination between the the US Department of Justice and federal civil rights offices.
Footnotes

1. Algorithmic Bias: A New Legal Frontier,” International Association of Defense Counsel, 2019.

2. How We Analyzed the COMPAS Recidivism Algorithm,” ProPublica, 23 May 2016.

3. “Predictive policing is still racist—whatever data it uses,” MIT Technology Review, 5 February 2021.

4. Why Amazon’s Automated Hiring Tool Discriminated Against Women,” ACLU, 12 October 2018.

5. AI is Making Housing Discrimination Easier Than Ever Before,” The Kreisman Initiative for Housing Law and Policy, University of Chicago, 12 February 2024.

6. Ageism, sexism, classism and more: 7 examples of bias in AI-generated images,” The Conversation, 9 July 2023.

7. Algorithmic bias detection and mitigation: Best practices and policies to reduce consumer harms,” Brookings, 22 May 2019.

8. Algorithmic Bias Explained,” The Greenlining Institute, February 2021.

9. Algorithmic Impact Assessments Report: A Practical Framework for Public Agency Accountability,” AI Now Institute. 9 April 2018.

Related solutions IBM® watsonx.governance™

Govern generative AI models from anywhere and deploy on cloud or on premises with IBM watsonx.governance.

Discover watsonx.governance
AI governance consulting services

Prepare for the EU AI Act and establish a responsible AI governance approach with the help of IBM Consulting®.

Explore AI governance services
IBM OpenPages®

Simplify how you manage risk and regulatory compliance with a unified GRC platform.

Explore OpenPages
Take the next step

Direct, manage and monitor your AI using a single platform to speed responsible, transparent and explainable AI.

Explore watsonx.governance Book a live demo