My IBM Log in Subscribe

What is trustworthy AI?

24 October 2024

5 min read

Author

Alice Gomstyn

IBM Content Contributor

Alexandra Jonker

Editorial Content Lead

Amanda McGrath

Writer

IBM

What is trustworthy AI?

Trustworthy AI refers to artificial intelligence systems that are explainable, fair, interpretable, robust, transparent, safe and secure. These qualities create trust and confidence in AI systems among stakeholders and end users.

Trustworthy artificial intelligence or TAI, can mitigate the potential risks that are associated with the deployment of AI models. These AI risks include harm to people, organizations and ecosystems. When such harms take place, they can undermine not only trust in specific AI models but in artificial intelligence overall.

Trustworthy AI frameworks can help guide organizations in their development, adoption and evaluation of AI technologies. Several government and intergovernmental organizations have established such frameworks, including the National Institute of Standards and Technology (NIST) in the United States, the European Commission’s High-Level Expert Group on AI and the Organisation for Economic Co-operation and Development (OECD).

Additionally, businesses can implement different strategies and tools to improve the trustworthiness of their AI systems. For example, continuous monitoring, documentation and AI governance frameworks can all help minimize risk.

 

Why is trustworthy AI important?

Understanding how a technology works is often key to trusting its efficacy. But many AI and machine learning (ML) systems, such as deep learning models, operate as veritable black boxes; they ingest data and create outputs, with little to no transparency into how they arrive at those outputs.

As a result, trust shortfalls abound. A 2023 survey found that more than 40% of business leaders cited concerns about AI trustworthiness.1 Meanwhile, consumers have also demonstrated AI distrust: a 2024 study found that including the term “artificial intelligence” in a product's labeling can make shoppers less likely to buy that product.2

Real-world examples of AI systems producing errant or harmful results in high-stakes use cases further fuel AI trust concerns. In one well-known healthcare example, an AI model failed to reliably diagnose sepsis. While the model performed well in a training setting, it didn’t detect sepsis in more than two-thirds of hospital patients.3

In other cases, AI models have demonstrated biased algorithmic decision-making, including predictive policing systems that disproportionately target minority communities and applicant tracking systems that favor male candidates over female ones. And then there are security concerns, such as AI chatbots inadvertently revealing sensitive, personal data and hackers exploiting vulnerabilities in AI models to steal proprietary corporate information.

When AI models underperform or produce harmful results, it can undermine trust not only in those models, but in artificial intelligence in general, potentially hampering future development and adoption of AI. Achieving trustworthy AI systems and supporting future AI development means shedding light inside the metaphorical AI black box. This enables stakeholders to count on their AI applications to deliver reliable, accurate results while minimizing the risks of outcomes that are biased or not aligned with original intent.

 

What are the principles of trustworthy AI?

Different organizations and frameworks emphasize various guiding principles and goals for trustworthy AI. Frequently cited principles of trustworthy AI include:

  • Accountability
  • Explainability
  • Fairness
  • Interpretability and transparency
  • Privacy
  • Reliability
  • Robustness and security
  • Safety

Accountability

Accountability in AI entails holding AI actors accountable for the proper functioning of AI systems throughout their lifecycles. This includes individuals and organizations that are involved in developing, deploying or operating AI technology.4

 

Explainability

AI explainability is about verification or providing justifications for a model's outputs. There are various explainability methods, collectively known as explainable AI, that enable human users to comprehend and trust the results and output created by machine learning algorithms. 

 

Fairness

Fairness in AI refers to the equitable treatment of individuals and groups. It encompasses the mitigation of algorithmic and data biases. Algorithmic bias occurs when systemic errors in machine learning algorithms produce unfair or discriminatory outcomes, while data bias refers to the skewed or unrepresentative nature of the training data used in an AI model.

 

Interpretability and transparency

AI interpretability helps people better understand and explain the decision-making processes of AI models. Interpretability is about transparency, allowing users to comprehend a model's architecture, the features it uses and how it combines them to deliver predictions. While some models are inherently interpretable, others require the use of interpretation methods.

 

Privacy

AI privacy refers to the protection of personal or sensitive information that is collected, used, shared or stored by AI. AI privacy is closely linked to data privacy. Data privacy, also known as information privacy, is the principle that a person should have control over their personal data. Maintaining AI and data privacy can be improved through a number of methods, ranging from cryptography to federated learning.

 

Reliability

Reliability can be defined as the ability to function as intended or required, without failure, for a given period of time under certain conditions. Reliable AI systems, when used under expected conditions, should deliver correct results over a given period, which might include the full lifetime of those systems.5

 

Robustness and security

Secure, robust AI systems have protection mechanisms against adversarial attacks and unauthorized access, minimizing cybersecurity risks and vulnerabilities. They can perform under abnormal conditions without causing unintended harm and return to normal function after an unexpected event.

 

Safety

Safe AI systems do not endanger human life, health, property or the environment. They are proactively designed to protect people from harm and include measures that mitigate unsafe outcomes, including the possibility of removing a system from use.6

 

What risks can trustworthy AI mitigate?

AI systems that lack trustworthy qualities pose a wide array of risks. The National Institute of Standards and Technology (NIST), which is part of the US Department of Commerce, developed a framework that’s become a benchmark for AI risk management. It organizes the risks of potential harms from AI systems into the following categories:7

  • Harm to people
  • Harm to an organization
  • Harm to an ecosystem

Harm to people

This category includes harms posed to individuals’ civil liberties, rights, physical or psychological safety or economic opportunity. It also encompasses impacts on groups through discrimination and impacts on societies in the form of harms to democratic participation or educational access.

 

Harm to an organization

This category refers to harm to an organization’s business operations, harm from security breaches or monetary loss, and harm to its reputation.

 

Harm to an ecosystem

This category encompasses harm to “interconnected and interdependent elements and resources.” NIST specifically cites harms to the global financial system, supply chain or “interrelated systems” as well as to natural resources, the environment and the planet.

 

Biased or inaccurate outputs from AI systems can result in multiple harms. Returning to an earlier example, biased applicant-tracking systems can damage individuals’ economic opportunities while also hurting an organization’s reputation. If a large language model (LLM) is tricked into running malware that paralyzes a company’s operations, that could cause harm to both the company and the supply chain to which it belongs.

Trustworthy AI systems might help prevent such dire scenarios and consequences. According to NIST, “[t]rustworthy AI systems and their responsible use can mitigate negative risks and contribute to benefits for people, organizations and ecosystems.”

 

Trustworthy AI frameworks

In recent years, different frameworks have emerged to guide AI providers and users in the development, deployment and operation of trustworthy AI systems. These frameworks include:

 

The NIST AI Risk Management Framework

Published in January 2023, the NIST AI Risk Management Framework (AI RMF) includes an overview of AI risks across AI lifecycles and the characteristics of trustworthy AI systems. The framework also outlines specific actions to help organizations manage such systems, including testing, evaluation, verification and validation tasks.

The voluntary framework applies to any company or geography, but NIST acknowledges that not all trustworthy AI characteristics apply in every setting. The framework encourages using human judgment in choosing applicable trustworthiness metrics and considering that tradeoffs are usually involved when optimizing for one trustworthy AI characteristic or another. In July 2024, NIST released a companion resource to AI RMF, which focused on generative AI.

 

The Organisation for Economic Co-operation and Development (OECD) AI Principles

The OECD AI Principles promote the respect of human rights and democratic values in the use of AI. Adopted in May 2019 and updated in May 2024, the OECD framework includes both values-based principles and recommendations for policymakers. The OECD touts the recommendations as the first intergovernmental standards for AI, with 47 adherents around the world, including the United States, European Union countries and countries in South America and Asia.

 

The EU’s Ethics Guidelines for Trustworthy Artificial Intelligence

The European Union’s guidelines, which were published in April 2019 by the European Commission’s High-Level Expert Group on AI, focus on AI ethics and emphasize a “human-centric” approach to AI development in the EU. The guidelines included 7 ethical principles, such as “human agency and oversight” and “societal and environmental well-being.” The following year, the group released the Assessment List for Trustworthy AI (link resides outside of ibm.com), which helps organizations evaluate their AI systems.

While the guidelines themselves are non-binding, they were later cited in the landmark EU AI Act, a law that governs the development or use of artificial intelligence in the European Union. The text of the act states that the EU AI ethical principles “should be translated, when possible, in the design and use of AI models.”8

 

Other organizations have also released frameworks and guidelines encouraging trustworthy AI, including the White House Office of Science and Technology (through its Blueprint for AI Bill of Rights), and companies such as Deloitte (link resides outside of ibm.com) and IBM.

 

Trustworthy AI vs. ethical AI vs. responsible AI

The terms trustworthy AI, ethical AI and responsible AI are often used interchangeably. And because the definitions of each concept can vary by source and often include significant overlaps, drawing conclusive distinctions among the 3 can be challenging.

For example, common definitions of trustworthy AI and ethical AI list principles such as fairness and privacy as foundational to each concept. Likewise, accountable and transparent are attributes that are often associated with both trustworthy AI and responsible AI.

One way to discern among the 3 AI-based concepts is to look beyond their core principles and instead focus on how they’re used:

  • Trustworthy AI is often framed as something that is achieved; it is AI that establishes trust with its users.
  • Ethical AI, in contrast, has been described as AI systems that have ethical considerations—reflecting human values and moral standards—embedded in them during their design and development.
  • Responsible AI can be interpreted as encompassing the practical means of embedding those ethics in AI applications and workflows.
3D design of balls rolling on a track

The latest AI news and insights 


Expertly curated insights and news on AI, cloud and more in the weekly Think Newsletter. 

Strategies for achieving trustworthy AI

Organizations can take important steps to help ensure their artificial intelligence systems, including AI algorithms and data sets, are operating in alignment with the principles of trustworthy AI.

Assessment: Assessing AI-enabled business processes can help companies determine where there’s room for improvement on different trustworthiness metrics.

Continuous monitoring: Through continuous monitoring for problems such as AI bias and model drift, organizations proactively address unfair or inaccurate processes or outputs, thereby supporting fairness and reliability.

Risk management: Implementing a risk management framework and tools allows for the detection and minimization of security breaches and privacy violations to empower AI robustness.

Documentation: Automated documentation across the data science and AI lifecycle can be used for industry and regulatory audits, enabling accountability and transparency.

AI governance frameworks: AI governance frameworks include procedures on data and model management, helping to ensure that developers and data scientists within an organization are following both internal standards and government regulations.

AI governance software and open source toolkits can help organizations take these and other steps to improve trustworthiness in their AI systems. With the right measures and safeguards in place, businesses can minimize risks as they harness the power of AI.

 
Footnotes

Links reside outside ibm.com.

1 ”Workday Global Survey: 98% of CEOs Say Their Organizations Would Benefit from Implementing AI, But Trust Remains a Concern.” Workday. 14 September 2023.

2 “Adverse impacts of revealing the presence of “Artificial Intelligence (AI)” technology in product and service descriptions on purchase intentions: the mediating role of emotional trust and the moderating role of perceived risk.” Journal of Hospitality Marketing & Management. 19 June 2024.

3 “From theory to practice: Harmonizing taxonomies of trustworthy AI.” Health Policy OPEN. 5 September 2024.

4OECD AI Principles: Accountability (Principle 1.5).” OECD. Accessed 17 October 2024.

5,7Artificial Intelligence Risk Management Framework (AI RMF 1.0).” National Institute of Standards and Technology, US Department of Commerce. January 2023.

6Blueprint for an AI Privacy Bill of Rights: Safe and Effective Systems.” The White House Office of Science and Technology Policy. Accessed 17 October 2024.

8EU Artificial Intelligence Act: Recital 27.” The European Union. 13 June 2024.

Related solutions

Related solutions

IBM® watsonx.governance™

Govern generative AI models from anywhere and deploy on cloud or on premises with IBM watsonx.governance.

Discover watsonx.governance
AI governance solutions

See how AI governance can help increase your employees’ confidence in AI, accelerate adoption and innovation, and improve customer trust.

Discover AI governance solutions
AI governance consulting services

Prepare for the EU AI Act and establish a responsible AI governance approach with the help of IBM Consulting®.

Explore AI governance services
Take the next step

Direct, manage and monitor your AI with a single portfolio to speed responsible, transparent and explainable AI.

Explore watsonx.governance Book a live demo