Recently, I was asked this question: How do I measure success in my AI ethics and governance engagements? I measure success by whether ethical AI principles are embedded into strategy, workflows and decision-making, not just written down. If teams are empowered to question, impacted communities are included, and models are built with accountability at every stage, you're on the right path.
For starters, the deeper failure is not considering human behavior and the experience of a person interacting with AI that is supposedly having their intelligence “augmented". I think it’s important to know how you measure the behavior of people. One foundational principle of human psychology 101 is that you get far more of the human behaviors you measure.
What human behaviors are we measuring around the use of AI? What human behaviors do our clients want to see more of in the use of AI?
Industry newsletter
Get curated insights on the most important—and intriguing—AI news. Subscribe to our weekly Think newsletter. See the IBM Privacy Statement.
Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.
To begin, here's a story about one of my all-time favorite clients: a large police department. This client stood apart from the culture of any other previously encountered. When we began our engagement with members of the police department’s AI governing council, many of them didn’t know why they were there.
The reason was that they all had an extensive domain expertise in policing. They’d say that they were not experts when it came to artificial intelligence (AI) or machine learning (ML). They would ask, ‘Why do you need me on this governing council?’.
For that reason, our effort started with explaining why they were there and why it was important to have their wisdom and domain expertise in the topic at hand. But we did not stop there, we had to demonstrate why they were needed. They needed to learn and understand that what they brought to these AI solutions was so critical to doing AI right. For AI implementation to succeed, it’s essential to put the people first.
The way we connected to them was by actively demonstrating to them that the hard part of getting AI right was not strictly technical at all. Having domain experts—true domain experts in the policing domain—is essential. These people understand the data, the context in which it was gathered, and the relationships between data points, ensuring that the right AI is built, maintained and governed responsibly.
Some of the tactics we used were design thinking exercises that were born out of IBM’s own AI design guild. These design thinking exercises address questions like:
Do we have the right people in the room?
What is the core problem that we are trying to solve?
Do we have the right data and the right understanding of the data according to the domain experts to make this AI initiative come alive?
Which tactical AI principles must be reflected in our AI systems to earn public trust? And how do we define the functional and nonfunctional requirements necessary to bring those principles to life?
What are the unintended effects of these AI models?
How would you approach mitigating for risk in a way that is intentional?
Who are all the personas that we need to be building for?
What do we need to communicate about the intended and unintended use of this AI?
This introspective work must be done in an environment that prioritizes humility and inclusivity, psychological safety and includes people with varied lived world experiences. The outcome of this work is that teams finally have the language they’ve long needed. It allows them to clearly communicate to builders or buyers what must be developed or acquired to responsibly curate AI solutions.
Again, you get the human behaviors that you measure. So, the question is, what are those human behaviors that we needed to measure to determine the success of a governance project?
A common question to ask is:
What human behaviors are being measured? Which are the human behaviors that are being emphasized in communications efforts, and are they the same as those governance-related behaviors that are being measured?
Oftentimes, employees are both encouraged to use AI in the workplace (at the risk of being fired) and sometimes in parallel are being told that they are accountable for responsible outcomes. But they are often strictly measured on how fast they can do their work and how productive they can be, without being measured at all on the outcomes themselves.
There have been several documented cases where the domain expert knew the AI was wrong and was expressly told to use their best judgment. Yet, instead of contradicting the AI, they did nothing—because, once again, their performance was evaluated according to entirely different metrics. Some people are indeed punished for contradicting an AI at work.
When we return to the question—what behaviors are we measuring—let’s explore how we can measure whether AI enhances a person’s intelligence. We should also consider whether that person actively engages as a critical consumer of AI. These individuals should be measurably incentivized to help train the AI, or at least alert others, in order to drive more responsible outcomes.
In order to be able to do that we must measure whether employees who use AI truly understand AI risk, truly understand the real nature of data, of bias, of disparate impact. Do they know what it means to be accountable? Are they measured in such a way to encourage that accountability?
These leadership behaviors are just a few examples of the things that we know to look for as we consider setting up the right organizational culture.
A core component of knowing whether you’ve been successful in an AI governance engagement is whether you’ve earned people’s trust. Know again- there is nothing easy about building AI in a way that earns trust. Those leaders I look up to are able to show their work.
What I’d love for you to take away from this article is that you belong in this conversation even though you might think you don’t. You don’t need a degree in data science or a PhD in AI, I promise you. The fact that you bring a different lived world experience means that you belong in this conversation about AI.
It isn’t just because AI has been popularized and is likely to show up in your workplace that this conversation matters. You’re going to want to know that the AI models being used—even by your children—are safe from adversaries and grounded in truth. They should be fair, avoid suddenly spewing toxic hate speech or going off the rails and protect personal data with strong privacy safeguards.
Let’s be honest: this AI transition is hard. Leading in the AI era means getting comfortable with ambiguity. It means choosing to do what’s right, not just what’s easy or profitable in the short term. It requires developing a deeper understanding of AI technologies—how they work, what they can achieve, the risks they carry. And it requires imparting that knowledge and understanding throughout the workforce. It means investing in AI literacy, AI governance and a capable AI governance leader.
The leaders who step up now—who ask the brave questions, resource the right systems and center human values—not only avoid the pitfalls but actively shape a more responsible AI future. They earn the trust that becomes a lasting competitive advantage. This leadership moment is yours. Make it count. Make it count.
Govern generative AI models from anywhere and deploy on cloud or on premises with IBM watsonx.governance.
See how AI governance can help increase your employees’ confidence in AI, accelerate adoption and innovation, and improve customer trust.
Prepare for the EU AI Act and establish a responsible AI governance approach with the help of IBM Consulting®.