An AI management system (AIMS) provides a framework for the development, deployment and continuous monitoring of AI systems. Thoughtful AI management systems are an essential component of responsible AI technologies, providing a sound procedural and organizational basis integrating risk management, regulatory compliance and operational efficiency into the AI development lifecycle.
Artificial intelligence systems behave differently from traditional software. A conventional software program follows explicit instructions to produce a deterministic output: providing the same input under the same circumstances to a traditional program will always yield the same output. An AI-driven system produces probabilistic outputs driven by machine learning algorithms: its output might vary even if the input and circumstances stay the same.
That dynamic logic is uniquely powerful, but the unpredictability it entails brings potential risks. For instance, organizations face reputational risks derived from biased or toxic outputs. Unchecked or unnoticed AI hallucinations present a their own broad suite of risks, from simply annoying or confusing users to inaccuracies in mission-critical scenarios with major consequences. AI-powered adversarial attacks or compromised datasets can introduce cybersecurity risks. Failure to comply with AI regulations in a rapidly changing environment carries financial and legal risks.
An AIMS is designed to anticipate and account for those risks. In more tangible terms, an AIMS can be understood as policies and protocols for embedding sound AI governance into any and all workflows relevant to an organization’s use of AI-based tools and products. This includes measures such as standardized proactive risk assessments, reporting mechanisms, routine audits and real-time monitoring of model outputs and performance.
Executed well, AI management systems not only mitigate risks, but also optimize operational efficiency and build trust between developers, end users and other stakeholders whose buy-in is essential to effective AI adoption.
Industry newsletter
Get curated insights on the most important—and intriguing—AI news. Subscribe to our weekly Think newsletter. See the IBM Privacy Statement.
Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.
A comprehensive artificial intelligence management system should address the following:
Who might be affected by issues of fairness and bias, and how would they be affected? Could your AI project inadvertently result in discrimination? Regulatory requirements aside, what are your ethical responsibilities regarding data privacy and accurately conveying the system’s capabilities and limitations? Earnest transparency on such ethical issues sustains stakeholder confidence and improves decision-making later on.
What specific threats are inherent to the development or deployment of your AI application? For instance, the use of AI tools in healthcare or legal domains must reckon with regulatory consequences; AI misfires in customer service contexts run the risk of reputational consequences. Given that the mitigation of these threats is one of the primary purposes of your AIMS, identifying specific AI risks entailed by your use case should be one of the earliest and most important steps in the process.
Proper data governance is essential to maintaining the ongoing quality and security of data used to train AI models (and data generated by them). It’s particularly important in domains that entail compliance with regulatory frameworks. For example, if a user asserts their GDPR-derived “right to be forgotten,” your system must allow for that individual’s data to be easily located and deleted. Systems operating in healthcare contexts must safeguard sensitive information in adherence to HIPAA requirements.
Protocols and policies to address these considerations must be embedded not only in the development and initial deployment of AI systems, but in their ongoing maintenance. Proper AI lifecycle management is necessary to spot and subdue risks introduced by model drift or other concerns that may emerge over time.
A strategy for actively interrogating the health of your AI system over time is among the most important means to maintain the objectives your AIMS is meant to help achieve. The specific structure and cadence of routine audits are among the most important operational considerations of AI management.
Whose job is it to enact and oversee all of this? Clear delineation of individual roles and responsibilities is necessary to take an AI management strategy from something that sounds nice on paper to something that actively safeguards AI innovation within your organization.
Establishing these principles and processes early in the AI development process helps to optimize and streamline decision-making later on.
There are many ways to implement effective AI management, and some organizations may choose to pursue entirely bespoke approaches to their AI management initiatives. But many will be well served by adhering to (or at least borrowing inspiration from) established frameworks such as ISO 42001, the NIST AI Risk Management Framework or the tenets prescribed by the EU AI Act.
ISO 42001—short for ISO/IEC 42001:2023—is the world’s first international standard for establishing, implementing, maintaining and continuing to improve AIMS. Developed in 2023, ISO 42001’s stated objective is to offer organizations “the comprehensive guidance they need to use AI responsibly and effectively, even as the technology is rapidly evolving.” Organizations can request formal ISO 42001 certification, which entails a third party audit of systems and processes to evaluate their adherence to ISO 42001’s prescribed AI practices.
More specifically, ISO 42001 prescribes a management system standard (MSS) for sound organizational governance of AI using the Plan-Do-Check-Act methodology.
The Plan stage is where an AIMS defines the rules of engagement for your AI system.
Context and scope: Define exactly what your AI tools do and do not do. For instance, you might specify that a customer support chatbot can answer support queries or escalate issues to a human, but it cannot directly authorize refunds.
Risk assessment: Map out the specific threats entailed by that context and scope. For instance, the customer support bot might hallucinate a company policy, causing confusion or misguided expectations. IBM’s AI risk atlas can help get you started.
Objective setting: Define tangible metrics for what qualifies as “success.” This will generally depend on the problem you’re trying to solve by implementing AI tools. If the goal of your AI project is to free up employee bandwidth for more complex queries, for example, you might define success as a certain percentage of cases being resolved without need for further human intervention.
In the Do stage, the principles and objectives defined in the prior stage are operationalized in an actual AI development workflow.
Potential threats should be accounted for in data governance policies.
Guardrails and redundancies should be established to guard against known risks.
The automation of system performance evaluation, where possible, minimizes the legwork required to objectively measure success and subsequently improve the system.
The AI system should be actively supervised. Every action taken by the system should be logged, to facilitate system explainability. If a given outcome of the AI system is troubling or unsatisfactory in some way, it can only be improved if you can identify exactly where in the process errors have occurred.
Routine audits should ensure that teams are following AIMS protocols—and that they’re doing so meaningfully, not just nominally.
When incidents occur, a system is in place to stop work and make necessary improvements. If audits turn up inadequate safeguards, unforeseen issues or emerging security vulnerabilities, new measures are put in place to address them.
The National Institute of Standards and Technology (NIST) is an agency housed within the United States Department of Commerce. NIST’s AI Risk Management Framework (AI RMF) is an adaptable playbook for responsible use of AI, supplemented by extensive literature of supporting materials including:
The AI RMF Playbook, which provides suggested actions for achieving outcomes laid out in the AI RMF proper.
The AI RMF Roadmap, which aims to “help fill gaps in knowledge, practice, or guidance,” with the goal of identifying ways to bolster the AI RMF itself and NIST’s ability to assist public and private sector organizations enact it.
AI RMF Crosswalks, which are designed to help organizations map existing practices to concepts and terms in a variety of AI management frameworks, including (but not limited to) ISO 42001.
Use Cases in government, industry and academia.
The AI RMF Playbook uses a Govern, Map, Measure, Manage methodology.
An “AI Risk Committee” is established. This committee defines the organization’s “risk tolerance,” and produces an AI Risk Policy document, to be signed by C-suite executives, that explicitly specifies who is accountable for any AI failures.
A discovery phase identifies specific risks pertaining to both the AI tools and the way people will use them. This process eventually yields the risk profile, itemizing every identified threat.
Each mapped threat is quantified in terms of how it might impact the company or its customers. This typically entails red teaming and stress teaming, yielding a TEVV Report (Test, Evaluation, Verification and Validation) that functions as a “scorecard” for system accuracy and trustworthiness.
Problems and potential vulnerabilities found during the measure phase are addressed and accounted for.
Whereas ISO 42001 and the NIST AI RMF are voluntary frameworks, the provisions of the EU AI Act are mandatory prescriptions for any organizations involved in the development and deployment of AI systems within the EU market.
The provisions of the EU AI Act include:
Mandatory technical documentation for model training, testing and evaluation.
The furnishing of information about the capabilities and limitations of relevant AI models to any downstream providers who will use it.
Compliance with the EU Copyright Directive
Public summaries of training data
If a given AI model exceeds 1,025 floating point operations (FLOPs) in training, it’s considered to present systemic risks and becomes subject to additional safeguards.
The EU provides tools and resources to help companies comply with the AI Act, including an interactive compliance checker that specifies which provisions apply to your business, a small business guide for small- and medium-sized enterprises (SMES) and an array of issue-specific articles.