AI

Building Ethically Aligned AI

Share this post:

The more AI agents are deployed in scenarios with possibly unexpected situations, the more they need to be flexible, adaptive, and creative in achieving their goals. Thus, a certain level of freedom to choose the best path to a specific goal is necessary in making AI robust and flexible enough to be deployed successfully in real-life scenarios.

This is especially true when AI systems tackle difficult problems whose solution cannot be accurately defined by a traditional rule-based approach but require the data-driven and/or learning approaches increasingly being used in AI. Indeed, data-driven AI systems, such as those using machine learning, are very successful in terms of accuracy and flexibility, and they can be very “creative” in solving a problem, finding solutions that could positively surprise humans and teach them innovative ways to resolve a challenge.

However, creativity and freedom without boundaries can sometimes lead to undesired actions: the AI system could achieve its goal in ways that are not considered acceptable according to values and norms of the impacted community. Thus, there is a growing need to understand how to constrain the actions of an AI system by providing boundaries within which the system must operate. This is usually referred to as the “value alignment” problem, since such boundaries should model values and principles required for the specific AI application scenario.

At IBM Research, we have studied and assessed two ways to align AI systems to ethical principles:

  • The first uses the same formalism to model and combine subjective preferences (to achieve service personalization) and ethical priorities (to achieve value alignment) [3]. A notion of distance between preferences and ethical priorities is used to decide if actions can be determined just by the preferences or if we need to consider additional ethical priorities, when the preferences are too divergent from these priorities.
  • The second employs a reinforcement learning approach (within the bandit problem setting) for reward maximization and learns the ethical guidelines from positive and negative examples [2]. We tested this approach on movie recommendations with parental guidance, as well as drug dosage selection with quality of life considerations.

The paper that describes our overall approach and the two possible ways to solve the value alignment problem is going to be presented at the upcoming AAAI 2019 conference and will receive the AAAI 2019 Blue Sky Idea award [1]. It can be found here.

This work is part of a long-term effort to understand how to embed ethical principles into AI systems in collaboration with MIT. While the research done in [2] and [3] models ethical priorities as deontologic constraints, the IBM-MIT team is currently gathering human preferences data to model how humans follow, and switch between, different ethical theories (such as utilitarian, deontologic, and contractualist), in order to then engineer both ethical theories and switching mechanisms, suitably adapted, into AI systems. In this way, such systems will be able to be better aligned to the way people reason and act upon ethics while making decisions, and thus will be better equipped to naturally and compactly interact with humans in an augmented intelligence approach to AI.


  1. “Building Ethically Bounded AI”, Francesca Rossi and Nicholas Mattei, to appear in Proceedings of AAAI 2019, senior member presentation track, Blue Sky idea award paper.
  2. “Incorporating Behavioral Constraints in Online AI Systems”, Avinash Balakrishnan, Djallel Bouneffouf, Nicholas Mattei,   Francesca Rossi, to appear in Proceedings of AAAI 2019.
  3. “On the Distance Between CP-nets”, Andrea Loreggia, Nicholas Mattei, Francesca Rossi, K. Brent Venable. In Proc. AAMAS 2018, Stockholm, July 2018.

AI Ethics Global Leader, Distinguished Research Staff Member, IBM Research

More AI stories

IBM RXN for Chemistry: Unveiling the grammar of the organic chemistry language

In our paper “Extraction of organic chemistry grammar from unsupervised learning of chemical reactions,” published in the peer-reviewed journal Science Advances, we extract the "grammar" of organic chemistry's "language" from a large number of organic chemistry reactions. For that, we used RXNMapper, a cutting-edge, open-source atom-mapping tool we developed.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading

Simplifying data: IBM’s AutoAI automates time series forecasting

In our recent paper “AutoAI-TS: AutoAI for Time Series Forecasting,” which we’ll present at ACM SIGMOD 2021, AutoAI Time Series for Watson Studio incorporates the best-performing models from all possible classes — as often there is no single technique that performs best across all datasets.

Continue reading