What does AI look like? You might say it looks like a robot, or flashing LEDs, or a waveform on a screen. But what would AI say AI looks like? To find out, IBM Research asked AI to draw us a picture… of itself. AI’s self-portrait was published in The New York Times today and, looking at the image, I am amazed not only with the result, but also the journey we took to get there.
The New York Times contacted IBM Research in late September asking for our help to use AI in a clever way to create art for the coming special section on AI. With a very short timeline and no guarantee of success, we set out to teach AI to create original art. Given only a high-level task—identify an important concept in AI, create an original image that captures it, and present it in way that fits with the visual style of The New York Times—we developed a new process that perfectly combines AI and human creativity.
Why would drawing a self-portrait be such a challenge for AI? After all, AI can drive cars, play video games, even produce a movie trailer. The difference is that these tasks don’t require AI to create new material, just to analyze the information at hand and make decisions or selections based on its training. We already know that AI can perform exceptionally well at language and image analysis. Creating new content, on the other hand, is a much more experimental activity.
To take on this challenge, we quickly assembled a multidisciplinary team within IBM Research that included Alfio Gliozzo, Mauro Martino, Michele Merler, and Cicero Nogueira dos santos. The range of expertise required speaks to the nature of the task: deep science thinking, hands-on technical and engineering skills, and design and visualization talent were essential to our effort. In essence, we needed to explicitly define the creative process. The result is a nuanced pipeline in which AI performs critical functions in both analysis and synthesis to create something truly novel and captivating.
The process included the following three major steps:
- Identify a core visual concept in AI:
- Ingested ~3,000 past articles on “AI” from The New York Times (NYT)
- Applied natural language processing tools to identify the top-30 discriminative semantic concepts for “AI”
- Trained a neural network for visual recognition based on images for these top-30 concepts
- Applied the network to score images from NYT articles for their strength of depicting or representing “AI”
- Selected one of the top-10 images: an image of a human and robot shaking hands
- Create an original image that captures the AI concept:
- Built a training dataset of >1,000 images of human and robot hands
- Trained a generative neural network (GAN) to draw new images of human and robot hands, which it did day and night for nearly a week
- Present it in a way that fits NYT visual style:
- Collected a sample of cover art from NYT and trained a style transfer network
- Applied the network to automatically produce stylized versions of the AI-generated hands images to match the NYT “visual language” for cover art
- Chose the final image shown here based on overall concept clarity and artistic style
This pipeline gives us a compelling new capability for collaborative creativity that could be applied to other tasks as well. Imagine using AI to design artwork for a new album based on the musicians’ songs, lyrics, and history.
More importantly, the results show how AI and humans can work hand in hand to explore entirely new territory. We’ve seen this synergy in diverse settings from drug discovery to financial market prediction to malware detection. Extending this paradigm to the realm of creativity underscores the many ways that AI can augment human abilities.