What is physical AI?

Published 19 January 2026

By Cole Stryker

Physical AI, explained

Physical AI refers to artificial intelligence (AI) systems that operate in and interact with the physical world, rather than existing only in software or digital environments.

Physical AI typically involves the combination of AI models with sensors, actuators and other control systems that allow models to act upon real-world environments, taking models from the realm of bits to the realm of atoms. With AI, advanced physical systems can now perceive the environment, reason with the power of a large language model (LLM), act accordingly, and then learn from the outcome of that action.

Another way of thinking about physical AI is that it is simply AI-powered models applied to systems in physical space. For example, robotics focuses on mechanics and control of physical machines. Before AI, robot behavior was typically rule-based or scripted, and robots could only perform narrow tasks within specifically engineered environments. Think of a robotic arm that welds the same seam 1,000 times a day on an automotive production line, or an early-generation robotic vacuum that follows preset navigation rules.

In contrast, robotic AI agents equipped with general understanding from LLMs have a limited but still powerful “common sense” about the world. These models can be paired with reinforcement learning techniques in high-performance hybrid architectures so that robots can possess both general knowledge and a specialized understanding of a specific use case.

What’s more, physical AI goes far beyond individual robots to entire AI-powered factories, energy-efficient smart grids or fleets of automated vehicles. Many systems that exist in physical space can be augmented with AI.

Why is physical AI a hot topic?

Several bottlenecks that previously prevented a physical AI revolution are being broken at the same time. The first and most important is the arrival of generative AI, powered by foundation models. Today’s large computer vision and multimodal models can recognize objects, understand spatial relationships and generalize across settings. This reduces the amount of specific training required for individual tasks and allows systems to re-use intelligence across them.

The second challenge is now being overcome by the power of modern simulation, which combines high-fidelity physics modelling, photorealistic rendering and parallelization. This dramatically reduces model training times and makes simulation useful not just for testing but as a primary training ground. A related trend is the explosion of compute availability. Breakthroughs in GPUs and data centers have made training at scale feasible.

Finally, hardware is better than ever. Modern robots have better sensors and lighter materials. They can take advantage of recent edge AI breakthroughs and better communications capabilities. These innovations have made experimentation viable, even for small startups. The result is a renaissance for physical automation initiatives, from autonomous vehicles to industrial robots and healthcare bots that perform surgery and other complicated procedures.

Jensen Huang, CEO of Nvidia, is widely credited with popularizing the term ”physical AI” and framing it as the next major wave of AI-driven innovation. During a January 2026 podcast interview, Huang predicted a future with “a billion robots.”¹ This vision involves a new global economy around developing and maintaining all these new robots, which could become one of the largest industries on the planet, nothing less than a second industrial revolution.

That same month, Nvidia released a collection of open models, frameworks and advanced AI infrastructure for physical AI.² The release touted new technologies to speed up workflows across “the entire robot development lifecycle.”

“The ChatGPT moment for robotics is here,” Huang said.

The release includes open, fully customizable world models that enable physically based synthetic data generation and robot policy evaluation in simulation for physical AI, an open reasoning vision language model and an open reasoning vision language action model. This came alongside new simulation and compute frameworks.

Industry newsletter

The latest AI trends, brought to you by experts

Get curated insights on the most important—and intriguing—AI news. Subscribe to our weekly Think newsletter. See the IBM Privacy Statement.

How does physical AI work?

Imagine the goal is to train a network of mobile robots (AMRs) that can autonomously pick up litter from sidewalks, parks and streets without harming people or themselves. The task is not simply defined as “picking up objects,” but as detecting litter among non-litter, navigating crowded environments, choosing safe paths, picking up objects of variable shape and size and other concerns.

Once the goals are defined, the robot must be designed with the proper morphology. Should it be a humanoid robot or something else? Does it use wheels or legs? Does it need a gripper that pinches objects or a vacuum that sucks them up? What sort of cameras and sensors does it need to navigate its environment?

Then, a simulated environment is typically created. Such an environment might include terrain, litter, random objects (rocks, benches, fences, etc.), people, lighting effects and various weather conditions.

In this simulated training environment, the model governing the robot’s behavior learns what litter looks like, from bottles and cans to scraps of paper and tiny candy wrappers. It learns how to maintain balance on uneven terrain and in strong winds. It learns how to best avoid bumping into people and how to grasp glass bottles hard enough to pick them up but not so hard that it shatters them.

Each training run changes the qualities of the components involved: bigger pieces of trash, different weather conditions, more people walking around. The robot “never sees the same sidewalk twice.”

When the robot gets a defined task right, its behavior is “rewarded” with a high score, which reinforces the best behaviors. Across many iterations, the robot learns how to do its job.

Once the robot surpasses a certain success threshold, it is deployed to a real-world training environment, like a quiet street without too many people. The robot is fine-tuned to handle unexpected new conditions that weren’t present in the simulation, like wind blowing small bits of trash.

This information is used to improve the simulated training environment for additional training. The robot can then be stress-tested in more complex environments with dense crowds, in poor lighting or on wet slippery surfaces.

Reinforcement learning

The reward mechanism described above is part of reinforcement learning, a type of machine learning process in which autonomous agents learn to make decisions from trial and error interactions with their environment. Reinforcement learning is crucial for robotics because agents learn behavior through interaction over time, which is what robots must do in the physical world.

The world is messy: surfaces differ, objects deform, sensor data is noisy and humans behave unpredictably. Scalability can’t be achieved when writing hard rules for every situation. Reinforcement learning allows robots to discover strategies on their own by experimenting within constraints. Instead of being told how to move, the robot learns which behaviors work best under real conditions.

Reinforcement learning excels where other machine learning methods fail. For example, grasping litter involves approaching it, aligning a manipulator, adjusting force and lifting—all while responding to real-time feedback. Supervised learning methods can theoretically label what a “good grasp” looks like, but it cannot easily teach how to recover from a slip or adapt mid-motion. Reinforcement learning, by contrast, optimizes entire action sequences based on long-term outcomes.

This is just one example of how a robot might be trained. There are many other methods for physical AI systems like supervised and unsupervised learning, imitation learning and learning from demonstration (LfD).

AI Academy

Become an AI expert

Gain the knowledge to prioritize AI investments that drive business growth. Get started with our free AI Academy today and lead the future of AI in your organization.

Watch the series

Challenges in training physical AI

Training physical AI works differently from training nonphysical autonomous systems for a few reasons.

Data is expensive
Physics is hard
Time is of the essence
Real stakes

Data is expensive

While traditional AI models are trained on static datasets, including text, images and audio, physical AI usually requires data of robots interacting with real environments. In traditional machine learning training, data can be easily scraped, copied and re-used cheaply. Not so with physical AI. One typically can’t just “download a dataset.”

Data collection takes time. Every data point requires a robot to move its body, manipulate objects, or just observe things happening in its environment in continuous time. In the real world, machines break down. Gaskets are known to blow, creating complexities for gathering good training data.

Physics is hard

Physical AI must contend with physics. Gravity, friction, temperature, torque, balance, timing, momentum, wear, noise, lag—the real world is infinitely complex, which is why models that look great in simulated environments often fail when tested in the field.

To grapple with the uncertainties and complexities of physics, training might incorporate physics-informed models or hybrid systems in which simpler control algorithms ensure stability and learning models are limited to handling perception and decision-making.

Time is of the essence

Physical systems operate in continuous time. In many use cases, tight feedback loops with minimal latency are required between perception, decision and action. Small delays can cause failures. Often speed is just as important or even more important than accuracy. In other AI domains, it’s usually all about getting the most accurate output, but factoring in the need for speed introduces a major engineering challenge.

Real stakes

In most AI training environments, errors are harmless and easily discarded. But stakes are high in the real world. If an LLM makes a wrong prediction in a digital environment, a human can choose to act on it or not. In contrast, if a self-driving car incorrectly predicts the speed of the car in front of it, the results can be catastrophic. Training often involves constraints and gradual increases in autonomy, sometimes requiring human oversight and other forms of monitoring.

The role of synthetic data

To address the drawbacks above, researchers rely heavily on simulated environments and synthetic data, generated by robots, often virtual, interacting with virtual environments.

The use of world foundation models (WFM) is increasingly common in robotics. A WFM is a powerful AI system that has learned the dynamics of the physical world (geometry, motion, physics) from vast amounts of real-world data, enabling it to generate realistic, physics-aware scenarios for training physical AI.

This simulation often involves the creation of a digital twin of a system or environment, like a factory. In this virtual space, autonomous machines perform tasks, generating synthetic data about how these machines performed in the virtual space.

Techniques like domain randomization, in which the characteristics of simulated environments are intentionally generated in all sorts of random ways, can help produce more useful synthetic data, resulting in more robust models that are able to transfer their skills to messy, highly variable reality. However, an overreliance on synthetic data can lead to overfitting.

Author:

Cole Stryker

Staff Editor, AI Models

IBM Think

Start realizing ROI: A practical guide to agentic AI

Discover ways to get ahead, successfully scaling AI across your business with real results.

Resources

Start realizing ROI: A practical guide to agentic AI

Discover ways to get ahead, successfully scaling AI across your business with real results.

How AI agents and assistants can benefit your organization

Dive into this comprehensive guide that breaks down key use cases, core capabilities, and step-by-step recommendations to help you choose the right solutions for your business.

Top strategic technology trends for 2025: Agentic AI

Download this Gartner® research to learn the potential opportunities and risks of agentic AI for IT leaders and how to prepare for this next wave of AI innovation.

Level up your AI expertise

Access our full catalog of over 100 online courses by purchasing an individual or multi-user subscription today, enabling you to expand your skills across a range of our products at a low price.

From AI projects to profits: How agentic AI can sustain financial returns

Learn how organizations are shifting from launching AI in disparate pilots to using it to drive transformation at the core.

Explore IBM Granite

IBM® Granite® is a family of open, performant and trusted AI models tailored for business and optimized to scale your AI applications. Explore language, code, time series and guardrail options.

IBM is named a Leader in Data Science & Machine Learning

Learn why IBM has been recognized as a Leader in the 2025 Gartner® Magic Quadrant™ for Data Science and Machine Learning Platforms.

IBM AI Academy

Led by top IBM thought leaders, the curriculum is designed to help business leaders gain the knowledge needed to prioritize the AI investments that can drive growth.

The 2025 CEO’s guide: 5 mindshifts to supercharge business growth

Activate these five mindshifts to cut through the uncertainty, spur business reinvention, and supercharge growth with agentic AI.

Unlock the power of generative AI and ML

Learn how to confidently incorporate generative AI and machine learning into your business.

How to thrive in this new era of AI with trust and confidence

Dive into the three critical elements of a strong AI strategy: creating a competitive edge, scaling AI across the business and advancing trustworthy AI.

Footnotes:

Jensen Huang, January 2026 podcast interview (video), No Priors: AI, Machine Learning, Tech, & Startups, YouTube.com, Jan 8, 2026
NVIDIA Newsroom: NVIDIA Releases New Physical AI Models as Global Partners Unveil Next-Generation Robots., Nvidia.com, January 5, 2026