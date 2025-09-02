Is that tennis announcer narrating a thrilling rally on the court a human or a bot? Soon, thanks to advances combining computer vision and text-to-speech language models in new ways, it may be difficult for fans to tell the difference. But that could be a good thing: for events like the US Open, which is home this year to over 300 matches over 15 days, it is logistically tricky and prohibitively expensive to staff human announcers to cover all the action on the courts.

IBM researchers from the MIT-IBM Watson AI Lab are combining AI models in this way to adjust speech elements like intonation and volume in AI-generated sports commentary so that it sounds more life-like and engaging. For example, the models can detect when fans and players get particularly excited after a big point, and the AI voice can grow more animated in response, instead of delivering all commentary with the same robotic level of enthusiasm.

“The idea of AI-generated commentary is not to replace humans,” said Rogerio Feris, a Principal Scientist and Senior Manager of the MIT-IBM Watson AI Lab. “It’s to augment humans and provide more coverage for courts that currently lack commentary.”