April 4, 2023 By IBM Data and AI Team 3 min read

For more than two decades, IBM has helped the Augusta National Golf Club gather, analyze and use an ocean of data to deliver engaging insights and content to the Masters® app. In addition to traditional scores and stats, state-of-the art cameras, microphones and lasers capture video, sound, and the precise x, y and z coordinates of more than 20,000 shots over the four days of the tournament. On top of the real-time data, there’s a trove of historical data to consider: player performance coming into the tournament and in previous Masters tournaments, and player statistics such as greens in regulation and putting efficiency.

This data is the raw material of the Masters digital experience. It’s routed to multiple clouds, integrated and fed into AI models to build a wide variety of features in the Masters app. Fans can create custom video feeds, get automated video highlights, see projected scores and build out their Masters Fantasy roster with the help of AI-generated player insights.

“The Masters is a uniquely compelling sporting event, steeped in tradition and respect for the beauty of the game,” says Noah Syken, Vice President of Sports and Entertainment Partnerships at IBM. “Working with the digital teams, we’re able to deliver data-driven insights to patrons through an app that maintains the simplicity and design aesthetic they’ve come to expect from every experience at the tournament.”

Training AI on the language of golf

This year, thanks to AI-driven natural language processing, some of the data can literally speak for itself. At the 2023 Masters, IBM is using generative AI technology to add audio descriptions to video of every shot at the Masters. The essence of this generative AI project is reverse engineering: discovering rules and patterns within huge data sets to inform a system that can generate useful solutions on demand, such as shot-by-shot golf commentary.

For the large language model or foundation model, the “huge data set” used was the C4 (colossal clean crawled corpus), consisting of petabytes of web data and metadata collected over 12 years, that identifies millions of parameters. The next step was to fine-tune the model to the appropriate domain of expertise.

“We had to teach the system the language of golf,” says Stephen Hammer, Sports CTO and IBM Distinguished Engineer. “Which begins with creating a set of concepts and categories—an ontology—that draws relationships between golf rules, how it’s organized around holes, tees, par numbers and distances. Then we fill in the ontology with details from actual game play. That helps the system query and figure out the game in a way it understands, extracting information from the ontology that’s used to further train itself on how to write base sentences.” These base sentences (essentially templates that convey the concept of people playing golf) are the root of the data-to-text process.

The system was also trained to use the language of the Masters, which has its own vocabulary. For example, a sand trap is a “bunker” and the rough is the “second cut.”

The next step is to generate candidate sentences with syntactical variability, such as “His shot finds the pine straw” and “That drive ends up in the pine straw.” These are then weighted by the likelihood of their occurrence. “We encode the system to use the most common ways of saying things,” says Hammer. “And once the system phrases something in certain way, it will be less likely to say it the same way again. Every single shot, no matter how similar, could produce a different sentence.”

Because the AI system is, in effect, “watching” the action in real time, instructions were required around its verbosity, so it wouldn’t be talking nonstop. “A golf shot is made up of three items. A setup, so we create a sentence for that. Then he hits the ball, and it’s moving—we don’t create a sentence for that. And then the ball stops, and we create a sentence for that outcome,” says Hammer.

The team equipped the system with semantic variability so it can describe possible situations, from whiffing a tee shot (unlikely at the Masters) to making a hole-in-one, as well as things that could happen in between.

Lastly, the speaking persona, selected in collaboration with Augusta National, had to be taught to pronounce domain-specific words and phrases. “We couldn’t have the voice saying, ‘minus one’ when it meant ‘one under,’ and we had to ensure it could pronounce players’ names correctly,” says Syken. A team listened to the sentences to flag things the system was getting wrong, then entered phonetic spellings into the dictionary and retrained the system.

“The Masters is a great example of how powerful AI and machine learning models can transform data into world-class digital experiences,” says Syken.

See how IBM Consulting created a platform of innovation for the Masters Read how a data fabric can accelerate innovation through multicloud data integration

Was this article helpful?

More from Artificial intelligence

Accelerating the Java application lifecycle with generative AI and automation

3 min read - In today’s digital world, organizations are continuously developing, enhancing, upgrading and modernizing Java applications as part of their hybrid cloud strategy. While these are common development activities, they are often wrought with challenges, especially when working with complex enterprise applications that are monolithic, poorly documented or laden with technical debt. By harnessing the power of generative AI and automation, organizations have an opportunity to significantly reduce costs, decrease risk and improve time to value for development teams working with enterprise…

A new era in BI: Overcoming low adoption to make smart decisions accessible for all

5 min read - Organizations today are both empowered and overwhelmed by data. This paradox lies at the heart of modern business strategy: while there's an unprecedented amount of data available, unlocking actionable insights requires more than access to numbers. The push to enhance productivity, use resources wisely, and boost sustainability through data-driven decision-making is stronger than ever. Yet, the low adoption rates of business intelligence (BI) tools present a significant hurdle. According to Gartner, although the number of employees that use analytics and…

The power of remote engine execution for ETL/ELT data pipelines

5 min read - Business leaders risk compromising their competitive edge if they do not proactively implement generative AI (gen AI). However, businesses scaling AI face entry barriers. Organizations require reliable data for robust AI models and accurate insights, yet the current technology landscape presents unparalleled data quality challenges. According to International Data Corporation (IDC), stored data is set to increase by 250% by 2025, with data rapidly propagating on-premises and across clouds, applications and locations with compromised quality. This situation will exacerbate data silos, increase costs…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters