Scaling Wimbledon’s video production of highlight reels through AI technology

Share this post:

Demonstrating the continual innovation that takes place around its major sporting events, IBM Research and IBM iX are teaming up to provide “Cognitive Highlights” to The Championships, Wimbledon, the oldest tennis tournament in the world, to demonstrate how AI technology can scale and accelerate the video production process for any media, sports or entertainment company.

It was in April that IBM Research and IBM iX first explored this project, creating the first ever multi-modal system for analyzing golf video for the 2017 Masters Golf Tournament. The proof-of-concept brought together computer vision and other leading AI technologies to listen, watch and learn from a live video feed of the golf tournament and automatically identify and curate the most exciting moments and shots into segments that could be used in online highlight packages.

The solution for Wimbledon will go beyond selecting and curating individual segments for a video editor to choose from, to automatically creating a one to two minute highlights package of matches for the Wimbledon editorial team’s use across the Wimbledon Digital Platforms, and which will be available shortly after each match.

Moreover, instead of a four-day tournament at The Masters, IBM will contend with a 13-day championship at The All England Lawn Tennis & Croquet Club that starts next Monday July 3rd. Video from the matches can quickly add up to hundreds of hours of footage to sift through. This sheer volume of data requires a mix of cognitive technologies and advanced engineering techniques that can integrate multiple data points, audio and visual components to search, discover and extract key scenes or moments.

A key advantage for Wimbledon is that the IBM system is scalable, enabling the tournament content team to automatically capture highlight packages for the matches played outside of the most popular courts, which traditionally may not have had moments curated. Now, with the help of Cognitive Highlights, producers are assisted with a tool that can quickly deliver highlights suggestions from six courts, expanding the number of potential matches that are turned into timely highlight videos for fans to watch and share.

A fusion of metadata and multimedia

Flowchart showing how a mix of data and cognitive technologies allow IBM and Wimbledon to auto-curate the production of highlight videos for The Championships, Wimbledon 2017

Flowchart showing how a mix of data and cognitive technologies allow IBM and Wimbledon to auto-curate the production of highlight videos for The Championships, Wimbledon 2017

For this year’s Championships, the production of highlight reels will rely on a number of steps and technologies. Video of the matches will be collected by IBM soon after their completion. Initial highlight candidates are identified using information from the on-court statistician and other sensors that provide data, for example, on the speed of the ball, number of aces, saved breakpoints, etc. The system then continues to capture segments of a match that could be predictors of an exciting moment using a combination of audio and video AI tools that analyze crowd cheering as well as action recognition (i.e. visuals of player behavior) as well as scoring data. Based on these different modalities, the video segments are rank-ordered and selected to produce the final highlight video for each match.

The combination of this data and these modalities helps the system get the full picture of a match’s most exciting moments, and demonstrates the value of audio and video techniques in helping rank or discover moments that might ordinarily be passed over using pure meta-data analysis. For example, a match may have a number of break points worthy of being considered a highlight, so in order to select the best ones, the system will choose the segments with the highest excitement score measured from the audio and video of the moment. Having more eyes and ears on a match that can distinguish and rank all the various exciting moments of a game allows editorial teams to sift right to the ‘winning’ moments that fans can re-watch after the game.

Once the highlights for the match are determined, the system uses meta-data from the match to generate the graphics to facilitate the production of the one to two  minute highlight packages that fans can watch on any of Wimbledon’s digital channels.

The IBM Research team trained the system to recognize crowd cheering and the player’s reaction using videos of matches of previous tournaments, with associated contextual metadata from the on-court statistician to filter out specific content. The technology relies on state-of-the-art deep learning models which provides effective methods for learning new classifiers using a few manually annotated training examples, using active learning techniques.

The value of a technology solution such as Cognitive Highlights is that its utility can go beyond sports. Media and entertainment companies have amassed huge archives of hundreds of thousands of hours of video of program material and other footage that is not easily searched, and where this kind of solution could simplify their production process. We also believe this technology could be extended to provide summarization tools for consumer videos captured by mobile phones or wearable cameras, for example.

Starting Monday July 3rd you can view highlight videos produced by our technology on, and let us know what you think!

More AI stories

Deriving Complex Insights from Event-driven Continuous Time Bayesian Networks

Real-world decision making often involves situations and systems whose uncertain and inter-dependent variables interact in a complex and dynamic way. Additionally, many scenarios are influenced by external events that affect how system variables evolve. To address these complex scenarios for decision making, together with colleagues at the IBM T. J. Watson Research Center, we have developed a new dynamic, probabilistic graphical model called - Event-driven Continuous Time Bayesian Networks.

Continue reading

Progressing IBM Project Debater at AAAI-20 — and Beyond

At the thirty-fourth AAAI conference on Artificial Intelligence (AAAI-20), we will present two papers on recent advancements in Project Debater on two core tasks, both utilizing BERT.

Continue reading

Mastering Language Is Key to More Natural Human–AI Interaction

IBM Research AI is leading the push to develop new tools that enable AI to process and understand natural language. Our goal: empower enterprises to deploy and scale sophisticated AI systems that leverage natural language processing (NLP) with greater accuracy and efficiency, while requiring less data and human supervision.

Continue reading