IBM Cognos Analytics

Driving towards predicting the US Open champion with IBM Cognos Analytics

Blog Home > Driving towards predicting the US Open champion with IBM Cognos Analytics

Driving towards predicting the US Open champion with IBM Cognos Analytics


Reading Time: 7 minutes

Predicting the US Open champion is a challenge, much like actually winning it. With its grueling course setup and traditional punishing roughs, the US Open is quite possibly golf’s toughest major. Just ask Bobby Jones, who claimed “Nobody ever wins an Open. Everybody else just loses it.” Or Jack Nicklaus, who famously said, “You can’t win the Open on Thursday and Friday, but you can lose it.” And both of these guys are tied for the record for most US Open victories!

The fun began early this morning at Erin Hills in southeast Wisconsin. Although the official start of the 117th U.S. Open was at 6:45 AM, the chatter around the golf course began days before. (See Kevin Na’s video describing the impossibility of playing out of the thick fescue rough). Like Tiger’s 2008 US Open victory on a torn ACL and two microfractures in his left leg, this tournament will certainly provide drama for golf fans around the world.

If I learned anything from trying to correctly predict the 2017 Masters winner, it’s that the task is nearly impossible. So many  measurable and unmeasurable variables come into play. How many people out there can honestly say they expected Sergio Garcia to take home the Green Jacket after 74 major starts without a victory?

However, after reviewing my methods upon conclusion of the Masters, I learned some things. I think I’ve got a better handle on what’s needed for predicting the US Open champion. So, here goes.

Data visualization for 5 statistical categories

Once again, I turned to the PGA Tour website to collect player statistics dating back to the 1980s. This time around, I compiled 2017 data on just the top 45 (Official World Golf Rankings) players in the world. Only 4 times since 1986 has a golfer ranked over 45 won the US Open. First, I reviewed the statistics I used in my Masters prediction. Then I read about the challenges Erin Hills proposes to the field. As a result, I chose five statistical categories for my analysis. Digesting and gaining insights from the data in raw form is difficult for me. So, I uploaded the spreadsheet to IBM’s Business Intelligence platform, Cognos Analytics. With this data visualization tool, I immediately was able to design charts, graphs and tables, making the datasheet much easier to comprehend.

1. Top 45 players, world rank and home country

I began with an initial overview dashboard (shown below) that includes a couple key metrics for the Top 45 players, as well as a grid showing their world rank and home country.

predicting the US Open champion initial overview2. Average driving distance

Going deeper into the analysis, I focused on how each player stacks up in average driving distance.

predicting the US Open champion driving distance

This statistical category played a role in the 2017 Masters outcome, as 80% of the top 15 finishers averaged longer drives throughout the long weekend than the rest of the field! Many expert analysts are predicting the average driving distance to play an even larger role at Erin Hills due to the official course length of 7,741 yards, the longest in U.S. Open history. This course certainly sets up well for the PGA Tour’s biggest hitters.

3. Driving accuracy

The next statistical category I included as part of predicting the US Open champion was driving accuracy. Ironically, driving accuracy did not have a strong correlation with the top 15 Masters finishers (only 33.3% averaged a higher accuracy percentage). But, many believe it will for the U.S. Open due to the thick fescue rough mentioned before. Although the fairways are fairly wide making them easier to hit, when players do miss, they might find themselves in severe trouble. The chart below shows how the top 45 rank in Driving Accuracy Percentage in descending order.

predicting the US Open champion driving accuracy

4. Strokes Gained: Tee to Green

Another statistical category that has been growing in popularity with golf analysts is strokes gained: tee to green (SG TTG). This groundbreaking data set attempts to more accurately compare each individual player’s performance to the rest of the field. SG TTG measures all the strokes taken by a player that are not on the putting surface. I think it’s a very important statistic for predicting the US Open champion. While the PGA Tour also has strokes gained: putting as a statistic, I chose to ignore it due to the easier greens at Erin Hills than those at Augusta National. The greens at Erin Hills have been described by experts as Augusta-like, but not as tough, as the hybrid bentgrass putting surfaces are smoother, larger and do not have has many significant contours. Therefore, the smooth area chart below shows how each players’ average SG TTG compares.

predicting the US Open champion strokes gained tee-to-green

5. Par-5 scoring average

Although typically the U.S. Open is played as an even Par-70 tournament, with the occasional Par-71 at Pebble Beach, this year the scorecards will showcase the PGA traditional Par-72 (first time at a U.S. Open since 1992). Erin Hills will feature four incredibly important par-5s for the players. So, naturally, the last statistical category I included in my analysis was Par-5 Scoring Average for each golfer, as these holes routinely give the competitors the opportunity to climb the leader board if played well. This visualization shows how each individual golfer has taken advantage of par-5s thus far in 2017 tournaments, keeping in mind that the lower the diamond is on the graph, the better the golfer has performed.

predicting the US Open champion par 5

Making my U.S. Open champion pick

The data visualization capabilities in Cognos Analytics allowed me to see how the top 45 ranked players on the PGA Tour stack up in regards to some of golf’s most important statistical categories. As mentioned in my previous Masters article, many numbers for the players are extremely close, but slight differences always set the champion apart from the rest of the field.

Just like the Masters, much of the media attention will focus on some of the more popular players in the world like Dustin Johnson, Rory McIlroy and Jordan Spieth. But these stars could struggle with that fescue rough if they continue their driving accuracy percentage. There are, however, two players who consistently show up at the top in all of the charts above: Rickie Fowler and Sergio Garcia!

Here’s how these two golfers compare to the rest of the top 45:

predicting the US Open champion Rickie Fowler

And the winner of the 117th U.S. Open at Erin Hills is…Rickie Fowler! This year I’m sticking with my Masters prediction (who had a chance to win late in the tournament). I’m hoping the 28-year-old takes home his first major title.

My sleeper picks this year (I’m making two) are Jason Dufner and Jordan Niebrugge (who?). First, the more realistic sleeper pick: Dufner. He has finished in the top 10 in the last three U.S. Opens and is coming off a Memorial Tournament win in early June. While he isn’t the longest hitter on tour, Dufner is accurate and a great iron player! As for Jordan Niebrugge, the ultimate sleeper pick…he finished in the top 10 at the 2015 British Open as an amateur, he is from the state of Wisconsin and he knows Erin Hills very well. If any young player is to come out of nowhere and shake up the leaderboard, my pick is Jordan Niebrugge.

Final remarks about predicting the US Open champion

Erin Hills will provide numerous tests for the field, even with the wider fairways and newly cut down fescue roughs thanks to the USGA on Tuesday. If you, too, are going through the process of predicting the US Open champion, keep in mind a few factors that may be different this year. While they aren’t calling it a links style course, Erin Hills does have similar features. This could give European players a slight advantage (maybe I should have chosen Sergio instead of Rickie). Some of these features include no trees, brutal fescue, windy fairways and difficult bunkers. Adam Scott might be a player to keep your eye on, too. He finished in the top 10 four of the last five British Opens (links style).

Creativity will certainly come into play this year with the number of blind shots on the golf course. USGA Executive Mike Davis confirmed that “14 of 18 holes have a blindness element.” This will lead to the competitors taking chances often. Is there a more creative golfer than someone like Bubba Watson?

The bunkers at Erin Hills are sure to cause issues throughout the long weekend. With 138 bunkers total, hardly any flat bottoms and extremely tough lies, take a look at some of the better bunker players. They may have a leg up on the competition. Matt Kuchar, Justin Rose and Jason Day are all great bunker players that could steal a stroke or two out of the sand.

Conclusion

Predicting the US Open champion is just as difficult as predicting the Masters champion! Like Augusta National, Erin Hills will provide grueling, yet very different, elements for the field. In many respects, it’s more closely related to the British Open venues.

While it’s difficult to accurately pick a winner, Cognos Analytics has empowered me to quickly build visualizations based on data. The result is a more educated prediction. So, although it’s certainly anyone’s tournament. it appears that, when it comes to predicting the US Open champion, Rickie Fowler and Sergio Garcia are winners!

If you have any questions about Cognos Analytics or are interested in seeing a demonstration, reach out via email at adam.joyce@ibm.com!

Leave a Comment
0 Comments

Leave a Reply

Your email address will not be published.Required fields are marked *

More IBM Cognos Analytics Stories

IBM Cognos Analytics

Mohammed Omar Khan

IBM and PMsquare partner to help you see the bigger picture with Thrive!

From data prep and discovery to data visualization and collaboration, AI drives the all new IBM Cognos Analytics! Our goal is to help you hit the ...

IBM Cognos Analytics

Letha Wicker

3 Cognos Analytics Success Scenarios

It’s all very well to list speeds and feeds, pretty screenshots and teaser videos. However, the true power of the all-new IBM Cognos Analytics only ...

IBM Cognos Analytics

Timothy Walker

Announcing IBM Cognos Analytics for IBM Cloud Private for Data

IBM Cloud Private for Data is a platform that continually improves itself; with the addition of IBM Cognos Analytics, that drive only accelerates.