Skip to main content

Google Code baseball hacks: Display batting stats in a Google Gadget

Create a baseball hack with Google Gadgets, the Google Spreadsheet API, and the Google Chart API

Paul D. Reiners, Software Developer, IBM
Paul Reiners
Paul Reiners was an Oakland Athletics fan back in the days of Catfish Hunter, Rollie Fingers, and Vida Blue. Paul is the developer of several open source programs, including Automatous Monk, Twisted Life, and Leipzig. He received his M.S. in Applied Mathematics (Theory of Computation) at the University of Illinois at Urbana-Champaign. He lives in Minnesota and, in his spare time, practices the electric bass and competes at TopCoder.
John S. Mysak, Software Developer, IBM
John Mysak
John is a former catcher and has been using Google as his primary search engine for as long as he can remember. John is the developer of PowerWorship and the owner of the world's smartest beagle.
Paul DiCarlo, Software Developer, IBM
Paul DiCarlo
Paul DiCarlo liked Catfish Hunter much better when he pitched for the New York Yankees. He lives in Minnesota where he is desperately trying to learn the ukulele.

Summary:  This article demonstrates how to use several Google Code APIs using a baseball hack as an example. We will create a Google Gadget that displays Major League Baseball batting statistics. You will learn about Google Gadgets, the Google Spreadsheet API, and the Google Chart API. After reading this article, you'll have a good idea of the sorts of applications you can build using these APIs, know enough to get started writing your own applications, and know where to get more detailed information. This is the first article in the Google Code baseball hacks series.

View more content in this series

Date:  12 Aug 2008
Level:  Intermediate PDF:  A4 and Letter (126KB | 14 pages)Get Adobe® Reader®
Activity:  3223 views

Introduction

In this article, we illustrate how to use three of the Google Code APIs by coding an interesting baseball hack. The format of this article and the application domain of our code are inspired by Joseph Adler's excellent book, Baseball Hacks (see Resources). We use three Google Code APIs: the Google Chart API, the Google Spreadsheets Data API, and the Google Gadgets API. We won't cover the APIs comprehensively. We hope to show enough to pique your interest. You can consult the Google documentation and tutorials for more details. Our goal is to build a Google Gadget that displays up-to-date batting statistics of your favorite Major League Baseball team. We do this in three steps.

First, we download current baseball statistics from a baseball Web site and store them in comma-delimited text files. Second, we upload these stats to a Google Spreadsheet using the Google Spreadsheets Data API. Third, we write a Google Gadget that reads the stats from the spreadsheet and displays them as a bar chart using the Google Chart API.


Scraping statistics from the Web

The first step is to download the stats. Our code for this was originally based on "Hack #25: Spider Baseball Sites for Data" in Baseball Hacks. We won't go into any details because it's explained well in the book. (Our code is in the get_all_teams_hitting.pl Perl program and GetStats.pm Perl module included in the sample code that you can download for this article.) A few points are worth mentioning, though.

First, while there are plenty of RSS and Atom feeds for the results of games (box scores, that is), there do not seem to be any RSS or Atom feeds for up-to-date player stats. If there were, obtaining the stats from such Web feeds would be the way to go. So, instead, we were required to scrape HTML pages for our data. Having to do this is never pretty and is fragile. If the format of the HTML is suddenly changed, your code could easily be broken. The second point is that you shouldn't scrape data from a site without first checking whether it's okay with the site owner. Check out the "Terms of Use" of the site. Third, if it is okay to scrape data from the site, don't be greedy. Put a pause of a second or two between successive page hits. Not only is this the polite thing to do, but, if you don't, you may find yourself locked out of the site. (This happened to us when testing some code that didn't have pauses in it.)


Programmatically uploading statistics to a Google Spreadsheet

After we've scraped HTML pages, downloaded the stats, and stored them in comma-delimited text files, one for each team, we want to store them in a Google Spreadsheet that will be accessed later by our Google Gadget. The Google Spreadsheets Data API lets you add to, change, and delete the contents of a Google Spreadsheet either at the row level or the cell level. The Google Spreadsheet Data API is available for Java™, .NET, PHP, and Python. We used Python because we happen to like Python.

In Listing 1, our Python code loops over the set of comma-delimited text files and then calls the code that actually uploads the stats to the spreadsheet. (We also use another data file, MLBTeams.csv, that contains a list of the teams.)


Listing 1. Partial listing of upload_all_teams_hitting.py
				
mlb_teams = open('../../data/MLBTeams.csv')
try:
      lines = mlb_teams.readlines( );
finally:
      mlb_teams.close( )             
  
# Skip header.
for line in lines[1:]:
    line = line.rstrip();
    if line != '':
        parts = line.split(",")  
        curr_wksht_id = parts[1].rstrip()
        teamAbbrev = parts[0].rstrip()
        print "Updating: " + teamAbbrev
        infile = "../../tempData/" + teamAbbrev + "Hitting.txt"
                
        uploader = Uploader(user, pw, infile)
        if curr_key != '':
            uploader.set_spreadsheet(curr_key)
        if curr_wksht_id != '':
            uploader.set_worksheet(curr_wksht_id)
        uploader.prompt_delete_old_rows_and_add_new_rows()

You can programmatically interact with Google Spreadsheets in one of two ways: a row-based feed or a cell-based feed. In this application, since we're usually just deleting the rows of old data and creating rows with the new data, we mainly use a row-based feed.

Here are the gdata.spreadsheet methods and fields we use:

  • SpreadsheetsListFeed: Lets you work with spreadsheets at the row level
  • SpreadsheetsList: A single row in a worksheet
  • SpreadsheetsCellsFeed: Lets you work with spreadsheets at the cell level
    • row_count.text: For getting the number of rows in a worksheet
  • service.SpreadsheetsService: The client API object for accessing the spreadsheet
    • email: A Gmail e-mail address
    • password: Gmail password
    • source: The name of the application accessing the API
    • ProgrammaticLogin(): Logs you into your Google account
    • GetCellsFeed(key, wksht_id): Gives you a cell feed (SpreadsheetsCellsFeed) of a spreadsheet. You must specify the ID of the spreadsheet and the ID of the worksheet within the spreadsheet.
    • GetListFeed(key, wksht_id): Gives you a list feed (SpreadsheetsListFeed) of a spreadsheet. You must specify the ID of the spreadsheet and the ID of the worksheet within the spreadsheet.
    • InsertRow(row_data, key, wksht_id): Adds a new row to the spreadsheet
    • DeleteRow(entry): Deletes a SpreadsheetsList (which is a row)

Listing 2 contains the code that uploads our batting stats to a Google Spreadsheet. First, we delete all the rows in the current worksheet (in _delete_and_add_rows), and then we add new rows with data from a .csv file (in _fill_in_rows).


Listing 2. Partial listing of upload_stats.py
				
import gdata.spreadsheet.service

class Uploader(CRUD):
    """
    Deletes old contents of a Google Worksheet and adds new contents from a .csv
    file.
    """

    def __init__(self, email, password, infile):
        "Initialize attributes of an Uploader."
        self.gd_client = gdata.spreadsheet.service.SpreadsheetsService()
        self.gd_client.email = email
        self.gd_client.password = password
        self.gd_client.source = 'Upload Batting Stats'
        self.gd_client.ProgrammaticLogin()
        self.curr_key = ''
        self.curr_wksht_id = ''
        self.list_feed = None
        self.infile = infile
    
    def prompt_delete_old_rows_and_add_new_rows(self):
        """
        Delete old rows and add new ones, but prompt user for spreadsheet and 
        worksheet IDs if they're not already set.
        """
        if not self.curr_key:
            self._PromptForSpreadsheet()
        if not self.curr_wksht_id:
            self._PromptForWorksheet()

        self._delete_and_add_rows()

    def _delete_and_add_rows(self):
        """
        First, delete all the rows (except the header row) in the spreadsheet 
        and then call method to add rows.
        """
        cells_feed = self.gd_client.GetCellsFeed(self.curr_key, self.curr_wksht_id)
        row_count = int(cells_feed.row_count.text)
        while (row_count > 1):
            self._list_delete_action(0)
            cells_feed = self.gd_client.GetCellsFeed(self.curr_key, self.curr_wksht_id)
            row_count = int(cells_feed.row_count.text)
      
        self._fill_in_rows()
  
    def _list_delete_action(self, index):
        "Delete the row with the given index."
        self.list_feed = self.gd_client.GetListFeed(self.curr_key, self.curr_wksht_id)
        self.gd_client.DeleteRow(self.list_feed.entry[index])
        print 'Deleted!'
  
    def _fill_in_rows(self):
        "Create new rows with the new data."
        infile = open(self.infile)
        header_line = infile.readline().strip()
        headers = header_line.split(',')
        col_count = len(headers)
        while infile:
            line = infile.readline().strip()
            if len(line) > 0:
                row_data = {}
                parts = line.split(',')
                for col in range(col_count):
                    row_data[headers[col].lower()] = parts[col].strip()
                try:
                    self._list_insert_action(row_data)
                except (gdata.service.RequestError, SyntaxError):
                    print 'Error inserting row:', row_data
                    print 'Will skip that row.'
            else:
                break;
    
    def _list_insert_action(self, row_data):
        "Insert a new row with data."
        entry = self.gd_client.InsertRow(row_data,
            self.curr_key, self.curr_wksht_id)
        if isinstance(entry, gdata.spreadsheet.SpreadsheetsList):
            print 'Inserted!'

Next, we will display this data in a Google bar chart.


Creating a Google bar chart with the Google Chart API

The Google Chart API is easy to use and is accessed via a particular google.com URL. You specify the required arguments and the Google Chart API returns a custom chart.

Let's look at an example. Figure 1 shows what we want our batting stats chart to look like. The chart shows batting average (AVG), on-base percentage (OBP), and slugging percentage (SLG).


Figure 1. Example Google bar chart
Example Google bar chart

Take a quick look at our example chart-generating URL in Listing 3, and then we'll break it down and explain its component parts.


Listing 3. Example URL for Google Chart API
                
var chartURL = "http://chart.apis.google.com/chart?cht=bvg&" +
    "chd=t:0.284,0.299,0.271|0.328,0.37,0.298|0.355,0.494,0.4&chs=298x180&" +
    "chco=c6d9fd,4d89f9,8a31fb&chdl=AVG|OBP|SLG&chds=0,0.700&" + 
    "chtt=Minnesota%20Twins+Batting&chxt=y,x&" +
    "chxl=0:|0.000|0.100|0.200|0.300|0.400|0.500|0.600|0.700|" + 
    "1:|D+Young|J+Morneau|C+Gomez&chbh=15"

Here are what the different arguments specify:

  • cht: The chart type. bvg means we want a bar chart that is vertically-oriented and the bars are grouped (as opposed to stacked).
  • chd: The chart data
  • chs: The chart size (in pixels)
  • chco: The colors of the bars
  • chdl: The chart data labels
  • chds: The chart scaling parameter
  • chtt: The chart title
  • chxt: The orientation of multiple axes
  • chxl: The chart labels
  • chbh: The height of the bars (Bars are, by default, horizontal, but ours are vertical. So, in our case, this is actually the width of the bars.)


Creating a Google Gadget

Finally, we create our Google Gadget. Google Gadgets are widgets that you can place within Web pages, such as your iGoogle page. They're made up of a combination of XML, HTML, and JavaScript.

Our Google Gadget contains our chart and is shown in Figure 2.


Figure 2. Batting stats Google Gadget
Batting Stats Google Gadget

As you can see, Google Gadgets allow you to prompt for user preferences. In ours, we let the users select the team and how many players they want to display.


Figure 3. Gadget with preferences shown
Gadget with preferences shown

Listing 4 shows the code for our gadget. The code basically does three things:

  1. Gets the batting stats from the Google Spreadsheet
  2. Constructs the URL for the Google Chart
  3. Adds the chart to the HTML in the gadget

We retrieve a JSON (JavaScript Object Notation) feed using the Spreadsheets Data API to get the data from the spreadsheet. This JSON feed is handed back to our listStats call-back function, and we parse the spreadsheet data within that function.



Listing 4. Partial listing of mlb-batting-stats.xml
                
<Module>
 <ModulePrefs 
  title="MLB Batting Stats"
  screenshot="http://mrsabermetrics.sourceforge.net/images/MLBBattingScreenShot.PNG"
  description="Displays up-to-date MLB batting stats."
  height="200"
  width="675"
  thumbnail="http://mrsabermetrics.sourceforge.net/images/Babe_Ruth.jpg"
  title_url="http://mrsabermetrics.sourceforge.net/"
  scrolling="true"
 >
 </ModulePrefs>
 <UserPref name="num_players" display_name="How many players to show?" 
           datatype="enum" default_value="9" required="true">
  <EnumValue value="2" display_value="2"/>
  <EnumValue value="3" display_value="3"/>
  <EnumValue value="4" display_value="4"/>
    .
    .
    .
  <EnumValue value="15" display_value="15"/>
 </UserPref>
 <UserPref name="team" 
     display_name="Team"
     datatype="enum"
     required="true"
 >
  <EnumValue value="ARI" display_value="Arizona Diamondbacks"/>
  <EnumValue value="ATL" display_value="Atlanta Braves"/>
  <EnumValue value="BAL" display_value="Baltimore Orioles"/>
    .
    .
    .
  <EnumValue value="MON" display_value="Washington Nationals"/>
 </UserPref>
 <Content type="html">
<![CDATA[
<div id="statsdiv"></div>
<script>
    function listStats(root) {
        var feed = root.feed;
        var entries = feed.entry || [];
        var html = [''];
        var avgs = [];
        var obps = [];
        var slgs = [];
        var names = [];
            
        // Construct the URL for the Google Chart.             
        var cht = 'bvg';            
        html.push(
            '<img src="http://chart.apis.google.com/chart?cht=' + cht + '&chd=t:');
        for (var i = 0; i < feed.entry.length; ++i) {
            var entry = feed.entry[i];
            var title = entry.title.$t;
            names.push(title);
            var content = entry.content.$t;
            var parts = content.split(',');

            var avg = extractStat(parts[0]);
            avgs.push(avg);
            var obp = extractStat(parts[1]);
            obps.push(obp);
            var slg = extractStat(parts[2]);
            slgs.push(slg);
        }
        
        var numPlayersPref = getStringPref("num_players");

        // Show highest AB players
        var numPlayers = parseInt(numPlayersPref);
        if (numPlayers > names.length) {
            numPlayers = names.length;
        }
        
        for (var i = 0; i < numPlayers; ++i) {
            html.push(avgs[i]);
            if (i < numPlayers - 1) {
                html.push(",");
            }
        }
        html.push("|");
        
        for (var i = 0; i < numPlayers; ++i) {
            html.push(obps[i]);
            if (i < numPlayers - 1) {
                html.push(",");
            }
        }
        html.push("|");
        
        for (var i = 0; i < numPlayers; ++i) {
            html.push(slgs[i]);
            if (i < numPlayers - 1) {
                html.push(",");
            }
        }
        var width = 1000;
        if (numPlayers <= 9) {
            var width = 58 * numPlayers + 124;
        }
        var chs = width + 'x180';
        var chds = '0,0.700';
        html.push(
            "&chs=" + chs + 
                "&chco=c6d9fd,4d89f9,8a31fb&chdl=AVG|OBP|SLG&chds=" + chds);

        var teamAbbrevToName = new Object();
        for (var i = 0; i < teamInfoArray.length; i += 3) {
            teamAbbrevToName[teamInfoArray[i]] = teamInfoArray[i + 2];
        }  
        var teamName = teamAbbrevToName[getStringPref("team")]; 
        var chxt = 'y,x';
        html.push('&chtt=' + teamName + '+Batting&chxt=' + chxt + 
            '&chxl=0:|0.000|0.100|0.200|0.300|0.400|0.500|0.600|0.700|1:');
        
        for (var i = 0; i < numPlayers; ++i) {
            var parts = names[i].split(' ');
            name = parts.join("+");
            html.push('|' + name)
        }
        
        html.push('&chbh=15"/>');
        
        var generatedHTML = html.join("");
    
        // Add the chart to the HTML in the gadget.         
        document.getElementById("statsdiv").innerHTML = generatedHTML;
    }
    
    function extractStat(nameAndStat) {
        var parts = nameAndStat.split(":");
        var stat = parts[1];
        var fStat = parseFloat(stat);
        
        return fStat;
    }
  
    function getStringPref(prefName) {
        var prefs = new _IG_Prefs();
        var stringPref = prefs.getString(prefName);

        return stringPref;
    }
  
    var teamInfoArray = 
        new Array('MIN', 'od6', "Minnesota Twins", 'ANA', 'od7', "Los Angeles Angels", 
                  'ARI', 'od4', "Arizona Diamondbacks", 'ATL', 'od5', "Atlanta Braves", 
                  'BAL', 'oda', "Baltimore Orioles", 'BOS', 'odb', "Boston Red Sox", 
                    .
                    .
                    .
                  'TEX', 'ocm', "Texas Rangers", 'TOR', 'ocn', "Toronto Blue Jays");  
</script>
  
<script language="javascript">
    var teamToWkshtID = new Object(  );
    for (var i = 0; i < teamInfoArray.length; i += 3) {
        teamToWkshtID[teamInfoArray[i]] = teamInfoArray[i + 1];
    }  
    var wkshtID = teamToWkshtID[getStringPref("team")];
    
    // Get the batting stats from the Google spreadsheet.
    // Retrieve a JSON feed using the Spreadsheets Data API
    var parmsrc = 
        "http://spreadsheets.google.com/feeds/list/" + 
            "o15075496074116042190.8006901556347913051/" + wkshtID + 
            "/public/basic?alt=json-in-script&callback=listStats"; 
    scriptNode = document.createElement('script');
    scriptNode.src = parmsrc;
    scriptNode.type = 'text/javascript';
    document.getElementsByTagName('head')[0].appendChild(scriptNode);
</script>
]]>

 </Content>
</Module>
        


Hacking the hack

Some improvements could be made to this gadget. Here are a few:

  • It would be nice to be able to see the exact numbers represented by the bars. For instance, when your mouse pointer hovers over a bar, the exact stat could appear (like a tool tip does).
  • It would be nice to have a feature that lets you compare team AVG/OBP/SLG stats.
  • There are lists of team information in the Gadget code which we might want to keep as data in a text file somewhere, rather than having it in the code itself. On the other hand, this data isn't likely to change often (although the Devil Rays did shorten their name to the "Rays"), so keeping this data in the code isn't the end of the world.

You could argue that we should obtain the stats within the gadget code directly from a baseball Web site without going through the intermediate step of using a spreadsheet. We used a spreadsheet as an intermediate step for a couple of reasons. First, these stats are only going to change once a day and they're the same for every user. So, there's no point in needlessly pounding the baseball site that we're scraping. Second, the main point of this article is to show you what's available in the Google Code APIs, and this is a nice example of the Google Spreadsheet API.

There is really no limit to the number of baseball Google Gadgets you could come up with. It would be fun to create a whole 'dashboard' of gadgets that would help you to manage fantasy baseball teams.

Google Gadgets are not the only type of widgets out there. Mac OS X Dashboard Widgets are another popular type of widget.


Conclusion

In this article, we've given you some idea of the type of Web applications you can build using Google Gadgets, the Google Spreadsheet API, and the Google Chart API. With our code samples, we've also given you enough to get started writing your own applications using these Google Code APIs. In the Resources section, you'll find links to more detailed information. We also encourage you to take a look at Joseph Adler's excellent Baseball Hacks book. Even if you think you don't like baseball, you'll find this book interesting.


Download

Note

  1. The Google Gadget built in this article.

Resources

Learn

Discuss

About the authors

Paul Reiners

Paul Reiners was an Oakland Athletics fan back in the days of Catfish Hunter, Rollie Fingers, and Vida Blue. Paul is the developer of several open source programs, including Automatous Monk, Twisted Life, and Leipzig. He received his M.S. in Applied Mathematics (Theory of Computation) at the University of Illinois at Urbana-Champaign. He lives in Minnesota and, in his spare time, practices the electric bass and competes at TopCoder.

John Mysak

John is a former catcher and has been using Google as his primary search engine for as long as he can remember. John is the developer of PowerWorship and the owner of the world's smartest beagle.

Paul DiCarlo

Paul DiCarlo liked Catfish Hunter much better when he pitched for the New York Yankees. He lives in Minnesota where he is desperately trying to learn the ukulele.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development
ArticleID=330438
ArticleTitle=Google Code baseball hacks: Display batting stats in a Google Gadget
publish-date=08122008
author1-email=reiners@us.ibm.com
author1-email-cc=
author2-email=mysak@us.ibm.com
author2-email-cc=
author3-email=pdicarlo@us.ibm.com
author3-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers