In this series, we look at how to get started with the Google App Engine (GAE). Here in Part 1, we look at how to get a development environment set up so you can start creating an application that will run on the GAE. We will see how we can use Eclipse to make developing and debugging your application easier. In Part 2, we build an Ajax mashup using Eclipse and deploy it to the GAE. Finally, in Part 3, we give back to the ecosystem by creating RESTful Web services to our application, so other folks can use it to create their own mashups.
The GAE is a platform for creating Web applications. The biggest prerequisite for it is knowledge of Python, as this is the programming language used on it (currently, Python V2.5.2). For this series, it would be helpful to have some typical Web development skills (e.g., knowledge of HTML, JavaScript, and CSS). To develop for the GAE, you need to download three software packages.
- Eclipse Classic
- I used Eclipse Classic V3.3.2. Later versions will work, too.
- Google App Engine SDK
- Read official documentation from the GAE site and find links to download the SDK.
- PyDev
- PyDev, which turns Eclipse into a Python IDE, can be installed from within Eclipse using the update site at http://pydev.sourceforge.net/updates/.
Installing the latter two software packages is discussed in detail below. If you are new to Eclipse, see Resources to get started.
If you have developed Web applications for any amount of time, you're probably used to downloading libraries, Web servers, and databases when you want to get started with a new application stack. Sometimes all of these can be bundled in nice installers to make it a little easier to work with all the moving parts. Once you get everything in place, there are usually even more hoops to jump through to get everything working with your favorite development environment. Fortunately, this is not the case when working with the GAE. Let's take a look at how to get it set up and at how to get it working with Eclipse.
The first thing you need to do to set up the GAE is download the SDK, available for Microsoft® Windows®, Mac OS X, and Linux®. For Windows and Mac OS X, the SDK comes as an installer that will not only install the SDK on your system but will also put a couple key executable scripts on your path for easy use. Figure 1 shows a picture of what the directory structure of the SDK looks like.
Figure 1. GAE SDK directory structure
At the root of the directory, you should see two Python scripts: appcfg.py and dev_appserver.py. The dev_appserver.py script is what you will use to launch a development application server. There is no separate install and no deployment needed for developing and testing your application. The appcfg.py script is what you will use when you are ready to deploy your application to the GAE.
The Google directory is where you will find all the APIs that are the foundation of the GAE platform. You will inevitably use and extend classes in this directory. Thus, if something wants to run GAE application code, it needs to know about this directory because it will need to know about the APIs used by the application code. To see an example, let's take a look at how to set up Eclipse to develop GAE code.
Eclipse is known as the premier IDE for developing Java™ programming language applications. However, it is not just for Java developers and is also used for many other languages, including C++, PHP, Ruby, and Python. In fact, there are multiple Eclipse plug-ins available that transform Eclipse into a Python IDE. The most popular of those is PyDev. It can be installed from within Eclipse using its update site: http://pydev.sourceforge.net/updates/.
Once you have PyDev installed, you need to configure it. Open Eclipse and go to Preferences > PyDev.
Figure 2. Configuring PyDev
You need to tell PyDev where your Python installation is. Go to Interpreter > Python and click New, as shown above. Simply navigate to your Python V2.5+ installation, and Eclipse should do the rest. Click OK, and you will be ready to develop Python from within Eclipse.
We will start by creating a new PyDev in Eclipse. We will switch to the PyDev perspective by selecting Windows > Open Perspective > Other and picking PyDev from the list of available perspectives.
Figure 3. Open PyDev perspective
Now you should be able to create a new application by selecting File > New > PyDev Project. In this example, we will create a separate src folder for our source code, but this is optional and mostly a matter of taste. So far, everything we have done is generic to Python development on Eclipse. Now we will start doing some GAE-specific things.
GAE projects are simple in layout; everything is one directory with no subdirectories. There are three files that you must have: app.yaml, index.yaml, and main.py. The first of these, app.yaml, is shown below.
Listing 1. app.yaml
application: aggrogator version: 1 runtime: python api_version: 1 handlers: - url: .* script: main.py |
This is the main configuration file for a GAE application. Most of the things in here you will only need to set once, like the name of your application, version, and what version of the GAE API it was built using. The part you will deal more with is the handlers section. This is a mapping between the URLs of HTTP requests and Python scripts in your project. In the case above, we map everything to the same main.py script. That file can then do additional routing to match URLs to methods.
The other YAML file, index.yaml, is used to help the GAE fetch data efficiently. You can actually get away without having one, as GAE will generate it for you if you leave it out. We do not need to do anything with this for now, and Listing 2 shows a blank one.
Listing 2. index.yaml
indexes: # AUTOGENERATED # This index.yaml is automatically updated whenever the dev_appserver # detects that a new type of query is run. If you want to manage the # index.yaml file manually, remove the above marker line (the line # saying "# AUTOGENERATED"). If you want to manage some indexes # manually, move them above the marker line. The index.yaml file is # automatically uploaded to the admin console when you next deploy # your application using appcfg.py. |
Finally, we come to the meat of our application: the main.py script. You can actually call this whatever you want, as long as it is in sync with app.yaml. It is common to use main.py, so that's what we will use. Before we look at that file, let's describe the type of application we will write.
Our application is called aggroGator. It will allow users to associate themselves with
various Web services they already use. It will then fetch their data feeds from those
services and aggregate them chronologically. For our application, we will use the
popular FeedParser library to parse the feeds. We will also
use GAE's built-in identity management via Google, so we do not have to write our own
registration/login/logout features. Users will just log in using the Google identity.
With all of that in mind, let's take a look at main.py.
Listing 3. main.py
def main():
application = webapp.WSGIApplication(
[('/', MainPage),
('/add', AddService)],
debug=True)
wsgiref.handlers.CGIHandler().run(application)
if __name__ == "__main__":
main()
|
So this is actually just the main method of our script. All it does is set up some more
routing. In this case, it will route requests for "/" to a class called MainPage and requests for "/add" to a class called AddService. Now let's look at the MainPage class.
Listing 4.
MainPage
class MainPage(webapp.RequestHandler):
def get(self):
user = users.get_current_user()
if users.get_current_user():
url = users.create_logout_url(self.request.uri)
url_linktext = 'Logout'
else:
url = users.create_login_url(self.request.uri)
url_linktext = 'Login'
updates = []
account = None
if user:
account_query = Account.all()
account_query.filter('user = ', users.get_current_user())
result_set = account_query.fetch(1)
if len(result_set) > 0:
account = account_query.fetch(1)[0]
if account:
updates = []
for service in account.dynamic_properties():
url = getattr(account, service)
feed = GenericFeed(url, service)
updates.extend(feed.entries())
else:
account = Account()
account.user = user
account.put()
updates.sort(key=attrgetter('timestamp'), reverse=True)
template_values = {
'account': account,
'updates': updates,
'url': url,
'url_linktext': url_linktext,
}
path = os.path.join(os.path.dirname(__file__), 'index.html')
self.response.out.write(template.render(path, template_values))
|
This class does a lot, as it is the main controller for our application. The first
thing to notice is that it only has a get method. That means
it will only support an HTTP GET request. A POST to "/" will result in an error. Next, it checks for identity.
The users class is an API from the GAE SDK. It is leveraging
Google's identity management. We use it to check if the user is logged in. If he is
logged in, we know who he is (what his Google ID is.) We then create either a
login or logout link, depending on whether he is logged in. If not, we just
show the login link; otherwise, we keep processing by looking up the account. This is
another class we have written, and it is shown below.
Listing 5. The
Account class
class Account(db.Expando):
user = db.UserProperty()
|
This class leverages Google's data storage APIs (Google's famous Bigtable datastore).
Our Account entity has a User
property and is an Expando Model. That allows us to create dynamic properties
— one for the URL of each service. For each service, we retrieve a list
of entries using the GenericFeed class, shown below.
Listing 6.
GenericFeed class
class GenericFeed:
def __init__(self, url, name):
self.url = url
self.name = name
def entries(self):
result = urlfetch.fetch(self.url)
updates = []
if result.status_code == 200:
feed = feedparser.parse(result.content)
for entry in feed['entries']:
x = Entry()
x.service = self.name
x.title = entry['title']
x.link = entry['link']
if entry.summary:
x.content = entry.summary
else:
x.content = entry['title']
x.timestamp = entry.updated_parsed
updates.append(x)
return updates
|
This is the class using the FeedParser library. Before we
use that library, we use another one of the GAE APIs: the urlfetch class. This class allows for HTTP requests, but only port
80 and 443 (for secure requests). We simply use it to do an HTTP GET on the URL we stored, then pass the results to the FeedParser library. We then create an instance of an Entry class, shown below.
Listing 7.
Entry class
class Entry:
def __init__(self=None, title=None, link=None, timestamp=None,
content=None, service=None):
self.title = title
self.link = link
self.content = content
self.service = service
self.timestamp = timestamp
def printTime(self):
return strftime('%B %d,%Y at %I:%M:%S %p',self.timestamp)
|
This class is mostly just a simple data structure. Its only logic is that it has a
method for printing a timestamp. Our GenericFeed class
returns a list of Entry instances for each service
associated with the user. We then sort the Entries based on
timestamp, in descending order (newest Entry first). Going
back to MainPage, we then pass several objects, the user's
Account, the sorted list of Entries, and the login/logout link to a template. GAE uses a
templating system similar to the one used by the popular Python framework Django. In
this case, we pass the data to a template called index.html.
Listing 8. The index.html template
<html>
<body>
<a href="{{ url }}">{{ url_linktext }}</a>
<ol>
{% for update in updates %}
<li>
From {{update.service}}:
<a href="{{update.link}}">{{update.content}}</a>
posted at: {{update.printTime}}
</li>
{% endfor %}
</ol>
{% if account %}
<form action="/add" method="post">
<label for="service">Service: </label>
<select name="service">
<option>twitter</option>
<option>del.icio.us</option>
<option>last.fm</option>
<option>YouTube</option>
</select><br/>
<label for="username">Username: </label>
<input type="text" name="username"/>
<input type="submit" value="Add"/>
</form>
{% endif %}
</body>
</html>
|
This is a simple template. It is mostly just HTML, with a couple of dynamic parts.
First, it creates the appropriate login/logout link. Next, it iterates over the list of
entries to display them to the user. Finally, if the user is logged in, it will create
a form that will allow the user to add a service. That form does an HTTP POST to the /add URL. As we saw in Listing
3, that request will get routed to the AddService controller
class.
Listing 9.
AddService controller
class AddService(webapp.RequestHandler):
def post(self):
# check if user already exists
account_query = Account.all()
account_query.filter('user = ', users.get_current_user())
result_set = account_query.fetch(1)
if len(result_set) > 0:
account = account_query.fetch(1)[0]
else :
account = Account()
account.user = users.get_current_user()
service = self.request.get('service')
username = self.request.get('username')
if service == 'twitter':
service = 'http://twitter.com/statuses/user_timeline/'+username+'.rss'
account.twitter = service
if service =='del.icio.us':
service = 'http://del.icio.us/rss/' + username
account.del_icio_us = service
if service == 'last.fm':
service = 'http://ws.audioscrobbler.com/1.0/user/'+username+
'/recenttracks.rss'
account.last_fm = service
if service == 'YouTube':
service = 'http://www.youtube.com/rss/user/'+username+'/videos.rss'
account.you_tube = service
account.put()
self.redirect('/')
|
This class looks up our user's account. It then creates the appropriate URL based on
what service was used. It adds that URL to the account's services, using an Expando
property, and saves everything back to Bigtable. Finally, it
redirects to MainPage.
Now we have seen all of the code for the application, and we are ready to run it. But how? Once again, Eclipse makes this easy.
The GAE SDK provides command-line tools for running your project locally. However, we
want to take advantage of Eclipse, so we want to run things from inside Eclipse. This
allows us to debug the application, as we will see later. The first step in running
the application is editing the PYTHONPATH for the project.
The easiest way to do this is to right-click on your project and select
Properties. This will bring up the project properties.
Figure 4. Project properties
As you can see, you want to select PyDev - PYTHONPATH in the left menu. Then you want to select Add source folder and navigate to where the GAE SDK has been installed. This differs depending on your OS and is customizable. On Windows, the default (set by the installer) is C:\Program Files\Google\AppEngine, and on OS X, the default is /usr/local/google_appengine. If you are on Linux or you downloaded the ZIP instead of the OS-specific installer, you picked where you wanted to put the SDK. It can be anywhere, but you just need to let Eclipse know where it is. We will refer to this location as $APP_ENGINE_HOME.
Now we need to do just one last thing to run our project: We need to create a Run profile for it. To do this, select Run > Open Run dialog.
Figure 5. Run dialog
We will call the Run profile aggroGator. Under Main
Module browse to $APP_ENGINE_HOME and select the dev_appserver.py script. This is a Python
application server that mimics the GAE production environment. Next, go to the
Arguments tab, as shown in Figure 6.
Figure 6. Arguments tab
In the Program arguments box, enter ${project_loc}/src. The
Eclipse variable ${project_loc} simply points to the
physical location of the current project. We want to pass the directory of our
application to the dev_appserver.py script, hence the /src. If you did not put your
code in a src directory, adjust the argument accordingly.
Now we're ready to run the application. If you click Run, you should see the output in Listing 10 in the Eclipse console.
Listing 10. Console output
INFO 2008-06-08 05:00:29,236 appcfg.py] Server: appengine.google.com INFO 2008-06-08 05:00:29,283 appcfg.py] Checking for updates to the SDK. WARNING 2008-06-08 05:00:29,581 datastore_file_stub.py] Could not read datastore data from /var/folders/oo/ooKE4ln2HqC9exSMWxwprk+++TI/-Tmp-/dev_appserver.datastore WARNING 2008-06-08 05:00:29,582 datastore_file_stub.py] Could not read datastore data from /var/folders/oo/ooKE4ln2HqC9exSMWxwprk+++TI/-Tmp-/dev_appserver.datastore.history INFO 2008-06-08 05:00:29,606 dev_appserver_main.py] Running application aggrogator on port 8080: http://localhost:8080 |
Notice that it says the application is running at http://localhost:8080. Go to that in your browser and give it a try, as shown below.
Figure 7. aggroGator welcome screen
Clicking on Login will bring up Figure 8.
Figure 8. Login screen
This is a mock login screen, obviously. You can use any e-mail address, as this is not going to actually access the Google authentication service. In fact, test@example.com will work just fine. Logging in will allow you to start adding services.
.
Figure 9. Adding a service
Now you can start having fun with the application by adding services. If you go back to
Listing 8, the AddService controller, the URLs for the feeds
to the various services is hard-coded in this class. Of course, this could change in
the future, and you might get an error. This is when a debugger comes in handy. Let's
take a look at how to use the Eclipse debugger with our GAE project.
A major advantage of using an IDE like Eclipse is that it makes it much easier to debug applications, even complex Web applications. The first thing we need to do for our GAE project is create a Debug profile for it. This is similar to our Run profile, so we simply select the project and click Run > Open Debug dialog.
Figure 10. Debug dialog
Eclipse is smart enough to default to your Run settings we created previously. There is no modification needed, and you can simply click Debug. Take a look at your Eclipse console, and you should see some output similar to Listing 11.
Listing 11. Debug output
pydev debugger: warning: psyco not available for debugger speedups pydev debugger: starting INFO 2008-06-08 05:18:37,704 appcfg.py] Server: appengine.google.com INFO 2008-06-08 05:18:37,755 appcfg.py] Checking for updates to the SDK. INFO 2008-06-08 05:18:38,196 dev_appserver_main.py] Running application aggrogator on port 8080: http://localhost:8080 |
The first two lines show output from the pydev debugger. Now you can set break points
in the project and start debugging. In Figure 11, we are debugging the AddService controller.
Figure 11. Debugging
AddService
Now we can start stepping through the code and inspecting variables. If any of our services change, or we need to add new ones, this will make it easy to find and squash bugs.
In this article, we have gone from start to a full application in no time. The Google
App Engine SDK
has been installed and hooked up to Eclipse. This allowed us to quickly write, test,
and debug code. We took a look at many key concepts in GAE projects, including URL
routing, interacting with external sites via URL fetching, using presentation
templates, and working with Bigtable. Our application is
ready to be deployed to the GAE, which we will take a look at in Part 2.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample code | os-eclipse-mashup-google-pt1-aggrogator.zip | 74KB | HTTP |
Information about download methods
Learn
-
Read "Charming
Python: Python elegance and warts" to learn about the latest goodies in Python.
-
Read more of the "Charming
Python" series on developerWorks.
-
The SDK uses the Web app framework that is similar to Django. You can actually use
Django, so you might want to learn about Django in the developerWorks article "Python Web frameworks,
Part 1."
-
Check out "Get started with
open source CMS, Part 6: Build a Python WebDAV client for Jakarta Slide" to see PyDev in action.
-
Read all about Google's Bigtable: A Distributed Storage
System for Structured Data.
-
With a dynamic language like Python, it is always good to have the official Python documentation handy.
-
Doing Web development with Eclipse? You might want to read "Discover the Ajax
Toolkit Framework for Eclipse."
-
Check out the "Recommended Eclipse reading list."
-
Browse all the Eclipse content on developerWorks.
-
New to Eclipse? Read the developerWorks article "Get started with Eclipse Platform" to learn its origin and architecture, and how to extend Eclipse with plug-ins.
-
Expand your Eclipse skills by checking out IBM developerWorks' Eclipse project resources.
-
To listen to interesting interviews and discussions for software developers, check out developerWorks podcasts.
-
Stay current with developerWorks' Technical events and webcasts.
-
Watch and learn about IBM and open source technologies and product functions with the no-cost developerWorks On demand demos.
-
Check out upcoming conferences, trade shows, webcasts, and other Events around the world that are of interest to IBM open source developers.
-
Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
Get products and technologies
-
Download the Google App Engine SDK and
read official documentation from the Google App Engine site.
-
The application created in this article used Mark Pilgrim's Universal Feed Parser. This awesome library can parse RSS, Atom, you name it.
-
DjangoProject.com is the home page for the Django framework.
-
This article uses Eclipse Classic V3.3.2.
-
The PyDev plug-in is available from http://pydev.sourceforge.net/updates/. It
can be installed from within Eclipse using this update site.
-
Check out the latest Eclipse technology downloads at IBM alphaWorks.
-
Download Eclipse Platform and other projects from the Eclipse Foundation.
-
Download IBM product evaluation versions, and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
-
Innovate your next open source development project with IBM trial software, available for download or on DVD.
Discuss
-
The Eclipse Platform newsgroups should be your first stop to discuss questions regarding Eclipse. (Selecting this will launch your default Usenet news reader application and open eclipse.platform.)
-
The Eclipse newsgroups has many resources for people interested in using and extending Eclipse.
-
Participate in developerWorks blogs and get involved in the developerWorks community.




