 | Level: Intermediate Michael Galpin (mike.sr@gmail.com), Developer, eBay
05 Aug 2008 Tapping into social software can be a great way to add value to your
application. Social networks are making it easier to take data and mash it up to
create innovative new Web applications. However, you must still deal with all the
usual issues of creating a scalable Web application. Now the Google App Engine (GAE)
makes that easier, as well. With the GAE, you can forget all about managing pools of
application servers. You do not have to worry about storing huge amounts of static
content and dynamic data. Instead, you can concentrate on creating a great mashup. In
this article, the first of a three-part "Creating mashups on the Google App Engine using
Eclipse" series, we see how to get started developing GAE applications, and we will
take a look at how to use Eclipse to make GAE development even
easier.
About this series
In this series, we look at how to get started with the Google App Engine (GAE). Here in
Part 1, we look at how to get a development environment set up so you can start
creating an application that will run on the GAE. We will see how we can use Eclipse to
make developing and debugging your application easier. In Part 2, we build an Ajax
mashup using Eclipse and deploy it to the GAE. Finally, in Part 3, we give back to the
ecosystem by creating RESTful Web services to our application, so other folks can use
it to create their own mashups.
The GAE is a platform for creating Web applications. The biggest prerequisite for it is
knowledge of Python, as this is the programming language used on it (currently, Python
V2.5.2). For this series, it would be helpful to have some typical Web development
skills (e.g., knowledge of HTML, JavaScript, and CSS). To develop for the GAE, you need
to download three software packages.
- Eclipse Classic
- I used Eclipse Classic V3.3.2. Later versions will work, too.
- Google App Engine SDK
- Read official documentation from the GAE site and find links to download the SDK.
- PyDev
- PyDev, which turns Eclipse into a Python IDE, can be installed from within Eclipse
using the update site at http://pydev.sourceforge.net/updates/.
Installing the latter two software packages is discussed in detail below. If you are
new to Eclipse, see Resources to get started.
Setting up the GAE
If you have developed Web applications for any amount of time, you're probably used to
downloading libraries, Web servers, and databases when you want to get started with a
new application stack. Sometimes all of these can be bundled in nice
installers to make it a little easier to work with all the moving parts. Once you
get everything in place, there are usually even more hoops to jump through to get
everything working with your favorite development environment. Fortunately, this is not
the case when working with the GAE. Let's take a look at how to get it set up and at how to get it working with Eclipse.
The first thing you need to do to set up the GAE is download the SDK, available for
Microsoft® Windows®, Mac OS X, and Linux®.
For Windows and Mac OS X, the SDK comes as an installer that will not only install the
SDK on your system but will also put a couple key executable scripts on your
path for easy use. Figure 1 shows a picture of what the directory structure of the SDK looks like.
Figure 1. GAE SDK directory structure
At the root of the directory, you should see two Python scripts: appcfg.py and
dev_appserver.py. The dev_appserver.py script is what you will use to launch a
development application server. There is no separate install and no deployment needed
for developing and testing your application. The appcfg.py script is what you will use
when you are ready to deploy your application to the GAE.
The Google directory is where you will find all the APIs that are the foundation of
the GAE platform. You will inevitably use and extend classes in this directory. Thus, if
something wants to run GAE application code, it needs to know about this directory
because it will need to know about the APIs used by the application code. To see an
example, let's take a look at how to set up Eclipse to develop GAE code.
Setting up Eclipse
Eclipse is known as the premier IDE for developing Java™ programming
language applications. However, it is not just for Java developers and is also used for
many other languages, including C++, PHP, Ruby, and Python. In fact, there are multiple
Eclipse plug-ins available that transform Eclipse into a Python IDE. The most popular
of those is PyDev. It can be installed from within Eclipse using its update site: http://pydev.sourceforge.net/updates/.
Once you have PyDev installed, you need to configure it. Open Eclipse and go to
Preferences > PyDev.
Figure 2. Configuring PyDev
You need to tell PyDev where your Python installation is. Go to Interpreter >
Python and click New, as shown above. Simply navigate to your Python V2.5+
installation, and Eclipse should do the rest. Click OK, and you will be ready to develop Python from within Eclipse.
Create an application
We will start by creating a new PyDev in Eclipse. We will switch to the PyDev
perspective by selecting Windows > Open Perspective > Other and picking
PyDev from the list of available perspectives.
Figure 3. Open PyDev perspective
Now you should be able to create a new application by selecting File > New >
PyDev Project. In this example, we will create a separate src folder for our source
code, but this is optional and mostly a matter of taste. So far, everything we have
done is generic to Python development on Eclipse. Now we will start doing some GAE-specific things.
GAE template project
GAE projects are simple in layout; everything is one directory with no subdirectories.
There are three files that you must have: app.yaml, index.yaml, and main.py. The first
of these, app.yaml, is shown below.
Listing 1. app.yaml
application: aggrogator
version: 1
runtime: python
api_version: 1
handlers:
- url: .*
script: main.py
|
This is the main configuration file for a GAE application. Most of the things in here
you will only need to set once, like the name of your application, version, and what
version of the GAE API it was built using. The part you will deal more with is the
handlers section. This is a mapping between the URLs of HTTP requests and Python
scripts in your project. In the case above, we map everything to the same main.py
script. That file can then do additional routing to match URLs to methods.
The other YAML file, index.yaml, is used to help the GAE fetch data efficiently. You
can actually get away without having one, as GAE will generate it for you if you leave
it out. We do not need to do anything with this for now, and Listing 2 shows a blank one.
Listing 2. index.yaml
indexes:
# AUTOGENERATED
# This index.yaml is automatically updated whenever the dev_appserver
# detects that a new type of query is run. If you want to manage the
# index.yaml file manually, remove the above marker line (the line
# saying "# AUTOGENERATED"). If you want to manage some indexes
# manually, move them above the marker line. The index.yaml file is
# automatically uploaded to the admin console when you next deploy
# your application using appcfg.py.
|
Finally, we come to the meat of our application: the main.py script. You can actually
call this whatever you want, as long as it is in sync with app.yaml. It is common to
use main.py, so that's what we will use. Before we look at that file, let's describe the type of application we will write.
The aggroGator
Our application is called aggroGator. It will allow users to associate themselves with
various Web services they already use. It will then fetch their data feeds from those
services and aggregate them chronologically. For our application, we will use the
popular FeedParser library to parse the feeds. We will also
use GAE's built-in identity management via Google, so we do not have to write our own
registration/login/logout features. Users will just log in using the Google identity.
With all of that in mind, let's take a look at main.py.
Listing 3. main.py
def main():
application = webapp.WSGIApplication(
[('/', MainPage),
('/add', AddService)],
debug=True)
wsgiref.handlers.CGIHandler().run(application)
if __name__ == "__main__":
main()
|
So this is actually just the main method of our script. All it does is set up some more
routing. In this case, it will route requests for "/" to a class called MainPage and requests for "/add" to a class called AddService. Now let's look at the MainPage class.
Listing 4. MainPage
class MainPage(webapp.RequestHandler):
def get(self):
user = users.get_current_user()
if users.get_current_user():
url = users.create_logout_url(self.request.uri)
url_linktext = 'Logout'
else:
url = users.create_login_url(self.request.uri)
url_linktext = 'Login'
updates = []
account = None
if user:
account_query = Account.all()
account_query.filter('user = ', users.get_current_user())
result_set = account_query.fetch(1)
if len(result_set) > 0:
account = account_query.fetch(1)[0]
if account:
updates = []
for service in account.dynamic_properties():
url = getattr(account, service)
feed = GenericFeed(url, service)
updates.extend(feed.entries())
else:
account = Account()
account.user = user
account.put()
updates.sort(key=attrgetter('timestamp'), reverse=True)
template_values = {
'account': account,
'updates': updates,
'url': url,
'url_linktext': url_linktext,
}
path = os.path.join(os.path.dirname(__file__), 'index.html')
self.response.out.write(template.render(path, template_values))
|
This class does a lot, as it is the main controller for our application. The first
thing to notice is that it only has a get method. That means
it will only support an HTTP GET request. A POST to "/" will result in an error. Next, it checks for identity.
The users class is an API from the GAE SDK. It is leveraging
Google's identity management. We use it to check if the user is logged in. If he is
logged in, we know who he is (what his Google ID is.) We then create either a
login or logout link, depending on whether he is logged in. If not, we just
show the login link; otherwise, we keep processing by looking up the account. This is
another class we have written, and it is shown below.
Listing 5. The Account class
class Account(db.Expando):
user = db.UserProperty()
|
This class leverages Google's data storage APIs (Google's famous Bigtable datastore).
Our Account entity has a User
property and is an Expando Model. That allows us to create dynamic properties
— one for the URL of each service. For each service, we retrieve a list
of entries using the GenericFeed class, shown below.
Listing 6. GenericFeed class
class GenericFeed:
def __init__(self, url, name):
self.url = url
self.name = name
def entries(self):
result = urlfetch.fetch(self.url)
updates = []
if result.status_code == 200:
feed = feedparser.parse(result.content)
for entry in feed['entries']:
x = Entry()
x.service = self.name
x.title = entry['title']
x.link = entry['link']
if entry.summary:
x.content = entry.summary
else:
x.content = entry['title']
x.timestamp = entry.updated_parsed
updates.append(x)
return updates
|
This is the class using the FeedParser library. Before we
use that library, we use another one of the GAE APIs: the urlfetch class. This class allows for HTTP requests, but only port
80 and 443 (for secure requests). We simply use it to do an HTTP GET on the URL we stored, then pass the results to the FeedParser library. We then create an instance of an Entry class, shown below.
Listing 7. Entry class
class Entry:
def __init__(self=None, title=None, link=None, timestamp=None,
content=None, service=None):
self.title = title
self.link = link
self.content = content
self.service = service
self.timestamp = timestamp
def printTime(self):
return strftime('%B %d,%Y at %I:%M:%S %p',self.timestamp)
|
This class is mostly just a simple data structure. Its only logic is that it has a
method for printing a timestamp. Our GenericFeed class
returns a list of Entry instances for each service
associated with the user. We then sort the Entries based on
timestamp, in descending order (newest Entry first). Going
back to MainPage, we then pass several objects, the user's
Account, the sorted list of Entries, and the login/logout link to a template. GAE uses a
templating system similar to the one used by the popular Python framework Django. In
this case, we pass the data to a template called index.html.
Listing 8. The index.html template
<html>
<body>
<a href="{{ url }}">{{ url_linktext }}</a>
<ol>
{% for update in updates %}
<li>
From {{update.service}}:
<a href="{{update.link}}">{{update.content}}</a>
posted at: {{update.printTime}}
</li>
{% endfor %}
</ol>
{% if account %}
<form action="/add" method="post">
<label for="service">Service: </label>
<select name="service">
<option>twitter</option>
<option>del.icio.us</option>
<option>last.fm</option>
<option>YouTube</option>
</select><br/>
<label for="username">Username: </label>
<input type="text" name="username"/>
<input type="submit" value="Add"/>
</form>
{% endif %}
</body>
</html>
|
This is a simple template. It is mostly just HTML, with a couple of dynamic parts.
First, it creates the appropriate login/logout link. Next, it iterates over the list of
entries to display them to the user. Finally, if the user is logged in, it will create
a form that will allow the user to add a service. That form does an HTTP POST to the /add URL. As we saw in Listing
3, that request will get routed to the AddService controller
class.
Listing 9. AddService controller
class AddService(webapp.RequestHandler):
def post(self):
# check if user already exists
account_query = Account.all()
account_query.filter('user = ', users.get_current_user())
result_set = account_query.fetch(1)
if len(result_set) > 0:
account = account_query.fetch(1)[0]
else :
account = Account()
account.user = users.get_current_user()
service = self.request.get('service')
username = self.request.get('username')
if service == 'twitter':
service = 'http://twitter.com/statuses/user_timeline/'+username+'.rss'
account.twitter = service
if service =='del.icio.us':
service = 'http://del.icio.us/rss/' + username
account.del_icio_us = service
if service == 'last.fm':
service = 'http://ws.audioscrobbler.com/1.0/user/'+username+
'/recenttracks.rss'
account.last_fm = service
if service == 'YouTube':
service = 'http://www.youtube.com/rss/user/'+username+'/videos.rss'
account.you_tube = service
account.put()
self.redirect('/')
|
This class looks up our user's account. It then creates the appropriate URL based on
what service was used. It adds that URL to the account's services, using an Expando
property, and saves everything back to Bigtable. Finally, it
redirects to MainPage.
Now we have seen all of the code for
the application, and we are ready to run it. But how? Once again, Eclipse makes this easy.
Testing locally
The GAE SDK provides command-line tools for running your project locally. However, we
want to take advantage of Eclipse, so we want to run things from inside Eclipse. This
allows us to debug the application, as we will see later. The first step in running
the application is editing the PYTHONPATH for the project.
The easiest way to do this is to right-click on your project and select
Properties. This will bring up the project properties.
Figure 4. Project properties
As you can see, you want to select PyDev - PYTHONPATH in the left menu. Then you
want to select Add source folder and navigate to where the GAE SDK has been
installed. This differs depending on your OS and is customizable. On Windows, the
default (set by the installer) is C:\Program Files\Google\AppEngine, and on OS X, the
default is /usr/local/google_appengine. If you are on Linux or you downloaded the ZIP
instead of the OS-specific installer, you picked where you wanted to put the SDK. It
can be anywhere, but you just need to let Eclipse know where it is. We will refer to this location as $APP_ENGINE_HOME.
Now we need to do just one last thing to run our project: We need to create a Run
profile for it. To do this, select Run > Open Run dialog.
Figure 5. Run dialog
We will call the Run profile aggroGator. Under Main
Module browse to $APP_ENGINE_HOME and select the dev_appserver.py script. This is a Python
application server that mimics the GAE production environment. Next, go to the
Arguments tab, as shown in Figure 6.
Figure 6. Arguments tab
In the Program arguments box, enter ${project_loc}/src. The
Eclipse variable ${project_loc} simply points to the
physical location of the current project. We want to pass the directory of our
application to the dev_appserver.py script, hence the /src. If you did not put your
code in a src directory, adjust the argument accordingly.
Now we're ready to run the
application. If you click Run, you should see the output in Listing 10 in the Eclipse console.
Listing 10. Console output
INFO 2008-06-08 05:00:29,236 appcfg.py] Server: appengine.google.com
INFO 2008-06-08 05:00:29,283 appcfg.py] Checking for updates to the SDK.
WARNING 2008-06-08 05:00:29,581 datastore_file_stub.py] Could not read datastore data
from /var/folders/oo/ooKE4ln2HqC9exSMWxwprk+++TI/-Tmp-/dev_appserver.datastore
WARNING 2008-06-08 05:00:29,582 datastore_file_stub.py] Could not read datastore data
from /var/folders/oo/ooKE4ln2HqC9exSMWxwprk+++TI/-Tmp-/dev_appserver.datastore.history
INFO 2008-06-08 05:00:29,606 dev_appserver_main.py] Running application aggrogator
on port 8080: http://localhost:8080
|
Notice that it says the application is running at http://localhost:8080. Go to that in
your browser and give it a try, as shown below.
Figure 7. aggroGator welcome screen
Clicking on Login will bring up Figure 8.
Figure 8. Login screen
This is a mock login screen, obviously. You can use any e-mail address, as this is not
going to actually access the Google authentication service. In fact, test@example.com
will work just fine. Logging in will allow you to start adding services. .
Figure 9. Adding a service
Now you can start having fun with the application by adding services. If you go back to
Listing 8, the AddService controller, the URLs for the feeds
to the various services is hard-coded in this class. Of course, this could change in
the future, and you might get an error. This is when a debugger comes in handy. Let's
take a look at how to use the Eclipse debugger with our GAE project.
Debugging
A major advantage of using an IDE like Eclipse is that it makes it much
easier to debug applications, even complex Web applications. The first thing we need to
do for our GAE project is create a Debug profile for it. This is similar to our Run
profile, so we simply select the project and click Run > Open Debug dialog.
Figure 10. Debug dialog
Eclipse is smart enough to default to your Run settings we created previously. There is
no modification needed, and you can simply click Debug. Take a look at your
Eclipse console, and you should see some output similar to Listing 11.
Listing 11. Debug output
pydev debugger: warning: psyco not available for debugger speedups
pydev debugger: starting
INFO 2008-06-08 05:18:37,704 appcfg.py] Server: appengine.google.com
INFO 2008-06-08 05:18:37,755 appcfg.py] Checking for updates to the SDK.
INFO 2008-06-08 05:18:38,196 dev_appserver_main.py] Running application aggrogator
on port 8080: http://localhost:8080
|
The first two lines show output from the pydev debugger. Now you can set break points
in the project and start debugging. In Figure 11, we are debugging the AddService controller.
Figure 11. Debugging AddService
Now we can start stepping through the code and inspecting variables. If any of our
services change, or we need to add new ones, this will make it easy to find and squash bugs.
Summary
In this article, we have gone from start to a full application in no time. The Google
App Engine SDK
has been installed and hooked up to Eclipse. This allowed us to quickly write, test,
and debug code. We took a look at many key concepts in GAE projects, including URL
routing, interacting with external sites via URL fetching, using presentation
templates, and working with Bigtable. Our application is
ready to be deployed to the GAE, which we will take a look at in Part 2.
Download | Description | Name | Size | Download method |
|---|
| Sample code | os-eclipse-mashup-google-pt1-aggrogator.zip | 74KB | HTTP |
|---|
Resources Learn
-
Read "Charming
Python: Python elegance and warts" to learn about the latest goodies in Python.
-
Read more of the "Charming
Python" series on developerWorks.
-
The SDK uses the Web app framework that is similar to Django. You can actually use
Django, so you might want to learn about Django in the developerWorks article "Python Web frameworks,
Part 1."
-
Check out "Get started with
open source CMS, Part 6: Build a Python WebDAV client for Jakarta Slide" to see PyDev in action.
-
Read all about Google's Bigtable: A Distributed Storage
System for Structured Data.
-
With a dynamic language like Python, it is always good to have the official Python documentation handy.
-
Doing Web development with Eclipse? You might want to read "Discover the Ajax
Toolkit Framework for Eclipse."
-
Check out the "Recommended Eclipse reading list."
-
Browse all the Eclipse content on developerWorks.
-
New to Eclipse? Read the developerWorks article "Get started with Eclipse Platform" to learn its origin and architecture, and how to extend Eclipse with plug-ins.
-
Expand your Eclipse skills by checking out IBM developerWorks' Eclipse project resources.
-
To listen to interesting interviews and discussions for software developers, check out developerWorks podcasts.
-
Stay current with developerWorks' Technical events and webcasts.
-
Watch and learn about IBM and open source technologies and product functions with the no-cost developerWorks On demand demos.
-
Check out upcoming conferences, trade shows, webcasts, and other Events around the world that are of interest to IBM open source developers.
-
Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
Get products and technologies
Discuss
-
The Eclipse Platform newsgroups should be your first stop to discuss questions regarding Eclipse. (Selecting this will launch your default Usenet news reader application and open eclipse.platform.)
-
The Eclipse newsgroups has many resources for people interested in using and extending Eclipse.
-
Participate in developerWorks blogs and get involved in the developerWorks community.
About the author  | 
|  | Michael Galpin has been developing Java software professionally since 1998. He currently works for eBay. He holds a degree in mathematics from the California Institute of Technology. |
Rate this page
|  |