Cloud services for your virtual infrastructure, Part 2: Platform as a Service (PaaS) and AppScale

This series explores the major types of cloud services and related software you can use to build Web-scale systems. In this article, learn about AppScale and Platform-as-a-Service (PaaS) cloud computing. Explore the features and architecture of this virtual infrastructure. It's a great way to test your Google App Engine applications on your local resources or virtualized cloud infrastructures, such as Amazon EC2 or Eucalyptus.

Prabhakar Chaganti, CTO, Ylastic, LLC

Prabhakar Chaganti is the CTO of Ylastic, a start-up that is building a single unified interface to architect, manage, and monitor a user's entire AWS Cloud computing environment: EC2, S3, RDS, AutoScaling, ELB, Cloudwatch, SQS, and SimpleDB. He is the author of Xen Virtualization and GWT Java AJAX Programming, and is also the winner of the community choice award for the most innovative virtual appliance in the VMware Global Virtual Appliance Challenge. He is currently working on a book about Amazon SimpleDB, and can be found on Twitter as @pchaganti.



26 January 2010

Also available in Portuguese

Introduction

In this "Cloud services for your virtual infrastructure" series, learn about the three major types of cloud services: Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS).

Part 1 explores how IaaS provides a set of building blocks, or services, such as virtual servers, data storage, and databases. Using these services, you can create a platform to deploy and run your applications. Also learn about Eucalyptus, an open source software infrastructure for implementing cloud computing with clusters or workstation farms.

In this article, learn about PaaS platforms. Get an introduction to Google App Engine and to AppScale architecture. Explore how to use AppScale to run Google App Engine applications using your own clusters or on virtualized infrastructures, such as Amazon EC2 and Eucalyptus.


Platform as a Service (PaaS)

PaaS is a type of cloud service that provides software and product development tools hosted by the provider on the hardware infrastructure. The term PaaS is commonly used for cloud-based platforms you can leverage to build and run custom applications. PaaS applications provide everything you need to build and deploy Web applications and services accessible from anywhere on the Internet. Your end users do not have to download, install, or maintain the system.

PaaS offerings provide a set of basic services, including virtual servers, storage, and databases. You can use these services to build an application on top of the PaaS platform with the tools, or APIs, provided by the PaaS platform. Following are some of the most popular PaaS platforms.

Google App Engine
Lets you run your Web applications on Google's infrastructure. You can use Python or Java™ technology to create your Web applications.
Microsoft® Windows® Azure
A Windows-based environment to create cloud applications and services. You can use Microsoft Visual Studio® for the development and deployment of applications on the Azure platform. Azure was available in Community Technology Preview (CTP) for free evaluation through January 2010.
Force.com
Is a recent initiative from Salesforce.com. It provides a development platform for quickly building scalable applications. An Eclipse-based IDE can be used to build, version, and deploy Force.com components and applications on the platform.
Morph
Provides a complete environment called the Morph Application Platform (MAP) for hosting multiple Web applications over their cloud computing infrastructures.
Bungee Connect
A complete application development and deployment platform for building Web applications for the cloud.

See Resources for links to these platforms.

Figure 1. Popular PaaS platforms
Image shows popular PaaS platforms

PaaS platforms usually provide scalable abstractions using an interface to infrastructures like Google App Engine or Microsoft Azure. Use the application Software Development Kit (SDK) provided by the PaaS provider to create and debug applications.


Google App Engine

Google App Engine is a cloud-based platform that lets you run your Web applications on Google's own infrastructure. The applications that run on App Engine can be written in Python or the Java programming language. The ability to run Java Virtual Machine (JVM)-based applications opens up myriad possibilities. You can create applications in several languages other than Java that can run within the JVM:

  • JRuby
  • Scala
  • Clojure
  • Groovy
  • Jython
  • Beanshell

You can scale your applications easily as your traffic grows. There are two different tiers of pricing available for App Engine developers.

Free account
Can be used to create applications that consume up to 500 MB of storage and up to 5 million page views a month.
Paid account
For applications that will consume more resources than the free version. You can allocate your budget according to your needs. You always control the maximum amount of resources your application can consume and can set limits for the resource consumption.

App Engine provides scalable abstractions of well-defined APIs to proprietary Google implementations. They are available to Python or Java-based programs using an API/SDK.

ServiceProvider
DatastoreGoogle's BigTable, a high-performance proprietary database system for storing large amounts of data in a semi-structured manner
CachingMemcache, a high-performance distributed memory object-caching system
AuthenticationGoogle accounts for authentication and user management
MailGoogle Mail (Gmail) for sending e-mail
Figure 2. Google App Engine services implementation
Image shows Google AppEngine services implementation

There are some additional restrictions placed on your applications by the App Engine environment, as follows:

  • You are allowed to use only a subset of the standard libraries available with either Python or Java technology.
  • Quotas are enforced for CPU requests, memory, size of files, etc.
  • Any request made to your application must return within 30 seconds.
  • You do not have any access to the file system and can only read static files uploaded as a part of the application.
  • You cannot spawn threads or processes in the App Engine environment.
  • The storage back end used in App Engine is BigTable, a schema-less key value data store.
  • App Engine can only execute code that is triggered from an HTTP request.

These restrictions may or may not be limiting to your application. App Engine is a great way to build scalable Web applications, and AppScale provides a framework for emulating the Google App Engine environment. AppScale lets you execute and debug App Engine applications locally and transparently over cloud-based infrastructures, such as Amazon EC2 and Eucalyptus.


AppScale

AppScale (see Resources) is an open source implementation of the Google App Engine APIs from the RACELab at University of California, Santa Barbara. It is a cloud computing platform that makes it easy to execute Google App Engine applications over IaaS clouds (such as Amazon's Elastic Compute Cloud (EC2) or Eucalyptus, which were explored in Part 1). See Resources for more information.

AppScale brings you the power of App Engine and lets you run App Engine applications using your own clusters. It can also run transparently over IaaS platforms. According to the RACELab team,

"Our goal with AppScale is to provide a Platform-As-A-Service (PaaS) cloud infrastructure that enables users to not only deploy, test, debug, measure, and monitor their GAE applications prior to deployment on Google's proprietary resources, but also to facilitate investigation and extension of the PaaS implementation: services, runtime, interoperation with lower-level cloud fabric, etc."

Figure 3 shows the implementation of services with AppScale.

Figure 3. AppScale services implementation
Image shows AppScale services implementation

Architecture of AppScale

The AppScale environment consists of four primary components. AppScale complements the functions provided by Google App Engine by building upon and extending the SDK from the Google App Engine and implementing the open API exposed by the SDK. The multiple components within AppScale automate the deployment, management, scaling, and fault tolerance of the system for executing App Engine applications.

You can deploy and run Google App Engine applications in AppScale without any modifications to the application. AppScale is not meant to replace or compete with Google's App Engine. It is a framework for experimentation with cloud infrastructures and is not meant to scale up as much as Google's vaunted infrastructure.

The four AppScale components, as shown in Figure 4, are:

AppServer
The main component for executing an App Engine application. It is an extension to the Google App Engine SDK for executing App Engine applications locally. Each AppServer can execute only one application at a time. You can add multiple AppServers in order to host multiple applications.
AppLoadBalancer
The component responsible for distributing the initial requests from users. After the user is successfully logged in, the load balancer routes the request to the appropriate AppServer for the actual handling of requests for that application. After that, the load balancer is no longer involved, and the user is routed to the appropriate AppServer. In this sense, it is somewhat different from a traditional load balancer. The load balancer is a Ruby on Rails application, and the load balancing function is provided by using the open source Web proxy named nginx.
Database Master
The main interface to the data store. It provides access to the various available data store implementations for MySQL, Cassandra, Voldemort, MongoDB, HBase, and HyperTable. Support for other databases, including CouchDB, is forthcoming.
Database slaves
One or more database slaves provide the distributed, scalable, and fault-tolerant data management capability.

The components communicate with each other using the AppController, which controls the setup, initialization, and tear-down of all the AppScale instances within the deployment environment. The AppController is also responsible for the deployment of, and authentication for, App Engine applications.

Figure 4. AppScale components
Figure 4 shows the AppScale components

Users of App Engine applications interact with AppServers using SSL. The first login request to the AppScale environment will always go to the load balancer, which will route it to the appropriate application upon successful login.

Developers creating applications to be accessed by users will interact with AppScale using the AppScale Tools toolset. This toolset provides functions that let you set up an AppScale instance and deploy App Engine applications into AppScale. It is conceptually similar to the Amazon EC2 tools. Some of the scripts in this toolset are summarized below.

ScriptAction
appscale-run-instancesDeploy an AppScale instance along with an App Engine application
appscale-upload-app Upload an App Engine application into a running AppScale instance
appscale-describe-instancesRetrieve utilization statistics, such as CPU and memory utilization, from the AppController and AppServers
appscale-reset-pwdReset developer password for the root user/developer
appscale-terminate-instancesClean up and destroy all the AppScale instances
Figure 5. AppScale architecture
Image shows AppScale architecture

AppScale uses the concept of nodes. A node is an instance of an AppScale image instance. An AppScale deployment will consist of at least one node and usually several nodes. A node will contain the AppController for communicating with other nodes, along with one or more AppScale components. The node that implements the AppLoadBalancer is called the head node. There is only one instance of the head node in an AppScale deployment.

The AppController on the head node is the main controller and has some additional responsibilities:

  • Monitors the AppScale deployment for failed nodes.
  • Grows and shrinks the AppScale deployment according to system demand and developer preferences.
  • Collects application information and resource usage from the other nodes periodically.
  • Is responsible for restarting failed components and respawning nodes if needed.

Figure 6 shows an example.

Figure 6. AppScale nodes
Figure 6 shows AppScale nodes

The AppScale image instance is also called a guest virtual machine (GVM). It can execute over the open source IaaS cloud Eucalyptus (discussed in Part 1) or it can execute over the Amazon Web Services EC2 environment. It can also be used on non-virtualized systems using the latest Ubuntu distributions. In the case of Eucalyptus, you can use Xen, KVM, or VMware as the underlying virtualization layer. Figure 7 shows AppScale deployed on Eucalyptus.

Figure 7. AppScale deployed on Eucalyptus
Image shows AppScale deployed on Eucalyptus

As shown below, Amazon EC2 uses Xen as the underlying virtualization framework.

Figure 8. AppScale deployed on EC2
Image shows AppScale deployed on EC2

AppScale was designed with fault tolerance. It can sustain failures in the AppServer, Database Slave, AppLoadBalancer, and AppController components.

The AppScale team is researching various failure scenarios and the effect on the system. They are working on ways to build tolerance for these faults in a deployed AppScale environment.

AppScale opens up the ability to create a complete end-to-end open source cloud stack that can further the utility computing model. This has recently been given the moniker LEAP (for Linux, Eucalyptus, AppScale and Python) by Tim O’Reilly. There are also upcoming projects that are trying to provide a similar solution. The most promising, called TyphoonAE (see Resources), is an open source project for executing Python-based Google App Engine applications with pluggable modules for data storage, messaging, and caching. As of this writing, TyphoonAE is in beta, and the project seems to be developing rapidly.


Benefits of AppScale

AppScale is a great way to test and debug Google App Engine applications locally. AppScale and Eucalyptus together provide a fantastic platform for exploring and researching cloud computing.

Open source
AppScale was created to research cloud computing platforms. It is freely available in source form, making it easy to peek beneath the covers or to easily create extensions of the platform to meet your needs. The pace of development on the platform is very rapid. Features and improvements are being added at a fast clip.
Great for experimentation
AppScale is a great platform for experimenting with cloud fabrics and ideas. It's easy to try out new ways of running cloud-based applications. The platform is also easy to use and easily extensible.
Private cloud
AppScale, along with Eucalyptus, can be installed within your data center behind your firewall as a private test cloud running on your own infrastructure. You get the benefits of complete control over security and the environment.
App Engine compatibility
The App Engine applications you create and run on the AppScale framework can be easily deployed to the actual Google App Engine environment once your testing is complete.

Conclusion

In this article, you learned how the Platform as a Service (PaaS) cloud computing offerings provide a set of basic services, including virtual servers, storage, and databases. You can use these services to build an application on top of the PaaS platform with the tools or APIs provided by the PaaS platform. You explored how to use AppScale for executing Google App Engine applications on virtualized infrastructures, or on Infrastructure as a Service (IaaS) systems such as Amazon EC2 and Eucalyptus.

Resources

Learn

  • Explore the world of Amazon Web Services, which provides companies of all sizes with an infrastructure Web services platform in the cloud.
  • Learn more about IaaS platforms, such as the Amazon Web Services Elastic Compute Cloud (EC2) and Eucalyptus.
  • Follow Eucalyptus updates from @eucalyptuscloud on Twitter.
  • For AppScale:
  • For Google App Engine:
    • The applications gallery for App Engine shows sample applications and types of applications that you can build and deploy.
    • TyphoonAE is an open source project providing a full-featured and productive serving environment to run Google App Engine (Python) applications.
    • The Getting Started guide for Python and Java for App Engine provides comprehensive documentation for building applications using the GAE SDK. It also has a description of the API.
  • Check out Force.com, the recent initiative from Salesforce.com, provides a development platform for quickly building scalable applications.
  • Morph provides a complete environment called the Morph Application Platform (MAP) for hosting multiple Web applications over their cloud computing infrastructure.
  • Learn more about Bungee Connect, the complete application development and deployment platform for building Web applications for the cloud.
  • Use the Ubuntu Launchpad for tracking changes to the AppScale packages in the Ubuntu distribution.
  • Memcached is a high-performance distributed memory object-caching system.
  • To listen to interesting interviews and discussions for software developers, check out developerWorks podcasts.
  • Stay current with developerWorks' Technical events and webcasts.
  • Follow developerWorks on Twitter.
  • Check out upcoming conferences, trade shows, webcasts, and other Events around the world that are of interest to IBM open source developers.
  • Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products, as well as our most popular articles and tutorials.
  • The My developerWorks community is an example of a successful general community that covers a wide variety of topics.
  • Watch and learn about IBM and open source technologies and product functions with the no-cost developerWorks On demand demos.

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source, Cloud computing
ArticleID=463265
ArticleTitle=Cloud services for your virtual infrastructure, Part 2: Platform as a Service (PaaS) and AppScale
publish-date=01262010