In this "Cloud services for your virtual infrastructure" series, learn about the three major types of cloud services: Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS).
Part 1 explores how IaaS provides a set of building blocks, or services, such as virtual servers, data storage, and databases. Using these services, you can create a platform to deploy and run your applications. Also learn about Eucalyptus, an open source software infrastructure for implementing cloud computing with clusters or workstation farms.
In this article, learn about PaaS platforms. Get an introduction to Google App Engine and to AppScale architecture. Explore how to use AppScale to run Google App Engine applications using your own clusters or on virtualized infrastructures, such as Amazon EC2 and Eucalyptus.
Platform as a Service (PaaS)
PaaS is a type of cloud service that provides software and product development tools hosted by the provider on the hardware infrastructure. The term PaaS is commonly used for cloud-based platforms you can leverage to build and run custom applications. PaaS applications provide everything you need to build and deploy Web applications and services accessible from anywhere on the Internet. Your end users do not have to download, install, or maintain the system.
PaaS offerings provide a set of basic services, including virtual servers, storage, and databases. You can use these services to build an application on top of the PaaS platform with the tools, or APIs, provided by the PaaS platform. Following are some of the most popular PaaS platforms.
- Google App Engine
- Lets you run your Web applications on Google's infrastructure. You can use Python or Java™ technology to create your Web applications.
- Microsoft® Windows® Azure
- A Windows-based environment to create cloud applications and services. You can use Microsoft Visual Studio® for the development and deployment of applications on the Azure platform. Azure was available in Community Technology Preview (CTP) for free evaluation through January 2010.
- Is a recent initiative from Salesforce.com. It provides a development platform for quickly building scalable applications. An Eclipse-based IDE can be used to build, version, and deploy Force.com components and applications on the platform.
- Provides a complete environment called the Morph Application Platform (MAP) for hosting multiple Web applications over their cloud computing infrastructures.
- Bungee Connect
- A complete application development and deployment platform for building Web applications for the cloud.
See Resources for links to these platforms.
Figure 1. Popular PaaS platforms
PaaS platforms usually provide scalable abstractions using an interface to infrastructures like Google App Engine or Microsoft Azure. Use the application Software Development Kit (SDK) provided by the PaaS provider to create and debug applications.
Google App Engine
Google App Engine is a cloud-based platform that lets you run your Web applications on Google's own infrastructure. The applications that run on App Engine can be written in Python or the Java programming language. The ability to run Java Virtual Machine (JVM)-based applications opens up myriad possibilities. You can create applications in several languages other than Java that can run within the JVM:
You can scale your applications easily as your traffic grows. There are two different tiers of pricing available for App Engine developers.
- Free account
- Can be used to create applications that consume up to 500 MB of storage and up to 5 million page views a month.
- Paid account
- For applications that will consume more resources than the free version. You can allocate your budget according to your needs. You always control the maximum amount of resources your application can consume and can set limits for the resource consumption.
App Engine provides scalable abstractions of well-defined APIs to proprietary Google implementations. They are available to Python or Java-based programs using an API/SDK.
|Datastore||Google's BigTable, a high-performance proprietary database system for storing large amounts of data in a semi-structured manner|
|Caching||Memcache, a high-performance distributed memory object-caching system|
|Authentication||Google accounts for authentication and user management|
|Google Mail (Gmail) for sending e-mail|
Figure 2. Google App Engine services implementation
There are some additional restrictions placed on your applications by the App Engine environment, as follows:
- You are allowed to use only a subset of the standard libraries available with either Python or Java technology.
- Quotas are enforced for CPU requests, memory, size of files, etc.
- Any request made to your application must return within 30 seconds.
- You do not have any access to the file system and can only read static files uploaded as a part of the application.
- You cannot spawn threads or processes in the App Engine environment.
- The storage back end used in App Engine is BigTable, a schema-less key value data store.
- App Engine can only execute code that is triggered from an HTTP request.
These restrictions may or may not be limiting to your application. App Engine is a great way to build scalable Web applications, and AppScale provides a framework for emulating the Google App Engine environment. AppScale lets you execute and debug App Engine applications locally and transparently over cloud-based infrastructures, such as Amazon EC2 and Eucalyptus.
AppScale (see Resources) is an open source implementation of the Google App Engine APIs from the RACELab at University of California, Santa Barbara. It is a cloud computing platform that makes it easy to execute Google App Engine applications over IaaS clouds (such as Amazon's Elastic Compute Cloud (EC2) or Eucalyptus, which were explored in Part 1). See Resources for more information.
AppScale brings you the power of App Engine and lets you run App Engine applications using your own clusters. It can also run transparently over IaaS platforms. According to the RACELab team,
"Our goal with AppScale is to provide a Platform-As-A-Service (PaaS) cloud infrastructure that enables users to not only deploy, test, debug, measure, and monitor their GAE applications prior to deployment on Google's proprietary resources, but also to facilitate investigation and extension of the PaaS implementation: services, runtime, interoperation with lower-level cloud fabric, etc."
Figure 3 shows the implementation of services with AppScale.
Figure 3. AppScale services implementation
Architecture of AppScale
The AppScale environment consists of four primary components. AppScale complements the functions provided by Google App Engine by building upon and extending the SDK from the Google App Engine and implementing the open API exposed by the SDK. The multiple components within AppScale automate the deployment, management, scaling, and fault tolerance of the system for executing App Engine applications.
You can deploy and run Google App Engine applications in AppScale without any modifications to the application. AppScale is not meant to replace or compete with Google's App Engine. It is a framework for experimentation with cloud infrastructures and is not meant to scale up as much as Google's vaunted infrastructure.
The four AppScale components, as shown in Figure 4, are:
- The main component for executing an App Engine application. It is an extension to the Google App Engine SDK for executing App Engine applications locally. Each AppServer can execute only one application at a time. You can add multiple AppServers in order to host multiple applications.
- The component responsible for distributing the initial requests from users. After the user is successfully logged in, the load balancer routes the request to the appropriate AppServer for the actual handling of requests for that application. After that, the load balancer is no longer involved, and the user is routed to the appropriate AppServer. In this sense, it is somewhat different from a traditional load balancer. The load balancer is a Ruby on Rails application, and the load balancing function is provided by using the open source Web proxy named nginx.
- Database Master
- The main interface to the data store. It provides access to the various available data store implementations for MySQL, Cassandra, Voldemort, MongoDB, HBase, and HyperTable. Support for other databases, including CouchDB, is forthcoming.
- Database slaves
- One or more database slaves provide the distributed, scalable, and fault-tolerant data management capability.
The components communicate with each other using the AppController, which controls the setup, initialization, and tear-down of all the AppScale instances within the deployment environment. The AppController is also responsible for the deployment of, and authentication for, App Engine applications.
Figure 4. AppScale components
Users of App Engine applications interact with AppServers using SSL. The first login request to the AppScale environment will always go to the load balancer, which will route it to the appropriate application upon successful login.
Developers creating applications to be accessed by users will interact with AppScale using the AppScale Tools toolset. This toolset provides functions that let you set up an AppScale instance and deploy App Engine applications into AppScale. It is conceptually similar to the Amazon EC2 tools. Some of the scripts in this toolset are summarized below.
|appscale-run-instances||Deploy an AppScale instance along with an App Engine application|
|appscale-upload-app||Upload an App Engine application into a running AppScale instance|
|appscale-describe-instances||Retrieve utilization statistics, such as CPU and memory utilization, from the AppController and AppServers|
|appscale-reset-pwd||Reset developer password for the root user/developer|
|appscale-terminate-instances||Clean up and destroy all the AppScale instances|
Figure 5. AppScale architecture
AppScale uses the concept of nodes. A node is an instance of an AppScale image instance. An AppScale deployment will consist of at least one node and usually several nodes. A node will contain the AppController for communicating with other nodes, along with one or more AppScale components. The node that implements the AppLoadBalancer is called the head node. There is only one instance of the head node in an AppScale deployment.
The AppController on the head node is the main controller and has some additional responsibilities:
- Monitors the AppScale deployment for failed nodes.
- Grows and shrinks the AppScale deployment according to system demand and developer preferences.
- Collects application information and resource usage from the other nodes periodically.
- Is responsible for restarting failed components and respawning nodes if needed.
Figure 6 shows an example.
Figure 6. AppScale nodes
The AppScale image instance is also called a guest virtual machine (GVM). It can execute over the open source IaaS cloud Eucalyptus (discussed in Part 1) or it can execute over the Amazon Web Services EC2 environment. It can also be used on non-virtualized systems using the latest Ubuntu distributions. In the case of Eucalyptus, you can use Xen, KVM, or VMware as the underlying virtualization layer. Figure 7 shows AppScale deployed on Eucalyptus.
Figure 7. AppScale deployed on Eucalyptus
As shown below, Amazon EC2 uses Xen as the underlying virtualization framework.
Figure 8. AppScale deployed on EC2
AppScale was designed with fault tolerance. It can sustain failures in the AppServer, Database Slave, AppLoadBalancer, and AppController components.
The AppScale team is researching various failure scenarios and the effect on the system. They are working on ways to build tolerance for these faults in a deployed AppScale environment.
AppScale opens up the ability to create a complete end-to-end open source cloud stack that can further the utility computing model. This has recently been given the moniker LEAP (for Linux, Eucalyptus, AppScale and Python) by Tim O’Reilly. There are also upcoming projects that are trying to provide a similar solution. The most promising, called TyphoonAE (see Resources), is an open source project for executing Python-based Google App Engine applications with pluggable modules for data storage, messaging, and caching. As of this writing, TyphoonAE is in beta, and the project seems to be developing rapidly.
Benefits of AppScale
AppScale is a great way to test and debug Google App Engine applications locally. AppScale and Eucalyptus together provide a fantastic platform for exploring and researching cloud computing.
- Open source
- AppScale was created to research cloud computing platforms. It is freely available in source form, making it easy to peek beneath the covers or to easily create extensions of the platform to meet your needs. The pace of development on the platform is very rapid. Features and improvements are being added at a fast clip.
- Great for experimentation
- AppScale is a great platform for experimenting with cloud fabrics and ideas. It's easy to try out new ways of running cloud-based applications. The platform is also easy to use and easily extensible.
- Private cloud
- AppScale, along with Eucalyptus, can be installed within your data center behind your firewall as a private test cloud running on your own infrastructure. You get the benefits of complete control over security and the environment.
- App Engine compatibility
- The App Engine applications you create and run on the AppScale framework can be easily deployed to the actual Google App Engine environment once your testing is complete.
In this article, you learned how the Platform as a Service (PaaS) cloud computing offerings provide a set of basic services, including virtual servers, storage, and databases. You can use these services to build an application on top of the PaaS platform with the tools or APIs provided by the PaaS platform. You explored how to use AppScale for executing Google App Engine applications on virtualized infrastructures, or on Infrastructure as a Service (IaaS) systems such as Amazon EC2 and Eucalyptus.
- Explore the world of Amazon Web Services, which provides companies of all sizes with an infrastructure Web services platform in the cloud.
- Learn more about IaaS platforms, such as the Amazon Web Services Elastic Compute Cloud (EC2) and Eucalyptus.
- Follow Eucalyptus updates from @eucalyptuscloud on Twitter.
- The AppScale project wiki contains pointers for deploying and running AppScale on various virtualized infrastructures.
- View the presentation "AppScale: Scalable and Open AppEngine Development and Deployment" at Cloudcomp 2009.
- On Google Groups, the AppScale Community is a discussion forum.
- The Apache Hadoop project is the home for some of the technologies, such as HBase, HDFS and HyperTable, used by AppScale.
- AppScale is an open source implementation of the Google App Engine APIs from the RACELab at University of California, Santa Barbara.
For Google App Engine:
- TyphoonAE is an open source project providing a full-featured and productive serving environment to run Google App Engine (Python) applications.
- The Getting Started guide for Python and Java for App Engine provides comprehensive documentation for building applications using the GAE SDK. It also has a description of the API.
- Check out Force.com, the recent initiative from Salesforce.com, provides a development platform for quickly building scalable applications.
- Morph provides a complete environment called the Morph Application Platform (MAP) for hosting multiple Web applications over their cloud computing infrastructure.
- Use the Ubuntu Launchpad for tracking changes to the AppScale packages in the Ubuntu distribution.
- Memcached is a high-performance distributed memory object-caching system.
- To listen to interesting interviews and discussions for software developers, check out developerWorks podcasts.
- Stay current with developerWorks' Technical events and webcasts.
- Follow developerWorks on Twitter.
- Check out upcoming conferences, trade shows, webcasts, and other Events around the world that are of interest to IBM open source developers.
- Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
- The My developerWorks community is an example of a successful general community that covers a wide variety of topics.
- Watch and learn about IBM and open source technologies and product functions with the no-cost developerWorks On demand demos.
Get products and technologies
- Get the code for AppScale, the open source platform for Google App Engine apps.
- Sign up for Google App Engine and explore the Python or Java SDK.
- You can download tools and the SDK for building Azure-based applications.
- Innovate your next open source development project with IBM trial software, available for download or on DVD.
- Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
- Participate in developerWorks blogs and get involved in the developerWorks community.
Dig deeper into Open source on developerWorks
Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.
Keep up with the best and latest technical info to help you tackle your development challenges.
Software development in the cloud. Register today to create a project.
Evaluate IBM software and solutions, and transform challenges into opportunities.