Realities of open source cloud computing, Part 1: Not all clouds are equal

Picking from a profusion of platforms

Your CTO wants to know your cloud computing strategy — and wants to know it tomorrow. There are a lot of choices, with many differences and similarities. This article explores some of the options for an organization that wants to leverage the power and promise of cloud computing, with a focus on open source technologies. Learn about several of the providers, such as Amazon, Microsoft®, Google, IBM®, Aptana, Heroku, Mosso, Ning, and Salesforce. Review the relative strengths and weaknesses of each platform, and what types of open source and proprietary technologies are supported on each platform. Learn how to pick the platform that fits your needs.

Michael Galpin, Software Architect, eBay

Michael Galpin's photoMichael Galpin has been developing Java software professionally since 1998. He currently works for eBay. He holds a degree in mathematics from the California Institute of Technology.



07 April 2009

Also available in Russian Japanese Vietnamese Portuguese Spanish

About this series

In this three-part "Realities of open source cloud computing" series, learn how to determine if cloud computing can help you and how to plan your cloud computing strategy. In this first part, learn the benefits of cloud computing, types of clouds, and high-level choices in cloud computing platforms. Upcoming articles will explore designing and developing for the cloud and how to manage an application in the cloud.


Cloud computing: More than marketing hype?

If you've been working in technology for a while, you might be thinking, "Haven't we heard all of this before? Is cloud computing just another over-hyped technology" — the latest silver bullet that is supposed to solve all of your problems? Or is there more to it? The answer is "Yes," and "Yes." There's no denying the hype around cloud computing. This article discusses several available cloud computing options, but this list is far from exhaustive. Many vendors are trying to cash in on the hype. Still, there are very real and tangible benefits to cloud computing — hype or not.

There are some aspects of cloud computing to be wary of before moving to the cloud. In this article, learn about the benefits and challenges of cloud computing.

About cloud computing

Types of clouds

Wikipedia defines cloud computing as "Internet-based development and use of computer technology." That's a broad description, and many types of offerings can be classified as cloud computing. A large group of cloud offerings are variations of Software as a Service (SaaS). Examples of SaaS are: Web applications such as Zoho (word processing, spreadsheet), Salesforce (CRM), SlideRocket (presentations), or Web services such as Google Search, Yahoo! Weather, or PayPal. These are all great examples of cloud computing, but they're probably not useful to an enterprise looking to use cloud computing. They can, however, be complementary to other types of cloud computing.

The type of cloud computing you might be looking for is a type of infrastructure that perhaps is known as a Platform as a Service (PaaS). Some of the most common examples of PaaS are various types of cloud data storage, such as the unstructured data with Amazon's Simple Storage Service (S3) or IBM's Scale out File Service (SOFS). Both of these are distributed file systems. S3 is accessible through a Web services interface, while SOFS is accessible through file protocols, such as NFS and FTP. Amazon also offers structured data storage with its SimpleDB service. SimpleDB allows structured data to be saved and queried through a Web services interface.

Computing is certainly more than just storage, and that's where cloud computing platforms come in. The platforms provide a way for you to take code and execute it on a cloud platform. This can certainly be combined with cloud storage and cloud Web services. There are many platforms available, with a wide variety of relative advantages and disadvantages.

Benefits

Why would you want to run your code on a cloud platform instead of your own computers? There are several simple, practical reasons. You don't have to buy and set up all of those computers. If that were the only aspect of cloud computing, it would be no different than a hosting service. The major advantage of cloud computing is being able to quickly turn applications on ("spin up") or off and to elastically grow your computing power on an as-needed basis. At the very least, any cloud computing platform can seamlessly provide greater and greater computing resources on demand. Some platforms also provide common, general-usage development platforms on top of on demand computing.

In short, cloud computing allows your organization to quickly deploy applications and grow them to meet the needs of your business. Sounds great, but there are some challenges associated with cloud computing you should be aware of.

Challenges

It's easy to focus entirely on the benefits of cloud computing, but there's a downside: One of the most obvious issues with cloud computing is that the data that powers your application lives in the cloud, along with your application. Your data could be sensitive, such as personally identifying information about your customer or about their financial instruments and transaction records. You could also have nonsensitive data that is still extremely valuable, such as aggregate information about your users and how they use your application. With critical information being stored in the cloud, you must understand whether the platform is secure.

Who accesses your data in the cloud is not the only thing to worry about. The integrity of that data is just as important. Machine failure has to be expected, so it's crucial that your data can be backed up and restored in case of failures. Does a platform offer data backup and recovery, or at least make it possible for customers who need this? The reliability of your application is obviously very important. What kinds of service-level agreements are offered by a particular platform? These, and other important questions, are explored as this article examines some of the available platforms.

Platforms

There are a lot of cloud computing platforms to choose from. The list here is far from complete, but should give you an idea of the more popular choices and the fundamental differences among them. We'll pay special attention to the programming languages and open source technologies supported on each platform, and how each platform addresses some of the thornier issues of cloud computing. To help navigate such a large list, they are loosely classified as basic and specialized platforms.

Basic platforms are minimal offerings — just (virtual) hardware and maybe an operating system. They tend to be more flexible, as they have fewer limitations.

Specialized platforms provide some type of programming environment and services on top of a basic platform. Specialized platforms are usually simpler and often provide some unique services.


Basic platforms

If you want the greatest freedom to configure your systems in the cloud, you probably want a basic platform. You can specify some hardware-like specifications, such as a type of processor, possibly of a certain speed, with a certain amount of memory, etc. From there, you're free to create whatever you need. It is very much like a hosting service, but one that grows and shrinks to meet your needs. This section discusses four providers: Amazon, IBM, Joyent, and Mosso.

Amazon Elastic Compute Cloud

Amazon's Elastic Compute Cloud (EC2) was one of the first cloud computing platforms and is still one of the most popular. It's a common saying that "you will never get fired for going with Amazon." EC2 is a great example of a basic platform.

To start working with EC2, you need an Amazon Machine Instance (AMI). An AMI is a full machine image, with operating system, applications, etc. There are many common AMIs available from Amazon and the EC2 community, with either Microsoft Windows® or Linux®, plus various suites of open source software, such as the Apache Web server, MySQL, and Python interpreter. If you do not find an AMI that suits your needs, Amazon provides tools for creating your own AMI that you can either keep private or share with the community.

An AMI can be deployed to an "instance" of various sizes. As of this writing, a small instance had a single 1-GHz core with 1.7 GB of memory and 160 GB of disk space. At the other end of the spectrum is an extra-large instance with four cores running at 2 GHz each, 15 GB of memory, and 1.6 TB of disk space. There are also more specialized sizes designed for computationally heavy tasks. You just pick the size you need and deploy your AMI. All administration and control of your instance is done through Web services. A large ecosystem has grown around these Web services to make it easier to manage EC2 instances. For example, there is a Firefox extension called Elasticfox that can be used to manage and launch AMIs straight from Firefox.

EC2 is powered by the open source Xen virtualization software. With EC2, you can run virtually any type of software you want. Various flavors of Linux are commonly used as the operating system for AMIs. Whatever programming language you want is available: the Java™ programming language, PHP, Python, etc. It's possible to use proprietary software on EC2, but the elastic nature of EC2 makes open source software very attractive. You do not have to be concerned about licensing when you use bigger or more instances.

Amazon provides a wide range of infrastructure services to go along with EC2, which you can use to address issues like data reliability and backup. Amazon's S3 service is a great option for backing up your data. It is very much a do-it-yourself model. Administration and access to the Amazon cloud is done exclusively through its Web services that require two-factor authentication.

IBM Blue Cloud

When Amazon first entered the cloud computing space, many people were surprised. When IBM entered the space, nobody was surprised. The Blue Cloud was announced in late 2008 and promises to provide all of the basics of cloud computing. Customers can pick from the more-common x86 hardware or higher-end hardware based on POWER®. The Blue Cloud leverages IBM's Tivoli® software for automatically provisioning systems of various capabilities (CPU/RAM/disk), which lets your organization potentially tap into huge computing power — but only pay for it as needed. IBM is also pioneering "private" clouds, bringing the benefits of cloud computing to internal, inside-the-firewall applications.

IBM's Blue Cloud is an emerging technology, so you'll want to check the latest information about what types of technology it supports. IBM is one of the greatest backers of open source technology, making IBM an attractive choice for applications that highly leverage open source technology.

Joyent Accelerator

Joyent might not be a household name like Amazon or IBM, but has quickly earned an impressive reputation as a cloud computing platform provider for Web-based start-ups. Joyent Accelerator gives you much of the flexibility of traditional hosting providers, but with the on-demand computing key to cloud computing. With it, you can quickly spin up an instance complete with PHP, the Java language, or Ruby on Rails preconfigured and ready to use. You pick how much computing power you need. Everything is running on OpenSolaris, so you can use all of the usual tools for accessing and managing the assets deployed to it, such as SSH and FTP.

Joyent's cloud computing is designed with scalability in mind. Even its most affordable offerings are designed to handle bursts of use. This has made Joyent very popular for organizations creating Facebook applications that don't usually need much power, but can experience dramatic spikes in usage.

With Joyent, any technology compatible with OpenSolaris is supported. This includes any of the open source LAMP technologies and programming languages, and other programming languages such as the Java language and Ruby. Joyent allows you to leverage any existing Linux or UNIX® tools available for securing and maintaining your site and data.

Mosso

Mosso, a subsidiary of the well-known hosting provider The Rackspace Cloud, has a few different offerings in cloud computing. Mosso's Cloud Sites straddle the line between a basic and specialized platform. There are two basic Cloud Site configurations available. One is powered by open source software. It is the classic LAMP setup. The other configuration is a Windows Server with the IIS Web server and the SQL Server database. You pick the configuration and pay for bandwidth, storage, and CPU cycles as needed.

Mosso has announced that it will also offer a new product called Cloud Servers, which will be Linux systems, but will allow for complete flexibility in their configuration. Mosso's Cloud Sites are popular, as they provide the basic building blocks needed by many applications. You could describe them as a basic specialized platform. With that in mind, the next section looks at the more specialized platforms that are available.


Specialized platforms

The term "specialized" is obviously somewhat subjective. What exactly makes a cloud computing platform specialized? The platforms in this section all offer extra features on top of the basic platforms described above. Sometimes, the features are unique development environments; sometimes, they're extra services that are integrated into the platform; and sometimes, they're convenience features. This section explores the following specialized platforms: Microsoft Azure, Google App Engine, Aptana Cloud, Heroku, Ning, and Salesforce.

Microsoft Azure

The Azure platform was announced by Microsoft in the fourth quarter of 2008. The platform is tied to its operating system, which is a specialized flavor of Windows. It includes a "hypervisor" for provisioning machine instances dynamically. It is designed to run any .NET application. Of course, server-based .NET applications would be the natural pick to move to this cloud. Microsoft has begun offering many of its server-based products, such as Exchange, running in the cloud on Azure.

However, Azure is not simply a Windows and .NET platform. The Azure platform also offers numerous other services, including SQL Services, which is a highly scalable SQL server database, and Live Services, which are Web services into many popular Microsoft applications for searching, photo sharing, instant messaging, etc. Azure also offers tight integration with Microsoft's IDE, Visual Studio®, making it easy to run, test, and deploy applications to the Azure platform.

Azure is one of the most proprietary cloud platforms available, but has some obvious attractions if you are already using proprietary Microsoft technologies. You are limited to proprietary technologies from Microsoft, such as the .NET languages and an SQL server-based database. It is possible to leverage many Windows technologies for securing access and managing any applications running on Azure.

Google App Engine

The App Engine, launched by Google in the second quarter of 2008, is quite a bit different from many of the other cloud platforms. There is no provisioning of hardware on it; you simply deploy your application to it — you can do this for free. However, App Engine usage is capped off, and you can buy additional CPU usage, storage, and bandwidth as needed, similar to other cloud platforms. There are some convenience features to the Google App Engine, but that is just the beginning of its specialized feature set.

The Google App Engine provides a robust development environment that only supports Python. It provides numerous services on top of Python. User management is integrated with Google. For example, people log in to your app with the same credentials they would use to log in to Google Mail. There is a data-store API for storing structured data. Storage and retrieval from the data store are similar to using a relational database, but they're entirely proprietary to Google. It is based on Google's proprietary distributed file system, GFS.

In short, Google supports Python only, which is open source, but everything else involved is effectively proprietary (though Google is likely using many open source technologies behind the scenes). The Google App Engine does not offer any type of data backup solutions, though the underlying data store is designed to be highly fault-tolerant.

Aptana Cloud

Aptana may be best known for its product Aptana Studio, an Eclipse-based IDE for working with dynamic programming languages, such as JavaScript, PHP, Python, and Ruby. Aptana announced its cloud platform in the second quarter of 2008. The Aptana Cloud is actually a set of features on top of the cloud computing platform from Joyent.

With Aptana Cloud, you can easily deploy to a Linux or MySQL environment with PHP or Jaxer, Aptana's server-side JavaScript implementation, or Ruby on Rails. An Aptana Cloud deployment has all the characteristics of any Joyent Accelerator deployment, but with extra features from Aptana. Deployment and management of cloud applications is managed directly through Aptana Studio. Everything from provisioning hardware for your application to monitoring log files can be done from Aptana Studio. With Aptana, you get an unparalleled level of convenience. Development, testing, deployment, and management are all handled in one place.

Aptana inherits a lot of support for open source technologies and programming from Joyent. It also inherits open source tooling for management and backup. Many of the management aspects are integrated into Aptana Studio, but more sophisticated systems are possible, as well.

Heroku

You could say that "What the Google App Engine is to Python, Y-Combinator startup Heroku is to Ruby on Rails." But that would not do justice to Heroku. It is not just a cloud platform where Ruby on Rails is available. Heroku only supports Rails, and, as such, it is heavily tailored to Rails. With Heroku, you simply add a Ruby gem to your local setup, and you can immediately issue commands to deploy and run your application on the Heroku cloud. Alternatively, you can deploy from a Git repository. You can even access and edit your code directly from a Web browser. You can use any Ruby gem or Rails plug-in you want with your application.

Heroku is all about convenience. It runs on top of Amazon EC2, so computing power can expand elastically. Heroku offers free services with its Heroku Garden. There you can deploy and test your application in the cloud for free. Once you are ready to take on more traffic or need fault tolerance, you can graduate your application to the main Heroku platform.

Ning

The cloud platforms discussed in this article thus far are pretty general-purpose. Whatever your application is going to be, they can handle it. Some of them are focused on Web applications, but that is still a pretty general classification. The popular site Ning allows users to create their own social networks. This is usually through pure configuration, adding pages, adding widgets to pages, configuring widgets, etc. With Ning, you can also download the source code of your network, modify it as you see fit, and run it on the Ning cloud. The network code is in simple PHP, so that's all you need to know to start creating your own social-networking application.

Ning is similar to the Google App Engine in that it provides a data-store API instead of a relational database. It also provides many Ning APIs that provide access to the social-networking infrastructure. You can deploy by simply uploading your code, and there is provisioning of hardware. Ning monetizes your network with ads, and by capping your storage and bandwidth. You can remove the ads and add more storage and bandwidth capacity for a fee.

Ning is obviously a very specialized cloud platform. However, if you plan on building social-networking features into your application (even if they're secondary to the main features), and you are comfortable with programming in PHP, Ning can be a very attractive option. Ning is similar to the Google App Engine. You only get one choice of programming language (PHP) and cannot simply install additional software as needed. However, you get to leverage a highly scalable but proprietary system.

Salesforce

Another very specialized cloud computing platform is available from Salesforce, best known for revolutionizing customer relationship management (CRM) software by using an SaaS model. With the Force.com platform, you can create your own applications that run on the same type of cloud infrastructure used by Salesforce for its CRM application. Enterprises use the AppExchange to find and "install" these applications to make them available to their users. This is similar to Facebook applications, where the application runs seamlessly as part of the main Salesforce applications.

Alternatively, a custom Force.com site can be created out of one or more applications. This is more like the cloud computing paradigm. With a Force.com site, you do not pay for hardware, but, instead, pay for the number of users. There are also different price tiers depending on how much storage per user is needed. To create an application for running on Salesforce, you program in Apex, which is a proprietary language similar to the Java programming language. This is the same language used by Salesforce engineers to create their CRM applications.

Salesforce also provides many platform-specific services for managing users, accounts, roles, and data access. For business applications, especially those unique to a particular enterprise, a Force.com site can be an attractive option. Salesforce is quite limited in its open source technology and programming options. But, like Google App Engine and Ning, Salesforce provides highly scalable proprietary technology.


Summary

This article explored some of the important benefits of cloud computing. You learned about a broad range of cloud computing platforms, and about how they are alike and different. The information will help you pick what kind of platforms make sense for your organization.

Stay tuned for the upcoming installments in this "Realities of open source cloud computing" series, which will take a look at what it's like to develop, deploy, and manage applications on a cloud computing platform.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source, Cloud computing
ArticleID=378099
ArticleTitle=Realities of open source cloud computing, Part 1: Not all clouds are equal
publish-date=04072009