It's a given that testing for cloud-based applications is a critical function before deployment to guarantee the application's functionality, security, scalability, and reliability. And with increasing number of applications needing to be deployed, especially with the trend to adapt all types of applications for mobile devices, effective testing can become a bottleneck. That's where automated cloud app testing comes into play.
This article introduces the effect of adding grid computing and peer-to-peer collaboration to automated testing in the cloud, discusses some of the key concepts of each as they relate to cloud computing, illustrates the considerations for integrating each into automated testing in the cloud, and provides a real-world scenario and software as an example.
But first, a basic backgrounder on grid computing and peer-to-peer functionality:
Everyone remembers when the concept of "grid computing" first became popular. It was the early 2000s when projects like SETI@home and the Human Genome Project harnessed the power of thousands of computers to work on complex problems. Grid computing refers to combining resources from multiple administrative domains in order to reach a common goal; the grid can be considered to be a distributed system with non-interactive workloads that involve a large number of files.
Grids tend to be more loosely coupled, heterogeneous, and physically dispersed than clusters. A grid can also be constructed on a LAN; in fact a private cloud is really a kind of virtual private grid.
Another concept that emerged at the same time as grid computing was called peer-to-peer (P2P), immortalized by the short-lived Napster project and later made resoundingly successful by Skype with its collaborative peer-to-peer model. P2P is a distributed application architecture that partitions tasks or workloads between peers (peers being equally privileged participants in the application).
Peers form a peer-to-peer network of nodes. Peers make a portion of their resources, such as processing power, disk storage, or network bandwidth, directly available to other network participants without the need for central coordination by servers or stable hosts. Peers can be suppliers and consumers of resources (in the client-server model, only servers supply and clients consume).
Peer-to-peer systems are often implemented as an abstract overlay network built at the application layer on top of the native or physical network. These overlays are used for peer discovery and indexing and are what makes a P2P system independent from the network topology. The P2P overlay network consists of all the participating peers as network nodes.
Since nodes share their resources as well as their demands with the overall system in a P2P network, the more nodes that are engaged, the larger the total capacity of the system to serve demands. Also, the decentralized nature of the P2P network means that it is more robust since no single points of failure are present.
So what happens if you integrate both grid computing and collaborative peer-to-peer concepts into the cloud test environment?
- Adding grid functionality gives you the ability to spin up an automated test execution grid on the cloud that can run one test in a small fraction of the time or many instances of the same test in the same period of time. Any data-driven functional test, regardless of the functional test tool being used, can be executed in parallel the same way a large problem like the Human Genome Project is divided between many computers.
- Adding P2P functionality allows the test execution grid to be shared via the browser session with as many collaborators as desired simply by passing along the link. It allows for a built-in chat window detailing all the online participants and those with proper access can run their own tests. (And since we're describing an automated test execution system instead of a manual one, participants rarely need to establish a direct remote desktop protocol (RDP) session with running instances.)
Let's talk about adding these functionalities to the cloud environment and some of the details you should consider with each before we introduce the real-world software that resulted from this integration effort.
What makes grid computing unique is the ability to divide a large task into many smaller ones that can be carried out in parallel. In the classic case of the SETI@home project, this task was about analyzing radio signals from deep space to identify patterns that could indicate the presence of extraterrestrial intelligence. Other well-known examples of grid computing include the sequencing of the human genome and calculating a million digits of pi.
Besides the grid itself, a central server or server cluster for distributing the task and collecting the results is a necessary component of grid computing architecture. Depending on the task being performed, the design and architecture of the server component can vary from simply sending and receiving data to the node computers to performing complex control and synchronization activities or both. So both the size of the grid itself and the complexity of the task are largely determined by the command and control server.
Earlier in this decade, I developed a grid computing project called Capacity Calibration. In the project, people around the world downloaded an agent program and allowed their computers to be used for testing the capacity and performance of websites. One notable example of this was testing NASA's website used to stream video of the space shuttle launch. More than 1,500 agents worldwide simulated thousands of hits per second to verify service levels before NASA took the site live.
Capacity Calibration (or CapCal) has been migrated to the cloud where the grid itself can be spun up and torn down on demand. Since there is a considerable amount of control and synchronization required, in addition to a large amount of data being transferred, this makes a huge difference. In particular, the variable network connectivity and availability of random agents on the web made it an extra challenge to coordinate and control massive load tests. With IBM® Smart Cloud Enterprise, this same thing can be done with different data centers using precise and accurate metrics and 99.99 percent availability.
With my current project, Cloud Lab Grid Automation (which I'll discuss later in the article), the challenge and the approach are similar — how to harness the power of a number of virtual machines and apply them to a common task (in this case, functional test automation). I needed the Cloud Lab server to be able to quickly spin up new instances and add them to the virtual lab, tearing them down when no longer needed.
With each of these instances being a full-blown desktop Windows® environment, of course there are all kinds of interesting possibilities that may arise. Take for example the problem of testing the performance of a legacy client-server application using a thick client written in Java™ Swing or .NET — these are notoriously difficult to test because there is no way to simulate multiple clients on a single machine. With IBM SmartCloud Enterprise, the answer is simple — just spin up multiple machines instead!
In the early part of this decade, the terms grid computing and peer-to-peer were often mentioned in the same sentence and consequently generated some confusion. Taking Napster as the first popular example of peer-to-peer computing, the key differentiator was the fact that the nodes in the grid (the users) could exchange data (music) directly with one another. Because of the piracy issues involved, this paradigm didn't last long even though it was wildly successful for a while. Nowadays Skype is an example of a P2P application that is both legal and very widely used.
I needed the Cloud Lab Grid Automation system to allow each virtual machine in the grid to be assigned to a particular user or shared by a group of users. The administrator needed to be able control this function in a number of ways, from sending a link to the browser console itself or just to an individual machine via RDP. With today's testing and development teams being geographically dispersed as a rule, the ability to share computing resources like this for both unattended automated testing and manual testing is invaluable.
For example, the administrator might assign one or two people the role of test lead and thereby pass them a link to the browser console of the grid. Anyone with this level of access can see all the machines and can copy files to and from them or start and stop tests. Just like ordinary computers in a physical test lab, virtual machines might be used by individuals during work hours and assigned to nightly regression tests during off hours.
My team decided that the browser console should include a built-in chat window that supports basic collaboration between the administrator and the team members, each of whom may take control of individual machines or may use groups of them for tasks like the one below. A user can then just add a voice conference bridge or a Skype chat and have everything he'd have in a test lab except for the physical lab itself!
Something that test automation tools are especially useful for is populating databases, either for test purposes or converting data from one system to another. The advantage of doing it this way rather than writing a query to handle it is that the business rules get applied via the UI so data integrity is ensured. In any case, it can take quite a while to enter hundreds or thousands of records and this is where grid computing can speed things up enormously.
In this article and in the subsequent real-world system used as an example, I am using the .NET Pet Shop application (see Resources). I want to populate 10,000 new users from an ASCII comma delimited text file. Each row contains a user ID, an email address, first and last name, phone number, and mailing address. For a single workstation to populate the entire list takes about 3 hours and 20 minutes. With a grid of only 12 cloud instances, you can reduce that down to 15 minutes.
The real-world software example provided in this article is Grid Robotics's Cloud Lab Grid Automation solution, a system in which both grid computing and collaborative peer-to-peer concepts have found their way to the cloud.
A Cloud Lab server runs on the cloud and can support any number of Cloud Lab Appliances. A Cloud Lab Appliance is a Windows or Linux® virtual machine instance running on a public cloud, a private cloud, a virtual machine or a physical machine. A Cloud Lab Appliance can be created by installing the Cloud Lab Agent on an existing machine or by starting with the standard Cloud Lab Agent machine instance on a public cloud and configuring it.
The emergence of public and private clouds has created an environment where virtually unlimited resources can be summoned on demand and nowhere is this more significant than in test automation. in particular, regression testing derives from a matrix of supported environments and configurations, with browsers and their different versions being perhaps the most common.
A Cloud Lab Grid Automation Server can run anywhere, whether in a public or private cloud or just a server in a lab or data center. From there it can manage any number of client or agent computers which can be spun up automatically on public clouds like AWS, IBM, or Rackspace or private clouds like VMware and Citrix. No matter how it is configured, it is distributed computing at its very best — physical machines, virtual machines, and cloud machines are all the same to a Cloud Lab server.
The standard Cloud Lab agent comes with Selenium (see Resources), the leading open source web-testing solution, already installed along with the five major browsers (Internet Explorer, Firefox, Chrome, Safari, and Opera). So "out of the box," you can begin running your Selenium tests against all the major browsers in parallel, shaving hours (if not days) off your usual regression testing cycle. But if you have a license for another functional test tool, you can install that on the Cloud Lab agent (or install the Cloud Lab agent software on your own machine) and do the same thing.
Whether you are running the same regression test in many environments or using different data in the same environment, Cloud Lab Grid Automation is the way to "supercharge" your testing and realize the benefits of combining grid computing with the cloud for maximum productivity. Cloud Lab Grid Automation is available on the IBM and Amazon clouds.
Cloud Lab Grid Automation makes accessing grid and collaborative peer-to-peer functions and feathers easy to facilitate automated testing in the cloud. (There is a video demonstration of this process available.)
Step One: Spin up the grid on the cloud
- Go to the Cloud Lab Setup page at www.cloudlabsetup.com. If this is your first time, you will need to register for a 14-day trial.
Figure 1. Register to try Cloud Lab
You'll be using the open source Selenium web-testing solution for this demo which comes pre-installed on the default Cloud Lab Agent. (You can also install your functional tool of choice on the default agent and make a customized version for this purpose.)
- Let's specify 12 default agents and wait for an email that tells us when our grid is
up and running.
Figure 2. Waiting for grid to start
This can take a few minutes, so it might be a good time to get a refill on your coffee.
- Once the email arrives, click the link and you'll see your Cloud Lab test execution grid ready to take your commands:
Figure 3. The test execution grid is ready
Figure 4. Close-up of the test execution grid
Since collaboration is involved, you might want to plan ahead and have some colleagues to join you. To share the environment, invite others to join, hence the "collaborative peer-to-peer" aspect. Use the built-in chat window or any voice conferencing bridge to allow everyone in the team to take part.
Step Two: Upload and run the test
There are only three steps necessary to kick off the test.
- Upload the test itself to all the instances using the
Upload Local Filecommand. To upload multiple files and/or folders you'll need to zip them up first. Cloud Lab will take care of the rest automatically.
- Use the
Dispense Filecommand to distribute the test data file evenly between the instances.
- Use the
Execute OScommand to kick off the test.
And that's all there is to it!
Step Three: Download the results
After 15 minutes or so, all 10,000 new records have been added to the database; now it's time to shut down the grid. But if any of the instances logged errors, it's important to download them before shutting down the grid or the test results will be lost.
Figure 5.Downloading the results
Download Agent Files command for this. Enter the name of the folder where the test results are stored and the Cloud Lab server will consolidate them into a single ZIP file and send you an email with a link to download it.
Cloud computing revolutionizes software test automation in many ways. The ability to spin up any number of pre-configured test machines out of thin air and use them independently or in parallel can bring all the power of grid computing to handle all of your testing challenges. The ability to collaborate with any member of the team, some who may be located in other parts of the world, offers the advantages of a traditional test lab but without the limitations of physical machines in a test lab environment.
This article demonstrated how you can integrate grid computing and collaborative peer-to-peer functionality in order to make it easier to automate cloud-based testing. I hope you'll explore automated cloud testing further.
A classic, "Choosing a test automation framework" shows you five basic frameworks, sets of assumptions, concepts, and practices that provide support for automated software testing.
"Automated web testing with Selenium" provides more information on learned best practices for common issues during testing based on Selenium.
"Functional testing for web applications" demonstrates actual application testing on the cloud using Selenium.
To understand the concepts of a cloud-oriented overlay network (and see a real-world example), read "Deliver cloud network control to the user."
See the video demonstration of Cloud Lab Grid Automation facilitating automated testing in the cloud.
In the developerWorks cloud developer resources, discover and share knowledge and experience of application and services developers building their projects for cloud deployment.
The next steps: Find out how to access IBM SmartCloud Enterprise.
Get products and technologies
See the product images available for IBM SmartCloud Enterprise.
The PetShop .NET Sample Application demonstrates an enterprise architecture for building .NET Web Applications and highlights the structure of an enterprise application through examination of its architecture, programming model, productivity, performance, scalability, and reliability.
See the IBM product images available for the Amazon EC2 cloud.
Join a cloud computing group on developerWorks.
Read all the great cloud blogs on developerWorks.
Join the developerWorks community, a professional network and unified set of community tools for connecting, sharing, and collaborating.
Randy cut his teeth writing diagnostic tests for the Altair 8080 in 1979. He created the first automated test tool for the PC called AutoTester for DOS in 1983 and was a co-founder of Worksoft, a vendor of application life cycle management solutions for enterprise applications. Grid Robotics (formerly Capacity Calibration) was founded in November 2008 and is a leading supplier of PaaS (Platform as a Service) solutions for public and private clouds.