Comment lines: Defeat image sprawl, once and for all

Virtualization and cloud computing make it very easy to create new virtual images, but as image catalogs grow, finding and locating the right images gets harder. New images are created because it is easier to create a new image than it is to figure out what existing image might be reusable, creating "image sprawl." Unless you address how to more effectively build and manage your virtual images, you will not realize the full benefits of the cloud. Two new IBM® capabilities, the Virtual Image Library and the Image Construction and Composition Tool, can help you quickly understand the content of your images and build reusable, parameterized images. This content is part of the IBM WebSphere Developer Technical Journal.

Share:

Ruth Willenborg (rewillen@us.ibm.com), Distinguished Engineer, IBM

Ruth Willenborg is a Distinguished Engineer in IBM Software Group Tivoli. Ruth is currently responsible for Image Management, including image construction, conversion, and management capabilities. Prior to joining Tivoli, Ruth was a founder of the WebSphere CloudBurst Appliance (now called IBM Workload Deployer) and the IBM Hypervisor Edition pre-built virtual images. Ruth has 25 years of experience in software development at IBM. She is co-author of Performance Analysis for Java Web Sites (Addison-Wesley, 2002) and numerous articles on both WebSphere performance and using WebSphere with virtualization technologies.



07 December 2011

Also available in Russian Japanese

Is there an image sprawl monster under your bed?

This May, I switched divisions within IBM® and moved from the WebSphere® organization to Tivoli®. My first week, I met the monster: 11,000 virtual images across just two customers. At that point, I really started looking forward to this month, when I knew we would be releasing two new technologies, the IBM Image Construction and Composition Tool and the IBM Virtual Image Library. I'd like to introduce you to these new technologies and explain how they can help you control and eliminate the dreaded image sprawl monster.


What causes image sprawl?

Virtual image sprawl is a reasonably new industry phenomena derived from the simplicity that is the "black box" virtual image. Virtualization and cloud computing make it very easy to create new virtual images, but very hard to know what is in the image and how to manage it. Unfortunately, as image catalogs grow, finding and locating the right images gets harder; new images are created because it is easier to create a new image than it is to figure out what existing image might be reusable.

Most customers I see fall into one of two categories. Either:

  • They have only standard operating system images, and use scripts and manual installations to add software content to each instance, or:
  • They put more content in the images and have an image sprawl problem.

Very few have found an effective balance. The result: virtual image technology is not being leveraged to its fullest benefits.


Right-sizing your image catalog

Perhaps it is because of the time I spent working in performance, but I am a firm believer that to improve the performance of any path, you must first look at what you can completely eliminate, and then look for ways to make the remaining path faster and simpler. In addition, these concepts also apply to creating a well-balanced image catalog:

  • If you don’t need to do something – don’t do it.
  • If you only need to do something once, don’t do it every time.
  • If you can automate it – automate it.

The advantages of creating a virtual image, by installing and configuring one time and then sharing this image, are tremendous. Only a small number of individuals in your organization need to have installation and configurations skills. The installation and configuration process is executed once and tested. The image is then just copied as needed. This is repeatable and eliminates steps (often manual, error-prone steps), thereby improving quality. The performance advantages are also significant; copying and instantiating an image rather than rerunning installation programs alone is faster, and when combined with image caching technologies, the savings are quite significant.

These advantages are why I believe virtual image templates with more than just the OS make sense; I like to see image catalogs with content. However, when I see image catalogs with thousands and thousands of images, the advantages I just cited are lost: each unique image requires installation and configuration, caching benefits are lost, maintenance is a nightmare, and sharing is often non-existent. You cannot afford to create new images for every variant. There needs to be a balance that supports image reuse and sharing.

There is no one answer to finding this balance. The more you put into an image, the more images you will have to maintain. However, the less you put into images, the more scripts and manual steps you will have, and the more that can go wrong. My three rules of thumb for finding a balance between what to “burn” into the image template once, versus what to leave for instantiation time, are:

  • If you want it in every instance, burn it into the image. This applies to the operating system and to your organization standards, such as monitoring agents, required security, and auditing software. All these are great candidates to go directly into the image.
  • If it is big or slow, burn it into the image. This applies to things like large binaries and long running configurations. For example, middleware products such as application and database servers are great candidates to burn the binaries and some level of configuration directly into the image.
  • If it changes frequently, script it at instantiation time. This applies to fast running configurations, configuration options such as port number, passwords, and so on. This also applies to frequently changing software content such as OS emergency fixes and applications under development.

Implementing the balance using Image Construction and Composition Tool

IBM follows these rules of thumb when creating our IBM Hypervisor Edition images and IBM Workload Deployer (formerly WebSphere CloudBurst™ Appliance) virtual system patterns. The virtual images pre-install the binaries along with multiple configurations. Each image also exposes a set of common configuration parameters. This enables a single virtual image to be used to support many different environments and applications. The image is just copied (cached) and instantiated with a different set of parameters for the desired personality. Pattern scripts are used for cross-configuration between the instantiated instances, because this is fast and there are many specific pattern configurations. In addition, users also add IBM Workload Deployer script packages for frequently changing content, such as their applications.

You can now apply these same techniques to build your own images, by using the IBM Image Construction and Composition Tool. The design of the image construction tool helps you:

  • Automate the one time installation of content into images.
  • Create instantiation time configuration options to reduce the number of images required.
  • Recreate an image with a touch of a button.
  • Rebuild an image for a different cloud (private <-> public) or on a different OS or version.
  • Identify specific software versions and dependencies of an image.

As shown in Figure 1, you can set up one IBM Image Construction and Composition Tool environment and build images for IBM Workload Deployer, IBM SmartCloud Provisioning, IBM SmartCloud Enterprise, or VMware ESX. The bundles and image definitions are reusable across the different cloud environments. The tool ships with IBM Workload Deployer 3.1 and is available as part of the SmartCloud Provisioning 1.2 open beta program.

Figure 1. Overview of IBM Image Construction and Composition Tool
Figure 1. Overview of IBM Image Construction and Composition Tool

Understanding and controlling your monster with IBM Virtual Image Library

If you already have the image sprawl monster in your environment, the Virtual Image Library capabilities offer a fast and easy approach to help you start eliminating sprawl. First, connect the image library to your existing VMware environments. There is no need to move or copy your images. You can immediately start searching for images with specific software in them, as well as perform similarity and difference reporting. For example, your first step might be to reduce the number of different Windows® or Linux® images that different organizations might have created.

Begin by selecting one of the master operating system images on which you want everyone standardized. Perform a “similarity search” to identify all the similar images. Images that are similar (or identical) to the master image are great candidates for removing. For example, Figure 2 shows an example similarity report, where two images (w2k3-db2client-9.1.4-eap, and jeff-w2k3-db2client-9.1.4) are actually identical to the selected image. You also see images with high similarity. For similar images, you can perform both software and file level differencing comparisons of the similar image against the master to determine exactly what is different. You will likely find that many of the variants are not necessary, and you can start eliminating images.

Figure 2. Example IBM Virtual Image Library Similarity Report
Figure 2. Example IBM Virtual Image Library Similarity Report

Once you have identified the golden template images, you can use the Virtual Image Library reference repository and version control capabilities to manage them. First, check the images into the Virtual Image Library as Version 1. Then, check the images out to each operational repository so the library can track your golden templates and the locations. When you need to make changes to a golden template, check the image out, apply the changes, and check it in as a new version. Figure3 shows an example of the version tree for an image, as well as the family tree, which shows the explicit lineage.

Figure 3. Example IBM Virtual Image Library Version Chain and Family Tree
Figure 3. Example IBM Virtual Image Library Version Chain and Family Tree

You can then check this new version out to each location. If possible, remove the previous version from the operational repositories at this time. By doing this, you always have the latest version in use. You can quickly make comparisons between versions to know what has changed, and you can always go back to an older version in the repository, if necessary.


Putting it all together

In addition to using the image library to control the monster, you want to start using IBM Image Construction and Composition Tool to build your image templates. Construct software bundles for content you use across your master images so you can easily automate construction of your master images. Also, start using the instantiation time parameterization capabilities so your images are more reusable, enabling further consolidation of the number of master templates.

To avoid having to change your golden templates for frequently changing content, I recommend that each image template contain a dynamic component to retrieve this frequently changing content on instantiation. There are many techniques for achieving this. IBM has an excellent solution in the IBM Tivoli End Point Manager (previously BigFix®), which even provides pre-built content for operating system patches. IBM Workload Deployer users often use a virtual system pattern script to retrieve frequently changing content (such as applications under development) and install the content at instantiation time.


Conclusion

Virtualization and cloud computing is creating image sprawl problems. Unless you address how to more effectively build and manage your virtual images, you will not realize the full benefits of the cloud. IBM has two new capabilities, the Virtual Image Library and the Image Construction and Composition Tool to help. With Virtual Image Library you can quickly understand the content of your images, search them, and run comparison reports for both differences and similarities. Image Construction and Composition Tool enables you to build reusable, parameterized images.

Check out these capabilities with IBM SmartCloud Provisioning, IBM SmartCloud Enterprise, and IBM Workload Deployer 3.1.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Business process management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Business process management, WebSphere, Web development, Commerce
ArticleID=778971
ArticleTitle=Comment lines: Defeat image sprawl, once and for all
publish-date=12072011