Skip to main content

Using open source software to design, develop, and deploy a collaborative Web site, Part 12: Hosting and deploying

Realize the benefits of virtualization technologies

Alister Lewis-Bowen (alister.lewisbowen@gmail.com), Senior Software Engineer, IBM, Software Group
Alister's photo
Alister Lewis-Bowen is a senior software engineer in IBM's Internet Technology Group. He has worked on Internet and Web technologies as an IBM UK employee since 1993. Alister was brought to the U.S. to work on the Web sites for the IBM-sponsored sports events, then as senior Webmaster for ibm.com. He is currently helping create semantic Web prototypes.
Stephen Evanchik (evanchsa@gmail.com), Software Engineer, IBM, Software Group
Stephen's photo
Stephen Evanchik is a software engineer in IBM's Internet Technology Group. He has been a contributor to many open source software projects, the most notable being his IBM TrackPoint driver in the Linux kernel. Stephen is currently working with emerging semantic Web technologies.
Louis Weitzman (louis.weitzman@gmail.com ), Senior Software Engineer, IBM, Software Group
Louie's photo
Louis Weitzman is a senior software engineer in IBM's Internet Technology Group. For 30 years he has worked at the intersection of design and computation. He helped develop an XML, fragment-based content management system in use by ibm.com, and currently is involved with bringing the design process to emerging projects.

Summary:  Follow along in this series of articles as the IBM® Internet Technology Group designs, develops, and deploys an extranet Web site for a fictitious company, International Business Council (IBC), using a suite of freely available software. In this article investigate the issues surrounding deployment of a Drupal site using virtualization technologies. Discover why the team chose to use virtualization, what technologies were considered, and the setup of the final production environment.

View more content in this series

Date:  19 Dec 2006
Level:  Intermediate
Activity:  2134 views

Introduction

When you reach the final phase of development of your Drupal powered site, it's time to start thinking about how you are going to deploy it for your client. In our case, the International Business Council (IBC) Web site supports several hundred users who will periodically access content throughout the year. During business conferences, we expect increased usage, with many simultaneous users accessing content and downloading large files, and new content creation by the site operations staff. This pattern of usage dictates several constraints that we had to address when we deployed the final IBC Web site.

Your Web site will have its own usage pattern, which you must understand in order to select an appropriate deployment environment.

Two items in particular -- concentrated usage during conferences and large file transfers -- forced our deployment environment to have a substantial amount of bandwidth and dedicated system resources (memory, CPU cycles, and disk space) that are typically reserved for co-located environments. If that wasn't complicated (and expensive!) enough, we didn't want to own any of the hardware that the IBC Web site used. We were already under time constraints during development and could not afford to spend time upgrading, troubleshooting, or monitoring hardware.

There are basically three types (and variations of these types) of options for hosting a Web site: simple shared hosting, co-location of a physical server and, virtualization. Each has advantages and disadvantages based on cost and complexity of management. With a clear understanding of the usage patterns and time constraints, we decided to host our Drupal environment using virtualization. This allowed us to build and test the Drupal environment locally and easily ship the entire virtual machine to the production hosting platform.

This article discusses why we chose to use virtualization, what technologies we considered, and a description of the final production environment.

Information in this article should not be interpreted as a rigid set of development guidelines that must be followed, but rather as a place to start when considering hosting options for your own Web site.

Hosting options

You can find many places to host your Drupal site if you enter Web hosting in your favorite search engine. Your choice of providers is simplified if you expect your site's traffic to be very low or very high. If you expect your site's traffic to be low, then just about any Web hosting provider that is compatible with Drupal will work. If you expect your traffic to be very high, then the number of acceptable Web hosting providers is drastically reduced to the top providers.

The problem occurs when you are certain your site's traffic will be somewhere in between the two extremes. We don't recommend any particular hosting provider, but advise you to carefully evaluate how much traffic you expect your site to receive. Pick a Web hosting provider that will support your initial site's traffic and will also provide enough resources to grow your site without breaking your budget. We didn't need to use an external Web hosting provider for the IBC site. We had our own set of requirements that are described in detail in this article.

The three basic options for hosting are:

Shared hosting
A low-cost solution for operating a small Drupal Web site. Your Web site resides on a server that is shared with other Web sites. This means that your site is always competing with the other sites for resources, such as memory and CPU time.
Co-location
Co-location of a physical server is often expensive and requires you to manage and maintain physical hardware. Some providers let you "rent" a complete physical server, alleviating you of the burden of physical hardware maintenance. Co-location gives you dedicated resources and a high degree of flexibility in how Web sites are deployed.

There are trade-offs to this approach. For example, if you want redundancy of physical hardware, you need to buy or rent two machines and configure them properly. In short, if you have time to maintain a complete environment from the physical hardware to your software, then co-location is a good choice.

Virtualization
A relatively new approach to hosting a Web site. With virtualization you realize many of the benefits that co-location offers, such as dedicated resources, without the burden of managing physical hardware.

Virtualization is the abstraction of physical computer resources into equivalent virtual resources. For example, a single physical server can be subdivided into multiple servers using virtualization. These subdivisions are called virtual machines. A virtual machine behaves just like a physical machine from a user or administrator's perspective. Virtual machines, like physical machines, are isolated from each other. If software on one virtual machine malfunctions, it does not affect the execution of the other virtual machines located on the same physical machine. Figure 1 shows a typical combination of physical host and virtual machines.


Figure 1. Physical machine with virtual machines
Physical machine with virtual machines

There are many advantages to using virtualization technologies for development and deployment. The advantages we found most attractive are:

  • Control of operating system
  • Single unit of distribution
  • Resource procurement and upgrades

Because we are creating a complete computer system virtually, we can choose to install an operating system with which we are familiar. We chose Red Hat Enterprise Linux® 4 to run the IBC Web site. Familiarity with the operating system running the IBC Web site let us concentrate on setting up Drupal and our custom code without worrying about the peculiarities of a less-familiar operating system.

After installing the operating system and the IBC site in the virtual machine, we are left with a neat package of our software and all of its prerequisites. The physical manifestation of a virtual machine is just a large file. The virtual machine essentially becomes an appliance with three buttons: power on / off, suspend, and reset. This makes it very simple to distribute the IBC site to the hosting environment and other developers, and to transport for on-site demo purposes.

Finally, we weren't sure how much memory or disk was needed to support the IBC site. We had a general idea, but no precise numbers. We didn't want to buy hardware that we wouldn't use, and we didn't want to buy too little hardware and then have to upgrade as the site usage increased. With virtual machines we can make an educated guess about our initial resource allocation; if it proves inadequate, we can correct it with a few key strokes.

There are many different virtualization platforms that you can use. Some are open source, some are commercial products but free to download and use, and others require a commercial license. We considered two options, Xen and VMware, described below.

Xen

Xen is an open source virtualization platform originally developed at the University of Cambridge Computer Laboratory. Xen is fast and completely free. It is actively developed by a large and knowledgeable open source community, similar to how the Linux kernel is developed. We are very excited about the potential that Xen has, but were (and still are) disappointed that we couldn't use it to host the IBC site.

We did not use Xen because its management tools are not as robust and polished as we would like. We're comfortable editing virtual machine descriptor files and working through a Linux machine's command line, but these tasks were not in line with our goal of saving as much time as possible so that we could be free to concentrate on developing and testing the IBC site. We opted for a more complete and refined virtualization platform from VMware.

VMware Server

VMware, a hybrid -- a free and commercial platform -- provides several virtualization products used by anyone from developers to data center administrators. We'll be using two specific products from VMware:

  • VMware Server -- Used to build and test the virtual machine we will eventually deploy to the production environment. VMware Server is a free product from VMware that is ideally suited to building and executing virtual machines on a developer's workstation. It is unobtrusive but retains the easy-to-use graphical interface found in some of VMware's other products.
  • VMware ESX Server -- Used to host the IBC virtual machine in the production environment. VMware ESX Server offers the necessary performance, scalability, and management tools that a business-critical Web site requires.

See Resources for information about these products.


Creating a virtual machine

This section walks you through the process for creating a virtual machine.

Start the VMware Server Console program and connect to your local machine. You should see a window similar to the one in Figure 2.


Figure 2. Initial VMware Server Console
Initial VMware Server Console

Click the New Virtual Machine icon to start creating a new virtual machine. A dialog should pop up, as shown in Figure 3.


Figure 3. New Virtual Machine Wizard dialog
New Virtual Machine Wizard dialog

Click Next to proceed to the first step. We aren't going to be doing anything out of the ordinary with this virtual machine, so can safely choose Typical machine configuration, as in Figure 4. The Custom configuration option allows advanced options, such as virtual disk adapter types, to be selected. You should choose Custom if you are an experienced VMware user. Click Next to continue.


Figure 4. Select machine configuration type
Select machine configuration type

VMware Server records the operating system that your virtual machine will execute. We used Red Hat Enterprise Linux 4 for the IBC site.

In Figure 5, choose the operating system that most closely matches the one you will install. This setting provides information to VMware about how the virtual machine will operate. Click Next.


Figure 5. Select virtual machine operating system
Select virtual machine operating system

The physical manifestation of a virtual machine is a large file, so we need to identify the file by choosing an appropriate name and directory where it should reside on your local system, as shown in Figure 6. Click Next to continue.


Figure 6. Virtual machine name and directory selection
Virtual machine name and directory selection.

The networking type is a matter of personal preference. As shown in Figure 7, we chose network address translation (NAT) because it allows our virtual machine to access networked resources without unnecessarily exposing it to other network machines. We will have to repeat this process when we deploy the virtual machine to the production environment, so it is safe to choose the option with which you are most comfortable. Click Next after making your choice.


Figure 7. Select networking type
Select networking typ

The next step is to specify the virtual hard disk that the virtual machine will use to store its data, as shown in Figure 8. We chose an initial size of 8 GB because it is small enough to be manageable but not overly large to cause wasted disk space.

Make sure you uncheck the option Allocate disk space now. If this option is checked, the transfer to the production environment will include the formatted disk space as part of the virtual machine. We don't want to transfer any more data than is absolutely necessary. This setting reduces disk performance. We'll revisit this decision when we deploy the virtual machine to the production environment. Click Next to continue.


Figure 8. Virtual disk creation
Virtual disk creation

You now have an empty virtual machine that is ready for an operating system, as shown in Figure 9. Before you install your operating system, we recommend adding another 256 MB of memory to the virtual machine for a total of 512 MB.


Figure 9. Completed virtual machine
Completed virtual machine.

Click VM on the menu and choose Settings, as shown in Figure 10.


Figure 10. Modifying the virtual machine's settings
Modifying the virtual machine's settings.

Once selected, you'll see a dialog like the one in Figure 11. This dialog lets you modify all of the attributes of the virtual machine. Increase the memory to 512 MB, and click OK.


Figure 11. Upgrading memory for the virtual machine
Upgrading the memory for the virtual machine.

Now that you have finished creating your virtual machine, it's time to install an operating system. We chose Red Hat Enterprise Linux 4 because it was familiar; choose an operating system that you are familiar with. (It is outside the scope of this article to discuss the complete installation process we followed.)

Be sure to install Apache, PHP, and MySQL as described in Part 3 and Part 4, as they form the foundation of a Drupal site. If you have begun to populate your Drupal database with initial site content, you should export the database from your development environment and re-import it to the database server on the virtual machine. The following features saved us time and smoothed out the deployment process.

Logical volume management

Logical volume management is a disk abstraction layer that allows you to group physical disks into a single entity called a volume group. Volume groups are divided into one or more logical volumes where your files eventually reside. Figure 12 shows an example disk layout using logical volume management.


Figure 12. Logical volume management
Logical Volume Management.

The physical disks Disk 1 and Disk 3 are bundled into a volume group named Main, while Disk 2 and Disk 4 are bundled into the Data volume group. Each of these volume groups can be subdivided into many logical volumes whenever necessary, as long as there is unallocated space left in the volume group. The logical volumes act like a normal disk partition, and are depicted by LV1 - LV6.

If a volume group becomes full, another physical disk can be added to it to create a new logical volume or enlarge a preexisting logical volume. Using logical volume management lets you quickly add more disk capacity to your system without having to shuffle your data around. Once you've finished configuring your disk layout, proceed to install your operating system as normal.

Once you have the base operating system installed and it has successfully booted, log in and configure Apache, PHP, and MySQL using the techniques discussed in Part 4. After completing those steps, install the VMware Tools.

VMware Tools

The VMware Tools package allows your virtual machine to take advantage of a high-speed network driver and advanced memory management. It also allows your virtual machine to reliably synchronize its internal clock to the host machine. For your Drupal site to be useful, the underlying server must keep an accurate clock. In a traditional environment, the Network Time Protocol (NTP) daemon is used to keep the internal clock accurate. Unfortunately, NTP in a virtual machine can introduce problems, making VMware Tools a necessity.

First, choose Install VMware Tools from the VM menu, as shown in Figure 13.


Figure 13. Install VMware Tools
Install VMware Tools menu item.

When you select the menu item, another dialog like the one in Figure 14 will pop up. Click Install to continue.


Figure 14. Information dialog prior to VMware Tools installation
Information dialog prior to VMware Tools installation.

Log in to your virtual machine as the root user to begin the VMware Tools installation process. Listing 1 shows how we installed VMware Tools using the command line. When you run the command /usr/bin/vmware-config-tools.pl, be sure to follow the onscreen directions carefully to ensure you correctly configure all of the components.


Listing 1. VMware Tools installation
$ mount /dev/hdc /mnt
$ cd /mnt
$ rpm -ivh VMwareTools-1.0.1-29996.i386.rpm
$ /usr/bin/vmware-config-tools.pl
$ cd ~
$ umount /mnt
	

At this point, the VMware Tools package is installed and ready to be used. We recommend that you reboot your virtual machine to make sure everything is working properly. When your virtual machine is finished booting, continue to the next section that discusses how to get the Drupal code onto the virtual machine.


Deploying Drupal

Now that you have an operating system installed in your virtual machine, it's time to deploy the code you have been writing and testing. Our goal in this section is to develop a method for moving code from your revision control system to the production server in an orderly fashion. The last thing we want (and you want) is to have the production site using modifications to the code that are not present in your revision control system. We also want to prevent haphazard updates to the code on the production site.

Creating a snapshot of the code

We'll begin by using Eclipse's excellent CVS support to apply a version tag to the files we are going to deploy. A version tag applies a meaningful name of your choice to a specific revision of each file in the repository, essentially creating a named snapshot of your development project. You can then use this meaningful tag to retrieve the contents of the CVS repository in the future, even if you have subsequently modified one or more of the files in the list of files affected by the version tag.

To apply a version tag, right-click the top-level project entry; for example, ibc_site, in Eclipse's Navigator pane and choose Team > Tag as version., as shown in Figure 15.


Figure 15. Creating a version tag in CVS
Creating a version tag in CVS.

Once you click Tag as version, another dialog box pops up, asking you to create the meaningful tag name, as shown in Figure 16.


Figure 16. Naming your version tag in CVS
Naming your version tag in CVS.

Enter a tag name that is meaningful to you. We chose to use ibc-0-1-0 for our first deployment of the site. Click OK and your tag name is applied. If you ever need to revert to this version of your code, you can right click on the project directory and choose Replace With > Another Branch or Version to replace the code with the saved version.

Installing the snapshot on the virtual machine

Now that you have a version tag in your CVS repository, it's time to deploy it to the virtual machine we have been building. We assume that the production Drupal site will reside in the /var/www/htdocs directory. Listing 2 shows the commands we used to retrieve the ibc-0-1-0 version of our source code.


Listing 2. Deploying a specific version of the IBC site code
$ cd /var/www/htdocs
$ export CVS_RSH=ssh
$ cvs -d :ext:username@cvs.yourserver.com:/cvsroot/drupal checkout ibc_site -r ibc-0-1-0
	

The CVS command checkout downloads all of the source code from the CVS server to the local machine. This downloaded code can then easily be updated in the future from the CVS repository. The checkout command has several options that control the checkout process. We used the -r switch to select the appropriate version tag. You can experiment with other switches as you become more familiar with CVS and your deployment needs.

Configuring within the virtual machine

Now that you have the source code checked out in the virtual machine, you can begin to configure and test your production Drupal site. Refer to Part 4 and the Drupal documentation during the configuration process. At the very least, you will need to change or preferably add a new site settings file under the site directory.

Production maintenance

During your testing you may uncover bugs or other unexpected problems. We strongly recommend that you never edit your site's source code on the production machine. The production virtual machine should be a resting place for thoroughly tested code -- not an additional development tool.

Make all of your changes on your local development workstation, then commit those changes to your CVS repository. Once you are satisfied that the changes in the repository are tested and ready for production, you can use CVS to update the deployed code. For example, suppose you found a bug and fixed it and now need to push that update out to production. The first thing to do is tag the bug-fixed code. We'll use the tag name ibc-0-1-1 to indicate that this code is an update to the tag ibc-0-1-0. Once the tag exists in the repository, issue the command in Listing 3 on the production server.


Listing 3. Updating deployed code to a new version
$ cd /var/www/htdocs
$ cvs update -Pd -r ibc-0-1-1
	

There are three options specified: -P instructs the update process to prune empty directories; -d to build new directories like the checkout command does; and -r to use the tag ibc-0-1-1 as the basis for this update. Given that you aren't making any modifications to the code on the production server, the update process will basically replace the previous revision of the code, and tag ibc-0-1-0 with the new revision ibc-0-1-1.

We've purposely left out a substantial amount of information about using CVS within your development process. The items detailed here are a small window into the full functions that CVS provides to the developer. These points are to serve as a jumping-off point for you to consider how to better integrate CVS (or your revision control system of choice) with your entire development workflow.


Production environment

Now that you have installed, configured, and deployed Drupal to the virtual machine, it's time to ship it off to its final hosting environment. We are fortunate to have access to an enterprise virtualization platform, VMware ESX Server. The hosting environment is on a fast Internet connection in a typical data center environment and is managed by professional IT staff. All we need to provide to the IT staff is the virtual disk files and resource requirements for our virtual machine.

Providing the resource requirements necessary to execute the IBC site is easy; a simple e-mail was all that was necessary. Transporting and installing the virtual disk files was more complicated and involved a small amount of trial and error.

Migrating the virtual machine to the production server

The two VMware products we used do not produce directly compatible virtual disk files. As an added complication, we preallocated our virtual disk files, which made transferring the files across a network problematic because of the large file size (8 GB). If you have followed along and created your virtual machine as we did in Figure 8, then you will not experience this problem. Unfortunately, importing the virtual disk into VMware ESX Server did not go as smoothly as we anticipated. It turns out that the virtual disk files created in the free VMware Server cannot be directly imported without a little help. We aren't sure if this is the most appropriate method for accomplishing the task, but it worked well for us.

In Listing 4 you see a line in the virtual disk descriptor file that was created for your virtual machine. In our case the virtual disk descriptor file is named IBC.vmdk; your file name will be similar to the name you gave to the virtual machine.


Listing 4. Details of incompatible virtual disk file
...
ddb.virtualHWVersion = "4"
...
	

This line defines a version of something for the virtual disk that made it incompatible with our version of VMware ESX Server. We decided to reduce the version number until we were able to import the disk and test. The simple change is shown in Listing 5.


Listing 5. Compatible virtual disk file
...
ddb.virtualHWVersion = "3"
...
	

Once we made that change, we were able to import the virtual disk into VMware ESX Server and finish the deployment procedure.

Storage expansion

Now that the base virtual machine has been imported into its final hosting environment, we need to add another disk to handle the many files that will be uploaded. This is a simple task because we used logical volume management. We requested that the IT staff managing the VMware ESX Server install an additional 8 GB disk in our virtual machine. We booted the virtual machine and added the new disk to the volume group and formatted it for use. Listing 6 shows the commands we used.


Listing 6. Adding the new disk to a volume group
$ pvcreate /dev/sdb
$ vgextend /dev/VolGroup00 /dev/sdb
$ lvextend -L +8GB /dev/VolGroup00/LogVol00
$ ext2online /dev/VolGroup00/LogVol00 16g
	

The first three commands in Listing 6 format the new disk for use in the logical volume management system, add it to the volume group VolGroup00, and then extend the logical volume LogVol00 to use the new space. A logical volume is analogous to a disk partition for our purposes. The last command, ext2online, resizes the ext3 file system on the fly while the system is running.

At this point we've built a virtual machine that contains our Drupal system, installed it in the production environment, and increased the disk space for the files our users will upload. It is now a good time to verify that the uploaded files will be protected by Drupal's access control system, rather than be publicly available. Navigate to the admin/settings URL and you should see something similar to Figure 17. Set the download method to Private, and be sure to set the file system path to a location that is not inside the directory /var/www/htdocs.


Figure 17. Configuring files for private download
Configuring files for private download.

Summary

In this article you learned how, and why, we deployed our Drupal site using virtualization technologies. We derived our deployment requirements from a thorough understanding of our users' access patterns, and the realization that we could not spend time managing physical hardware. There is not a one-size-fits-all approach to deploying a Drupal-powered site. We chose virtualization because it fit our needs and has been a lasting success.

Stay tuned for the next article, which covers how to use the Eclipse IDE to develop your Drupal Web site.


Resources

Learn

Get products and technologies

  • Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

Discuss

About the authors

Alister's photo

Alister Lewis-Bowen is a senior software engineer in IBM's Internet Technology Group. He has worked on Internet and Web technologies as an IBM UK employee since 1993. Alister was brought to the U.S. to work on the Web sites for the IBM-sponsored sports events, then as senior Webmaster for ibm.com. He is currently helping create semantic Web prototypes.

Stephen's photo

Stephen Evanchik is a software engineer in IBM's Internet Technology Group. He has been a contributor to many open source software projects, the most notable being his IBM TrackPoint driver in the Linux kernel. Stephen is currently working with emerging semantic Web technologies.

Louie's photo

Louis Weitzman is a senior software engineer in IBM's Internet Technology Group. For 30 years he has worked at the intersection of design and computation. He helped develop an XML, fragment-based content management system in use by ibm.com, and currently is involved with bringing the design process to emerging projects.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Sample IT projects, Open source
ArticleID=184167
ArticleTitle=Using open source software to design, develop, and deploy a collaborative Web site, Part 12: Hosting and deploying
publish-date=12192006
author1-email=alister.lewisbowen@gmail.com
author1-email-cc=
author2-email=evanchsa@gmail.com
author2-email-cc=
author3-email=louis.weitzman@gmail.com
author3-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).