The first article in this series followed the migration of a single physical server to a single physical cloud server. Despite all the work you did, however, the application was no better off, largely because more single points of failure were introduced.
Even on a single physical server, you can have redundant power supplies, error-correcting RAM, redundant disks, and copious monitoring of pre-fault indicators. On a cloud server, you don't know what you've got—or, more to the point, what you have access to. Cloud servers are generally reliable, but taking precautions is smart, especially since Amazon provides extra services to enhance reliability.
When deploying to a cloud environment, it is wise to assume that you could lose a virtual instance at any point. This is not to say that cloud services are unreliable, only that the types of failures you might run into are unlike what you're used to in a physical environment. Therefore, you should push intelligence into your application to deal with communications loss and to scale across multiple servers. This type of thinking will build a better application regardless of the type of environment you're building for.
In this article, you learn how to improve on the database's ephemeral storage using Amazon Elastic Block Store (EBS). You further improve your data protection by setting up backups. You protect against application server loss by load balancing across multiple instances, allowing you to recover from various failures.
Figure 1 shows the architecture of the application where you left off last time.
Figure 1. The current architecture
Everything is on one Amazon Elastic Compute Cloud (Amazon EC2) instance. The front-end Web server, nginx, proxies requests to multiple mongrel instances or serves static files itself. The mongrel application servers access a PostgreSQL database on the same host.
Instance storage is the biggest difference between Amazon EC2 and virtualization technologies like VMware and Xen. Recall that an Amazon EC2 instance gives you a fixed 10GB root partition and an instance disk that's sized according to the type of instance launched. The root partition is cloned from the Amazon Machine Image (AMI) on boot, and the instance store is empty. When you shut down your server, your instance store is lost.
Amazon's initial position was to tell people to back up their servers frequently to Amazon Simple Storage Service (Amazon S3). If your server crashed, then you would have other servers pick up the load, or you'd get your data from Amazon S3. Eventually, Amazon came out with EBS, which is a service that provides persistent disks. If your server crashes, you can reattach the EBS volume to another server. Amazon even built a snapshot mechanism to ease backups.
The prime problem with the database server in the SmallPayroll application is that it is a single point of failure. There are two general approaches to fixing this. One is to build two database servers that can take over for each other; the other approach is to reduce the potential downtime to something reasonable. The first option has the least downtime but is complex. The second option is much more practical in this situation. If the database server should crash, a new instance will be started to replace it. EBS takes care of keeping the data safe. Total time to launch a new database server and re-point the clients should be under 10 minutes from the time the fault is noticed. As a final bonus, EBS storage has a higher I/O capacity than the instance storage.
To use EBS, you perform the following steps:
- Create the volume with the
ec2-create-volumecommand. - Attach the volume to a running instance with the
ec2-attach-volumecommand. - Create a file system on the volume.
- Mount the file system to a directory.
Setting up EBS for the first time
The first step in setting up EBS is to tell Amazon you want to create a volume. You need to know two things: the size of your image (in gigabytes) and the availability zone in which you want to use the image. The availability zone is something that Amazon came up with to describe the location of the server. Zones starting with us-east are in northern Virginia and are collectively called the region. There are three such zones in the us-east region at this time: us-east-1a, us-east-1b, and us-east-1c. Each availability zone is designed to be isolated from failures in other availability zones. Zones in the same region are still close to each other and therefore have low latency between them.
One restriction of EBS is that the volume can only be mounted in the availability zone in which it was created. There are ways to move it, but you must create your volumes in the same availability zone as your server.
Run the command:
ec2-create-volume -s 20 -z us-east-1a |
to create a 20GB volume in the us-east-1a zone. If you don't know where
your server is, the ec2-describe-instances
command tells you. You can use the same -z
parameter to ec2-run-instance to specify where
your server is to be launched. Listing 1 shows this command and its
output.
Listing 1. Creating the EBS volume
$ ec2-create-volume -s 20 -z us-east-1a VOLUME vol-c8791ca1 20 us-east-1a creating 2010-07-01T02:52:52+0000 |
The output of Listing 1 shows that the volume is being created and that the
volume ID is vol-c8791ca1. Knowing this, you
can attach the volume to a running Amazon EC2 instance if you know the
instance identifier of the server and the device you want the server to
see the volume as. Run the command:
ec2-attach-volume vol-c8701ca1 -i i-fd15e097 -d /dev/sdj |
to attach this newly created volume to the service instance
i-fd15e097. Remember that you can find your
instance identifier through the
ec2-describe-instances command and see the list
of volumes with ec2-describe-volumes.
Your virtual server now has a disk called /dev/sdj that looks to it like a normal hard disk. As with any disk, you need to create a file system on the raw disk. You have several choices, depending on your needs:
- Create a standard third extended (ext3) file system.
- Create an XFS file system. Doing so allows you to freeze the file system to take a snapshot for backup purposes.
- Layer the Logical Volume Manager (LVM) between the disk and the file system. Doing so allows you to extend the EBS volume at a later time.
- Use Linux® software RAID to stripe multiple EBS volumes, and put either XFS or ext3 on top of the RAID set. Doing so gives even higher disk performance.
Even though RAID and LVM provide interesting features, XFS is the simplest option for a relatively small EBS volume. You will be able to use the freezing features of XFS along with the EBS snapshots to make consistent backups. Listing 2 shows how to create an XFS file system and mount it on the host.
Listing 2. Creating and mounting the XFS file system
# mkfs.xfs /dev/sdj
meta-data=/dev/sdj isize=256 agcount=8, agsize=32768 blks
= sectsz=512 attr=0
data = bsize=4096 blocks=262144, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=1
naming =version 2 bsize=4096
log =internal log bsize=4096 blocks=2560, version=1
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
# mkdir /ebsvol
# mount /dev/sdj /ebsvol
|
Listing 2 runs the mkfs.xfs command to format
/dev/sdj. (Run gem install -y xfsprogs if you
do not have the mkfs.xfs command.) The output
of this command describes the parameters of the file system. As long as
there are no errors in the output, it can be ignored. The last two
commands in Listing 2 create a mount point called /ebsvol, and
then mount the file system on the mount point.
The file system is now usable. Any files under /ebsvol will persist even when the server is down.
You have an EBS volume mounted on /ebsvol and need to move the PostgreSQL data over. The most direct way to do so is to copy over the existing data store and fix things up with a symlink. Although this technique would work, a cleaner option is to clone the data from the EBS volume to /var/lib/pgsql. Listing 3 shows this procedure.
Listing 3. Moving the PostgreSQL data to the EBS volume
# service postgresql stop # mv /var/lib/pgsql /ebsvol # mkdir /var/lib/pgsql # chown postgres:postgres /var/lib/pgsql # mount /ebsvol/pgsql /var/lib/pgsql -o bind # service postgresql start |
The sequence of commands in Listing 3 is as follows:
- Stop the PostgreSQL daemon to ensure data consistency.
- Move the whole directory tree to the EBS store.
- Recreate the PostgreSQL directory.
- Reset the ownership of the PostgreSQL directory.
- Mount /ebsvol/pgsql on top of /var/lib/pgsql using
mount'sbindoption. - Restart the database.
The bind option to
mount clones the first directory onto the
second. Changes to one appear on the other—after all, it's the
same blocks on the same disk. Using bind
differs from mounting the same device twice in that you can mount
subdirectories instead of the whole file system.
If your server crashes, perform the following steps:
- Start a new instance using your AMI.
- Attach the EBS volume to the instance with
ec2-attach-volume. - Mount the EBS device on /ebsvol.
- Perform the last four commands from Listing 3.
As long as you've bundled your AMI recently, your database system will be up to date.
Addressing the application server
One of the benefits that cloud computing offers is that you have easy access to server capacity. Right now, the SmallPayroll.ca environment has both the database and the application server on the same virtual instance, which is the same as prior to the migration to Amazon EC2. The next step is to separate the application server from the database server.
The term scaling is generally associated with capacity. If an application is said to scale, then the application can be grown to handle the load of more users. If this scaling is accomplished by adding servers, it's called scaling horizontally. If you're replacing a server with a larger server to handle the load, then the application scales vertically.
You can use horizontal and vertical scaling in combination. It may be easier to outrun database capacity problems with bigger servers and faster disks and to spread computations across more servers. Being able to scale horizontally or vertically is mostly a function of application design. Some applications cannot be spread across multiple computers, and some operations take a certain amount of time no matter how fast the computer is. Furthermore, some applications may scale horizontally to a certain point, at which a bottleneck drives the marginal gain of adding a server to nothing.
When you spread the application out across multiple servers, the problem arises of how to distribute incoming requests. The device most often used to do this is a load balancer, which is an appliance that accepts requests from the outside world and hands them off to the next available application server. Because this is not an intensive task, a single device can handle a large number of connections, or this function can be handled in software.
Amazon EC2 provides a cloud load balancer called Elastic Load Balancing that is adequate for most purposes. It distributes requests, can be reconfigured to add or remove servers through an API, and performs health checks on the back-end servers.
An alternative to using Elastic Load Balancing is to run your own load-balancing software on an Amazon EC2 instance, such as HAProxy or Varnish (see Resources for links to more information). This process would be more complex than using Elastic Load Balancing but would provide a higher level of control over your own traffic. Elastic Load Balancing is more than adequate for an application like SmallPayroll.ca.
Figure 2 shows the new design of the SmallPayroll.ca application.
Figure 2. The SmallPayroll.ca application with separate application servers
Incoming requests land on Elastic Load Balancing and are sent to one of two servers. The servers themselves run nginx, which handles static requests and proxies any dynamic requests to mongrel instances. The mongrels attach to a single database server.
In the event that one of the application servers becomes incapacitated, Elastic Load Balancing redirects all traffic to the other server.
To build the separate application servers, you need to launch two more instances. You can use the same AMI as before, because it should have all the necessary software. It is possible to launch more than one instance at a time: Listing 4 shows two instances being launched with one command.
Listing 4. Launching two instances at once
$ ec2-run-instances ami-147f977d -k main -z us-east-1a \ -d 'role=web,db=10.201.207.180' -n 2 RESERVATION r-9cc240f7 223410055806 default INSTANCE i-81ee2eeb ami-147f977d pending main ... INSTANCE i-87ee2eed ami-147f977d pending main ... |
The ec2-run-instances command is similar to
those used in the past. The availability zone is chosen with
-z us-east-1a, because the database server is
in the same region. At the moment, you want to keep the database server
and application servers in the same availability zones to reduce latency
and bandwidth charges.
However, the -d and
-n parameters are new. The
-n 2 parameter simply tells Amazon to launch
two instances, which is confirmed in the output. The
-d parameter allows you to pass information to
the instance. Listing 5, taken from the new instance, shows how to
retrieve this information.
Listing 5. Retrieving instance metadata
[root@domU-12-31-39-0C-C5-B2 ~]# DATA=`curl -s http://169.254.169.254/latest/user-data` [root@domU-12-31-39-0C-C5-B2 ~]# echo $DATA role=web,db=10.201.207.180 |
The curl command retrieves a Web page from the
Amazon EC2 services containing the user data. This is similar to how the
server retrieved its Secure Shell (SSH) key in the previous article in this series.
Configuring the application servers
There isn't much to do on the application servers, because the AMI they were cloned from is already capable of running the application against a local database. A Rails application reads its database configuration from config/database.yml, which tells the application the database server to use. By default, the application connects to localhost.
First, create a DNS alias by adding an entry to /etc/hosts. For example,
10.201.207.180 dbserver aliases the name
dbserver to the address 10.201.207.180. It is important to
use the private address of the database, which is the address assigned to
eth0, instead of the public address you connect to. Traffic between the
private addresses of Amazon EC2 instances in the same availability zone is
free, but traffic from one Amazon EC2 instance to the public address of
another instance is billable.
Next, add your database.yml file to point your application to the DNS alias you created previously. Listing 6 shows such a configuration.
Listing 6. Pointing the application to the database server
production: adapter: postgresql encoding: utf8 database: payroll_prod pool: 5 username: payroll password: secret host: dbserver |
You should be able to launch your Rails application and connect to it over the public IP address of the application server. If you get an error, check the following:
- Is PostgreSQL listening on all interfaces? The postgresql.conf file
must have a line like
listen_addresses="*". - Does pg_hba.conf allow 10/8 addresses to connect using MD5 authentication?
- Does your Amazon security group allow the connection to the database server?
Elastic Load Balancing is a fairly simple load balancer. Requests come in to the load balancer and are directed to an available server in the pool. Elastic Load Balancing can do some basic health checking of the Web servers to avoid sending requests to servers that are down. It also has some basic affinity mechanisms that let you keep users on the same back-end servers. More advanced features, such as redirecting based on the URL, are not currently supported.
Configuring Elastic Load Balancing is a thee-step process:
- Create the load balancing instance.
- Define your health checks.
- Configure DNS to point to the Elastic Load Balancing name.
Listing 7 shows the first two steps in action.
Listing 7. Configuring an Elastic load balancing instance
$ elb-create-lb smallpayroll-http \
--listener "lb-port=80,instance-port=80,protocol=HTTP" \
--availability-zones us-east-1a
DNS_NAME DNS_NAME
DNS_NAME smallpayroll-http-706414765.us-east-1.elb.amazonaws.com
$ elb-configure-healthcheck smallpayroll-http --target "HTTP:80/" \
--interval 30 --timeout 3 --unhealthy-threshold 2 --healthy-threshold 2
HEALTH_CHECK TARGET INTERVAL TIMEOUT HEALTHY_THRESHOLD UNHEALTHY_THRESHOLD
HEALTH_CHECK HTTP:80/ 30 3 2 2
|
Listing 7 shows two commands. The first command,
elb-create-lb, creates the load balancer. The
first parameter is the name of the load balancer, which is unique to you.
The --listener parameter dictates that the
public-facing port is 80, that it is also to be connected to port 80 on
the instance, and that the protocol in use is HTTP. The output of this
command is a DNS name—in this case,
smallpayroll-http-706414765.us-east-1.elb.amazonaws.com.
Unlike with most load balancers, you are not given a public IP address to
connect to. Amazon assigns its own IP addresses, and you connect through a
DNS alias.
The second command, elb-configure-healthcheck,
first references the name of the load balancer, and then specifies that
the health check be performed with the HTTP protocol on port 80 using the
root URL. It is also possible to write a separate controller and action to
handle the checks, such as /status, but in this case, the root URL
provides enough assurance that the application is running properly.
The second line of parameters specifies, in order, the following:
--interval 30. Test every 30 seconds.--timeout 3. How long the application must wait for a response before failing the test.--unhealthy-threshold 2. Two consecutive failed tests mark the server as out of service.--healthy-threshold 2. A failed service requires two consecutive successful checks before the server is brought back into the pool.
The next step is to attach instances to the load balancer (see Listing 8). You can add and remove instances at will.
Listing 8. Adding two instances to the load balancer
$ elb-register-instances-with-lb smallpayroll-http --instances i-87f232ed,i-85f232ef INSTANCE_ID INSTANCE_ID INSTANCE_ID i-85f232ef INSTANCE_ID i-87f232ed $ elb-describe-instance-health smallpayroll-http --headers INSTANCE_ID INSTANCE_ID STATE DESCRIPTION REASON-CODE INSTANCE_ID i-85f232ef InService N/A N/A INSTANCE_ID i-87f232ed InService N/A N/A |
Listing 8 first shows two instances being added to the
smallpayroll-http load balancer. Run the
elb-describe-instance-health command to see the
status of each server in the pool. InService
means that the service is able to handle requests through the load
balancer.
Finally, browse to the DNS name of your load balancer. You should see the
application working across two servers. To make the load balancer work for
the real DNS name of your application, change your application's DNS
record from an A record to a
CNAME pointing at the DNS name of the load
balancer. See Resources for more details about
the DNS requirements, including some caveats. Although the DNS method is
cumbersome, it allows you to handle orders of magnitude more requests than
you could by building a load balancer on an Amazon EC2 instance. The DNS
change can happen at any time, because there would be no disruption of
service.
The application is now spread across two nodes, and the database server can be started from scratch in less than half an hour. This is good for availability but doesn't help if an administrator accidentally destroys critical data or if the EBS volume fails. Fortunately, solutions are available to address these problems.
EBS provides a snapshot feature that stores a copy of the volume in Amazon S3. To be precise, an EBS snapshot stores the differences since the last snapshot. A database complicates the matter, because it caches some disk writes, which may result in an inconsistent snapshot. Therefore, you must make sure that everything is on disk in a consistent state (see Listing 9). The order of the backups will be as follows:
- Tell PostgreSQL to enter backup mode.
- Freeze the file system.
- Request a snapshot from Amazon.
- Unfreeze the file system.
- Tell PostgreSQL that the backup is complete.
Even though this procedure may take a minute or two, Amazon is spooling the snapshot to Amazon S3 in the background. Changes made after step 3 will not be reflected in the snapshot, however.
Listing 9. Backing up the database
#!/bin/sh
export EC2_HOME=/usr/local/
export JAVA_HOME=/usr
export EC2_CERT="/root/.ec2/cert.pem"
export EC2_PRIVATE_KEY="/root/.ec2/pk.pem"
echo "select pg_start_backup('snapshot')" | su - postgres -c psql
/usr/sbin/xfs_freeze -f /ebsvol/
/usr/local/bin/ec2-create-snapshot vol-93f77ffa --description "`date`"
/usr/sbin/xfs_freeze -u /ebsvol/
echo "select pg_stop_backup('snapshot')" | su - postgres -c psql
|
You can verify the status of the snapshot with the
ec2-describe-snapshots command, as in Listing
10.
Listing 10. Showing the EBS snapshots (condensed)
$ ec2-describe-snapshots --headers
SnapshotId VolumeId Status StartTime
SNAPSHOT snap-298cb741 vol-93f77ffa completed 2010-06-29T02:50:55
SNAPSHOT snap-a2b959c9 vol-93f77ffa completed 2010-07-13T15:14:54
|
Listing 10 shows two completed snapshots along with their times.
You should automate the creation of snapshots by running Listing 9 from cron. You should also
periodically prune your list of snapshots with the
ec2-delete-snapshot command.
If your EBS volume fails or if you need to restore old data from EBS, you will need to restore from your last snapshot. The procedure to restore an EBS volume is almost identical to creating a new one. Listing 11 shows how to restore the last snapshot from Listing 10.
Listing 11. Restoring an EBS snapshot
$ ec2-create-volume --snapshot snap-a2b959c9 -z us-east-1a -s 20 VOLUME vol-d06b06b9 20 snap-a2b959c9 us-east-1a creating |
You can then mount this volume on any instance to restore your data.
Backing up and restoring files
A simple way to back up files from your servers is to copy them into Amazon
S3 or make them part of your stock AMI. The latter method is more
effective for binaries and software packages, while copying to Amazon S3
is more for user data. The S3Sync tool provides
some command-line Amazon S3 tools along with a handy
rsync-like utility.
Download the S3Sync utilities (see Resources for the link). Listing 12 shows how to create a bucket for backups and how to upload files to Amazon S3.
Listing 12. Backing up your data to Amazon S3
$ s3cmd.rb createbucket smallpayroll-backup $ s3cmd.rb listbuckets ertw.com smallpayroll-backup $ s3sync.rb -r /var/uploads smallpayroll-backup:`date +%Y-%m-%d` $ s3cmd.rb list smallpayroll-backup:2010-07-12 -------------------- 2010-07-12/uploads 2010-07-12/uploads/file1.txt |
Listing 12 starts by creating a bucket called smallpayroll-backup. You can safely store different backups from different times in the same bucket, so you perform this step only once. The second command verifies that the bucket was created; you can see the bucket that was just created and the ertw.com bucket, where the AMIs reside.
The s3sync.rb command recursively copies the
/var/uploads directory into the backup bucket, prefixing all the files
with the current date. The final command shows all the files inside that
bucket.
Restoring files is just as simple. You can either use
S3Sync with the parameters reserved or retrieve
an individual file through another tool, like Amazon S3 File Manager (see
Resources).
The SmallPayroll application is running in the cloud and is better designed for future growth. Even though the mean time between failures of the hardware hasn't changed, the backups and scripts put in place mean that the data is safe and that you can quickly rebuild the environment, if needed.
Most of the original shortcomings of a straight migration to the cloud have been addressed. There is little visibility into the health of the environment, however, and it would be helpful to be able to scale server resources in response to demand. These issues will be addressed in the next two articles in this series.
Learn
-
RightScale on EBS: The RightScale guys have given an in-depth
look at EBS, providing more information about snapshots and availability
zones.
-
EBS performance: You can attach multiple EBS volumes to one
server. The MySQL performance blog has an excellent comparison of using
RAID to increase EBS performance that is relevant to other databases, too.
-
PostgreSQL online backups: These backups are invaluable, but they
take some understanding to ensure consistency.
-
Using
Instance Data: An Amazon EC2 instance has several pieces of
instance metadata that can help the instance learn about the environment.
Browse through this chapter of the Amazon EC2 documentation to get ideas
on what you can do.
-
Introduction to Elastic Load Balancing: The Elastic Load
Balancing service operates differently than you may be used to, especially
because it needs you to use a DNS
CNAMEto alias your application to the Amazon hostname. -
ELB and CNAMEs to the root apex discussion: If you're trying to
use Elastic Load Balancing to balance the root of your domain, such as
example.com (also called the root apex), this discussion in the
Amazon Web Services discussion forum highlights the problems and provides
workarounds.
- In the Cloud Computing area
on developerWorks, get the resources you need to develop and
deploy applications in the cloud and keep on top of recent cloud
developments.
- In the developerWorks Linux zone, find hundreds of how-to articles and tutorials, as well as downloads, discussion
forums, and a wealth of other resources for Linux developers and
administrators.
- Stay current with developerWorks technical events and webcasts focused on a variety
of IBM products and IT industry topics.
- Attend a free developerWorks Live! briefing to get up-to-speed quickly on
IBM products and tools, as well as IT industry trends.
- Watch developerWorks on-demand demos ranging from product installation
and setup demos for beginners, to advanced functionality for experienced
developers.
- Follow developerWorks on
Twitter, or subscribe to a feed of Linux tweets on developerWorks.
Get products and technologies
-
HAProxy and Varnish: If you are looking for
alternatives to Elastic Load Balancing, look at these two
projects. HAProxy is an event-driven reverse proxy, while Varnish uses a
threaded model and also performs caching.
-
Amazon S3 File Manager: Now that you've
got multiple AMIs inside Amazon S3, you might want to prune some old ones.
The Amazon S3 File Manager is a Web-based file manager that rivals the
features of many standalone applications or browser plug-ins. If you
delete an AMI, don't forget to
ec2-deregisterit. -
S3Sync:
S3Syncis a helpful tool for copying files to and from Amazon S3 as well as manipulating your buckets. -
Evaluate
IBM products in the way that suits you best: Download a product
trial, try a product online, use a product in a cloud environment, or
spend a few hours in the SOA Sandbox learning how to implement Service Oriented
Architecture efficiently.
Discuss
- Get involved in the My developerWorks
community. Connect with other developerWorks users while exploring
the developer-driven blogs, forums, groups, and wikis.

Sean Walberg has been working with Linux and UNIX systems since 1994 in academic, corporate, and Internet Service Provider environments. He has written extensively about systems administration over the past several years. You can contact him at sean@ertw.com.




