Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Migrate your Linux application to the Amazon cloud, Part 4: Conquering administrative challenges

Prevent headaches as you grow

Sean A. Walberg, Senior Network Engineer
Photo of Sean Walberg
Sean Walberg has been working with Linux and UNIX systems since 1994 in academic, corporate, and Internet Service Provider environments. He has written extensively about systems administration over the past several years. You can contact him at sean@ertw.com.

Summary:  Up to now, you have moved your application to the cloud and can enable and disable resources automatically in response to demand. In this article, the fourth in a series on migrating a Linux application to the Amazon cloud, learn how to keep this changing environment under control so that it supports your application and business.

View more content in this series

Date:  27 Oct 2010
Level:  Intermediate PDF:  A4 and Letter (51KB | 12 pages)Get Adobe® Reader®
Also available in:   Korean  Japanese  Portuguese

Activity:  42434 views
Comments:  

You migrated the SmallPayroll.ca application to the Amazon cloud in part 1 of this series and made it more robust in part 2. The application can even add and remove servers on its own, depending on load, as you saw in Part 3. It's now likely that at any given time, the numbers and IP addresses of the active servers cannot be predicted, which makes connecting to them a challenge. As a result, the cloud environment is different from a traditional data center.

The dynamic nature of the cloud also makes application deployment difficult. Your list of servers will be different between deployments, so how do you update the application? For that matter, how do you monitor your servers for faults?

This isn't your normal data center

In a "normal" data center, you can name your computers whatever you want, give them IP addresses that suit you, and—if you want—go and look at your servers to make sure they're still there. Maybe you keep a spreadsheet to track your servers, maybe you have software, and maybe you just keep information in your head or in a text file. Do you have configuration management to make sure your configuration is consistent?

The cloud environment is much different from a traditional data center, because you're ceding control of many functions. You can't predict IP addresses or even ensure that two servers will be on the same subnet. If you progress to automatic scaling of resources, all the hard work you did in manual configurations might be lost when a new node is launched. Your scripts that rely on knowing you have 20 Web servers with a predictable name won't work in the cloud.

Fortunately, a bit of discipline can work around these problems and even improve your uptime in the physical data center!

IP addressing and naming

People tend to spend a great deal of time worrying about what to name their servers and how to come up with a sensible IP addressing scheme. Amazon Elastic Compute Cloud (Amazon EC2) instances come up with a fairly random IP address and a name based on this address. You could certainly rename your server, but that often requires knowledge of the rest of the environment. For example, to call a server webprd42, you have to know that the last server you launched was webprd41.

The better solution is not to rely on names or IP addresses and to build your software such that these names don't matter.

Configuration management

In a physical environment, you can usually get away with making manual configuration changes to servers. When servers are launched automatically, manual changes won't be applied. You can re-bundle your Amazon Machine Image (AMI) after each change, but doing so doesn't solve the problem of how to push updates to the other servers that are already running. Fortunately, excellent software packages, such as Puppet and Cfengine, can automate these changes for you (see Resources).

Deploying application changes is another aspect of configuration management that deserves a separate look. Generic configuration management tools can do the job, but using them to reproduce the specific steps in deploying an application and managing migrations and configuration rollbacks is difficult. The Rails community has come out with other tools, such as Capistrano, to handle the task of application deployment (see Resources).

It is helpful to look at configuration management as two separate problems. The first is how to manage the server—from the installation of software packages to the configuration of various daemons. The second is how to deploy new versions of software in a controlled manner.

System monitoring

It's important to know what your servers are doing. CPU, disk resources, memory, and the network are vital components you need to monitor. Daemons running on your system, including the application itself, may have other metrics to watch. For example, watching application response time and the number of connections to the Web server and application server can warn you of problems before they happen.

Many tools are available to monitor servers and graph the results. The challenge is how to monitor new servers as they come online and how to stop monitoring them as they are taken offline.

Patterns as applied to cloud architecture

Three general patterns emerge when you look at how to manage a dynamic environment such as Amazon EC2:

  • Client poll. The server queries a central server for resources. You don't need to know the addresses of all your servers using this pattern, but the servers operate on their own schedule, so you can't control the timing of the client's polling.
  • Server push. This pattern first queries the cloud provider's application programming interface (API) to find the current list of servers, then a central server contacts each server to do the work. This pattern is slower and requires that the management tool understand the dynamic nature of the environment, but it has the benefit of allowing you to synchronize updates.
  • Client registration. As each server comes online, it registers itself with a central server. Before the server is terminated, it de-registers itself. This method is more complex but lets you use non-cloud-aware tools in a cloud environment.

Client polling for configuration management

This pattern is easy to implement: A client simply polls a well-known server for instructions on a predetermined schedule. If the server doesn't have anything for the client to do, it informs the client of such. The downside is that instructions can only be issued if the client polls the server; if the change is urgent, it must wait for the next poll.

An excellent use for polling is configuration management of the server. The Puppet package from Reductive Labs is a popular configuration management tool. A process, called the Puppetmaster, runs on a central server. Clients run the Puppet daemon, which polls the Puppetmaster for the appropriate configuration manifest. These configuration manifests specify the desired end state of a particular component, such as "make sure that the NTP daemon is installed and running." Puppet reads these manifests and corrects any problems.

Your distribution may come with Puppet, or you can quickly install it with gem install puppet facter. Puppet implements a security system that complicates matters, however. Clients must have a signed key to talk to the Puppetmaster. You can tell the Puppetmaster to automatically sign keys for clients that connect, but doing so would allow anyone to download your configuration files. An alternative solution is to ignore the Puppetmaster, distribute your manifests yourself, and run the Puppet tools locally.

The sequence of events to have the client run the Puppet manifests is as follows:

  1. Download an updated copy of the manifests and any associated files from the server.
  2. Run Puppet against the manifest.

For step 1, the tool of choice is rsync, which only downloads changed files. For step 2, the puppet command (part of the puppet installation) executes the manifest. Note that there are two caveats to this approach:

  • The server must accept the client's Secure Shell (SSH) public key. This key can be distributed in the AMI.
  • Any configuration files you specify in the manifest must be copied with the manifest. The built-in Puppet file server also requires certificates, so you can't use this file transfer method.

The sample manifest ensures that the client has the correct network time protocol configuration. This involves making sure that the software is installed, the configuration file is modified, and the daemon is running. Listing 1 shows the top-level manifest.


Listing 1. The top-level manifest
	
import "classes/*"

node default {
  include ntpclient
}

Getting started with Puppet

One of Puppet's benefits is that you can start small and put more things under Puppet's control as you learn the language. For example, you can start with NTP, as shown in this article, and grow to support other services. Each time you improve your Puppet manifest, you're making your systems that much more reliable and saving yourself future work.

Listing 1 first imports all the files in the classes directory; each file contains information about a single component. All nodes then include the ntpclient class, which is defined in Listing 2.


Listing 2. The ntpclient class
	
class ntpclient {
  package {
    ntp:
      ensure => installed
  }
  service {
    ntpd:
      ensure => true,
      enable => true,
      subscribe => [ Package [ntp], File["ntp.conf"] ],
  }
  file {
    "ntp.conf":
      mode => 644,
      owner => root,
      group => root,
      path => "/etc/ntp.conf",
      source => "/var/puppetstage/files/etc/ntp.conf",
      before => Service["ntpd"]
  }
}

A detailed look at the Puppet language is outside the scope of this article, but at a high level, Listing 2 defines a class called ntpclient that is composed of a package called ntp, a service called ntpd, and a file in /etc called ntp.conf. If the ntp package is not installed, Puppet uses the appropriate tool, such as yum or apt-get to install it. If the service is not running and in the startup scripts, it will be fixed. If the ntp.conf file differs from the copy in /var/puppetstage/files/etc, the file will be updated. The before and subscribe lines make sure that the daemon gets restarted if the configuration changes.

The server stores the manifests and files in /var/puppetdist, and clients copy that tree to /var/puppetstage. The outline of the directory tree is shown in Listing 3.


Listing 3. Contents of /var/puppetdist
	
/var/puppetdist/
|-- files
|   `-- etc
|       `-- ntp.conf
`-- manifests
    |-- classes
    |   `-- ntp.conf
    `-- site.pp

Finally, Listing 4 synchronizes the files and runs the manifest on the client.


Listing 4. Client code to synchronize and run the manifest
	
#!/bin/bash
/usr/bin/rsync -avz puppetserver:/var/puppetdist/ /var/puppetstage/ --delete
/usr/bin/puppet /var/puppetstage/manifests/site.pp

This code, when run from cron periodically, picks up any changes in the manifests and applies them to the cloud server. If the server's configuration somehow gets changed, Puppet takes steps to put the server back into compliance.


Pushing application updates

Configuration updates on servers rarely require synchronization between servers. If a package needs to be upgraded, a half-hour window is usually enough. For application updates, however, you want to roll out your changes at once, and you want control over the timing. A popular tool for accomplishing this is Capistrano. You write a script that uses Capistrano's domain-specific language and run various tasks. Listing 5 shows a minimal Capistrano script to push an application to a known set of servers.


Listing 5. A simple Capistrano script
	
set :application, "payroll"
set :repository,  "https://svn.smallpayroll.ca/svn/app/trunk/"
set :user, 'payroll'
set :home, '/home/payroll'
set :deploy_to, "#{home}"
set :rails_env, "production"

role :db, "174.129.174.213", :primary => true
role :web, "174.129.174.213", "184.73.3.169"

Most of the lines in Listing 5 set variables that alter the default behavior of Capistrano, which is to use SSH to access all the servers and use a source code management tool to check out a copy of the application. The last two lines define the servers in use—in particular, the database and Web servers. These roles are known to Capistrano (and can be extended for your own purposes).

The problem with Listing 5 is that the servers must be predefined. It is possible to have Capistrano determine the list of servers at run time using the Amazon Web Services (AWS) APIs, however. First, run:

gem install amazon-ec2

to install a library that implements the API. Then, modify your Capistrano recipe (deploy.rb) as shown in Listing 6.


Listing 6. Modifying Capistrano to dynamically load the list of servers at run time
	
# Put this at the beginning of your deploy.rb
require 'AWS'

# Change your role :web definition to this
role(:web) { my_instances }

# This goes at the bottom of the recipe
def my_instances
  @ec2 = AWS::EC2::Base.new(  :access_key_id => ENV['AWS_ACCESS_KEY_ID'],
                              :secret_access_key => ENV['AWS_SECRET_ACCESS_KEY'])
  servers = @ec2.describe_instances.reservationSet.item.collect do |itemgroup|
    itemgroup.instancesSet.item.collect {|item| item.ipAddress}
  end
  servers.flatten
end

Listing 6 changes the Web role from a static definition to a dynamic list of servers returned from the my_instances function. The function uses the Amazon EC2 API DescribeInstances call to return a list of servers. The API returns data in a format that groups instances that were launched together under the same reservation identifier. The outer collect loop iterates over these reservation groups, and the inner collect loop iterates over the servers contained within each restrain group. The result is an array of arrays, which is flattened to a single dimensional array of server IP addresses and passed back to the caller.

It is fortunate that Capistrano has provided a way to operate on a dynamic list of servers. If it did not provide such hooks, then you would need to take another approach.


Registering with a management server

For applications that don't easily allow you to use a dynamic list of servers, you can work around the problem by having the cloud server register itself with other applications. This process generally takes one of two forms:

  • The cloud server connects to another server and runs a script, which updates the management application directly.
  • The cloud server drops a file with some metadata in a common place, such as Amazon Simple Storage Service (Amazon S3), where other scripts look to rebuild their configuration files.

Direct updates

Cacti is a popular performance management tool that can graph various metrics through Simple Network Management Protocol (SNMP) or scripts and combine these graphs into dashboards or meta-graphs (see Resources). The limitation with Cacti is that you have to configure the server for management within the Cacti Web interface or through command-line scripts. In this example, the cloud server connects back to the Cacti server and configure itself.

Cacti is based on a system of templates, which makes mass changes to graphs much easier. All the command-line tools operate on the template identifier, though, so you must first figure out which identifiers to use. Listing 7 shows how to find the host template, which pre-populates some data elements for you.


Listing 7. Listing the host templates
	
$ php -q /var/lib/cacti/cli/add_device.php --list-host-templates
Valid Host Templates: (id, name)
0       None
1       Generic SNMP-enabled Host
3       ucd/net SNMP Host
4       Karlnet Wireless Bridge
5       Cisco Router
6       Netware 4/5 Server
7       Windows 2000/XP Host
8       Local Linux Machine

Template number 3 is for a host running the Net-SNMP daemon, which is available with most Linux® distributions out there. Using this specific daemon rather than a more generic version allows you to monitor some Linux-specific counters easily.

Knowing that you are using host template 3, the list of available graphs is shown in Listing 8.


Listing 8. Listing the graph templates
	
$ php -q /var/lib/cacti/cli/add_graphs.php --list-graph-templates --host-template-id=3
Known Graph Templates:(id, name)
4       ucd/net - CPU Usage
11      ucd/net - Load Average
13      ucd/net - Memory Usage

The three graphs in Listing 8 are what you get with the default Cacti distribution. You can add many more, you can leave off the --host-template-id option to see them, or import the graphs from sources on the Internet.

Listing 9 shows how to add a new device, and then a CPU graph.


Listing 9. Adding a new device with a graph
	
$ php -q /var/lib/cacti/cli/add_device.php --description="EC2-1.2.3.4" \
  --ip=1.2.3.4 --template=3
Adding EC2-1.2.3.4 (1.2.3.4) as "ucd/net SNMP Host" using SNMP v1 with community "public"
Success - new device-id: (5)
php -q /var/lib/cacti/cli/add_graphs.php --host-id=5 --graph-type=cg \
  --graph-template-id=4
Graph Added - graph-id: (6) - data-source-ids: (11, 12, 13)

Listing 9 first adds a host with the IP address 1.2.3.4. The device ID returned is 5, which is then used to add a graph for CPU usage (graph type of cg and template 4). The results are the ID of the graph and the IDs of the various data sources that are now being monitored.

It is now fairly easy to script the procedure in Listing 9. Listing 10 shows such a script.


Listing 10. add_to_cacti.sh
	
#!/bin/bash

IP=$1

# Add a new device and parse the output to only return the id
DEVICEID=`php -q /var/lib/cacti/cli/add_device.php --description="EC2-$IP" \
  --ip=$IP --template=3 | grep device-id | sed 's/[^0-9]//g'`
# CPU graph
php -q /var/lib/cacti/cli/add_graphs.php --host-id=$DEVICEID --graph-type=cg \
    --graph-template-id=4

The first parameter to the script is saved to a variable called $IP. The add_device.php script is run with this IP address, with the output filtered to only the line containing the ID using the grep command. The output of this is fed into a sed script that only prints numbers. This value is saved in a variable called $DEVICEID.

With the device ID stored, adding a graph is as simple as calling the add_graphs.php script. Note that the CPU graph is the simplest case and that some other types of graphs require more parameters.

With the add_to_cacti.sh script on the Cacti server, all it takes is for the cloud server to run it. Listing 11 shows how to call the script.


Listing 11. Calling the cacti script from the cloud server
	
#!/bin/bash

MYIP=`/usr/bin/curl -s http://169.254.169.254/2007-01-19/meta-data/public-ipv4`
ssh cacti@cacti.example.com "/usr/local/bin/add_to_cacti.sh $MYIP"

Listing 11 first calls the Amazon EC2 meta-data server to return the public IP address, and then runs the command remotely on the Cacti server.

Back to Puppet

If you wanted to use the Puppet server and certificate authority instead of the rsync hack detailed earlier, this pattern will come in useful.


Conclusion

This series has followed the migration of an application from a single server to the AWS cloud. Improvements were made incrementally to take advantage of the Amazon EC2 offerings, from launching new servers to load balancers. This final article looked at managing a dynamic cloud environment and offered some patterns for you to use.

Given the low cost of entry to using cloud resources, you should have a look and try to conduct a practice migration. Even if you decide not to run the application in production using the cloud, you will learn a lot about what can be done in the cloud and perhaps improve your systems management skills.


Resources

Learn

Get products and technologies

  • Now that you've got multiple AMIs inside Amazon S3, you might want to prune some old ones. Amazon S3 File Manager is a Web-based file manager that rivals the features of many stand-alone applications or browser plug-ins. If you delete an AMI, don't forget to ec2-deregister it.

  • Capistrano is a popular deployment package that acts in a similar manner to Rake.

  • Cfengine is the most popular configuration management tool for UNIX®. It is lightweight and can operate on a large number of machines.

  • Cacti is a network graphing tool built around RRDTool. You can graph almost anything imaginable. If it's in your data center, there's a good chance that someone has already written a plug-in to graph it.

  • Puppet is a configuration management tool written in Ruby and built to overcome some limitations in Cfengine. If you're looking for a good way to start, Pulling Strings with Puppet by James Turnbull (Apress, 2008) is a book that the author enjoyed.

  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.

Discuss

  • Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

About the author

Photo of Sean Walberg

Sean Walberg has been working with Linux and UNIX systems since 1994 in academic, corporate, and Internet Service Provider environments. He has written extensively about systems administration over the past several years. You can contact him at sean@ertw.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux, , Open source
ArticleID=556749
ArticleTitle=Migrate your Linux application to the Amazon cloud, Part 4: Conquering administrative challenges
publish-date=10272010
author1-email=sean@ertw.com
author1-email-cc=

Next steps from IBM

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers