Agile DevOps: Infrastructure automation

Treat infrastructure as code with Chef or Puppet

How many times have you manually applied the same steps when creating an infrastructure, or relied on another team to set up an environment for you? What if all of these actions were scripted and versioned just like the rest of the software system? In this Agile DevOps installment, DevOps expert Paul Duvall shows how Chef and Puppet enable you to automate infrastructure provisioning. He covers the basics of each of these tools — along with their similarities, use cases, and differences — and provides a video demo of scripting with Puppet.


Paul Duvall, CTO, Stelligent

Paul DuvallPaul Duvall is the CTO of Stelligent. A featured speaker at many leading software conferences, he has worked in virtually every role on software projects: developer, project manager, architect, and tester. He is the principal author of Continuous Integration: Improving Software Quality and Reducing Risk (Addison-Wesley, 2007) and a 2008 Jolt Award Winner. He is also the author of Startup@Cloud and DevOps in the Cloud LiveLessons (Pearson Education, June 2012). He's contributed to several other books as well. Paul authored the 20-article Automation for the people series on developerWorks. He is passionate about getting high-quality software to users quicker and more often through continuous delivery and the cloud. Read his blog at

11 September 2012

Infrastructure automation is the process of scripting environments — from installing an operating system, to installing and configuring servers on instances, to configuring how the instances and software communicate with one another, and much more. By scripting environments, you can apply the same configuration to a single node or to thousands.

Infrastructure automation also goes by other names: configuration management, IT management, provisioning, scripted infrastructures, system configuration management, and many other overlapping terms. The point is the same: you are describing your infrastructure and its configuration as a script or set of scripts so that environments can be replicated in a much less error-prone manner. Infrastructure automation brings agility to both development and operations because any authorized team member can modify the scripts while applying good development practices — such as automated testing and versioning — to your infrastructure.

About this series

Developers can learn a lot from operations, and operations can learn a lot from developers. This series of articles is dedicated to exploring the practical uses of applying an operations mindset to development, and vice versa — and of considering software products as holistic entities that can be delivered with more agility and frequency than ever before.

In the past decade, several open source and commercial tools have emerged to support infrastructure automation. The open source tools include Bcfg2, CFEngine, Chef, and Puppet. They can be used in the cloud and in virtual and physical environments. In this article, I'll focus on the most popular open source infrastructure automation tools: Chef and Puppet. Although you won't learn the intricacies of either tool, you'll get an understanding of the similarities and differences between them, along with some representative examples. For a more detailed example of setting up and using an infrastructure automation tool, this article provides a companion video that shows how to run Puppet in a cloud environment.

Traditional approaches

Not all teams are applying infrastructure automation tools — along with its practices and patterns — so what are they doing? Traditional approaches — which do not scale — include configuring environments manually or writing and running combinations of scripts that must be performed by a human. This leads to error-prone processes that increase cycle times, preventing teams from regularly releasing software.

Chef and Puppet both use a Ruby domain-specific language (DSL) for scripting environments. Chef is expressed as an internal Ruby DSL, and Puppet users primarily use its external DSL — also written in Ruby. These tools tend to be used more often in Linux® system automation, but they have support for Windows as well. Puppet has a larger user base than Chef, and it offers more support for older, outdated operating systems. With Puppet, you can set dependencies on other tasks. Both tools are idempotent— meaning you get the same result with the same configuration no matter how many times you run it.


Chef has been around since 2009. It was influenced by Puppet and CFEngine. Chef supports multiple platforms including Ubuntu, Debian, RHEL/CentOS, Fedora, Mac OS X, Windows 7, and Windows Server. It is often described as easier to use — particularly for Ruby developers, because everything in Chef is defined as a Ruby script and follows a model that developers are used to working in. Chef has a passionate user base, and the Chef community is rapidly growing while developing cookbooks for others to use.

How it works

Get involved

developerWorks Agile transformation provides news, discussions, and training to help you and your organization build a foundation on agile development principles.

In Chef, three core components interact with one another — Chef server, nodes, and Chef workstation. Chef runs cookbooks, which consist of recipes that perform automated steps — called actions — on nodes, such as installing and configuring software or adding files. The Chef server contains configuration data for managing multiple nodes. The configuration files and resources stored on the Chef server are pulled down by nodes when requested. Examples of resources include file, package, cron, and execute.

Users interact with the Chef server using Chef's command-line interface, called Knife. Nodes can have one or more roles. A role defines attributes (node-specific settings) and recipes for a node and can apply them across multiple nodes. Recipes can run other recipes. The recipes in a node, called a run list, are executed in the order they are listed. A Chef workstation is an instance with a local Chef repository and Knife installed on it.

Table 1 describes the core components of Chef:

Table 1. Chef components
AttributesDescribe node data, such as the IP address and hostname.
Chef clientDoes work on behalf of a node. A single Chef client can run recipes for multiple nodes.
Chef SoloAllows you to run Chef cookbooks in the absence of a Chef server.
CookbooksContain all the resources you need to automate your infrastructure and can be shared with other Chef users. Cookbooks typically consist of multiple recipes.
Data bagsContain globally available data used by nodes and roles.
KnifeUsed by system administrators to upload configuration changes to the Chef Server. Knife is used for communication between nodes via SSH.
Management consoleChef server's web interface for managing nodes, roles, cookbooks, data bags, and API clients.
NodeHosts that run the Chef client. The primary features of a node, from Chef's point of view, are its attributes and its run list. Nodes are the component to which recipes and roles are applied.
OhaiDetects data about your operating system. It can be used stand-alone, but its primary purpose is to provide node data to Chef.
RecipeThe fundamental configuration in Chef. Recipes encapsulate collections of resources that are executed in the order defined to configure the nodes.
Repository (Chef repository)The place where cookbooks, roles, configuration files, and other artifacts for managing systems with Chef are hosted.
ResourceA cross-platform abstraction of something you're configuring on a node. For example, users and packages can be configured differently on different OS platforms; Chef abstracts the complexity in doing this away from the user.
RoleA mechanism for grouping similar features of similar nodes.
Server (Chef server)Centralized repository of your server's configuration.


Listing 1 demonstrates the use of the service resource within a recipe that's part of a Tomcat cookbook. You can see that you can use tools like Chef to do platform-specific configuration and manage server configuration.

Listing 1. Chef recipes
service "tomcat" do
  service_name "tomcat6"
  case node["platform"]
  when "centos","redhat","fedora"
    supports :restart => true, :status => true
  when "debian","ubuntu"
    supports :restart => true, :reload => true, :status => true
  action [:enable, :start]

Listing 2 defines the attributes for the Tomcat cookbook. In this example, I'm defining some external ports for the Tomcat server to make available. Other types of attributes you might see include values for directories, options, users, and other configurations.

Listing 2. Chef attributes
default["tomcat"]["port"] = 8080
default["tomcat"]["ssl_port"] = 8443
default["tomcat"]["ajp_port"] = 8009

Chef extends the Ruby language — as compared to an external DSL — to provide a model for applying configuration to many nodes at once. Chef uses an imperative model without explicit dependency management, so people with more of a development background tend to gravitate toward Chef when they are scripting environments.


Puppet has been in use since 2005. Many organizations, including Google, Twitter, Oracle, and Rackspace, use it to manage their infrastructure. Puppet, which tends to require a steeper learning curve than Chef, supports a variety of Windows and *nix environments. Puppet has a large and active user community. It has been used in thousands of organizations with installations running tens of thousands of instances.

How it works

Puppet uses the concept of a master server — called the Puppet master — which centralizes the configuration among nodes and groups them together based on type. For example, if you had a set of web servers that were all running Tomcat with a Jenkins WAR, you'd group them together on the Puppet master. The Puppet agent runs as a daemon on systems. This enables you to deploy infrastructure changes to multiple nodes simultaneously. It functions the same way as a deployment manager, but instead of deploying applications, it deploys infrastructure changes.

Puppet includes a tool called facter. Facter holds metadata about the system and can be used to filter among servers. For example, you can use facter to determine a node's hostname. MCollective is a deployment tool that integrates with Puppet. You can use MCollective to deploy infrastructure changes across nodes.

Table 2 lists the key components of Puppet:

Table 2. Key Puppet components
AgentA daemon process running on a node that collects information about the node and sends it to the Puppet master.
CatalogCompilation of facts that specifies how to configure the node.
FactsData about a node, sent by the node to the Puppet master.
ManifestDescribes resources and the dependencies among them.
ModuleGroups related manifests (in a directory). For example, a module might define how a database like MySQL gets installed, configured, and run.
NodeA host that is managed by the Puppet master. Nodes are defined like classes but contain the host name or fully qualified domain name.
Puppet masterThe server that manages all the Puppet nodes.
ResourceFor example, a package, file, or service.


In the example in Listing 3, a Puppet manifest describes the packages to install on a node. Puppet determines the best approach and order of execution for installing these packages.

Listing 3. Puppet manifest for package installation
class system {
  package { "rubygems": ensure => "installed" }

  package { "make": ensure => "installed" }
  package { "gcc": ensure => "installed" }
  package { "gcc-c++": ensure => "installed" }
  package { "ruby-devel": ensure => "installed" }
  package { "libcurl-devel": ensure => "installed" }
  package { "zlib-devel": ensure => "installed" }
  package { "openssl-devel": ensure => "installed" }
  package { "libxml2-devel": ensure => "installed" }
  package { "libxslt-devel": ensure => "installed" }

The Puppet manifest snippet in Listing 4 shows examples of different resource types — package and service — that can be used in scripting an infrastructure:

Listing 4. Puppet manifest for httpd
class httpd {
  package { 'httpd-devel':
    ensure => installed,
  service { 'httpd':
    ensure => running,
    enable => true,
    subscribe => Package['httpd-devel'],

Puppet employs a declarative model with explicit dependency management. Because of this, it tends to be one of the first tool considerations by engineers who have more of a systems administration background and are looking to script their environments.

Infrastructure as code

In this article, you learned — through examples — that your infrastructure no longer needs to be a manual effort uniquely applied to individual nodes. By automating your infrastructure, you can scale it up and down without any additional effort. Because your infrastructure is modeled in scripts, you can version and test them just like the application code.

In the next article, you'll learn patterns and techniques for creating ephemeral (or transient) environments — environments that are created and destroyed in 24 hours and embrace the abundance mindset (that is, lack of scarcity) that comes with Agile DevOps.



Get products and technologies

  • Chef: Several Chef "flavors" are available.
  • Puppet: Download Puppet Enterprise.
  • IBM Tivoli Provisioning Manager: Tivoli Provisioning Manager enables a dynamic infrastructure by automating the management of physical servers, virtual servers, software, storage, and networks.
  • IBM Tivoli® System Automation for Multiplatforms: Tivoli System Automation for Multiplatforms provides high availability and automation for enterprise-wide applications and IT services.
  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.


  • Get involved in the developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.
  • The developerWorks Agile transformation community provides news, discussions, and training to help you and your organization build a foundation on agile development principles.


developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into DevOps on developerWorks

Zone=DevOps, Java technology
ArticleTitle=Agile DevOps: Infrastructure automation