Discover OpenStack: The Storage components Swift and Cinder

This article presents OpenStack Block (Swift) and Object (Glance) storage, explains how they fit into the overall architecture, and shows how they operate. It illustrates the tools with insight into what it takes to install, configure, and use the components.

Share:

John Rhoton (john.rhoton@gmail.com), Cloud Computing Strategist, Recursive

Author photoJohn Rhoton is a technology strategist specializing in consulting to global enterprise customers, with a focus on public, private, and hybrid cloud computing. He speaks regularly at industry events on emerging technologies such as mobility, social networking, and virtualization and is the author of seven books, including Cloud Computing Explained (2009) and Cloud Computing Architected (2011).



12 December 2013

Also available in Chinese Japanese

This article describes the OpenStack storage components, which offer persistent storage for other OpenStack projects.

As noted in the article on OpenStack Compute, computation is the core of a computational workload. In some cases, a computational instance may be all that's required, but often, there is a need for durable storage that persists over the life of an instance. Or there may be a requirement to share large amounts of data between running services.

In fact, there may be cases in which applications running outside an OpenStack environment depend on replicated, scalable, and reliable storage, and OpenStack Storage meets those specifications. But before I evaluate the alternatives, it's important to realize that OpenStack and many other cloud services have two fundamentally different storage services:

  • OpenStack Swift is an example of object storage which is similar in concept to Amazon Simple Storage Service.
  • In contrast, OpenStack Cinder represents block storage, similar to Amazon Elastic Block Store.

Block storage (Cinder)

Cinder is the project name for OpenStack Block Storage; it provides persistent block storage to guest virtual machines (VMs). Block storage is often necessary for expandable file systems, maximum performance, and integration with enterprise storage services as well as applications that require access to raw block-level storage.

The system can expose and connect devices, subsequently managing the creation, attachment to, and detachment from servers. The application programming interface (API) also facilitates snapshot management which can back up volumes of block storage.


Object store (Swift)

Swift or Cinder? When to use which

So which should you use: Swift or Cinder? The answer depends on your application. If you need to run commercial or legacy applications, you will rarely have a choice. They are unlikely to be coded to take advantage of the Swift APIs, but you can easily mount a Cinder disk that will behave just like direct attached storage to most applications.

You can certainly use Cinder for new applications, too, but you wouldn't get the benefits of resiliency and redundancy that automatically accompany Swift. If your programmers are up to the challenge, the distributed and scalable architecture of Swift is definitely a feature worth considering.

Swift is the more mature of the two offerings: It has been a core project since the inception of OpenStack. Swift functions as a distributed, API-accessible storage platform that can be integrated directly into applications or used to store VM images, backups, and archives as well as smaller files, such as photos and email messages.

There are two main concepts to the Object Store — objects and containers.

An object is the primary storage entity. It includes the content and any optional metadata associated with the files stored in the OpenStack Object Storage system. The data is saved in uncompressed and unencrypted format and consists of the object's name, its container, and any metadata in the form of key-value pairs. Objects are spread across multiple disks throughout the data center, whereby Swift ensures data replication and integrity. The distributed operation can leverage low-cost commodity hardware while enhancing scalability, redundancy, and durability.

A container is similar to a Windows® folder in that it is a storage compartment for a set of files. Containers cannot be nested, but it is possible for a tenant to create an unlimited number of containers. Objects must be stored in a container, so you must have at least one container to use the Object store.

Unlike a traditional file server, Swift is distributed across multiple systems. It automatically stores redundant copies of each object to maximize availability and scalability. Object versioning offers additional protection against inadvertent loss or overwriting of data.


Swift architecture

The Swift architecture consists of three components — servers, processes, and rings.

Servers

The Swift architecture is distributed to prevent any single point of failure as well as to scale horizontally. It includes the following four servers:

  • A proxy server
  • Object servers
  • Container servers
  • Account servers

The proxy server presents a unified interface to the remainder of the OpenStack Object Storage architecture. It accepts requests to create containers, upload files, or modify metadata and can also provide container listings or present stored files. When it receives a request, it determines the location of the account, container, or object in the ring and forwards the request to the relevant server.

An object server is a simple server that can upload, modify, and retrieve objects (usually files) stored on the devices it manages. The objects are stored in the local file system using extended attributes to hold any metadata. The path is based on the object name's hash and a timestamp.

A container server is essentially a directory of objects. It handles the assignment of objects to a specific container and provides listings of the containers on request. The listings are replicated across the cluster to provide redundancy.

Try it!

Self-service IaaS SoftLayer infrastructure trial

An account server manages accounts by using the object storage services. It operates similarly to a container server in that it provides listings, in this case enumerating the containers that are assigned to a given account.

Processes

Several scheduled housekeeping processes manage the data store, including replications services, auditors, and updaters.

Replication services are the essential processes: They ensure consistency and availability throughout the cluster. Because one of the primary draw of the object store is its distributed storage, OpenStack must ensure a consistent state in the case of transient error conditions, such as power outages or component failures. It does so by regularly comparing local data with its remote copies and ensuring that all replicas contain the latest version.

To minimize the amount of network traffic needed for comparison, the services create a hash of each partition subsection and compare these lists. Container and account replication also use hashes but supplement them with shared high-water marks. The actual updates are pushed, generally using rsync to copy objects, containers, and accounts.

The replicator also performs garbage collection to enforce consistent data removal when objects, containers, or accounts are deleted. On deletion, the system marks the latest version with a tombstone, a signals to the replicator to remove the item from all replicated nodes.

Even the best replication design is only as effective as the components that implement it, however. Production environments need to be able to cope with failure, whether it's the result of hardware or software failures or merely the product of insufficient capacity. In Swift, this is accomplished with updaters and auditors.

Updaters are responsible for ensuring the integrity of the system in the face of failure. When the replication services encounter a problem and cannot update a container or account, a period of inconsistency occurs during which the object exists in storage but is not listed on all the container or account servers. In this case, the system queues the update on the local file system, and an updater process regularly retries the updates.

Auditors provide an additional level of protection against inconsistency. They regularly scan the local repository, verifying the integrity of the accounts, containers, and objects. When they identify any corruption, they quarantine the element and replace it with a copy from another replica. If they discover an inconsistency that they are not able to reconcile (for example, objects that do not belong to any container), they record the error in a log file.

Rings

Users and other OpenStack projects reference storage entities by their logical name, but ultimately, all requests, whether for reading or for writing, need to map to a physical location. To accomplish this, the proxy server and the background processes, including replication services, need to be able to map logical names to physical locations. This mapping is called a ring. Accounts, containers, and objects are assigned with separate rings. The ring describes this mapping in terms of devices, partitions, replicas, and zones.

The term partition, in this context, refers to logical subsets of the content stored in the ring. The recommendation is to allocate 100 partitions for each participating device. The partitions are distributed evenly among all the devices assigned to OpenStack Object Storage. If a cluster uses drives of varying sizes, it is also possible to assign weights that will balance the distribution of partitions across devices.

By default, each partition is replicated three times. It's possible to use a higher number to optimize availability, but obviously this also increases storage consumption. The ring also specifies which devices to use for hand-off in failure scenarios and how to redistribute partitions when devices are added to or removed from the cluster.

The last element of the ring mapping is the zone which is used to enable data affinity and anti-affinity. A zone can represent a storage device, a physical server, or a location, such as a rack, aisle, or data center. It is a logical concept that users can employ to suit their needs, but it usually reflects physical elements such as location, power source, and network connectivity.


The Cinder architecture

Cinder is significantly simpler than Swift since it doesn't provide automatic object distribution and replication. Figure 1 shows the Cinder architecture.

Figure 1. Cinder architecture
Image showing the Cinder architecture

Similar to other OpenStack projects, Cinder's functionality is exposed to both the dashboard and the command line via an API. It is able to access the object store via a Representational State Transfer (REST)-ful HTTP API and incorporates authentication to OpenStack Keystone with a Python class called Auth Manager.

The API parses and forwards all incoming requests to the message queue, where the scheduler and volume service perform the actual work. When new volumes are created, the scheduler decides which host should be responsible for it. By default, it selects the node with the most space available.

The volume manager manages the dynamically attachable block storage devices, called volumes. They can be used as the boot device of virtual instances or attached as secondary storage. Cinder also provides a facility for snapshots which are read-only copies of a volume. These snapshots can then be used to create new volumes for read-write use.

Volumes are usually attached to the Compute nodes via iSCSI. The block storage also requires some form of back-end storage which by default is logical volume management on a local volume group but can be extended via drivers to external storage arrays or appliances.


Setting it up

The actual installation instructions vary greatly between distributions and OpenStack releases. Generally, they are available as part of the distribution. Nonetheless, the same basic tasks must be completed. This section gives you an idea of what's involved.

System requirements

OpenStack relies on a 64-bit x86 architecture; otherwise, it's designed for commodity hardware, so the minimal system requirements are modest. It is possible to run the entire suite of OpenStack projects on a single system with 8GB of RAM. However, for large workloads, it makes sense to use dedicated systems for storage. Because the focus is on commodity equipment, there is no need for redundant array of independent disks (RAID) functionality, but it's advisable to use at least dual quad-core CPUs, 8-12GB of RAM, and a 1GB network adapter. Obviously the size of the hard disk drive or solid-state disk depends on the amount of data to be stored and the level of redundancy you want.

Installation

The installation instructions depend on the distribution and, more specifically, on the package-management utility you select. In many cases, it's necessary to declare the repository. So, for example, in the case of Zypper, you announce to libzypp with zypper ar:

Click to see code listing

# zypper ar -f http://download.opensuse.org/repositories/Cloud:/OpenStack:/Grizzly/SLE_11_SP3/Cloud:OpenStack:Grizzly.repo

You then install the required Swift and/or Cinder packages. The package-management utility should automatically install any dependencies. The full installation procedure depends on the desired configuration and on the exact release of OpenStack. Be sure to look at the installation guide for the authoritative instructions. For the purpose of illustration, below are the primary commands for Debian (for example, Ubuntu), Red Hat (for example, Red Hat Enterprise Linux®, CentOS, Fedora), and openSUSE.

  • Debian: Install the base Swift packages on all hosts:
    sudo apt-get install python-swift
    sudo apt-get install swift
    and the server-specific packages on the hosts that will be running them:
    sudo apt-get install swift-auth
    sudo apt-get install swift-proxy
    sudo apt-get install swift-account
    sudo apt-get install swift-container
    sudo apt-get install swift-object

    The Cinder packages include the API, scheduler, and volume manager:

    sudo apt-get install cinder-api
    sudo apt-get install cinder-scheduler
    sudo apt-get install cinder-volume
  • Red Hat: On Red Hat systems, the commands are:
    sudo yum install openstack-swift
    sudo yum install openstack-swift-proxy
    sudo yum install openstack-swift-account
    sudo yum install openstack-swift-container
    sudo yum install openstack-swift-object
    sudo yum install openstack-swift-doc
    sudo yum install openstack-cinder
    sudo yum install openstack-cinder-doc
  • openSUSE: Use the following commands:
    sudo zypper install  openstack-swift
    sudo zypper install  openstack-swift-auth 
    sudo zypper install  openstack-swift-account 
    sudo zypper install  openstack-swift-container 
    sudo zypper install  openstack-swift-object 
    sudo zypper install  openstack-swift-proxy
    sudo zypper install openstack-cinder-api
    sudo zypper install openstack-cinder-scheduler
    sudo zypper install openstack-cinder-volume

Configuration

Configuring your OpenStack Object Storage installation involves tailoring the configuration files for each of the four packages:

  • account-server.conf
  • container-server.conf
  • object-server.conf
  • proxy-server.conf

The configuration files are installed in /etc/swift/. A default set of options works fine for a standard installation, but it will be necessary to edit the configuration for any special requirements.


Usage scenarios

To get an idea of how you might use OpenStack storage, imagine a scenario in which you have a service that runs legacy software using a file system and new code where you want to use distributed object storage. The environment for this project should include both Swift and Cinder.

Let's start with Cinder.

  1. Log in to the OpenStack Dashboard as a user with a Member role. In the navigation pane, beneath Manage Computer, click Volumes > Create Volume.
    Figure 2. Create a volume
    Image showing how to create a volume
  2. The volume should appear in the list for your project.
    Figure 3. The volumes in your project
    Image showing the volumes for your project

    Click to see larger image

    Figure 3. The volumes in your project

    Image showing the volumes for your project
  3. Edit attachments to connect the volume to one of your compute instances.
    Figure 4. Manage volume attachments
    Image showing how to edit attachments

OpenStack creates a unique iSCSI qualified name and exposes it to the compute node which now has an active iSCSI session. The instance can use the Cinder volume as if it were local storage (usually a /dev/sdX disk).

To use Swift with your project, you must first create a container.

  1. Log in to the OpenStack Dashboard as a user with a Member role. In the navigation pane, beneath Object Store, click Containers > Create Container.
    Figure 5. Create a container
    Image showing how to create a container

    This is a simple operation that doesn't involve supplying any data at all. It is just a name.

  2. When you have the container, it is usually up to the application to populate it with objects and retrieve them as needed using a programmatic interface.
    Figure 6. The populated container
    Image showing the populated container

    Click to see larger image

    Figure 6. The populated container

    Image showing the populated container
  3. However, you can also upload objects from the dashboard. Beneath Object Store, click Containers > Upload Object and supply a file with the stored content.
    Figure 7. Uploading an object
    Image showing how to upload an object

You can also use the interface to download, copy, or delete objects.


Conclusion

As you can see, OpenStack provides an intuitive interface for setting up private cloud storage and making it available to workloads. These are only the tip of the iceberg in terms of what's possible. Many customers use Ceph or GlusterFS, for instance, as a back-end storage mechanism, but even in these cases, end users only need to interact with the user interface. As you have seen in previous articles in this series, OpenStack is simply an abstraction layer that integrates a pluggable set of components.

Resources

Learn

Get products and technologies

Discuss

  • Get involved in the developerWorks Community. Connect with other developerWorks users while you explore developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Cloud computing on developerWorks


  • Bluemix Developers Community

    Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.

  • developerWorks Labs

    Experiment with new directions in software development.

  • DevOps Services

    Software development in the cloud. Register today to create a project.

  • Try SoftLayer Cloud

    Deploy public cloud instances in as few as 5 minutes. Try the SoftLayer public cloud instance for one month.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Cloud computing, Open source
ArticleID=954059
ArticleTitle=Discover OpenStack: The Storage components Swift and Cinder
publish-date=12122013