Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Improve file sharing and file locking in a cloud

Modify block storage to provide more efficient Infrastructure-as-a-Service

Ramanathan Sundarrajan (raman123@in.ibm.com), System Operations Senior Specialist, IBM
Ramanathan Sundarrajan (MydW profile) is an active member of IBM's Cloud Computing work group and performed lots of research on cloud innovations. Ramanathan monitored final-year student interns from the College of Engineering Guindy at Anna University; one result of that project is this article.
Kishorekumar Neelamegam (kineelam@in.ibm.com), Systems Engineer/IT Architect, IBM
Kishorekumar Neelamegan brings more than 13 years of software development experience with a strong focus on integrating software into the Rational platform. A passionate evangelist on cloud, Kishore is a frequent participant on developerWorks: You can follow his activities through his MydW profile and MydW group, dW India IBMers.
V. T. Prabagaran, Intern, IBM
V.T. Prabagaran is a final-year student at College of Engineering Guindy, Anna University, Chennai, India.

Summary:  Block storage is a key foundation to most file systems. File sharing and file locking are important processes for sharing cloud data resources, and for eliminating race conditions. Efficient implementation can make a significant mark on your system's and applications' performance levels. In this article, we use an open source example -- cloud platform Eucalyptus and its storage component Walrus -- to demonstrate how to modify block storage to improve the file-sharing and -locking mechanisms. Learn how to install Eucalyptus so you can deliver a top-notch Infrastructure-as-a-Service platform.

Date:  28 Jul 2010
Level:  Intermediate PDF:  A4 and Letter (378KB | 22 pages)Get Adobe® Reader®
Also available in:   Chinese  Korean  Spanish

Activity:  17060 views
Comments:  

This article introduces some modifications to the source code of Walrus, the storage service component included with the Eucalyptus open source framework for cloud computing that implements an IaaS environment (Infrastructure-as-a-Service). Learn how to modify the Walrus source code, and how to recompile and run it, in order to improve the file-sharing and file-locking mechanisms in the Eucalyptus environment.

The best reason we can think of for doing this, for consumers of cloud services or for developers and designers of cloud applications and services, especially if they will employ file sharing or locking, is that it can improve the function of your application or service, which can improve the performance of said resources and which in turn might reduce the overhead allocation of time, bandwidth, and compute power, for your resource. This will likely result in reduced cost.

We show you, step by step, how to install Eucalyptus in a cluster: in this case, an IBM® blade server; this technique can also be used on a personal or laptop computer.

To get the most from this article, you should have a good understanding of cloud computing concepts, Java™ technology and coding UNIX® commands, and some basic understanding on how to work with clusters. To use the sample code, you need a basic understanding of the Eclipse framework. There are links to background information on these technologies in the Resources section.

Installing Eucalyptus in a cluster

For this article, we used Eclipse 3.4.2 and Cent OS 5.4 as the operating system.

Before the installation

IBM blade servers support a wide selection of processor technologies and operating systems to allow clients to run all of their diverse workloads inside a single architecture. The blade servers reduce complexity, improve systems management, and increase energy efficiency while driving down total cost of ownership. We used IBM LS20 BladeCenter® Server (Resources).

We are generally referring to a single-cluster installation in this article; all the components except the node controllers are located on one machine which we refer to as the front end. (In other words, the cloud controller, cluster controller, and storage controller run on the front end machine. The machines running only node controllers are referred to as "nodes."

Installing as the admin

It's pretty simple to install Eucalyptus 1.6.1 in CentOS. As the admin:

  1. Extract (untar) the file eucalyptus-1.6.1-centos-i386.tar.gz.
  2. Login as any user other than root and install it as shown in the developerWorks wiki.

After following those steps, Eucalyptus should be installed. Alternative installation directions are available from the Eucalyptus web site. Next, download the Eucalyptus management tools to manage virtual images. The wiki explains bundle image usage.

Figure 1 shows four high-level components, each with its own web service interface, that comprises a Eucalyptus installation:


Figure 1. Eucalyptus four high-level components
Eucalyptus four high-level components

Those component are the node controller, cluster controller, storage controller (Walrus), and the cloud controller.

  • The node controllers control the execution, inspection, and terminating of VM instances on the host where it runs.
  • The cluster controllers gather information about and schedules VM execution on specific node controllers, as well as manages virtual instance network.
  • The storage controller (Walrus) is a put/get storage service that implements Amazon's S3 interface, providing a mechanism for storing and accessing virtual machine images and user data.
  • The cloud controller is the entry point into the cloud for users and administrators. It queries node managers for information about resources, makes high level scheduling decisions, and implements them by making requests to cluster controller.

About Walrus, the storage component

Walrus is a storage service included with Eucalyptus that is interface-compatible with Amazon's S3. Walrus lets you store persistent data, organized as buckets and objects.

Walrus does not provide locking for object writes; however, as is the case with S3, you are guaranteed that a consistent copy of the object is saved if there are concurrent writes to the same object. If a write to an object is encountered while there is a previous write to the same object in progress, the previous write is invalidated.

How Walrus works now

The current version of Walrus offers inconsistent data and no object locking. To run an image in the cloud, you must produce a bundled image and upload it in the cloud. Walrus acts as a storage manager: It receives the image from you and stores it as buckets and objects. When you want to access the image from the cloud, Walrus is entrusted with the task of verifying and decrypting images that have been uploaded by users.

When you want to store an image, a separate bucket is created for each user with a unique bucket name. Using S3cmd, create a bucket and bucket name:

$ s3cmd mb s3://my-new-bucket-name

Once a bucket has been created, you can upload the file, referred to as the object, into the bucket:

$ s3cmd put filename s3://my-new-bucket-name/filename

To learn more about Walrus internal working, you can study Amazon S3's S3cmd (Resources).

Introducing file locking to Walrus

To overcome the drawbacks in Walrus, we have introduced a file-locking mechanism: To maintain data consistency, we've provided the ability to access the file in read/write mode.

When user1 wants to access any file in write mode, the corresponding object will be locked so that other users can't access it until it is released by user1. But other users can access the file in read mode.

We designed a separate queue in which to place the write request of each user using the order in which they requested the object and enabling the system to process the request accordingly.

Image management in Walrus

Before running VM instances in Eucalyptus, you should add the downloaded or created VM images by bundling these images with your Eucalyptus credentials, then upload the image and register them.

To enable a VM image as an executable entity, the Eucalyptus administrator must add a root filesystem image and a kernel/ramdisk pair to Walrus (bucket storage) and register the uploaded data with Eucalyptus. Each image is added to Walrus and registered with Eucalyptus separately, using the following EC2-compatible commands:

  • To add the root filesystem image to Walrus:
    1. Bundle the image:
      $ euca-bundle-image -i <vm image file>
      

    2. Upload the bundle:
      $ euca-upload-bundle -b <image bucket> -m /tmp/<vm image file>.manifest.xml
      

    3. Register the image:
      $ euca-register <image bucket>/<vm image file>.manifest.xml
      

  • To add the kernel to Walrus and register it with Eucalyptus:
    1. Bundle the kernel:
      $ euca-bundle-image -i <kernel file> --kernel true
      

    2. Upload the bundle:
      $ euca-upload-bundle -b <kernel bucket> -m /tmp/<kernel file>.manifest.xml
      

    3. Register the kernel:
      $ euca-register <kernel-bucket>/<kernel file>.manifest.xml
      


Behind the modified mechanism

At present Eucalyptus doesn't support a file-sharing mechanism, but we'll show you how to implement file sharing in Eucalyptus. We focus on maintaining data consistency.

For each user, a separate Virtual Machine instance is created. In its current incarnation, Eucalyptus also doesn't support sharing files among different VM instances. If two or more users access the file in write mode concurrently and modify the file, the last saved content is updated in the file.

First, lets look at how a volume is created and attached to an instance.

Working with volumes

Before creating a new volume, view information about current availability zones:

$ euca-describe-availability-zones

Create a new volume:

$ euca-create-volume --size <size of volume> -x <name of availability zone>

where --size denotes the size of volume you wish to create and -x denotes the name of the availability zone where you want the volume to reside.

Attach a volume to an instance with the following command:

$ euca-attach-volume

For example, to attach the volume vol-12345678 to the instance i-98765432 at /dev/sdb:

$ euca-attach-volume -i i-98765432 -d /dev/sdb vol-12345678

When the VM instance starts running, you can see two IP addresses assigned to it. Login to the IP address using the SSH key:

$ ssh -i mykey.private root@<ip-address>

Let's look at this in a scenario form

Let's assume that user A and user B logs into two different systems, say System 1 and System 2 with same username and password and try to access a file from both the systems.

Both A and B try to access the same VM instance through Elastic Fox concurrently (at the same time) in write mode. By using the IP address of the instance, both of them try to the access the instance using the ssh command. When A modifies the file B does, then B's modification is the one that gets updated. The state of the file writes is not consistent.

The modified walrus architecture helps to make the data file modifications consistent.


Modifying the mechanism

Let's look at the architecture of cloud and its virtual network.


Figure 2. The architecture of cloud and its virtual network
The architecture of cloud and its virtual network

The components are:

  • The CLC, or cloud controller, which is the interface to the clients and does high-level scheduling; it forms the management platform.
  • The ccX are the cluster controllers which schedule incoming requests to specific node controllers and gathers/reports information about a set of node controllers.
  • The ncX are the node controllers, the machines which host VM instances.
  • Walrus is the persistent secondary storage which is used by the node controllers to store their VM images and sometimes to store data.

Figure 3 shows how a user shares files with other users.


Figure 3. Flow diagram of how a user shares files
Flow diagram of how a user shares files

In the flow diagram (follow the numbers):

  1. Client logs in with login ID and password.
  2. CLC checks the user ID in the database and creates a new session for valid user.
  3. CLC returns the status message to the client.
  4. User shares the file he owns.
  5. CLC now checks whether the user really owns the file or not and upon successful authentication, adds the new users identity to the shared file access list.
  6. CLC forwards this message to the corresponding CC.
  7. CC finds the NC hosting the virtual machine instance for the user and forwards this message.
  8. NC transfers this file to a persistent shared medium (Walrus) to enable sharing between users.
  9. File is transferred to the Walrus through the CC and CLC.
  10. File is transferred to the Walrus.
  11. CLC delivers the success message to the client.

Figure 4 shows how a client requests access to a file.


Figure 4. Flow diagram of how a client requests access to a file
Flow diagram of how a client requests access to a file

In this flow diagram (follow the numbers):

  1. User logs in using login and password.
  2. CLC checks it with the user database and creates a new session for valid user.
  3. CLC returns the login status message to the client.
  4. Client requests a file.
  5. CLC sends request to the user directory to verify user access to the file. The user directory stores file details and user access data.
  6. CLC forwards the request to the corresponding CC.
  7. CC finds the NC that hosts the virtual machine instance created for the user.

In steps 8, 9, and 10, the NC transmits data to the user using a secure channel through the CC and CLC.

It's probably time to demonstrate what the inside of a node controller looks like. In every node controller, there is a hypervisor running. The hypervisor is a platform-virtualization software. We are using type 1 hypervisor that interacts directly with the host hardware, runs a guest operating system above the hypervisors, and allocates system resources across LPARs to share physical resources such as CPUs, direct access storage devices, and memory. (Type 1 hypervisors were introduced by IBM in the early 1970s with the IBM System 370 processors.) Figure 5 demonstrates how the use flow works with the NC and its hypervisor.


Figure 5. Inside the node controller
Inside the node controller

In this flow diagram (follow the numbers):

  1. Incoming request from the CC to the NC.
  2. The node controller module running on that node forwards it to the hypervisors.
  3. The hypervisor does the job with the help of the guest operating system.
  4. The guest OS instructs the hypervisor what to do.
  5. The hypervisor now interact with the hardware and completes the job.

We've looked at how file sharing introduced into Eucalyptus can help; now let's look at ensuring data consistency via the concept of accessing files in read/write mode.

Figure 6 demonstrates how a write mode time queue for file access can improve data consistency:


Figure 6. Improving consistency using a write mode time queue
Improving consistency using a write mode time queue

Figure 6 compares user B's request for the file F1 in write mode at time t versus user C's request for the same file in write mode at time t+1. To implement file consistency, we have designed a queue that is used to place the request using first-come-first-served basis.

Since B requested the file first compared with C, B is placed at the top of the queue and C is placed next to B in the queue.

In general, if any request is made by the user for accessing the file in write mode, each request is placed in the order by using the time at which the request is made. The first requested user is placed in the top of the queue. The next requested user is placed adjacent to it, and so on.

Figure 7 demonstrates the added determiner of user-request function (write or read) to determine sharing/locking levels in order to improve data consistency.


Figure 7. Improving consistency using a read/write determiner
Improving consistency using a read/write determiner

In Figure 7, we've added a field in addition to the time frame a user accesses the file to represent in which mode the file access is provided — whether the user is accessing for write or for read.

From the write access queue, user B is at the top of the queue since user B requests first to access the file in write mode compared with user C. User B is provided the write access. User C will get the write access once user B releases the file lock. But user C can access the file in read mode while user B still has it locked in write mode.

In general, if two or more users access the file in write mode concurrently, the first user is granted write mode access to the file and the remaining users write mode access requests are queued. But read mode access is given to all the other users. When the first user with write access releases the file, the top-most user from the queue is then given the write mode access.


Modifying, recompiling, and running the modified code

We know you've waded through all the concepts just to get to this part—the actual modification steps. They're pretty simple.

  1. Create a workspace and copy the folder clc from the Eucalyptus source.

    Figure 8. Choose your workspace folder
    Choose your workspace folder

  2. Import the source by clicking File > Import.

    Figure 9. Choose your import source
    Choose your import source

  3. Select General > Existing Projects into your workspace.

    Figure 10. Select Existing Projects
    Select Existing Projects

  4. Select the root directory path as root/java/workspace/clc.

    Figure 11. Select the root directory path
    Select the root directory path

  5. Click Finish.

    Figure 12. When the root directory and projects have been successfully added, click Finish
    When it looks like this, click Finish

  6. On the left-hand side is a tab called "package" which lists the content of the project. Now right-click build.xml.

    Figure 13. Ready to build ...
    Ready to build ...

  7. Run the Ant build

    Figure 14. ... and it's a success!
    ... and it's a success!

You should see the build is successful. That was easy.


Implementing the application

The application itself has several files, but we cover only the main highlights, leaving it to you to build on this to create you own applications.

To implement file sharing and locking mechanism we have created a class called WalrusVirtualBlockManager. The code implements the file locking mechanism in Eucalyptus. Listing 1 is the source code.


Listing 1. WalrusVirtualBlockManager

package edu.ucsb.eucalyptus.cloud.ws;
import org.apache.log4j.Logger;
import edu.ucsb.eucalyptus.cloud.entities.ObjectInfo;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.locks.*;
import edu.ucsb.eucalyptus.cloud.entities.ObjectInfo;
public class WalrusVirtualBlockManager
{
    private static Logger LOG = Logger.getLogger(WalrusVirtualBlockManager.class);
    public static Map<ObjectInfo,ReentrantLock> 
           storagelockMap = new HashMap<ObjectInfo,ReentrantLock>();
    private static WalrusVirtualBlockManager virtualBlockMgr;
    private WalrusVirtualBlockManager()
    {
    
    }
    public static WalrusVirtualBlockManager getInstance()
    {
        VirtualBlockMgr = new WalrusVirtualBlockManager();
        return virtualBlockMgr;
    }
    public ReentrantLock lock(ObjectInfo info)
{
ReentrantLock lck = new ReentrantLock();
storagelockMap.put(info,lck);
return lck;
}
  public void unlock(ObjectInfo info)  
  {
    ReentrantLock lck = storagelockMap.get(info); 
    lck.unlock();
    clear(info);
  }
  public void clear(ObjectInfo info)
  {
    storagelockMap.remove(info);    
  }
  public void clearAll()
  {
      for(Map.Entry<ObjectInfo,ReentrantLock> entry : storagelockMap.entrySet())
      {
       unlock(entry.getKey());       
  }
      storagelockMap.clear();
}
}


What if I'm not using Walrus?

This modified block storage technique can be adapted for other cloud platforms as well. For example, in Cassandra, the data is replicated. That is, the latest version of a data resource is sitting on some node in the cluster, but older versions are still out there on other nodes. The goal is that eventually, all nodes will access the latest version. File object locking is not available, but the modified block storage technique can be introduced here the way we did in this article to maintain data consistency. You've seen Cassandra in action at Digg, Facebook, Twitter, and other sites.


In conclusion

Now you know how to install Eucalyptus in a cluster and how to modify the Walrus source code to implement or improve the file-sharing and file-locking mechanism on the cloud.


Resources

Learn

Get products and technologies

Discuss

  • Follow the Eucalyptus chatter on Twitter; you can follow developerWorks too.

  • The Developer Cloud group on My developerWorks is the community for the Smart Business Development and Test on the IBM Cloud.

  • Get involved in the developerWorks community (developer blogs, groups, forums, podcasts, profiles, newsletters, wikis, and community topics) through My developerWorks, a professional network and unified set of community tools for connecting, sharing, and collaborating.

About the authors

Ramanathan Sundarrajan (MydW profile) is an active member of IBM's Cloud Computing work group and performed lots of research on cloud innovations. Ramanathan monitored final-year student interns from the College of Engineering Guindy at Anna University; one result of that project is this article.

Kishorekumar Neelamegan brings more than 13 years of software development experience with a strong focus on integrating software into the Rational platform. A passionate evangelist on cloud, Kishore is a frequent participant on developerWorks: You can follow his activities through his MydW profile and MydW group, dW India IBMers.

V.T. Prabagaran is a final-year student at College of Engineering Guindy, Anna University, Chennai, India.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=, Open source, Java technology
ArticleID=502571
ArticleTitle=Improve file sharing and file locking in a cloud
publish-date=07282010
author1-email=raman123@in.ibm.com
author1-email-cc=
author2-email=kineelam@in.ibm.com
author2-email-cc=
author3-email=mydw@us.ibm.com
author3-email-cc=

Next steps from IBM

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers