Skip to main content
skip to main content

developerWorks  >  Open source  >

GT4 development: Integrate Storage Resource Broker with Jakarta Commons Virtual File System

developerWorks
Document options

Document options requiring JavaScript are not displayed

Sample code


My developerWorks needs you!

Connect to your technical community


Rate this page

Help us improve this content


Level: Intermediate

Vladimir Silva (vladimir_silva@hotmail.com), Software Engineer, Consultant

12 Dec 2006

Storage Resource Broker (SRB), developed by the San Diego Super Computer Center (SDSC), is a software platform that provides an interface for connecting heterogeneous data sources over a network. SRB is middleware designed to build data grids and access replicated data sets distributed over long distances. In this article, you'll learn the internals of SRB, plus the tools required to interface SRB with the popular open source Jakarta Commons Virtual File System (VFS).

SRB components and features

SRB is a distributed file system designed for data management in a grid environment by providing such features as:

  • Transparent replication
  • Archiving, caching, data synchronization, and backups
  • Heterogeneous storage features
  • Container and aggregated data movement
  • Bulk data ingestion
  • Third-party operations, such as copy and move
  • Version control and partitioned data management

SRB is divided into three logical layers: Client (Presentation), Metadata Catalog (MCAT) (Logic), and SRB Agents (Data). At the heart of SRB is MCAT.



Back to top


MCAT

MCAT is a database system that keeps track of namespaces and the mapping of data objects to storage resources within the federation. MCAT is used to determine where a given data object is located, as well as file attributes, metadata, access control lists, storage resource information, and user data.

By querying it, clients can easily find distributed data objects, replicate, transfer, or synchronize data, perform sophisticated queries, and many other functions. The MCAT stores information about resources, users, domain machines, and data. This information is usually grouped in logical entities dubbed collections.

Collections and storage resources

A collection is much like a folder or a directory in a file system. It is an object that contains other collections or data objects. A collection is used to organize data into a logical hierarchy easily accessible and understood by users. The mapping between data objects and the ultimate physical storage is done through resources. There are three types of resources:

  • Physical resources represent places where data objects are physically stored. For example a UNIX®/Linux® file system, a relational database such as Oracle or DB2®, tape drives, FTP, or Web servers.
  • Logical resources are used to group one or more physical resources, making it transparent where the data is stored. This allows for transparent and data storage.
  • Cluster resources enable the efficient management of cluster file systems behind the scenes. A cluster file system can have zero or more physical resources attached. Cluster resources are useful for failover conditions where collections can be transfer from one failing server to another, provided that they share the same data.


Back to top


Enter Jakarta Commons

Mix the powerful features of SRB with the single API for accessing various file systems from Jakarta and what do you get? Quite simply: the ability to query and transfer files from virtually anywhere. Imagine a usage scenario where users have disparate storage resources, such as FTP, SFTP, SSH, HTTP, and SRB servers. With the single API provided by Commons VFS, an organization can leverage its own resources with little or no change to the existing infrastructure. Files can be transferred among servers transparently. Discovery services can be easily built to query for user and file metadata. All these will translate to significant savings in software, hardware, and development resources, maximizing the organization's return of investment.

Commons VFS-SRB provider implementation


Figure 1. VFS-SRB provider implementation classes
VFS-SRB provider implementation classes

Your first step to implement the VFS-SRB provider is to create a configuration XML file. This file tells the core API the provider class name, as well as the protocol scheme. This file must be created in the META-INF folder of your Java™ project to be read by the core. This is critical or VFS will fail to recognize the new protocol scheme. The configuration file for SRB looks like this:


Listing 1. SRB configuration file
                
<providers>
 <provider class-name=
  "org.apache.commons.vfs.provider
    .srb.SrbFileProvider">
        <scheme name="srb"/>
 </provider>
</providers>

With the configuration file in place, it's time to implement the required interfaces. The first class to implement is a file provider whose role is to provide information on the new file system capabilities, as well as to create a new instance of the file system.



Back to top


SRB file provider

The role of the SrbFileProvider class is twofold. First, it provides information about the capabilities of the new file system, such as the ability to:

  1. Create, delete, or rename files.
  2. Get file information, such as type, URI, and last modification time.
  3. Read and write to or from the file.
  4. List child information.

Second, it creates an instance of the SRB file system by returning a new object of type SrbFileSystem.


Listing 2. SrbFileProvider class
                
/**
 * A file system provider, which uses direct file access.
 *
 * @author Vladimir Silva
 */
public class SrbFileProvider
    extends AbstractOriginatingFileProvider
{
    private Log log = LogFactory.getLog(SrbFileProvider.class);

    public final static Collection capabilities =
            Collections.unmodifiableCollection
              (Arrays.asList(new Capability[]
    {
        Capability.CREATE,
        Capability.DELETE,
        Capability.RENAME,
        Capability.GET_TYPE,
        Capability.GET_LAST_MODIFIED,
        Capability.SET_LAST_MODIFIED_FILE,
        Capability.SET_LAST_MODIFIED_FOLDER,
        Capability.LIST_CHILDREN,
        Capability.READ_CONTENT,
        Capability.URI,
        Capability.WRITE_CONTENT
    }));

    public SrbFileProvider()
    {
        super();
        setFileNameParser(SrbFileNameParser.getInstance());
    }

    /**
     * Creates the filesystem.
     */
    protected FileSystem doCreateFileSystem(final FileName name
    , final FileSystemOptions fileSystemOptions)
        throws FileSystemException
    {
        // Create the file system
        String Path = name.getPath();

        log.debug("Creating file system with name=" + name
                  + " URI=" + name.getRootURI());
        return
          new SrbFileSystem((GenericFileName)name
            , Path, fileSystemOptions);
    }

    public Collection getCapabilities()
    {
        return capabilities;
    }
}



Back to top


SRB file system

The SRB log-in information is controlled by two environment files: .MdasEnv and .MdasAuth. They reside in the .srb directory under the user's home directory. The file .MdasEnv contains the user's name, home collection, domain, SRB host, port, and authentication schema. See Listing 3 for the syntax of .MdasEnv. The file .MdasAuth contains one line representing the user's password.


Listing 3. MCAT/Agent environment file $HOME/.srb/.MadsEnv
                
# SRB MCAT log in parameters
# The password resides in $HOME/.srb/.MdasAuth

# User's Collection (home directory name)
mdasCollectionName '/nccZone/home/srbAdmin.NCC'

# Location of the hhome directory
mdasCollectionHome '/nccZone/home/srbAdmin.NCC'

# SRB domain
mdasDomainName 'NCC'
mdasDomainHome 'NCC'

# User name
srbUser 'srbAdmin'

# Meta data catalog MCAT Host
srbHost 'ebony.rtpnc.epa.gov'

# Default port
#srbPort '5544'

# Default storage resource (machine) name
defaultResource 'nccResc'

# Authorization Schemes
# SRB supports Grid Security Infrastructure authentication! 
# However the server must be
# recompiled with the GSI libraries
#AUTH_SCHEME 'PASSWD_AUTH'
#AUTH_SCHEME 'GSI_AUTH'
AUTH_SCHEME 'ENCRYPT1'

SRB access through the Java programming language is provided by Java API for Real Grids On Networks (JARGON). JARGON has been designed from the ground up to make programming for the grid as straightforward as possible. Its capabilities include transparent replication, archiving, caching heterogeneous storage, aggregated data movement, shadow objects, and more. Connecting to SRB using JARGON is easy.


Listing 4. Connecting to SRB using JARGON
                
edu.sdsc.grid.io.GeneralFileSystem
    srbFileSystem = FileFactory.newFileSystem(
       new SRBAccount(host     // SRB host
          , port   // SRB port (5544)
          , user   // User
          , pwd    // Pwd
          , home   // User's home dir
          , domain // SRB domain
          , defRes  // SRB default resource (machine)
          , mcatZone)); // MCAT Zone (for federations)

The code will create an SRB connection that can be used to do all kinds of file system operations or metadata queries. An easier way is to let the API read the account information from the configuration files $HOME/.srb/.MdasEnv and $HOME/.srb/.MdasAuth by simply calling:

  srbFileSystem = FileFactory.newFileSystem( new SRBAccount());

Listing 5 shows the Commons VFS implementation of the SRB file system provider.


Listing 5. SrbFileSystem class
                
/**
 * A local file system.
 *
 * @author Vladimir Silva
 */
public class SrbFileSystem extends AbstractFileSystem
        implements FileSystem
{
    private Log log = LogFactory.getLog(SrbFileSystem.class);

    private final GeneralFileSystem srbFileSystem;
    private Map attribs = new HashMap();

    public static final String HOME_DIRECTORY = "HOME_DIRECTORY";

    /**
     * Constructor
     * @param rootName Path to the root folder
     * @param rootFile
     * @param opts FS options
     * @throws FileSystemException
     */
    public SrbFileSystem(final GenericFileName rootName,
                         final String rootFile,
                         final FileSystemOptions opts) throws
            FileSystemException {
        super(rootName, null, opts);

        String host = rootName.getHostName();
        int port = rootName.getPort();
        String user = rootName.getUserName();
        String pwd = rootName.getPassword();

        try {
            // Load some stuff from ~/.srb/MdasEnv user info file
            SRBAccount srbAccount = new SRBAccount();

            String domain;

            // SRB user names are composed by [USER-NAME].[DOMAIN]
            if (user.indexOf(".") != -1) {
                domain = user.substring(user.indexOf(".") + 1);
                user = user.substring(0, user.indexOf("."));
            } else {
                domain = srbAccount.getDomainName();
            }

            String home = srbAccount.getHomeDirectory();
            String mcatZone = srbAccount.getMcatZone();
            String defRes = srbAccount.getDefaultStorageResource();

            // Create SRB connection, mot of the data comes from
            // $HOME/.srb/MdasEnv
            srbFileSystem = FileFactory.newFileSystem(
              new SRBAccount(host     // SRB host
                  , port   // SRB port (5599)
                  , user   // User
                  , pwd    // Pwd
                  , home   // User's home dir
                  , domain // SRB domain
                  , defRes, // SRB default resource (machine)
                  mcatZone)); // MCAT Zone (for federations)

            // save SRB home dir
            attribs.put(HOME_DIRECTORY, home);

            log.debug("Constructor: Created new SRB FileSystem " +
                      srbFileSystem + " with home=" +
                      attribs.get(HOME_DIRECTORY));

        } catch (Exception e) {
            throw new FileSystemException(e.getMessage(), e);
        }
    }

    /**
     * Creates a file object.
     */
    protected FileObject createFile(final FileName name)
    throws  FileSystemException {
        // Create the file
        String fileName = name.getPath();

        log.debug("Creating new file="
          + fileName + " Name=" + name
          + " Parent=" + name.getParent());

        return new SrbFileObject(this, name);
    }

    /**
     * Returns the capabilities of this file system.
     */
    protected void addCapabilities(final Collection caps) {
        caps.addAll(DefaultLocalFileProvider.capabilities);
    }

    protected GeneralFileSystem getSRBFileSystem() {
        return srbFileSystem;
    }

    public Object getAttribute(final String attrName)
     throws FileSystemException {
        return attribs.get(attrName);
    }
}



Back to top


SRB-VFS file object

SrbFileObject is where all the file operations get executed, including:

  • Attaching the file object to its resource (doAttach).
  • Creating, deleting, renaming child folders or files (doCreateFolder, doDelete, doRename).
  • Getting file information: name, size, attributes, etc.
  • IO streams to handle reads and writes.

Listing 6 shows the SrbFileObject class.



Back to top


Conclusion

SRB is a software platform that provides an interface for connecting heterogeneous data sources over a network. SRB is designed to build data grids and access replicated data sets distributed over long distances. The main purpose of this article has been to present a software interface between Jakarta Commons VFS and SRB.




Back to top


Download

DescriptionNameSizeDownload method
Sample scriptsgr-jvfssrb.zip3MBHTTP
Information about download methods


Resources

Learn

Get products and technologies
  • Innovate your next open source development project with IBM trial software, available for download or on DVD.


Discuss


About the author

Vladimir Silva was born in Quito, Ecuador. He received a Systems Analyst degree from the Polytechnic Institute of the Army in 1994. In the same year, he came to the United States as an exchange student pursuing an M.S. degree in Computer Science at Middle Tennessee State University. After graduation, he joined the IBM "Web-Ahead" technology think tank. His interests include grid computing, neural nets, and artificial intelligence. He also holds numerous IT certifications including OCP, MCSD, and MCP.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top