Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Grid Job submission using the Java CoG Kit

Writing Java-enabled Grid applications

Vladimir Silva, Software engineer, EMC
Vladimir Silva was born in Quito, Ecuador in 1969. He received a System's Analyst Degree from the Polytechnic Institute of the Army in 1994. The same year, he came to the United States as an exchange student pursuing a career in Computer Science at Middle Tennessee State University. After graduation, he joined the IBM "Web-Ahead" technology think tank. His interests include Grid computing, Neural Nets, and artificial intelligence. He also holds numerous IT certifications including OCP, MCSD and MCP.

Summary:  Grids are environments that enable software applications to integrate instruments, displays, computational and information resources that are managed by diverse organizations in widespread locations. Grid computing is all about sharing resources that are located in different places based on different architectures and belonging to different management domains. Computer Grids will create a powerful pool of computing resources capable of running the most demanding scientific and engineering applications required by researchers and businesses today. The objective of this article is to describe the basics of job submission against computer Grids using the Java CoG Kit.

Date:  01 Feb 2003
Level:  Intermediate

Comments:  

Introduction

Today's "de-facto" Grid technologies are based on the Globus Toolkit. The Globus Toolkit is an open architecture, open source software package. It focuses on the following areas: resource management, data management and access, application development environments, information services, and security.

Resource management focuses on providing uniform and scalable mechanisms for naming and locating computational and communication resources on remote systems, and for incorporating these resources into parallel and distributed computations.

Data management focuses on handling large amounts of data (terabytes or petabytes). Next generation applications may also require access to distributed data applications, such as collaborative environments, Grid-enabled RDBMS systems, etc.

Grid information services require high-performance execution in distributed computing environments with careful selection and configuration not only of computers, networks, and other resources, but also of the protocols and algorithms used by applications.

Grid security establishes secure relationships between a large number of dynamically created objects and across a range of administrative domains, each with its own local security policy.

If you are unfamiliar with Grid technologies, a good place to start would be to see some of the links in the Resources section. The code implemented in this article is based on the Java CoG (Commodity Grid Kits), which combines Java technology with Grid computing to develop advanced Grid Services and access to basic Globus resources (see Resources). Other CoG implementations include CORBA, Perl, and Python.


Job submission terminology

What is a job?

In Globus terminology, a job is a binary executable or command to be run in a remote resource (machine). For this job to run, the remote server must have the Globus toolkit installed. This remote server is also referred to as a "contact" or "gatekeeper". Currently, toolkit implementations exist for all major flavors of UNIX and Linux. An example of a job string would be "/bin/ls" which will produce a listing of the current working directory (similar to the "dir" command in windows).

Job submission modes

When a job is submitted to a remote gatekeeper (server) for execution, it can run in two different modes: batch and non-batch. When a job runs in batch mode, the remote submission call will return immediately with a job-id string. Job ids use the following format: https://server.com:39374/15621/1021382777/. This job id can later be used to obtain the output of the call using standard Globus toolkit commands. In non-batch job submission, the client will wait for the remote gatekeeper to follow through with the execution and return the output. Batch mode submission is useful for jobs that take a long time, such as process-intensive computations.

Resource Specification Language (RSL)

RSL provides a common interchange language to describe resources. The various components of the Globus Resource Management architecture manipulate RSL strings to perform their management functions in cooperation with the other components in the system. An example of an RSL string would be "& (executable = /bin/ls)(directory=/bin)(arguments=-l)". This RSL string will produce a long listing of the current working directory in a UNIX-like system.


Security infrastructure

The Globus Toolkit uses the Grid Security Infrastructure (GSI) for enabling secure authentication and communication over an open network. GSI provides a number of useful services for Grids, including mutual authentication and single sign-on.

GSI is based on public key encryption, X.509 certificates, and the Secure Sockets Layer (SSL) communication protocol. Extensions to these standards have been added for single sign-on and delegation. The Globus Toolkit's implementation of the GSI adheres to the Generic Security Service API (GSS-API), which is a standard API for security systems promoted by the Internet Engineering Task Force (IETF).

For this article implementation, a guest certificate has been set up for clients to access Grid resources. The system administrator of that remote machine can then control access to those resources. It is assumed that the Java CoG kit has been installed on the client and a user certificate has been requested and set up properly. For instructions on obtaining a user's certificate for authentication, see the Globus web site (see Resources).


Job submission basics in Java technology

A minimal Job submission Java class

The following Java class implements a minimal job submission method. This code requires the Java CoG kit libraries to compile properly. (See the Resource listing below for installation instructions.) User/host certificates are required for authentication (see Resources).

We will begin our implementation by creating a Java class to encapsulate a "GRAM Job" (see Listing 1). This class will be used to submit a job request against a machine (gatekeeper) in either batch or non-batch mode. Output will be returned on completion.


Listing 1. A Java class to encapsulate job submission
		
package globus.services;

import org.globus.security.*;
import org.globus.gram.*;
import org.globus.io.gass.server.*;
import org.globus.util.deactivator.Deactivator;

/**
 * Java CoG Job submission class
 */
public class GridJob implements GramJobListener, JobOutputListener
{
private GassServer m_gassServer; // GASS Server: required to get job output
private String m_gassURL = null;   // URL of the GASS server
private GramJob m_job = null;        // GRAM JOB to be executed
private String m_jobOutput = "";     // job output as string
private boolean m_batch = false;     // Submission modes:
						 batch=do not wait for output
                                      //      non-batch=wait for output.
private String m_remoteHost = null;  // host where job will run
private GlobusProxy m_proxy = null; 
 // Globus proxy used for authentication against gatekeeper

// Job output variables:
// Used for non-batch mode jobs to receive output from 
// gatekeeper through the GASS server
     
private JobOutputStream m_stdoutStream = null;
private JobOutputStream m_stderrStream = null;
private String m_jobid = null;  // Globus job id on the form: 
https://server.com:39374/15621/1021382777/

public GridJob(String Contact, boolean batch) {
m_remoteHost = Contact;  // remote host
m_batch = batch;         // submission mode
 }

Note the interfaces implemented: GramJobListener and JobOutputListener. The GramJobListener interface is required if our class is to wait for job status. Job status can be one of PENDING, ACTIVE, DONE, FAILED, SUSPENDED, or UNSUBMITTED. The JobOutputListener interface is required for the actual job output notifications.

The constructor of this class takes two arguments: a "Contact" or remote host where the job will run, and a boolean value representing the submission mode: true=batch, false=non-batch. Batch means "do not wait for output". In this mode, the GRAM request will return immediately with a unique job id that can be used later on to retrieve that output. Batching is useful for long-lived jobs (jobs that take hours or even days). If the request is to take a short time (seconds or minutes), set the batch flag to false (non-batch).

Listening for job output

Globus uses the GASS (Globus Access to Second Storage) service to listen for output. For this output to be transferred back and forth between client and server, the following code in Listing 2 can be used.


Listing 2. Starting the GASS server for output transfer
		
		
/**
 * Start the Globus GASS Server. Used to get the output from the server
 * back to the client.
 */
  private boolean startGassServer(GlobusProxy proxy) {
  if (m_gassServer != null) return true;
  try {
      m_gassServer = new GassServer(proxy, 0);
      m_gassURL = m_gassServer.getURL();
  } catch(Exception e) {
      System.err.println("gass server failed to start!");
      e.printStackTrace();
      return false;
  }
  m_gassServer.registerDefaultDeactivator();
  return true;
 }
 
 /**
 * Init job out listeners for non-batch mode jobs.
 */
 private void initJobOutListeners() throws Exception {
  if ( m_stdoutStream != null ) return;
  // job output vars
  m_stdoutStream = new JobOutputStream(this);
  m_stderrStream = new JobOutputStream(this);
  m_jobid = String.valueOf(System.currentTimeMillis());

  // register output listeners
  m_gassServer.registerJobOutputStream("err-" + m_jobid, m_stderrStream);
  m_gassServer.registerJobOutputStream("out-" + m_jobid, m_stdoutStream);
  return;
  }

A GASS server must be started before sending the GRAM request. The initJobOutListeners method is used to register stdout/stderr streams so that the output can be received from the GRAM protocol. For this to work, an id value is generated using System.currentTimeMillis(), and the streams are registered with the GASS server using that value.

Handling received output

The GramJobListener and JobOutputListener interfaces implemented by our class require several methods to be implemented to handle output sent by the server. Thus the methods shown in Table 1 and Listing 3 must be implemented.

Table 1. GramJobListener and JobOutputListener events

MethodDescription
public void statusChanged(GramJob job)This method is used to notify the implementer when the status of a job has changed
public void outputChanged(String output)Called whenever the job's output is updated
public void outputClosed()Called whenever job finished and no more output will be generated

Listing 3. Handling output received from the remove server through the GASS protocol
			
			
 /**
 * This method is used to notify the implementer when the status of a
 * GramJob has changed.
 *
 * @param job The GramJob whose status has changed.
 */
 public void statusChanged(GramJob job) {
     try {
       if ( job.getStatus() == GramJob.STATUS_DONE ) {
	  // notify waiting thread when job ready
              m_jobOutput = "Job sent. url=" + job.getIDAsString();  
              // if notify enabled return URL as output
                        synchronized(this) {
                            notify();
   }
             }
       }
   catch (Exception ex) {
    System.out.println("statusChanged Error:" + ex.getMessage());
       }
  }

/**
 * It is called whenever the job's output
 * has been updated.
 *
 * @param output new output
 */
 public void outputChanged(String output) {
    m_jobOutput += output;
 }

 /**
 * It is called whenever job finished
 * and no more output will be generated.
 */
 public void outputClosed() {
 }

Note the synchronization mechanism implemented by statusChanged. When the job completes execution (status equals DONE), a notification is sent to the job execution thread to return. We will see in the actual request implementation that, if the submission fires in non-batch mode, the current thread must wait for the gatekeeper to complete and return the output back to the client. This scheme is recommended for jobs that take a short time to complete. If your job is to be run for hours or days use a "batch" mode.

Sending the GRAM job request

The main part of the GRAM request is implemented by the following method. The steps required for job submission include:

  • Loading a proxy. The GlobusProxy class provides a convenient method: GlobusProxy.getDefaultUserProxy(). Note that the Java CoG kit and user certificates must be setup properly.
  • Starting the GASS server.
  • Setup job output listeners for the client to receive output/error streams.
  • Formatting the RSL string properly according to the submission mode.
  • Sending the actual GRAM job request and waiting for output (if non-batch).

It is critical that the Java CoG kit be installed properly and the user certificates setup correctly or the proxy load call will fail throwing an exception. A host certificate must be installed also in the remote server (gatekeeper). Listing 4 shows the code.


Listing 4. Sending the GRAM request
			
public synchronized String GlobusRun(String RSL) {
   try {
  // load default Globus proxy. Java CoG kit must be installed 
  and a user certificate setup properly
    m_proxy = GlobusProxy.getDefaultUserProxy();
                
   // Start GASS server
   if (! startGassServer(m_proxy)) {
   throw new Exception("Unable to stat GASS server.");
}

// setup Job Output listeners
    initJobOutListeners();

   System.out.println("proxy Issuer=" + m_proxy.getIssuer()+ 
   "\nsubject=" + m_proxy.getSubject() );
    System.out.println("Strength=" + m_proxy.getStrength() +
    " Gass: " + m_gassServer.toString() );
    System.out.println("GASS URL: " + m_gassURL );

// Append GASS URL to job String so we can get some output back
 String newRSL = null;

 // if non-batch, then get some output back
 if ( !m_batch) {
  newRSL = "&" + RSL.substring(0, RSL.indexOf('&')) +
  "(rsl_substitution=(GLOBUSRUN_GASS_URL " + m_gassURL + "))" +
 RSL.substring(RSL.indexOf('&') + 1, RSL.length()) +
 "(stdout=$(GLOBUSRUN_GASS_URL)/dev/stdout-" + m_jobid + ")" +
 "(stderr=$(GLOBUSRUN_GASS_URL)/dev/stderr-" + m_jobid + ")";
    }
    else {
// format batching RSL so output can be retrieved later on using any GTK commands
newRSL = RSL + "(stdout=x-gass-cache://$(GLOBUS_GRAM_JOB_CONTACT)stdout anExtraTag)"
  + "(stderr=x-gass-cache://$(GLOBUS_GRAM_JOB_CONTACT)stderr anExtraTag)";
}

m_job = new GramJob(newRSL);

// set proxy. CoG kit and user credentials must be installed and set 
// up properly
 m_job.setCredentials(m_proxy);

 // if non-batch then listen for output
 if (!m_batch) m_job.addListener(this);

 System.out.println("Sending job request to: " + m_remoteHost);
 m_job.request(m_remoteHost, m_batch, false);

 // Wait for job to complete
 if ( ! m_batch ) {
 synchronized(this) {
 try {
  wait();
 } catch(InterruptedException e) {
            }
          }
        }
 else
  // do not wait for job. Return immediately
  m_jobOutput = "Job sent. url=" + m_job.getIDAsString();
  }
  catch (Exception ex) {
  if ( m_gassServer != null ) {
  // unregister from gass server
  m_gassServer.unregisterJobOutputStream("err-" + m_jobid);
  m_gassServer.unregisterJobOutputStream("out-" + m_jobid);
  }
  m_jobOutput =  "Error submitting job: " + ex.getClass() + ":"
   + ex.getMessage();
  }
 // cleanup
 Deactivator.deactivateAll();
 return m_jobOutput;
 }

This method is implemented as "synchronized" so no other threads can access the same code at the same time ensuring thread safety. Also, if the request is to be sent in non-batch mode, the running thread must wait for the server to complete execution and return the output. When the job completes, a "statusChanged" method will fire setting the status to DONE. At this point, the running thread will "notify" waiting ones to proceed with execution returning the output back to the client.

Ready to begin with job submission

Now we are ready to begin sending jobs to the "gatekeepers" in our Grid. This can be done by running the class with the arguments shown in Listing 5.


Listing 5. Running a job against a machine on the Grid
		
 public static void main(String[] args) {
String RSL = "& (executable = /bin/ls)(directory=/bin)(arguments=-l)";
 GridJob Job1 = new GridJob("myserver.mygrid.com", false);

 String jobOut = Job1.GlobusRun(RSL);

 System.out.println(jobOut);
      }

Note the RSL string "& (executable = /bin/ls)(directory=/bin)(arguments=-l)". It contains the actual job arguments to be executed on the remote server. These arguments can be changed to represent any executable/program present on the server. The particular execution of this class will produce the output shown in Listing 6.


Listing 6. Sample output of a Grid job execution
		
proxy Issuer=CN=BlueGrid Guest,OU=Guest_Project,O=BlueGrid,O=IBM
subject=CN=proxy,CN=BlueGrid Guest,OU=Guest_Project,O=BlueGrid,O=IBM
Strength=512 Gass: GassServer: https://9.45.188.210:4294 options 
  (r:+ w:+ so:+ se:+ rc:-)
GASS URL: https://9.45.188.210:4294 
Sending job request to: 9.45.124.126

total 80
-rw-r--r--  1 root   root      3998 Jun 18  2002 anaconda-ks.cfg
drwx------ 3 root   root      4096 Sep 16 11:29 Desktop
drwx------ 5 root   root      4096 Oct 10 10:57 evolution
-rw-r--r--  1 root   root      25472 Jun 18  2002 install.log
-rw-r--r--  1 root   root      0 Jun 17  2002 install.log.syslog
drwxr-xr-x 2 root root     4096 Oct 18 09:04 logbr
drwx------  2 root root      4096 Sep  5 08:28 mail


Conclusion

This class was written specifically to implement a minimal Grid Job submission application using the Java CoG Kit. Further enhancement ideas include encapsulating Grid logic into a web service. Such a service has the advantage of allowing clients to access Grid resources without requiring the Grid API libraries to be installed. Furthermore, clients can be written in any computer language that supports the SOAP protocol, such as Java, Visual Basic, HTML, etc. Currently, there are efforts to develop an open Grid services infrastructure by the Globus project.


Resources

About the author

Vladimir Silva was born in Quito, Ecuador in 1969. He received a System's Analyst Degree from the Polytechnic Institute of the Army in 1994. The same year, he came to the United States as an exchange student pursuing a career in Computer Science at Middle Tennessee State University. After graduation, he joined the IBM "Web-Ahead" technology think tank. His interests include Grid computing, Neural Nets, and artificial intelligence. He also holds numerous IT certifications including OCP, MCSD and MCP.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=SOA and web services, Java technology
ArticleID=10369
ArticleTitle=Grid Job submission using the Java CoG Kit
publish-date=02012003
author1-email=vsilva@us.ibm.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).