Implementing reconnection logic in a Java EE application

Enterprise JavaBeans and Web-based applications that want to automatically reconnect if a queue manager fails, need to implement their own reconnection logic.

The following options give more information on how you might achieve this:

Allow the application to fail

This approach requires no application changes, but does require an administrative reconfiguration of the connection factory definition to include the CONNECTIONNAMELIST property. However, this approach does require the invoker to be able to handle a failure appropriately . Note that this is also required for failures such as MQRC_Q_FULL that are not related to connection failure.

Example code for this process:


public class SimpleServlet extends HttpServlet {   
  public void doGet(HttpServletRequest request,                      
                    HttpServletResponse response)     
       throws ServletException, IOException {     
          try {      
 // get connection factory/ queue       
 InitialContext ic = new InitialContext();       
 ConnectionFactory cf =                 
           (ConnectionFactory)ic.lookup("java:comp/env/jms/WMQCF");  
 Queue q = (Queue) ic.lookup("java:comp/env/jms/WMQQueue");       

 // send a message       
 Connection c = cf.createConnection();  
 Session s = c.createSession(false, Session.AUTO_ACKNOWLEDGE);  
 MessageProducer p = s.createProducer(q);    
 Message m = s.createTextMessage();  
 p.send(m); 

 // done, release the connection 
 c.close();
 }
 catch (JMSException je) {  
 // process exception 
   } 
 } 
}

The preceding code assumes that the connection factory, this servlet is using, has the CONNECTIONNAMELIST property defined.

When the servlet first processes, a new connection is created using the CONNECTIONNAMELIST property, assuming that no pooled connections are available from other applications connecting to the same queue manager.

When the connection is released following a close() call, this connection is returned to the pool and reused the next time the servlet runs - without referring to the CONNECTIONNAMELIST - until a connection failure occurs, at which point a CONNECTION_ERROR_OCCURRED event is generated. This event prompts the pool to destroy the failed connection.

When the application next runs, no pooled connection is available and the CONNECTIONNAMELIST is used to connect to the first available queue manager. If queue manager fail over has taken place (for example, the failure was not a transitory network failure) the servlet connects to the backup instance once it is available.

If other resources, such as databases, are involved in the application, it might be appropriate to indicate that the application server should roll back the transaction.

Handle reconnection within the application

If the invoker is unable to process a failure from the servlet, then reconnection must be handled within the application. As shown in the following example, to handle a reconnection within the application requires the application to request a new connection so that it can cache the connection factory that it looked up from JNDI and handle a JMSException such as JMSCMQ0001: WebSphere MQ call failed with compcode '2' ('MQCC_FAILED') reason '2009' ('MQRC_CONNECTION_BROKEN').


public void doGet(HttpServletRequest request, HttpServletResponse response) 
      throws ServletException, IOException { 

  // get connection factory/ queue 
  InitialContext ic = new InitialContext(); 
  ConnectionFactory cf = (ConnectionFactory) 
               ic.lookup("java:comp/env/jms/WMQCF"); 
  Destination destination = (Destination) ic.lookup("java:comp/env/jms/WMQQueue"); 

  setupResources(); 
  
  // loop sending messages 
  while (!sendComplete) { 
    try { 
      // create the next message to send 
      msg.setText("message sent at "+new Date()); 
      // and send it 
      producer.send(msg); 
    } 
    catch (JMSException je) { 
        // drive reconnection 
        setupResources(); 
    } 
  }

In the following example, setupResources() creates the JMS objects and includes a sleep and retry loop to handle non-instantaneous reconnection. In practice, this method prevents many reconnect attempts. Note that exit conditions have been omitted from the example for clarity.


 private void setupResources() { 

    boolean connected = false; 
    while (!connected) { 
      try { 
        connection = cf.createConnection(); // cf cached from JNDI lookup 
        session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE); 
        msg = session.createTextMessage(); 
        producer = session.createProducer(destination); // destination cached from JNDI lookup 
        // no exception? then we connected ok 
        connected = true; 
      } 
      catch (JMSException je) { 
        // sleep and then have another attempt 
        try {Thread.sleep(30*1000);} catch (InterruptedException ie) {} 
      } 
    }

If the application manages reconnection, it is important that the application releases any connections that are held to other resources, whether these resources are other IBM® MQ queue managers or other back end services such as databases. You must reestablish these connections when reconnection to a new IBM MQ queue manager instance is complete. If you do not do reestablish the connections, application server resources are held unnecessarily during the reconnection attempt, and might have timed out by the time they are reused.

Use of the WorkManager

For long-lived applications (for example, batch processing) where processing time is greater than a few tens of seconds, the WebSphere® Application Server WorkManager can be used. A code fragment example for WebSphere Application Server follows:


public class BatchSenderServlet extends HttpServlet  { 
  
  private WorkManager workManager = null; 
  private MessageSender sender; // background sender WorkImpl 
  
  public void init() throws ServletException { 
    InitialContext ctx = new InitialContext(); 
    workManager = (WorkManager)ctx.lookup(java:comp/env/wm/default); 
    sender = new MessageSender(5000); 
    workManager.startWork(sender); 
  } 

  public void destroy() { 
    sender.halt(); 
  } 

  public void doGet(HttpServletRequest req, HttpServletResponse res) 
                               throws ServletException, IOException { 
    res.setContentType("text/plain"); 
    PrintWriter out = res.getWriter(); 
    if (sender.isRunning()) { 
      out.println(sender.getStatus()); 
    } 
}

where web.xml contains:


<resource-ref> 
      <description>WorkManager</description> 
      <res-ref-name>wm/default</res-ref-name> 
      <res-type>com.ibm.websphere.asynchbeans.WorkManager</res-type> 
      <res-auth>Container</res-auth> 
      <res-sharing-scope>Shareable</res-sharing-scope> 
   </resource-ref>

and the batch is now implemented through the work interface:


import com.ibm.websphere.asynchbeans.Work; 

public class MessageSender implements Work { 

  public MessageSender(int messages) {numberOfMessages = messages;} 

  public void run() { 
    // get connection factory/ queue 
    InitialContext ic = new InitialContext(); 
    ConnectionFactory cf = (ConnectionFactory) 
               ic.lookup("java:comp/env/jms/WMQCF"); 
    Destination destination = (Destination) ic.lookup("jms/WMQQueue"); 

    setupResources(); 
  
    // loop sending messages 
    while (!sendComplete) { 
      try { 
        // create the next message to send 
        msg.setText("message sent at "+new Date()); 
        // and send it 
        producer.send(msg); 
        // are we finished? 
        if (sendCount == numberOfMessages) {sendComplete = true); 
      } 
      catch (JMSException je) { 
          // drive reconnection 
          setupResources(); 
      } 
  } 

  public boolean isRunning() {return !sendComplete;} 

  public void release() {sendComplete = true;}

If the batch processing takes a long time to run, for example, large messages, slow network, or extensive database access (especially when coupled with slow fail over) then the server starts to output hung thread warnings, similar to the following example:

WSVR0605W: Thread "WorkManager.DefaultWorkManager : 0" (00000035) has been active for 694061 milliseconds and may be hung. There is/are 1 thread(s) in total in the server that may be hung.

These warnings can be minimized by reducing the batch size, or increasing the hung thread timeout. However, it is generally preferable if you implement this processing in an EJB (for batch send) or message-driven bean (for consume or consume and reply) processing.

Note that application-managed reconnection does not provide a general solution to handling run time errors, and the application must still handle errors that are not related to connection failure.

For example, attempting to put a message to a queue that is full (2053 MQRC_Q_FULL), or attempting to connect to a queue manager using security credentials that are not valid (2035 MQRC_NOT_AUTHORIZED).

The application must also handle 2059 MQRC_Q_MGR_NOT_AVAILABLE errors when no instances are immediately available when fail over is in progress. This can be achieved by the application reporting the JMS exceptions as they occur, instead of silently attempting to reconnect.