IBM Support

HWSJ029E (timeout)

Question & Answer


Question

Needing assistance with VisualAge Java failures. 3.5.3 IMS Connector invokes transaction and recycles the connection, having problems, going into production in 3 weeks. no synchronization, some of the Connector code was trying to do a synchronization.

Answer

Subject: IMS Connector problem - ticket/PMR # 07809,550
p.s. 65262,227 was also intermittent so I think it's a good match for the symptoms you are getting.

I've found two other references to HWSJ029E in our PMR archives. Extracts for your information.

PMR 65262,227
HWSJ029E: most likely cause either IMS Connect or the client timeout value was set too low. Increase the timeout value to see if the problem improves.
PMR 41487,379
This is from a VAJ 3.0 PMR, I think some of the component names have changed since then.

ITOC [IMS TOC] has a configuration parameter, timeout, on the TCPIP configuration statement that controls how long ITOC will wait for a reply from IMS. If timeout is 0 ITOC will wait forever. So, if a message from a client (e.g., a servlet) is placed on an IMS message
queue, and there isn't an IMS message region for the IMS application program to run in, and timeout is 0, ITOC will wait forever and the client will appear to hang. . You can see the value of timeout using the ITOC command VIEWHWS. Unfortunately timeout is a configuration parameter so it's not programmatically controlled.


We were experiencing a problem trying to run our Java application that uses IMS Connectors.


diagnosing IMS Connector error HWSJ029E

My project is using the IBM Common Connector Framework 3.5.3 for Java to run some IMS transactions from a Web application. When the application is hosted under the Apache Tomcat Test Environment 3.2.3 under VisualAge 3.5.3 (JDK 1.2.2) on a Windows 2000 environment, the application seems to work fine. When the application is moved to a Solaris 2.8 under Oracle 9iAS 903 under JDK 1.3, the IMS transactions occasionally generate an EOFException with the following stack trace:
com.ibm.connector.CommunicationException: HWSJ029E:                            
com.ibm.connector.imstoc.IMSAdapter@6a91a5.receive(InteractionSpec)            
error. A communciation error occured while receiving the output message.      
[java.io.EOFException]                                                        
  at com.ibm.connector.imstoc.IMSAdapter.receive(IMSAdapter.java:1267)        
  at com.ibm.connector.imstoc.IMSConnection.call(IMSConnection.java:518)      
at                                                                            
com.ibm.connector.imstoc.IMSCommunication.execute(IMSCommunication.java:      
473)                                                                          
at                                                                            
com.ibm.ivj.eab.command.CommandCommunicationPrimitive.execute(CommandCom      
municationPrimitive.java:193)      
at                    
com.ibm.ivj.eab.command.CommunicationCommand.connEtoM10(CommunicationCom    
mand.java:239)    
at                                                                            
com.ibm.ivj.eab.command.CommunicationCommand.internalExecutionStarting(C      
ommunicationCommand.java:898)                                                  
at                                                                            
com.ibm.ivj.eab.command.Command.fireInternalExecutionStarting(Command.ja      
va:401)                                                                        

 at                                                                            
com.ibm.ivj.eab.command.CommunicationCommand.internalExecute(Communicati      
onCommand.java:886)                                                            
  at com.ibm.ivj.eab.command.Command.execute(Command.java:283)                
  at com.ibm.ivj.eab.command.Command.execute(Command.java:246)                
at                                                                            
com.ibm.ivj.eab.command.CommunicationCommand.execute(CommunicationComman  PAGE
d.java:705)


States 3 possible causes as follows:

The client on IMS Connect was closed. (For example, the IMS Connect STOPCLNT command was issued). The connection with the host has been reset or the TCP/IP connection is down. The output message is incomplete or corrupted.

We would appreciate a suggestions on steps that can be used o determine which of the 3 possible causes is our root cause. Please note that a recent change to our code involves an attempt to re-use connections in the multi-thread servlet environment:

    try {                                                                      
   ...                                                                        
runtimeContext = (JavaRuntimeContext)                                          
Environment.getIMSRuntimeContext();                                            
    cmd = new com.customer.etrac.services.PTRCUPTLCommand();                    
    // Populate the command bean with data to connect to the host              
    cmd.setHostName(Environment.getImsHostName());                            
    cmd.setPortNumber(31221);                                                  
    cmd.setDataStoreName(Environment.getImsDataStoreName());                  
    cmd.setReapTime(1000);                                                    
    cmd.setMaxConnections(150);                                                
    cmd.setMinConnections(0);                                                  
    cmd.setUnusedTimeout(5000);        
// Populate the command bean with input data for the IMS    
transaction.                                              
    cmd.setUPTL__TRAN__CODE("TTRCUPTL");            
cmd.setUPTL__LL((short) ((PTRCUPTLInputMsg)                                    
cmd.getInput()).getSize());                                                    
    cmd.setUPTL__ZZ((short) 0);                                                
    cmd.setFill_0(" ");                                                        
    cmd.setUPTL__SCREEN__MODE("I");                                            
    cmd.setUPTL__USER__ID(user.getRacfUserId());                              
    cmd.setUPTL__USER__ROLE(user.getRole().getMastRevByCode());                
    ...                                                                        
    cmd.execute();                                                            
     ((JavaCoordinator) runtimeContext.getCoordinator()).commit();            
}                                                                              
catch (Exception e)                                                            
{                                                                    
    e.printStackTrace(System.err);                                  
    ((JavaCoordinator) runtimeContext.getCoordinator()).rollback();      
    if (runtimeContext != null)                          
    {                                                                          
        runtimeContext.close();                                                
        runtimeContext.removeCurrent();                                        
    }                                                                          
    throw new ServicesConfigurationException(                                  
        "problem with IMSConnector.retrieveLookAheadData() on thread "        
            + Thread.currentThread().toString(),                              
        e);                                                                    
}                                                                              
Object bean = cmd.getOutput();

...



Contacted customer, their IMS Connector for Java application had been running without problem for a few months with VAJ 3.5 and WAS 3.5.3 under Window 2000 (JDK 1.2.2) and Sun Solaris (JDK 1.3.1). They just added codes to reuse the connection and use commit roll back in the application. The changes was running without problem under window 2000 but intermittently getting msg HWSJ029E when running under Sun Solaris. They are suspecting they need to some kind of synchronization with the application but would like to talk to someone had in dept knowledge to IMS Connector for Java.

He will send me their application timeout settings of three situation (i.e. before the changes, after the changes with error and back out from changes). I also ask for the IMS Connect timeout value and understand they are running under one IMS Connect without any changes in the TIMEOUT value.

Here is some of the connector code that we have had trouble with. Our "Current connector" code seems to work fine, but we would like confirmation that the changes that we have made are appropriate. I have also included the "Previous connector" code that threw the errors that prompted us to open this ticket. The "Old connector" is also in this e-mail that does not reuse the IMS Connections.
The IMS Connection timeout value is 30,000 centiseconds (5 minutes).
Please note that there was not a long delay before we see this problem. We got an error right after we invoke a transaction. We would like to know:
1) Can the error that we were getting be explained by our call to commit() on a transaction that was not synchronized?
2) Why did we see this error on the execute() call and not the commit() call, since removing the commit() seems to have resolved the problem?
3) Why was the problem intermittent, instead of happening on each transaction call?
4) Does our new version of our connector code look reasonable for a servlet (multi-threaded) environment?

---------------------------------------------------------
 public static com.ibm.connector.connectionmanager.ConnectionManager            
getImsConnManager()                                                            
{                                                
    return imsConnManager;                  
}                                                  
The imsConnManager value is set by a static initializer in our main servlet:                                                                      
import com.customer.etrac.exceptions.EtracSystemException;                      
import java.util.Properties;                                                  
import javax.servlet.*;                                                        
import com.ibm.connector.connectionmanager.ConnectionManager;                  
import java.io.IOException;                                                    
import javax.servlet.http.*;                                                  
import com.customer.etrac.services.Environment;                                  
import com.customer.etrac.services.DbConnect;                                    
import com.customer.etrac.util.EtracListCacheManager;                            
/**                                                                            
 * The one and only servlet (per the Apache Struts framework).                
 * Creation date: (4/16/2002 3:53:59 PM)    
 * @author:      
 */                                                                
public class EtracServlet extends org.apache.struts.action.ActionServlet                                                                                
{                                                                              
   static ConnectionManager imsConnManager;                                    
   static                                                                      
   {                                                                          
      imsConnManager = new ConnectionManager();                                
   }                                                                          
}

===================================================================
He also sent in the details and the stack trace :
My project is using the IBM Common Connector Framework 3.5.3 for Java to
run some IMS transactions from a Web application. When the application
Tomcat Test Environment 3.2.3 under VisualAge 3.5.3 (JDK 1.2.2) on a
Windows 2000 environment, the application seems to work fine. When the application is moved to a Solaris 2.8 under Oracle 9iAS 903 under JDK 1.3, the IMS transactions with the stack trace send to email:
imslvl2@us.ibm.com.


SUMMARY OF CUSTOMER'S PROBLEM -
Regarding customer getting the EOFException, we were thinking because they were setting up their applications to use SyncLevel=None but were then turning around and sending in a confirm using the JavaCoordinator after receiving the transaction response (i.e. reuse the Connection and Commit Rollback). IMS Connect would consider this a protocol violation and disconnect the socket. IMS Connector for Java would then try to use that connection for the next interaction and receive a EOFException on receive because the socket had disappeared. TCP/IP does not complain during the Send about a socket being disconnected at the IMS Connect end; it only issues the exception on the subsequent Receive. To verify that this is what is happening, we would need a recorder trace that includes the time window when the failure is occurring and possibly an IC4J Level 3 trace that covers the same time period. They should be able to find the error in the IC4J trace and then select the IMS Connect recorder trace records that have the same client ID. Also the message data in the two traces should match up.

===================================================================
A) Please escalate this ticket to severity 2. (This is not a production
outage, but we have a production release date of 5/17 and have not
pinned down exactly why we are getting these failures.)
B) After our teleconference yesterday (5/1), we ran a load test on our
preproduction (Solaris 280R, 2 cpu) machine and had no connector
errors. As discussed in our teleconference, the code that we are
using does not call commit()/rollback() for our IMS transaction that
is a "SYNC_LEVEL_NONE" transaction.
C) Today we are performing load tests on our future production machine
(Solaris V480, 4 cpu) with much higher loads. The initial results
are that we were getting several (14) of the HWSJ029E errors with
the same stack trace that we saw before we removed the commit()/
rollback() calls. Strangely, subsequent tests had fewer of these
errors (2) and our latest test had none. This may indicate that this
problem is some how related to the mainframe or network load ?
(This might also explain why no failures were found in yesterday's
load test, because it was run in the late afternoon.)
What we need from IBM is:
1) Please review our code that we supplied in a previous e-mail to
ensure that it looks okay for how we are invoking the IMS transaction
and cleaning up the IMS connections for reuse. Each thread (servlet
request) gets its own connection from the pool and uses it to invoke
the transaction. Please feel free to suggest code changes.
2) Please advise us on both Java IMS connector trace settings and
mainframe trace settings that might help to indicate the root cause
of this problem. We will make these changes and re-run our tests.
3) If the IMS connections are going bad from some reason, is there a way
to detect a bad connection through code tests? (This would allow us
to "sprinkle" this check in our code to pin down when the connection
goes bad. It also might be a way to detect the need to get a new
connection before running the transaction.)
4) If the mainframe experiences problems, then does the Java pool of IMS
Connections need to be re-initialized?
5) Please suggest any work-arounds that might be reasonable to get our
code working under load. For example, Java synchronized code sections
not re-using the connections, retrying (readonly) transactions once
we get this error, etc.
6) Please respond by e-mail with any suggestions or code changes ASAP.
(We will be working this weekend doing performance testing, so please
let us know if you would like to talk to us tomorrow (5/3).)

1) ANSWER : I don't see a problem with the code as it is written.
2) ANSWER : Change your getIMSRuntimeContext() to use
((JavaRASService) runtimeContext.getRASService())                  
              .setTraceLevel(RASService.RAS_TRACE_INTERNAL);

Please turn on the IMS Connect Recorder trace(mainframe) and the
IMS Connector for Java level 3 trace when you re-run your tests.
It will be a large output and you need to find the error in the
IC4J (IMS Connector for Java) level 3 trace and the select the
IMS Connect recorder trace records that have the same client ID.
Also the message data in the two traces should match up.
3) ANSWER : Unfortunately, the only way to detect a bad TCP/IP
connection is to try to use that connection (actually, you can't
just do a send on that connection, you need to do a send and a
receive - not sure if a receive alone will do the trick - I know
a send without a subsequent receive will not.) So re-trying read
only transactions after an error as you suggest below might be a
good work-around. This would help in the case of isolated error
which causes IMS Connect to disconnect from a connection.
4) ANSWER : If IMS Connect goes down, all of the existing connections
go bad. Unfortunately, the connection manager does not monitor
connections and clean them up if they go bad. As a result, you
only find out that a connection is bad when you try to re-use it

after it has gone bad. WAS 5.0 allows you to set a property on a
connection factory that allows all connections in a pool to be
removed if any connection in the pool is marked dirty but that
function is not available in WAS 4.0. The net result is that, if
IMS Connect goes down, each existing connection will need to be
tried once (and fail) before it is removed from the connection
pool. If an individual connection is disconnected by IMS Connect
due to a error of some sort (such as a protocol violation or a
timeout,) only that connection needs to be removed from the pool.
5) ANSWER : Not knowing what is causing the problem, I can't suggest
using synchronization to eliminate the problem. Not using
connection pooling would eliminate any problems associated with
re-using connections that have been disconnected at the IMS
Connect end due to some kind of error, but would not address the
actual error that is occurring. You might just substitute one
type of exception for another. Re-trying transactions that fail
would most likely provide you a usable work-around. It would not
address the problem of IMS Connect going down and killing all of
the connections in a pool, but I don't think that is your problem.
It's worth noting that a connection pool would get cleaned up
faster if you were to re-try connections when a failure occurs
under these circumstances (IMS Connect having gone down,
disconnecting from all of the existing connections in a pool.)
6) ANSWER : I am sending the answers back now and we will not available
tomorrow. Please send me an email to let me know if you
need a conference call on Monday after you review the
answers above and done your test tomorrow. The conference call
properly will be close to mid-day Monday in order to set up
conference call number and let me know how many lines you require
at your end.


Action taken : They had turned
on the IC4J level 3 trace from server side and recorder trace
under IMS Connect since this morning. They wonder any IMS Connect
command to monitor the number of socket being used. Informed
customer the only command VIEWHWS is to take a snap short of IMS
Connect's current connection with all the clients but no command
to monitor how many socket being used.
They received No Connect Available Exception which required recycle
the application to clean it up. They wonder if it will corrupt the
connection pool. He told customer it is not IC4J code to issue
the exception but most likely the VisualAge for Java code. We need
assist from VisualAge for Java support to explain this. Requested
customer to send in the trace and we will forward to VisualAge for
Java support.
They will rerun the test to try to correct the trace information
with recorder trace.
Action plan : We will review the trace when arrive and forward to
VisualAge for Java support to answer the question. And will wait
for further update from customer's test (schedule for tonight).

Action taken : received exception output and request VisualAge for Java
support to help us to understand how this connection limit
exception effects the connection pool. Customer says they noticed
once they started getting this exception, they would get lots of
them and would start seeing "unpredictable behavior" from the
application. They didn't really define what they meant by

"unpredictable behavior" as I recall. Once they "bounce" the
application (stop and restart the application on the Web server)
these errors would go away and everything would be fine again.
They were wondering if this meant that the connections in the
pool were being corrupted once they reached the maximum number of sockets.


We have made some minor changes to our code and we have not seen an IMS Connector error in the last few days. We will continue to test and review the logs to see if the problem has been corrected.

received response from VJA support to explain why the following exception is being thrown:
ConnectionManager.reserve() error: Maximum coordinated connections            
reached, wait timed out, throwing                                              
NoConnectionAvailable exception.    
com.ibm.connector.NoConnectionAvailableException      
at            
com.ibm.connector.connectionmanager.ConnectionManager.addCoordinatedConn ection (ConnectionManager.java:118)

Explanation :
This exception is coming from IBM Common Connector Framework (CCF).
This exception is thrown if the communication is unable to create a
connection. The possible reason could be that server has reached limit
of number of connections in the pool and all connections are being used.
One possible solution is to increase the size of the number of
connections in the pool. Second possible solution could be; within
application code, this exception can be caught and upon catching the
exception re-try to obtain the connection. Upon re-trying, a connection
may become available in pool and can be used.
Another possible reason for this exception could be related to network
connection in which case even re-try may fail so a counter needs to be
set upon number of re-tries; otherwise, application may go into a
infinite loop.

Consulted with developer and passed VAJ support's to customer and ask VAJ following additional question:

The further question in our mind is why it cause customer the slow up in the system (poor performance) when it get to connection limit exception. We just would like to understand
if any further consideration in the code the will relief this situation.

When the connection limit is reached, it means there is not connection is available in the pool,when the application attempts to get a new one. Therefore, the application has to wait until a connection is returned to the pool. Obviously, it slows everything down.
To cope with this kind of situation, we usually should make sure the connection is returned to the pool, as soon as an application finished using it. Ideally, a idle timeout parameter be set on the connection, so that the connection is pulled back to the pool, if the application
keeps it for a while without using it. To be more sophisticated, sometimes we add waiting timeout for the application that is waiting for a connection. So when the timeout is reached, a WaitTimeoutException is thrown to the application, and the application can catch this,
present a msg to the end user that the network is busy, please wait etc.


They have coded retries when an IMS Connector exception is generated and this seems to be working properly. They feel our next move is to will give them more control over the connection pool.

[{"Product":{"code":"SSV7D2","label":"IMS Tools"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"IMS Connect","Platform":[{"code":"PF025","label":"Platform Independent"},{"code":"PF035","label":"z\/OS"}],"Version":"1.1.0;1.2.0;2.1.0","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Historical Number

PMR 07809;550;000

Document Information

Modified date:
28 November 2022

UID

swg21145757