java.net.ConnectException: Connection refused

I am getting errors in the logs related to connection refusal.

Symptoms

Along with the errors, I am also seeing the following behavior intermittently:
  1. The CPU usage spikes up to over 100%.
  2. The application user owning the install is unable to log into the server; other users are working fine.
  3. The downstream systems are unable to interact with the application.
The product works normally upon an application restart. Why am I getting this and how can I prevent it?

Causes

While creating a user, we specify the resource usage limit that can be allocated to the user as well as the processes spawning from it. This prevents the user from being able to crash the whole server in case of failure. However, if this limit is too low then it can also lead to issues with the user and processes owned by the user.
One of the common reasons for getting the java.net.ConnectException is that the user is unable to create the process requested by the application and is hence unable to connect to the back-end server. The configuration controlling this limit is the "ulimit -u" setting, which specifies the maximum number of processes available to the user. If the user exhausts all available processes, then it will not be able to make new connections resulting in issues mentioned above and exceptions like the following:

Exception:Connection refused to host: 10.15.66.164; nested exception is:
java.net.ConnectException: Connection refused
java.rmi.ConnectException: Connection refused to host: 10.15.66.164; nested exception is:
java.net.ConnectException: Connection refused
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:210)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:196)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122)
at com.ibm.ccd.scheduler.common.Scheduler_Stub.getRunningInfo(Scheduler_Stub.java:236)
at com.ibm.ccd.scheduler.common.JobStatus._updateStatus(JobStatus.java:71)
at com.ibm.ccd.scheduler.common.JobStatus.getCached(JobStatus.java:147)
at com.ibm.ccd.scheduler.common.JobStatus.getRunningByJobId(JobStatus.java:158)
at com.ibm.ccd.scheduler.threads.MasterThread.checkJobNotCurrentlyRunningOnAnyJVM(MasterThread.java:266)
at com.ibm.ccd.scheduler.threads.MasterThread.fuzaoRun(MasterThread.java:418)
at com.ibm.ccd.common.util.FuzaoRunnableAdapter.run(FuzaoRunnableAdapter.java:54)
at com.ibm.ccd.common.util.FuzaoThread.run(FuzaoThread.java:123)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:381)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:243)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:230)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:377)
at java.net.Socket.connect(Socket.java:539)
at java.net.Socket.connect(Socket.java:488)
at java.net.Socket.<init>(Socket.java:385)
at java.net.Socket.<init>(Socket.java:199)
at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:34)
at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:140)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:607)
... 11 more

Resolving the problem

Check the max number of allowed processes in the server using the following command: ulimit -u
If this is set to a low value, say 1024, then increase it to 131072 or unlimited using:

ulimit -u 131072
ulimit -u unlimited

Once this value is increased, the application user should work normally.


Last updated: 21 May 2017