Topic
  • 16 replies
  • Latest Post - ‏2017-02-20T23:18:12Z by MoragH
michaesi
michaesi
17 Posts

Pinned topic SVRCONN channel appears as running while there is no connection

‏2017-01-24T09:50:06Z |

We have a Client application that is polling a queue for messages connecting to a SVRCONN channel.

I some circumstances the application cannot retrieve messages from the queue and the channel appears in running state

even when there is no active connection to the Queue Manager.

At that time I issue a "display  chstatus (CHANNEL2) ALL" command and here is the output.

AMQ8417: Display Channel Status details.
   CHANNEL(CHANNEL2)                       CHLTYPE(SVRCONN)
   BUFSRCVD(6)                             BUFSSENT(5)
   BYTSRCVD(1020)                          BYTSSENT(860)
   CHSTADA(2017-01-24)                     CHSTATI(09.20.51)
   COMPHDR(NONE,NONE)                      COMPMSG(NONE,NONE)
   COMPRATE(0,0)                           COMPTIME(0,0)
   CONNAME(10.250.5.150)                   CURRENT
   EXITTIME(0,0)                           HBINT(1832481381)
   JOBNAME(0000235C0000053B)               LOCLADDR( )
   LSTMSGDA( )                             LSTMSGTI( )
   MCASTAT(RUNNING)                        MCAUSER(mqm)
   MONCHL(OFF)                             MSGS(0)
   RAPPLTAG( )                             RQMNAME(QM_BOAP)
   SSLCERTI( )                             SSLKEYDA( )
   SSLKEYTI( )                             SSLPEER( )
   SSLRKEYS(0)                             STATUS(RUNNING)
   STOPREQ(NO)                             SUBSTATE(RECEIVE)
   XMITQ( )

 

The value of HBINT(1832481381) is very strange and I wondering if this could be the problem.

Could it be the client application incorrectly setting this value or could this be a bug?

MQ Server is v6.0 and client is v7.5. I know v6,0 is no longer supported but there is no immediate

capability to upgrade it.

Any idea welcomed.

  • MoragH
    MoragH
    131 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-01-24T23:00:46Z  

    With such an old version of the MQ queue manager, you don't have the benefit of being able to use a full-duplex TCP/IP socket like you have in V7+. Therefore, if there is a TCP/IP connection problem and your socket is lost by one end, the other end may take a very long time to discover that the socket has gone away. Your only available solution to this at such an old version of MQ is to ensure you have a sensible value for TCP KeepAlive and that you actually have keepalive turned on for MQ. This will mean that orphaned connections such as you describe are cleaned up more quickly.

    When the client end of the application notices the failure, is it at least able to reconnect successfully, even though there is still the orphaned connection as far as the queue manager is concerned, or are you also suffering from an input-exclusive lock on the queue held by the orphaned connection?

    The HBINT(1832481381) looks like a display defect. Are you on latest fixpac on the V6 at least?

    Cheers
    Morag

  • michaesi
    michaesi
    17 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-01-25T07:54:52Z  

    Morag,

    tnx for your reply, I will try to upgrade the soonest possible.

    There are few things that look rather strange.

    1. Yesterday I was connected to the Queue Manager using MQ Explorer v9 to monitor the channel status a whole day

    through a Server Connection channel and I did not have a single problem or disconnection

    2. In normal operation the HBINT value is 300, and when there is "THE problem" and Channel appears as RUNNING

    then the HBINT is set to a ridiculous large number.

    3. From what I have been told the application does not understand any disconnection, It is like it thinks there are no more messages

    in the queue (while they do exist) and the only way to recover is to end the application and restart. At that time at the Queue Manager

    there appears no connection to the channel and the channel appears as RUNNING. The application supposedly 

    when there are no more messages to read issue a disconnect and after a while it reconnects again

    Is it possible that there is a problem at the application side in the  way it handles connects & disconnects?

  • MoragH
    MoragH
    131 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-01-25T19:04:55Z  

    What is the return code (MQRC) that the application receives that makes it think there are no more messages? The actual MQRC is a very important number.

    Also, you didn't say anything about TCP Keepalive. Do you have that turned on in MQ?

    Cheers
    Morag

  • michaesi
    michaesi
    17 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-01-26T13:15:06Z  
    • MoragH
    • ‏2017-01-25T19:04:55Z

    What is the return code (MQRC) that the application receives that makes it think there are no more messages? The actual MQRC is a very important number.

    Also, you didn't say anything about TCP Keepalive. Do you have that turned on in MQ?

    Cheers
    Morag

    I got today this error from the application but I suspect this was due that at one point of time I have terminated the channel

     STOP CHANNEL(CHANNEL2) MODE(TERMINATE) in order to clear it in case it was hung  (it had the strange

    HBINT value)

    Where should I look for mq client related errors?

    An error occured in windows service. Error:Error Code: NOTFOUND: errorNo=2538 Error DateTime:26/01/2017 14:57:55 Error Source:CustomLibraries.MQAdapter StackTrace: at CustomLibraries.MQAdapter.MessageQueueManager.ConnectMQbyAdapter(String strQueueManagerName, String strChannelInfo, String getQueue, String putQueue, Int32 waitInterval, Int32 charset) at BAConfService.BAConfManagerBase.InitMessageQueue(String getQueue, String putQueue, String queueManager, String connectionInfo) at BAConfService.MQPollingServiceConfiscations.PollingPass(Int32 messagesToSkip)

  • michaesi
    michaesi
    17 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-01-26T14:13:08Z  
    • michaesi
    • ‏2017-01-26T13:15:06Z

    I got today this error from the application but I suspect this was due that at one point of time I have terminated the channel

     STOP CHANNEL(CHANNEL2) MODE(TERMINATE) in order to clear it in case it was hung  (it had the strange

    HBINT value)

    Where should I look for mq client related errors?

    An error occured in windows service. Error:Error Code: NOTFOUND: errorNo=2538 Error DateTime:26/01/2017 14:57:55 Error Source:CustomLibraries.MQAdapter StackTrace: at CustomLibraries.MQAdapter.MessageQueueManager.ConnectMQbyAdapter(String strQueueManagerName, String strChannelInfo, String getQueue, String putQueue, Int32 waitInterval, Int32 charset) at BAConfService.BAConfManagerBase.InitMessageQueue(String getQueue, String putQueue, String queueManager, String connectionInfo) at BAConfService.MQPollingServiceConfiscations.PollingPass(Int32 messagesToSkip)

    I was able to check the error log from the client

    26/01/2017 09:54:03 - Process(248.1) User(egarnusr) Program(onfiscations.exe)
                          Host(EGARNSRV) Installation(Installation1)
                          VRMF(7.5.0.2)
    AMQ9259: Connection timed out from host '10.250.2.185(1417)'.

    EXPLANATION:
    A connection from host '10.250.2.185(1417)' over TCP/IP timed out.
    ACTION:
    The select() [TIMEOUT] 360 seconds call timed out. Check to see why data was
    not received in the expected time. Correct the problem. Reconnect the channel,
    or wait for a retrying channel to reconnect itself.
    ----- amqccita.c : 4172 -------------------------------------------------------
    26/01/2017 14:57:54 - Process(8760.1) User(egarnusr) Program(onfiscations.exe)
                          Host(EGARNSRV) Installation(Installation1)
                          VRMF(7.5.0.2)
    AMQ9209: Connection to host '10.250.2.185 (10.250.2.185)(1417)' for channel
    'CHANNEL2' closed.

    EXPLANATION:
    An error occurred receiving data from '10.250.2.185 (10.250.2.185)(1417)' over
    TCP/IP.  The connection to the remote host has unexpectedly terminated.

    The channel name is 'CHANNEL2'; in some cases it cannot be determined and so is
    shown as '????'.
    ACTION:
    Tell the systems administrator.
    ----- amqccita.c : 3843 -------------------------------------------------------

     

  • MoragH
    MoragH
    131 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-01-26T19:46:44Z  

    You should look for the MQRC reason code inside the application itself. If it's a Java application you'll need to catch the exception to see it, but otherwise it's just returned to you on the MQ API call. You said earlier, "the application does not understand any disconnection, It is like it thinks there are no more messages in the queue". What is the evidence that causes the application to believe this?

    You show a reason code 2538 in one of your posts. Does the application see this error code? This reason code occurs at connect time and indicates that the connect failed because the host was unavailable.

    The timeout in your error log is the connection being tidied up after nothing happened for 360 seconds. The fact that you see this in the client error log and that your queue manage side of the connection is also still there suggests that the connection is still live.

    I think the most important thing to understand here is why the application believes that there are no more messages on the queue. Does it just get the next message or does it look for specific message ids? Is it browsing? Have any messages been rolled back? What language is your application written in?

    Cheers
    Morag

  • michaesi
    michaesi
    17 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-01T13:01:35Z  
    • MoragH
    • ‏2017-01-26T19:46:44Z

    You should look for the MQRC reason code inside the application itself. If it's a Java application you'll need to catch the exception to see it, but otherwise it's just returned to you on the MQ API call. You said earlier, "the application does not understand any disconnection, It is like it thinks there are no more messages in the queue". What is the evidence that causes the application to believe this?

    You show a reason code 2538 in one of your posts. Does the application see this error code? This reason code occurs at connect time and indicates that the connect failed because the host was unavailable.

    The timeout in your error log is the connection being tidied up after nothing happened for 360 seconds. The fact that you see this in the client error log and that your queue manage side of the connection is also still there suggests that the connection is still live.

    I think the most important thing to understand here is why the application believes that there are no more messages on the queue. Does it just get the next message or does it look for specific message ids? Is it browsing? Have any messages been rolled back? What language is your application written in?

    Cheers
    Morag

    Today I noticed a strange behaviour of the SRVCONN channel and I was getting multiple entries with the channel

    status. What does this mean? Are there hung/stuck threads for this channel. I notice that the 2 of them

    have the weird HBINT value, CONNAME is the system with the MQ Client BUT No RAPPLTAG

    Any idea?

    display  chstatus (CHANNEL2) ALL
         1 : display  chstatus (CHANNEL2) ALL
    AMQ8417: Display Channel Status details.
       CHANNEL(CHANNEL2)                       CHLTYPE(SVRCONN)
       BUFSRCVD(6)                             BUFSSENT(5)
       BYTSRCVD(1020)                          BYTSSENT(860)
       CHSTADA(2017-01-27)                     CHSTATI(14.42.11)
       COMPHDR(NONE,NONE)                      COMPMSG(NONE,NONE)
       COMPRATE(0,0)                           COMPTIME(0,0)
       CONNAME(10.250.5.150)                   CURRENT
       EXITTIME(0,0)                           HBINT(1734624111)
       JOBNAME(000019660000010B)               LOCLADDR( )
       LSTMSGDA( )                             LSTMSGTI( )
       MCASTAT(RUNNING)                        MCAUSER(mqm)
       MONCHL(OFF)                             MSGS(0)
       RAPPLTAG( )                             RQMNAME(QM_BOAP)
       SSLCERTI( )                             SSLKEYDA( )
       SSLKEYTI( )                             SSLPEER( )
       SSLRKEYS(0)                             STATUS(RUNNING)
       STOPREQ(NO)                             SUBSTATE(RECEIVE)
       XMITQ( )
    AMQ8417: Display Channel Status details.
       CHANNEL(CHANNEL2)                       CHLTYPE(SVRCONN)
       BUFSRCVD(6)                             BUFSSENT(5)
       BYTSRCVD(1020)                          BYTSSENT(860)
       CHSTADA(2017-01-31)                     CHSTATI(11.16.53)
       COMPHDR(NONE,NONE)                      COMPMSG(NONE,NONE)
       COMPRATE(0,0)                           COMPTIME(0,0)
       CONNAME(10.250.5.150)                   CURRENT
       EXITTIME(0,0)                           HBINT(1214734918)
       JOBNAME(0000196600002A0B)               LOCLADDR( )
       LSTMSGDA( )                             LSTMSGTI( )
       MCASTAT(RUNNING)                        MCAUSER(mqm)
       MONCHL(OFF)                             MSGS(0)
       RAPPLTAG( )                             RQMNAME(QM_BOAP)
       SSLCERTI( )                             SSLKEYDA( )
       SSLKEYTI( )                             SSLPEER( )
       SSLRKEYS(0)                             STATUS(RUNNING)
       STOPREQ(NO)                             SUBSTATE(RECEIVE)
       XMITQ( )
    AMQ8417: Display Channel Status details.
       CHANNEL(CHANNEL2)                       CHLTYPE(SVRCONN)
       BUFSRCVD(15)                            BUFSSENT(13)
       BYTSRCVD(2860)                          BYTSSENT(2468)
       CHSTADA(2017-02-01)                     CHSTATI(14.17.52)
       COMPHDR(NONE,NONE)                      COMPMSG(NONE,NONE)
       COMPRATE(0,0)                           COMPTIME(0,0)
       CONNAME(10.250.5.150)                   CURRENT
       EXITTIME(0,0)                           HBINT(300)
       JOBNAME(000019660000480E)               LOCLADDR( )
       LSTMSGDA(2017-02-01)                    LSTMSGTI(14.17.52)
       MCASTAT(RUNNING)                        MCAUSER(mqm)
       MONCHL(OFF)                             MSGS(12)
       RAPPLTAG(ConfServiceConfiscations.exe)
       RQMNAME(QM_BOAP)                        SSLCERTI( )
       SSLKEYDA( )                             SSLKEYTI( )
       SSLPEER( )                              SSLRKEYS(0)
       STATUS(RUNNING)                         STOPREQ(NO)
       SUBSTATE(MQGET)                         XMITQ( )

     

  • MoragH
    MoragH
    131 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-01T18:32:02Z  

    Multiple channel status instances means that you have made multiple connections from the client application, perhaps which have then failed and so another connection was made without a disconnect first.

    No RAPPLTAG may just mean that they didn't get far enough through the start up process to exchange that data before deciding it wasn't good,

    Did you have any luck on finding the MQRC return code yet? That is a very important part of this puzzle,

    Cheers
    Morag

  • michaesi
    michaesi
    17 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-02T08:15:08Z  
    • MoragH
    • ‏2017-02-01T18:32:02Z

    Multiple channel status instances means that you have made multiple connections from the client application, perhaps which have then failed and so another connection was made without a disconnect first.

    No RAPPLTAG may just mean that they didn't get far enough through the start up process to exchange that data before deciding it wasn't good,

    Did you have any luck on finding the MQRC return code yet? That is a very important part of this puzzle,

    Cheers
    Morag

    RC code can be found in any log file?, are they displayed?  or the application must be catch them and store

    them. I ll have probably next week a meeting with depeloper and can ask for details.

    I will try to enable tcp keepalive to see if this makes any difference.

    On the Server side I have edited the qm.ini file,  and added

    TCP:

    Keepalive=yes

    and also changed the Channel property  KeepAlive interval = 300.

    I have restrted QManager but the value KAINT is not reflected on the channel properties

         2 : display  chstatus (CHANNEL2) ALL
    AMQ8417: Display Channel Status details.
       CHANNEL(CHANNEL2)                       CHLTYPE(SVRCONN)
       BUFSRCVD(6)                             BUFSSENT(5)
       BYTSRCVD(1020)                          BYTSSENT(860)
       CHSTADA(2017-02-02)                     CHSTATI(10.03.18)
       COMPHDR(NONE,NONE)                      COMPMSG(NONE,NONE)
       COMPRATE(0,0)                           COMPTIME(0,0)
       CONNAME(10.250.5.150)                   CURRENT
       EXITTIME(0,0)                           HBINT(794979442)
       JOBNAME(0000537200000008)               LOCLADDR( )
       LSTMSGDA( )                             LSTMSGTI( )
       MCASTAT(RUNNING)                        MCAUSER(mqm)
       MONCHL(OFF)                             MSGS(0)
       RAPPLTAG( )                             RQMNAME(QM_BOAP)
       SSLCERTI( )                             SSLKEYDA( )
       SSLKEYTI( )                             SSLPEER( )
       SSLRKEYS(0)                             STATUS(RUNNING)
       STOPREQ(NO)                             SUBSTATE(RECEIVE)
       XMITQ( )

     

    Am I doing something wrong?

    Also I have reviewed APAR IC98704 and I was wondering if this could be related to our problem,

    so an upgrade of client from v7.5.0.2 to latest would make any diffrence.

     

  • MoragH
    MoragH
    131 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-02T10:23:23Z  

    The MQRC is given to the application, so only the application can tell you what it is.

    Per channel KeepAlive (aka KAINT) is only applicable on z/OS. I don't think you are running on z/OS. You can only set system wide keepalive settings which will apply to all channels. Make sure KeepAlive is also on in the TCP network settings for your platform. Support Pac MD0C has some more information on this.

    APAR IC98704 is applicable if a CCDT is in use. Are you using a CCDT? You have not mentioned this in your question so I could not tell you whether it applies to you or not. Upgrading the client to V7.5.0.2 wouldn't give you the fix for this APAR as it is fixed in V7.5.0.4.

    Cheers
    Morag

  • michaesi
    michaesi
    17 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-02T13:20:02Z  
    • MoragH
    • ‏2017-02-02T10:23:23Z

    The MQRC is given to the application, so only the application can tell you what it is.

    Per channel KeepAlive (aka KAINT) is only applicable on z/OS. I don't think you are running on z/OS. You can only set system wide keepalive settings which will apply to all channels. Make sure KeepAlive is also on in the TCP network settings for your platform. Support Pac MD0C has some more information on this.

    APAR IC98704 is applicable if a CCDT is in use. Are you using a CCDT? You have not mentioned this in your question so I could not tell you whether it applies to you or not. Upgrading the client to V7.5.0.2 wouldn't give you the fix for this APAR as it is fixed in V7.5.0.4.

    Cheers
    Morag

    I found out that the application codes are kept in a table.  Not sure what exactly is kept. I attach todays

    I noticed 2 major errors

    Error Code: NOTFOUND: errorNo=2537 and

    Error Code: MQRC_Q_MGR_NOT_AVAILABLE

    I checked with MQ Server log and it appears that at those time I have issued a

    STOP CHANNEL(CHANNEL2) MODE(TERMINATE)

    When the program is not able to poll messages of the queue there are no RC in this table

    which makes me think that the application does not realize that are still pending messages

    I noticed that when it happens no application appears to be connected in MQ while there is an established\

    tcp connection in the system with the client

     TCP    10.250.5.150:60797     10.250.2.185:1417      ESTABLISHED

    The connection DOES not go away when application is stopped, and only disappears when the QManager

    is stopped.

     

    ErrorLogId    ExceptionName    ErrorCode    Message    StackTrace    DateCreated    ParentErrorId    ApplicationName
    81521    System.Threading.ThreadAbortException    NULL    Thread was being aborted.       at System.Threading.Thread.SleepInternal(Int32 millisecondsTimeout)
       at System.Threading.Thread.Sleep(Int32 millisecondsTimeout)
       at BAConfServiceAccRegisters.MQPollingServiceAccRegistersForwarder.PollingPass(Int32 messagesToSkip)    2017-02-02 08:20:43.100    NULL    MQPollingServiceAccRegistersForwarder
    81522    System.Threading.ThreadAbortException    NULL    Thread was being aborted.       at CustomLibraries.MQAdapter.MessageQueueManager.ReadLocalQMsgByAdapter(String strQueueName)
       at BAConfServiceAccRegisters.MQPollingServiceAccRegistersOrchestrator.PollingPass(Int32 messagesToSkip)    2017-02-02 08:37:56.697    NULL    MQPollingServiceAccRegistersOrchestrator
    81523    System.Threading.ThreadAbortException    NULL    Thread was being aborted.       at System.Threading.Thread.SleepInternal(Int32 millisecondsTimeout)
       at System.Threading.Thread.Sleep(Int32 millisecondsTimeout)
       at BAConfServiceAccRegisters.MQPollingServiceAccRegistersForwarder.PollingPass(Int32 messagesToSkip)    2017-02-02 08:49:42.050    NULL    MQPollingServiceAccRegistersForwarder
    81524    CustomLibraries.MQAdapter.MQAdapterException    NULL    Error Code: NOTFOUND: errorNo=2538       at CustomLibraries.MQAdapter.MessageQueueManager.ConnectMQbyAdapter(String strQueueManagerName, String strChannelInfo, String getQueue, String putQueue, Int32 waitInterval, Int32 charset)
       at BAConfService.BAConfManagerBase.InitMessageQueue(String getQueue, String putQueue, String queueManager, String connectionInfo)
       at BAConfService.MQPollingServiceConfiscations.PollingPass(Int32 messagesToSkip)    2017-02-02 09:35:12.960    NULL    MQPollingServiceConfiscations
    81525    CustomLibraries.MQAdapter.MQAdapterException    NULL    Error Code: NOTFOUND: errorNo=2538       at CustomLibraries.MQAdapter.MessageQueueManager.ConnectMQbyAdapter(String strQueueManagerName, String strChannelInfo, String getQueue, String putQueue, Int32 waitInterval, Int32 charset)
       at BAConfService.BAConfManagerBase.InitMessageQueue(String getQueue, String putQueue, String queueManager, String connectionInfo)
       at BAConfService.MQPollingServiceConfiscations.PollingPass(Int32 messagesToSkip)    2017-02-02 09:42:49.543    NULL    MQPollingServiceConfiscations
    81526    CustomLibraries.MQAdapter.MQAdapterException    NULL    Error Code: NOTFOUND: errorNo=2538       at CustomLibraries.MQAdapter.MessageQueueManager.ConnectMQbyAdapter(String strQueueManagerName, String strChannelInfo, String getQueue, String putQueue, Int32 waitInterval, Int32 charset)
       at BAConfService.BAConfManagerBase.InitMessageQueue(String getQueue, String putQueue, String queueManager, String connectionInfo)
       at BAConfService.MQPollingServiceConfiscations.PollingPass(Int32 messagesToSkip)    2017-02-02 09:59:11.487    NULL    MQPollingServiceConfiscations
    81527    CustomLibraries.MQAdapter.MQAdapterException    NULL    Error Code: NOTFOUND: errorNo=2538       at CustomLibraries.MQAdapter.MessageQueueManager.ConnectMQbyAdapter(String strQueueManagerName, String strChannelInfo, String getQueue, String putQueue, Int32 waitInterval, Int32 charset)
       at BAConfService.BAConfManagerBase.InitMessageQueue(String getQueue, String putQueue, String queueManager, String connectionInfo)
       at BAConfService.MQPollingServiceConfiscations.PollingPass(Int32 messagesToSkip)    2017-02-02 12:00:13.580    NULL    MQPollingServiceConfiscations
    81528    CustomLibraries.MQAdapter.MQAdapterException    NULL    Error Code: NOTFOUND: errorNo=2537       at CustomLibraries.MQAdapter.MessageQueueManager.ConnectMQbyAdapter(String strQueueManagerName, String strChannelInfo, String getQueue, String putQueue, Int32 waitInterval, Int32 charset)
       at BAConfService.BAConfManagerBase.InitMessageQueue(String getQueue, String putQueue, String queueManager, String connectionInfo)
       at BAConfService.MQPollingServiceConfiscations.PollingPass(Int32 messagesToSkip)    2017-02-02 12:04:14.663    NULL    MQPollingServiceConfiscations
    81529    CustomLibraries.MQAdapter.MQAdapterException    NULL    Error Code: MQRC_Q_MGR_NOT_AVAILABLE       at CustomLibraries.MQAdapter.MessageQueueManager.ConnectMQbyAdapter(String strQueueManagerName, String strChannelInfo, String getQueue, String putQueue, Int32 waitInterval, Int32 charset)
       at BAConfService.BAConfManagerBase.InitMessageQueue(String getQueue, String putQueue, String queueManager, String connectionInfo)
       at BAConfService.MQPollingServiceConfiscations.PollingPass(Int32 messagesToSkip)    2017-02-02 14:15:05.517    NULL    MQPollingServiceConfiscations

  • MoragH
    MoragH
    131 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-02T22:13:00Z  

    The MQRCs 2537 (MQRC_CHANNEL_NOT_AVAILABLE) and 2059 (MQRC_Q_MGR_NOT_AVAILABLE) would make perfect sense in relation to you issuing an MQ command to STOP the channel. I do not believe this is related to your MQGET problem.

    You have now mentioned a few times that the "program is not able to poll messages of the queue". How does the application determine this? You say there are no MQRCs at the time when this determination is made, so something else must be telling the application that it cannot pull messages off the queue. What is that? How does it know in other words. Is it just that it doesn't find any messages? Is it because a message processing piece of code doesn't get driven? Or something else? This appears to be the crux of the problem and yet we know almost nothing about it.

    You say "I noticed that when it happens no application appears to be connected in MQ". What do you look at in MQ to see that no application is connected? Is it DISPLAY CHSTATUS or DISPLAY CONN or some other command?

    There shouldn't be an established TCP connection if the application has disconnected. If the application doesn't explicitly disconnect, you should get the application corrected to do that, but which there is an established socket, you should be able to see evidence of it in MQ using DISPLAY CHSTATUS.

    I assume the log snippet you provide is of the time when you stopped the channel forcibly? Or is there something else you are showing there?

    Cheers
    Morag

  • michaesi
    michaesi
    17 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-03T14:19:41Z  
    • MoragH
    • ‏2017-02-02T22:13:00Z

    The MQRCs 2537 (MQRC_CHANNEL_NOT_AVAILABLE) and 2059 (MQRC_Q_MGR_NOT_AVAILABLE) would make perfect sense in relation to you issuing an MQ command to STOP the channel. I do not believe this is related to your MQGET problem.

    You have now mentioned a few times that the "program is not able to poll messages of the queue". How does the application determine this? You say there are no MQRCs at the time when this determination is made, so something else must be telling the application that it cannot pull messages off the queue. What is that? How does it know in other words. Is it just that it doesn't find any messages? Is it because a message processing piece of code doesn't get driven? Or something else? This appears to be the crux of the problem and yet we know almost nothing about it.

    You say "I noticed that when it happens no application appears to be connected in MQ". What do you look at in MQ to see that no application is connected? Is it DISPLAY CHSTATUS or DISPLAY CONN or some other command?

    There shouldn't be an established TCP connection if the application has disconnected. If the application doesn't explicitly disconnect, you should get the application corrected to do that, but which there is an established socket, you should be able to see evidence of it in MQ using DISPLAY CHSTATUS.

    I assume the log snippet you provide is of the time when you stopped the channel forcibly? Or is there something else you are showing there?

    Cheers
    Morag

    You are right the main issue is what the application is doing when the messages are NOT consumed. I am not able to

    respond to that and hopefully next week I can have a meeting with the developer to clarify those questions.

    What I am only able to do is to observe mostly from the server side what is happening. I see the following things

    1. The output of the DISPLAY CONN does not include any connected application

    2. The output of DISPLAY CHSTATUS shows  a status of RUNNING with a connection from the client without RAPPLTAG and a huge HBINT

    3. The above status returns to INACTIVE when the client application is terminated. (and HBINT returns to 300)

    It looks like the channel is in a "strange" condition with a TCPIP session active, but without an active application connection.

    I was wondering if this behaviour could be caused by some connection option in the application that

    is causing a strange behaviour in QM v6. Is there a safe set of connection options to use with v6?

    As soon as I have details from the application I will update

    Tnx

  • MoragH
    MoragH
    131 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-06T22:04:48Z  

    I will wait to hear what you get from the application side of things.

    My best guess as to the reasons for your odd behaviour seen on the queue manager side is as follows:-

    • DISPLAY CHSTATUS examples you have shown comparing a 'good' status and a 'bad' status show that the bad status (with the odd HBINT and the missing RAPPLTAG) has not done as many network flows (look at numbers in BUFSRCVD, BUFSSENT, BYTSRCVD, and BYTSSENT). The channel is in a SUBSTATE(RECEIVE) waiting (for a long time) for the other end to send the next flow and it never arrived.
    • You indicated that what little you could see in the application log reported a 2538 (MQRC_HOST_NOT_AVAILABLE) which is a connection time error suggesting the network connection got broken (and only the client side noticed) during connection time - this would align well with the previous bullet point.
    • TCP/IP is notoriously bad at not noticing that one end of a socket has died, and can leave the other end hanging for ages - TCP Keepalive is useful at cleaning up such orphaned sockets. MQ relies on TCP/IP so if TCP/IP does not inform the user of the socket (MQ) that the socket is dead, MQ continues to wait for data on it. Over the years newer releases of MQ have enhanced the way MQ can spot these orphaned connections itself and not rely on waiting for TCP/IP to tell it. This is the "strange" condition you refer to.
    • DISPLAY CHSTATUS has no RAPPLTAG because the processing of the connection has not proceeded far enough yet
    • DISPLAY CONN does not show any connected application because the processing of the connection has not proceeded far enough yet
    • When you client application is terminated, the socket is cleaned up and all is well again. You say the status returns to INACTIVE and the HBINT returns to 300. What is actually happening there is that the tool you are using to display the channel reverts to showing you the channel definition value of the HBINT because when status says INACTIVE there is no status at all. (Try issuing MQSC command DISPLAY CHSTATUS to see what I mean).
    • The behaviour is not caused by any option that the application used, but simply a network blip, and an old version of MQ. There are no unsafe connection options.

    Cheers
    Morag

    Updated on 2017-02-06T22:05:30Z at 2017-02-06T22:05:30Z by MoragH
  • michaesi
    michaesi
    17 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-20T13:30:45Z  
    • MoragH
    • ‏2017-02-06T22:04:48Z

    I will wait to hear what you get from the application side of things.

    My best guess as to the reasons for your odd behaviour seen on the queue manager side is as follows:-

    • DISPLAY CHSTATUS examples you have shown comparing a 'good' status and a 'bad' status show that the bad status (with the odd HBINT and the missing RAPPLTAG) has not done as many network flows (look at numbers in BUFSRCVD, BUFSSENT, BYTSRCVD, and BYTSSENT). The channel is in a SUBSTATE(RECEIVE) waiting (for a long time) for the other end to send the next flow and it never arrived.
    • You indicated that what little you could see in the application log reported a 2538 (MQRC_HOST_NOT_AVAILABLE) which is a connection time error suggesting the network connection got broken (and only the client side noticed) during connection time - this would align well with the previous bullet point.
    • TCP/IP is notoriously bad at not noticing that one end of a socket has died, and can leave the other end hanging for ages - TCP Keepalive is useful at cleaning up such orphaned sockets. MQ relies on TCP/IP so if TCP/IP does not inform the user of the socket (MQ) that the socket is dead, MQ continues to wait for data on it. Over the years newer releases of MQ have enhanced the way MQ can spot these orphaned connections itself and not rely on waiting for TCP/IP to tell it. This is the "strange" condition you refer to.
    • DISPLAY CHSTATUS has no RAPPLTAG because the processing of the connection has not proceeded far enough yet
    • DISPLAY CONN does not show any connected application because the processing of the connection has not proceeded far enough yet
    • When you client application is terminated, the socket is cleaned up and all is well again. You say the status returns to INACTIVE and the HBINT returns to 300. What is actually happening there is that the tool you are using to display the channel reverts to showing you the channel definition value of the HBINT because when status says INACTIVE there is no status at all. (Try issuing MQSC command DISPLAY CHSTATUS to see what I mean).
    • The behaviour is not caused by any option that the application used, but simply a network blip, and an old version of MQ. There are no unsafe connection options.

    Cheers
    Morag

    We have moved to a new MQ Server v8. (client still at v7.5.0.7)

    The problem still remains, but now MQ apears to handle it better. (The channel does not remain RUNNING), the TCP

    connection is cleared and no strange HBINT values.

    The time of the problem we had the following error in QMgr log

    02/20/17 10:57:29 - Process(22216780.1446) User(mqm) Program(amqrmppa)
                        Host(mqprod) Installation(Installation1)
                        VRMF(8.0.0.5) QMgr(QM_BOAP)                  
    AMQ9208: Error on receive from host 10.250.5.150.

    EXPLANATION:
    An error occurred receiving data from 10.250.5.150 over TCP/IP. This may be due
    to a communications failure.
    ACTION:
    The return code from the TCP/IP read() call was 73 (X'49'). Record these values
    and tell the systems administrator

    Tomorrow we will have a meeting with developers to discuss further. It appears that the application that causes problem

    involves browse beside get.

    I ll keep post with further updates

  • MoragH
    MoragH
    131 Posts

    Re: SVRCONN channel appears as running while there is no connection

    ‏2017-02-20T23:18:12Z  
    • michaesi
    • ‏2017-02-20T13:30:45Z

    We have moved to a new MQ Server v8. (client still at v7.5.0.7)

    The problem still remains, but now MQ apears to handle it better. (The channel does not remain RUNNING), the TCP

    connection is cleared and no strange HBINT values.

    The time of the problem we had the following error in QMgr log

    02/20/17 10:57:29 - Process(22216780.1446) User(mqm) Program(amqrmppa)
                        Host(mqprod) Installation(Installation1)
                        VRMF(8.0.0.5) QMgr(QM_BOAP)                  
    AMQ9208: Error on receive from host 10.250.5.150.

    EXPLANATION:
    An error occurred receiving data from 10.250.5.150 over TCP/IP. This may be due
    to a communications failure.
    ACTION:
    The return code from the TCP/IP read() call was 73 (X'49'). Record these values
    and tell the systems administrator

    Tomorrow we will have a meeting with developers to discuss further. It appears that the application that causes problem

    involves browse beside get.

    I ll keep post with further updates

    The error you have mentioned in your error log is an error that the TCP/IP socket noticed and passed to the MQ channel using that socket. The error message includes the TCP/IP error number, 73 (X'49'). You can look this number up in the TCP/IP documentation for your platform. I think it is ECONNRESET which means that the other end of the socket has been abruptly ended rather than disconnected properly, Does you applicant do this?

    If you use browse with get at the same time on a get call you will simply get a bad return code. That will not cause this problem in itself. How your application reacts to the bad return code may be worth looking into?

    Cheers
    Morag