IBM Support

QRadar: connectionsPerHost[10] maximum [10] reached - for host [/XXX.XXX.XXX.XXX] ... dropping connection - no events from log source

Troubleshooting


Problem

Some devices or applications running on them might fail, for one reason or another, to maintain an established TCP session with QRadar collector host and might drop and reconnect multiple times due to an underlying networking issue. Another common cause is a client (device) side corporate firewall, configured to time out idle TCP connections. However, if you notice the behavior for many of the devices connected to the same collector, you should probably investigate the collector side as well.

Symptom

In such cases on the QRadar collector side, Max TCP Syslog Connections Per Host (by default 10) value is reached because the old stale sessions are not properly closed on the sending device side. This problem is usually due to firewall timing out idle sessions, and it is not configured to send an RST (reset) packet on that action.
We can then observe on the collector host up to 10 stale connections and in /var/log/qradar.error we see errors similar to:
[ecs-ec-ingress.ecs-ec-ingress] [TcpSyslog(0.0.0.0/514) Protocol Provider Thread: class com.q1labs.semsources.sources.tcpsyslog.TcpSyslogProvider0] com.q1labs.semsources.sources.tcpsyslog.TcpSyslogProvider: [WARN] [NOT:0000004000][10.10.10.20/- -] [-/- -]connectionsPerHost[10] maximum [10] reached for host [IP ADDRESS] ... dropping connection
The device is no longer accepted to establish a new connection on TCP port 514, so events are not transmitted beyond that point.
In rare cases a device might use more than one active session for sending Syslog traffic, so if there is any suspicion that it might be the case, you need to check the device configuration and its documentation.

Diagnosing The Problem

Example troubleshooting
If you have access to the device, and you know how to check the active sessions there, you can probably discover one session that uses one source port. It is usually the latest opened one (the highest count of ports on the collector side). Then, you already know that only one connection is required.

If you have no access to the device side, or don't know how to manage it, you can do the following investigation on the collector:
List with netstat or ss commands the active established tcp connections from а specific device IP.
For example,
netstat -topan | grep 'IP ADDRESS'
tcp6       0      0 10.10.10.20:514       IP ADDRESS:52358      ESTABLISHED 62358/java           off (0.00/0/0)
tcp6       0      0 10.10.10.20:514       IP ADDRESS:53351      ESTABLISHED 62358/java           off (0.00/0/0)
tcp6       0      0 10.10.10.20:514       IP ADDRESS:53455      ESTABLISHED 62358/java           off (0.00/0/0)
tcp6       0      0 10.10.10.20:514       IP ADDRESS:54882      ESTABLISHED 62358/java           off (0.00/0/0)
tcp6       0      0 10.10.10.20:514       IP ADDRESS:62333      ESTABLISHED 62358/java           off (0.00/0/0)
tcp6       0      0 10.10.10.20:514       IP ADDRESS:62534      ESTABLISHED 62358/java           off (0.00/0/0)
tcp6       0      0 10.10.10.20:514       IP ADDRESS:63137      ESTABLISHED 62358/java           off (0.00/0/0)
Or with ss command:
ss -topan | grep 192.168.10.10
ESTAB      0      0       [::ffff:10.10.10.20]:514                  [::ffff:IP ADDRESS]:52358               users:(("java",pid=62358,fd=605))
ESTAB      0      0       [::ffff:10.10.10.20]:514                  [::ffff:IP ADDRESS]:53351               users:(("java",pid=62358,fd=605))
ESTAB      0      0       [::ffff:10.10.10.20]:514                  [::ffff:IP ADDRESS]:53455               users:(("java",pid=62358,fd=605))
ESTAB      0      0       [::ffff:10.10.10.20]:514                  [::ffff:IP ADDRESS]:54882               users:(("java",pid=62358,fd=605))
ESTAB      0      0       [::ffff:10.10.10.20]:514                  [::ffff:IP ADDRESS]:57112               users:(("java",pid=62358,fd=605))
ESTAB      0      0       [::ffff:10.10.10.20]:514                  [::ffff:IP ADDRESS]:62333               users:(("java",pid=62358,fd=605))
ESTAB      0      0       [::ffff:10.10.10.20]:514                  [::ffff:IP ADDRESS]:63137               users:(("java",pid=62358,fd=605))
Eventually only one of the ESTABLISHED connections is active and carries traffic and the others would be silent. However, if you reached the max 10 connections already, they all might be stale sessions.
To check which is the active session (if less than 10 currently), you can use tcpdump.
Most likely, the active session is the last established one, so usually with the highest source port number.
You can check all of the connections port by port to be certain. Example:
[root@QradarVM1 ~]# tcpdump -i any -nnA src port 65394
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type EN10MB (Ethernet), capture size 262144 bytes
21:02:40.065149 IP ADDRESS > 10.10.10.20.514: Flags [P.], seq 284908037:284909575, ack 1393589379, win 65472, length 1538
E..*[email protected].|.P.......<13>Nov 10 21:02:47 SERVERNAME AgentDevice=WindowsLog    AgentLogFile=Security    PluginVersion=WC.MSEVEN6.10.0.2.62    Source=Microsoft-Windows-Security-Auditing    Computer=SERVERNAME    OriginatingComputer=SERVERNAME    User=    Domain=    EventID=5379    EventIDCode=5379    EventType=8    EventCategory=13824    RecordNumber=2355065    TimeGenerated=1668114134    TimeWritten=1668114134    Level=LogAlways    Keywords=AuditSuccess    Task=SE_ADT_ACCOUNTMANAGEMENT_USERACCOUNT    Opcode=Info    Message=Credential Manager credentials were read.  Subject:  Security ID:  SERVERNAME\john  Account Name:  john  Account Domain:  SERVERNAME  Logon ID:  0xBEB8CF  Read Operation:  Read Credential  This event occurs when a user performs a read operation on stored credentials in Credential Manager.
If the events are easy to recognize, you can then try to deduce what IP/host is reconnecting, and you can focus your investigation on the source.
If the device needs more than one connection for transmitting syslog events, the documentation provided by the vendor might clarify it.

Resolving The Problem

Once you confirm that only one or none of the ports is an active one, you need to decide how to deal with the issue of possibly incorrect TCP session management.
However, if you notice more than one device experiencing the same problem, you might first examine the collector side, and probably restart the ecs-ec-ingress service and even try to reboot the appliance.

Support cases have shown that most likely, the main suspect is one or some of the firewalls in the middle. They can be more than one between the sending device and QRadar. If you have access and can make sure that the firewalls are configured to time out idle TCP sessions and are also configured to send an RST (reset) packet when they terminate one stale session, then this will probably resolve the issue.
If you cannot fix for some reason such configuration problems, you still have one last option. You can switch the communication of the syslog log source to use UDP instead of TCP protocol on port 514.  UDP protocol is a stateless protocol and by design has no such issues.
Note, even if increasing the Max Number of TCP Syslog Connections value might temporarily resolve the issue, it might not be a sustainable resolution. Without any network investigation for finding a root cause, this change might only mask and postpone the underlying problem.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSV4BL","label":"IBM QRadar"},"ARM Category":[{"code":"a8m0z000000cwt0AAA","label":"Log Source"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
07 December 2022

UID

ibm16838625