IBM Support

QRadar: Windows Log Sources Not Processing

Troubleshooting


Problem

In IBM QRadar, a Windows log source might have status ERROR with message:
  • "Too many open files"
  • "Connection error"
  • "File not found"
  • "Login failed"
In addition, ecs-ec-ingress service can have status restarting, failed, or running with time stamps from hours or days ago.

Symptom

  • In some cases, /var/log partition fills because the SMB errors write to qradar.log and qradar.error faster than log rotation can cycle the files. In this case, instead of getting an out of memory, services are shut down by hostcontext to protect the system by design.
  • Service ecs-ec-ingress is producing an out of memory error. It might take time to process the dump file and reload the rpms on a service restart. As a result, events might not process for sometime during the service restart.
    NOTE: The service ecs-ec-ingress being hung due to SMB, can be confirmed by looking for smb in the out of memory errors:
    grep -i -e ingress -e OutOfMemoryError -e smb /var/log/qradar.error | less +G
    [ecs-ec-ingress.ecs-ec-ingress] [Folder Monitor [{HOSTNAME}][smb://X.X.X.X/{FOLDER_PATH}]] java.lang.OutOfMemoryError: Java heap space

Cause

Too Many Open Files or Out of Memory errors results from poorly configured remote polling log sources.
Many Log Sources in error status for Windows and SMB protocols, might need to be disabled until they are resolved to reduce the impact on the system. Remote polling, such as SMB, need to be configured to 900 seconds not the default 10 seconds. This increased time allows the system to process retry attempts, so when there is a connectivity issue it isn't forking off many threads while retry attempts are being processed by an existing thread.

Diagnosing The Problem

Before the out of memory or /var/log partition filling, several SMB log sources are producing errors:
grep -i smb /var/log/qradar.log | less +G

[ecs-ec-ingress.ecs-ec-ingress] [Folder Monitor [{HOSTNAME}][smb://{FOLDER_PATH}]] java.lang.OutOfMemoryError: Java heap space
[ecs-ec-ingress.ecs-ec-ingress] [ReceiveThread] com.q1labs.frameworks.core.ThreadExceptionHandler: [INFO] [NOT:0000006000][X.X.X.X/- -] [-/- -]38,Finalizer thread in Native Code, WAITING, blocked-count: 162, blocked-time: N ms, wait-count: N, wait-time: N ms, user cpu: N nanos, sys/user cpu time: N nanos, Folder Monitor [{HOSTNAME}][smb://{FOLDER_PATH}] locked on java.util.concurrent.locks.ReentrantLock$NonfairSync@35f5122c
[ecs-ec-ingress.ecs-ec-ingress] [Folder Monitor [{HOSTNAME}][smb://{FOLDER_PATH}]] com.q1labs.semsources.sources.windowsiis.WindowsIisTailProvider: [ERROR] [NOT:0000003000][X.X.X.X/- -] [-/- -]TailingException: null
[ecs-ec-ingress.ecs-ec-ingress] [Folder Monitor [{HOSTNAME}][smb://{FOLDER_PATH}]] com.q1labs.semsources.sources.smbtail.foldermonitor.IFilesystemObject$FilesystemObjectException

Resolving The Problem

  1. Although it might all be Windows and SMB Log Sources in an error state, you can also compile a list of the erroneous log sources:
    grep -i smb /var/log/qradar.error | less +G
    
    [ecs-ec-ingress.ecs-ec-ingress] [Folder Monitor [{LOG_SOURCE_IDENTIFIER|IP|HOSTNAME}][smb://X.X.X.X/{FILE_PATH}]] Caused by: com.q1labs.semsources.sources.smbtail.io.jnq.JNQException: Unable to create/open - {FILE_PATH} (0xC0000043)
    [ecs-ec-ingress.ecs-ec-ingress] [Folder Monitor [{LOG_SOURCE_IDENTIFIER|IP|HOSTNAME}][smb://X.X.X.X/{FILE_PATH}]] com.q1labs.semsources.sources.smbtail.io.SmbFileWithRetries: [ERROR] [NOT:0000003000][X.X.X.X/- -] [-/- -][smb://X.X.X.X/LogFiles/W3SVC13] exists(): Failed: Access error for file W3SVC13 status = -1073741790 (0xc0000022) (0xC0000022)
    [ecs-ec-ingress.ecs-ec-ingress] [Folder Monitor [{LOG_SOURCE_IDENTIFIER|IP|HOSTNAME}][smb://X.X.X.X/{FILE_PATH}]] com.q1labs.semsources.sources.windowsdhcp.WindowsDHCPTailProvider: [ERROR] [NOT:0000003000][X.X.X.X/- -] [-/- -]TailingException: Unable to create/open - j50.log status = -1073741757 (0xc0000043) (0xC0000043)
    NOTE: The LOG_SOURCE_IDENTIFIER, IP, or HOSTNAME would indicate the log source needing increased polling interval to process the retry attempts, or to be disabled.
  2. Disable SMB Log Sources that are in error state that are not necessary.
  3. Increased polling to 900 seconds (15 minutes), or higher, for any throwing an error that needs to be left enabled.
  4. If ecs-ec-ingress is hung, restart ingress and hostcontext with the explicit command so other monitored services are not affected.
    WARNING: Restarting ecs-ec-ingress temporarily stops event collection. Administrators with strict outage policies are advised to complete the next step during a scheduled maintenance window for their organization.
    /opt/qradar/init/hostcontext -q restart 
    systemctl restart ecs-ec-ingress
    NOTE: We are telling hostcontext to reload configurations not to restart itself. This command throws some errors stating the command was decommissioned but the service still reloads the new configuration and the error can safely be ignored.
  5. If /var/log partition was full, resolve log roation issues.
  6. After about 2 minutes, Hostcontext will start any services it stopped.
  7. Any log sources disabled, or left in error status, needs to be investigated by your Server Admin. Common causes of SMB in error status are folder or file no longer exist, permission changed, decommissioned, or the server is intermittently offline.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwt0AAA","label":"Log Source"}],"ARM Case Number":"TS004608033","Platform":[{"code":"PF016","label":"Linux"}],"Version":"7.4.3;7.5.0"}]

Document Information

Modified date:
03 May 2023

UID

ibm16589593