A red bar with the [
]An IO Error occurred on server(s)
. Please try again. message is displayed while running searches.
When running historical searches, a red bar is displayed on the results page. It displays the message:
An IO Error occurred on server(s)
. Please try again. The
Hostname or IP address that is displayed in the message is likely that of the console appliance. Applying filters to the search by using the "
Event Processor" parameter might eliminate the error even when including the console appliance.
An IO Error occurred on server(s)
. Please try again. message while running searches indicates that the Ariel database is not accessible on one or more managed hosts. The Hostname or IP address that is displayed on the error message does not always match the host or hosts that are experiencing the problem as there are various reasons that cause this error message to be displayed.
Note: Whether a hostname or an IP address is included in the message can depend on your name resolution configuration. Both possibilities are included in this document.
Diagnosing The Problem
When you are running a historical search, the console proxies your search requests to other managed hosts involved depending on your specified filters. An IO Error indicates that one or more of your managed hosts are not responding to these search requests. Identifying the correct host or hosts that are experiencing the issue is the first step in effectively troubleshooting this problem. You can use a combination of the methods below to help you identify the managed host that is experiencing the issue. If you are aware of managed hosts with issues, you can skip to the Checking the Security Data Distribution tab section for the verification step only.
By clicking the More Details link at the results section, you can get a better picture of how your managed hosts are responding to your search:
In this example, the host test-ep ran its search on no data files and the search duration was also zero:
Based on this result, test-ep is the host that is experiencing the problem. In a more realistic example, it can be necessary to verify your findings.
Reviewing the QRadar logs
If your managed hosts are not encrypted, or you have a mix of encrypted and not encrypted managed hosts in your environment, QRadar logs can be useful in identifying the managed host having the problems. Run the search by using the following filters to identify the actual error message:
Event Processor: your console
Time window: A time range that includes the last time that you received the IO Error
If the managed host experiencing the Ariel database issue is not encrypted, the raw event includes its hostname. Drill into the event and find the raw text:
Sep 14 14:56:23 127.0.0.1 [aqw_remote_2:4bac31ad-5cb4-47c8-8f0d-280d3bcb3d10] com.q1labs.ariel.searches.tasks.ServiceTaskBase: [ERROR] [NOT:0000003000][198.51.100.2/- -] [-/- -]Can't communicate to server [test-ep:32006
] executing query:Id:4bac31ad-5cb4-47c8-8f0d-280d3bcb3d10, DB:<events@/store/ariel/events/records, /store/ariel/events/payloads>, Time:<16-09-14,14:55:23 to 16-09-14,14:56:00>, Criteria=<DeviceType:[368,368]>, MappingFactory=com.q1labs.core.types.event.mapping.NormalizedEventMappingFactory@4ee, processedRecordsLimit=2147483647, executionTimeLimit=9223372036854775807, collectedRecordsLimit=2147483647, prio=NORMAL
If the issue is on an encrypted event processor, the raw event contains localhost as the hostname:
Sep 14 14:23:23 127.0.0.1 [aqw_remote_2:dd380d0d-ad31-4497-a9d3-81224cbd4b6b] com.q1labs.ariel.searches.tasks.ServiceTaskBase: [ERROR] [NOT:0000003000][198.51.100.2/- -] [-/- -]Can't communicate to server [localhost:32006
] executing query:Id:dd380d0d-ad31-4497-a9d3-81224cbd4b6b, DB:<events@/store/ariel/events/records, /store/ariel/events/payloads>, Time:<16-09-14,14:22:23 to 16-09-14,14:23:00>, Criteria=<DeviceType:[368,368]>, MappingFactory=com.q1labs.core.types.event.mapping.NormalizedEventMappingFactory@4ee, processedRecordsLimit=2147483647, executionTimeLimit=9223372036854775807, collectedRecordsLimit=2147483647, prio=NORMAL
Note: Regardless of the encryption setting of your managed host, you should make a note of hostname information from these raw events, as it is useful when verifying connectivity as described in the Resolving the Problem section.
Eliminating Event Processors
The IO Error is displayed only when you are searching on the managed host experiencing the issue. In our example, setting a filter to show events only from the console eliminates the IO error:
Therefore, you can set filters on your search to help you identify which managed hosts are experiencing the issue. Try filtering on the managed hosts that you previously identified when checking the search details. If the Ariel database of the managed host or hosts that you are filtering on are not accessible, you receive the same IO error:
Checking the Security Data Distribution tab
When you have an idea about which managed host or hosts are experiencing an issue, you can verify your conclusion by checking the Security Data Distribution tab of the System Information window. Open this tab by clicking Admin > System Configuration > System and License Management > Systems. When System and License Management window is opened, select the suspected managed host, and click Actions > View and Manage System. The System and License Details window open with Security Data Distribution tab that is selected by default. If the Ariel database is not accessible, the following warning is displayed:
Resolving The Problem
Note: For a video guide on how to troubleshoot IO errors, follow this article: How to troubleshoot IO errors when searching on QRadar
When you identified which Event Processor is experiencing the issue, you need to restore the access to its Ariel database. This is not always trivial. Below are some basic resolution steps that can help address the most common causes before contacting IBM support for further assistance.
Warning: The deployment of full configuration that is recommended in some of the below steps restart the services on all of your managed hosts, which result in a brief service interruption. This interruption must be taken into consideration when deploying full configuration. Perform a Full Deploy by going to the Admin tab on the UI and clicking Advanced > Deploy Full Configuration.
- Open an SSH connection to your Console with the root account.
- Create an SSH connection to the managed Host that you identified in the Diagnosing the Problem Section.
[root@test-console ~]# ssh test-ep
- Verify that the Ariel Query Server is running on this managed host:
[root@qradarep750 ~]# systemctl status ariel_query_server ● ariel_query_server.service - Ariel Query Server Loaded: loaded (/usr/lib/systemd/system/ariel_query_server.service; static; vendor preset: disabled) Drop-In: /etc/systemd/system/ariel_query_server.service.d └─ulimit.conf Active: active (running) since Thu 2022-09-15 08:50:36 EDT; 3h 19min ago Process: 24057 ExecStartPre=/opt/qradar/systemd/bin/generate_environment.sh ariel_query_server ariel (code=exited, status=0/SUCCESS) Process: 17885 ExecStartPre=/opt/qradar/systemd/bin/console_check.sh -r (code=exited, status=0/SUCCESS) Main PID: 29483 (java)
- If the Ariel Query Server is running, verify that it is listening on the port that is identified in the Diagnosing the Problem section:
[root@test-ep ~]# netstat -nalp | grep 32006 tcp 0 0 :::32006 :::* LISTEN 13732/ariel
- If your Ariel Query Server is listening on the specified port, verify the connectivity from the console to the managed host on that specific port. For unencrypted hosts, you need to use the hostname or IP address of the managed host and for encrypted host you need to use localhost.
Example for unencrypted hosts:
[root@test-console ~]# telnet test-ep 32006
[root@test-console ~]# telnet localhost 32006
If you do not receive a message indicating a successful connection, the most likely reason is a firewall blocking the traffic for the Ariel port.
Was this topic helpful?
15 September 2022