IBM Support

QRadar: How to troubleshoot peak Events Per Second

Troubleshooting


Problem

Measuring event rate in QRadar is one of the most important metrics. Events Per Second (EPS) is where the license is applied. The first place were raw events are processes is ecs-ec-ingress. The EPS rate is most important to understand whether a system is licensed properly and if it is getting the proper rate for the hardware capacity.

Resolving The Problem

When administrators use the graphs in QRadar, the calculations for Events Per Second (EPS) might not be accurate. An example is the Quick Search Event Rate (EPS) which does not always provide accurate results. A better metric for EPS calculations are suggested by using an AQL search or a command-line method. 

Before you begin
To use the advanced search (AQL) queries mentioned in this technical note:
 
  • Health metrics custom properties require QRadar 7.3.3 Fix Pack 6 or 7.4.1 and later. If you use older versions of QRadar, you can query for the peak EPS rate from the command line.
  • All Custom Event Properties for the Health Metrics log source must be enabled to display the incoming event rate from the ecs-ec-ingress service.
    image 8518
  • The command-line options require root access to the QRadar Console appliance. 

Determining peak EPS by using the Advanced Log Activity search

  1. Log in to the QRadar Console.
  2. Click the Log Activity tab.
  3. Click Advanced Search.
  4. Copy and paste the AQL statement into the search box.
    SELECT "Hostname" AS 'Hostname (custom)', MAX("Value") AS 'Value (custom) (Maximum)', 
    COUNT(*) AS 'Count' from events where ( "Metric ID"='EventRate' AND "deviceType"='368' ) 
    GROUP BY "Hostname" order by "Count" desc
  5. Click Search.

    Results
    The AQL query pulls peak EPS information from the Console, Event Collectors, and Event Processors. In the example, the graph is displaying a Console, an Event Processor, and peak EPS. The graph allows administrators to view whether they are exceeding their license and be able to investigate the high EPS.
    image 8630

Determining average EPS by using the Advanced Log Activity search

  1. Log in to the QRadar Console.
  2. Click the Log Activity tab.
  3. Click Advanced Search.
  4. Copy and paste the AQL statement into the search box.
    SELECT "Hostname" AS 'Hostname (custom)', AVG("Value") AS 'Value (custom) (Average)', COUNT(*) AS
     'Count' from events where ( "Metric ID"='EventRate' AND "deviceType"='368' )
     GROUP BY "Hostname" order by "Count" desc
  5. Click Search.


    Results
    The AQL query pulls average EPS information from the Console, Event Collectors, and Event Processors. In the example, the graph is displaying a Console, an Event Processor, Average EPS. The graph allows administrators to view when hosts exceed the licensed EPS rate and investigate EPS spikes.
    image 8620

Investigating Peak EPS spikes by using Filtered searches

After you discover which Event Processor from the AQL searches have high load, administrators need to use these two filtered searches to determine:
  • Which Log source Type is causing the spike.
  • From the Log Source Type filter, which Log Sources are causing the highest EPS spikes. 
  • Which Log Source Type is causing an EPS spike
    1. Log in to the QRadar Console.
    2. Click the Log Activity tab.
    3. Click Add Filter.
    4. Use Parameter:  Event Processor, Operator: Equals, Value: The Event processor from the AQL search.
      image 9343
    5. Click Add Filter.
    6. Click Search > Edit Search.
    7. Enter Time Range.
    8. Scroll down to Column Definition.
    9. In the text box, enter Log Source Type and add it to Group By.
      image 9350
    10. Add a Name for Column Layout.
    11. Click Save Column Layout
    12. Click Search.
    Results A search by Log Source Type is created to further investigate EPS spikes

    image 9425

  • Searching the Log Sources generating the highest EPS Spikes
    1. Use the search results from the previous search for Log Source Type.
    2. For the Log Source Type generating the highest event count, click Log Source (Unique Count).
      image 9426
    3. Another Graph is displayed with a list of the top Log Sources generating high EPS Spikes or counts.
    4. Click Event Name (Unique Count) to further investigate what is causing the EPS Spike.
      image 9427
    5. A new graph is displayed with events that are causing high EPS spikes or counts.
      image 9442

Results
These searches can be used for any Event Processor or Log Source Type to help with investigations of high EPS spikes or counts.
 

Determining peak EPS by using the command line

The Source monitor collects the metrics from MBeans and populates the statistics in /var/log/qradar.log.
To display the metrics from the logs, use the procedure. The important things to watch in the logs are:
  • Peak in the last 60s: The peak in the last minute used for tracking over license issues and spikes that cause performance issues.
  • Max Seen: The maximum event rate seen the last restart of ecs-ec-ingress.
  • Appliance Threshold: When we go greater than the appliance threshold, we start to use the license queue.
  1. Use SSH to log in to the Console as root user.
    Note: QRadar on Cloud (QRoC) users can SSH to the Data Gateway appliance to view metrics for the ecs-ec-ingress service.
  2. From the Console SSH to the appliance where you want to view your EPS.
  3. Type the command:
    1. For QRadar versions 7.4.2 or later run use the command:
      grep -i 'ecs-ec-ingress\].*SourceMonitor.*event' /var/log/qradar.log | sed -n 's/^\(.\{15\} \).*\((60s: [0-9\.]\{1,\} eps)\).*\(Peak.*60s: [0-9\.]\{1,\} eps\).*\(Appliance Threshold.*$\)$/\1 \2 \3 \4 /p' 
      
      Jan 25 14:01:29  (60s: 117.73 eps) Peak in the last 60s: 128.40 eps Appliance Threshold: 5020.00
      Jan 25 14:02:29  (60s: 117.57 eps) Peak in the last 60s: 127.20 eps Appliance Threshold: 5020.00
      Jan 25 14:03:29  (60s: 117.47 eps) Peak in the last 60s: 128.20 eps Appliance Threshold: 5020.00
      Jan 25 14:04:29  (60s: 117.52 eps) Peak in the last 60s: 127.40 eps Appliance Threshold: 5020.00
      Jan 25 14:05:34  (60s: 117.53 eps) Peak in the last 60s: 127.20 eps Appliance Threshold: 5020.00
      
    2. For QRadar 7.3.3 through 7.4.1 run the command:
       grep -i 'ecs-ec-ingress\].*SourceMonitor.*event' /var/log/qradar.log | sed -n 's/^\(.\{15\} \).*\((60s: [0-9\.]\{1,\} eps)\).*\(Peak.*60s: [0-9\.]\{1,\} eps\).*\(License Threshold.*$\)$/\1 \2 \3 \4 /p'
      
      Jan 22 14:55:38  (60s: 131.32 eps) Peak in the last 60s: 140.80 eps License Threshold: 5020.00
      Jan 22 14:56:38  (60s: 131.37 eps) Peak in the last 60s: 139.40 eps License Threshold: 5020.00
      Jan 22 14:57:38  (60s: 131.15 eps) Peak in the last 60s: 140.00 eps License Threshold: 5020.00
      Jan 22 14:58:38  (60s: 131.13 eps) Peak in the last 60s: 139.80 eps License Threshold: 5020.00
      Jan 22 14:59:38  (60s: 131.30 eps) Peak in the last 60s: 140.60 eps License Threshold: 5020.00
      Jan 22 15:00:38  (60s: 131.30 eps) Peak in the last 60s: 140.40 eps License Threshold: 5020.00
      Jan 22 15:01:38  (60s: 131.58 eps) Peak in the last 60s: 142.00 eps License Threshold: 5020.00
      Jan 22 15:02:43  (60s: 131.15 eps) Peak in the last 60s: 139.80 eps License Threshold: 5020.00
      
Results
Things to look at are Peak events in the last 60 seconds and maximum events seen since the last time ecs-ec-ingress was restarted.
To determine maximum EPS by using a MBean query

There is a hardware restriction applied on 1501, 1599, and QRadar on Cloud Data Gateways to prevent the incoming EPS rate from exceeding the capability of the hardware.

  1. Use SSH to log in to the Console.
  2. SSH to that appliance with high EPS.
  3. Copy the query to the command line:
    /opt/qradar/support/jmx.sh -p 7787 -b 'com.q1labs.sem:application=ecs-ec-ingress.ecs-ec-ingress,type=sources,name=Source Monitor'
    com.q1labs.sem:application=ecs-ec-ingress.ecs-ec-ingress,type=sources,name=Source Monitor
    -----------------------------------------------------------------------------------------
    LongWindowLengthInSecs: 900
    EventImmediateWindowAverage: 118.18450125218368
    FlowRate: 0.0
    FlowImmediateWindowAverage: 0.0
    FlowLongWindowAverage: 0.0
    ImmediateWindowLengthInSecs: 300
    MaximumFlowRateSinceStartup: 0.0
    EPSThreshold: 5020.0
    EventLongWindowAverage: 118.18450125218368
    FPSThreshold: 0.0
    EventRate: 145.2
    MaximumEventRateSinceStartup: 254.4
    
    
    

 

Creating custom EPS graphs in the Pulse application

Before you begin

  1. Click the Pulse Dashboard tab.
  2. Click the Dashboard Configuration icon.
    image-20221021151351-1
  3. Click Create new widget.
    image-20221021151546-2
  4. Type a Name and Description.
  5. Under Data Source, select AQL.
  6. Set the Refresh Time to your preference.
    Note: The default value for Refresh Time is 60 seconds.
  7. Create your graph with one of the following AQL statements:
    • AQL statement 1: Average EPS
      SELECT "Hostname" AS 'Hostname (custom)', AVG("Value") AS 'Value (custom) (Average)', COUNT(*) AS 'Count' from events where ( "Metric ID"='EventRate' AND "deviceType"='368' )  GROUP BY "Hostname" order by "Count" desc

      image-20221021153608-1

    • AQL statement 2: Peak EPS
      SELECT "Hostname" AS 'Hostname (custom)', MAX("Value") AS 'Value (custom) (Maximum)', COUNT(*) AS 'Count' from events where ( "Metric ID"='EventRate' AND "deviceType"='368' ) GROUP BY "Hostname" order by "Count" desc
      image-20221021153250-5
  8. Enter a Results Limit.
    Note: The default result limit is 1000.
  9. Click Run Query.
    image-20221021154806-2
  10. Under Views, create a View Name.
    image-20221021161358-3
  11. Under Chart Type, select the Time Series Chart.
  12. Under Time (x-axis), select Value.
  13. Under Values (y-axis), select Hostname.
    image-20221021161516-4
  14. Optional: Enable Area Chart.
    Note: The default is Off.
  15. Optional: Enable Show legend.
    Note: The default is Yes.
  16. Optional: Select Legend Orientation.
  17. Click Save.
  18. Confirm the graph data is correct.
  19. Repeat the procedure to create a graph with AQL statement 2.

    Results
    A Dashboard Widget is created that you can add to your Pulse Dashboard.
    image-20221021162145-6

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwsyAAA","label":"Admin Tasks"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.3.3;7.4.1;7.4.2;7.4.3;7.5.0"}]

Document Information

Modified date:
26 October 2022

UID

ibm16406002