IBM Support

QRadar: High Availability FAQ

Question & Answer


Question

How do I work with QRadar High Availability (HA) and are there common processes I need to be aware of?

Answer

Administrators are encouraged to read the QRadar High Availability guide and other available documentation to familiarize themselves with these deployments. The information in this article is supplemental to the QRadar High Availability guide and provides common questions and answers about High Availability (HA) in QRadar.

Upon installation, are the primary and secondary hosts configured by using the same appliance type?

No, the primary is always installed as a normal stand-alone appliance (for example, 31xx, 18xx, 17xx, 15xx, etc). The secondary is always installed as type 500. This rule applies for all hosts (console included) that support HA.
For example, two 31xx hosts cannot be paired.

Do both hosts in an HA cluster need to be installed at the same time?

No. A console or managed host can run as a stand-alone host for any amount of time before the creation of an HA cluster.
 

What is the VIP?

The VIP is the “virtual” IP that is controlled by the host in the HA pair that is active. The VIP is brought up as an alias of the active host’s management interface, therefore uses the same physical interface and MAC address.
Here is an example that uses the "ifconfig" command from the command line on the active host.

Figure01
In the previous output, the management interface is ens192, and the VIP is ens192:0 and its IP is 192.168.0.2.

How is the VIP created?

In an HA pair, when the primary host is initially installed, the original management interface IP assigned becomes the VIP when HA is configured.
A new IP is needed for the primary’s management interface since the VIP is used on both the primary and secondary hosts. The VIP is up and reachable only on the host in the HA pair that is active.
This design ensures connectivity to the newly created HA Cluster from any host (for example, log sources) previously connected to the host while run as stand-alone device.
Vip-Figure01

Upon installation, do I need to configure each hostname to specify which is the primary and which is the secondary?

No. The initial HA setup automatically appends the HA designation to the hostname of each server in the cluster, with the secondary inheriting the short name of the primary.
For example, the following are the fully qualified domain names (FQDN) for the primary and secondary hosts before HA is configured:
 
host.domain.com
host2.domain.com
After HA is configured:
 
host-primary.domain.com
host-secondary.domain.com

My HA cluster is already configured and includes “primary” and “secondary” in the hostname, is it an issue?

Yes, hosts named this way might hit issues in QRadar when certain processes or scripts misinterpret the secondary as the primary. For example, if the primary is installed by using the hostname “host-primary.domain.com”, after HA is configured, the hostnames for the pair look like this:
host-primary-primary.domain.com
host-primary-secondary.domain.com
Administrators with HA clusters with similar hostnames as the previous example can use the following technotes as guidance to resolve it:
  1. Follow the "Recommended practices for hostname creation" technote guidelines.
  2. Follow the "Changing the network settings of a QRadar High Availability Cluster" technote for a step-by-step procedure.

Can a crossover cable be connected and configured while sync is in progress between the HA pair?

Yes. For physical appliances, a crossover connection can be configured anytime during or after HA setup. If an HA cluster is set up and the secondary is still running initial sync with the primary, the crossover configuration can be completed during this time.
Note: For systems running QRadar 7.4.3 and later, configuring a crossover in the HA wizard can cause the cluster to reboot. The administrators are advised to schedule a maintenance window to enable the crossover until this issue is fixed in newer versions.

Can the sync speed be adjusted while sync is in progress?

Yes. The sync speed is managed by the DRBD service and this setting is administered by the QRadar HA Wizard. To access this menu, the administrators must follow:
  1. On the navigation menu ( Navigation menu icon ), click Admin.
  2. Click System and license Management.
  3. Select the host for which you want to configure HA.
    1. When the HA cluster is created for the first time: From the Actions menu, select Add HA Host and click OK.
    2. When the crossover is enabled on an existing HA Cluster: From the High Availability menu, select Edit HA Host and click OK.
  4. Read the introductory text.
  5. Click Next.
  6. Adjust the value in “Disk Synchronization Rate (MB/s)” field.
    Note: It is common to set this value to 1100 MB/s when 10Gbps fiber interfaces for the crossover are used, however this assumption is wrong. An excessive sync speed setting might cause sync to halt. It is important to stay within the capabilities of the hardware when this setting is adjusted. The administrators can use the following values as reference:
    1. The cluster uses 1GE interfaces, then use 100 MB/s.
    2. The cluster uses 10GE interfaces, then use 300 – 500 MB/s.
  7. Click Finish to exit.

I have 10Gbps fiber interfaces and set the sync rate to 1100 MB/s. Why isn’t sync speed hitting 1100 MB/s?

Although some interfaces are theoretically capable of 1100 MB/s, the DRBD sync process is limited by the capabilities of the system, disk I/O, and DRBD process itself. Typically the highest speeds with this setup are roughly 300 – 500 MB/s.

How do I check the transfer rate while sync is in progress?

The sync speed is managed by the DRBD service. From the command line of either of the HA hosts in the cluster, this command displays DRBD sync status:
cat /proc/drbd
Sample output:
Figure02
The previous output shows the current speed and average in KB/sec, and there’s also an estimated time to completion in the “finish:” field. 

I ran a deploy in QRadar while sync was in progress and the percentage complete is back to 0%. Is sync starting all over again?

No. A configuration deploy in QRadar restarts the sync process from where it left off with the completion counter set back to 0%. In this scenario DRBD doesn’t account for the disk blocks already synced, it's counting the remaining blocks pending to sync. DRBD picks up where it left off and any data blocks already copied is still valid.
The "oos:" field in the "cat /proc/drbd" output is key. It notes the remaining data in KB to be synced. Compare the post-deploy "oos:" value with the value seen initially when sync first started to see the difference.
See QRadar: HA synchronization progress resets to 0% for more information on mid-sync interruptions

The output for “cat /proc/drbd” shows “ro:Primary/Secondary”. Does this field correspond with the QRadar primary and secondary?

No. The Primary/Secondary designation in DRBD is not the same as the QRadar primary and secondary designation. In DRBD, this designation indicates the role of the host the command is run on. If the output shows “ro:Primary/Secondary” then the host is the primary DRBD disk, and the far end of the DRBD connection is the secondary.
This role status means the DRBD primary has control over /store and is replicating to the secondary, which is not to be confused with the QRadar HA primary and secondary. When a HA failover occurs, the secondary host becomes the active host and takes over /store resources. After the failover, the DRBD in the secondary then shows as “ro:Primary/Secondary” meaning it's the primary DRBD node.
Figure03
The previous output is an example of "cat /proc/drbd" output from an HA secondary host that is the active node. Note the "ro:" field.

What command can be used from the CLI to check QRadar HA status?

A useful command to check HA status is:
/opt/qradar/ha/bin/ha cstate
This command displays the status of the primary and secondary and can be run from the secondary host as well, which is useful if there are any discrepancies in reporting between each host.
Here is an example:
[root@host-primary ~]# /opt/qradar/ha/bin/ha cstate
Local: R:PRIMARY S:ACTIVE/ONLINE CS:NONE P:1.0 HBC:UP RTT:82 I:0 SI:91065664
Remote: R:SECONDARY S:STANDBY/ONLINE CS:NONE P:1.0 HBC:UP RTT:50 I:11314 SI:62462923
HBC: ALIVE/1881
LSN: drbd_status => 1.0 I:0
LSN: ha_services => 1.0 I:0
LSN: drbd_sync => 1.0 I:0
LSN: mount_status => 1.0 I:0
LSN: app_services => 1.0 I:0
LSN: drbd_io_perf => 1.0 I:0
LSN: link_status => 1.0 I:0
LSN: drbd_network_perf => 1.0 I:0
LSN: cluster_ip => 1.0 I:0
RSN: drbd_status => 1.0 I:0
RSN: drbd_sync => 1.0 I:0
RSN: mount_status => 1.0 I:0
RSN: drbd_io_perf => 1.0 I:0
RSN: link_status => 1.0 I:0
Peer nodes status
Local: The node where the script is run.  When run on the secondary, the secondary is displayed as Local.
Remote: The HA node status.
 
Sensor status
 
LSN: Status of the sensors of the peer where the script is run (Local).
RSN: Status of the sensors of the other peer (Remote).
 
Expected Results
The primary must have an ACTIVE/ONLINE status and the secondary STANDBY/ONLINE.
Local and Remote sensors must have a value of 1.0 (optimal status).

How QRadar patches work on High Availability hosts?

The primary host must have an ACTIVE/ONLINE status and the secondary STANDBY/ONLINE for the patch to run successfully. See QRadar: Patch upgrade fails with error "HA configuration does not appear to be correct" for more details.
The patch file must be mounted only on the primary host. When the patch runs, the primary host mounts the patch and runs it on the secondary automatically.


The words LINSTOR®, DRBD®, LINBIT®, and the logo LINSTOR®, DRBD®, and LINBIT® are trademarks or registered trademarks of LINBIT in Austria, the United States, and other countries.

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwtXAAQ","label":"High Availability"}],"ARM Case Number":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions"}]

Document Information

Modified date:
20 April 2022

UID

ibm16565347