Question & Answer
What is the role of Distributed Replicated Block Device in synchronizing the data across a High Availability (HA) appliance pair?
Distributed Replicated Block Device is one of the ways QRadar synchronizes HA peers. The Real-time data synchronization section of the QRadar Knowledge Center discusses Distributed Replicated Block Device and it's function. This article provides further information about disk synchronization in HA QRadar clusters.
- Impact of Networking on Distributed Replicated Block Device
- Crossover and Disk Synchronization Rate Configuration
- HA States and Distributed Replicated Block Device
In the example below, when we run the disk free command (
df) on a HA host, the filesystem column for the
/storepartition lists the block device drbd0.
[root@qradar-primary /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda7 20G 8.4G 11G 46% /
/dev/sda5 9.8G 1.2G 8.1G 13% /var/log
/dev/drbd0 31T 41G 31T 1% /store
/dev/sda6 9.7G 24M 9.2G 1% /store/tmp
/dev/sda9 8.0T 34M 8.0T 1% /store/transient
/dev/sda3 6.0G 3.4G 2.3G 60% /recovery
This is because the HA is enabled and the
/storefilesystem on the QRadar HA peers (Active & Standby) replicate using the Distributed Replicated Block Device feature. Distributed Replicated Block Device is a replicated Storage feature under the Linux platform which layers logical block devices over existing logical block devices on participating cluster nodes. It uses "synchronous mode" replication: replicating every write request to the peer node in real time and only when this write request is completed, returning the control back to the linux kernel.
2. Impact of networking on Distributed Replicated Block Device
Since Distributed Replicated Block Device replicates in synchronous mode, disk IO is bound by your network speed and latency. The upper limit on latency for Distributed Replicated Block Device to operate effectively is about 2 milliseconds. Latency on Fiber links (not including switching and so on) can be as high as 0.82 ms per 160 km (99.42 miles), making Distributed Replicated Block Device not suitable for geographically separated clusters. This highlights the fact that HA and DR are separate concepts for QRadar.
If your HA configuration does not utilize a dedicated crossover connection, all HA traffic, including Distributed Replicated Block Device, uses the management interfaces of the HA hosts. The management interface is usually the 1 Gbps Copper eth0 interface, which puts the physical limit of the synchronization rate (and thus Disk IO) at approximately 120 MB/s. This interface is also used for normal QRadar traffic, such as configuration updates, event searches, patch uploads, and incoming events from your log sources. Because of this, the expected synchronization performance is below the theoretical 120 MB/s physical limit. The default Disk Synchronization Rate for HA is set to 100 MB/s to reflect this.
If Disk Synchronization becomes saturated, your system suffers performance related issues which, in extreme cases, result in services becoming unavailable. To avoid such problems, crossover connections, which support 10 Gbps fiber ethernet connections, can be used. The Disk Synchronization Rate setting is always bound by the physical limit of the interface in use for HA operations. Therefore, when using 10 Gbps crossover interfaces, Disk Synchronization Rate setting can be increased as high as 1000 MB/s. Note that the actual performance might not always meet the configured rate.
3. Crossover and Disk Synchronization Rate Configuration
When first creating an HA cluster, you have option to configure a crossover connection, as discussed in the Creating an HA cluster section of the QRadar Knowledge Center. You can also edit the configuration of an existing cluster to add a crossover connection by selecting the relevant cluster from Admin > System and License Management and clicking Edit HA Host under the High Availability menu.
Regardless of how you access it, the HA Wizard has the settings for configuring the Disk Synchronization Rate and Crossover interfaces.
4. HA States and Distributed Replicated Block Device
The HA state information is available on the System and License Management window. In a fully functional HA environment, one host is in an
Activestate and the other in a
Standbystate. From a Distributed Replicated Block Device perspective, this means that the
/storepartition is fully synchronized and the updates are happening as expected.
/storepartition on a host is not yet fully synchronized (usually seen when an HA cluster is first created), its status will appear as
Synchronizing. When a host is shown as synchronizing, the data in the
/storefile system is inconsistent and the host with the Synchronizing state is copying data from the
Activenode. The percentage value next to the
Synchronizingstatus indicates the progress of the Distributed Replicated Block Device synchronization.
Other possible states for HA are
Unknown. These states may be encountered while performing a manual fail-over operation, during maintenance when one of the hosts is rebooted or turned off, or during patching. In most cases, a system in an
Offlinestate can be restored to normal operation by selecting the peer and clicking the Set System Online option available in the High Availability menu. The
Unknownstate however can be an indication of an issue and IBM Support should be contacted.
Where do you find more information?
16 June 2018