IBM Support

QRadar: High Availability (HA) may fail over if a NFS mount becomes read-only

Troubleshooting


Problem

If an NFS volume or mount point becomes read-only on an HA appliance, a fail over can occur from the primary (active) appliance to the standby. 

Cause

An NFS mount is read-only in the environment, causing a high-available failover to occur. Administrators need to review the status of their NFS partitions to ensure that they are not in read-only mode or that permissions have not changed for their NFS storage.  

Resolving The Problem

When an NFS volume goes read-only it can cause the active appliance to failover and the peer node is unable to write to the NFS Volume. This article provides steps for administrators to investigate the cause of a read-only NFS partition.

Procedure
  1. Use SSH to log in to the HA appliance that has failed over as root user.
    Note: This is typically the appliance where the Status column displays Standby.
  2. Verify that your NFS volume is mounted by typing:
    cat /proc/mounts | grep nfs
    
    sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
    192.168.0.58:/home/exports /store/nfs nfs4 rw,sync,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,soft,noac,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.95,local_lock=none,addr=192.168.0.58 0 0
    
  3. To verify NFS operational on your NFS server and the QRadar HA appliance, type:
    rpcinfo -p server_IP_address
    Example output:
       program vers proto   port  service
        100000    4   tcp    111  portmapper
        100000    3   tcp    111  portmapper
        100000    2   tcp    111  portmapper
        100000    4   udp    111  portmapper
        100000    3   udp    111  portmapper
        100000    2   udp    111  portmapper
        100005    1   udp  20048  mountd
        100024    1   udp  37017  status
        100005    1   tcp  20048  mountd
        100024    1   tcp  47199  status
        100005    2   udp  20048  mountd
        100005    2   tcp  20048  mountd
        100005    3   udp  20048  mountd
        100005    3   tcp  20048  mountd
        100003    3   tcp   2049  nfs
        100003    4   tcp   2049  nfs
        100227    3   tcp   2049  nfs_acl
        100003    3   udp   2049  nfs
        100003    4   udp   2049  nfs
        100227    3   udp   2049  nfs_acl
        100021    1   udp  50408  nlockmgr
        100021    3   udp  50408  nlockmgr
        100021    4   udp  50408  nlockmgr
        100021    1   tcp  40114  nlockmgr
        100021    3   tcp  40114  nlockmgr
        100021    4   tcp  40114  nlockmgr
    
  4. Review /var/log/messages on the QRadar appliance to determine if your NFS server is writable.
    kernel: nfs: server server.domain.name not responding, still trying
    kernel: nfs: task 10754 can't get a request slot
    kernel: nfs: server server.domain.name OK
  5. To check your NFS server and ensure that the file is being exported and permissions are read write, type:
    ​/export/dir QRadar_IP_address(rw,sync)
    Note: For more information on NFS export options, see: NFS Export Options.
  6. To review for IO latency on the NFS partition, type:
    nfsiostat
    Example output:
    192.168.0.58:/home/exports mounted on /store/nfs:
    op/s         rpc bklog
    0.33         0.00
    
    read:       ops/s     kB/s    kB/op    retrans      avg RTT (ms)    avg exe (ms)
                0.000     0.000   0.000    0 (0.0%)     0.000           0.000
    write:      ops/s     kB/s    kB/op    retrans      avg RTT (ms)   avg exe (ms)
                0.000     0.000   0.000    0 (0.0%)     0.000          0.000
    
    Note: If the nfs service is not running, the nfsiostat command does not display results.
    Results
    These steps are to assist with troubleshooting your issue. If you continue to experience issues, discuss with your storage administrtors to determine if the permissions on your remote storage have changed or if the NFS partitions were set as read-only. 

Related Information

Document Location

Worldwide

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB24","label":"Security Software"}}]

Document Information

Modified date:
18 December 2020

UID

ibm16236988