Linux-UNIX: Planning the integration with Hortonworks and Apache Ranger

Complete and verify the tasks in this topic before configuring the integration.

Topology of S-TAPs and collectors

Determine the required topology:
  • Number of collectors needed
  • Components monitored by each S-TAPs
Some customers prefer to have one S-TAP for each component. At a minimum, we recommend one S-TAP for HBase and one S-TAP for everything else.
Tip: An S-TAP is not required to sit on the same node as any particular component. It's possible--and even advisable if supporting Hadoop HA--to establish a dedicated Linux box for an S-TAP.
When configuring the number of connections for an S-TAP, use the following rule of thumb:
  • HBase: one plus the number of region servers
  • Everything else: one plus one for each component monitored
Attention:
  • For blocking, verify access to all HBase region servers, since you will need to copy the Guardium plugin JAR file to each of these region servers.
    For configuring high availability failover scenarios, record the failover node IP addresses or host names.

High availability and failover

Hadoop uses secondary nodes for high availability to handle data requests should the primary node fail. There are several options for S-TAP deployment so that you can continue to collect audit data in a failover scenario.
Install the S-TAP and set it up on a system that is not part of the Hadoop cluster
This provides a simple configuration where, when the components fail over, the new node automatically uses the S-TAP as a remote logger. No changes are needed to any configurations or S-TAPs.
Hybrid approach (recommended)
Install an S-TAP for HDFS and Hive using localhost in the S-TAP host field, then use a separate system such as an edge node for HBase. This provides an alternative to installing S-TAPs on all nodes and region servers and is the recommended approach.
Install the S-TAP on the nodes in the cluster
In this model, you install an S-TAP on the primary and standby node for each component.

Using localhost in the S-TAP host field, install an S-TAP on every node in the cluster and every region server for HBASE. This is approach is not recommended.

Guardium load balancing

Guardium S-TAP and enterprise load balancing options are supported when Ranger integration is enabled.