MapR DB
The MapR DB destination writes data to MapR DB binary tables. The destination can write data to MapR DB as text, binary data, or JSON strings. You can define the data format for each column written to MapR DB.
When you configure the MapR DB destination, you specify the MapR DB configuration properties, including the table name. You specify the row key for the table, and then map fields from the pipeline to MapR DB columns.
When necessary, you can enable Kerberos authentication and specify an HBase user. You can also use HDFS configuration files and add other HDFS configuration properties as needed.
Before you use any MapR stage in a pipeline, you must perform additional steps to enable Data Collector to process MapR data. For more information, see MapR Prerequisites.
Field Mappings
When you configure the MapR DB destination, you map fields from records to MapR DB columns.
You can map fields to columns in the following ways:
- Explicit field mappings
- By default, the MapR DB destination uses explicit field mappings. You select the
fields from records to map to MapR DB columns. Specify the MapR DB columns using
the following format:
<column-family>:<qualifier>
. You then define the storage type for the column in MapR DB. - Implicit field mappings
- When you configure the MapR DB destination to use implicit field mappings, the
destination writes data based on the matching field names. You can use implicit
field mappings when the field paths use the following
format:
<column-family>:<qualifier>
- Both implicit and explicit field mappings
- You can configure the destination to use implicit field mappings and then you can override the mappings by defining explicit mappings for specific fields.
Time Basis
The time basis determines the timestamp value added for each column written to MapR DB.
You can use the following times as the time basis:
- Processing Time
- When you use processing time as the time basis, the destination uses the Data Collector processing time as the timestamp value. The processing time is calculated once per batch.
- Record Time
- When you use the time associated with a record as the time basis, you specify a Date or Datetime field in the record. The destination uses the field value as the timestamp value.
- System Time
- When you leave the Time Basis field empty, the destination uses the timestamp value automatically generated by MapR when the column is written to MapR DB.
Kerberos Authentication
You can use Kerberos authentication to connect to MapR DB. When you use Kerberos authentication, Data Collector uses the Kerberos principal and keytab to connect to MapR DB. By default, Data Collector uses the user account who started it to connect.
The Kerberos principal and keytab are defined in the Data Collector configuration properties. To use Kerberos authentication, configure all Kerberos properties in the Data Collector configuration properties.
For more information about enabling Kerberos authentication for Data Collector, see Kerberos Authentication.
Using an HBase User
Data Collector can either use the currently logged in Data Collector user or a user configured in the destination to write to MapR DB.
A Data Collector configuration property can be set that requires using the currently logged in Data Collector user. When this property is not set, you can specify a user in the origin. For more information about Hadoop impersonation and the Data Collector property, see Hadoop Impersonation Mode.
Note that the destination uses a different user account to connect.By default, Data Collector uses the user account who started it to connect to external systems. When using Kerberos, Data Collector uses the Kerberos principal.
- On MapR, configure the user as a proxy user and authorize the user to
impersonate the HBase user.
For more information, see the HBase documentation.
- In the MapR DB destination, enter the HBase user name.
HDFS Properties and Configuration File
You can configure the MapR DB destination to use individual HDFS properties or HDFS configuration files:
- HBase configuration file
- You can use the following HDFS configuration file with the MapR DB
destination:
- hbase-site.xml
- Individual properties
- You can configure individual HBase properties in the MapR DB destination. To
add an HBase property, you specify the exact property name and the value.
The MapR DB destination does not validate the property names or
values.Note: Individual properties override properties defined in the HBase configuration file.
Configuring a MapR DB Destination
Configure a MapR DB destination to write data as text, binary data, or JSON strings to MapR DB binary tables.