Field Hasher
The Field Hasher processor uses an algorithm to encode data. Use the processor to encode highly-sensitive data. For example, you might use the Field Hasher processor to encode social security or credit card numbers.
Field Hasher provides several methods to enable hashing individual fields or the entire record. You can hash any field that can be converted to a string. The resulting hash is a string value.
You can configure the Field Hasher processor to use MD5, SHA1, SHA-256, SHA-512, or MurmurHash3 128 to hash field values. You can optionally add a single field separator character to fields before hashing.
Hash Methods
Field Hasher provides several methods to hash data. When you hash a field more than once, Field Hasher uses the existing hash when generating the next hash.
- Hash in Place - Field Hasher replaces the original data
in a field with hashed values.
You can specify multiple fields to be hashed with the same algorithm. You can also use different algorithms to hash different sets of fields.
- Hash to Target - Field Hasher hashes data in a field and
writes it to the specified field, header attribute, or both. It leaves the
original data in place.
If the specified target field or attribute does not exist, Field Hasher creates it.
If you specify multiple fields to be hashed with the same algorithm, Field Hasher hashes the fields together.
If any of the fields are already hashed, Field Hasher uses existing hash values to generate the new hash value.
- Hash Record - Field Hasher hashes the record and writes
it to the specified field, header attribute, or both. You can include the record
header in the hash.
If the specified target field or attribute does not exist, Field Hasher creates it.
If the record includes fields that are already hashed, Field Hasher uses the hash values when hashing the record.
Field Separator
You can configure the Field Hasher processor to add a field separator character to the end of all fields to be hashed. You might want to add a field separator character when you hash multiple fields to a single field or when you hash an entire record.
When you use a field separator, the Field Hasher processor adds the character to the end of each field to be hashed before they are hashed, so the field separator character is hashed with the field. Note that since the field separator is added to each field, then the last field in a set of fields or the last field in a record also includes the field separator character in the hash.
When you enable the use of a field separator, you can select one of the character options - Tab, Semicolon, Comma, and Space - or you can select Other and enter the code for any UTF-8 character.
List, Map, and List-Map Fields
Field Hasher does not hash list, map, or list-map fields, but can hash field data within the list, map, and list-map fields. To hash data within a list, map, or list-map field, select the field that contains the actual data to be hashed.
When hashing the entire record, Field Hasher hashes the data within list, map, and list-map fields.