Using the class APT_HashPartitioner

The class APT_HashPartitioner partitions a data set by performing a hashing function on one or more fields of a record.

APT_HashPartitioner uses a dynamic interface schema that allows you to specify one or more numeric or string fields as input to the partitioner.

The following figure shows an operator using APT_HashPartitioner:

Figure 1. APT_HashPartitioner

APT_HashPartitioner does not define any interface schema; you use the APT_HashPartitioner constructor or the member function APT_HashPartitioner::setKey() to specify the key fields.

The constructor for APT_HashPartitioner has two overloads:

APT_HashPartitioner();
APT_HashPartitioner(const APT_FieldList& fList);

The first overload creates an APT_HashPartitioner object without specifying any key fields. You must then use setKey() to specify key fields.

The second form of the constructor creates an APT_HashPartitioner object using a list of key fields from the input interface schema for the operator. These fields can be any field type, including raw, date, and timestamp. APT_HashPartitioner determines the data type of each field from the input interface schema.

SortOperator requires three fields as input: two integer fields and a string field. You can specify the interface schema of the partitioner within the describeOperator() function, as the code in the following table shows:

Comment Code

Table 1. Partitioner Interface Schema in describeOperator()
Comment	Code
`9 10 11 12`	`APT_Status SortOperator::describeOperator() { setKind(APT_Operator::eParallel); setInputDataSets(1); setOutputDataSets(1); setInputInterfaceSchema("record(field1:int32; field2:int32;field3:string; in:;)", 0); setOutputInterfaceSchema("record(out:;)", 0); declareTransfer("in", "out", 0, 0); APT_HashPartitioner * hashPart = new APT_HashPartitioner; hashPart->setKey("field1", "int32"); hashPart->setKey("field2", "int32"); setPartitionMethod(hashPart, APT_ViewAdapter(), 0); return APT_StatusOk; }`

APT_Status SortOperator::describeOperator()
{
	setKind(APT_Operator::eParallel);

	setInputDataSets(1);
	setOutputDataSets(1);

	setInputInterfaceSchema("record(field1:int32; 
                             field2:int32;field3:string; in:*;)", 0);
	setOutputInterfaceSchema("record(out:*;)", 0);
	declareTransfer("in", "out", 0, 0);

	APT_HashPartitioner * hashPart = new APT_HashPartitioner;
	hashPart->setKey("field1", "int32");
	hashPart->setKey("field2", "int32");

	setPartitionMethod(hashPart, APT_ViewAdapter(), 0);

	return APT_StatusOk;
}

9

Use the default constructor to dynamically allocate an APT_HashPartitioner object.

Partitioner objects must be dynamically allocated within describeOperator(). The framework deletes the partitioner for you when it is no longer needed.

You must call setKey() to specify the key fields for this APT_HashPartitioner object.

10

Use APT_HashPartitioner::setKey() to specify field1 as a key field for this APT_HashPartitioner object.

Use setKey() to specify both a field name and a data type for the field. The order in which key fields are listed is unimportant.

11

Specify field2 as a key field for this APT_HashPartitioner object.

12

Use APT_Operator::setPartitionMethod() to specify hashPart as the partitioner for this operator. After calling this function, do not delete the partitioner because the framework has claim to its memory.

Because you do not need to use a view adapter with this partitioner, this function creates and passes a default view adapter.

An application developer using this operator can use adapters to translate the name of a data set field and its data type in the input data set schema to match the input interface schema. In the previous figure, the data set myDS is input to the sort operator. An application developer could translate field a and field b of myDS to field1 and field2 of the operator. Therefore, the hash partitioner would partition the record by fields a and b.