Head operator

The head operator selects the first n records from each partition of an input data set and copies the selected records to an output data set. By default, n is 10 records.

However, you can determine the following by means of options:

  • The number of records to copy
  • The partition from which the records are copied
  • The location of the records to copy
  • The number of records to skip before the copying operation begins.

This control is helpful in testing and debugging jobs with large data sets. For example, the -part option lets you see data from a single partition to ascertain if the data is being partitioned as you want. The -skip option lets you access a portion of a data set.

The tail operator performs a similar operation, copying the last n records from each partition. See Tail operator.