INT_TO_INT comparison

Compares an interval from a data source to an interval from a reference source. The results match if an interval in one file overlaps or is fully contained in an interval in another file.

You might use this match comparison for comparing hospital admission dates to see if hospital stays are partially concurrent. In addition, you might use this match comparison for matching two geographic reference files containing ranges of addresses.

You can use this comparison with reverse matching.

Frequency information is not taken into account when this match comparison is used but a two-source match requires four input streams. If you use this match comparison with a Two-source Match stage job, create two dummy file inputs instead of files that contain frequency information.

Required Columns

The following data source and reference source columns are required:

  • Data. The data column that contains the beginning value of the interval.
  • Data. The data column that contains the ending value of the interval.
  • Reference. The reference column that contains the beginning value of the interval.
  • Reference. The reference column that contains the ending value of the interval.

Required Modes

A mode is required. Choose one of the following modes:

  • ZERO_VALID. Indicates that a value of 0 in a Data or Reference column is valid. A blank value in a Reference column means that the range value that it represents is the same as the range value in the companion Reference column. A blank value in a Data column means that the range value that it represents is the same as the range value in the companion Data column.
  • ZERO_NULL. Indicates that a value of 0 in the Data column is missing data. A value of 0 or a blank value in the Reference column that represents the ending range means that the ending range value is the same as the beginning range value in the Reference column that represents the beginning range. A value of 0 or a blank value in the Data column that represents the ending range means that the ending range value is the same as the beginning range value in the Data column that represents the beginning range.

Example

The following example illustrates interval-to-interval comparisons.

Assume that the interval from the data source is 19931023 to 19931031.

The interval from the reference source matches or does not match depending on whether the interval falls within the data source interval.

  • From 19931025 to 19931102, matches because 19931031 falls within the interval on the reference source
  • 19930901 to 19931225, matches because the interval from the data source falls within the interval on the reference source
  • 19930920 to 19931025, matches because 19931023 falls within the interval on the reference source
  • 19931030 to 19940123, matches because 19931031 falls within the interval on the reference source
  • 19930901 to 19930922, does not match because the interval from the data source does not overlap the interval on the reference source