z/OS DFSORT: Getting Started
Previous topic | Next topic | Contents | Contact z/OS | Library | PDF


Joining records

z/OS DFSORT: Getting Started
SC23-6880-00

Often, you have two data sets with common key fields and different data fields, and you want to join their records, that is, for records with matching keys, you want to create output records with some fields from one data set and some fields from the other data set. You might want a one record to one record join, a one record to many records join, a many records to one record join, or even a many records to many records join.

A JOINKEYS application helps you to perform various "join" applications on two data sets by one or more keys. You can do an inner join, full outer join, left outer join, right outer join and unpaired combinations. The two data sets can be of different types (fixed, variable, VSAM, and so on) and lengths, and have keys in different locations.

The records from the input data sets can be processed in a variety of ways before and after they are joined using most of the DFSORT control statements you learned about previously including SORT or COPY, INCLUDE or OMIT, INREC, OUTREC and OUTFIL.

Suppose you have two input data sets, REGION.IN1 and REGION.IN2 as shown in Table 1 and Table 2. REGION.IN1 has RECFM=FB and LRECL=35 and REGION.IN2 has RECFM=FB and LRECL=27, so their LRECLs are different.

Table 1. REGION.IN1 data set for JOINKEYS application
Region Headquarters Regional Director

1   5

6         20

21         35

East
West
North
South

Philadelphia
San Jose
Boston
Charlotte

C. Kent
B. Wayne
P. Parker
D. Prince

Table 2. REGION.IN2 data set for JOINKEYS application
Office Region Employees Evaluation Established

1  4

5   9

10  13

14      23

24  27

0001
0001
0002
0002
0001
0003
0002
0003
0001
0004

East
South
South
East
West
East
West
South
North
South

0050
0023
0068
0125
0052
0028
0105
0054
0200
0070

Fair
Good
Fair
Excellent
Good
Good
Excellent
Fair
Fair
Good

1983
1976
1978
1986
1995
1994
2001
1992
1991
2002

From these two input data sets, you want to create an output data set, REGION.OUT. For each record in REGION.IN2, you want to look up the corresponding Region in REGION.IN1, and combine fields from the two records into one output record in REGION.OUT, as shown in Table 3.

Table 3. REGION.OUT data set for JOINKEYS application
Office Region Regional Director Employees Evaluation Headquarters

1  4

5   9

10        24

25  28

29        38

39        53

0001
0002
0003
0001
0001
0002
0003
0004
0001
0002

East
East
East
North
South
South
South
South
West
West

C. Kent
C. Kent
C. Kent
P. Parker
D. Prince
D. Prince
D. Prince
D. Prince
B. Wayne
B. Wayne

0050
0125
0028
0200
0023
0068
0054
0070
0052
0105

Fair
Excellent
Good
Fair
Good
Fair
Fair
Good
Good
Excellent

Philadelphia
Philadelphia
Philadelphia
Boston
Charlotte
Charlotte
Charlotte
Charlotte
San Jose
San Jose

Write the following DFSORT JCL and control statements to use a JOINKEYS application to create REGION.OUT from REGION.IN1 and REGION.IN2.

//JN1 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//IN1 DD DSN=DSN=A123456.REGION.IN1,DISP=SHR
//IN2 DD DSN=DSN=A123456.REGION.IN2,DISP=SHR
//SORTOUT DD DSN=A123456.REGION.OUT,DISP=(NEW,CATLG,DELETE),UNIT=3390,
//  SPACE=(CYL,(5,5))
//SYSIN DD *
  JOINKEYS F1=IN1,FIELDS=(1,5,A)  F1 has ddname IN1 and key in 1-5
  JOINKEYS F2=IN2,FIELDS=(5,5,A)  F2 has ddname IN2 and key in 5-9
  REFORMAT FIELDS=(F2:1,4,  Office from F2
      F1:1,5,21,15,         Region and Regional Director from F1
      F2:10,4,14,10,        Employees and Evaluation from F2
      F1:6,15)              Headquarters from F1
  OPTION COPY               Copy joined records
/*

Two JOINKEYS statements are required: one for the F1 data set and another for the F2 data set. In this case, the first JOINKEYS statement identifies IN1 as the ddname for the F1 data set and indicates an ascending key (Region) in positions 1-5 of that data set. The second JOINKEYS statement identifies IN2 as the ddname for the F2 data set and indicates an ascending key (Region) is at positions 5-9 of that data set. Each key in the F1 data set must be of the same length and order (ascending or descending) as the corresponding key in the F2 data set, but does not have to be in the same location. The keys are always treated as unsigned binary (INREC can be used to "normalize" the keys in each data set before the records are joined, if necessary).

The F1 data set will be sorted by the key in positions 1-5. The F2 data set will be sorted by the key in positions 5-9. If the records in a data set are already sorted by the key, you can specify SORTED on the JOINKEYS statement to tell DFSORT to copy the records of that data set rather than sorting them.

Records with the same key in both data sets are joined and constructed as directed by the REFORMAT statement using F1: for fields from the F1 record and F2: for fields from the F2 record. This REFORMAT statement creates joined records from the following fields:
  • Output positions 1-4: Office from F2
  • Output positions 5-9: Region from F1
  • Output positions 10-24: Regional Director from F1
  • Output positions 25-28: Employees from F2
  • Output positions 29-38: Evaluation from F2
  • Output positions 39-53: Headquarters from F1

The resulting joined records are 53 bytes long and are copied to the SORTOUT data set (REGION.OUT).

If we wanted to sort the resulting joined records on the Headquarters and Office fields, we could replace the OPTION COPY statement with the following SORT statement:
  SORT FIELDS=(39,15,CH,A,1,4,CH,A)

The resulting sorted output records in REGION.OUT are shown in Table 4.

Table 4. REGION.OUT data set Sorted by Headquarters and Office
Office Region Regional Director Employees Evaluation Headquarters

1  4

5   9

10        24

25  28

29        38

39        53

0001
0001
0002
0003
0004
0001
0002
0003
0001
0002

North
South
South
South
South
East
East
East
West
West

P. Parker
D. Prince
D. Prince
D. Prince
D. Prince
C. Kent
C. Kent
C. Kent
B. Wayne
B. Wayne

0200
0023
0068
0054
0070
0050
0125
0028
0052
0105

Fair
Good
Fair
Fair
Good
Fair
Excellent
Good
Good
Excellent

Boston
Charlotte
Charlotte
Charlotte
Charlotte
Philadelphia
Philadelphia
Philadelphia
San Jose
San Jose

You can also use a JOINKEYS application to match records from two different input data sets in various ways. Suppose you have two input data sets, CITIES.IN1 and CITIES.IN2 as shown in Table 5 and Table 6.

Table 5. CITIES.IN1 data set
City State District

1     20

21         35

36    37

GILROY
GILROY
MORGAN HILL
PALO ALTO
PALO ALTO
PALO ALTO
SACRAMENTO
SAN JOSE
SAN JOSE
SAN MARTIN
AUSTIN
DALLAS
DALLAS
BARRE
BARRE
STOWE

CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
TEXAS
TEXAS
TEXAS
VERMONT
VERMONT
VERMONT

05
10
03
15
08
21
05
02
10
12
21
15
25
07
12
09

Table 6. CITIES.IN2 data set
State City

1         15

16         35

TEXAS
CALIFORNIA
CALIFORNIA
VERMONT
CALIFORNIA
TEXAS
VERMONT
CALIFORNIA
CALIFORNIA

AUSTIN
SACRAMENTO
GILROY
BURLINGTON
MODESTO
LAREDO
BARRE
LOS ANGELES
SAN JOSE

From these two data sets, you want to create a CITIES.OUT data set with the records for Cities that appear in CITIES.IN1, but not in CITIES.IN2, as shown in Table 7.

Table 7. CITIES.OUT data set
City State District

1     20

21         35

36    37

MORGAN HILL
PALO ALTO
PALO ALTO
PALO ALTO
SAN MARTIN
DALLAS
DALLAS
STOWE

CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
TEXAS
TEXAS
VERMONT

03
15
08
21
12
15
25
09

Write the following DFSORT JCL and control statements to use a JOINKEYS application to create CITIES.OUT from CITIES.IN1 and CITIES.IN2.
//JN2 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTJNF1 DD DSN=DSN=A123456.CITIES.IN1,DISP=SHR
//SORTJNF2 DD DSN=DSN=A123456.CITIES.IN2,DISP=SHR
//SORTOUT DD DSN=A123456.CITIES.OUT,DISP=OLD
//SYSIN DD *
* F1 keys are State and City - F1 is already sorted
* by those keys.
  JOINKEYS FILE=F1,FIELDS=(21,15,A,1,20,A),SORTED
* F2 keys are State and City 
  JOINKEYS FILE=F2,FIELDS=(1,15,A,16,20,A)
* Keep the records in F1 that do not have a match in F2
* for the State and City.
  JOIN UNPAIRED,F1,ONLY
  OPTION COPY
/*

The first JOINKEYS statement identifies SORTJNF1 as the ddname for the F1 data set and indicates ascending keys (State and City) in positions 21-35 and 1-20 of that data set. Since the F1 records are already in order by the State and City fields, SORTED is used to do a Copy of the F1 records rather than a Sort.

The second JOINKEYS statement identifies SORTJNF2 as the ddname for the F2 data set and indicates ascending keys (State and City) in positions 1-15 and 16-35 of that data set. Since the F2 records are not already in order by the State and City fields, SORTED is not used and a Sort in performed for the F2 records on the indicated keys.

JOIN UNPAIRED,F1,ONLY is used to restrict the output (SORTOUT) to the records in F1 that do not have matching keys in F2. Since we want the entire F1 record, we do not need a REFORMAT statement.

Now suppose we want to use the CITIES.IN1 and CITIES.IN2 data sets again, but this time we want to produce the following output from these two data sets:
  • BOTH.OUT: CALIFORNIA and TEXAS cities that appear in both CITIES.IN1 and CITIES.IN2.
  • F1ONLY.OUT: CALIFORNIA and TEXAS cities that appear in CITIES.IN1, but not in CITIES.IN2.
  • F2ONLY.OUT: CALIFORNIA and TEXAS cities that appear in CITIES.IN2, but not in CITIES.IN1.
Write the following DFSORT JCL and control statements to use a JOINKEYS application to create BOTH.OUT, F1ONLY.OUT and F2ONLY.OUT from CITIES.IN1 and CITIES.IN2.
//JN3 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//CITIES1 DD DSN=A123456.CITIES.IN1,DISP=SHR
//CITIES2 DD DSN=A123456.CITIES.IN2,DISP=SHR
//JNF1CNTL DD *                                            
  OMIT COND=(21,15,CH,EQ,C'VERMONT')                       
//JNF2CNTL DD *                                            
  OMIT COND=(1,15,CH,EQ,C'VERMONT') 
//BOTH DD DSN=A123456.BOTH.OUT,DISP=OLD
//F1ONLY DD DSN=A123456.F1ONLY.OUT,DISP=OLD
//F2ONLY DD DSN=A123456.F2ONLY.OUT,DISP=OLD
//SYSIN DD * 
  JOINKEYS F1=CITIES1,FIELDS=(21,15,A,1,20,A),SORTED          
  JOINKEYS F2=CITIES2,FIELDS=(1,15,A,16,20,A)                 
  JOIN UNPAIRED,F1,F2                                      
  REFORMAT FIELDS=(?,F1:1,37,F2:1,35)                      
  OPTION COPY                                              
  OUTFIL FNAMES=BOTH,INCLUDE=(1,1,CH,EQ,C'B'),     
    BUILD=(2,37)                                   
  OUTFIL FNAMES=F1ONLY,INCLUDE=(1,1,CH,EQ,C'1'),   
    BUILD=(2,37)                                   
  OUTFIL FNAMES=F2ONLY,INCLUDE=(1,1,CH,EQ,C'2'),   
    BUILD=(54,20,39,15)                            
/*

The first JOINKEYS statement identifies CITIES1 as the ddname for the F1 data set and indicates ascending keys (State and City) in positions 21-35 and 1-20 of that data set. The JNF1CNTL data set contains an OMIT statement to remove the VERMONT records from the F1 data set so they will not be joined. Since the F1 records are already in order by the State and City fields, SORTED is used to do a Copy rather than a Sort for the F1 data set.

The second JOINKEYS statement identifies CITIES2 as the ddname for the F2 data set and indicates ascending keys (State and City) in positions 1-15 and 16-35 of that data set. The JNF2CNTL data set contains an OMIT statement to remove the VERMONT records from the F2 data set so they will not be joined. Since the F2 records are not already in order by the State and City fields, SORTED is not used and a Sort in performed for the F2 data set.

Since we want to separate out the BOTH, F1ONLY and F2ONLY joined records, we use a JOIN statement with UNPAIRED,F1,F2 to keep the unpaired joined records as well as the paired join records. In the REFORMAT statement, we use ? as the first field to give us an indicator of whether each key was found in both records ('B' indicator), only in F1 ('1' indicator) or only in F2 ('2' indicator). After the indicator, we put positions 1-37 (City, State, District) from F1 and 1-35 (State, City) from F2. After the records are joined, they will look as shown in Table 8.

Table 8. Joined City records
Indicator F1 City F1 State F1 District F2 State F2 City

1  1

2   21

22        36

37  38

39        53

54        73

B
B
2
2
1
1
1
1
B
B
B
1
B
1
1
2

GILROY
GILROY


MORGAN HILL
PALO ALTO
PALO ALTO
PALO ALTO
SACRAMENTO
SAN JOSE
SAN JOSE
SAN MARTIN
AUSTIN
DALLAS
DALLAS

CALIFORNIA
CALIFORNIA


CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
TEXAS
TEXAS
TEXAS

05
10


03
15
08
21
05
02
10
12
21
15
25

CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA




CALIFORNIA
CALIFORNIA
CALIFORNIA

TEXAS


TEXAS

GILROY
GILROY
LOS ANGELES
MODESTO




SACRAMENTO
SAN JOSE
SAN JOSE

AUSTIN


LAREDO

Now we can use OUTFIL statements to create our three different output data sets. We write positions 2-38 (F1 City, F1 State, F1 District) of the joined records with an indicator of 'B' to positions 1-37 of the BOTH.OUT data set. We write positions 2-38 (F1 City, F1 State, F1 District) of the joined records with an indicator of '1' to positions 1-37 of the F1ONLY.OUT data set. We write positions 54-73 (F2 City) and 39-53 (F2 State) of the joined records with an indicator of '2' to positions 1-35 of the F2ONLY.OUT data set. The resulting records are shown in Table 9, Table 10 and Table 11.

Table 9. BOTH.OUT data set
City State District

1     20

21         35

36    37

GILROY
GILROY
SACRAMENTO
SAN JOSE
SAN JOSE
AUSTIN

CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
TEXAS

05
10
05
02
10
21

Table 10. F1ONLY.OUT data set
City State District

1     20

21         35

36    37

MORGAN HILL
PALO ALTO
PALO ALTO
PALO ALTO
SAN MARTIN
DALLAS
DALLAS

CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
CALIFORNIA
TEXAS
TEXAS

03
15
08
21
12
15
25

Table 11. F2ONLY.OUT data set
City State

1     20

21         35

LOS ANGELES
MODESTO
LAREDO

CALIFORNIA
CALIFORNIA
TEXAS

These are just a few of the many types of join operations you can do with JOINKEYS. See z/OS DFSORT Application Programming Guide for complete details of using the JOINKEYS, JOIN and REFORMAT statements along with the other DFSORT statements to perform different JOINKEYS applications.

Summary

This chapter covered the use of JOINKEYS, JOIN and REFORMAT to join records from two data sets.

The next chapter will cover methods of calling DFSORT from a program.

Go to the previous page Go to the next page




Copyright IBM Corporation 1990, 2014