SORT statement

The SORT statement causes a set of records or table elements to be arranged in a user-specified sequence.

For sorting files, the SORT statement accepts records from one or more files, sorts them according to the specified keys, and makes the sorted records available either through an output procedure or in an output file.

For sorting tables, the SORT statement sorts table elements according to specified table keys.

Format 1: SORT statement

Read syntax diagramSkip visual syntax diagramSORTfile-name-1ONASCENDINGDESCENDINGKEYdata-name-1WITHDUPLICATESINORDERCOLLATINGSEQUENCEIS alphabet-name-1USINGfile-name-2INPUT PROCEDUREISprocedure-name-1THROUGHTHRUprocedure-name-2GIVINGfile-name-3OUTPUT PROCEDUREISprocedure-name-3THROUGHTHRUprocedure-name-4

Format 1 SORT statements can appear anywhere in the PROCEDURE DIVISION except in the declarative portion. This format of the SORT statement is not supported for programs that are compiled with the THREAD option. See also MERGE statement.

Format 2: Table SORT statement

Read syntax diagramSkip visual syntax diagramSORTdata-name-2ONASCENDINGDESCENDINGKEYdata-name-1WITHDUPLICATESINORDERCOLLATINGSEQUENCEIS alphabet-name-1

Format 2 SORT statements can appear anywhere in the PROCEDURE DIVISION. This format of the SORT statement can be used with programs that are compiled with the THREAD option.

file-name-1
The name given in the SD entry that describes the records to be sorted.

No pair of file-names in a SORT statement can be specified in the same SAME SORT AREA clause or the SAME SORT-MERGE AREA clause. File-names associated with the GIVING clause (file-name-3, ...) cannot be specified in the SAME AREA clause; however, they can be associated with the SAME RECORD AREA clause.

data-name-2
Specifies a table data-name that is subject to the following rules:
  • data-name-2 must have an OCCURS clause in the data description entry.
  • data-name-2 can be qualified.
  • data-name-2 can be subscripted. The rightmost or only subscript of the table must be omitted or replaced with the word ALL.

The number of occurrences of table elements that are referenced by data-name-2 is determined by the rules in the OCCURS clause. The sorted table elements are placed in the same table that is referenced by data-name-2.

ASCENDING KEY and DESCENDING KEY phrases (format 1)

This phrase specifies that records are to be processed in ascending or descending sequence (depending on the phrase specified), based on the specified sort keys.

data-name-1
Specifies a KEY data item on which the SORT statement will be based. Each such data-name must identify a data item in a record associated with file-name-1. The data-names following the word KEY are listed from left to right in the SORT statement in order of decreasing significance without regard to how they are divided into KEY phrases. The leftmost data-name is the major key, the next data-name is the next most significant key, and so forth. The following rules apply:
  • A specific KEY data item must be physically located in the same position and have the same data format in each input file. However, it need not have the same data-name.
  • If file-name-1 has more than one record description, the KEY data items need be described in only one of the record descriptions.
  • If file-name-1 contains variable-length records, all of the KEY data-items must be contained within the first n character positions of the record, where n equals the minimum records size specified for file-name-1.
  • KEY data items must not contain an OCCURS clause or be subordinate to an item that contains an OCCURS clause.
  • KEY data items cannot be:
    • Variably located
    • Group items that contain variable-occurrence data items
    • Category numeric described with usage NATIONAL (national decimal item)
    • Category external floating-point described with usage NATIONAL (national floating-point item)
    • Category DBCS
  • KEY data items can be qualified.
  • KEY data items can belong to any of the following data categories:
    • Alphabetic, alphanumeric, alphanumeric-edited
    • Numeric (except numeric with usage NATIONAL)
    • Numeric-edited (with usage DISPLAY or NATIONAL)
    • Internal floating-point or display floating-point
    • National or national-edited

If file-name-3 references an indexed file , the first specification of data-name-1 must be associated with an ASCENDING phrase and the data item referenced by that data-name-1 must occupy the same character positions in this record as the data item associated with the prime record key for that file.

The direction of the sorting operation depends on the specification of the ASCENDING or DESCENDING keywords as follows:

  • When ASCENDING is specified, the sequence is from the lowest key value to the highest key value.
  • When DESCENDING is specified, the sequence is from the highest key value to the lowest.
  • If the KEY data item is described with usage NATIONAL, the sequence of the KEY values is based on the binary values of the national characters.
  • If the KEY data item is internal floating point, the sequence of key values will be in numeric order.
  • When the COLLATING SEQUENCE phrase is not specified, the key comparisons are performed according to the rules for comparison of operands in a relation condition. See General relation conditions.
  • When the COLLATING SEQUENCE phrase is specified, the indicated collating sequence is used for key data items of alphabetic, alphanumeric, alphanumeric-edited, external floating-point, and numeric-edited categories. For all other key data items, the comparisons are performed according to the rules for comparison of operands in a relation condition.

ASCENDING KEY and DESCENDING KEY phrases (format 2)

This phrase specifies that table elements are to be processed in ascending or descending sequence, based on the specified phrase and sort keys.

data-name-1
Specifies a KEY data name that is subject to the following rules:
  • The data item that is identified by a key data-name must be the same as, or subordinate to, the data item that is referenced by data-name-2.
  • KEY data items can be qualified.
  • KEY data items can belong to any of the following data categories:
    • Alphabetic, alphanumeric, alphanumeric-edited
    • Numeric (except numeric with usage NATIONAL)
    • Numeric-edited (with usage DISPLAY or NATIONAL)
    • Internal floating-point or display floating-point
    • National or national-edited
  • KEY data items cannot be:
    • Variably located
    • Group items that contain variable-occurrence data items
    • Category numeric that is described with usage NATIONAL (national decimal item)
    • Category external floating-point that is described with usage NATIONAL (national floating-point item)
    • Category DBCS
    • Class object or pointer
    • Subscripted
  • If the data item that is identified by a KEY data-name is subordinate to data-name-2, the following rules apply:
    • The data item cannot be described with an OCCURS clause.
    • The data item cannot be subordinate to an entry that is also subordinate to data-name-2 and that contains an OCCURS clause.

The KEY phrase can be omitted only if the description of the table that is referenced by data-name-2 contains a KEY phrase.

The words ASCENDING and DESCENDING are transitive across all occurrences of data-name-1 until another word ASCENDING or DESCENDING is encountered.

The data items that are referenced by data-name-1 are key data items, and these data items determine the order in which the sorted table elements are stored. The order of significance of the keys is the order in which data items are specified in the SORT statement, without regard to the association with ASCENDING or DESCENDING phrases.

The SORT statement sorts the table that is referenced by data-name-2 and presents the sorted table in data-name-2. The sorting order is determined by either the ASCENDING and DESCENDING phrases (if specified), or by the KEY phrase that is associated with data-name-2.

The direction of the sorting operation depends on the specification of the ASCENDING or DESCENDING keywords:
  • When ASCENDING is specified, the sequence is from the lowest key value to the highest one.
  • When DESCENDING is specified, the sequence is from the highest key value to the lowest one.
  • If the KEY data item is described with usage NATIONAL, the sequence of the KEY values is based on the binary values of the national characters.
  • If the KEY data item is internal floating-point, the sequence of key values is in the numeric order.
  • When the COLLATING SEQUENCE phrase is not specified, the EBCDIC sequence is used for key data items of alphabetic, alphanumeric, alphanumeric-edited, external floating-point, and numeric-edited categories. For all the other key data items, the comparisons are performed according to the rules for comparison of operands in a relation condition.
  • When the COLLATING SEQUENCE phrase is specified, the indicated collating sequence is used for key data items of alphabetic, alphanumeric, alphanumeric-edited, external floating-point, and numeric-edited categories. For all the other key data items, the comparisons are performed according to the rules for comparison of operands in a relation condition.
To determine the relative order in which table elements are stored, the contents of corresponding key data items are compared according to the rules for comparison of operands in a relation condition. The sorting starts with the most significant key data item with the following rules:
  • If the contents of the corresponding key data items are not equal and the key is associated with the ASCENDING phrase, the table element that contains the key data item with the lower value has the lower occurrence number.
  • If the contents of the corresponding key data items are not equal and the key is associated with the DESCENDING phrase, the table element that contains the key data item with the higher value has the lower occurrence number.
  • If the contents of the corresponding key data items are equal, the determination is based on the contents of the next most significant key data item.

If the KEY phrase is not specified, the sequence is determined by the KEY phrase in the data description entry of the table that is referenced by data-name-2.

If the KEY phrase is specified, it overrides any KEY phrase specified in the data description entry of the table that is referenced by data-name-2.

If data-name-1 is omitted, the data item that is referenced by data-name-2 is the key data item.

DUPLICATES phrase (format 1)

If the DUPLICATES phrase is specified, and the contents of all the key elements associated with one record are equal to the corresponding key elements in one or more other records, the order of return of these records is as follows:

  • The order of the associated input files as specified in the SORT statement. Within a given file the order is that in which the records are accessed from that file.
  • The order in which these records are released by an input procedure, when an input procedure is specified.

If the DUPLICATES phrase is not specified, the order of these records is undefined.

DUPLICATES phrase (format 2)

When both of the following conditions are met, the contents of table elements are in the relative order that is the same as the order before sorting operation:
  • The DUPLICATES phrase is specified.
  • The contents of all the key data items that are associated with one table element are equal to the contents of corresponding key data items that are associated with one or more other table elements.

If the DUPLICATES phrase is not specified and the second condition exists, the relative order of the contents of these table elements is undefined.

COLLATING SEQUENCE phrase (both formats)

This phrase specifies the collating sequence to be used in alphanumeric comparisons for the KEY data items in this sorting operation.

The COLLATING SEQUENCE phrase has no effect for keys that are not alphabetic or alphanumeric.

alphabet-name-1
Must be specified in the ALPHABET clause of the SPECIAL-NAMES paragraph. alphabet-name-1 can be associated with any one of the ALPHABET clause phrases, with the following results:
STANDARD-1
The ASCII collating sequence is used for all alphanumeric comparisons. (The ASCII collating sequence is shown in EBCDIC and ASCII collating sequences.)
STANDARD-2
The International Reference Version of ISO/IEC 646, 7-bit coded character set for information processing interchange is used for all alphanumeric comparisons.
NATIVE
The EBCDIC collating sequence is used for all alphanumeric comparisons. (The EBCDIC collating sequence is shown in EBCDIC and ASCII collating sequences.)
EBCDIC
The EBCDIC collating sequence is used for all alphanumeric comparisons. (The EBCDIC collating sequence is shown in EBCDIC and ASCII collating sequences.)
literal
The collating sequence established by the specification of literals in the alphabet-name clause is used for all alphanumeric comparisons.

When the COLLATING SEQUENCE phrase is omitted, the PROGRAM COLLATING SEQUENCE clause (if specified) in the OBJECT-COMPUTER paragraph specifies the collating sequence to be used. When both the COLLATING SEQUENCE phrase and the PROGRAM COLLATING SEQUENCE clause are omitted, the EBCDIC collating sequence is used.

USING phrase

file-name-2 , ...
The input files.

When the USING phrase is specified, all the records in file-name-2, ..., (that is, the input files) are transferred automatically to file-name-1. At the time the SORT statement is executed, these files must not be open. The compiler opens, reads, makes records available, and closes these files automatically. If EXCEPTION/ERROR procedures are specified for these files, the compiler makes the necessary linkage to these procedures.

All input files must be described in FD entries in the DATA DIVISION.

If the USING phrase is specified and if file-name-1 contains variable-length records, the size of the records contained in the input files (file-name-2, ...) must be neither less than the smallest record nor greater than the largest record described for file-name-1. If file-name-1 contains fixed-length records, the size of the records contained in the input files must not be greater than the largest record described for file-name-1. For more information, see Describing the input to sorting or merging in the Enterprise COBOL Programming Guide.

INPUT PROCEDURE phrase

This phrase specifies the name of a procedure that is to select or modify input records before the sorting operation begins.

procedure-name-1
Specifies the first (or only) section or paragraph in the input procedure.
procedure-name-2
Identifies the last section or paragraph of the input procedure.

The input procedure can consist of any procedure needed to select, modify, or copy the records that are made available one at a time by the RELEASE statement to the file referenced by file-name-1. The range includes all statements that are executed as the result of a transfer of control by CALL, EXIT, GO TO, PERFORM, and XML PARSE statements in the range of the input procedure, as well as all statements in declarative procedures that are executed as a result of the execution of statements in the range of the input procedure. The range of the input procedure must not cause the execution of any MERGE, RETURN, or format 1 SORT statement.

If an input procedure is specified, control is passed to the input procedure before the file referenced by file-name-1 is sequenced by the SORT statement. The compiler inserts a return mechanism at the end of the last statement in the input procedure. When control passes the last statement in the input procedure, the records that have been released to the file referenced by file-name-1 are sorted.

GIVING phrase

file-name-3 , ...
The output files.

When the GIVING phrase is specified, all the sorted records in file-name-1 are automatically transferred to the output files (file-name-3, ...).

All output files must be described in FD entries in the DATA DIVISION.

If the output files (file-name-3, ...) contain variable-length records, the size of the records contained in file-name-1 must be neither less than the smallest record nor greater than the largest record described for the output files. If the output files contain fixed-length records, the size of the records contained in file-name-1 must not be greater than the largest record described for the output files. For more information, see Describing the output from sorting or merging in the Enterprise COBOL Programming Guide.

At the time the SORT statement is executed, the output files (file-name-3, ...) must not be open. For each of the output files, the execution of the SORT statement causes the following actions to be taken:

  • The processing of the file is initiated. The initiation is performed as if an OPEN statement with the OUTPUT phrase had been executed.
  • The sorted logical records are returned and written onto the file. Each record is written as if a WRITE statement without any optional phrases had been executed.

    For a relative file, the relative key data item for the first record returned contains the value '1'; for the second record returned, the value '2'. After execution of the SORT statement, the content of the relative key data item indicates the last record returned to the file.

  • The processing of the file is terminated. The termination is performed as if a CLOSE statement without optional phrases had been executed.

These implicit functions are performed such that any associated USE AFTER EXCEPTION/ERROR procedures are executed; however, the execution of such a USE procedure must not cause the execution of any statement manipulating the file referenced by, or accessing the record area associated with, file-name-3. On the first attempt to write beyond the externally defined boundaries of the file, any USE AFTER STANDARD EXCEPTION/ERROR procedure specified for the file is executed. If control is returned from that USE procedure or if no such USE procedure is specified, the processing of the file is terminated.

OUTPUT PROCEDURE phrase

This phrase specifies the name of a procedure that is to select or modify output records from the sorting operation.

procedure-name-3
Specifies the first (or only) section or paragraph in the output procedure.
procedure-name-4
Identifies the last section or paragraph of the output procedure.

The output procedure can consist of any procedure needed to select, modify, or copy the records that are made available one at a time by the RETURN statement in sorted order from the file referenced by file-name-1. The range includes all statements that are executed as the result of a transfer of control by CALL, EXIT, GO TO, PERFORM, and XML PARSE statements in the range of the output procedure. The range also includes all statements in declarative procedures that are executed as a result of the execution of statements in the range of the output procedure. The range of the output procedure must not cause the execution of any MERGE, RELEASE, or format 1 SORT statement.

If an output procedure is specified, control passes to it after the file referenced by file-name-1 has been sequenced by the SORT statement. The compiler inserts a return mechanism at the end of the last statement in the output procedure and when control passes the last statement in the output procedure, the return mechanism provides the termination of the sort and then passes control to the next executable statement after the SORT statement. Before entering the output procedure, the sort procedure reaches a point at which it can select the next record in sorted order when requested. The RETURN statements in the output procedure are the requests for the next record.

The INPUT PROCEDURE and OUTPUT PROCEDURE phrases are similar to those for a basic PERFORM statement. For example, if you name a procedure in an output procedure, that procedure is executed during the sorting operation just as if it were named in a PERFORM statement. As with the PERFORM statement, execution of the procedure is terminated after the last statement completes execution. The last statement in an input or output procedure can be the EXIT statement (see EXIT statement).