cut — Cut out selected fields from each line of a file

Format

cut –b list [-Bn] [-W option[,option]...] [file…]
cut –c list [-B] [-W option[,option]...] [file…]
cut –f list [–d char] [–Bs] [-W option[,option]...] [file…]

Description

cut reads input from files, each specified with the file argument, and selectively copies sections of the input lines to the standard output (stdout). If you do not specify any file, or if you specify a file named –, cut reads from standard input (stdin).

Options

–B

Disables the automatic conversion of tagged files. This option is ignored if the filecodeset or pgmcodeset options (-W option) are specified.

–b list

Invokes byte position mode. After this comes a list of the byte positions you want to display. This list might contain multiple byte positions, separated by commas (,) or blanks or ranges of positions separated by dashes (–). Since the list must be a single argument, shell quoting is necessary if you use blanks. You can combine these to allow selection of any byte positions of the input.

Guideline: When using the –b option with double-byte characters, you must also specify the –n option if you want to ensure that entire characters are displayed. If you do not specify the –n option, cut assumes that the low byte of a range is the first byte of a character and that the high byte of a range is the last byte of a double-byte character, possibility resulting in the misinterpretation of the characters represented by those byte positions.

–c list

Invokes character-position mode. After this comes a list of character positions to retain in the output. This list can contain many character positions, separated by commas (,) or blanks or ranges of positions separated by a dash (–). Since the list must be a single argument, shell quoting is necessary if you use blanks. You can combine these to allow selection of any character positions of the input.

–d char

Specifies char as the character that separates fields in the input data; by default, this is the horizontal tab.

–f list

Invokes field delimiter mode. After this comes a list of the fields you want to display. You specify ranges of fields and multiple field numbers in the same way you specify ranges of character positions and multiple character positions in –c mode.

–n

Does not split characters. If the low byte in a selected range is not the first byte of a character, cut extends the range downward to include the entire character; if the high byte in a selected range is not the last byte of a character, cut limits the range to include only the last entire character before the high byte selected. If –n is selected, cut does not list ranges that do not encompass an entire character, and these ranges do not cause an error.

–s

Does not display lines that do not contain a field separator character. Normally, cut displays lines that do not contain a field separator character in their entirety.

-W option[,option]...

Specifies z/OS-specific options. The option keywords are case-sensitive. Possible options are:

filecodeset=codeset

Performs text conversion from one code set to another when reading from the file. The coded character set of the file is codeset. codeset can be a code set name known to the system or a numeric coded character set identifier (CCSID). Note that the command iconv -l lists existing CCSIDs along with their corresponding code set names. The filecodeset and pgmcodeset options can be used on files with any file tag.

If pgmcodeset is specified but filecodeset is omitted, then the default file code set is ISO8859-1 even if the file is tagged with a different code set. If neither filecodeset nor pgmcodeset is specified, text conversion will not occur unless automatic conversion is enabled or the _TEXT_CONV environment variable indicates text conversion. For more information about text conversion, see Controlling text conversion for z/OS UNIX shell commands.

If filecodeset or pgmcodeset is specified, then automatic conversion is disabled for this command invocation and the -B option is ignored if it is also specified. See z/OS UNIX System Services Planning for more information about automatic conversion.

When specifying values for filecodeset, use the values that Unicode Service supports. For more information about supported code sets, see z/OS Unicode Services User's Guide and Reference.

pgmcodeset=codeset

Performs text conversion from one code set to another when reading from the file. The coded character set of the program (command) is codeset. codeset can be a code set name known to the system or a numeric coded character set identifier (CCSID). Note that the command iconv -l lists existing CCSIDs along with their corresponding code set names. The filecodeset and pgmcodeset options can be used on files with any file tag.

If filecodeset is specified but pgmcodeset is omitted, then the default program code set is IBM-1047. If neither filecodeset nor pgmcodeset is specified, text conversion will not occur unless automatic conversion is enabled or the _TEXT_CONV environment variable indicates text conversion. For more information about text conversion, see Controlling text conversion for z/OS UNIX shell commands.

Restriction: The only supported values for pgmcodeset are IBM-1047 and 1047.

Examples

To print a list that contains the dates that the files were created and the file names of files in the working directory:
```
ls –al | cut –c 42–48,54–66
```
To display the first field of each line of a file containing ASCII characters to the standard output (stdout), assuming that
- The text file is untagged and you do not want to tag it or enable automatic conversion, and
- You cannot alter the tag (for example, you are displaying an untagged public text file or a read-only text file)
then issue:
```
cut -f 1 -W filecodeset=ISO8859-1,pgmcodeset=IBM-1047 myAsciiFile
```
To display the second byte of each line of a file containing EBCDIC characters to the standard output (stdout), assuming that automatic conversion has been enabled but the text file is incorrectly tagged as UTF-8:
```
cut -b 2 -B myMisTaggedFile
```

Localization

cut uses the following localization environment variables:

LANG
LC_ALL
LC_CTYPE
LC_MESSAGES
NLSPATH

See Localization for more information.

Environment variables

cut uses the following environment variable:

_TEXT_CONV: Contains text conversion information for the command. The text conversion information is not used when either the -B option or the filecodeset or pgmcodeset option (-W option) is specified. For more information about text conversion, see Controlling text conversion for z/OS UNIX shell commands.

Exit values

0

Successful completion

1

Failure due to any of the following reasons:

Cannot open the input file
Out of memory
The code set is not valid
Could not turn off automatic conversion
Could not perform requested text conversion

2

Failure due to any of the following reasons:

An incorrect command-line argument
You did not specify any of –b, –c, or –f
You omitted the list argument
Badly formed list argument

Portability

POSIX.2, X/Open Portability Guide, UNIX System V.

The –B and -W options are extensions of the POSIX standard.

Related information

paste, uname