sed - Start the sed noninteractive stream editor

Format

sed [-BEn] [-W option[,option] …] script [file ...]
sed [-BEn] [-e script] ... [-f scriptfile] ... [-W option[,option] ... ] [file ...]

Description

The sed command applies a set of editing subcommands that are contained in script to each argument input file.

If more than one file is specified, they are concatenated and treated as a single large file. script is the arguments of all -e and -f options and the contents of all script files. You can specify multiple -e and -f options; commands are added to script in the order specified.

If you did not specify file, sed reads the standard input.

sed reads each input line into a special area that is known as the pattern buffer. Certain subcommands [gGhHx] use a second area called the hold buffer. By default, after each pass through the script, sed writes the final contents of the pattern buffer to the standard output.

Options

-B

Disables the automatic conversion of tagged files. This option is ignored if the filecodeset or pgmcodeset options (-W option) are specified.

-E

Uses extended regular expressions. Normally, sed uses basic regular expressions.

-e script

Adds the editing subcommands script to the end of the script.

-f scriptfile

Adds the subcommands in the file scriptfile to the end of the script.

-n

Suppresses all output except that generated by explicit subcommands in the sed script [acilnpPr]

-W option[,option]...

Specifies options that are specific to z/OS. The option keywords are case-sensitive. Possible options are:

filecodeset=codeset

Performs text conversion from one code set to another when reading from the file. The coded character set of the file is codeset. codeset can be a code set name known to the system or a numeric coded character set identifier (CCSID). Note that the command iconv -l lists existing CCSIDs along with their corresponding code set names. The filecodeset and pgmcodeset options can be used on files with any file tag.

If pgmcodeset is specified but filecodeset is omitted, then the default file code set is ISO8859-1 even if the file is tagged with a different code set. If neither filecodeset nor pgmcodeset is specified, text conversion will not occur unless automatic conversion is enabled or the _TEXT_CONV environment variable indicates text conversion. For more information about text conversion, see Controlling text conversion for z/OS UNIX shell commands.

If filecodeset or pgmcodeset is specified, then automatic conversion is disabled for this command invocation and the -B option is ignored if it is also specified. For more information about automatic conversion, see Converting files between code pages in z/OS UNIX System Services Planning.

When specifying values for filecodeset, use the values that Unicode Service supports.

pgmcodeset=codeset

Performs text conversion from one code set to another when reading from the file. The coded character set of the program (command) is codeset. codeset can be a code set name known to the system or a numeric coded character set identifier (CCSID). Note that the command iconv -l lists existing CCSIDs along with their corresponding code set names. The filecodeset and pgmcodeset options can be used on files with any file tag.

If filecodeset is specified but pgmcodeset is omitted, then the default program code set is IBM-1047. If neither filecodeset nor pgmcodeset is specified, text conversion will not occur unless automatic conversion is enabled or the _TEXT_CONV environment variable indicates text conversion. For more information about text conversion, see Controlling text conversion for z/OS UNIX shell commands.

Restriction: The only supported values for pgmcodeset are IBM-1047 and 1047.

Note:

If the source string has non-convertible characters, the command is terminated. In this case, set the environment variable _BPXK_UNICODE_SUB to YES to specify the substitution action for the non-convertible characters. For more information about the environment variable, see Commonly used environment variables in z/OS UNIX System Services Planning. End of change

If you need only one script argument, you can omit the -e and use the first form of the command.

sed subcommands are similar to those of the interactive text editor ed, except that sed subcommands necessarily view the input text as a stream rather than as a directly addressable file.

Each line of a sed script consists of one or more editing commands. The commands can be preceded by either semicolons or blanks, or both. Each editing command contains up to two addresses, a single letter command, and possible command arguments. The last editing command is followed with a terminating newline. The newline is optional in script strings that are typed on the command line.

[addr[,addr]] command [arguments]

Subcommands

sed subcommands necessarily view the input text as a stream rather than as a directly addressable file. Script subcommands can begin with zero, one, or two addresses, as in ed.

Zero-address subcommands refer to every input line.
One-address subcommands select only those lines matching that address.
Two-address subcommands select the input line ranges starting with a match on the first address up to an input line matching the second address, inclusive. If the second address is a number less than or equal to the line number first selected, only one line is selected.

Permissible addressing constructions are:

n: The number n matches only the nth input line.
$: This address matches the last input line.
/regexp/: This address selects an input line that matches the specified regular expression regexp. If you do not want to use slash (/) characters around the regular expression, use a different character (but not backslash or newline) and put a backslash (\) before the first one. For example, if you want to use % to enclose the regular expression, write \%regexp%.
If an regexp is empty (that is, no pattern is specified) sed behaves as if the last regexp used in the last command applied (either as an address or as part of a substitute command) was specified.

A command can be preceded by a '!' character, in which case the command is applied if the addresses do not select the pattern space. When the variable _UNIX03=YES is set, one or more '!' characters are allowed, and it is not allowed to follow a '!' character with <blanks>s. When the variable _UNIX03 is unset or is not set to YES, only one '!' character is allowed, and it is not allowed to follow a '!' character with a <blank>.

The following sed subcommand summary shows the subcommands with the maximum number of legitimate addresses. A subcommand can be given fewer than the number of addresses specified, but not more. A subcommand with the form [a] command supports up to one address and a subcommand with the form [a[,b]] command supports up to two addresses. All other subcommands do not support any addresses.

[a]a\

Appends subsequent text lines from the script to the standard output. sed writes the text after completing all other script operations for that line and before reading the next record. Text lines are ended by the first line that does not end with a backslash (\). Characters after the backslash (\) must be inserted on a new line after you press Enter. sed does not treat the \ characters on the end of lines as part of the text.

[a[,b]]b [label]

Branches to :label. If you omit label, sed branches to the end of the script.

[a[,b]]c\

Changes the addressed lines by deleting the contents of the pattern buffer (input line) and sending subsequent text (similar to the a command) to the standard output. When you specify two addresses, sed delays text output until the final line in the range of addresses; otherwise, the behavior would surprise many users. The rest of the script is skipped for each addressed line except the last.

[a[,b]]d

Deletes the contents of the pattern buffer (input line) and restarts the script with the next input line.

[a[,b]]D

Deletes the pattern buffer only up to and including the first newline. Then it restarts the script from the beginning and applies it to the text left in the pattern buffer.

[a[,b]]g

Grabs a copy of the text in the hold buffer and places it in the pattern buffer, overwriting the original contents.

[a[,b]]G

Grabs a copy of the text in the hold buffer and appends it to the end of the pattern buffer after appending a newline.

[a[,b]]h

Holds a copy of the text in the pattern buffer by placing it in the hold buffer, overwriting its original contents.

[a[,b]]H

Holds a copy of the text in the pattern buffer by appending it to the end of the hold buffer after appending a newline.

[a]i\

Inserts text. This subcommand is similar to the a subcommand, except that its text is output immediately.

[a[,b]]l

Lists the pattern buffer (input line) to the standard output so that nonprintable characters are visible. The end-of-line is represented by $, and the characters \\, \a, \b, \f, \r, \t, and \v are printed as escape sequences. Each byte of a nonprintable double-byte character appears as an escape sequence or as a 3-digit octal number. This subcommand is analogous to the l subcommand in ed.

sed folds long lines to suit the output device, indicating the point of folding with a backslash (\).

[a[,b]]n

Prints the pattern space on standard output if the default printing of the pattern space is not suppressed (because of the -n option). The next line of input is then read, and the processing of the line continues from the location of the n command in the script.

[a[,b]]N

Appends the next line of input to the end of the pattern buffer, using a new line to separate the appended material from the original. The current line number changes.

[a[,b]]p

Prints the text in the pattern buffer to the standard output. The -n option does not disable this form of output. If you do not use -n, the pattern buffer is printed twice.

[a[,b]]P

Operates like the p subcommand, except that it prints the text in the pattern buffer only up to and including the first newline character.

[a]q

Quits sed, skipping the rest of the script and reading no more input lines.

[a]r file

Reads text from file and writes it to the standard output before it reads the next input line. The text conversion specified for the sed command (for example, the -B and -W option) is used. The timing of this operation is the same as for the a subcommand. If file does not exist or cannot be read, sed treats it as an empty file.

[a[,b]]s/reg/ sub/[gpn] [wfile]

Substitutes the new text string sub for text matching the regular expression, reg. Normally, the s subcommand replaces only the first such matching string in each input line. You can use any single printable character other than space or newline instead of the slash (/) to delimit reg and sub. The delimiter itself may appear as a literal character in reg or sub if you precede it with a backslash (\). You can omit the trailing delimiter.

If an ampersand (&) appears in sub, sed replaces it with the string matching reg. The characters \d, where d is a digit, are replaced by the text that is matched by the corresponding back-reference expression. A \n in reg matches an embedded newline in the pattern buffer (resulting, for example, from an N subcommand). The subcommand can be followed by a combination of the following:

n: Substitutes only the nth occurrence of regexp.
g: Replaces all non-overlapping occurrences of regexp rather than the default first occurrence. If both g and n are specified, the last one specified takes precedence.
p: Executes the print (p) subcommand only if a successful substitution occurs.
w file: writes the contents of the pattern buffer to the end of file, if a substitution occurs. The text conversion that is specified for the sed command (for example, the -B and -W option) is used. When the variable _UNIX03=YES is set, the file must be preceded with one or more <blank>s. When the variable _UNIX03 is unset or is not set to YES, zero <blank> separation between w and file is allowed.

[a[,b]]t [label]

Branches to the indicated label if a successful substitution occurred since either reading the last input line or running the last t subcommand. If you do not specify label, sed branches to the end of the script.

[a[,b]]w file

Writes the text in the pattern buffer to the end of file. The text conversion specified for the sed command (for example, the -B and -W) is used.

[a[,b]]x

Exchanges the text in the hold buffer with that in the pattern buffer.

[a[,b]]y/set1/set2/

Transliterates any input character occurring in set1 to the corresponding element of set2. The sets must be the same length. You can use any character other than backslash or newline instead of the slash to delimit the strings.

If the variable _UNIX03=YES is set and a backslash followed by an 'n' appear in set1 or set2, the two characters are handled as a single newline character. If the variable _UNIX03 is unset or is not set to YES, the two characters are handled as a single character 'n'.

If the delimiter is not n, within set1 and set2, the delimiter itself can be used as a literal character if it is preceded by a backslash. If a backslash character is immediately followed by a backslash character in set1 or set2, the two backslash characters are counted as a single literal backslash character.

[a[,b]]{

Groups all commands until the next matching } subcommand, so that sed runs the entire group only if the { subcommand is selected by its addresses.

:label

Designates a label, which can be the destination of a bor t subcommand.

#

Treats the script line as a comment unless it is the first line in the script. Including the first line in a script as #n is equivalent to specifying -n on the command line. An empty script line is also treated as a comment.

[a]=

Writes the decimal value of the current line number to the standard output.

Examples

This filter switches desserts in a menu:
```
sed 's/cake$ic$*/cookies/g'
```
To substitute a pattern in a text file that contains ASCII characters, using the sed stream-oriented text editor and assuming that
- The text file is untagged and you do not want to tag it or enable automatic conversion, and
- You cannot alter the tag (for example, you are processing an untagged public text file or a read-only text file)
then issue:
```
sed -W filecodeset=819,pgmcodeset=1047 's/pattern1/pattern2/w myOutFile' myAsciiFile
```
To substitute a pattern in a text file using the sed stream-oriented text editor, assuming that automatic conversion was enabled but the text file is incorrectly tagged as UTF-8:
```
sed -B 's/pattern1/pattern2/w myOutputFile' myMisTaggedFile
```
Some sed subcommands require the use of a backslash ( \ ) as part of their syntax. The backslash needs to be followed by a new line and the z/OS shell will allow you to continue entering the command on the subsequent line. Some emulators may require two backslashes. The following example uses the a (append) subcommand to insert a new line containing zzz after matching bbb.
```
$ sed '/bbb/a\
> zzz' a.txt
aaa
bbb
zzz
ccc
```
where a.txt contains
```
aaa
bbb
ccc
```

Environment variables

sed uses the following environment variables:

COLUMNS: Contains the width of the screen in columns. If set, sed uses this value to fold long lines on output. Otherwise, sed uses a default screen width of 80.
_TEXT_CONV: Contains text conversion information for the command. The text conversion information is not used when either the -B option or the filecodeset or pgmcodeset option (-W option) is specified. For more information about text conversion, see Controlling text conversion for z/OS UNIX shell commands.
_UNIX03: For more information about the effect of _UNIX03 on this command, see Shell commands changed for UNIX03.

Localization

sed uses the following localization environment variables:

LANG
LC_ALL
LC_COLLATE
LC_CTYPE
LC_MESSAGES
LC_SYNTAX
NLSPATH

Exit values

0

Successful completion

1

Failure due to any of the following:

Missing script.
Too many script arguments.
Too few arguments.
Unknown option.
Inability to open script file.
No noncomment subcommand.
Label not found in script.
Unknown subcommand.
Nesting ! subcommand not permitted.
No \ at end of subcommand.
End-of-file in subcommand.
No label in subcommand.
Badly formed file name.
Inability to open file.
Insufficient memory to compile subcommand.
Bad regular expression delimiter.
No remembered regular expression.
Regular expression error.
Insufficient memory for buffers.
y subcommand not followed by a printable character as a separator.
The strings are not the same length.
Nonmatching { and } subcommands.
Garbage after command.
Too many addresses for command.
Newline or end-of-file found in pattern.
Input line too long.
Pattern space overflow during G subcommand.
Hold space overflow during H subcommand.
Inability to chain subcommand.
The code set is not valid.
Could not turn off automatic conversion.
Could not perform requested text conversion.

Messages

Possible error messages include:

badly formed filename for command command: The given subcommand required a file name, but its operand did not have the syntax of a file name.
subcommand command needs a label: The specified subcommand required a label, but you did not supply one.
must have at least one (noncomment) command: The input to sed must contain at least one active subcommand (that is, a subcommand that is not a comment).
No remembered regular expression: You issued a subcommand that tried to use a remembered regular expression; for example, s//abc. However, there is no remembered regular expression yet. Change the subcommand to use an explicit regular expression.

Limits

sed allows a limit of 28000 lines per file. It does not allow the NUL character.

Portability

POSIX.2, X/Open Portability Guide, UNIX systems.

The -B, -E, and -W options are extensions of the POSIX standard.

Related information

awk, diff, ed, grep, vi

For more information about regexp, see Regular expressions (regexp).