tr - Translate characters

Format

tr [-c | C] [-s] string1 string2
tr -s [-c | C]  string1
tr -d [-c | C]  string1
tr -ds [-c | C] string1 string2

Description

tr copies data that is read from the standard input to stdout, substituting or deleting characters as specified by the options and string1 and string2. string1 and string2 are considered to be sets of characters. In its simplest form, tr translates each character in string1 into the character at the corresponding position in string2.
Note: tr works on a character basis, not on a collation element basis. Thus, for example, a range that includes the multicharacter collation element ch in regular expressions, does not include it here.

Options

-c
If the variable _UNIX03 is unset or is not set to YES, the behavior of -c option complements the set of characters that are specified by string1. tr constructs a new set of characters, consisting of all the characters not found in string1 and uses this new set in place of string1.

If the variable _UNIX03=YES is set, the behavior of -c option complements the set of values that are specified by string1. tr constructs a new set and the complements of the values specified by string1 (the set of all possible binary values, except for those that are specified in the string1 operand) are placed in this new set in ascending order by binary value. The new set is used in place of string1.

-C
Complements the set of characters specified by string1. This means that tr constructs a new set and the complements of the characters that are specified by string1 (the set of all characters in the current character set, as defined by the current setting of LC_CTYPE, except for those that are specified in the string1 operand) are placed in this new set in ascending collation sequence, as defined by the current setting of LC_COLLATE. This behaves the same as -c when the variable _UNIX03 is unset or is not set to YES.
-d
Deletes input characters that are found in string1 from the output.
-s
tr checks for sequences of a string1 character repeated several consecutive times. When this happens, tr replaces the sequence of repeated characters with one occurrence of the corresponding character from string2. If string2 is not specified, the sequence is replaced with one occurrence of the repeated character itself. For example:
tr -s abc xyz
translates the input string aaaabccccb into the output string of xyzy.
If you specify both the -d and -s options, you must specify both string1 and string2. In this case, string1 contains the characters to be deleted, whereas string2 contains characters that are to have multiple consecutive appearances replaced with one appearance of the character itself. For example:
tr -ds a b
translates the input string abbbaaacbb into the output string bcb.

The actions of the -s option take place after all other deletions and translations.

String options

You can use the following conventions to represent elements of string1 and string2:
character
Any character not described by the conventions that follow represents itself.
\ooo
An octal representation of a character with a specific coded value. It can consist of one, two, or three octal digits (01234567). Double-byte characters require multiple, concatenated escape sequences of this type, including the leading \ for each byte.
\character
The \ (backslash) character is used as an escape to remove the special meaning of characters. It also introduces escape sequences for nonprinting characters, in the manner of C character constants: \b, \f, \n, \r, \t, and \v.
c1-c2
In the POSIX locale, as long as neither endpoint is an octal sequence of the form \ooo, this represents all characters between characters c1 and c2 (in the current locale's collating sequence) including the end values. For example, 'a-z' represents all the lowercase letters in the POSIX locale, whereas 'A-Z' represents all that locale's uppercase letters. One way to convert lowercase and uppercase is with the following filter:
tr 'a-z' 'A-Z'
This is not, however, the recommended method; use the [:class:] construct instead.

If the second endpoint precedes the starting endpoint in the collation sequence, it causes an error.

If either or both of the range endpoints are octal sequences of the form \ooo, this represents the range of specific coded values between the two range endpoints, inclusive.

This construct c1-c2 is only applied in POSIX locale.
Note: The current locale has a significant effect on results when this method is used to specify subranges. If the command is required to give consistent results irrespective of locale, the use of construct c1-c2 should be avoided.
[c*n]
This represents n repeated occurrences of character c. (If n has a leading zero, tr assumes it is octal; otherwise, it is assumed to be decimal.) You can omit the number for the last character in a subset. This representation is valid only in string2.
[:class:]
This represents all characters that belong to the character class class in the locale indicated by LC_CTYPE. When the class [:upper] or [:lower:] appears in string1 and the opposite class, [:lower:] or [:upper:] appears in string2, tr uses the LC_CTYPE tolower or toupper mappings in the same relative positions.
[=c=]
This represents all characters that belong to the same equivalence class as the character c in the locale indicated by LC_COLLATE. Only international versions of the code support this format.

Usage notes

When string2 is shorter than string1, tr does not pad string2. The remaining characters in string1 will not be translated. For example:
tr  '0123456789'  'd' 
only translates '0' to 'd', '123456789' remain unchanged.
Coding the example in the following way:
tr  '0123456789'  '[d*]'
translates all digits to the letter 'd'.

Examples

This example creates a list of all words (strings of letters) found in file1 and puts it in file2:
tr -cs "[:alpha:]" "[\n*]" <file1 >file2

Environment variables

tr uses the following environment variable: _UNIX03.

For more information about the effect of _UNIX03 on this command, see Shell commands changed for UNIX03.

Localization

tr uses the following localization environment variables:
  • LANG
  • LC_ALL
  • LC_COLLATE
  • LC_CTYPE
  • LC_MESSAGES
  • LC_SYNTAX
  • NLSPATH

Exit values

0
Successful completion.
1
Failure because of unknown command-line option, or too few arguments.

Portability

POSIX.2, X/Open Portability Guide.

tr is compatible with earlier versions of both the UNIX Version 7 and System V variants of this command, but with extensions (C escapes, handles ASCII NUL, globalization).