tr - Translate characters
Format
tr -s [-c | C] string1
tr -d [-c | C] string1
tr -ds [-c | C] string1 string2
Description
ch
in regular expressions, does not include it here.Options
- -c
- If the variable _UNIX03 is unset or is not set to YES, the behavior of
-c option complements the set of characters that are specified by
string1. tr constructs a new set of characters,
consisting of all the characters not found in string1 and uses this new
set in place of string1.
If the variable _UNIX03=YES is set, the behavior of -c option complements the set of values that are specified by string1. tr constructs a new set and the complements of the values specified by string1 (the set of all possible binary values, except for those that are specified in the string1 operand) are placed in this new set in ascending order by binary value. The new set is used in place of string1.
- -C
- Complements the set of characters specified by string1. This means that tr constructs a new set and the complements of the characters that are specified by string1 (the set of all characters in the current character set, as defined by the current setting of LC_CTYPE, except for those that are specified in the string1 operand) are placed in this new set in ascending collation sequence, as defined by the current setting of LC_COLLATE. This behaves the same as -c when the variable _UNIX03 is unset or is not set to YES.
- -d
- Deletes input characters that are found in string1 from the output.
- -s
- tr checks for sequences of a string1 character
repeated several consecutive times. When this happens, tr replaces the sequence
of repeated characters with one occurrence of the corresponding character from
string2. If string2 is not specified, the
sequence is replaced with one occurrence of the repeated character itself. For example:
translates the input stringtr -s abc xyz
aaaabccccb
into the output string ofxyzy
.If you specify both the -d and -s options, you must specify both string1 and string2. In this case, string1 contains the characters to be deleted, whereas string2 contains characters that are to have multiple consecutive appearances replaced with one appearance of the character itself. For example:
translates the input stringtr -ds a b
abbbaaacbb
into the output stringbcb
.The actions of the -s option take place after all other deletions and translations.
String options
- character
- Any character not described by the conventions that follow represents itself.
\
ooo- An octal representation of a character with a specific coded value. It can consist of one, two,
or three octal digits (01234567). Double-byte characters require multiple, concatenated escape
sequences of this type, including the leading
\
for each byte. \
character- The
\
(backslash) character is used as an escape to remove the special meaning of characters. It also introduces escape sequences for nonprinting characters, in the manner of C character constants:\b
,\f
,\n
,\r
,\t
, and\v
. - c1
-
c2 - In the POSIX locale, as long as neither endpoint is an octal sequence of the form
\ooo, this represents all characters between characters
c1 and c2 (in the current locale's collating
sequence) including the end values. For example, '
a-z
' represents all the lowercase letters in the POSIX locale, whereas 'A-Z
' represents all that locale's uppercase letters. One way to convert lowercase and uppercase is with the following filter:
This is not, however, the recommended method; use thetr 'a-z' 'A-Z'
[:
class:]
construct instead.If the second endpoint precedes the starting endpoint in the collation sequence, it causes an error.
If either or both of the range endpoints are octal sequences of the form \ooo, this represents the range of specific coded values between the two range endpoints, inclusive.
This construct c1-
c2 is only applied in POSIX locale.Note: The current locale has a significant effect on results when this method is used to specify subranges. If the command is required to give consistent results irrespective of locale, the use of construct c1-c2 should be avoided. - [c
*
n] - This represents n repeated occurrences of character c. (If n has a leading zero, tr assumes it is octal; otherwise, it is assumed to be decimal.) You can omit the number for the last character in a subset. This representation is valid only in string2.
[:
class:]
- This represents all characters that belong to the character class
class in the locale indicated by LC_CTYPE. When the class
[:upper]
or[:lower:]
appears in string1 and the opposite class,[:lower:]
or[:upper:]
appears in string2, tr uses the LC_CTYPEtolower
ortoupper
mappings in the same relative positions. [=
c=]
- This represents all characters that belong to the same equivalence class as the character c in the locale indicated by LC_COLLATE. Only international versions of the code support this format.
Usage notes
tr '0123456789' 'd'
only
translates '0' to 'd', '123456789' remain unchanged. tr '0123456789' '[d*]'
translates
all digits to the letter 'd'.Examples
tr -cs "[:alpha:]" "[\n*]" <file1 >file2
Environment variables
tr uses the following environment variable: _UNIX03.
For more information about the effect of _UNIX03 on this command, see Shell commands changed for UNIX03.
Localization
- LANG
- LC_ALL
- LC_COLLATE
- LC_CTYPE
- LC_MESSAGES
- LC_SYNTAX
- NLSPATH
Exit values
0
- Successful completion.
1
- Failure because of unknown command-line option, or too few arguments.
Portability
POSIX.2, X/Open Portability Guide.
tr is compatible with earlier versions of both the UNIX Version 7 and System V variants of this command, but with extensions (C escapes, handles ASCII NUL, globalization).