tr Command

Purpose

Transforms characters or range of characters.

Syntax

To transform characters or sequence of characters:

tr-c | -cds -cs | -C | -Cds | -Cs | -ds |  -s ] [  -A ] String1 String2

To delete characters or sequence of characters:

tr { -cd -cs | -Cd | -Cs |  -d -s } [  -A ] String1

To define a range of characters for a specific locale:

[LANG=ll_RR] tr  -L {c1-cn C1-Cn} {c1c2c3c4c5c6…cn C1C2C3C4C5C6…Cn}

To delete a range of characters that is specified by the user:

[LANG=ll_RR] tr  -L {c1-cn C1-Cn}

Description

The tr command deletes or substitutes characters from standard input and writes the result to standard output. The tr command also defines a range of characters for a specific locale. The tr command performs the following types of operations depending on the strings specified by the String1 and String2 variable and depending on the flags specified by the user.

Transforming Characters

If String1 and String2 are both specified and the -d flag is not specified, the tr command replaces each character contained in String1 from the standard input with the character in the same position in String2.

Deleting Characters Using the -d Flag

If the -d flag is specified, the tr command deletes each character contained in String1 from standard input.

Removing Sequences Using the -s Flag

If the -s flag is specified, the tr command removes all but the first character in any sequence of a character string represented in String1 or String2. For each character represented in String1, the tr command removes all but the first occurrence of the character from standard output. For each character represented in String2, the tr command removes all but the first occurrence in a sequence of occurrences of that character in the standard output.

Defining a character range in the current locale environment

You can specify a character range for specific locales, code sets, and users by using the -L flag of the tr command. The tr command retains and recognizes the character range that you specified. The tr command uses this information to transform the characters to the mapped characters in the range. The character range c1-cn that is specified with the -L flag is mapped to characters c1c2c3c4c5c6...cn, and the specified character range C1-Cn is mapped to characters C1C2C3C4C5C6...Cn.

Special Sequences for Expressing Strings

The strings contained in the String1 and String2 variables can be expressed using the following conventions:

Item Description
C1-C2 Specifies the string of characters that collate between the character specified by C1 and the character specified by C2, inclusive. The character specified by C1 must collate before the character specified by C2.
Note: The current locale has a significant effect on results when specifying subranges using this method. If the command is required to give consistent results irrespective of locale, the use of subranges should be avoided.
[C*Number] Number is an integer that specifies the number of repetitions of the character specified by C. Number is considered a decimal integer unless the first digit is a 0; then it is considered an octal integer.
[C*] Fills out the string with the character specified by C. This option, used only at the end of the string contained within String2, forces the string within String2 to have the same number of characters as the string specified by the String1 variable. Any characters specified after the * (asterisk) are ignored.
[ :ClassName: ] Specifies all of the characters in the character class named by ClassName in the current locale. The class name can be any of the following names:
alnum      lower
alpha      print
blank      punct
cntrl      space
digit      upper
graph      xdigit

Except for [:lower:] and [:upper:] conversion character classes, the characters specified by other character classes are placed in an array in an unspecified order. Because the order of the characters specified by character classes is undefined, the characters should be used only if the intent is to map several characters into one. An exception to this is the case of conversion character classes.

For more information on character classes, see the ctype subroutines.

[ =C= ] Specifies all of the characters with the same equivalence class as the character specified by C.
\Octal Specifies the character whose encoding is represented by the octal value specified by Octal. An Octal value can be a one digit, two digit, or three digit octal integer. The NULL character can be expressed by using the '\0' expression, and is processed like any other character.
\ControlCharacter Specifies the control character that corresponds to the value specified by ControlCharacter. The following values can be represented:
\a
Alert
\b
Backspace
\f
Form-feed
\n
New line
\r
Carriage return
\t
Tab
\v
Vertical tab
\\ Specifies the \ (backslash) as itself, without any special meaning as an escape character.
\[ Specifies the [ (left bracket) as itself, without any special meaning as the beginning of a special string sequence.
\- Specifies the - (minus sign) as itself, without any special meaning as a range separator.

If a character is specified more than once in String1, the character is transformed into the character in String2 that corresponds to the last occurrence of the character in String1.

If the strings specified by String1 and String2 are not the same length, the tr command ignores the extra characters in the longer string.

Flags

Item Description
-A Performs all operations on a byte-by-byte basis using the ASCII collation order for ranges and character classes, instead of the collation order for the current locale.
-C Specifies that the value of String1 be replaced by the complement of the string specified by String1. The complement of String1 is all of the characters in the character set of the current locale, except the characters specified by String1. If the -A and -c flags are both specified, characters are complemented with respect to the set of all 8-bit character codes. If the -c and -s flags are both specified, the -s flag applies to characters in the complement of String1.

If the -d option is not specified, the complements of the characters specified by String1 will be placed in the array in ascending collation sequence as defined by the current setting of LC_COLLATE.

-c Specifies that the value of String1 be replaced by the complement of the string specified by String1. The complement of String1 is all of the characters in the character set of the current locale, except the characters specified by String1. If the -A and -c flags are both specified, characters are complemented with respect to the set of all 8-bit character codes. If the -c and -s flags are both specified, the -s flag applies to characters in the complement of String1.

If the -d option is not specified, the complement of the values specified by String1 will be placed in the array in ascending order by binary value.

-d Deletes each character from standard input that is contained in the string specified by String1.
Note:
  1. When the -C option is specified with the -d option, all characters except those specified by String1 will be deleted. The contents of String2 are ignored unless the -s option is also specified.
  2. When the -c option is specified with the -d option, all values except those specified by String1 will be deleted. The contents of String2 are ignored unless the -s option is also specified.
-L Adds a user-defined character range in the current locale environment to the $HOME/.trregexrc/$CODESET file. The character range c1-cn is mapped to c1c2c3c4c5c6...cn, and the character range C1-Cn is mapped to C1C2C3C4C5C6...Cn.

The -L flag is user-specific (depends on the $HOME variable), code-set specific, and locale-specific (depends on the $LANG variable). It means that you must define the character range for specific users, code sets, and locales. If the $HOME/.trregexrc/$CODESET file does not exist for a specific user or locale, the file is automatically generated when you specify the -L flag.

-s Removes all but the first character in a sequence of repeated characters. Character sequences specified by String1 are removed from standard input before translation, and character sequences specified by String2 are removed from standard output.
String1 Specifies a string of characters.
String2 Specifies a string of characters.

Exit Status

This command returns the following exit values:

0
All input was processed successfully.
>0
An error occurred.

Examples

  1. To transform braces into parentheses, enter the following command:
    tr '{}' '()' < textfile > newfile
    This transforms each { (left brace) to ( (left parenthesis) and each } (right brace) to ) (right parenthesis). All other characters remain unchanged.
  2. To transform braces into brackets, enter the following command:
    tr '{}' '\[]' < textfile > newfile
    This transforms each { (left brace) to [ (left bracket) and each } (right brace) to ] (right bracket). The left bracket must be entered with a \ (backslash) escape character.
  3. To transform lowercase characters to uppercase, enter the following command:
    tr 'a-z' 'A-Z' < textfile > newfile
  4. To create a list of words in a file, enter the following command:
    tr -cs '[:lower:][:upper:]' '[\n*]' < textfile > newfile
    This transforms each sequence of characters other than lowercase letters and uppercase letters into a single newline character. The * (asterisk) causes the tr command to repeat the new line character enough times to make the second string as long as the first string.
  5. To delete all NULL characters from a file, enter the following command:
    tr -d '\0' < textfile > newfile
  6. To replace every sequence of one or more new lines with a single new line, enter the following command:
    tr -s '\n' < textfile > newfile
    OR
    tr -s '\012' < textfile > newfile
  7. To replace every nonprinting character, other than valid control characters, with a ? (question mark), enter the following command:
    tr -c '[:print:][:cntrl:]' '[?*]' < textfile > newfile
    This scans a file created in a different locale to find characters that are not printable characters in the current locale.
  8. To replace every sequence of characters in the <space> character class with a single # character, enter the following command:
    tr -s '[:space:]' '[#*]'
  9. To define a character range for a specific locale, enter the following command:
    LANG=ES_ES tr -L a-z A-Z abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
    This command defines the a-z and A-Z character ranges for the ES_ES (Spanish_Spain) locale.
  10. To delete a character range for a specific locale, enter the following command:
    LANG=ES_ES tr -L a-z A-Z