Skip to main content

Tip: Filtering files with tr

Get to know your textutils

Jacek Artymiak (jacek@artymiak.com), Freelance author and consultant
Jacek Artymiak works as a freelance consultant, developer, and writer. Since 1991 he's been developing software for many commercial and free variants of UNIX and BSD operating systems (AIX, HP-UX, IRIX, Solaris, Linux, FreeBSD, NetBSD, OpenBSD, and others), as well as MS-DOS, Microsoft Windows, Mac OS, and Mac OS X. Jacek specializes in business and financial application development, Web design, network security, computer graphics, animation, and multimedia. He's a prolific writer on technology subjects and the coauthor of "Install, Configure, and Customize Slackware Linux" (Prima Tech, 2000) and "StarOffice for Linux Bible" (IDG Books, 2000). Many of Jacek's software projects can be found at SourceForge. You can learn more about him at his personal Web site and contact him at jacek@artymiak.com.

Summary:  Nobody ever said sed was easy -- and it isn't! But you can get a lot of sed's most basic functionality very easily by using tr instead. Jacek Artymiak shows how.

Date:  12 Mar 2003
Level:  Introductory

Activity:  4773 views
Comments:  

You can think of tr as a (very) simplified variant of sed: it will replace one character with another or remove some characters altogether. You can also use it to remove repeated characters. And that's about all tr can do.

So, why use it instead of sed? To simplify your life, of course. For example, if we wanted to replace all occurrences of the letter "a" with the letter "z," we can use tr a z, which is undoubtedly simpler than sed -e s/a/z/g, especially when it is used in a script, where quote-escaping can give us a lot of headaches. Also, when we use tr, we can avoid writing those pesky regular expressions.

Using tr is simple: to replace all occurrences of one character with another, use the notation shown in the previous paragraph. When you need to replace more than one character, use something like tr abc xyz , which replaces all occurrences of the letter "a" with the letter "x," all letters "b" with the letter "y," and all letters "c" with the letter "z." The number of characters listed in these two groups does not have to be equal.

You can also specify ranges of characters. For example, tr a-z A-Z will replace all lower case letters with their upper case equivalents (for example, it will turn "no smoking" into "NO SMOKING"). This particular trick comes in handy when you want to emphasize a section of the text you are editing in the vi editor. Simply hit the Escape key, press :, type 2,4!tr 'a-z' 'A-Z', and hit the Return key. Lines 2 through 4 will now turn into upper case.

More on tr

The GNU manual says that tr is used to "translate, squeeze, and/or delete characters" by copying standard input to standard output while performing the operation of your choice. You'll learn about those choices in this tip; learn even more by following along with the man or info page for tr also.

Open a new terminal window and type either man tr or info tr -- or you can open a new browser window and link to the tr man page at gnu.org (see Resources for a link).

Another time you will find tr very helpful will be when somebody sends you a text file created on a Mac OS or DOS/Windows machine. When not saved using the UNIX newline line ends, such files need to be converted into the native UNIX format or else some command utilities will not process them correctly. Mac OS ends lines with the carriage return character, and many text processing tools treat such files as a single line. To fix that, we can use the following tricks:

  • Mac -> UNIX: tr '\r' '\n' < macfile > unixfile
  • UNIX -> Mac: tr '\n' '\r' < unixfile > macfile

Microsoft DOS/Windows convention is to end each line of text with the carriage return character followed by the newline character. To fix that, use the following commands:

  • DOS -> UNIX: tr -d '\r' < dosfile > unixfile
  • UNIX -> DOS: in this case we need to use awk, because tr cannot insert two characters in place of one. The command to use is awk '{ print $0"\r" }' < unixfile > dosfile

Another time you will need to use tr is when you need to do some simple cleaning up of a text file, like removing tabs with tr -d '\t', removing extra spaces with tr -s ' ', or joining separate lines into one with tr -d '\n'. Again, all these commands can be used inside vi; just remember to precede them with the range of lines you wish to process and the exclamation mark (!), as in 1,$!tr -d '\t' (the dollar sign represents the last line).

Questions or comments? I'd love to hear from you -- send mail to jacek@artymiak.com.

Next time, we'll take a look at uniq. See you then!


Resources

About the author

Jacek Artymiak works as a freelance consultant, developer, and writer. Since 1991 he's been developing software for many commercial and free variants of UNIX and BSD operating systems (AIX, HP-UX, IRIX, Solaris, Linux, FreeBSD, NetBSD, OpenBSD, and others), as well as MS-DOS, Microsoft Windows, Mac OS, and Mac OS X. Jacek specializes in business and financial application development, Web design, network security, computer graphics, animation, and multimedia. He's a prolific writer on technology subjects and the coauthor of "Install, Configure, and Customize Slackware Linux" (Prima Tech, 2000) and "StarOffice for Linux Bible" (IDG Books, 2000). Many of Jacek's software projects can be found at SourceForge. You can learn more about him at his personal Web site and contact him at jacek@artymiak.com.

Comments



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=11299
ArticleTitle=Tip: Filtering files with tr
publish-date=03122003
author1-email=jacek@artymiak.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers