Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Discover Python, Part 8: Reading and writing data using Python's input and output functionality

More on how to use dictionaries effectively

Robert Brunner (rb@ncsa.uiuc.edu), NCSA Research Scientist, Assistant Professor of Astronomy, University of Illinois, Urbana-Champaign
Robert J. Brunner
Robert J. Brunner is a Research Scientist at the National Center for Supercomputing Applications and an Assistant Professor of Astronomy at the University of Illinois, Urbana-Champaign. He has published several books and a number of articles and tutorials on a range of topics. You can reach him at rb@ncsa.uiuc.edu.

Summary:  In this article, you learn how to work with files. First, we review a simple way to output data in Python, using the print statement, then learn about the file object, which is used by Python programs to read and write data to a file. The different modes with which a file can be opened are demonstrated, and the article concludes by showing how to read and write a binary file.

View more content in this series

Date:  03 Jan 2006
Level:  Intermediate

Activity:  97770 views
Comments:  

Reading, writing, and Python

In the previous articles in the "Discover Python" series, you learned about the basic Python data types and some of the container data types, such as the tuple, string, and list. Other articles discussed the conditional and looping features of the Python language and how they work together with the container data types to simplify programming tasks. The last basic step involved in writing programs is to read data from and write data to a file. After reading this article, you'll be able to check learning this skill off your to-do list.

Simple output

Throughout this series, you've written (output) data using the print statement, which by default writes the expression as a string to the screen (or console window). This is demonstrated in Listing 1, which repeats your first "Hello, World!" Python program with some minor tweaks.


Listing 1. Simple output

>>> print "Hello World!"
Hello World!
>>> print "The total value is = $", 40.0*45.50
The total value is = $ 1820.0
>>> print "The total value = $%6.2f" % (40.0*45.50)
The total value = $1820.00
>>> myfile = file("testit.txt", 'w')
>>> print >> myfile, "Hello World!"
>>> print >> myfile, "The total value = $%6.2f" % (40.0*45.50)
>>> myfile.close()

As this example shows, writing data is easy with the print statement. First, the example outputs a simple string. Then it creates and outputs a compound string, created using the string formatting technique.

After that, however, things change from the earlier version of the code. The next line creates a file object, passing in the name "testit.txt" and a 'w' character (to let you write to the file). You then use a modified print statement -- with two greater-than symbols followed by the variable holding the file object -- to write the same strings. This time, however, the data isn't displayed on the screen. The natural question is, where did the data go? And, what is this file object?

The first question is easy to answer. Look for the testit.txt file, and display its contents as shown below.

% more testit.txt 
Hello World!
The total value = $1820.00

As you can see, the data was written to the file exactly as it would have been written to the screen previously.

Now, notice the last line in Listing 1, which calls a close method on the file object. This is important in Python programs because file input and output are, by default, buffered; data isn't written as soon as you call a print statement but is instead written in chunks. The simplest mechanisms for telling Python to write your data to the file is to explicitly call the close method.

The file object

The file object is the basic mechanism by which you interact with files on your computer. You can use the file object to read data, to write data, to append data to a file, and to work with either binary or textual data.

The simplest technique for learning more about the file object is to ask for help, as shown in Listing 2.


Listing 2. Getting help for the file object

>>> help(file)
Help on class file in module __builtin__:

class file(object)
 |  file(name[, mode[, buffering]]) -> file object
 |  
 |  Open a file.  The mode can be 'r', 'w' or 'a' for reading (default),
 |  writing or appending.  The file will be created if it doesn't exist
 |  when opened for writing or appending; it will be truncated when
 |  opened for writing.  Add a 'b' to the mode for binary files.
 |  Add a '+' to the mode to allow simultaneous reading and writing.
 |  If the buffering argument is given, 0 means unbuffered, 1 means line
 |  buffered, and larger numbers specify the buffer size.
 |  Add a 'U' to mode to open the file for input with universal newline
 |  support.  Any line ending in the input file will be seen as a '\n'
 |  in Python.  Also, a file so opened gains the attribute 'newlines';
 |  the value for this attribute is one of None (no newline read yet),
 |  '\r', '\n', '\r\n' or a tuple containing all the newline types seen.
 |  
 |  'U' cannot be combined with 'w' or '+' mode.
 |  
 |  Note:  open() is an alias for file().
 |  
 |  Methods defined here:
...

As the help facility indicates, working with a file object is simple. You create a file object using the file constructor or the open method, which is an alias for the file constructor. The second parameter, which is optional, specifies how the file will be used:

  • 'r' (the default) indicates that you want to read data from the file.
  • 'w' indicates that you want to write data to the file, truncating the previous contents.
  • 'a' indicates that you want to write data to the file, appending to the end.
  • 'r+' indicates that you'll read from and write to (erasing any previous data) the file.
  • 'r+a' indicates you'll read from and write to (appending) the file.
  • 'b' indicates that you'll be reading or writing binary data.

The first code listing in this article wrote data to a file. Now, Listing 3 shows how to read this data into a Python program and parse the contents of the file.


Listing 3. Reading data from a file

>>> myfile = open("testit.txt")
>>> myfile.read()
'Hello World!\nThe total value = $1820.00\n'
>>> str = myfile.read()
>>> print str

>>> myfile.seek(0)
>>> str = myfile.read()
>>> print str
Hello World!
The total value = $1820.00

>>> str.split()
['Hello', 'World!', 'The', 'total', 'value', '=', '$1820.00']
>>> str.split('\n')
['Hello World!', 'The total value = $1820.00', '']
>>> for line in str.split('\n'):
...     print line
... 
Hello World!
The total value = $1820.00

>>> myfile.close()

To read the data, you first create an appropriate file object -- in this case, one that opens the testit.txt file -- and read the contents using the read method. This method reads the entire file into a string, which is printed to the console in this program. The second call to the read method, where you try to assign the value to the str variable, returns an empty string. This happens because the first read operation read the whole file. When you try to read the contents again, you're at the end of the file, so nothing can be read.

The solution to this problem is also easy: Tell the file object to go back to the beginning of the file. You do so via the seek method, which takes a single parameter that indicates where in the file you want to start reading or writing (for example, zero indicates the start of the file). The seek method allows for more complex operations, but they can be dangerous. For now, let's stick with the simple usage.

Now that you're back to the start of the file, you can read the file contents into a string variable and parse the contents of the string appropriately. Notice that the lines in the file are distinguished by a newline (or end-of-line) character. If you try to call the split method on your string, it will split on a whitespace character (like the space). To have the method split the lines based on a newline character, you have to explicitly specify the newline character. You can then split the string and iterate over the lines in the file within a for loop.

That seems like a lot of work just to read and process a single line from a file. Python makes simple things easy, so you're probably wondering if any shortcuts are available for this task. As shown in Listing 4, the answer is yes.


Listing 4. Reading and parsing lines

>>> myfile = open("testit.txt")
>>> for line in myfile.readlines():
...     print line
... 
Hello World!

The total value = $1820.00

>>> myfile.close()
>>> for line in open("testit.txt").readlines():
...     print line
... 
Hello World!

The total value = $1820.00

>>> for line in open("testit.txt"):
...     print line
... 
Hello World!

The total value = $1820.00

Listing 4 demonstrates three techniques for reading and parsing the lines in a text file. First, you open a file and assign it to a variable. You then call the readlines method, which reads the entire file into memory and splits the contents into a list of strings. The for loop iterates over the list of strings, printing them out one at a time.

The second for loop simplifies this process a little by using an implicit variable (that is, one that isn't explicitly created) for the file object. You open the file and read its contents all at once, producing the same result as the first, explicit example. The last example simplifies things even more and demonstrates the ability to iterate directly over a file object (note that this is a new feature of Python and, therefore, may not work on your computer). In this case, you create an implicit file object, and Python does the rest, allowing you to iterate over all the lines in the file.

Sometimes, however, you may want a finer level of control when you're reading data from a file. In this case, you should use the readline method, as shown in Listing 5.


Listing 5. Reading data

>>> myfile = open("testit.txt")
>>> myfile.readline()
'Hello World!\n'
>>> myfile.readline()
'The total value = $1820.00\n'
>>> myfile.readline()
''
>>> myfile.seek(0)
>>> myfile.readline()
'Hello World!\n'
>>> myfile.tell()
13L
>>> myfile.readline()
'The total value = $1820.00\n'
>>> myfile.tell()
40L
>>> myfile.readline()
''
>>> myfile.tell()
40L
>>> myfile.seek(0)
>>> myfile.read(17)
'Hello World!\nThe '
>>> myfile.seek(0)
>>> myfile.readlines(23)
['Hello World!\n', 'The total value = $1820.00\n']
>>> myfile.close()

This example demonstrates how to move through a file reading one line at a time or explicitly move the file position indicator using the seek method. You first step through the file line by line using the readline method. When you reach the end of the file, the readline method returns an empty string. Attempting to continue reading past the end of the file in this way doesn't cause an error, but returns an empty string.

You then jump back to the start of the file and read another line. The tell method displays where you are in the file (which should be after the first line of text) -- in this case, at the 13th character position. By using this knowledge, you can pass in a parameter to the read method or the readline method to control how many characters are read. For the read method, this parameter (17 in this example) is the number of characters that will be read from the file. The readline method, however, reads the specified number of characters and continues reading until the end of that line. In this example, it reads the first and second lines of text.

Writing data

So far, the examples have focused on reading data, not writing data. As shown in Listing 6, however, writing is easy once you know the basics of working with the file object.


Listing 6. Writing data
  
>>> mydata = ['Hello World!', 'The total value = $1820.00']
>>> myfile = open('testit.txt', 'w')
>>> for line in mydata:
...     myfile.write(line + '\n')
... 
>>> myfile.close()
>>> myfile = open("testit.txt")
>>> myfile.read()
'Hello World!\nThe total value = $1820.00\n'
>>> myfile.close()
>>> myfile = open("testit.txt", "r+")
>>> for line in mydata:
...     myfile.write(line + '\n')
... 
>>> myfile.seek(0)
>>> myfile.read()
'Hello World!\nThe total value = $1820.00\n'
>>> myfile.close()
>>> myfile = open("testit.txt", "r+a")
>>> myfile.read()
'Hello World!\nThe total value = $1820.00\n'
>>> for line in mydata:
...     myfile.write(line + '\n')
... 
>>> myfile.seek(0)
>>> myfile.read()
'Hello World!\nThe total value = $1820.00\nHello World!\nThe total value = $1820.00\n'
>>> myfile.close()

To write data to a file, you have to first create the file object. But in this case, you must specify that you want to write to the file by using the 'w' mode flag. In this example, you write the contents of the mydata list to the file, close the file, and then reopen the file so you can read the contents.

Often, however, you'll want to read to and write from a file at the same time, so the next part of this example reopens the file using the 'r+' mode. Because you'll be able to write to the file, not append, the file will be truncated. First you write the contents of the mydata list to the file, then you reposition the file pointer to the start of the file and read the contents. This example then closes the file and reopens it using the read and append mode, "r+a". As the example code demonstrates, the file contents are now the result of two write operations (the text is repeated).

Working with binary data

All of the previous examples have dealt with textual or character data: You wrote and read character strings. In certain situations, however -- for example, when you're working with integers or compressed files -- you need to be able to read and write binary data. You can easily do so in Python by appending 'b' to the file mode when you create the file object, as shown in Listing 7.


Listing 7. Working with binary data

>>> myfile = open("testit.txt", "wb")
>>> for c in range(50, 70):
...     myfile.write(chr(c))
... 
>>> myfile.close()
>>> myfile = open("testit.txt")
>>> myfile.read()
'23456789:;<=>?@ABCDE'
>>> myfile.close()

In this example, you create an appropriate file object, then write the binary characters with ASCII values from 50 to 69. You convert the integers created by the call to the range method to the character using the chr method. After you've written all the data, you close the file and reopen it for reading, again using the binary mode flag. Reading the file demonstrates that you clearly didn't write the integers to the file; instead, you wrote their character values.

When you're reading and writing binary data, you must be careful because different platforms store binary data in different ways. If you must work with binary data, it's best to use an appropriate object from the Python library (or one from a third-party developer).


Reading and writing: Fun for the masses

This article discussed how to read and write data to a file from a Python program. Overall, the process is simple: Create an appropriate file object, then read or write as necessary. However, you must be careful about truncation when using the write mode to create a file object for writing data to a file. If you need to append data to a file, you should use the append mode when creating the file object.


Resources

Learn

  • Read all the articles in the developerWorks "Discover Python" series.

  • When you have a working Python interpreter, the Python tutorial is a great place to start learning the language.

  • Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.

Get products and technologies

  • Download Python.

  • Innovate your next open source development project with IBM trial software, available for download or on DVD.

Discuss

About the author

Robert J. Brunner

Robert J. Brunner is a Research Scientist at the National Center for Supercomputing Applications and an Assistant Professor of Astronomy at the University of Illinois, Urbana-Champaign. He has published several books and a number of articles and tutorials on a range of topics. You can reach him at rb@ncsa.uiuc.edu.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=100972
ArticleTitle=Discover Python, Part 8: Reading and writing data using Python's input and output functionality
publish-date=01032006
author1-email=rb@ncsa.uiuc.edu
author1-email-cc=rb@ncsa.uiuc.edu