In the previous articles in the "Discover Python" series, you learned about the basic Python data types and some of the container data types, such as the
list. Other articles discussed the conditional and looping features of the Python language and how they work together with the container data types to simplify programming tasks. The last basic step involved in writing programs is to read data from and write data to a file. After reading this article, you'll be able to check learning this skill off your to-do list.
Throughout this series, you've written (output) data using the
string to the screen (or console window). This is demonstrated in Listing 1, which repeats your first "Hello, World!" Python program with some minor tweaks.
Listing 1. Simple output
>>> print "Hello World!" Hello World! >>> print "The total value is = $", 40.0*45.50 The total value is = $ 1820.0 >>> print "The total value = $%6.2f" % (40.0*45.50) The total value = $1820.00 >>> myfile = file("testit.txt", 'w') >>> print >> myfile, "Hello World!" >>> print >> myfile, "The total value = $%6.2f" % (40.0*45.50) >>> myfile.close()
As this example shows, writing data is easy with the
string. Then it creates and outputs a compound
string, created using the
string formatting technique.
After that, however, things change from the earlier version of the code. The next line creates a
file object, passing in the name
"testit.txt" and a
'w' character (to let you write to the file). You then use a modified
file object -- to write the same
strings. This time, however, the data isn't displayed on the screen. The natural question is, where did the data go? And, what is this
The first question is easy to answer. Look for the testit.txt file, and display its contents as shown below.
% more testit.txt Hello World! The total value = $1820.00
As you can see, the data was written to the file exactly as it would have been written to the screen previously.
Now, notice the last line in Listing 1, which calls a
close method on the
file object. This is important in Python programs because file input and output are, by default, buffered; data isn't written as soon as you call a
file object is the basic mechanism by which you interact with files on your computer. You can use the
file object to read data, to write data, to append data to a file, and to work with either binary or textual data.
The simplest technique for learning more about the
file object is to ask for help, as shown in Listing 2.
Listing 2. Getting help for the file object
>>> help(file) Help on class file in module __builtin__: class file(object) | file(name[, mode[, buffering]]) -> file object | | Open a file. The mode can be 'r', 'w' or 'a' for reading (default), | writing or appending. The file will be created if it doesn't exist | when opened for writing or appending; it will be truncated when | opened for writing. Add a 'b' to the mode for binary files. | Add a '+' to the mode to allow simultaneous reading and writing. | If the buffering argument is given, 0 means unbuffered, 1 means line | buffered, and larger numbers specify the buffer size. | Add a 'U' to mode to open the file for input with universal newline | support. Any line ending in the input file will be seen as a '\n' | in Python. Also, a file so opened gains the attribute 'newlines'; | the value for this attribute is one of None (no newline read yet), | '\r', '\n', '\r\n' or a tuple containing all the newline types seen. | | 'U' cannot be combined with 'w' or '+' mode. | | Note: open() is an alias for file(). | | Methods defined here: ...
As the help facility indicates, working with a
file object is simple. You create a
file object using the
file constructor or the
open method, which is an alias for the
file constructor. The second parameter, which is optional, specifies how the file will be used:
'r'(the default) indicates that you want to read data from the file.
'w'indicates that you want to write data to the file, truncating the previous contents.
'a'indicates that you want to write data to the file, appending to the end.
'r+'indicates that you'll read from and write to (erasing any previous data) the file.
'r+a'indicates you'll read from and write to (appending) the file.
'b'indicates that you'll be reading or writing binary data.
The first code listing in this article wrote data to a file. Now, Listing 3 shows how to read this data into a Python program and parse the contents of the file.
Listing 3. Reading data from a file
>>> myfile = open("testit.txt") >>> myfile.read() 'Hello World!\nThe total value = $1820.00\n' >>> str = myfile.read() >>> print str >>> myfile.seek(0) >>> str = myfile.read() >>> print str Hello World! The total value = $1820.00 >>> str.split() ['Hello', 'World!', 'The', 'total', 'value', '=', '$1820.00'] >>> str.split('\n') ['Hello World!', 'The total value = $1820.00', ''] >>> for line in str.split('\n'): ... print line ... Hello World! The total value = $1820.00 >>> myfile.close()
To read the data, you first create an appropriate
file object -- in this case, one that opens the testit.txt file -- and read the contents using the
read method. This method reads the entire file into a
string, which is printed to the console in this program. The second call to the
read method, where you try to assign the value to the
str variable, returns an empty
string. This happens because the first read operation read the whole file. When you try to read the contents again, you're at the end of the file, so nothing can be read.
The solution to this problem is also easy: Tell the
file object to go back to the beginning of the file. You do so via the
seek method, which takes a single parameter that indicates where in the file you want to start reading or writing (for example, zero indicates the start of the file). The
seek method allows for more complex operations, but they can be dangerous. For now, let's stick with the simple usage.
Now that you're back to the start of the file, you can read the file contents into a
string variable and parse the contents of the
string appropriately. Notice that the lines in the file are distinguished by a newline (or end-of-line) character. If you try to call the
split method on your
string, it will split on a whitespace character (like the space). To have the method split the lines based on a newline character, you have to explicitly specify the newline character. You can then split the
string and iterate over the lines in the file within a
That seems like a lot of work just to read and process a single line from a file. Python makes simple things easy, so you're probably wondering if any shortcuts are available for this task. As shown in Listing 4, the answer is yes.
Listing 4. Reading and parsing lines
>>> myfile = open("testit.txt") >>> for line in myfile.readlines(): ... print line ... Hello World! The total value = $1820.00 >>> myfile.close() >>> for line in open("testit.txt").readlines(): ... print line ... Hello World! The total value = $1820.00 >>> for line in open("testit.txt"): ... print line ... Hello World! The total value = $1820.00
Listing 4 demonstrates three techniques for reading and parsing the lines in a text file. First, you open a file and assign it to a variable. You then call the
readlines method, which reads the entire file into memory and splits the contents into a list of
for loop iterates over the list of
strings, printing them out one at a time.
for loop simplifies this process a little by using an implicit variable (that is, one that isn't explicitly created) for the
file object. You open the file and read its contents all at once, producing the same result as the first, explicit example. The last example simplifies things even more and demonstrates the ability to iterate directly over a
file object (note that this is a new feature of Python and, therefore, may not work on your computer). In this case, you create an implicit
file object, and Python does the rest, allowing you to iterate over all the lines in the file.
Sometimes, however, you may want a finer level of control when you're reading data from a file. In this case, you should use the
readline method, as shown in Listing 5.
Listing 5. Reading data
>>> myfile = open("testit.txt") >>> myfile.readline() 'Hello World!\n' >>> myfile.readline() 'The total value = $1820.00\n' >>> myfile.readline() '' >>> myfile.seek(0) >>> myfile.readline() 'Hello World!\n' >>> myfile.tell() 13L >>> myfile.readline() 'The total value = $1820.00\n' >>> myfile.tell() 40L >>> myfile.readline() '' >>> myfile.tell() 40L >>> myfile.seek(0) >>> myfile.read(17) 'Hello World!\nThe ' >>> myfile.seek(0) >>> myfile.readlines(23) ['Hello World!\n', 'The total value = $1820.00\n'] >>> myfile.close()
This example demonstrates how to move through a file reading one line at a time or explicitly move the file position indicator using the
seek method. You first step through the file line by line using the
readline method. When you reach the end of the file, the
readline method returns an empty
string. Attempting to continue reading past the end of the file in this way doesn't cause an error, but returns an empty
You then jump back to the start of the file and read another line. The
tell method displays where you are in the file (which should be after the first line of text) -- in this case, at the 13th character position. By using this knowledge, you can pass in a parameter to the
read method or the
readline method to control how many characters are read. For the
read method, this parameter (17 in this example) is the number of characters that will be read from the file. The
readline method, however, reads the specified number of characters and continues reading until the end of that line. In this example, it reads the first and second lines of text.
So far, the examples have focused on reading data, not writing data. As shown in Listing 6, however, writing is easy once you know the basics of working with the
Listing 6. Writing data
>>> mydata = ['Hello World!', 'The total value = $1820.00'] >>> myfile = open('testit.txt', 'w') >>> for line in mydata: ... myfile.write(line + '\n') ... >>> myfile.close() >>> myfile = open("testit.txt") >>> myfile.read() 'Hello World!\nThe total value = $1820.00\n' >>> myfile.close() >>> myfile = open("testit.txt", "r+") >>> for line in mydata: ... myfile.write(line + '\n') ... >>> myfile.seek(0) >>> myfile.read() 'Hello World!\nThe total value = $1820.00\n' >>> myfile.close() >>> myfile = open("testit.txt", "r+a") >>> myfile.read() 'Hello World!\nThe total value = $1820.00\n' >>> for line in mydata: ... myfile.write(line + '\n') ... >>> myfile.seek(0) >>> myfile.read() 'Hello World!\nThe total value = $1820.00\nHello World!\nThe total value = $1820.00\n' >>> myfile.close()
To write data to a file, you have to first create the
file object. But in this case, you must specify that you want to write to the file by using the
'w' mode flag. In this example, you write the contents of the
mydata list to the file, close the file, and then reopen the file so you can read the contents.
Often, however, you'll want to read to and write from a file at the same time, so the next part of this example reopens the file using the
'r+' mode. Because you'll be able to write to the file, not append, the file will be truncated. First you write the contents of the
mydata list to the file, then you reposition the file pointer to the start of the file and read the contents. This example then closes the file and reopens it using the read and append mode,
"r+a". As the example code demonstrates, the file contents are now the result of two write operations (the text is repeated).
All of the previous examples have dealt with textual or character data: You wrote and read character
strings. In certain situations, however -- for example, when you're working with integers or compressed files -- you need to be able to read and write binary data. You can easily do so in Python by appending
'b' to the file mode when you create the
file object, as shown in Listing 7.
Listing 7. Working with binary data
>>> myfile = open("testit.txt", "wb") >>> for c in range(50, 70): ... myfile.write(chr(c)) ... >>> myfile.close() >>> myfile = open("testit.txt") >>> myfile.read() '23456789:;<=>?@ABCDE' >>> myfile.close()
In this example, you create an appropriate
file object, then write the binary characters with ASCII values from 50 to 69. You convert the integers created by the call to the
range method to the character using the
chr method. After you've written all the data, you close the file and reopen it for reading, again using the binary mode flag. Reading the file demonstrates that you clearly didn't write the integers to the file; instead, you wrote their character values.
When you're reading and writing binary data, you must be careful because different platforms store binary data in different ways. If you must work with binary data, it's best to use an appropriate object from the Python library (or one from a third-party developer).
This article discussed how to read and write data to a file from a Python program. Overall, the process is simple: Create an appropriate
file object, then read or write as necessary. However, you must be careful about truncation when using the write mode to create a
file object for writing data to a file. If you need to append data to a file, you should use the append mode when creating the
Read all the articles in the developerWorks "Discover Python" series.
When you have a working Python interpreter, the Python tutorial is a great place to start learning the language.
Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
Get products and technologies
Innovate your next open source development project with IBM trial software, available for download or on DVD.
Get involved in the developerWorks community by participating in developerWorks blogs.
Robert J. Brunner is a Research Scientist at the National Center for Supercomputing Applications and an Assistant Professor of Astronomy at the University of Illinois, Urbana-Champaign. He has published several books and a number of articles and tutorials on a range of topics. You can reach him at firstname.lastname@example.org.