Thursday, 28 September 2017

Opening, Writing to and Reading from a File

Why bother with files? The main reason is persistence. When you close a Python IDLE session, you lose any information that you have created in memory. If it is in a file on the computer's storage (usually hard disk), it will survive until the next session.
Another reason is passing information between apps. A wide range of computer programs can handle a simple text file, but not so many can easily and seamlessly integrate with Python. So files (particularly plain text) can be useful intermediaries.

At the moment I will assume the files I will be opening, writing to, reading from and closing will be stored in the same folder/directory as the Python scripts. It is possible to navigate around a computer's folder structure, but I am not ready for that yet.

So to open a file, I create a variable that contains a file object, sometimes called a file handle, with the built-in function open and then the filename as it appears in the operating system and then specify why I am opening the file: 'w' for write, 'r' for read, 'a' for append. 
For example,
#!/usr/bin/python3
foo = open('test.txt', 'w')
foo.write('Hello World!\n')
foo.write('This is a Test!')
foo.close() 
foo = open('test.txt', 'r')
print(foo.read())
foo.close()

gives the output:
 RESTART: C:/Users/John/Dropbox/Misc Programming/Python/python3/test07_filetest.py
Hello World!
This is a Test!
Furthermore, you can open up your folder explorer (in Windows 10 this is File Explorer) and you should find the file test.txt in the same folder as this script.
As you might be able to tell, the first part of the script opens up the file for writing.
If the file does not exist, then Python creates it.
If the file does exist, then writing to it will overwrite any data currently in the file.
When you no longer need access to the file on the disk it is a good idea to close() the file object, which I have done on the last line of this script.

Using 'a' for append when opening the file means that any data written to the file is added to the end, preserving the old data. For example, we can add a few lines to the script to add a few lines to test.txt:
#!/usr/bin/python3
foo = open('test.txt', 'w')
foo.write('Hello World!\n')
foo.write('This is a Test!\n')
foo.close()
foo = open('test.txt', 'r')
print(foo.read())
foo.close()
foo = open('test.txt', 'a')
foo.write('Here are some more lines \n')
foo.write('Appended to the end\n')
foo.close()
foo = open('test.txt', 'r')
print(foo.read())
foo.close()
This gives the output:
  RESTART: C:/Users/John/Dropbox/Misc Programming/Python/python3/test07_filetest.py
Hello World!
This is a Test!
Hello World!
This is a Test!
Here are some more lines
Appended to the end
>>>
And again you can check the text file in the folder with the python script. 

Simply telling it to filename.read() will read all the data in the file at once. If you assign it to a variable, that variable will contain a string of the entire file contents.
This is useful sometimes. But more often than not, you want to go through the file line by line, and Python allows us to do that, using our old friend the for loop. Rather than treating the file contents as one big lump, Python can treat it as a collection or list of lines. And for loops are good for working their way through a sequence or list.
#!/usr/bin/python3
foo = open('test.txt', 'r')
linecount = 0
linelist = []
for line in foo:
    linecount += 1
    print(linecount, line)
    linelist.append(line)
foo.close()
print (linelist)
with the output:
 RESTART: C:/Users/John/Dropbox/Misc Programming/Python/python3/test07_filetest2.py 1 Hello World! 
2 This is a Test! 
3 Here are some more lines  
4 Appended to the end 
['Hello World!\n', 'This is a Test!\n', 'Here are some more lines \n', 'Appended to the end\n']
>>>
Why the extra spaces between the lines? Because write() does not automatically add newlines (end of line characters) so I added them manually in the strings written to the file (you noticed the '\n' at the end?) but then print() automatically adds newlines. Hey presto, two newlines per line.

No comments:

Post a Comment