Thursday, 5 October 2017

CSV files and split()

A lot of data is in the form of tables, particularly regular tables that can be stored as CSV (Comma Separated Values) files.
As I explained in earlier posts, reading from files can be done one line at a time - this is effectively one table row at a time. And the columns are separated by commas or whatever punctuation. As long as there are a fixed number of commas, the columns can be distinguished. We can use the split() function to convert each line (which is read from the file as a string) into a list.

For example, here is the contents of a typical CSV file called groceries.csv .
Item Name, Price per Item, Number of Items
Butter (250g), 1.50, 2,
Flour (1kg), 0.50, 3
Milk (1 pint), 0.70, 5
Rice (1kg), 1.70, 1
Strawberry Jam (400g), 2.00, 1
Chocolate Biscuits (300g), 1.50, 2
Pork Sausages (500g), 4.50, 1

 And here is a script I have written to use the data in it:
#!/usr/bin/python3
fhandle = open('groceries.csv', 'r') # opens relevant file for reading
biglist = []  # Creates empty list
linecount = 0 # sets linecount variable
totalcost = 0.0
itemtotal = 0
for line in fhandle:
    if linecount > 0:
        linelist = line.split(',') # Splits each line
        linelist[1] = float(linelist[1])
        linelist[2] = int(linelist[2])
        subtotal = linelist[1] * linelist[2]
        totalcost += subtotal
        itemtotal += linelist[2]
        linelist.append(subtotal)
        print (linelist[0])
        print ('per item', linelist[1], 'no of items', linelist[2], 'subtotal', subtotal)
    linecount += 1
fhandle.close()
print ('Number of rows = ', linecount)
print ('Number of items = ', itemtotal)
print ('Total cost = £', totalcost)

I have used a few comments to explain to myself and anyone else reading the code what I am doing.
 And the results are:
>>>
 RESTART: C:/Users/John/Dropbox/Misc Programming/Python/python3/groceries01.py
Butter (250g)
per item 1.5 no of items 2 subtotal 3.0
Flour (1kg)
per item 0.5 no of items 3 subtotal 1.5
Milk (1 pint)
per item 0.7 no of items 5 subtotal 3.5
Rice (1kg)
per item 1.7 no of items 1 subtotal 1.7
Strawberry Jam (400g)
per item 2.0 no of items 1 subtotal 2.0
Chocolate Biscuits (300g)
per item 1.5 no of items 2 subtotal 3.0
Pork Sausages (500g)
per item 4.5 no of items 1 subtotal 4.5
Number of rows =  8
Number of items =  15
Total cost = £ 19.2

>>>
 
The format could do with some polishing up to make it look more like a table. But the point is we can read from a file and use split to turn it into a series of columns, and thus use each column for maths.
You might wonder "why not just use a spreadsheet?"
At this level of only simple arithmetic with the data, yes a spreadsheet could do the job. But Python is capable of doing so much more with the data than a spreadsheet.

 Actually, I have adjusted the output so that it looks more like a table. Here is the code, the first half being unaltered except for comments:
#!/usr/bin/python3
fhandle = open('groceries.csv', 'r') # opens relevant file for reading
biglist = []  # Creates empty list
linecount = 0 # sets linecount variable
totalcost = 0.0
itemtotal = 0
for line in fhandle:
    if linecount > 0: # Does not do this on linecount = 0 i.e. first line
        linelist = line.split(',') # Splits each line at each comma
        linelist[1] = float(linelist[1]) # converts second element into float
        linelist[2] = int(linelist[2]) # converts third element into integer
        subtotal = linelist[1] * linelist[2]
        totalcost += subtotal # adds subtotal variable to totalcost variable
        itemtotal += linelist[2]
        linelist.append(subtotal)
        space = ' ' * (25 - len(linelist[0])) # calculates space for formatting
        print (linelist[0], space, '£', linelist[1], 'X', linelist[2], '= £', subtotal)
 
    linecount += 1
   
fhandle.close()
print ('Number of rows = ', linecount)
print ('Number of items = ', itemtotal)
print ('Total cost = £', totalcost)

 And here is the output:
>>>
 RESTART: C:/Users/John/Dropbox/Misc Programming/Python/python3/groceries01.py

Butter (250g)              £ 1.5 X 2 = £ 3.0
Flour (1kg)                £ 0.5 X 3 = £ 1.5
Milk (1 pint)              £ 0.7 X 5 = £ 3.5
Rice (1kg)                 £ 1.7 X 1 = £ 1.7
Strawberry Jam (400g)      £ 2.0 X 1 = £ 2.0
Chocolate Biscuits (300g)  £ 1.5 X 2 = £ 3.0
Pork Sausages (500g)       £ 4.5 X 1 = £ 4.5
Number of rows =  8
Number of items =  15
Total cost = £ 19.2

>>>

No comments:

Post a Comment