sales = []Chapter 8: Storing Collections of Data
Notes
Lists and Tracking Sales
- Consider the following vignette
- The owner of an ice-cream stand wants a program to track sales
- There are ten stands, each selling multiple items
- The program should take sales data as input and then provide the following views on the data
- Sorted from lowest to highest
- Sorted from highest to lowest
- Show just the highest and the lowest
- Show the total number of sales
- Show the average number of sales
Getting the specification right: Storyboarding
Agreeing on the specification with your client is important. A technique is called storyboarding, best done by sitting down with a paper and pen (or a whiteboard)
A storyboard shows how the program should flow in response to various user inputs. E.g. depicting the menus the user might use, with a storyboard for each menu choice. The storyboard should also show how the program will work
For bigger programs you can break different components out into their own storyboards, much in the same way we built up functions. Storyboards depict what needs to happen, but not how to do it.
Given the spec for the ice cream stand we can now outline the program
- Store the sales data in variables
- Implement a way to sort the data
- A way to print the output
- Store the data globally and pass it to functions to handle the work
We can construct the prototype interface, similar to the Ride Selector Program
Ice-Cream Sales 1: Print the Sales 2: Sort Low to High 3: Sort High to Low 4: Highest and Lowest 5: Total Sales 6: Average Sales 7: Enter Figures Enter your command: 3
Limitations of Individual Variables
- We first need to store the sales
For ten stores, we could theoretically use ten variables, one for each store
But this method becomes clunky when we want to start analysing the variables
E.g. the following code (FindingLargestSales.py), only handles finding if the first stand is the one with the greatest sales
# Example 8.1 Finding the Largest Sales # # Checks if sales1 has the largest sales. Demonstrates the difficulty of using # individual named variables to deal with aggregate data import BTCInput sales1 = BTCInput.read_int("Enter the sales for stand 1: ") sales2 = BTCInput.read_int("Enter the sales for stand 2: ") sales3 = BTCInput.read_int("Enter the sales for stand 3: ") sales4 = BTCInput.read_int("Enter the sales for stand 4: ") sales5 = BTCInput.read_int("Enter the sales for stand 5: ") sales6 = BTCInput.read_int("Enter the sales for stand 6: ") sales7 = BTCInput.read_int("Enter the sales for stand 7: ") sales8 = BTCInput.read_int("Enter the sales for stand 8: ") sales9 = BTCInput.read_int("Enter the sales for stand 9: ") sales10 = BTCInput.read_int("Enter the sales for stand 10: ") if ( sales1 > sales2 and sales1 > sales3 and sales1 > sales4 and sales1 > sales5 and sales1 > sales6 and sales1 > sales7 and sales1 > sales8 and sales1 > sales9 and sales1 > sales10 ): print("Stand 1 had the best sales")Problem: We would have to repeat the code each time for each individual sales variable
If we add more stands, we have add another named variable and another big
ifstatement- AND modify all the previous
ifstatements
- AND modify all the previous
- Clearly this approach is not very maintainable
Lists in Python
- A collection is a composite type
- It stores multiple elements of another type
- We’ve already (briefly) seen one type of collection the tuple
- The most common form of collection is the
list- What it sounds like, a list of items
Make Something Happen: Creating a List
Open a python interpreter and work through the following steps to learn about list
A list is created using brackets around the contents
[], e.g.- The above defines
salesas an empty list
- The above defines
Items can be appended to a
listusing theappendfunctionsales.append(99) sales[99]- As we can see from above
salesnow contains the value99
- As we can see from above
Calling append again, adds the new item to the end of the list
sales.append(100) sales[99, 100]Observe from above you can see the contents of a list, by simply typing the variable name in the interpreter
In scripts we can also use the explicit
printcallprint(sales)[99, 100]
You can access individual items of the list, using the indexing operator
[]sales[0]99- Syntax is
list_name[index]whereindexis an integer giving the index of the item - Python lists are zero-indexed. i.e. the first value is stored at index \(0\)
- Syntax is
The indexing operator can be used to change the value of an item at a given index
sales[1] = 101 sales[99, 101]- The above changes the value of the second item in
salesto \(101\)
WarningIndexed elements must exist
Whenever we use the indexing operator the index must exist! For example if we tried to view the (non-existent) third item, we would get an error, e.g.
example_list = [1, 2] print(example_list[2])--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[7], line 2 1 example_list = [1, 2] ----> 2 print(example_list[2]) IndexError: list index out of range
The above illustrates the common off-by-one error where we access the last index past the list rather than the last element of the list. Here the type of exception thrown is called an
IndexError- The above changes the value of the second item in
A single list can store values of different types, and can replace items with new items of a different type
sales.append("Rob") sales[0] = "Python" sales['Python', 101, 'Rob']- The above appends a new string
"Rob", convertssales[0]from an int to the string"Python"and leaves the number \(101\) insales[1]untouched - Overall list thus mixes string and integer types*
- The above appends a new string
Avoid Mixing Types in Lists
ust because you can* mix types in lists, doesn’t mean you should. Typically lists and list processing is much easier when a list stores all items of the same type*
Read in a List
You can use loops to populate a list (see ReadAndDisplay.py)
# Example 8.2.1 Read and Display # # Demonstrates using a loop to populate a list import BTCInput # create an empty list to populate sales = [] for count in range(1, 11): prompt = "Enter the sales for stand " + str(count) + ": " sales.append(BTCInput.read_int(prompt)) print(sales)
Code Analysis: Investigate a List Reading Loop
Examine the code given above and consider the following questions to understand how the list is processed
What is the purpose of the
countvariable?counttracks the value of the current index in the loop. This is used to print the id for the sales stand we are collecting the data from
Why does the range of
countgo from \(1\) to \(11\)?- The
rangefunction returns a collection with the start included but the stop excluded. Since we have stores \(1\) through \(10\), we want the range to go from \(1\) to \(11\) so the generated numbers are \(1\) through to \(10\)
- The
Which item in the list would hold the sales for stand number \(1\)?
- The first item in the list, or the zeroth indexed, i.e.
sales[0]
- The first item in the list, or the zeroth indexed, i.e.
What part of the code would have to be changed if we instead had \(100\) stands?
We simply change
range(1,11)through torange(1,101)The program below (ReadAndDisplay2.py) is a variant in which the user specifies the number of stands
# Example 8.2.2 Read and Display 2 # # Improved version of Read and Display which allows the user to specify # the number of stands import BTCInput # create an empty list to populate sales = [] number_of_stands = BTCInput.read_int("Enter the number of stands: ") for count in range(1, number_of_stands + 1): prompt = "Enter the sales for stand " + str(count) + ": " sales.append(BTCInput.read_int(prompt)) print(sales)The above is more flexible, but as a result it is more complicated, the trade off between flexibility and ease of use is one that should be considered with the input of the users
If I got one sales value wrong, would it be possible to edit the list to put in a corrected version?
- This is not implemented in the current program, but we have already seen that you can reassign the value of list at a given index, so we could implement this in a more complete program
Display a list using a for Loop
We’ve already seen that
printhas a default way of displaying a listWe can use a
forloop for if we want custom printing for each item# Example 8.3 Read and Display Loop # # Uses a for loop to provide custom list printing import BTCInput sales = [] for count in range(1, 11): prompt = "Enter the sales for stand " + str(count) + ": " sales.append(BTCInput.read_int(prompt)) # print a heading print("Sales Figures") count = 1 for sales_value in sales: print("Sales for stand", count, "are", sales_value) count = count + 1
Make Something Happen: Read the Names of Guests for a Party
Lists can hold any type of data that you need to store, including strings. You can change the ice-cream sales program to read and store the names of guests for a party or an event you’re planning. Make a modified version of the sales program that reads in some guest names and then displays them. Make your program handle between \(5\) and \(15\) guests
- We basically just copy the previous program with the following changes
sales\(\rightarrow\)guestssales_value\(\rightarrow\)guest- We change the prompts to appropriately refer to guests rather than sales
- The two main changes are
- We add an initial prompt for the number of guests
- We use
BTCInput.read_int_rangedto ensure the value is from \(5\) to \(15\)
- We use
- We use
BTCInput.read_textinstead ofBTCInput.read_intto get the guest names
- We add an initial prompt for the number of guests
# Exercise 8.1 Party Guests
#
# A program that receives and then prints a list of party guests
# Works for between 5 and 15 guests
import BTCInput
guests = []
number_of_guests = BTCInput.read_int_ranged(
"Enter the number of guests (5-15): ", 5, 15
)
for count in range(1, number_of_guests + 1):
prompt = "Enter the name of guest " + str(count) + ": "
guests.append(BTCInput.read_text(prompt))
# print a heading
print("\nGuests attending:")
count = 1
for guest in guests:
print("- ", guest)
count = count + 1Refactor Programs into Functions
The previous examples build up our program as one long chain of events
However, if we think about our program this isn’t strictly the cleanest
- There are two distinct responsibilities occuring
- First we read in the data
- Second we display the data
- These are natural candidates to be converted into functions
- There are two distinct responsibilities occuring
By pairing these behaviours the program locks us into one way of processing data
- What happens if we want to read in a second set of data?
- What if we want to print the data multiple times?
Refactoring is the process of modifying existing code
- Specifically changing how factors interact
Refactoring avoids the problem of overcomplicating the design at the start of the process
- Instead we write the program the most simple way we can
- Then once a structure emerges, or we need to add functionality we can refactor the design
Let us factor out the two key components identified above into a new implementation (Functions.py)
# Example 8.4 Functions # # Demonstrates refactoring a program into component functions import BTCInput sales = [] def read_sales(number_of_sales): """ Reads in the sales values and stores them in the sales list Parameters ---------- number_of_sales : int Number of Stores to record sales values for Returns ------- None Results are read into the sales list """ sales.clear() # remove existing sales values for count in range(1, number_of_sales + 1): prompt = "Enter the sales for stand " + str(count) + ": " sales.append(BTCInput.read_int(prompt)) def print_sales(): """ Prints the sales figures on the screen with a heading. Each figure is numbered in sequence Returns ------- None """ print("Sales Figures") count = 1 for sales_value in sales: print("Sales for stand", count, "are", sales_value) count = count + 1 read_sales(10) print_sales()
Code Analysis: Functions in the Sales Analysis Program
Our sales analysis program now consists of two functions, read_sales and print_sales
What does the parameter for the
read_salesfunction do?- We hinted at in the previous section that we might want to account for the potential for the number of stands to change in a future implementation. To support this behaviour
read_salesreads in the number of sales value that it should reads
- We hinted at in the previous section that we might want to account for the potential for the number of stands to change in a future implementation. To support this behaviour
What does
cleardo?- We want to start with a fresh list every time we read the sales values
clearis a method onlistobjects that clears its contents
Why don’t we need to tell the
print_salesfunction how many sales figures to print?- The
forloop goes through the contents of thesaleslist - A list tracks its own size
- In some languages like C, containers do not naturally track their sizes and we would need to specify them
- The
Why didn’t we have to write
global salesin theread_salesfunction?- Python variable names are references to memory
- These are distinct from the objects that live in that memory
- Assignments change what object a reference (variable) refers to
- e.g.
sales=[]
- e.g.
- However, calling methods on a variable, is not changing the reference e.g.
sales.append(99)(They change the object contents)- So we don’t need to use global because by calling methods its clear what reference we’re using
Create Placeholder Functions
- A development technique called stubs is where we write placeholder functions before we can provide a complete implementation for a given behaviour
- The placeholders are sometimes called stub functions e.g. the two below
def sort_high_to_low():
"""
Print out a sales list from highest to lowest
Returns
-------
None
See Also
--------
sort_low_to_high : sorts from lowest to highest
"""
pass
def sort_low_to_high():
"""
Print out a sales list from lowest to highest
Returns
-------
None
See Also
--------
sort_high_to_low : sorts from highest to lowest
"""
pass- Placeholders let us model the flow of program before we have all the behaviours specified
- Obviously does not model the complete program since the functions are incomplete
passis a keyword for a statement that does nothing- It is effectively a placeholder statement
Sort Using Bubble Sort
- Sorting is a common task for computing programs
- It can be time-intensive
- There are often multiple ways that we may wish to sort things, e.g.
- Alphabetically vs Numerically
- Increasing vs Decreasing
- Case-sensitive vs Case-insensitive
- Traditional sorts are down, one item (or pair of items) at a time
- Algorithms, are a sequence of steps that solve a problem
- Sorting Algorithms are algorithms that sort collections
- Programming is really the implementation of an algorithm
- Bubble Sort is a simple sorting algorithm
- Easy to follow and understand
- Not scalable to larger data sets
Initialise a list with Test Data
- Often when implementing an algorithm we want to use a fixed set of test data
- i.e. Data for which we can easily know the desired final state or output
- Allows us to check our algorithm is not incorrect
- We can define a
listin python with some contents,
sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]Sort a List from High to Low
block-beta
columns 6
classDef BG stroke:transparent, fill:transparent
index["Index"]:1
class index BG
block:Indices:5
columns 10
0
1
2
3
4
5
6
7
8
9
end
value["Value"]:1
class value BG
block:Values:5
columns 10
50
54_1["54"]
29
33
22
100
45
54_2["54"]
89
75
end
- The above shows how the test data looks in a python list
- For a highest to lowest sort we want the largest value to be in index \(0\) and the lowest in index \(9\)
- The basic idea of Bubble sort is to compare neighbouring values, if the right value is larger we want to swap them so the larger value is on the left
- Thus closer to the top of the list
Swap Two Values in a Variable
The following code to swap two variables is broken,
if sales[0] < sales[1]:
# the two items are in the wrong order and must be swapped
sales[0] = sales[1]
sales[1] = sales[0]Why? Lets work through what happens
sales[0]is set to the value ofsales[1]sales[1]is set to the current value ofsales[0]- But,
sales[0]has already been set tosales[1]- So
sales[1]is set to the same value it already has
- So
The net result is that we only copy sales[1] to sales[0]
The correct implementation is given below,
if sales[0] < sales[1]:
temp = sales[0]
sales[0] = sales[1]
sales[1] = temptemp is used to store the value of sales[0] before it was overwritten
Obviously, we don’t want to write the code with explicit reference to indices. However we can write this generically with a for loop as below
for count in range(0, len(sales) - 1):
if sales[count] < sales[count - 1]:
temp = sales[count]
sales[count] = sales[count + 1]
sales[count + 1] = tempCode Analysis: Work through a List using a Loop
The above code uses some new python features. Work through the following questions to understand what’s going on
Why have you used a
forloop, rather than awhileloop?- We could use either, the
forloop is slightly smaller since we don’t have to manually incrementcount - Additionally
rangetechnically returns what is called a generator, - This is more memory efficient
- Rather than creating a full list of numbers in memory, it just returns the next number each time the
forloop requests it
- Rather than creating a full list of numbers in memory, it just returns the next number each time the
- We could use either, the
What does the
lenfunction do on line \(1\)?lenreturns the length of a collection, i.e. the number of items in the collection- This lets you write code that is insensitive to the size of the collection being worked with
- Means our sorting code could work on any length list
Why is the limit of
countthe length of the list minus 1?- This is because bubble sort compares the current item to the item to its right, i.e. at the next index
- If the range goes to the last index, then program will try an access an element one past the end of the list which doesn’t exist
- This will cause an error. e.g.
a_list = [1,2] for count in range(0, len(a_list)): if a_list[count] < a_list[count + 1]: temp = a_list[count] a_list[count] = a_list[count + 1] a_list[count + 1] = temp--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[9], line 3 1 a_list = [1,2] 2 for count in range(0, len(a_list)): ----> 3 if a_list[count] < a_list[count + 1]: 4 temp = a_list[count] 5 a_list[count] = a_list[count + 1] IndexError: list index out of range
- The complete implementation of the above discussion below performs one pass through the list
# Example 8.6 Bubble Sort First Pass
#
# Implements the first pass of bubble sort and shows the impact on the list
# test data
sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
def sort_high_to_low():
"""
Print out a sales list from highest to lowest
Returns
-------
None
"""
for count in range(0, len(sales) - 1):
if sales[count] < sales[count + 1]:
temp = sales[count]
sales[count] = sales[count + 1]
sales[count + 1] = temp
print("Input list:", sales)
sort_high_to_low()
print("Output list:", sales)Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
Output list: [54, 50, 33, 29, 100, 45, 54, 89, 75, 22]
after which the test data looks like this
block-beta
columns 6
classDef BG stroke:transparent, fill:transparent
index["Index"]:1
class index BG
block:Indices:5
columns 10
0
1
2
3
4
5
6
7
8
9
end
value["Value"]:1
class value BG
block:Values:5
columns 10
54_1["54"]
50
33
29
100
45
54_2["54"]
89
75
22
end
- Notice that the list has been partially sorted
- Also notice that the smallest value \(22\) has been moved to the correct index (the end)
- The high numbers effectively bubble left past one of the values smaller than them
- Since we can see that after sorting the smallest value has been moved to the end we expect on the second loop through the second smallest value will have been moved to the correct spot
- So we want to loop through
len(sales)times
- So we want to loop through
- The working bubble sort implemention is then,
# Example 8.7 Bubble Sort Multiple Pass
#
# Implements a complete working version of bubble sort
# test data
sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
def sort_high_to_low():
"""
Print out a sales list from highest to lowest
Returns
-------
None
"""
for sort_pass in range(0, len(sales)):
for count in range(0, len(sales) - 1):
if sales[count] < sales[count + 1]:
temp = sales[count]
sales[count] = sales[count + 1]
sales[count + 1] = temp
print("Input list:", sales)
sort_high_to_low()
print("Output list:", sales)Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
Output list: [100, 89, 75, 54, 54, 50, 45, 33, 29, 22]
Code Analysis: Improving Performance
As seen above, the sorting program now works correctly. Once you have a working implementation its worth investigating if there are changes you can make to improve the efficiency. Work through the following questions to get the idea
Is the program making more comparisons than necessary?
Yes, as we mentioned before, after one pass the smallest item will always be at the end of the collection
This means we don’t need to check any swaps against it any more for the inner loop
After each pass the size of this sorted section increases by at least one
An implementation taking this into account is,
for sort_pass in range(0, len(sales)): for count in range(0, len(sales) - 1 - sort_pass): if sales[count] < sales[count + 1]: temp = sales[count] sales[count] = sales[count + 1] sales[count + 1] = temp
Is the program performing more passes through the list than nessecary?
- Probably, unless the largest value is at the end of the list all values should be bubbled to their correct spot in less than
len(sales)passes - We can stop doing additional passes if we work out the list is already sorted
- How?
- We use a flag to track if any swaps occur in a pass
- If none do then the list is already sorted and we can stop
# Example 8.8 Efficient Bubble Sort # # A bubble sort implementation incorporating efficiency savings to the number # of comparisons and passes through the list # test data sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] def sort_high_to_low(): """ Print out a sales list from highest to lowest Returns ------- None """ for sort_pass in range(0, len(sales)): done_swap = False for count in range(0, len(sales) - 1 - sort_pass): if sales[count] < sales[count + 1]: temp = sales[count] sales[count] = sales[count + 1] sales[count + 1] = temp done_swap = True if not done_swap: break print("Input list:", sales) sort_high_to_low() print("Output list:", sales)Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] Output list: [100, 89, 75, 54, 54, 50, 45, 33, 29, 22]
- Probably, unless the largest value is at the end of the list all values should be bubbled to their correct spot in less than
Make Something Happen: Sort Alphabetically
Bubble sort works for strings as well as integers. We saw that in Chapter 5 the python relational operators also work for strings. See if you can modify the Party Guest Program to display the names in alphabetical order
We can basically just reuse our sort code, but renamed for the guest program.
def sort_alphabetical():
"""
Sorts a list alphabetically
Returns
-------
None
"""
for sort_pass in range(0, len(guests)):
done_swap = False
for count in range(0, len(guests) - 1 - sort_pass):
if guests[count] > guests[count + 1]:
temp = guests[count]
guests[count] = guests[count + 1]
guests[count + 1] = temp
done_swap = True
if not done_swap:
breakThere is a second modification above, which is changing the sign of the relational operator, e.g.
guests[count] < guests[count + 1]has been changed to,
guests[count] > guests[count + 1]This is because as written the program tries to put the smallest strings last, but for strings; where the relational operator is alphabetically ordered this puts strings starting with a for example, after those starting with z etc. So we need to swap the sign so that the list is printed a, b, … , z etc.
Why don’t we have to make more modifications? Well the code as written only requires that the items being sorted are stored in a list, and that the items in the list can be compared with a relational operator. Both of these properties are satisfied by a collection of strings so the code effectively works out of the box
The complete code, including the integration with reading and printing the guest list is given in SortAlphabetically.py
Sort a List from Low to High
- To flip the direction of the sort, we just need the condition that determines what is out of order or not
We do this by changing \(<\) to \(>\), i.e.
# Example 8.9 Bubble Sort Low to High # # Implementation of Bubble Sort that sorts from low to high # test data sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] def sort_low_to_high(): """ Print out a sales list from highest to lowest Returns ------- None """ for sort_pass in range(0, len(sales)): done_swap = False for count in range(0, len(sales) - 1 - sort_pass): if sales[count] > sales[count + 1]: temp = sales[count] sales[count] = sales[count + 1] sales[count + 1] = temp done_swap = True if not done_swap: break print("Input list:", sales) sort_low_to_high() print("Output list:", sales)Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] Output list: [22, 29, 33, 45, 50, 54, 54, 75, 89, 100]
- The code above is given in BubbleSortLowToHigh.py
Find the Highest and Lowest Sales Values
In comparison to sorting, finding a value is much easier
The basic outline for finding the highest is,
for values in collection if(new value > highest seen so far) highest = new valueWe can write the code for the highest and lowest in python then as,
highest = sales[0] for sales_value in sales: if sales_value > highest: highest = sales_value lowest = sales[0] for sales_value in sales: if sales_value < lowest: lowest = sales_valueIf we want to find both at the same time, then we can combine the code above, which means we only have to do one pass through the collection
# Example 8.10 Highest and Lowest # # Function that finds the highest and lowest value in a collection # Example 8.9 Bubble Sort Low to High # # Implementation of Bubble Sort that sorts from low to high # test data sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] def highest_and_lowest(): """ Print out the highest and lowest elements of a sales list Returns ------- None """ highest = sales[0] lowest = sales[0] for sales_value in sales: if sales_value > highest: highest = sales_value elif sales_value < lowest: lowest = sales_value print("The highest is:", highest) print("The lowest is", lowest) print("Input list:", sales) highest_and_lowest()Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] The highest is: 100 The lowest is 22- The code above is given in HighestAndLowest.py
Evaluate Total and Average Sales
To evaluate the total we have to sum the contents of a list, simple using the
forloops we’ve looked at, (implementation in TotalSales.py)# Example 8.11 Total Sales # # Calculate the Total Sales # test data sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] def total_sales(): """ Print out the total sales of a sales list Returns ------- None """ total = 0 for sales_value in sales: total = total + sales_value print("Total sales are:", total) print("Input list:", sales) total_sales()Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] Total sales are: 551It is a simple extra step to them calculate the average, (divide the total by the number of elements in the collection)
# Example 8.12 Average Sales # # Calculate the Average Sales # test data sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] def average_sales(): """ Print out the average sales of a sales list Returns ------- None """ total = 0 for sales_value in sales: total = total + sales_value average_sales = total / len(sales) print("Average sales are:", average_sales) print("Input list:", sales) average_sales()Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] Average sales are: 55.1
Complete the Program
- The previous Exercises have given us all the parts, now we want to put it together
- The crux of our program should be a loop around the menu through which the user selects different functions
- We first however need to read in the data from the user
- For useability we should add the ability to quit the program
- The final program implements this
# Example 8.13 Complete Program
#
# A Complete implementation of the Sales Program combining all the individual
# programs that we have implemented
import BTCInput
sales = []
def read_sales(number_of_sales):
"""
Reads in the sales values and stores them in the sales list
Parameters
----------
number_of_sales : int
Number of Stores to record sales values for
Returns
-------
None
Results are read into the sales list
"""
sales.clear() # remove existing sales values
for count in range(1, number_of_sales + 1):
prompt = "Enter the sales for stand " + str(count) + ": "
sales.append(BTCInput.read_int(prompt))
def print_sales():
"""
Prints the sales figures on the screen with a heading. Each figure is
numbered in sequence
Returns
-------
None
"""
print("Sales Figures")
count = 1
for sales_value in sales:
print("Sales for stand", count, "are", sales_value)
count = count + 1
def sort_high_to_low():
"""
Print out a sales list from highest to lowest
Returns
-------
None
See Also
--------
sort_low_to_high : sorts from lowest to highest
"""
for sort_pass in range(0, len(sales)):
done_swap = False
for count in range(0, len(sales) - 1 - sort_pass):
if sales[count] < sales[count + 1]:
temp = sales[count]
sales[count] = sales[count + 1]
sales[count + 1] = temp
done_swap = True
if not done_swap:
break
def sort_low_to_high():
"""
Print out a sales list from lowest to highest
Returns
-------
None
See Also
--------
sort_high_to_low : sorts from highest to lowest
"""
for sort_pass in range(0, len(sales)):
done_swap = False
for count in range(0, len(sales) - 1 - sort_pass):
if sales[count] > sales[count + 1]:
temp = sales[count]
sales[count] = sales[count + 1]
sales[count + 1] = temp
done_swap = True
if not done_swap:
break
def highest_and_lowest():
"""
Print out the highest and lowest elements of a sales list
Returns
-------
None
"""
highest = sales[0]
lowest = sales[0]
for sales_value in sales:
if sales_value > highest:
highest = sales_value
elif sales_value < lowest:
lowest = sales_value
print("The highest is:", highest)
print("The lowest is", lowest)
def total_sales():
"""
Print out the total sales of a sales list
Returns
-------
None
"""
total = 0
for sales_value in sales:
total = total + sales_value
print("Total sales are:", total)
def average_sales():
"""
Print out the average sales of a sales list
Returns
-------
None
"""
total = 0
for sales_value in sales:
total = total + sales_value
average_sales = total / len(sales)
print("Average sales are:", average_sales)
# Get initial sales list
read_sales(10)
menu = """
Ice Cream Sales
0. Quit the Program
1. Print the Sales
2. Sort High to Low
3. Sort Low to High
4. Highest and Lowest
5. Total Sales
6. Average Sales
7. Enter Figures
Enter your command: """
while True:
command = BTCInput.read_int_ranged(menu, 0, 7)
if command == 0:
break
if command == 1:
print_sales()
elif command == 2:
sort_high_to_low()
elif command == 3:
sort_low_to_high()
elif command == 4:
highest_and_lowest()
elif command == 5:
total_sales()
elif command == 6:
average_sales()
elif command == 7:
read_sales(10)
else:
raise ValueError("Unexpected value " + str(command) + " found")Keeping Information Synchronised when Sorting
Playing around with the program you might notice one thing. The stands are numbered in the order that they are printed. This works great for printing the original list out, but once we start sorting these numbers don’t match their original value. This is fine if we only care about the sales figures, but if we want to maintain a relationship between a stand and its sales this is something that would have to be modified.
This is something you would discuss with the client
Store Data in a File
A natural extension to the program would be the ability to read or store the sales data to a file
Files allow for persisting the data between sessions
To do this we’ll add two new options,
8. Save Salesand9. Load SalesLet us start by stubbing out our functions (the complete integration is found in LoadAndSave.py),
def save_sales(file_path): """ Saves the contents of the sales list to a file Parameters ---------- file_path : str string giving the file path to save to Returns ------- None Raises ------ FileException Raised if the save fails See Also -------- load_sales : load sales from a sales list file """ print("Save the sales in:", file_path) def load_sales(file_path): """ loads the contents of a file into the sales list Parameters ---------- file_path : str string giving the file path to load from Returns ------- None Raises ------ FileException Raised if the load fails See Also -------- save_sales : save the sales list into a file """ print("Load the sales in:", file_path)We also add a basic integration to the user menu, where we use
BTCInput.read_textto get a file name, then call the functionObserve that by adding the complete docstring’s we’re also starting to document the requirements for these functions in-code
elif command == 7: read_sales(10) elif command == 8: file_to_save_to = BTCInput.read_text("Enter file to save to: ") save_sales(file_to_save_to) elif command == 9: file_to_load_from = BTCInput.read_text("Enter file to load: ") load_sales(file_to_load_from) else: raise ValueError("Unexpected value " + str(command) + " found")
Write into a File
When interacting with a file, python represents it as a memory object
- Technically representing the connection
opencreates a connection to a file, the below, opens a file,test.txt, in write modewand stores it in the variableoutput_fileoutput_file = open('test.txt', 'w')- The two arguments are called the
file_pathand themodefile_pathis the file you want to openmodeis what you want to do with it
- The two arguments are called the
It’s very easy to overwrite an existing file
The open function will not prevent you from modifying important files. For example files opened for write will first wipe the contents of any existing file that matches the path then write the new contents.
Python provides the os module which has some extra functionality for handling files and directories, e.g. you can check that a file exists before you open it if you then want check if the user wants to overwrite it before opening it
import os.path
if os.path.isfile("text.txt"):
print("The file exists")If we’ve opened a file in write mode, we can use the
writemethod on the file object to write to the fileoutput_file.write("First line\n") output_file.write("Second line\n") output_file.close()Once you’re done with a file you need to call
close- Completes any unfinished writes (ensures data integrity)
- Releases the file so other programs or processes can use it
- Files open for writing are locked for editing by that process, nothing else can use them
Putting everything together our simple file writing program is,
# Exercise 8.15 File Output # # A simple program to demonstrate opening and writing to a file output_file = open("test.txt", "w") output_file.write("line 1\n") output_file.write("line 2\n") output_file.close()
Code Analysis: File Writing
Consider the following questions about file writing
Why have you called the
writefunction a method? Isn’t it a function?- As discussed earlier, methods are functions associated with a specific object
- Typically when we say functionw we refer to a function that is defined outside of an object
writeis a method on the file object- It is impossible to use
writewithout there being a file object to use - Methods allow us to work with multiple file objects without having to worry about making sure we pass the correct one to the function
- It is impossible to use
What does the
\nmean at the end of the strings?- It’s the new line symbol
writedoesn’t automatically end the line after we call it - We have to manually pass the new line
- It’s the new line symbol
Where is the file
text.txtactually created?The file_path is relative to the current running python program
Hence the file is written to the same directory
- E.g. if we had a folder called “My Programs” with a python program “MakeFiles.py”, when we run “MakeFiles.py” the files it makes are stored in “My Programs”
You can use more complicated file_paths
path = "./data/test.txt"would look for test.txt in the data subdirectory of the current python program (relative path)path = "c:/data/test.txt"would look for test.txt in the data subdirectory of the c drive (absolute path)
NoteDenoting a Directory Seperator
On Windows
\is used to seperate directories, but in python you always use/
Can any program use a file written from a Python program?
- Yes, python uses the underlying operating systems file handling services
- Any other program on the operating system can access files created or modified by python
Can I add lines at the end of a python file?
- Yes, rather than open the file in write
w, you open the file in append (a). - Any writes will then be appended to the end of the file.
- A non-existent file will be created the same way as for write mode
- Yes, rather than open the file in write
Write the Sales Figures
Using the above discussion we can implement the
write_salesfunction# Example 8.16 Write Sales # # Implements the Write Sales function # test data sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] def save_sales(file_path): """ Saves the contents of the sales list to a file Parameters ---------- file_path : str string giving the file path to save to Returns ------- None Raises ------ FileException Raised if the save fails """ print("Save the sales in: ", file_path) output_file = open(file_path, "w") for sale in sales: output_file.write(str(sale) + "\n") output_file.close() save_sales("test_output.txt")
Code Analysis: The save_sales Function
The save_sales function combines several behaviours and is worth examining in detail. What is the purpose of the function? To take a list of sales figures and write those figures to a file (preferably in a format that is easy for a human to read and to load back into the program.) Consider the following questions
What does the
strfunction do? Why are we using it?- The
strfunction converts the sales number to a string - While
printcan handle non-string inputs,writecan only take a string
- The
Why can’t we just write out the sales list as one object?
- A
listdoes not provide any built-in methods for writing an object out to a file - We could try and print out it’s string representation (i.e. call
strand output that) - Doesn’t give us great ability to control the way the data is output
- A
Read from a File
We an also use
opento read from a file, we just use the read mode (r)input_file = open("test.txt", "r")We can then loop over the lines in a file using a
forloopfor line in input_file: print(line)We should still use
close()when we’re done readinginput_file.close()The complete sample program looks like,
# Example 8.17 File Input # # Demonstrates reading input from a file input_file = open("test.txt", "r") for line in input_file: print(line) input_file.close()
Code Analysis: Reading from Files
Work through the following questions to understand how reading from files works
If you look at the following output, you’ll notice there are empty lines after each line of text. Why is that?
line 1 line 2Every time we read a line from a file, we read the terminating new line
This is included in the string stored in
lineso when we call print we get that new line and the new line added byprintWe could fix this by modifying our
printcall, to remove the new lineprint(line, end='')A more natural way to fix this is to remove the newline when we first read in the string
The
stripmethod when called without arguments returns a copy of the string with all leading and trailing whitespace removed from the string
line = line.strip()- This is an example of conditioning input
- Process of making sure that an input does not contain any unexpected values
- E.g. we might also want to use
stripto remove non-printable characterslstripandrstripare variants ofstripthat only work on the lead or end of the string respectively
Why do we have to close the file we’re reading?
- For reading a file forgetting to close it won’t cause issues with other programs or processes that also try to read from the file
- However, lets other programs now write to that file
- Releases the memory associated with holding the connection
- Your computer might not let you shut down if it thinks there are still unclosed files
What would happen if you tried to write to a file that had been opened for reading?
- An exception will be raised
r+is a mode that lets you read and write to a file- You typically don’t want to read and write to a file at the same time
- Hard to ensure the integrity of the data and avoid corrupting it
- Such as by writing a line longer than the one previously written
- this may corrupt the next line
- A better pattern is to load data, update the data then write that back into the file
- A temporary file (often abreviated as a tmp file) can be used if we need an intermediate file to write to
Can a program read an entire file at once?
- Yes, the*
readmethod by default will try to read an entire file - line endings are preserved
- Be careful with large files, as this may overwhelm your computers memory…
# Example 8.18 File Read # # Demonstrates the use of file_object.read to read # the contents of a file in one go input_file = open("test.txt", "r") total_file = input_file.read() print(total_file) input_file.close()- Yes, the*
Read the Sales Figures
Let’s now implement
load_sales# Example 8.19 Load Sales # # Implements the Load Sales function sales = [] def load_sales(file_path): """ loads the contents of a file into the sales list Parameters ---------- file_path : str string giving the file path to load from Returns ------- None Raises ------ FileException Raised if the load fails """ print("Load the sales in:", file_path) sales.clear() input_file = open(file_path, "r") for line in input_file: line = line.strip() sales.append(int(line)) input_file.close()
Code Analysis: The load_sales Function
load_sales works as the opposite of save_sales instead of taking a sales list and putting it into a text file, we pull the figures from a file and load them into the sales list. Consider the following questions
What does the
intfunction do?- The numbers pulled out of the file are initially stored as a string
- We need to convert them to a number, so we call
int
What happens if the input file was empty?
- The function works as one would hope
- The loop doesn’t iterate and we get an empty sales list
Deal with File Errors
- Dealing with files, also means dealing with the errors they can introduce
- e.g. A file might have been deleted, a USB removed, or simply the user might pass the wrong name
- When an error occurs we want to ensure two things:
- No files are left open
- The user is aware that the error has occured
- File objects typically raise exceptions when their methods
- Enables us to handle and report on their errors
- Use the
try ... exceptsyntax we’ve seen before
try: output_file = open(file_path, "w") for sale in sales: output_file.write(str(sale) + "\n") output_file.close() print("File Written Successfully") except: print("Something went wrong with the file")
Code Analysis: Dealing with File Handling Exceptions
The code performing the file write is wrapped in a try...except block. If write, open or close causes an exception it will be caught and handled by the except clause. Let’s work through the following questions to see if this solves the ensures that the file is closed and the user is informed
In what circumstances will the code in the
exceptpart be executed?- If any of the file functions,
write,open, orcloseraise an exception, the code in theexceptpart will be executed - An error message is thus only printed when an error occurs
- If any of the file functions,
In what circumstances will the “File written successfully?” message be printed?
- This is only printed if every step in the file writing process is completed successfully
An error message is always printed if an error is thrown, but will the file always be closed?
- No, this is a problem, as we said that all files needed to be closed even when an error occurs!
- We could put the
closestatement in the exception handling section to, but a more general solution to this problem is to use afinallyblock- A
finallyblock contains code that is always executed after all of thetryand/orexceptcode has executed - Good for code that we naturally want to run after the block no matter if the process succeeds or fail (such as clean-up)
- A
try: output_file = open(filename, "w") for sale in sales: output_file.write(str(sale) + "\n") except: print("Something went wrong with writing to the file") finally: output_file.close()
Use the with Construction to Tidy up File Access
- It would be great if we didn’t have to remember to manually ensure a file gets closed
- Failing to properly close a file can lead to hard to pin down behaviour
Intermittent Faults are the Worst Kind to Fix
A piece of code that is broken all the time is annoying, but at least you can typically easily identify what is not working. If a program fails only some of the time this can be much harder to solve. Often you require precise directions as to the steps taken up to the point of failure in order to be able to attempt to replicate the problem. This adds significant overhead to fixing the problem
- The
withconstruct allows the programmer to automatically manage the acquisition and release of resources- More generic than just file access
- You can write your own services to work with
with- Advanced topic we can ignore for now
block-beta
columns 6
classDef BG stroke:transparent, fill:transparent
space
title["Breakdown of a with statement"]:4
space
class title BG
block:With
columns 1
with["with"]
withDescr["(start of a with block)"]
end
class with BG
class withDescr BG
block:Expression
columns 1
expression["expression"]
expressionDescr["(expression generating resource to use)"]
end
class expression BG
class expressionDescr BG
block:As
columns 1
as["as"]
space
end
class as BG
block:Name
columns 1
name["name"]
nameDescr["(name to represent the resource)"]
end
class name BG
class nameDescr BG
block:Colon
columns 1
colon[":"]
space
end
class colon BG
block:Suite
columns 1
suite["Statement block"]
suiteDescr["(statements)"]
end
class suite BG
class suiteDescr BG
withis used to provide an object that provides a serviceasis used to assign a semantically meaningful name to the resourcewithactivates an “enter” behaviour on its object- For files this is
open
- For files this is
When the block is finished,
withcalls some exit behaviour on the object- For files this causes the file to be closed
withallows us to ensure a few things- The file is always closed
- The reference to the file only exists as long as we are using it
# Example 8.20 Using with to Access Files # # Rewrites read_sales and load_sales to use the with functionality # implemented in python # test data sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] def save_sales(file_path): """ Saves the contents of the sales list to a file Parameters ---------- file_path : str string giving the file path to save to Returns ------- None Raises ------ FileException Raised if the save fails See Also -------- load_sales : load sales from a given file """ print("Save the sales in:", file_path) try: with open(file_path, "w") as output_file: for sale in sales: output_file.write(str(sale) + "\n") except: # noqa: E722 print("Something went wrong with the file") def load_sales(file_path): """ loads the contents of a file into the sales list Parameters ---------- file_path : str string giving the file path to load from Returns ------- None Raises ------ FileException Raised if the load fails See Also -------- save_sales : save sales to a file """ print("Load the sales in:", file_path) sales.clear() try: with open(file_path, "r") as input_file: for line in input_file: line = line.strip() sales.append(int(line)) except: # noqa: E722 print("Something went wrong with the file") print("Sales before save and load:", sales) save_sales("test.txt") load_sales("test.txt") print("Sales after save and load:", sales)Observe that we no longer have to explicitly include the
closecallwithdoes not handle exceptions however, so we still have to include atry...exceptblockWhen an exception occurs the
withfirst releases the resource with its exit behaviour- e.g. closes the file
- Then the excecution moves to the
exceptblock
If we wanted to handle exceptions without releasing the resource, we would have to swap the order to,
with open("file", "mode"): try: #do standard thing here except: # handle exception without releasing resource finally: # do something regardless of success or fail without releasing resource
Make Something Happen: Record a List with a save Function
Add a save function to your party guest program so that you can record a list of people who attended your party
We build off our version that generates a sorted list. We can basically copy the save_sales function making changes to the refer to the guests list instead of sales and giving a more appropriate name to the loop variable.
def save(file_path):
"""
Saves the guest list to a file
Parameters
----------
file_path : str
string giving the file path to save to
Returns
-------
None
Raises
------
FileException
Raised if the save fails
"""
print("Save the guest list in:", file_path)
try:
with open(file_path, "w") as output_file:
for guest in guests:
output_file.write(str(guest) + "\n")
except: # noqa: E722
print("Something went wrong with the file")We then run the program as normal
- Ask for the number of guests
- Read in the guests
- Sort the guest list
- Display the guest list
We then ask the user if they want to save the guest list. For simplicity we use BTCInput.read_input_ranged to ask for a \(0\) or a \(1\) where a \(1\) indicates the user wishes to save, while \(0\) indicates they dont. If the user wishes to save we then prompt them using BTCInput.read_text for a file name and then call save on the given file path
user_wants_to_save = BTCInput.read_int_ranged(
"Would you like save the list? (1 for yes, 0 for no): ", min_value=0, max_value=1
)
if user_wants_to_save:
save_file_name = BTCInput.read_text("Enter file name to save as: ")
save(save_file_name)- The complete integrated code is given in GuestListWithSave.py
Store Tables of Data
- A list holds data in one dimension, i.e. its length
- Often data is multi-dimensional
- e.g. Our Ice Cream Sales client might now ask for the ability to track sales, by store and by day of the week
block-beta
columns 5
classDef Header fill:#bbf,stroke:#333,stroke-width:4px;
classDef BG stroke:transparent, fill:transparent
space:2
title["Data Table"]:2
space:1
class title BG
space
block:fields:4
columns 4
monday["Monday"]
tuesday["Tuesday"]
wednesday["Wednesday"]
stop["..."]
end
class fields Header
Stand1["Stand 1"]
50
80
10
Blank1["..."]
class Stand1 BG
Stand2["Stand 2"]
54
98
7
Blank2["..."]
class Stand2 BG
Stand3["Stand 3"]
29
40
80_2["80"]
Blank3["..."]
class Stand3 BG
Stand4["..."]
stand4_1[" "]
stand4_2[" "]
stand4_3[" "]
stand4_4[" "]
class Stand4 BG
Our current implementation is effectively a vertical slice for one of the days
Can implement multiple lists, one per day of the week
- Effectively repeats the problem we had before of a distinct named variable for each item
We want a list of lists
mon_sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75] tue_sales = [80, 98, 40, 43, 43, 80, 50, 60, 79, 30] wed_sales = [10, 7, 80, 43, 48, 82, 33, 55, 83, 80] thu_sales = [15, 20, 38, 10, 36, 50, 20, 26, 45, 20] fri_sales = [20, 25, 47, 18, 56, 70, 30, 36, 65, 28] sat_sales = [122, 140, 245, 128, 156, 163, 90, 140, 150, 128] sun_sales = [100, 130, 234, 114, 138, 156, 107, 132, 134, 148] week_sales = [mon_sales, tue_sales, wed_sales, thu_sales, fri_sales, sat_sales, sun_sales]Think of lists of lists as a collection of rows and columns
We first specify the row we want say
tue_salesThen the column, say Stand 1
print(week_sales[1][0])80
Code Analysis: Inadequate Index Values
It can be difficult to get the hang of working with multiple indices. Which of the following indices would fail when the program runs?
Statement 1: week_sales[0][0] = 50
Statement 2: week_sales[8][7] = 88
Statement 3: week_sales[7][10] = 100
- Statement 1 is valid
- Statement 2 is invalid because the first index \(8\) corresponds to the day of the week
- The valid indices here are \(0\) to \(6\)
- Statement 3 is also invalid for the same reason
- Even though there are seven days of the week
- The list is zero indexed
Let’s see this in action
- Statement 1:
week_sales[0][0]50
- Statement 2:
week_sales[8][7]--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[20], line 1 ----> 1 week_sales[8][7] IndexError: list index out of range
- Statement 3:
week_sales[7][10]--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[21], line 1 ----> 1 week_sales[7][10] IndexError: list index out of range
Make it easy to test your program
Testing is important, but unless it’s easy or automatic it’s pretty common to get left by the wayside.
In a program one might use a function make_test_data or for larger projects a test framework that is used to generate test data.
Whenever you find yourself repeating a pattern to test code, consider how you can automate or bypass that process
Use Loops to Work with Tables
We can use nested
forloops to work through individual values in a list of listsE.g. if we want to calculate the total sales over a week, (full code given in TablesOfSaleData.py)
total_sales = 0 for day_sales in week_sales: for sales_value in day_sales: total_sales = total_sales + sales_value print("Total sales for the week are", total_sales)Total sales for the week are 5205day_salesin the outer loop iterates over each constituent list in the list of listssales_valueis then each value in the current list referenced byday_sales
Code Analysis: Loop Counting
Consider the code for summing the sales data in the previous example. Answer the following questions to make sure you understand how it works
How many times will the statements inside the two loops be obeyed?
- In total they will be run \(70\) times
- The outer loop runs seven times (once for each day of the week)
- The inner loop runs ten times (one for each stand)
- for each iteration of the outer loop
How would you change this program so that it could handle more than one week’s worth of sales?
- We can add more days to the list
- Rather than have them correspond to Monday - Friday it might be Week 1 Day 1 etc.
- These would be additional rows in the list of lists
How would we add a day’s worth of sales to the list?
We have to read in a new list of values
Can then append it to the list of lists
read_sales(10) # read ten values into sales list week_sales.append(sales) # append the values to the weekly sales list
More than Two Dimensions
It is possible to work with higher dimensions
For example we might want to store multiple weeks of data
- Then we would have a list of (list of (lists))s
Works just like two dimensions but with an extra index, for example we can append a week of sales like so,
annual_sales.append(week_sales)
Keep your dimensions low
You should rarely have to use more than three dimensions. If you find yourself using highly nested / high-dimensional structures you might want to rethink how you’re representing your data
One technique we will see later is the use of classes, which can make it easier to create linear collections
The computer itself is perfectly happy working in higher dimensions. The real difficulty is that you probably aren’t and it can be hard to reason about high dimension data
Use Lists as Lookup Tables
Now we have the ability to manipulate weekly sales data, the next question is how to display that data and the requests.
When we enter the data we want to see something like,
Enter the Monday sales figures for stand 2:Here we need to have a variable to control what day is printed
- Simplest implementation is an integer to track the day, implemented in DayNameIf.py
# Example 8.22 Day Name If # # Uses a if, elif, else construction to convert an integer # to a string representation of the day of the week import time current_time = time.localtime() day_number = current_time.tm_wday if day_number == 0: day_name = "Monday" elif day_number == 1: day_name = "Tuesday" elif day_number == 2: day_name = "Wednesday" elif day_number == 3: day_name = "Thursday" elif day_number == 4: day_name = "Friday" elif day_number == 5: day_name = "Saturday" elif day_number == 6: day_name = "Sunday" else: raise ValueError("Unexpected day_number " + str(day_number) + " encountered") print(day_name)FridayThis works, but is fragile, a cleaner way to do this is to use a lookup table
- i.e. we use
day_numberto index a list that stores the correct day
- i.e. we use
We use the
timelibrary for fun so the program prints the current day# Example 8.23 Day Name List # # Uses a lookup table to correctly print the day import time current_time = time.localtime() day_number = current_time.tm_wday day_names = [ "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday", ] day_name = day_names[day_number] print("Today is", day_name)Today is FridayLookup tables are powerful for shrinking written code
They also are used to create data-driven applications
- Programs that use built-in or loaded data rather than fixed behaviour
Tuples
- Lists are the standard collection type
- They are mutable, i.e. we can change the value of a given index or add new items
- Consider the
day_nameslist, once defined we don’t want to change itWe would like to also prevent this, to catch potential programming errors e.g.
day_names[5] = "Splatterday"
- A tuple is like a list, but the contents cannot be changed
- A tuple is said to be immutable
- If we attempt to change the tuple we get an error, (demonstrated in the implementation DayNameList.py)
- Specifically a
TypeError - Because the action we are trying to take (change the value at an index) is not supported by the object type (tuple)
# Example 8.24 Day Name Tuple # # Reimplements the Day Name lookup table with a tuple # and demonstrates the immutability of the data structure import time current_time = time.localtime() day_number = current_time.tm_wday day_names = ( "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday", ) day_name = day_names[day_number] print("Today is", day_name) print("Attempting to change the lookup table...") day_names[day_number] = "Splatterday" # type: ignore print("Today is", day_names[day_number])Today is Friday Attempting to change the lookup table...--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[25], line 27 23 print("Today is", day_name) 25 print("Attempting to change the lookup table...") ---> 27 day_names[day_number] = "Splatterday" # type: ignore 28 print("Today is", day_names[day_number]) TypeError: 'tuple' object does not support item assignment
- Specifically a
- Tuple is created as for a list but using
()to delimit the items rather than[] - Tuples are good for working with complicated values
- e.g. composite types
- For Example, consider a pirates treasure map
- Treasure’s location is given by
- A reference landmark
- Number of steps north
- Number of steps east
- Treasure’s location is given by
- A function can strictly speaking
returnone value- We can return multiple values as a
tuple
def get_treasure_location(): # get the treasures location return ("The old oak tree", 20, 30)- This returns three values
- The string
"The old oak tree" - The number of steps north,
20 - The number of steps east,
30
- The string
- We can return multiple values as a
- Like lists, tuples are zero-indexed
Take care with your tuple indices
When returning multiple items from a function via a tuple, we have to be clear to specify the order of what the items in the tuple correspond to. This is effectively a contract between the function and any caller (if you change the order, you will break the code of anyone who relies on the current order)
The order that parameters are returned in should thus be clearly documented, e.g.
def get_treasure_location():
"""
Gets the location of the treasure
Returns
-------
str
Name of a landmark to start at
int
Number of paces north
int
Number of paces east
"""
return ("The old oak tree", 20, 30)- An alternative to explicitly referencing the index of a returned tuple, is called tuple-unpacking
We provide a comma-seperated list of variables to assign the tuple values (in order) to, e.g.
landmark, north, east = get_treasure_location() print("Start at", landmark, "walk", north, "paces north and", east, "paces east")
- The complete Pirate’s Treasure program implemention is given in PiratesTreasure.py
Summary
- Lists can be used to store large and arbitarily sized data
- We refer to the individual elements of a list as items
appendlets us add new elements to a list (at the end)lenreturns the number of items in a list- lists can contain different types of data in the same list
- list values are accessed via the indexing operator
[]- lists are indexed from \(0\)
- The last index in a list is
len(list) - 1
- Nested lists allow for multi-dimensional structures
- Files can be manipulated by python
openis used to access a file- files can be read from or written to
forcan be used to loop over lines from a file- when using
writeto write to a file, newlines ('\n') must be added exactly stripcan be used to remove whitespace when reading lines from a file- Files must be closed using the
closemethod once they are no longer in use - Files can raise exceptions which must be handled or notified to the user
- They must ensure the file is still closed
withcan be used to automatically ensure a file is closed once it is no longer used, even in error scenarios
- Tuples are immutable collections
- Once they are defined we cannot modify or add values
- Tuples are suitable for tuples or other fixed collections
- Tuples can be used by functions that return more than one value
Questions and Answers
- Do we really need lists?
- Yes, any scenario with large or arbitrary data needs collections to meaningfully handle and manipulate them
- Do we really need tuples?
- No, techically we could just use lists instead. They are useful though because they enforce properties that lists don’t such as immutability which is useful in some cases
- How does the list actually work?
- When a list is created the program reserves memory to hold a few items
- The memory also tracks the number of items currently stored in the list
- Appending an item consumes part of the allocated memory
- If the list doesn’t have enough room, then more memory is allocated to the list
- When accessing a list item, the list checks if the item exists
- If the item doesn’t exist, an exception is thrown
- else, the item is found and returned
- Why are tuples called tuples?
- Tuples are ordered collections of elements in mathematics. Python adopted the terminology
- Should the sales program use a list to store the sales figures or a tuple?
- It depends on the operations we want to peform
- Once we have the list of sales figures, none of our operations strictly change the tuple (except sorting)
- Can implement sorting them as creating a new tuple
- Probably good to then use a tuple from a security perspective
- However, this makes the code more complicated
- If we wanted to introduce an edit function later to modify sales data we might prefer a list for the clean implementation
- As again opposed to the tuple approach
- Can functions return lists instead of tuples?
- Yes, they can.
- However, typically the results of functions cannot be changed
- So naturally a tuple
- Will my program run faster if I use tuples to store all the data in it?
- Potentially, tuples are faster to implement than lists
- Depends on what the program does, if you’re mutating a lot of data, the cost of constantly recreating multiple tuples might be greater than the cost of creating and modifying a list
- The speed difference should hardly be noticable in any case
- Does the
withconstruction stop objects from throwing exceptions?- No,
withis designed to ensure that even if an object throws an exception the managed resource is released correctly withwill still pass on the exception
- No,