Chapter 8: Storing Collections of Data

Notes

Lists and Tracking Sales

Consider the following vignette
The owner of an ice-cream stand wants a program to track sales
- There are ten stands, each selling multiple items
- The program should take sales data as input and then provide the following views on the data
  - Sorted from lowest to highest
  - Sorted from highest to lowest
  - Show just the highest and the lowest
  - Show the total number of sales
  - Show the average number of sales

Important

Getting the specification right: Storyboarding

Agreeing on the specification with your client is important. A technique is called storyboarding, best done by sitting down with a paper and pen (or a whiteboard)

A storyboard shows how the program should flow in response to various user inputs. E.g. depicting the menus the user might use, with a storyboard for each menu choice. The storyboard should also show how the program will work

For bigger programs you can break different components out into their own storyboards, much in the same way we built up functions. Storyboards depict what needs to happen, but not how to do it.

Given the spec for the ice cream stand we can now outline the program
1. Store the sales data in variables
2. Implement a way to sort the data
3. A way to print the output
4. Store the data globally and pass it to functions to handle the work

We can construct the prototype interface, similar to the Ride Selector Program

  Ice-Cream Sales

  1: Print the Sales
  2: Sort Low to High
  3: Sort High to Low
  4: Highest and Lowest
  5: Total Sales
  6: Average Sales
  7: Enter Figures

  Enter your command: 3

Limitations of Individual Variables

We first need to store the sales

For ten stores, we could theoretically use ten variables, one for each store
But this method becomes clunky when we want to start analysing the variables

E.g. the following code (FindingLargestSales.py), only handles finding if the first stand is the one with the greatest sales

  # Example 8.1 Finding the Largest Sales
  #
  # Checks if sales1 has the largest sales. Demonstrates the difficulty of using
  # individual named variables to deal with aggregate data

  import BTCInput

  sales1 = BTCInput.read_int("Enter the sales for stand 1: ")
  sales2 = BTCInput.read_int("Enter the sales for stand 2: ")
  sales3 = BTCInput.read_int("Enter the sales for stand 3: ")
  sales4 = BTCInput.read_int("Enter the sales for stand 4: ")
  sales5 = BTCInput.read_int("Enter the sales for stand 5: ")
  sales6 = BTCInput.read_int("Enter the sales for stand 6: ")
  sales7 = BTCInput.read_int("Enter the sales for stand 7: ")
  sales8 = BTCInput.read_int("Enter the sales for stand 8: ")
  sales9 = BTCInput.read_int("Enter the sales for stand 9: ")
  sales10 = BTCInput.read_int("Enter the sales for stand 10: ")

  if (
      sales1 > sales2
      and sales1 > sales3
      and sales1 > sales4
      and sales1 > sales5
      and sales1 > sales6
      and sales1 > sales7
      and sales1 > sales8
      and sales1 > sales9
      and sales1 > sales10
  ):
      print("Stand 1 had the best sales")

Problem: We would have to repeat the code each time for each individual sales variable
If we add more stands, we have add another named variable and another big if statement
- AND modify all the previous if statements

Clearly this approach is not very maintainable

Lists in Python

A collection is a composite type
- It stores multiple elements of another type
We’ve already (briefly) seen one type of collection the tuple
The most common form of collection is the list
- What it sounds like, a list of items

Make Something Happen: Creating a List

Open a python interpreter and work through the following steps to learn about list

A list is created using brackets around the contents [], e.g.
```
 sales = []
```
- The above defines sales as an empty list
Items can be appended to a list using the append function
```
 sales.append(99)
 sales
```
```
[99]
```
- As we can see from above sales now contains the value 99
Calling append again, adds the new item to the end of the list
```
 sales.append(100)
 sales
```
```
[99, 100]
```
Observe from above you can see the contents of a list, by simply typing the variable name in the interpreter
- In scripts we can also use the explicit print call
  print(sales)
```
[99, 100]
```
You can access individual items of the list, using the indexing operator []
```
 sales[0]
```
```
99
```
- Syntax is list_name[index] where index is an integer giving the index of the item
- Python lists are zero-indexed. i.e. the first value is stored at index \(0\)
The indexing operator can be used to change the value of an item at a given index
```
 sales[1] = 101
 sales
```
```
[99, 101]
```
- The above changes the value of the second item in sales to \(101\)
Warning
Indexed elements must exist

Whenever we use the indexing operator the index must exist! For example if we tried to view the (non-existent) third item, we would get an error, e.g.
example_list = [1, 2] print(example_list[2])
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[7], line 2 1 example_list = [1, 2] ----> 2 print(example_list[2]) IndexError: list index out of range
The above illustrates the common off-by-one error where we access the last index past the list rather than the last element of the list. Here the type of exception thrown is called an IndexError
A single list can store values of different types, and can replace items with new items of a different type
```
 sales.append("Rob")
 sales[0] = "Python"
 sales
```
```
['Python', 101, 'Rob']
```
- The above appends a new string "Rob", converts sales[0] from an int to the string "Python" and leaves the number \(101\) in sales[1] untouched
- Overall list thus mixes string and integer types*

Warning

Avoid Mixing Types in Lists

ust because you can* mix types in lists, doesn’t mean you should. Typically lists and list processing is much easier when a list stores all items of the same type*

Read in a List

You can use loops to populate a list (see ReadAndDisplay.py)

  # Example 8.2.1 Read and Display
  #
  # Demonstrates using a loop to populate a list

  import BTCInput

  # create an empty list to populate
  sales = []

  for count in range(1, 11):
      prompt = "Enter the sales for stand " + str(count) + ": "
      sales.append(BTCInput.read_int(prompt))

  print(sales)

Code Analysis: Investigate a List Reading Loop

Examine the code given above and consider the following questions to understand how the list is processed

What is the purpose of the count variable?
- count tracks the value of the current index in the loop. This is used to print the id for the sales stand we are collecting the data from
Why does the range of count go from \(1\) to \(11\)?
- The range function returns a collection with the start included but the stop excluded. Since we have stores \(1\) through \(10\), we want the range to go from \(1\) to \(11\) so the generated numbers are \(1\) through to \(10\)
Which item in the list would hold the sales for stand number \(1\)?
- The first item in the list, or the zeroth indexed, i.e. sales[0]

What part of the code would have to be changed if we instead had \(100\) stands?

We simply change range(1,11) through to range(1,101)

The program below (ReadAndDisplay2.py) is a variant in which the user specifies the number of stands

  # Example 8.2.2 Read and Display 2
  #
  # Improved version of Read and Display which allows the user to specify
  # the number of stands

  import BTCInput

  # create an empty list to populate
  sales = []

  number_of_stands = BTCInput.read_int("Enter the number of stands: ")
  for count in range(1, number_of_stands + 1):
      prompt = "Enter the sales for stand " + str(count) + ": "
      sales.append(BTCInput.read_int(prompt))

  print(sales)

The above is more flexible, but as a result it is more complicated, the trade off between flexibility and ease of use is one that should be considered with the input of the users

If I got one sales value wrong, would it be possible to edit the list to put in a corrected version?
- This is not implemented in the current program, but we have already seen that you can reassign the value of list at a given index, so we could implement this in a more complete program

Display a `list` using a `for` Loop

We’ve already seen that print has a default way of displaying a list

We can use a for loop for if we want custom printing for each item

  # Example 8.3 Read and Display Loop
  #
  # Uses a for loop to provide custom list printing

  import BTCInput

  sales = []

  for count in range(1, 11):
      prompt = "Enter the sales for stand " + str(count) + ": "
      sales.append(BTCInput.read_int(prompt))

  # print a heading
  print("Sales Figures")
  count = 1
  for sales_value in sales:
      print("Sales for stand", count, "are", sales_value)
      count = count + 1

Make Something Happen: Read the Names of Guests for a Party

Lists can hold any type of data that you need to store, including strings. You can change the ice-cream sales program to read and store the names of guests for a party or an event you’re planning. Make a modified version of the sales program that reads in some guest names and then displays them. Make your program handle between \(5\) and \(15\) guests

We basically just copy the previous program with the following changes
- sales \(\rightarrow\) guests
- sales_value \(\rightarrow\) guest
- We change the prompts to appropriately refer to guests rather than sales
The two main changes are
1. We add an initial prompt for the number of guests
  - We use BTCInput.read_int_ranged to ensure the value is from \(5\) to \(15\)
2. We use BTCInput.read_text instead of BTCInput.read_int to get the guest names

    # Exercise 8.1 Party Guests
    #
    # A program that receives and then prints a list of party guests
    # Works for between 5 and 15 guests

    import BTCInput

    guests = []
    number_of_guests = BTCInput.read_int_ranged(
        "Enter the number of guests (5-15): ", 5, 15
    )

    for count in range(1, number_of_guests + 1):
        prompt = "Enter the name of guest " + str(count) + ": "
        guests.append(BTCInput.read_text(prompt))

    # print a heading
    print("\nGuests attending:")
    count = 1
    for guest in guests:
        print("- ", guest)
        count = count + 1

Refactor Programs into Functions

The previous examples build up our program as one long chain of events
However, if we think about our program this isn’t strictly the cleanest
- There are two distinct responsibilities occuring
  1. First we read in the data
  2. Second we display the data
- These are natural candidates to be converted into functions
By pairing these behaviours the program locks us into one way of processing data
- What happens if we want to read in a second set of data?
- What if we want to print the data multiple times?
Refactoring is the process of modifying existing code
- Specifically changing how factors interact
Refactoring avoids the problem of overcomplicating the design at the start of the process
- Instead we write the program the most simple way we can
- Then once a structure emerges, or we need to add functionality we can refactor the design

Let us factor out the two key components identified above into a new implementation (Functions.py)

  # Example 8.4 Functions
  #
  # Demonstrates refactoring a program into component functions

  import BTCInput

  sales = []


  def read_sales(number_of_sales):
      """
      Reads in the sales values and stores them in the sales list

      Parameters
      ----------
      number_of_sales : int
          Number of Stores to record sales values for

      Returns
      -------
      None
          Results are read into the sales list
      """
      sales.clear()  # remove existing sales values
      for count in range(1, number_of_sales + 1):
          prompt = "Enter the sales for stand " + str(count) + ": "
          sales.append(BTCInput.read_int(prompt))


  def print_sales():
      """
      Prints the sales figures on the screen with a heading.

      Each figure is numbered in sequence

      Returns
      -------
      None
      """
      print("Sales Figures")
      count = 1
      for sales_value in sales:
          print("Sales for stand", count, "are", sales_value)
          count = count + 1


  read_sales(10)
  print_sales()

Code Analysis: Functions in the Sales Analysis Program

Our sales analysis program now consists of two functions, read_sales and print_sales

What does the parameter for the read_sales function do?
- We hinted at in the previous section that we might want to account for the potential for the number of stands to change in a future implementation. To support this behaviour read_sales reads in the number of sales value that it should reads
What does clear do?
- We want to start with a fresh list every time we read the sales values
- clear is a method on list objects that clears its contents
Why don’t we need to tell the print_sales function how many sales figures to print?
- The for loop goes through the contents of the sales list
- A list tracks its own size
- In some languages like C, containers do not naturally track their sizes and we would need to specify them
Why didn’t we have to write global sales in the read_sales function?
- Python variable names are references to memory
- These are distinct from the objects that live in that memory
- Assignments change what object a reference (variable) refers to
  - e.g. sales=[]
- However, calling methods on a variable, is not changing the reference e.g. sales.append(99) (They change the object contents)
  - So we don’t need to use global because by calling methods its clear what reference we’re using

Create Placeholder Functions

A development technique called stubs is where we write placeholder functions before we can provide a complete implementation for a given behaviour
The placeholders are sometimes called stub functions e.g. the two below

def sort_high_to_low():
    """
    Print out a sales list from highest to lowest

    Returns
    -------
    None

    See Also
    --------
    sort_low_to_high : sorts from lowest to highest
    """
    pass


def sort_low_to_high():
    """
    Print out a sales list from lowest to highest

    Returns
    -------
    None

    See Also
    --------
    sort_high_to_low : sorts from highest to lowest
    """
    pass

Placeholders let us model the flow of program before we have all the behaviours specified
- Obviously does not model the complete program since the functions are incomplete
pass is a keyword for a statement that does nothing
- It is effectively a placeholder statement

Create a User Menu

At the start of the Chapter we defined a user interface

By using the previous discussion on stubbing, and our initial functions we can implement this menu (see the full implementation in FunctionsAndMenu.py)

  menu = """
  Ice Cream Sales

  1. Print the Sales
  2. Sort High to Low
  3. Sort Low to High
  4. Highest and Lowest
  5. Total Sales
  6. Average Sales
  7. Enter Figures

  Enter your command: """

  command = BTCInput.read_int_ranged(menu, 1, 7)

  if command == 1:
      print_sales()
  elif command == 2:
      sort_high_to_low()
  elif command == 3:
      sort_low_to_high()
  elif command == 4:
      highest_and_lowest()
  elif command == 5:
      total_sales()
  elif command == 6:
      average_sales()
  elif command == 7:
      read_sales(10)
  else:
      raise ValueError("Unexpected value " + str(command) + " found")

We use stub functions for the unimplemented behaviour

Tip

Using Else Clauses to Guard Against Modification

In the example above the final else clause should never trip because we expect the result of BTCInput.read_int_ranged(menu, 1, 7) to be between \(1\) and \(7\) (inclusive) which is captured by the if..elif chain

Why then do we include the else clause? The reason is to protect against modification. This could include,

The author of BTCInput introduces a bug in read_int_ranged that allows invalid input to leak through
Someone editing the sales program changes the allowed range of input for read_int_ranged (perhaps to introduce new functions) but forgets to include them in the elif chain

In either case, the else clause trips, and rather than a silent error which may have occured if we expected the else to catch a \(7\), or if there was no else an exception is raised, which immediately notifies us that there’s a problem in the code

This technique of guarding against potential modifications is a simple technique for catching sources of errors and making sure you’re confirming your assumptions

Use the `elif` keyword to simplify conditions

In many of the examples and exercises I’ve used elif to simplify cases where we would otherwise have a bunch of nested if...else conditions.
elif is short for else if and is effectively a next condition to check if the first if (or all preceding elif) statement is False
- All elif conditions must come before the else

Sort Using Bubble Sort

Sorting is a common task for computing programs
It can be time-intensive
There are often multiple ways that we may wish to sort things, e.g.
- Alphabetically vs Numerically
- Increasing vs Decreasing
- Case-sensitive vs Case-insensitive
Traditional sorts are down, one item (or pair of items) at a time
Algorithms, are a sequence of steps that solve a problem
- Sorting Algorithms are algorithms that sort collections
- Programming is really the implementation of an algorithm
Bubble Sort is a simple sorting algorithm
- Easy to follow and understand
- Not scalable to larger data sets

Initialise a list with Test Data

Often when implementing an algorithm we want to use a fixed set of test data
- i.e. Data for which we can easily know the desired final state or output
- Allows us to check our algorithm is not incorrect
We can define a list in python with some contents,

sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]

Sort a List from High to Low

block-beta
    columns 6

    classDef BG stroke:transparent, fill:transparent

    index["Index"]:1
    class index BG

    block:Indices:5
    columns 10
        0
        1
        2
        3
        4
        5
        6
        7
        8
        9
    end

    value["Value"]:1
    class value BG

    block:Values:5
    columns 10
        50
        54_1["54"]
        29
        33
        22
        100
        45
        54_2["54"]
        89
        75
    end

The above shows how the test data looks in a python list
For a highest to lowest sort we want the largest value to be in index \(0\) and the lowest in index \(9\)
The basic idea of Bubble sort is to compare neighbouring values, if the right value is larger we want to swap them so the larger value is on the left
- Thus closer to the top of the list

Important

Swap Two Values in a Variable

The following code to swap two variables is broken,

if sales[0] < sales[1]:
    # the two items are in the wrong order and must be swapped
    sales[0] = sales[1]
    sales[1] = sales[0]

Why? Lets work through what happens

sales[0] is set to the value of sales[1]
sales[1] is set to the current value of sales[0]
But, sales[0] has already been set to sales[1]
- So sales[1] is set to the same value it already has

The net result is that we only copy sales[1] to sales[0]

The correct implementation is given below,

if sales[0] < sales[1]:
    temp = sales[0]
    sales[0] = sales[1]
    sales[1] = temp

temp is used to store the value of sales[0] before it was overwritten

Obviously, we don’t want to write the code with explicit reference to indices. However we can write this generically with a for loop as below

for count in range(0, len(sales) - 1):
    if sales[count] < sales[count - 1]:
        temp = sales[count]
        sales[count] = sales[count + 1]
        sales[count + 1] = temp

Code Analysis: Work through a List using a Loop

The above code uses some new python features. Work through the following questions to understand what’s going on

Why have you used a for loop, rather than a while loop?
- We could use either, the for loop is slightly smaller since we don’t have to manually increment count
- Additionally range technically returns what is called a generator,
- This is more memory efficient
  - Rather than creating a full list of numbers in memory, it just returns the next number each time the for loop requests it
What does the len function do on line \(1\)?
- len returns the length of a collection, i.e. the number of items in the collection
- This lets you write code that is insensitive to the size of the collection being worked with
- Means our sorting code could work on any length list

Why is the limit of count the length of the list minus 1?

This is because bubble sort compares the current item to the item to its right, i.e. at the next index
If the range goes to the last index, then program will try an access an element one past the end of the list which doesn’t exist
- This will cause an error. e.g.

 a_list = [1,2]
 for count in range(0, len(a_list)):
     if a_list[count] < a_list[count + 1]:
         temp = a_list[count]
         a_list[count] = a_list[count + 1]
         a_list[count + 1] = temp

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[9], line 3
      1 a_list = [1,2]
      2 for count in range(0, len(a_list)):
----> 3     if a_list[count] < a_list[count + 1]:
      4         temp = a_list[count]
      5         a_list[count] = a_list[count + 1]

IndexError: list index out of range

The complete implementation of the above discussion below performs one pass through the list

# Example 8.6 Bubble Sort First Pass
#
# Implements the first pass of bubble sort and shows the impact on the list

# test data
sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]


def sort_high_to_low():
    """
    Print out a sales list from highest to lowest

    Returns
    -------
    None
    """

    for count in range(0, len(sales) - 1):
        if sales[count] < sales[count + 1]:
            temp = sales[count]
            sales[count] = sales[count + 1]
            sales[count + 1] = temp


print("Input list:", sales)

sort_high_to_low()

print("Output list:", sales)

Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
Output list: [54, 50, 33, 29, 100, 45, 54, 89, 75, 22]

after which the test data looks like this

block-beta
    columns 6

    classDef BG stroke:transparent, fill:transparent

    index["Index"]:1
    class index BG

    block:Indices:5
    columns 10
        0
        1
        2
        3
        4
        5
        6
        7
        8
        9
    end

    value["Value"]:1
    class value BG

    block:Values:5
    columns 10
        54_1["54"]
        50
        33
        29
        100
        45
        54_2["54"]
        89
        75
        22
    end

Notice that the list has been partially sorted
- Also notice that the smallest value \(22\) has been moved to the correct index (the end)
- The high numbers effectively bubble left past one of the values smaller than them
Since we can see that after sorting the smallest value has been moved to the end we expect on the second loop through the second smallest value will have been moved to the correct spot
- So we want to loop through len(sales) times
The working bubble sort implemention is then,

# Example 8.7 Bubble Sort Multiple Pass
#
# Implements a complete working version of bubble sort

# test data
sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]


def sort_high_to_low():
    """
    Print out a sales list from highest to lowest

    Returns
    -------
    None
    """
    for sort_pass in range(0, len(sales)):
        for count in range(0, len(sales) - 1):
            if sales[count] < sales[count + 1]:
                temp = sales[count]
                sales[count] = sales[count + 1]
                sales[count + 1] = temp


print("Input list:", sales)

sort_high_to_low()

print("Output list:", sales)

Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
Output list: [100, 89, 75, 54, 54, 50, 45, 33, 29, 22]

Code Analysis: Improving Performance

As seen above, the sorting program now works correctly. Once you have a working implementation its worth investigating if there are changes you can make to improve the efficiency. Work through the following questions to get the idea

Is the program making more comparisons than necessary?
- Yes, as we mentioned before, after one pass the smallest item will always be at the end of the collection
- This means we don’t need to check any swaps against it any more for the inner loop
- After each pass the size of this sorted section increases by at least one
- An implementation taking this into account is,
```
  for sort_pass in range(0, len(sales)):
      for count in range(0, len(sales) - 1 - sort_pass):
          if sales[count] < sales[count + 1]:
              temp = sales[count]
              sales[count] = sales[count + 1]
              sales[count + 1] = temp
```

Is the program performing more passes through the list than nessecary?

Probably, unless the largest value is at the end of the list all values should be bubbled to their correct spot in less than len(sales) passes
We can stop doing additional passes if we work out the list is already sorted

How?

We use a flag to track if any swaps occur in a pass
If none do then the list is already sorted and we can stop

  # Example 8.8 Efficient Bubble Sort
  #
  # A bubble sort implementation incorporating efficiency savings to the number
  # of comparisons and passes through the list

  # test data
  sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]


  def sort_high_to_low():
      """
      Print out a sales list from highest to lowest

      Returns
      -------
      None
      """
      for sort_pass in range(0, len(sales)):
          done_swap = False
          for count in range(0, len(sales) - 1 - sort_pass):
              if sales[count] < sales[count + 1]:
                  temp = sales[count]
                  sales[count] = sales[count + 1]
                  sales[count + 1] = temp
                  done_swap = True
          if not done_swap:
              break

  print("Input list:", sales)

  sort_high_to_low()

  print("Output list:", sales)

Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
Output list: [100, 89, 75, 54, 54, 50, 45, 33, 29, 22]

Make Something Happen: Sort Alphabetically

Bubble sort works for strings as well as integers. We saw that in Chapter 5 the python relational operators also work for strings. See if you can modify the Party Guest Program to display the names in alphabetical order

We can basically just reuse our sort code, but renamed for the guest program.

def sort_alphabetical():
    """
    Sorts a list alphabetically

    Returns
    -------
    None
    """
    for sort_pass in range(0, len(guests)):
        done_swap = False
        for count in range(0, len(guests) - 1 - sort_pass):
            if guests[count] > guests[count + 1]:
                temp = guests[count]
                guests[count] = guests[count + 1]
                guests[count + 1] = temp
                done_swap = True
        if not done_swap:
            break

There is a second modification above, which is changing the sign of the relational operator, e.g.

guests[count] < guests[count + 1]

has been changed to,

guests[count] > guests[count + 1]

This is because as written the program tries to put the smallest strings last, but for strings; where the relational operator is alphabetically ordered this puts strings starting with a for example, after those starting with z etc. So we need to swap the sign so that the list is printed a, b, … , z etc.

Why don’t we have to make more modifications? Well the code as written only requires that the items being sorted are stored in a list, and that the items in the list can be compared with a relational operator. Both of these properties are satisfied by a collection of strings so the code effectively works out of the box

The complete code, including the integration with reading and printing the guest list is given in SortAlphabetically.py

Sort a List from Low to High

To flip the direction of the sort, we just need the condition that determines what is out of order or not

We do this by changing \(<\) to \(>\), i.e.

  # Example 8.9 Bubble Sort Low to High
  #
  # Implementation of Bubble Sort that sorts from low to high

  # test data
  sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]


  def sort_low_to_high():
      """
      Print out a sales list from highest to lowest

      Returns
      -------
      None
      """
      for sort_pass in range(0, len(sales)):
          done_swap = False
          for count in range(0, len(sales) - 1 - sort_pass):
              if sales[count] > sales[count + 1]:
                  temp = sales[count]
                  sales[count] = sales[count + 1]
                  sales[count + 1] = temp
                  done_swap = True
          if not done_swap:
              break


  print("Input list:", sales)

  sort_low_to_high()

  print("Output list:", sales)

Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
Output list: [22, 29, 33, 45, 50, 54, 54, 75, 89, 100]

The code above is given in BubbleSortLowToHigh.py

Find the Highest and Lowest Sales Values

In comparison to sorting, finding a value is much easier

The basic outline for finding the highest is,

  for values in collection
      if(new value > highest seen so far)
          highest = new value

We can write the code for the highest and lowest in python then as,

  highest = sales[0]
  for sales_value in sales:
      if sales_value > highest:
          highest = sales_value

  lowest = sales[0]
  for sales_value in sales:
      if sales_value < lowest:
          lowest = sales_value

If we want to find both at the same time, then we can combine the code above, which means we only have to do one pass through the collection

  # Example 8.10 Highest and Lowest
  #
  # Function that finds the highest and lowest value in a collection

  # Example 8.9 Bubble Sort Low to High
  #
  # Implementation of Bubble Sort that sorts from low to high

  # test data
  sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]


  def highest_and_lowest():
      """
      Print out the highest and lowest elements of a sales list

      Returns
      -------
      None
      """
      highest = sales[0]
      lowest = sales[0]

      for sales_value in sales:
          if sales_value > highest:
              highest = sales_value
          elif sales_value < lowest:
              lowest = sales_value
      print("The highest is:", highest)
      print("The lowest is", lowest)


  print("Input list:", sales)

  highest_and_lowest()

Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
The highest is: 100
The lowest is 22

The code above is given in HighestAndLowest.py

Evaluate Total and Average Sales

To evaluate the total we have to sum the contents of a list, simple using the for loops we’ve looked at, (implementation in TotalSales.py)

  # Example 8.11 Total Sales
  #
  # Calculate the Total Sales

  # test data
  sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]


  def total_sales():
      """
      Print out the total sales of a sales list

      Returns
      -------
      None
      """
      total = 0
      for sales_value in sales:
          total = total + sales_value
      print("Total sales are:", total)


  print("Input list:", sales)

  total_sales()

Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
Total sales are: 551

It is a simple extra step to them calculate the average, (divide the total by the number of elements in the collection)

  # Example 8.12 Average Sales
  #
  # Calculate the Average Sales

  # test data
  sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]


  def average_sales():
      """
      Print out the average sales of a sales list

      Returns
      -------
      None
      """
      total = 0
      for sales_value in sales:
          total = total + sales_value
      average_sales = total / len(sales)
      print("Average sales are:", average_sales)


  print("Input list:", sales)

  average_sales()

Input list: [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]
Average sales are: 55.1

Complete the Program

The previous Exercises have given us all the parts, now we want to put it together
The crux of our program should be a loop around the menu through which the user selects different functions
We first however need to read in the data from the user
For useability we should add the ability to quit the program
The final program implements this

# Example 8.13 Complete Program
#
# A Complete implementation of the Sales Program combining all the individual
# programs that we have implemented

import BTCInput

sales = []


def read_sales(number_of_sales):
    """
    Reads in the sales values and stores them in the sales list

    Parameters
    ----------
    number_of_sales : int
        Number of Stores to record sales values for

    Returns
    -------
    None
        Results are read into the sales list
    """
    sales.clear()  # remove existing sales values
    for count in range(1, number_of_sales + 1):
        prompt = "Enter the sales for stand " + str(count) + ": "
        sales.append(BTCInput.read_int(prompt))


def print_sales():
    """
    Prints the sales figures on the screen with a heading. Each figure is
    numbered in sequence

    Returns
    -------
    None
    """
    print("Sales Figures")
    count = 1
    for sales_value in sales:
        print("Sales for stand", count, "are", sales_value)
        count = count + 1


def sort_high_to_low():
    """
    Print out a sales list from highest to lowest

    Returns
    -------
    None

    See Also
    --------
    sort_low_to_high : sorts from lowest to highest
    """
    for sort_pass in range(0, len(sales)):
        done_swap = False
        for count in range(0, len(sales) - 1 - sort_pass):
            if sales[count] < sales[count + 1]:
                temp = sales[count]
                sales[count] = sales[count + 1]
                sales[count + 1] = temp
                done_swap = True
        if not done_swap:
            break


def sort_low_to_high():
    """
    Print out a sales list from lowest to highest

    Returns
    -------
    None

    See Also
    --------
    sort_high_to_low : sorts from highest to lowest
    """
    for sort_pass in range(0, len(sales)):
        done_swap = False
        for count in range(0, len(sales) - 1 - sort_pass):
            if sales[count] > sales[count + 1]:
                temp = sales[count]
                sales[count] = sales[count + 1]
                sales[count + 1] = temp
                done_swap = True
        if not done_swap:
            break


def highest_and_lowest():
    """
    Print out the highest and lowest elements of a sales list

    Returns
    -------
    None
    """
    highest = sales[0]
    lowest = sales[0]

    for sales_value in sales:
        if sales_value > highest:
            highest = sales_value
        elif sales_value < lowest:
            lowest = sales_value
    print("The highest is:", highest)
    print("The lowest is", lowest)


def total_sales():
    """
    Print out the total sales of a sales list

    Returns
    -------
    None
    """
    total = 0
    for sales_value in sales:
        total = total + sales_value
    print("Total sales are:", total)


def average_sales():
    """
    Print out the average sales of a sales list

    Returns
    -------
    None
    """
    total = 0
    for sales_value in sales:
        total = total + sales_value
    average_sales = total / len(sales)
    print("Average sales are:", average_sales)


# Get initial sales list
read_sales(10)


menu = """
Ice Cream Sales

0. Quit the Program
1. Print the Sales
2. Sort High to Low
3. Sort Low to High
4. Highest and Lowest
5. Total Sales
6. Average Sales
7. Enter Figures

Enter your command: """

while True:
    command = BTCInput.read_int_ranged(menu, 0, 7)
    if command == 0:
        break
    if command == 1:
        print_sales()
    elif command == 2:
        sort_high_to_low()
    elif command == 3:
        sort_low_to_high()
    elif command == 4:
        highest_and_lowest()
    elif command == 5:
        total_sales()
    elif command == 6:
        average_sales()
    elif command == 7:
        read_sales(10)
    else:
        raise ValueError("Unexpected value " + str(command) + " found")

Warning

Keeping Information Synchronised when Sorting

Playing around with the program you might notice one thing. The stands are numbered in the order that they are printed. This works great for printing the original list out, but once we start sorting these numbers don’t match their original value. This is fine if we only care about the sales figures, but if we want to maintain a relationship between a stand and its sales this is something that would have to be modified.

This is something you would discuss with the client

Store Data in a File

A natural extension to the program would be the ability to read or store the sales data to a file
Files allow for persisting the data between sessions
To do this we’ll add two new options, 8. Save Sales and 9. Load Sales

Let us start by stubbing out our functions (the complete integration is found in LoadAndSave.py),

  def save_sales(file_path):
      """
      Saves the contents of the sales list to a file

      Parameters
      ----------

      file_path : str
          string giving the file path to save to

      Returns
      -------
      None

      Raises
      ------
      FileException
          Raised if the save fails

      See Also
      --------
      load_sales : load sales from a sales list file
      """
      print("Save the sales in:", file_path)


  def load_sales(file_path):
      """
      loads the contents of a file into the sales list

      Parameters
      ----------

      file_path : str
          string giving the file path to load from

      Returns
      -------
      None

      Raises
      ------
      FileException
          Raised if the load fails

      See Also
      --------
      save_sales : save the sales list into a file
      """
      print("Load the sales in:", file_path)

We also add a basic integration to the user menu, where we use BTCInput.read_text to get a file name, then call the function

Observe that by adding the complete docstring’s we’re also starting to document the requirements for these functions in-code

    elif command == 7:
      read_sales(10)
  elif command == 8:
      file_to_save_to = BTCInput.read_text("Enter file to save to: ")
      save_sales(file_to_save_to)
  elif command == 9:
      file_to_load_from = BTCInput.read_text("Enter file to load: ")
      load_sales(file_to_load_from)
  else:
      raise ValueError("Unexpected value " + str(command) + " found")

Write into a File

When interacting with a file, python represents it as a memory object
- Technically representing the connection
open creates a connection to a file, the below, opens a file, test.txt, in write mode w and stores it in the variable output_file
```
  output_file = open('test.txt', 'w')
```
- The two arguments are called the file_path and the mode
  - file_path is the file you want to open
  - mode is what you want to do with it

Caution

It’s very easy to overwrite an existing file

The open function will not prevent you from modifying important files. For example files opened for write will first wipe the contents of any existing file that matches the path then write the new contents.

Python provides the os module which has some extra functionality for handling files and directories, e.g. you can check that a file exists before you open it if you then want check if the user wants to overwrite it before opening it

import os.path
if os.path.isfile("text.txt"):
    print("The file exists")

If we’ve opened a file in write mode, we can use the write method on the file object to write to the file
```
  output_file.write("First line\n")
  output_file.write("Second line\n")
  output_file.close()
```
Once you’re done with a file you need to call close
- Completes any unfinished writes (ensures data integrity)
- Releases the file so other programs or processes can use it
  - Files open for writing are locked for editing by that process, nothing else can use them

Putting everything together our simple file writing program is,

  # Exercise 8.15 File Output
  #
  # A simple program to demonstrate opening and writing to a file

  output_file = open("test.txt", "w")
  output_file.write("line 1\n")
  output_file.write("line 2\n")
  output_file.close()

Code Analysis: File Writing

Consider the following questions about file writing

Why have you called the write function a method? Isn’t it a function?
- As discussed earlier, methods are functions associated with a specific object
- Typically when we say functionw we refer to a function that is defined outside of an object
- write is a method on the file object
  - It is impossible to use write without there being a file object to use
  - Methods allow us to work with multiple file objects without having to worry about making sure we pass the correct one to the function
What does the \n mean at the end of the strings?
- It’s the new line symbol write doesn’t automatically end the line after we call it
- We have to manually pass the new line
Where is the file text.txt actually created?
- The file_path is relative to the current running python program
- Hence the file is written to the same directory
  - E.g. if we had a folder called “My Programs” with a python program “MakeFiles.py”, when we run “MakeFiles.py” the files it makes are stored in “My Programs”
- You can use more complicated file_paths
  1. path = "./data/test.txt" would look for test.txt in the data subdirectory of the current python program (relative path)
  2. path = "c:/data/test.txt" would look for test.txt in the data subdirectory of the c drive (absolute path)
  Note
  
  Denoting a Directory Seperator
  
  On Windows \ is used to seperate directories, but in python you always use /
Can any program use a file written from a Python program?
- Yes, python uses the underlying operating systems file handling services
- Any other program on the operating system can access files created or modified by python
Can I add lines at the end of a python file?
- Yes, rather than open the file in write w, you open the file in append (a).
- Any writes will then be appended to the end of the file.
- A non-existent file will be created the same way as for write mode

Write the Sales Figures

Using the above discussion we can implement the write_sales function

  # Example 8.16 Write Sales
  #
  # Implements the Write Sales function

  # test data
  sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]


  def save_sales(file_path):
      """
      Saves the contents of the sales list to a file

      Parameters
      ----------

      file_path : str
          string giving the file path to save to

      Returns
      -------
      None

      Raises
      ------
      FileException
          Raised if the save fails
      """
      print("Save the sales in: ", file_path)
      output_file = open(file_path, "w")
      for sale in sales:
          output_file.write(str(sale) + "\n")
      output_file.close()


  save_sales("test_output.txt")

Code Analysis: The `save_sales` Function

The save_sales function combines several behaviours and is worth examining in detail. What is the purpose of the function? To take a list of sales figures and write those figures to a file (preferably in a format that is easy for a human to read and to load back into the program.) Consider the following questions

What does the str function do? Why are we using it?
- The str function converts the sales number to a string
- While print can handle non-string inputs, write can only take a string
Why can’t we just write out the sales list as one object?
- A list does not provide any built-in methods for writing an object out to a file
- We could try and print out it’s string representation (i.e. call str and output that)
- Doesn’t give us great ability to control the way the data is output

Read from a File

We an also use open to read from a file, we just use the read mode (r)
```
  input_file = open("test.txt", "r")
```
We can then loop over the lines in a file using a for loop
```
for line in input_file:
      print(line)
```
We should still use close() when we’re done reading
```
  input_file.close()
```

The complete sample program looks like,

  # Example 8.17 File Input
  #
  # Demonstrates reading input from a file

  input_file = open("test.txt", "r")
  for line in input_file:
      print(line)
  input_file.close()

Code Analysis: Reading from Files

Work through the following questions to understand how reading from files works

If you look at the following output, you’ll notice there are empty lines after each line of text. Why is that?
```
 line 1

 line 2
```
- Every time we read a line from a file, we read the terminating new line
- This is included in the string stored in line so when we call print we get that new line and the new line added by print
- We could fix this by modifying our print call, to remove the new line
```
  print(line, end='')
```
- A more natural way to fix this is to remove the newline when we first read in the string
- The strip method when called without arguments returns a copy of the string with all leading and trailing whitespace removed from the string
```
  line = line.strip()
```
- This is an example of conditioning input
- Process of making sure that an input does not contain any unexpected values
- E.g. we might also want to use strip to remove non-printable characters
  - lstrip and rstrip are variants of strip that only work on the lead or end of the string respectively
Why do we have to close the file we’re reading?
- For reading a file forgetting to close it won’t cause issues with other programs or processes that also try to read from the file
- However, lets other programs now write to that file
- Releases the memory associated with holding the connection
- Your computer might not let you shut down if it thinks there are still unclosed files
What would happen if you tried to write to a file that had been opened for reading?
- An exception will be raised
- r+ is a mode that lets you read and write to a file
- You typically don’t want to read and write to a file at the same time
  - Hard to ensure the integrity of the data and avoid corrupting it
  - Such as by writing a line longer than the one previously written
    - this may corrupt the next line
- A better pattern is to load data, update the data then write that back into the file
  - A temporary file (often abreviated as a tmp file) can be used if we need an intermediate file to write to

Can a program read an entire file at once?

Yes, the* read method by default will try to read an entire file
line endings are preserved
Be careful with large files, as this may overwhelm your computers memory…

 # Example 8.18 File Read
 #
 # Demonstrates the use of file_object.read to read
 # the contents of a file in one go

 input_file = open("test.txt", "r")
 total_file = input_file.read()
 print(total_file)
 input_file.close()

Read the Sales Figures

Let’s now implement load_sales

  # Example 8.19 Load Sales
  #
  # Implements the Load Sales function

  sales = []


  def load_sales(file_path):
      """
      loads the contents of a file into the sales list

      Parameters
      ----------

      file_path : str
          string giving the file path to load from

      Returns
      -------
      None

      Raises
      ------
      FileException
          Raised if the load fails
      """
      print("Load the sales in:", file_path)
      sales.clear()
      input_file = open(file_path, "r")
      for line in input_file:
          line = line.strip()
          sales.append(int(line))
      input_file.close()

Code Analysis: The `load_sales` Function

load_sales works as the opposite of save_sales instead of taking a sales list and putting it into a text file, we pull the figures from a file and load them into the sales list. Consider the following questions

What does the int function do?
- The numbers pulled out of the file are initially stored as a string
- We need to convert them to a number, so we call int
What happens if the input file was empty?
- The function works as one would hope
- The loop doesn’t iterate and we get an empty sales list

Deal with File Errors

Dealing with files, also means dealing with the errors they can introduce
- e.g. A file might have been deleted, a USB removed, or simply the user might pass the wrong name
When an error occurs we want to ensure two things:
1. No files are left open
2. The user is aware that the error has occured

File objects typically raise exceptions when their methods

Enables us to handle and report on their errors
Use the try ... except syntax we’ve seen before

  try:
      output_file = open(file_path, "w")
      for sale in sales:
          output_file.write(str(sale) + "\n")
      output_file.close()
      print("File Written Successfully")
  except:
      print("Something went wrong with the file")

Code Analysis: Dealing with File Handling Exceptions

The code performing the file write is wrapped in a try...except block. If write, open or close causes an exception it will be caught and handled by the except clause. Let’s work through the following questions to see if this solves the ensures that the file is closed and the user is informed

In what circumstances will the code in the except part be executed?
- If any of the file functions, write, open, or close raise an exception, the code in the except part will be executed
- An error message is thus only printed when an error occurs
In what circumstances will the “File written successfully?” message be printed?
- This is only printed if every step in the file writing process is completed successfully
An error message is always printed if an error is thrown, but will the file always be closed?
- No, this is a problem, as we said that all files needed to be closed even when an error occurs!
- We could put the close statement in the exception handling section to, but a more general solution to this problem is to use a finally block
  - A finally block contains code that is always executed after all of the try and/or except code has executed
  - Good for code that we naturally want to run after the block no matter if the process succeeds or fail (such as clean-up)
```
 try:
     output_file = open(filename, "w")
     for sale in sales:
         output_file.write(str(sale) + "\n")
 except:
     print("Something went wrong with writing to the file")
 finally:
     output_file.close()
```

Use the `with` Construction to Tidy up File Access

It would be great if we didn’t have to remember to manually ensure a file gets closed
- Failing to properly close a file can lead to hard to pin down behaviour

Warning

Intermittent Faults are the Worst Kind to Fix

A piece of code that is broken all the time is annoying, but at least you can typically easily identify what is not working. If a program fails only some of the time this can be much harder to solve. Often you require precise directions as to the steps taken up to the point of failure in order to be able to attempt to replicate the problem. This adds significant overhead to fixing the problem

The with construct allows the programmer to automatically manage the acquisition and release of resources
- More generic than just file access
- You can write your own services to work with with
  - Advanced topic we can ignore for now

block-beta
    columns 6

    classDef BG stroke:transparent, fill:transparent


    space
    title["Breakdown of a with statement"]:4
    space

    class title BG

    block:With
    columns 1
        with["with"]
        withDescr["(start of a with block)"]
    end

    class with BG
    class withDescr BG


    block:Expression
    columns 1
        expression["expression"]
        expressionDescr["(expression generating resource to use)"]
    end

    class expression BG
    class expressionDescr BG

    block:As
    columns 1
        as["as"]
        space
    end

    class as BG

    block:Name
    columns 1
        name["name"]
        nameDescr["(name to represent the resource)"]
    end

    class name BG
    class nameDescr BG

    block:Colon
    columns 1
        colon[":"]
        space
    end

    class colon BG

    block:Suite
    columns 1
        suite["Statement block"]
        suiteDescr["(statements)"]
    end

    class suite BG
    class suiteDescr BG

with is used to provide an object that provides a service
as is used to assign a semantically meaningful name to the resource
with activates an “enter” behaviour on its object
- For files this is open
When the block is finished, with calls some exit behaviour on the object
- For files this causes the file to be closed

with allows us to ensure a few things

The file is always closed
The reference to the file only exists as long as we are using it

  # Example 8.20 Using with to Access Files
  #
  # Rewrites read_sales and load_sales to use the with functionality
  # implemented in python

  # test data
  sales = [50, 54, 29, 33, 22, 100, 45, 54, 89, 75]


  def save_sales(file_path):
      """
      Saves the contents of the sales list to a file

      Parameters
      ----------

      file_path : str
          string giving the file path to save to

      Returns
      -------
      None

      Raises
      ------
      FileException
          Raised if the save fails

      See Also
      --------
      load_sales : load sales from a given file
      """
      print("Save the sales in:", file_path)
      try:
          with open(file_path, "w") as output_file:
              for sale in sales:
                  output_file.write(str(sale) + "\n")
      except:  # noqa: E722
          print("Something went wrong with the file")


  def load_sales(file_path):
      """
      loads the contents of a file into the sales list

      Parameters
      ----------

      file_path : str
          string giving the file path to load from

      Returns
      -------
      None

      Raises
      ------
      FileException
          Raised if the load fails

      See Also
      --------
      save_sales : save sales to a file
      """
      print("Load the sales in:", file_path)
      sales.clear()
      try:
          with open(file_path, "r") as input_file:
              for line in input_file:
                  line = line.strip()
                  sales.append(int(line))
      except:  # noqa: E722
          print("Something went wrong with the file")


  print("Sales before save and load:", sales)
  save_sales("test.txt")
  load_sales("test.txt")
  print("Sales after save and load:", sales)

Observe that we no longer have to explicitly include the close call
with does not handle exceptions however, so we still have to include a try...except block
When an exception occurs the with first releases the resource with its exit behaviour
- e.g. closes the file
- Then the excecution moves to the except block

If we wanted to handle exceptions without releasing the resource, we would have to swap the order to,

  with open("file", "mode"):
      try:
          #do standard thing here
      except:
          # handle exception without releasing resource
      finally:
          # do something regardless of success or fail without releasing resource

Make Something Happen: Record a List with a `save` Function

Add a save function to your party guest program so that you can record a list of people who attended your party

We build off our version that generates a sorted list. We can basically copy the save_sales function making changes to the refer to the guests list instead of sales and giving a more appropriate name to the loop variable.

def save(file_path):
    """
    Saves the guest list to a file

    Parameters
    ----------

    file_path : str
        string giving the file path to save to

    Returns
    -------
    None

    Raises
    ------
    FileException
        Raised if the save fails
    """
    print("Save the guest list in:", file_path)
    try:
        with open(file_path, "w") as output_file:
            for guest in guests:
                output_file.write(str(guest) + "\n")
    except:  # noqa: E722
        print("Something went wrong with the file")

We then run the program as normal

Ask for the number of guests
Read in the guests
Sort the guest list
Display the guest list

We then ask the user if they want to save the guest list. For simplicity we use BTCInput.read_input_ranged to ask for a \(0\) or a \(1\) where a \(1\) indicates the user wishes to save, while \(0\) indicates they dont. If the user wishes to save we then prompt them using BTCInput.read_text for a file name and then call save on the given file path

user_wants_to_save = BTCInput.read_int_ranged(
    "Would you like save the list? (1 for yes, 0 for no): ", min_value=0, max_value=1
)

if user_wants_to_save:
    save_file_name = BTCInput.read_text("Enter file name to save as: ")
    save(save_file_name)

The complete integrated code is given in GuestListWithSave.py

Store Tables of Data

A list holds data in one dimension, i.e. its length
Often data is multi-dimensional
e.g. Our Ice Cream Sales client might now ask for the ability to track sales, by store and by day of the week

block-beta
    columns 5

    classDef Header fill:#bbf,stroke:#333,stroke-width:4px;
    classDef BG stroke:transparent, fill:transparent

    space:2
    title["Data Table"]:2
    space:1

    class title BG

    space
    block:fields:4
    columns 4
        monday["Monday"]
        tuesday["Tuesday"]
        wednesday["Wednesday"]
        stop["..."]
    end

    class fields Header

    Stand1["Stand 1"]
    50
    80
    10
    Blank1["..."]

    class Stand1 BG

    Stand2["Stand 2"]
    54
    98
    7
    Blank2["..."]

    class Stand2 BG

    Stand3["Stand 3"]
    29
    40
    80_2["80"]
    Blank3["..."]

    class Stand3 BG

    Stand4["..."]
    stand4_1[" "]
    stand4_2[" "]
    stand4_3[" "]
    stand4_4[" "]

    class Stand4 BG

Our current implementation is effectively a vertical slice for one of the days
Can implement multiple lists, one per day of the week
- Effectively repeats the problem we had before of a distinct named variable for each item

We want a list of lists

  mon_sales = [50, 54, 29, 33,  22, 100, 45, 54, 89, 75]
  tue_sales = [80, 98, 40, 43, 43, 80, 50, 60, 79, 30]
  wed_sales = [10, 7, 80, 43, 48, 82, 33, 55, 83, 80]
  thu_sales = [15, 20, 38, 10, 36, 50, 20, 26, 45, 20]
  fri_sales = [20, 25, 47, 18, 56, 70, 30, 36, 65, 28]
  sat_sales = [122, 140, 245, 128, 156, 163, 90, 140, 150, 128]
  sun_sales = [100, 130, 234, 114, 138, 156, 107, 132, 134, 148]

  week_sales = [mon_sales, tue_sales, wed_sales, thu_sales, fri_sales, sat_sales, sun_sales]

Think of lists of lists as a collection of rows and columns
- We first specify the row we want say tue_sales
- Then the column, say Stand 1
  print(week_sales[1][0])
```
80
```

Code Analysis: Inadequate Index Values

It can be difficult to get the hang of working with multiple indices. Which of the following indices would fail when the program runs?

Statement 1: week_sales[0][0] = 50
Statement 2: week_sales[8][7] = 88
Statement 3: week_sales[7][10] = 100

Statement 1 is valid
Statement 2 is invalid because the first index \(8\) corresponds to the day of the week
- The valid indices here are \(0\) to \(6\)
Statement 3 is also invalid for the same reason
- Even though there are seven days of the week
- The list is zero indexed

Let’s see this in action

Statement 1:

week_sales[0][0]

Statement 2:

week_sales[8][7]

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[20], line 1
----> 1 week_sales[8][7]

IndexError: list index out of range

Statement 3:

week_sales[7][10]

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[21], line 1
----> 1 week_sales[7][10]

IndexError: list index out of range

Tip

Make it easy to test your program

Testing is important, but unless it’s easy or automatic it’s pretty common to get left by the wayside.

In a program one might use a function make_test_data or for larger projects a test framework that is used to generate test data.

Whenever you find yourself repeating a pattern to test code, consider how you can automate or bypass that process

Use Loops to Work with Tables

We can use nested for loops to work through individual values in a list of lists

E.g. if we want to calculate the total sales over a week, (full code given in TablesOfSaleData.py)

  total_sales = 0
  for day_sales in week_sales:
      for sales_value in day_sales:
          total_sales = total_sales + sales_value

  print("Total sales for the week are", total_sales)

Total sales for the week are 5205

day_sales in the outer loop iterates over each constituent list in the list of lists
sales_value is then each value in the current list referenced by day_sales

Code Analysis: Loop Counting

Consider the code for summing the sales data in the previous example. Answer the following questions to make sure you understand how it works

How many times will the statements inside the two loops be obeyed?
- In total they will be run \(70\) times
- The outer loop runs seven times (once for each day of the week)
- The inner loop runs ten times (one for each stand)
  - for each iteration of the outer loop
How would you change this program so that it could handle more than one week’s worth of sales?
- We can add more days to the list
- Rather than have them correspond to Monday - Friday it might be Week 1 Day 1 etc.
- These would be additional rows in the list of lists
How would we add a day’s worth of sales to the list?
- We have to read in a new list of values
- Can then append it to the list of lists
```
  read_sales(10) # read ten values into sales list
  week_sales.append(sales) # append the values to the weekly sales list
```

More than Two Dimensions

It is possible to work with higher dimensions
For example we might want to store multiple weeks of data
- Then we would have a list of (list of (lists))s
Works just like two dimensions but with an extra index, for example we can append a week of sales like so,
```
  annual_sales.append(week_sales)
```

Tip

Keep your dimensions low

You should rarely have to use more than three dimensions. If you find yourself using highly nested / high-dimensional structures you might want to rethink how you’re representing your data

One technique we will see later is the use of classes, which can make it easier to create linear collections

The computer itself is perfectly happy working in higher dimensions. The real difficulty is that you probably aren’t and it can be hard to reason about high dimension data

Use Lists as Lookup Tables

Now we have the ability to manipulate weekly sales data, the next question is how to display that data and the requests.
When we enter the data we want to see something like,
```
  Enter the Monday sales figures for stand 2:
```

Here we need to have a variable to control what day is printed

Simplest implementation is an integer to track the day, implemented in DayNameIf.py

  # Example 8.22 Day Name If
  #
  # Uses a if, elif, else construction to convert an integer
  # to a string representation of the day of the week

  import time

  current_time = time.localtime()
  day_number = current_time.tm_wday

  if day_number == 0:
      day_name = "Monday"
  elif day_number == 1:
      day_name = "Tuesday"
  elif day_number == 2:
      day_name = "Wednesday"
  elif day_number == 3:
      day_name = "Thursday"
  elif day_number == 4:
      day_name = "Friday"
  elif day_number == 5:
      day_name = "Saturday"
  elif day_number == 6:
      day_name = "Sunday"
  else:
      raise ValueError("Unexpected day_number " + str(day_number) + " encountered")

  print(day_name)

Friday

This works, but is fragile, a cleaner way to do this is to use a lookup table
- i.e. we use day_number to index a list that stores the correct day

We use thetime library for fun so the program prints the current day

  # Example 8.23 Day Name List
  #
  # Uses a lookup table to correctly print the day

  import time

  current_time = time.localtime()
  day_number = current_time.tm_wday

  day_names = [
      "Monday",
      "Tuesday",
      "Wednesday",
      "Thursday",
      "Friday",
      "Saturday",
      "Sunday",
  ]

  day_name = day_names[day_number]

  print("Today is", day_name)

Today is Friday

Lookup tables are powerful for shrinking written code
They also are used to create data-driven applications
- Programs that use built-in or loaded data rather than fixed behaviour

Tuples

Lists are the standard collection type
- They are mutable, i.e. we can change the value of a given index or add new items
Consider the day_names list, once defined we don’t want to change it
- We would like to also prevent this, to catch potential programming errors e.g.
```
  day_names[5] = "Splatterday"
```

A tuple is like a list, but the contents cannot be changed

A tuple is said to be immutable

If we attempt to change the tuple we get an error, (demonstrated in the implementation DayNameList.py)

Specifically a TypeError
Because the action we are trying to take (change the value at an index) is not supported by the object type (tuple)

  # Example 8.24 Day Name Tuple
  #
  # Reimplements the Day Name lookup table with a tuple
  # and demonstrates the immutability of the data structure

  import time

  current_time = time.localtime()
  day_number = current_time.tm_wday

  day_names = (
      "Monday",
      "Tuesday",
      "Wednesday",
      "Thursday",
      "Friday",
      "Saturday",
      "Sunday",
  )

  day_name = day_names[day_number]

  print("Today is", day_name)

  print("Attempting to change the lookup table...")

  day_names[day_number] = "Splatterday"  # type: ignore
  print("Today is", day_names[day_number])

Today is Friday
Attempting to change the lookup table...

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[25], line 27
     23 print("Today is", day_name)
     25 print("Attempting to change the lookup table...")
---> 27 day_names[day_number] = "Splatterday"  # type: ignore
     28 print("Today is", day_names[day_number])

TypeError: 'tuple' object does not support item assignment

Tuple is created as for a list but using () to delimit the items rather than []
Tuples are good for working with complicated values
- e.g. composite types
For Example, consider a pirates treasure map
- Treasure’s location is given by
  1. A reference landmark
  2. Number of steps north
  3. Number of steps east
A function can strictly speaking return one value
- We can return multiple values as a tuple
```
  def get_treasure_location():
      # get the treasures location
      return ("The old oak tree", 20, 30)
```
- This returns three values
  1. The string "The old oak tree"
  2. The number of steps north, 20
  3. The number of steps east, 30
Like lists, tuples are zero-indexed

Warning

Take care with your tuple indices

When returning multiple items from a function via a tuple, we have to be clear to specify the order of what the items in the tuple correspond to. This is effectively a contract between the function and any caller (if you change the order, you will break the code of anyone who relies on the current order)

The order that parameters are returned in should thus be clearly documented, e.g.

def get_treasure_location():
    """
    Gets the location of the treasure

    Returns
    -------
    str
        Name of a landmark to start at
    int
        Number of paces north
    int
        Number of paces east
    """

    return ("The old oak tree", 20, 30)

An alternative to explicitly referencing the index of a returned tuple, is called tuple-unpacking
- We provide a comma-seperated list of variables to assign the tuple values (in order) to, e.g.
```
      landmark, north, east = get_treasure_location()
  print("Start at", landmark, "walk", north, "paces north and", east, "paces east")
```
The complete Pirate’s Treasure program implemention is given in PiratesTreasure.py

Summary

Lists can be used to store large and arbitarily sized data
- We refer to the individual elements of a list as items
- append lets us add new elements to a list (at the end)
- len returns the number of items in a list
- lists can contain different types of data in the same list
- list values are accessed via the indexing operator []
  - lists are indexed from \(0\)
  - The last index in a list is len(list) - 1
- Nested lists allow for multi-dimensional structures
Files can be manipulated by python
- open is used to access a file
- files can be read from or written to
- for can be used to loop over lines from a file
- when using write to write to a file, newlines ('\n') must be added exactly
- strip can be used to remove whitespace when reading lines from a file
- Files must be closed using the close method once they are no longer in use
- Files can raise exceptions which must be handled or notified to the user
  - They must ensure the file is still closed
- with can be used to automatically ensure a file is closed once it is no longer used, even in error scenarios
Tuples are immutable collections
- Once they are defined we cannot modify or add values
- Tuples are suitable for tuples or other fixed collections
- Tuples can be used by functions that return more than one value

Questions and Answers

Do we really need lists?
- Yes, any scenario with large or arbitrary data needs collections to meaningfully handle and manipulate them
Do we really need tuples?
- No, techically we could just use lists instead. They are useful though because they enforce properties that lists don’t such as immutability which is useful in some cases
How does the list actually work?
- When a list is created the program reserves memory to hold a few items
- The memory also tracks the number of items currently stored in the list
- Appending an item consumes part of the allocated memory
- If the list doesn’t have enough room, then more memory is allocated to the list
- When accessing a list item, the list checks if the item exists
  - If the item doesn’t exist, an exception is thrown
  - else, the item is found and returned
Why are tuples called tuples?
- Tuples are ordered collections of elements in mathematics. Python adopted the terminology
Should the sales program use a list to store the sales figures or a tuple?
- It depends on the operations we want to peform
- Once we have the list of sales figures, none of our operations strictly change the tuple (except sorting)
  - Can implement sorting them as creating a new tuple
  - Probably good to then use a tuple from a security perspective
  - However, this makes the code more complicated
- If we wanted to introduce an edit function later to modify sales data we might prefer a list for the clean implementation
  - As again opposed to the tuple approach
Can functions return lists instead of tuples?
- Yes, they can.
- However, typically the results of functions cannot be changed
  - So naturally a tuple
Will my program run faster if I use tuples to store all the data in it?
- Potentially, tuples are faster to implement than lists
- Depends on what the program does, if you’re mutating a lot of data, the cost of constantly recreating multiple tuples might be greater than the cost of creating and modifying a list
- The speed difference should hardly be noticable in any case
Does the with construction stop objects from throwing exceptions?
- No, with is designed to ensure that even if an object throws an exception the managed resource is released correctly
- with will still pass on the exception

Notes

Lists and Tracking Sales

Limitations of Individual Variables

Lists in Python

Make Something Happen: Creating a List

Read in a List

Code Analysis: Investigate a List Reading Loop

Display a list using a for Loop

Make Something Happen: Read the Names of Guests for a Party

Refactor Programs into Functions

Code Analysis: Functions in the Sales Analysis Program

Create Placeholder Functions

Create a User Menu

Use the elif keyword to simplify conditions

Sort Using Bubble Sort

Initialise a list with Test Data

Sort a List from High to Low

Code Analysis: Work through a List using a Loop

Code Analysis: Improving Performance

Make Something Happen: Sort Alphabetically

Sort a List from Low to High

Find the Highest and Lowest Sales Values

Evaluate Total and Average Sales

Complete the Program

Store Data in a File

Write into a File

Code Analysis: File Writing

Write the Sales Figures

Code Analysis: The save_sales Function

Read from a File

Code Analysis: Reading from Files

Read the Sales Figures

Code Analysis: The load_sales Function

Deal with File Errors

Code Analysis: Dealing with File Handling Exceptions

Use the with Construction to Tidy up File Access

Make Something Happen: Record a List with a save Function

Store Tables of Data

Code Analysis: Inadequate Index Values

Use Loops to Work with Tables

Code Analysis: Loop Counting

More than Two Dimensions

Use Lists as Lookup Tables

Tuples

Summary

Questions and Answers

Display a `list` using a `for` Loop

Use the `elif` keyword to simplify conditions

Code Analysis: The `save_sales` Function

Code Analysis: The `load_sales` Function

Use the `with` Construction to Tidy up File Access

Make Something Happen: Record a List with a `save` Function