class Contact:
"""
Stores Contact Information
Attributes
----------
name : str
Contact Name
address : str
Contact's postal or street address.
telephone : str
Contact phone number (stored as a string).
"""
passChapter 9: Use Classes to Store Data
Notes
Make a Tiny Contacts App
Let’s develop a lightweight program to store contact details
- Names
- Addresses
- Telephone Numbers
We storyboard the interface
Tiny Contacts 1. New Contact 2. Find Contact 3. Exit Program Enter your command:We then want to extend the storyboard to the different interface options
Create new contact Enter the contact name: Rob Miles Enter the contact address: 18 Pussycat Mews, London, NE1 410S Enter the contact phone: +44(1234) 56789 Contact record stored for Rob MilesThe matching storyboard for find is then,
Find Contact Enter the contact name: Rob Miles Name: Rob Miles Address: 18 Pussycat Mews, London, NE1 410S Phone: +44(1234) 56789With a matching storyboard for a contact not existing
Find Contact Enter the contact name: Fred Bloggs This name was not found
Make a Prototype
- We start by making a prototype
- We stub out the functions with mock messages
- This is good for demoing to the customer for their feedback
- Also helps to start working out how the structure should flow
# Example 9.1 Tiny Contacts Prototype # # Simple stub implementation of the Tiny Contacts Prototype import BTCInput def new_contact(): """ Creates and adds a new contact to the contact book Returns ------- None """ print("Create the new contact") BTCInput.read_text("Enter the contact name: ") BTCInput.read_text("Enter the contact address: ") BTCInput.read_text("Enter the contact phone: ") def find_contact(): """ Displays the contact matching a user-specified name Prompts the user for a name, and searches the contacts list. If the contact is found in the list, their full contact details are displayed Returns ------- None """ print("Find contact") name = BTCInput.read_text("Enter the contact name: ") if name == "Rob Miles": print("Name: Rob Miles") print("Address: 18 Pussycat News, London, NE1 410S") print("Phone: +44(1234) 56789") else: print("This name was not found.") menu = """Tiny Contacts 1. New Contact 2. Find Contact 3. Exit Program Enter your command:""" while True: command = BTCInput.read_int_ranged(prompt=menu, min_value=1, max_value=3) if command == 1: new_contact() elif command == 2: find_contact() elif command == 3: break else: raise ValueError("Unexpected command id found: " + str(command))
Code Analysis: The Contacts Application Prototype
The code above doesn’t introduce any new concepts, but it’s worth examining in detail to make sure you understand how all the parts work. Work through the following questions.
Is this code familiar?
- It should be!, It is very similar to the ride-selector and Ice Cream Sales programs
- This menu structure is very common for imperative programs
The value returned by the
read_textfunctions are ignored by the program. Is this legal?- Yes, it is perfectly legal.
read_textis from the BTCInput library, it returns a user-provided string- We have yet to decide how we store this, so we simply discard it
- We emulate the behaviour of getting a contact, but not the process of storing it yet
How does the program stop?
- The main loop contains a special option that is used for exiting the program. This is achieved by calling
breakto get out of the loop, after which the program will finish
- The main loop contains a special option that is used for exiting the program. This is achieved by calling
Isn’t the prototype a bit basic? Why don’t you make it store data?
- The prototype is not designed to be functional
- We minimise the initial work so that if the customer backs out we haven’t wasted too much time
- We want to make it clear that the program is a prototype, so that the customer won’t immediately want to use it
- The prototype is not designed to be functional
How is the telephone number stored?
- Our plan is store the number as a string
- While referred to as a number, telephone numbers typically have additional characters that make them much more like strings (e.g. +)
Store Contact Details in Separate Lists
Start with storing Contact Details
Simplest implementation is to maintain a list for each different type of information we store
- the \(i\)-th contact them has it’s details in the \(i\)-th index of each list
names = [] addresses = [] telephones = [] def new_contact(): """ Creates and adds a new contact to the contact book Returns ------- None """ print("Create the new contact") names.append(BTCInput.read_text("Enter the contact name: ")) addresses.append(BTCInput.read_text("Enter the contact address: ")) telephones.append(BTCInput.read_text("Enter the contact phone: "))To find items we then get the index from the
nameslist and use that to access the corresponding indices in theaddressesandtelephonesdef find_contact(): """ Displays the contact matching a user-specified name Prompts the user for a name, and searches the contacts list. If the contact is found in the list, their full contact details are displayed Returns ------- None """ print("Find contact") search_name = BTCInput.read_text("Enter the contact name: ") search_name = search_name.strip() search_name = search_name.lower() name_index = 0 for name in names: name = name.strip() name = name.lower() if name == search_name: break name_index = name_index + 1 if name_position < len(names): print("Name: ", names[name_index]) print("Address: ", addresses[name_index]) print("Telephone: ", telephones[name_index]) else: print("This name was not found")You can view the complete program all put together in TinyContactsParallelLists.py
Code Analysis: The find_contact Function
The find_contact function is probably one of the more sophisticated pieces of code we’ve written. Work through the following questions to make sure you understand what is going on.
How does this code work?
- We look through the
nameslist until we find a match - Once we’ve found it we can immediately stop looking
- Keeping track of the index that we’ve been looking at
- We look through the
What is the
name_indexvariable used for?- The
name_positionvariable is used to track which index of thenameslist matches the name we’re trying to find - We use this to then grab the address and phone from the
addressesandtelephonesarrays - This technique is called parallel lists
- The
How does the function know if a name has been found?
- If we reach the end of the list without finding a match, then
name_indexends the loop as one past the actual last valid index of the loop- We can check this with
len - We use an
ifcondition to check this
- We can check this with
- If we reach the end of the list without finding a match, then
What do the calls of
stripandlowerdo?- These functions normalise the input, so that any extra whitespace or variations in upper and lower case are removed
Can we save the user from having to type in all the names when they search?
Yes, we can. We could use
startswithto find a name that starts with whatever the user inputsThis means they might only need to put in the first name
There are more sophisticated search techniques that we could use, but they are outside the scope of this discussion
if name.startswith(search_name): break- The above is integrated into the complete program in TinyContactsQuickSearch.py
Use a Class to Store Contact Details
- An issue with this set-up is we have to ensure that the parallel lists stay aligned
- For example if we sort the
nameslist alphabetically, we have to ensure we make the same transformations to theaddressesandtelephoneslists - We instead would prefer to have one object or container that holds all three values together
- One option is to use a tuple or a list
- But then we have to remember how values are stored
- Alternative is the class
- In object-oriented programming we use classes to define and construct objects
- A class is a type, an object is the instance
Make Something Happen: Creating a Class
Open the python interpreter and work through the following steps and questions to understand classes
Enter the statements below
The line class Contact: begins a class definition - The class contents is given as an indented block - We use pass To make an empty placeholder class
Why does the name
Contactbegin with a capital letter?- It’s convention, in python
- Variables and functions start with lowercase letters
- Classes start with uppercase letters
- It’s convention, in python
Why does the
Contactclass contain a Pythonpassstatement?The class definition expects an indented block
We haven’t yet decided the contents of the class so we use
passto give an placeholder statementWe can create an instance of a
Contactwithx = Contact()
This looks like a function call. Are we calling a function here?
- Technically this is a call to a function called a constructor
- Which is responsible for creating an instance of a
Contact - By using capital letters it’s clear that this is an object instantiation
What’s an instance?
An instance is the realisation of a class
Class is the design, object is the actual thing
You can add data attributes to an instance
x.name = "Rob Miles"
What’s a data attribute?
Provide information about a specific instance
For a contact we would want it to have
name,address, andphonemethods can also be thought of as attributes
You can use and manipulate data attributes
print(x.name) x.name = x.name + " is a star" print(x.name)Rob Miles Rob Miles is a star
Attributes in Python classes can be confusing
The ability to add attributes to an instance is not common across programming languages. For example Java, c# and c++ all prevent this.
In these languages a class definition must be fully specified including the attributes before it can be instantiated.
Both this static definition and pythons dynamic definitions have it’s advantages and disadvantages. The latter is easier for prototyping and development, but the former is much more type-safe
Use the Contact class in the Tiny Contacts Program
We can use the
Contactclass to eliminate the needs for multiple lists (see TinyContactsClass.py highlighted below)contacts = [] def new_contact(): """ Creates and adds a new contact to the contact book Returns ------- None See Also -------- Contact : class for storing contact information """ print("Create the new contact") new_contact = Contact() new_contact.name = BTCInput.read_text("Enter the contact name: ") new_contact.address = BTCInput.read_text("Enter the contact address: ") new_contact.telephone = BTCInput.read_text("Enter the contact phone: ") contacts.append(new_contact) def find_contact(): """ Displays the contact matching a user-specified name Prompts the user for a name, and searches the contacts list. If the contact is found in the list, their full contact details are displayed Returns ------- None Notes ----- Matches any name prefixed by the search name """ print("Find contact") search_name = BTCInput.read_text("Enter the contact name: ") search_name = search_name.strip() search_name = search_name.lower() result = None for contact in contacts: name = contact.name name = name.strip() name = name.lower() if name.startswith(search_name): result = contact break
Code Analysis: The class-based find_contact function
Answer the following questions about the new find_contact implementation
How does this code work?
- This functions like the previous search, we look for a contact that has a match to the search name
- Rather than use the index of the match, we set a reference to the object itself iin the variable*
result - We use
Noneto indicate no match was found
What does the value
Nonemean?Nonein python is used to refer to a value that does not exist- Semantically here it is used to indicate that no match was found
Exercise: Duplicate Names
This program has a fault in that if multiple contacts have the same name as an existing one only the first one will be returned. Modify the program to correct this problem
We have two solutions that we could use,
- When a duplicate name is encountered we simply replace the old one
- This is the simplest approach, however it is quite common for people to have the same names
- The program returns all the valid matches
This is a bit more complicated
Our search function now rather than returning one
Contactreturns a list containing all matching ContactsWe only have to change the
find_contactsfunction (the full code is given in TinyContactsDuplicates.py)def find_contact(): """ Displays the contact matching a user-specified name Prompts the user for a name, and searches the contacts list. If the contact is found in the list, their full contact details are displayed Returns ------- None """ print("Find contact") search_name = BTCInput.read_text("Enter the contact name: ") search_name = search_name.strip() search_name = search_name.lower() results = [] for contact in contacts: name = contact.name name = name.strip() name = name.lower() if name.startswith(search_name): results.append(contact) if len(results) > 0: for result in results: print("Name: ", result.name) print("Address: ", result.address) print("Telephone: ", result.telephone, "\n") else: print("This name was not found")
Look for problems when you receive the specification
When you discuss a specification there’s no guarantee ambiguities like how to deal with duplicate names will be discussed. You will need to consider cases like this that may arise and define the behaviour for them. This behaviour will need to match what the client expects to happen. The best way to make sure that happens is to make sure that is included in the specification
Edit Contacts
It might be quite common for contacts to change their contact details
We would like to be able to update an existing contact
The new interface
Tiny Contacts 1. New Contact 2. Find Contact 3. Edit Contact 4. Exit Program Enter your command:We then storyboard out the program,
- Our storyboard will be slightly different to the book implementation to better handle duplicates
Edit Contact Enter the contact name: Rob Found 1 match Name: Robert Miles Address: 18 Pussycat News, London, NE1 410S Telephone: +44(1234) 56789 Edit this contact? (1 - Yes, 0 - No): 1 Enter new name or . to leave unchanged: . Enter new address or . to leave unchanged: . Enter new telephone or . to leave unchanged: +44 (1482) 465079The edit program first needs us to find search for the contact we wish to edit
We then report the number of matches found
For each match, we then print the current details and ask the user if this is the contact they want to edit
We then give the user the option of editing each attribute or leaving it unchanged with
.
Refactor the Tiny Contacts Program
Our program is starting to get some structure
- Good time to consider a refactor
We now have two features that need to search for a contact by name
- Find and display a contact
- Find and edit a contact
One option is to copy the
find_contactforedit_contactand replace the display code by the edit code- Now we have to maintain two different copies of the search functionality
- Easy for these to become desynchronised if in the future we want to change how the search works (or need to fix a book)
For our refactor, we’ll do the following
- Factor out a core
find_contactsfunction that takes a search name and returns the matches - Change the name of the old
find_contactfunction todisplay_contactsfunction
- Factor out a core
Here’s the book’s implementation, (we’ll use something different in our implementation to account for duplicates)
def find_contact(search_name): """ Finds the contacts with the matching name Parameters ---------- search_name : str Name to search for (uses prefix matching) Returns ------- list[Contact] list of contacts matching the `search_name`, if no matches exist the list is empty """ search_name = search_name.strip().lower() result = None for contact in contacts: name = contact.name.strip().lower() if name.startswith(search_name): return result return None
Code Analysis: The refactored find_contact function
Answer the following questions, about this new version of find_contact
Why does the function contain two
returnstatements?- Only one return will actually be executed
- If a match is found then the match is returned
- If not then the program will exit the
forloop at which point it encounters the second return and returnsNone
What would happen if another program tried to use the return value of the
find_contactfunction, and thefind_contactfunction had returnedNoneDepends on what the function tries to do
If the function tried to something with that value, then an exception is thrown
x = None #emulate failed find from find_contact print(x.address)--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[5], line 2 1 x = None #emulate failed find from find_contact ----> 2 print(x.address) AttributeError: 'NoneType' object has no attribute 'address'
Since
find_contactdocuments that it can returnNoneit is the responsibility of the consumer of the function to ensure they don’t misuse theNone
Contact Objects and References
find_contactsearches through contacts for a match- Returned object is a reference to the memory object
- e.g.
rob = find_contact("Rob Miles")graphically looks like,
---
config:
flowchart:
htmlLabels: false
---
flowchart TD
A@{shape: tag-doc, label: "Rob"}
B@{shape: div-rect, label: "Name: Rob Miles
Address: 18 Pussycat Mews, London, NE 410S
Telephone: +44(1234) 5678"}
A-->B
- We have multiple references to an object, e.g.
test = robcreates a new referencetest
---
config:
flowchart:
htmlLabels: false
---
flowchart TD
A@{shape: tag-doc, label: "Rob"}
B@{shape: div-rect, label: "Name: Rob Miles
Address: 18 Pussycat Mews ..
Telephone: +44(1234) 5678"}
C@{shape: tag-doc, label: "test"}
A-->B
C-->B
- Changes from one reference to the underlying memory object are seen in the other references
- e.g.
test.name = "Robert Miles Man of Mystery"gives the state as,
---
config:
flowchart:
htmlLabels: false
---
flowchart TD
A@{shape: tag-doc, label: "Rob"}
B@{shape: div-rect, label: "Name: Rob Miles Man ...
Address: 18 Pussycat Mews ..
Telephone: +44(1234) 5678"}
C@{shape: tag-doc, label: "test"}
A-->B
C-->B
- This behaviour is sometimes referred to as shallow copying since we have multiple copies of the object, but they are not distinct, changes are propagated between all the references
- There is only one memory object
Code Analysis: Understanding Lists and References
The figure below illustrates how lists and references work. It shows a Tiny Contacts data store with three contacts registered. Each of the tags in the contacts list refers to a different Contact instance in the memory. Work through the following questions to develop your understanding of references
---
config:
flowchart:
htmlLabels: false
---
flowchart TD
subgraph Contacts
A@{shape: tag-doc, label: "0"}
B@{shape: tag-doc, label: "1"}
C@{shape: tag-doc, label: "2"}
end
A1@{shape: div-rect, label: "Name: Fred Smith
Address: 1605 Main St,
New York
Telephone: (560) 567-5209"}
B1@{shape: div-rect, label: "Name: Joe Bloggs
Address: 2312 Pine Street,
Seattle
Telephone: (453) 545-1232"}
C1@{shape: div-rect, label: "Name: Rob Miles
Address: 18 Pussycat Mews,
London, NE1 410S
Telephone: +44(1234) 5678"}
D@{shape: tag-doc, label: "rob"}
A-->A1
B-->B1
C-->C1
D-->C1
The diagram contains four references. How many data objects does it contain?
- There are three data objects, the
Contactitems themselves - One (Rob Miles) is referenced by the list index \(2\) and the variable
rob
- There are three data objects, the
What would happen if the program performed the following statement?
contacts[0] = contacts[1]- The \(0\) index in the list now references the same memory object as that in the \(1\) index, the state now looks like,
flowchart TD
subgraph Contacts
A@{shape: tag-doc, label: "0"}
B@{shape: tag-doc, label: "1"}
C@{shape: tag-doc, label: "2"}
end
A1@{shape: div-rect, label: "Name: Fred Smith
Address: 1605 Main St,
New York
Telephone: (560) 567-5209"}
B1@{shape: div-rect, label: "Name: Joe Bloggs
Address: 2312 Pine Street,
Seattle
Telephone: (453) 545-1232"}
C1@{shape: div-rect, label: "Name: Rob Miles
Address: 18 Pussycat Mews,
London, NE1 410S
Telephone: +44(1234) 5678"}
D@{shape: tag-doc, label: "rob"}
A-->B1
B-->B1
C-->C1
D-->C1
- Looping through the list would thus refer to the Joe Bloggs Contact twice
- Note that we have now lost the reference to Fred Smith, we can never get it back!
- Unreferenced memory objects will be removed by python in a process called garbage collection
- References make it easy to work with large data objects
- Avoid the need to create expensive copies
Immutability
Everything in Python is an object
30is an instance of anintThe following creates a reference
ageto30age = 30Which we can visualise,
flowchart TD
A@{shape: tag-doc, label: "age"}
A1["`int
30`"]
A-->A1
and verify,
type(age)inttypeis a built-in function- Takes a reference as an argument
- Returns the type of the referenced object
Now, suppose we define another reference
tempviatemp = ageWhich we can again visualise as,
flowchart TD
A@{shape: tag-doc, label: "age"}
B@{shape: tag-doc, label: "temp"}
A1["`int
30`"]
A-->A1
B-->A1
ageandtempnow refer to the same object instanceWhat happens if we assign
tempa new value?temp = 99 print(age) print(temp)30 99So we have the final state,
flowchart TD
A@{shape: tag-doc, label: "age"}
B@{shape: tag-doc, label: "temp"}
A1["`int
30`"]
B1["`int
99`"]
A-->A1
B-->B1
agehas not been modifiedInstead a new
intwith a value of \(99\) was createdThis is because
intis an immutable type- i.e. once an
inthas been created its value can’t be reassigned - Value assignments thus create a new instance of an
int
- i.e. once an
stringis also an immutable typename = "Rob" temp = name print("temp is", temp, "name is", name) temp = "Fred" print("temp is now", temp, "name is now", name)temp is Rob name is Rob temp is now Fred name is now Rob
- Why does python use immutable data types?
For some procedures, like simple numerical calculations, treating variables as values is often the most desired approach, e.g.
pi = 3.1415 x = pi x = 99.99We don’t want the above to accidentally change the value of the constant
pi
Programming Languages work with values differently
Languages handle the distinction between references and values differently. References make it easy to work with large data as the objects remain stationary in memory. However value types make it easy to perform data manipulation with types such as int, bool, float and string
C# has a similar concept of value types. Java has primitive types, C++ has references. Python implements int, bool, float and string are immutable types, and behave like values
Remember that the tuple collection type is also immutable
Edit a Contact
Once we have found a reference we can read and modify the attributes
Our program implementation, uses a simple interface optionally modify each attribute one at a time
- Need to read a user string for each modifiable attribute
- Our duplicates implementation also needs to read an int to indicate if we want to modify a specific contact (and print the current contact)
The book implementation is,
def edit_contact(): """ Reads in a name to search for an then allows the user to edit the details of that contact. If there is no contact, the function displays a message indicating that the name was not found """ print("Edit Contact") search_name = read_text("Enter the contact name: ") contact=find_contact(search_name) if contact != None: print("Name: ", contact.name) new_name = read_text("Enter new name or . to leave unchanged") if new_name != '.': contact.name = new_name new_address = read_text("Enter new address or . to leave unchanged") if new_address != '.': contact.address = new_address new_phone = read_text("Enter new phone or . to leave unchanged") if new_phone != '.': contact.telephone = new_phone else: print("This name was not found")Editing as configured performs modifications of the live data
- Referred to as in-place because it occurs on the original object not a copy
Can’t easily rollback if there is an error or ask the user wishes to cancel
To do so,
edit_contactwould need to work on a copy of the data
Missing Attributes
edit_function calls find_contact to match a given name. find_contact returns None if no match is found. However, another possible fault in a contact is returned without all the attributes defined, e.g. a contact with a name but no address. Then the program would fail, as the code below demonstrates
class Contact:
pass
# fake contact "returned without address"
contact = Contact()
contact.name = "Hello"
# Attempt to access address
print("contact address is", contact.address)--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[11], line 9 6 contact.name = "Hello" 8 # Attempt to access address ----> 9 print("contact address is", contact.address) AttributeError: 'Contact' object has no attribute 'address'
Some programming languages e.g. Java, C#, C++ check for these errors before a program executes. Python does not. This means that minor typos e.g. writing address instead of address can lead to runtime errors
Save Contacts in a File using pickle
We saw in Chapter 8 that we can save and load data to files
- There we used a human-readable text representation
We could replicate this for large class based structures
- e.g. write out all attributes as comma, separated values
Python provides a process called pickling for storing large data structures
- pickled data is stored as binary
- Data is therefore computer-readable and more compressed than human text
Pickling is done through the
picklelibrary, import it to use itimport picklejpeg, mp3, zip are all different formats of binary data
- Commonly a file extension is used to identify what a binary file represents
- e.g.
- myhouse.jpg
- track1.mp3
- Different programs are written to work with different binary file formats
.txtdefines a generic text file.pydefines a text file that is valid python codeTechnically text is also a binary file, just with those binary values associated to human-readable characters
Python programs can use the
bmode flag to read files as binary, e.g.out_file = open("contacts.pickle", "wb")- The above opens the file
contacts.picklefor writing (w) as a binary file (b) .pickleindicates the file is a pickled python data object
- The above opens the file
pickle supplies the
dumpfunction to write a data structure to a file- The file must be opened for writing in binary
pickle.dump(contacts, out_file)contactsis the variable to pickleout_fileis the variable storing the file save the data to
You can open and modify pickle files
- While they may contain some readable text, they will also contain a mix of improperly rendered binary
- (Unless you have a specially set up text editor like a hexadecimal reader)
- While they may contain some readable text, they will also contain a mix of improperly rendered binary

Be careful modifying pickle files by hand
Pickle files are not designed to be human-readable. While most text editors will happily let you edit and save a pickle file this is very likely to end up in you breaking the binary format and the file no longer loading properly
save_contactsbelow saves thecontactslist into a given file (passed as a path name)- As in Chapter 8 we use
withto handle managing the file access
def save_contacts(file_name): """ Saves the contacts to the given file name Contacts are stored in binary as a pickled file Parameters ---------- file_name : str string giving the path to the file to store the contacts data in Returns ------- None Raises ------ Exceptions are raised if contacts could not be saved See Also -------- load_contacts : loads contacts from a pickled file """ print("save contacts") with open(file_name, "wb") as out_file: pickle.dump(contacts, out_file)- This function does not perform any exception handling
- This will cause the program to crash if the save fails
- Probably fine for a program of this size
- Note: You should never hide a failed save from the user!
- If we wanted to handle exceptions, we would do that in the code outside
save_contacts
- As in Chapter 8 we use
Load Contacts from a file using pickle
pickle provides
loadto read a pickle file- returns the reconstructed data object
- As a result, needs only the file
file needs to be open for reading
rand in binary modebdef load_contacts(file_name): """ Loads the contacts from the given file Contacts are stored in binary as a pickled file Parameters ---------- file_name : str string giving the path to the file where the contacts data is stored Returns ------- None Contact detail is loaded into the global `contacts` value Raises ------ Exceptions if contacts failed to load See Also -------- save_contacts : saves contacts to a pickled file """ global contacts #connect to the global contacts variable print("load contacts") with open(file_name, "rb") as input_file: contacts=pickle.load(input_file)
Code Analysis: Loading Data using pickle
Work through the following questions to make sure you understand how load_contacts works
What does the
global contactsstatement do? Why do we need it only in theloadfunction and not thesavefunction?- The
load_contactsfunction is used to modify the values of thecontactsvariable - The
contactsholds all the current contacts, held in the program save_contactsneeds the reference to find the list- Does not modify the list itself
load_contactsdoes modify thecontactslist- Need to explicitly link to the global variable to write to it
- The
How does the pickle
loadfunction know what kind of data to make when loading?- The information is encoded in the pickle file
- In a pickle file you should be able to identify the data attributes (
nameetc.) and their values - Also contains the class name
loadlooks for matching classes in the program loading the data- Constructs object instances based on those classes
- Means the class
Contactmust be defined before pickle is used to load any contact data
Version Control
Pickle is a tool called a serializer because it converts a data structure is a serial stream (i.e. ordered sequence of data) that can be sent to another program and/or stored in a file.
This introduces the need for version control. If the design of a class e.g. Contact (say we added an email attribute) then all previously pickled data may no longer load since the class definition is mismatched. To resolve this you need to version control both the version of the class and the pickled files so that data can be migrated (or converted) between versions
Add save and load to Tiny Contacts
Let’s add the
saveandloadfunctionality to the Tiny Contacts ProgramThere are two options for how we implement this,
- The user manually declares they want to save and load
- We hardcode a data file
- Load from this file on program start
- Save to this file as part of the exit process
For a small contacts app the user probably doesn’t want to manually have to handle saving and loading files
More likely to want to have it “just work”
- We’ll go with option 2
The new interface now looks like below (see our full implementation which contains an example pickle file)
# Example 9.7 Tiny Contacts with Load and Save import pickle #Load contacts from file or create empty list if it fails to load file_name = "contacts.pickle" try: load_contacts(file_name) except: print("Contacts file not found") contacts=[] while True: command = BTCInput.read_int_ranged(prompt=menu, min_value=1, max_value=4) if command == 1: new_contact() elif command == 2: display_contact() elif command == 3: edit_contact() elif command == 4: save_contacts(file_name) #save contacts on exit break else: raise ValueError("Unexpected value encountered")
Code Analysis: Saving and Loading Contacts
Consider the following questions about the code above
What happens if the
load_contactsfunction raises an exception?load_contactsraises an exception if the contacts file can’t be found, or if theloadfunction inpicklefails- In this case the exception is caught, an error message is printed and an empty contacts list is created
Why does the program not catch the exceptions raised by
save_contacts?- You could add this if you wanted
- If the program crashes, the user should probably expect the save failed
- Since they were trying to quit anyway they probably don’t care
- My implementation adds
try...exceptblock that prints an error message if the save fails (as the book suggests you consider)
Why does the program use a variable for the file name of the pickled file?
- The contacts are held in a file called*
contacts.pickle - This file is used in two places*
load_contactsandsave_contacts- We could put the string literal in both places
- Instead use a variable
- Means we can change the file name in one place and the program works
- The contacts are held in a file called*
Avoiding Magic Constants
A magic constant is a literal value that appears in multiple places in code without an apparent reason. For example if we just used contacts.pickle that would be a magic constant. The string is a constant but there is no context to explain what the value means. Magic constants are problematic because they are hard to find and if we want to change them we have to find all the places they’re used and then resolve the issue of does this 2 correspond to this magic constant or another magic constant
It is a good idea to instead put these constants in a variable so that we need only change the value in one place and we can clearly explain what the constant means
Setup Class Instances
Tiny Contacts builds the
Contactinstance after we create onenew_contact = Contact() new_contact.name = BTCInput.read_text("Enter the contact name: ") new_contact.address = BTCInput.read_text("Enter the contact address: ") new_contact.telephone = BTCInput.read_text("Enter the contact phone: ")This makes the program fragile
- We could misspell an attribute
- Forget to set one
Ideally want to create a
Contactand ensure values are set as part of creationWe can do so with a Constructor, a special method called to create the object
- Also sometimes called an initializer method
A method attribute is like a data attribute but as an attached function
The Python Initializer Method
- Held inside a python class
- Named
__init__- python uses “dunder methods” marked
__function_name__to mark special functions defined by the language
- python uses “dunder methods” marked
Make Something Happen: Create an Initializer
Open the python interpreter and work through the following steps to create an understand an initializer, answering the questions
Type the below code in to define a class
class InitPrint:
def __init__(self):
print("you made an InitPrint instance")The above defines the class InitPrint it defines an initializer method that prints a method. Note the double underscore before and after the init are required, as is the parameter self. The last line of the class is an empty line
The initializer looks remarkably like a function why is that?
An initializer is a function that is called when a class instance is created.
Type in the code below, which creates an instance of
InitPrintand assigns it to the variablex,observe that the__init__method is called even without us explicitly putting itx = InitPrint()you made an InitPrint instance
How is the
__init__function made to run?- It is handled by the python interpreter and how objects are constructed
- It will run each time an instance of the
InitPrintclass is created
Now define the
InitNameclass as belowclass InitName: def __init__(self, new_name): self.name = new_nameThe initializer can take arguments like any other function, here it takes
new_nameInitializer no longer prints a message but rather sets a
nameattribute on the variableselfselfis a reference to the object running the method- In the initializer this is the object being created
selfis always the first parameter of a method, and must be included
Now replicate the code below to see how the new
__init__method worksx = InitName("Fred") print(x.name)FredWhen creating an
InitNameobject we now have to pass thenew_nameparameterObserve we don’t explicitly pass
self
Once an initializer is defined, it is the only way to create an instance
- Attempting otherwise leads to an error, e.g. if we exclude the
new_name
y = InitName()--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[16], line 1 ----> 1 y = InitName() TypeError: InitName.__init__() missing 1 required positional argument: 'new_name'
- Attempting otherwise leads to an error, e.g. if we exclude the
This is a way of enforcing that an object is created with a full set of attributes
Our
Contactclass, should accept three parametersclass Contact: def __init__(self, name, address, telephone): """ Create a new `Contact` instance Parameters ---------- name : str Contact Name address : str Contact's postal or street address telephone : str Contact phone number (stored as a string) """ self.name = name self.address = address self.telephone = telephone
Code Analysis: Parameters and the __init__ method
Work through the following questions to ensure you understand the __init__ method
It looks like you’ve written the assignments in the initializer so that a value is assigned to itself. What’s going on?
Consider a statement
self.telephone = telephoneThis looks like assigning
telephonetotelephoneit, does notThe left is the
telephoneattribute on theselfobjectThe right is the
telephoneparameter passed to the initializerPython variable names are namespaced
- namespaces are regions in which names are uniquely identified
One namespace is the local namespace of the
__init__methodThe other namespace is the attribute namespace of the
selfobjectnamespaces allow different contexts to use the same variable name distinct from each other
Generally it is convention to give initializer parameters the same name as their associated object attributes
What happens if the user of the constructor supplies silly arguments to it?
- Currently the constructor doesn’t validate the input
- e.g. we could pass
namea number, empty string or evenNone- Still generates a
Contact
- Still generates a
- You can add error handling code to the constructor and raise exceptions if the provided values are invalid
- For a more robust application this might be required
- For a small toy program we can generally expect valid input
- If we want to create a new
Contactnow we can just call,
rob = Contact(name = "Rob Miles", address="18 Pussycat Mews, London, NE1 410S", telephone="+44(1234) 56789")We can integrate this into our Tiny Contacts implementation
def new_contact(): """ Creates and adds a new contact to the contact book Returns ------- None See Also -------- Contact : class for storing contact information """ print("Create new contact") name = BTCInput.read_text("Enter the contact name: ") address = BTCInput.read_text("Enter the contact address: ") telephone = BTCInput.read_text("Enter the contact phone: ") new_contact = Contact(name=name, address=address, telephone=telephone) contacts.append(new_contact)
Use Default Arguments in a Constructor
The
__init__method supports default argumentsFor example if we don’t want to make the telephone mandatory we could write
class Contact: def __init__(self, name, address, telephone="No Telephone"): """ Create a new Contact instance Parameters ---------- name : str Contact Name address : str Contact's postal or street address telephone : str Contact phone number (stored as a string) """ self.name = name self.address = address self.telephone = telephoneWe could then create create a
Contactas,rob = Contact(name="Rob Miles", address="18 Pussycat Mews, London, NE1 410S") print(rob.telephone)No Telephone- Observe that the
telephoneattribute still exists and has the default value"No Telephone"
- Observe that the
Dictionaries
- A Dictionary is another collection type like
listandtuple - Dictionaries store data as key-value pairs
- A value is looked up by its key
- You can think of a
listas a dictionary where the key is the index
Creating a Dictionary
Let’s consider creating a dictionary for a coffee shop
We want to key coffee products to their price (value)
We create an empty dictionary using
{}prices = {}Items can be added using the indexing operator to assign a value,
prices["latte"] = 3.5 prices["latte"]3.5We can see
priceshas the keylatteand the associated value of \(3.5\)We can redefine dictionary values as for lists
prices["latte"] = 3.6 prices["latte"]3.6Keys are case-sensitive and must be spelled correctly, else a
KeyErroroccurs,prices["Latte"]--------------------------------------------------------------------------- KeyError Traceback (most recent call last) Cell In[24], line 1 ----> 1 prices["Latte"] KeyError: 'Latte'
We can search a dictionary for keys using the
inoperatorprint('latte' in prices) print('flat white' in prices)True FalseWe can print an entire dictionary just like with lists and tuples
prices["espresso"] = 3.0 prices["tea"] = 2.5 prices{'latte': 3.6, 'espresso': 3.0, 'tea': 2.5}Observe that the key-value pairs are printed with the format
key : valueWe can also create a dictionary with value using the same syntax like the printed output
prices = {'Latte' : 3.6, 'Espresso' : 3.0, "Tea" : 2.5, "Americano" : 2.5}
Dictionary Management
Dictionary elements use the
"key:item"formatKeys and values can be a mix of types, e.g.
str,int,floate.g.,access_control = {1234 : "complete", 1111 : "limited", 4342 : "limited"}Values can be duplicated
But keys must be unique
Consider a dictionary that controls access to a burglar alarm
- Users provide an access code
- Code keys a dictionary
- Access is determined by the value in the dictionary
- A missing key (code) indicates no access permissions
# Example 9.9 Alarm Access Control # # Demonstrates the use of a dictionary as a lookup table to translate # keys into associated values import BTCInput access_control = {1234: "complete", 1111: "limited", 4342: "limited"} access_code = BTCInput.read_int("Enter your access code: ") if access_code in access_control: print("You have", access_control[access_code], "access") else: print("You are not allowed access")We can delete dictionary entries using the
delkeyword, e.g.del(access_control[1111]) print(access_control[1111])--------------------------------------------------------------------------- KeyError Traceback (most recent call last) Cell In[29], line 2 1 del(access_control[1111]) ----> 2 print(access_control[1111]) KeyError: 1111
The
KeyErrorabove shows that the key-value has been eliminateddelcan also be used to remove elements of a listdelwill raise an exception if the object being deleted doesn’t exist
Return a Dictionary from a Function
- We’ve seen that programs can use dictionaries as lookup tables
- Can also return a dictionary from a function
- e.g. our Pirates Treasure map from Chapter 8 could use a
dictinstead
# Example 9.10: Pirate Treasure Dictionary # # Implementation of the Pirates Treasure map that uses a # dictionary rather than a tuple to provide contextual # key-value pairs def get_treasure_location(): """ Get the location of the treasure Returns ------- dict Dictionary containing the location of the treasure, containing the following keys `"start"` : str landmark to start at `"n"` : int number of paces to walk north relative to the start `"e"` : int number of paces to walk east relative to the start """ return {"start": "The old oak tree", "n": 20, "e": 30} location = get_treasure_location() print( "Start at", location["start"], "walk", location["n"], "paces north, and", location["e"], "paces east", )Start at The old oak tree walk 20 paces north, and 30 paces east - e.g. our Pirates Treasure map from Chapter 8 could use a
- Dictionaries let us assign contextual meaning to the returned parameters
- Harder to work with than tuple unpacking though
Use a Dictionary to Store Contacts
- We could use dictionaries to store contacts in Tiny Contacts
Rather than use a class we could represent a contact with a dictionary like
rob_contact = {"name" : "Rob Miles", "address" : "18 Pussycat Mews", "telephone" : "+44(1234) 56789"}- But the we lost some of the nice class behaviours
- like attributes being accessible via
Contact.nameetc. and we would instead have to use the string literal keys everywhere
- like attributes being accessible via
- But the we lost some of the nice class behaviours
Another option is storing the contacts themselves in a dictionary rather than a list
contact_dictionary = {} rob = Contact(name = "Rob Miles", address = "18 Pussycat Mews", telephone = "+44(1234) 56789") contact_dictionary[rob.name] = rob print(contact_dictionary){'Rob Miles': <__main__.Contact object at 0x7fb688b72ed0>}We can then search for a contact by just querying the key
contact_dictionary["Rob Miles"]<__main__.Contact at 0x7fb688b72ed0>However the user would have to type the correct full name
- Also case sensitive
- We could fix the case sensitivity by rather than using the name directly using a normalised key
- such as by using
strip().lower()to strip excess whitespace and convert to lowercase
- such as by using
Our current implementation uses
startswithto provide more flexible matchingIn general though, dictionaries provide fast queries for finding objects when we can easily use the key as a unique identifier
Exercise: The Final Tiny Contacts Refactor
The Tiny contacts program is a useful template for any kind of program that stores data and lets a user work with it. You can even add some of the sorting and data-processing features from the ice-cream sales program to make applications that not only store data but let you do interesting things with it.
Expand on the Tiny Contacts Program to implement the following,
- The Tiny Contacts program will print all contacts if the
find_contactssearch string is blank, document this for the user - Add the sorting features from the ice-cream sales program to print contacts in alphabetical order
- Identify more common between functions in Tiny Contacts and see what further refactors you can make
The first step is straightforward. We update the program is two places. First we document the behaviour in the responsible function, find_contacts
def find_contacts(search_name):
"""
Finds the contacts with the matching name
If the empty string is given, all contacts
are matched
Parameters
----------
search_name : str
Name to search for (uses prefix matching)
Returns
-------
list[Contact]
list of contacts matching `search_name`, if no
matches exist the list is empty
"""
search_name = search_name.strip().lower()
results = []
for contact in contacts:
name = contact.name.strip().lower()
if name.startswith(search_name):
results.append(contact)
return resultsThis helps anyone who in the future has to edit or maintain our code. However it doesn’t provide much help to the user of the program. So we also document this in the user facing code (display_contacts)
def display_contacts():
"""
Prompts the user for a contact name and
displays all matching contacts
Returns
-------
None
See Also
--------
display_contact : displays a single contact
"""
print("Find contact")
contacts = find_contacts(
BTCInput.read_text("Enter the contact name (Press enter to display all): ")
)
if len(contacts) > 0:
for contact in contacts:
display_contact(contact)
else:
print("This name was not found")The relevant line being,
BTCInput.read_text("Enter the contact name (Press enter to display all): ")To add the sorting functionality we first need to put the sorting code in,
def sort_contacts():
"""
sorts the contacts list into alphabetical order
Returns
-------
None
"""
print("Sort contacts")
for sort_pass in range(0, len(contacts)):
done_swap = False
for count in range(0, len(contacts) - 1 - sort_pass):
if contacts[count].name > contacts[count + 1].name:
temp = contacts[count]
contacts[count] = contacts[count + 1]
contacts[count + 1] = temp
done_swap = True
if not done_swap:
breakThis code is basically the same as the sorting code in the ice-cream stand example except there might be something that looks odd. namely the line
contacts[count].name > contacts[count + 1].nameThis is because we want to sort the Contact objects alphabetical on the name field. So we have to compare against this field. contacts[count] is a reference to the Contact object stored at the index count so we can access the attributes on the underlying object
We then need to include the sorting option in the display menu
menu = """Tiny Contacts
1. New Contact
2. Find Contact
3. Edit Contact
4. Sort Contacts
5. Exit Program
Enter your command: """
file_name = "contacts.pickle"
try:
load_contacts(file_name)
except: # noqa: E722
print("Contacts file not found")
contacts = []
while True:
command = BTCInput.read_int_ranged(prompt=menu, min_value=1, max_value=5)
if command == 1:
new_contact()
elif command == 2:
display_contacts()
elif command == 3:
edit_contacts()
elif command == 4:
sort_contacts()
elif command == 5:
try:
save_contacts(file_name)
except: # noqa: E722
print("Contacts failed to save")
break
else:
raise ValueError("Unexpected command id found: " + str(command))To answer the final question of what functionality we can pull out we can see that both display_contacts and edit_contacts contain code for displaying an individual Contact. We can pull this code out into a distinct display_contact function responsible for displaying a single Contact. We can then update display_contacts and edit_contacts to defer the display functionality to display_contact
def display_contact(contact):
"""
Displays the Contact details for the supplied contact
Parameters
----------
contact : Contact
contact to display
Returns
-------
None
See Also
--------
display_contacts : Displays all contacts matching a search name
"""
print("Name:", contact.name)
print("Address:", contact.address)
print("Telephone:", contact.telephone, "\n")The new display_contacts now looks like,
def display_contacts():
"""
Prompts the user for a contact name and
displays all matching contacts
Returns
-------
None
See Also
--------
display_contact : displays a single contact
"""
print("Find contact")
contacts = find_contacts(
BTCInput.read_text("Enter the contact name (Press enter to display all): ")
)
if len(contacts) > 0:
for contact in contacts:
display_contact(contact)
else:
print("This name was not found")and edit_contacts looks similar
Further Exercises
This Chapter also contains two further examples which demonstrate building data storage applications that handle more functionality than tiny contacts.
Due to the size of the discussion required for each exercise they are linked on a separate page
Music Tracks
Focuses on demonstrating adding additional functionality to a basic storage application for data interrogation
- Develops an application that can store music tracks
- Implements searching and sorting based on length of the track
- Supports the ability to create and save playlists
- Playlists can be interrogated for their total length
- The program can suggest playlists that match the user’s target runtime
Recipe Register
Demonstrates working with larger more complicated objects in data storage
- Develops and application for storing recipes
- Users can search for recipes by their name or their ingredients
- Supports different mechanisms for viewing details about a recipe
- Viewing the ingredients
- Viewing the steps
- Stepping through the recipe, step by step
You are encouraged to work through or examine these exercises
Summary
- Python lets you define Classes
- Classes can have data attributes
- data attributes can be defined at construction via a constructor
- data attributes can be added dynamically at runtime
- classes may define a constructor via
__init__to control how they are created
- Python variables are references to memory objects
- If there are multiple references to one object then changing the object via one reference will be propagated to the other references
- Some fundamental python types
int,str,float, are immutable- Assigning a value to a variable holding an immutable type creates a new memory object with that value
- Other references to the original object are untouched
- This allows them to be manipulated as simple values
- Assigning a value to a variable holding an immutable type creates a new memory object with that value
pickleis a library for serialising python objects as binary data- Python provides a dictionary data object that can be used to store a collection of key-value pairs
Questions and Answers
- If an object has
name,addressandtelephoneattributes can a program treat it as aContactinstance?- Yes, Python uses what is called duck-typing
- “If it walks like a duck, quacks like a duck, it is a duck”
- Means that if it behaves like a
Contactit can be used as aContact - If the programmer makes a mistake however, a runtime exception is created
- This means we could define a
Customerwith the same fields and treat it as aContact - Python does provide mechanisms for explicit type checking
- We’ve seen one for example (
type)
- We’ve seen one for example (
- Other languages have different rules
- Java, C# and C++ are “Strongly Typed”
- The type of the variable is fixed and we can only work with objects, functions and operators that support that type
- “Strongly Typed” languages typically allow you catch mismatched type errors at compile-time before a program is run
- Yes, Python uses what is called duck-typing
- Can an object contain a reference to itself?
- Yes, though this is usually not a good idea
- Typically objects are daisy-chained together
- Object A references Object B references Object C etc.
- The most basic structure is called a linked list
- More complex structures like trees have more complicated referential structures
- Is an object forced to have a constructor / initializer?
- No, we saw this with the first
Contactwhich was a simple blank class __init__provides greater ability to ensure that objects are created properly though
- No, we saw this with the first
- Can you stop a program from adding new attributes to an object?
- No
- This has the impact of allowing us to create incompatible and distinct instances of the same class where for some reason one has been augmented with an additional attribute
- Can you remove attributes from an object?
Yes, you can use the
deloperator to delete an attributedel(rob.name) print(rob.address) rob.name18 Pussycat Mews--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[34], line 3 1 del(rob.name) 2 print(rob.address) ----> 3 rob.name AttributeError: 'Contact' object has no attribute 'name'
- What is immutable again?
- immutable means an unchangeable object
- When we try to change an immutable object python instead creates a copy with the new values
- immutability improves data storage especially for primitive types
- e.g. We might have a story of one million words
- The story string itself consists of words stored as strings which are immutable
- A list of words uses references to refer to each word
- Certain words are probably referred to multiple times (e.g. the)
- Since the words are immutable we can reference one instance of the string
"the"rather than having a distinct memory object for each instance (which we would need if they were mutable)
- How does the operating system know its storing a binary file?
- It doesn’t
- Really all files are binary
- File system simply responsible for locating and retrieving files
- Programs are the ones that impose meaning on a file
- file extensions are there to help an operating system or user associate different file formats with their respective programs
- Can two items in a dictionary have the same key?
- Strictly no, if you wish to store multiple objects with the same key you would have to use a dictionary where the value was some form of collection e.g. a list or tuple.
- You would then access the list via the dictionary key
- Then have to search through the list to get the specific value you were interested in