visits = {
"Mexico": {"Tulum", "Puerto Vallarta"},
"Japan": {"Hakone"}
}Item 27: Prefer defaultdict over setdefault to Handle Missing Items in Internal State
Notes
- We’ve seen that in general
getis preferable tosetdefault(Item 26) - However, we’ve seen that
getisn’t the cleanest interface when the type stored in the dictionary is complex - For example,
- Let’s keep track of cities, visited in countries around the world
- Here
setdefaultlet’s us add new cities regardless of if the country key exists
visits = {"Mexico": {"Tulum", "Puerto Vallarta"}, "Japan": {"Hakone"}}
#Short
visits.setdefault("France", set()).add("Arles")
#Long
if (japan := visits.get("Japan")) is None:
visits["Japan"] = japan = set()
japan.add("Kyoto")
print(visits){'Mexico': {'Tulum', 'Puerto Vallarta'}, 'Japan': {'Hakone', 'Kyoto'}, 'France': {'Arles'}}
- Here the
getcode is clearly longer and less clear - One might be tempted to wrap the code above in a class to hide the complexity from the user
class Visits:
def __init__(self):
self.data = {}
def add(self, country, city):
city_set = self.data.setdefault(country, set())
city_set.add(city)
visits = Visits()
visits.add("Russia", "Yekaterinburg")
visits.add("Tanzania", "Zanzibar")
print(visits.data){'Russia': {'Yekaterinburg'}, 'Tanzania': {'Zanzibar'}}
- This hides the
setdefaultcall addprovides a clean interface and more meaningful interface- There are still downsides
- The complexity is still present in the internals of the
addmethod setdefaultconstructs asetobject on every call (even if it isn’t assigned in the end)
- The complexity is still present in the internals of the
defaultdictsimplifies this use case- Provided in the
collectionsbuilt-in module - Accepts a function that returns a default value whenever a key is missing
- Provided in the
- We can rewrite our
Visitsclass as before
from collections import defaultdict
class Visits:
def __init__(self):
self.data = defaultdict(set)
def add(self, country, city):
self.data[country].add(city)
visits = Visits()
visits.add("Russia", "Yekaterinburg")
visits.add("Tanzania", "Zanzibar")
print(visits.data)defaultdict(<class 'set'>, {'Russia': {'Yekaterinburg'}, 'Tanzania': {'Zanzibar'}})
addis now succinct- Code can assume that accessing any key in the data will return a
setinstance - Only allocates a
setwhen required - Using
defaultdictis better than usingsetdefaultdefaultdictdoesn’t solve every problem, but it is useful to know about
Things to Remember
- If creating a dictionary to manage an arbitrary set of potential keys prefer a
defaultdictinstance fromcollectionsif you need complex types as default values - If a dictionary of arbitrary keys is passed to you (i.e. you don’t control it’s creation), then prefer
get- Consider
setdefaultif it leads to shorter code and the default objection allocation cost is low
- Consider