8: Dictionaries#
What are dictionaries and why should we care about them?#
Dictionaries are for associating data and quick lookup
Motivating example: I am making an index for a book, because I want to know which concepts show up on which pages, to make it easier to jump back to the right spots.
1# how to know which chapters talk about strings? or debugging?
2book = [
3 "Chapter 1: talks about strings and how they have the property of immutability also some basic debugging",
4 "Chapter 2: continues talking about advanced methods for strings and also introduces the concept of functions",
5 "Chapter 3: discusses iteration and lists and also debugging",
6]
Here’s what it might look like without dictionaries.
1concepts = ['strings', 'debugging', 'immutability']
2index = []
3
4for chapter in book:
5 # split into elements based on the colon
6 elements = chapter.split(":")
7 # first element is the chapter
8 chapter = elements[0]
9 # second element is the text
10 text = elements[1]
11
12 # parse the text
13 words = text.split()
14 for keyconcept in concepts:
15 if keyconcept in words:
16 index.append([keyconcept, chapter])
17index
[['strings', 'Chapter 1'],
['debugging', 'Chapter 1'],
['immutability', 'Chapter 1'],
['strings', 'Chapter 2'],
['debugging', 'Chapter 3']]
1# given this data structure, how to find
2# all the chapters that have strings in them?
3query = "strings"
4results = []
5# go through every item in the index
6for item in index:
7 concept, chapter = item
8 # if the concept matches the query
9 if query == concept:
10 results.append(chapter)
11
12print(results)
['Chapter 1', 'Chapter 2']
And here is what it might look like with dictionaries.
1# now consider if the concept index is in a dictionary (much more like what you might see in a book!)
2index_d = {
3 'strings': ['Chapter 1', 'Chapter 2'],
4 'debugging': ['Chapter 1', 'Chapter 3']
5}
1# given this data structure, how to find
2# all the chapters that have strings in them?
3query = "strings"
4index_d.get("strings")
['Chapter 1', 'Chapter 2']
Another common use case: attributes of data entries. For instance, attributes of a class, like credit hours, pre-reqs, instructor, location, hours, and so on.
1# without dictionaries
2courses = [
3 ["INST126", 3, "no", "Chan", "hybrid", "MWF"],
4 ["INST256", 4, "yes", "Kanishka", "in-person", "TR"]
5]
6
7# look up INST126 and check whether it has prereqs
8for course in courses:
9 if course[0] == "INST126":
10 print(course[2])
no
Rather than trying to remember which position we happened to have decided to use to store a particular attribute (if we used lists), we can use semantically meaningful indices for values, i.e., keys!
1# with dictionaries
2courses = {
3 "INST126": {
4 "credit hours": 3, "prereqs": "no", "instructor": "Chan", "location": "hybrid", "hours": "MWF"
5 },
6 "INST256": {
7 "credit hours": 4, "prereqs": "yes", "instructor": "Rony", "location": "in-person", "hours": "TR"
8 },
9}
10
11# look up INST126 and check whether it has prereqs
12courses.get("INST256").get("instructor")
'Rony'
It’s a lot easier to remember keys (if we name them useful things) compared to just indices. And Python can help us remember too!
If you’re interested, there are also formal technical reasons to prefer dictionaries over lists if you care about speed/efficiency and your computational task is checking if an item exists in a collection and you’re dealing with very large scale data. The short version: looking up a key in a dictionary is O(1) (constant time: just 1 operation, no matter how big the dictionary is), while checking if an item is in a list is O(n) (grows with the size of the list: even though computers are pretty fast, when you get to quite large database sizes, you’ll notice a difference!). See https://www.geeksforgeeks.org/python/difference-between-list-and-dictionary-in-python/ for a beginner-friendly explanation, or https://wiki.python.org/moin/TimeComplexity for the official reference.
In the future, you may learn the pandas library (and the dataframe data structure, which is sort of a hybrid of lists and dictionaries): you can do really fast lookup, but also sort stuff!
Anatomy of a dictionary#
Dictionaries are not so different from… our dictionaries in real life. :) Basically map a bunch of keys (e.g., a word) to corresponding values (e.g., a definition). Another example is indices in the back of print books that map key terms to pages where that term shows up, or tags on websites, that map tags to webpages that include those tags.
The key parts of a dictionary literal are:
The
{ }curly braces, which tell you and Python that it’s a dictionary (similar to""for strings, or[]for lists)At least one entry (but usually several), that maps a value on the right of a
:— which functions like the=expression — to a key on the left. For example, our first entry maps the value “apple” to the key “a”.Similar to lists, we include
,to separate multiple entries in the dictionary.
Let’s look at a simple example that maps letters to an example word that starts with the letter
1d = {
2 'a': 'apple', # an entry that maps the value apple to the letter a
3 'b': 'ball', # another entry that maps the value ball to the letter b
4 'c': 'crayon'
5}
6d
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}
Style note: the indentation inside the dictionary is more for readability. Python doesn’t care, as long as there are commas between the dictionary entries.
You can also write it out like this, but I find it harder to read:
1d = {'a': 'apple', 'b': 'ball', 'c': 'crayon'}
2d
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}
Here are some other examples.
1# map letters to numbers
2another = {
3 'a': 1,
4 'b': 2,
5 'c': 3
6}
7
8# map letter grades to list of scores
9grades = {
10 'A': [93, 100],
11 'B': [87, 93]
12}
Properties of a dictionary#
Here are some key properties of a dictionary in Python:
Dictionaries have length#
Similar to lists, dictionaries have length.
1d = {
2 'a': 'apple', # an entry that maps the value apple to the letter a
3 'b': 'ball', # another entry that maps the value ball to the letter b
4 'c': 'crayon'
5}
6len(d)
3
Dictionaries do not have an order or indexing by position#
Different from lists, dictionaries do not have an order. So you can’t really sort a dictionary, or grab things by position. You grab things by… key!
1# can't get things by position
2d[0]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[12], line 2
1 # can't get things by position
----> 2 d[0]
KeyError: 0
1# can't sort either
2d.sort()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[13], line 2
1 # can't sort either
----> 2 d.sort()
AttributeError: 'dict' object has no attribute 'sort'
1sorted(d) # this does *not* sort the dictionary; it returns a sorted list of the keys
['a', 'b', 'c']
Though as we will see next week, you can sneakily get some order, because Python now allows you to actually iterate through a dictionary in the order that stuff was put into it. It’s not reliable though! If you need your collection to have an actual sequence to it, don’t use dictionaries!
All keys in a dictionary are unique#
Also, all keys in a dictionary have to be unique. This makes dictionaries handy for keeping track of unique items (in contrast to say, lists, where you can have duplicate entries).
Values in the dictionary do not have to be unique, though: you can have different keys point to the same value, but not multiple values point to duplicate keys. There is a related data structure that has a similar property called sets if you’re interested.
1d = {
2 'a': 'apple', # an entry that maps the value apple to the letter a
3 'b': 'ball', # another entry that maps the value ball to the letter b
4 'c': 'crayon',
5 'a': 'animal',
6 'd': 'ball'
7}
8print(d) # oops, where did apple go? overwritten by the last a:animal entry since we can't have duplicate keys. d:ball is fine though, even though we already have b:ball.
{'a': 'animal', 'b': 'ball', 'c': 'crayon', 'd': 'ball'}
Dictionaries are mutable#
Like lists, dictionaries are also mutable: you can modify them directly (in contrast to strings, where you never modify them directly, but only ever create a new modified version of the string).
1print(f"d before modification: {d}")
2print("modifying d, by adding an entry mapping the key f to the value friend")
3d.update({"f": "friend"})
4print(f"d after modification: {d}")
d before modification: {'a': 'animal', 'b': 'ball', 'c': 'crayon', 'd': 'ball'}
modifying d, by adding an entry mapping the key f to the value friend
d after modification: {'a': 'animal', 'b': 'ball', 'c': 'crayon', 'd': 'ball', 'f': 'friend'}
What kinds of data can we put in a dictionary?#
Anything goes for values#
Basically anything goes for values. You can even nest a dictionary inside another dictionary, by mapping a dictionary value to some key.
1def hello():
2 print("hello world!")
3
4d = {
5 'a': ['apple', "animal"], # an entry that maps a list to the letter a
6 'b': 'ball', # another entry that maps the value ball to the letter b
7 'c': 2,
8 'd': [1, 3, "denizen"],
9 'e': hello # even a function is fine!
10}
11d
{'a': ['apple', 'animal'],
'b': 'ball',
'c': 2,
'd': [1, 3, 'denizen'],
'e': <function __main__.hello()>}
1students = {
2 'joel': {
3 'major': 'info sci',
4 'year': 'senior',
5 'interests': ['programming', 'football', 'dancing']
6 },
7}
8students
{'joel': {'major': 'info sci',
'year': 'senior',
'interests': ['programming', 'football', 'dancing']}}
Keys must be hashable#
But keys need to be hashable.
What does this mean?
From the Python glossary:
An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash() method), and can be compared to other objects (it needs an eq() or cmp() method). Hashable objects which compare equal must have the same hash value. Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.
All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal, and their hash value is their id().
More info here: https://stackoverflow.com/questions/14535730/what-does-hashable-mean-in-python
1d = {['3']: 'apple'}
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[19], line 1
----> 1 d = {['3']: 'apple'}
TypeError: unhashable type: 'list'
I mention this because a common error when first working with dictionaries is to try to use an unhashable data structure as a key.
Understanding hashes is not something to worry about for now. We can focus on a basic rule of thumb for now: strings and numbers are ok as keys; everything else (that you’ll learn now) is not.
Working with dictionaries: basics#
Create a dictionary#
1d = {
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon',
5}
You can also start with an empty dictionary, and then add stuff later, programmatically or with other functions.
1emptyd = {} # dictionary with nothing in it
2emptyd = dict() # same thing
3print(emptyd)
4len(emptyd)
{}
0
PRACTICE: make a dictionary e that has the following entries: smith:A, chan:B, abuye:A+
1# your code here
Get the value associated with a key from a dictionary#
1d = {
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon'
5}
6d['c'] # put a key inside square brackets associated with a dictionary
'crayon'
This is called the “old style” or “indexing” pattern. Looks a little bit like lists.
1d = {
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon'
5}
6d.get('c') # use the get function to get the value for the key that we give it
'crayon'
This is the newer pattern that I prefer for clarity.
It also has the advantage of not breaking your program if you try to access a key that doesn’t exist.
1d = {
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon'
5}
6d['d'] # will crash the program with key error
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[25], line 6
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon'
5 }
----> 6 d['d'] # will crash the program with key error
KeyError: 'd'
.get() lets you specify a default value that should come back if the key doesn’t exist. This is very useful for writing clean and understandable dictionary patterns, such as indexing, which we’ll dig into next week.
1d = {
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon'
5}
6result = d.get('d', "Key not found") # will return "Key not found" as the default
7print(result)
Key not found
PRACTICE: how do you get the value associated with the key a?
1# your code here
Adding entries to a dictionary (or updating entries)#
Classic style, using indexing and assignment
1d = {
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon'
5}
6print(d)
7# add a new entry for e
8d['e'] = 'egg' # map the value egg to the key e
9print(d)
10# update the entry for b
11d['b'] = 'bread' # map the value bread to the key b (which happens to already exist, so we update it)
12print(d)
13# update the entries for a and c
14d['a'] = 'ashes'
15d['c'] = 'charming'
16print(d)
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}
{'a': 'apple', 'b': 'ball', 'c': 'crayon', 'e': 'egg'}
{'a': 'apple', 'b': 'bread', 'c': 'crayon', 'e': 'egg'}
{'a': 'ashes', 'b': 'bread', 'c': 'charming', 'e': 'egg'}
Newer style, using .update()
1d = {
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon'
5}
6print(d)
7# add a new entry for e
8d.update({'e': 'egg'}) # map the value egg to the key e
9print(d)
10# update the entry for b
11d.update({'b': 'bread'}) # map the value bread to the key b (which happens to already exist, so we update it)
12print(d)
13# update the entries for a and c
14new_for_a = "ashes"
15d.update({'a': new_for_a, 'c': 'charming'})
16print(d)
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}
{'a': 'apple', 'b': 'ball', 'c': 'crayon', 'e': 'egg'}
{'a': 'apple', 'b': 'bread', 'c': 'crayon', 'e': 'egg'}
{'a': 'ashes', 'b': 'bread', 'c': 'charming', 'e': 'egg'}
PRACTICE: how do you update d with new entry h:hello?
1# your code here
PRACTICE: how do you update d so that a maps to the value asteroid?
1# your code here
.update() has the advantage of being able to add multiple key-value pairs at once.
1d = {
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon'
5}
6print(d)
7d.update({'a': 'ashes', 'b':'bread', 'c':'charming', 'e': 'egg'})
8print(d)
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}
{'a': 'ashes', 'b': 'bread', 'c': 'charming', 'e': 'egg'}
PRACTICE: how do you update d with new entries f:friend and g:grapes in a single operation?
1# your code here
PRACTICE: how do you update d so that f now points to funny and b points to bundle, in a single operation?
1# your code here
You can use the pattern that is comfortable for you, but I prefer .get() and .update() for now because it’s more readable and robust.
List keys and values#
.keys() gives you all the keys in the dictionary
.values() gives you all the values in the dictionary
.items() gives you all the entries in the dictionary
Each of these is iterable.
1d
{'a': 'ashes', 'b': 'bread', 'c': 'charming', 'e': 'egg'}
1# list all the keys in the dictionary
2d.keys()
dict_keys(['a', 'b', 'c', 'e'])
1# list all the values in the dictionary
2d.values()
dict_values(['ashes', 'bread', 'charming', 'egg'])
1# list all the key-value pairs in the dictionary
2d.items()
dict_items([('a', 'ashes'), ('b', 'bread'), ('c', 'charming'), ('e', 'egg')])
1# this means you can iterate through the keys/values
2for key in d.keys():
3 print(key, d.get(key))
a ashes
b bread
c charming
e egg
1# iterate through the items
2for key, value in d.items():
3 print(f'{key} is associated with the value {value}')
a is associated with the value ashes
b is associated with the value bread
c is associated with the value charming
e is associated with the value egg
Check if a key is in a dictionary#
1d = {
2 'a': 'apple',
3 'b': 'ball',
4 'c': 'crayon'
5}
6print('h' in d)
False
Can only use in operator with keys.
You can also do this with .get()!
1# retrieve value for h, if it's not found, return False
2d.get("h", False)
False
PRACTICE: how do you check if the key “c” is in the dictionary d?
1# your code here
Reverse look up keys from values: YOU CAN’T! Not really…#
Dictionaries are very powerful transformations of your data that make it REALLY easy to do a specific kind of operation, but lock you out of doing other things. So design the structure of the dictionary carefully. For example, if you make an index, and find that you actually care a lot about grabbing the top N words, you probably want to map counts (as keys) to words (as values), not words to counts.
1s = "she sells sea shells by the sea shore in the sea and the shells and the sea sea sea"
2
3d = {} # define a dictionary to hold the index
4# go through word by word
5for word in s.split():
6 # get the current count for the word, default to 0 if we haven't seen it
7 current_count = d.get(word, 0)
8 # update the count
9 new_count = current_count + 1
10 # update the dictionary with the word and count
11 d.update({word: new_count})
12
13d
{'she': 1,
'sells': 1,
'sea': 6,
'shells': 2,
'by': 1,
'the': 4,
'shore': 1,
'in': 1,
'and': 2}
Could try to invert it, but….
1def reverse_dictionary(d):
2 return {v: k for k, v in d.items()}
You actually lose information, because remember: keys are unique! No duplicates! So we lose one of our “2” entries.
1d_invert = reverse_dictionary(d)
2d_invert
{1: 'in', 6: 'sea', 2: 'and', 4: 'the'}
1d_invert = {}
2for word, count in d.items():
3 words = d_invert.get(count, []) # get the current list of words associated with this count
4 words.append(word)
5 d_invert.update({count: words})
6d_invert
{1: ['she', 'sells', 'by', 'shore', 'in'],
6: ['sea'],
2: ['shells', 'and'],
4: ['the']}
Practice: Code Tracing with Dictionaries#
Predict the output of each code snippet before running the cell!
Trace 1#
1d = {"x": 10, "y": 20, "z": 30}
2print(d["y"])
A)
20B)
"y"C)
10D)
1
Answer:
A) 20
d["y"] retrieves the value associated with the key "y", which is 20.
Trace 2#
1d = {"a": 1, "b": 2}
2d["c"] = 3
3d["a"] = 99
4print(d)
5print(len(d))
A)
{'a': 1, 'b': 2, 'c': 3}then3B)
{'a': 99, 'b': 2, 'c': 3}then3C)
{'a': 99, 'b': 2}then2D)
{'a': 1, 'a': 99, 'b': 2, 'c': 3}then4
Answer:
B) {'a': 99, 'b': 2, 'c': 3} then 3
d["c"] = 3 adds a new entry. d["a"] = 99 updates the existing key "a" (keys are unique — you can’t have two "a" entries). So the dict has 3 entries.
Trace 3#
1d = {"name": "Joel", "age": 30}
2result = d.get("email", "not found")
3print(result)
4print(len(d))
A)
Nonethen2B)
"not found"then2C)
"not found"then3D) KeyError
Answer:
B) "not found" then 2
"email" is not a key in d, so .get() returns the default value "not found". Importantly, .get() does not add the key to the dictionary — d still has only 2 entries.
Trace 4#
1d = {"a": [1, 2], "b": [3]}
2d["a"].append(3)
3d["c"] = []
4d["c"].append(7)
5print(d)
A)
{'a': [1, 2], 'b': [3], 'c': [7]}B)
{'a': [1, 2, 3], 'b': [3], 'c': [7]}C)
{'a': [3], 'b': [3], 'c': [7]}D)
{'a': [1, 2, 3], 'b': [3], 'c': []}
Answer:
B) {'a': [1, 2, 3], 'b': [3], 'c': [7]}
Dictionary values can be lists (or any other type). d["a"].append(3) gets the list [1, 2] and appends 3 to it — lists are mutable, so this modifies the list in place. d["c"] = [] creates a new key with an empty list, then .append(7) adds to it.
Trace 5#
1d = {"x": 10, "y": 20}
2d.update({"y": 50, "z": 30})
3print(d)
4print("z" in d)
5print(30 in d)
A)
{'x': 10, 'y': 50, 'z': 30}thenTruethenTrueB)
{'x': 10, 'y': 50, 'z': 30}thenTruethenFalseC)
{'x': 10, 'y': 20, 'z': 30}thenTruethenFalseD)
{'x': 10, 'y': 20, 'y': 50, 'z': 30}thenTruethenTrue
Answer:
B) {'x': 10, 'y': 50, 'z': 30} then True then False
.update() adds "z": 30 and updates "y" from 20 to 50. "z" in d checks if "z" is a key → True. 30 in d checks if 30 is a key → False (30 is a value, not a key). The in operator only checks keys!
Working with nested dictionaries#
So far, most of our dictionary values have been simple: a string, a number, a boolean. But one of the most powerful things about dictionaries is that values can be dictionaries too. This lets us model structured data — things with multiple attributes — in a very natural way.
Why nest?#
Think about a student record. You could try to cram everything into a flat dictionary:
1# flat: one key per piece of info
2student_joel_name = "Joel"
3student_joel_major = "Info Sci"
4student_joel_year = "Senior"
5# this gets messy fast with multiple students...
Or you could use a nested dictionary where each student maps to a dictionary of their attributes:
1students = {
2 "joel": {
3 "major": "Info Sci",
4 "year": "Senior",
5 "gpa": 3.5
6 },
7 "sarah": {
8 "major": "Computer Science",
9 "year": "Junior",
10 "gpa": 3.8
11 }
12}
This is much cleaner. Each student has a “record” of attributes, and we can look up any attribute for any student.
Reading nested values: chain the lookups#
To get a value from a nested dictionary, you chain two lookups — first the outer key, then the inner key.
1# get Joel's major
2print(students["joel"]["major"])
Info Sci
Chaining: a familiar idea#
This idea of chaining should feel familiar from strings! Remember how we could chain string methods like .strip().lower().replace("#", "")? That worked because each method returned a string, so you could immediately call another method on it.
The same principle is at work here: each operation returns something, and you can immediately do another operation on that result.
" Hello ".strip()→ returns a string →.lower()works on itemail.split("@")[0]→.split()returns a list →[0]indexes into itstudents["joel"]["major"]→students["joel"]returns a dictionary →["major"]looks up a key in it
It’s the same pattern every time: the result of one operation becomes the input to the next.
Breaking it into steps#
If chaining feels confusing, you can always break it into separate steps — it does the exact same thing:
1# step 1: get Joel's record (this gives us a dictionary)
2joel_record = students["joel"]
3print(type(joel_record))
4print(joel_record)
5
6# step 2: get the major from that dictionary
7print(joel_record["major"])
<class 'dict'>
{'major': 'Info Sci', 'year': 'Senior', 'gpa': 3.5}
Info Sci
The one-liner students["joel"]["major"] just combines both steps. Think of it as reading left to right: “from students, get "joel", then from that, get "major".”
Safe chaining with .get()#
You can also chain .get() calls for safer access:
1# safe access with .get()
2print(students.get("joel", {}).get("major", "unknown"))
3
4# if the outer key doesn't exist, .get() returns {},
5# and then .get("major") on {} returns "unknown"
6print(students.get("pat", {}).get("major", "unknown"))
Info Sci
unknown
This works because .get("pat", {}) returns an empty dictionary {} when "pat" isn’t found, and then .get("major", "unknown") on that empty dictionary returns "unknown". Each step produces something the next step can work with.
Updating nested values#
To update a value inside a nested dictionary, chain the lookups on the left side of the assignment:
1# Joel's GPA went up
2students["joel"]["gpa"] = 3.6
3print(students["joel"])
{'major': 'Info Sci', 'year': 'Senior', 'gpa': 3.6}
1# Sarah changed her major
2students["sarah"]["major"] = "Data Science"
3print(students["sarah"])
{'major': 'Data Science', 'year': 'Junior', 'gpa': 3.8}
Adding a new entry to the outer dictionary#
To add a whole new record, assign a new dictionary to a new outer key:
1# add a new student
2students["rony"] = {
3 "major": "Info Sci",
4 "year": "Senior",
5 "gpa": 3.9
6}
7print(students["rony"])
{'major': 'Info Sci', 'year': 'Senior', 'gpa': 3.9}
Adding a new field to an existing inner dictionary#
You can also add new keys to an inner dictionary — it’s just a regular dictionary, after all:
1# add an email field to Joel's record
2students["joel"]["email"] = "joel@umd.edu"
3print(students["joel"])
{'major': 'Info Sci', 'year': 'Senior', 'gpa': 3.6, 'email': 'joel@umd.edu'}
A real-world example: course catalog#
Let’s put this together with a more realistic example. Here’s a course catalog where each course has multiple attributes:
1catalog = {
2 "INST126": {
3 "title": "Intro to Programming",
4 "instructor": "Joel",
5 "credits": 3,
6 "prereqs": []
7 },
8 "INST326": {
9 "title": "OO Programming",
10 "instructor": "Pat",
11 "credits": 3,
12 "prereqs": ["INST126"]
13 },
14 "INST414": {
15 "title": "Data Science",
16 "instructor": "Sarah",
17 "credits": 3,
18 "prereqs": ["INST126", "INST201"]
19 }
20}
Now we can answer questions by chaining lookups:
1# who teaches INST326?
2print(catalog["INST326"]["instructor"])
Pat
1# what are the prereqs for INST414?
2print(catalog["INST414"]["prereqs"])
['INST126', 'INST201']
1# does INST126 have any prereqs?
2prereqs = catalog["INST126"]["prereqs"]
3if len(prereqs) == 0:
4 print("INST126 has no prerequisites")
5else:
6 print(f"INST126 requires: {prereqs}")
INST126 has no prerequisites
And we can update it:
1# Pat is no longer teaching INST326, Rony is
2catalog["INST326"]["instructor"] = "Rony"
3
4# INST326 added a new prereq
5catalog["INST326"]["prereqs"].append("INST201")
6
7print(catalog["INST326"])
{'title': 'OO Programming', 'instructor': 'Rony', 'credits': 3, 'prereqs': ['INST126', 'INST201']}
Notice that last line: catalog["INST326"]["prereqs"] gives us a list, and since lists are mutable, we can .append() to it directly. This is the power of nesting — the inner values follow all the rules of their own type.
Common pattern: building nested dictionaries from data#
Often you’ll start with raw data (like a list of strings) and need to build up a nested dictionary. The pattern is the same as building a flat dictionary, but the value you create or update is itself a dictionary (or a list inside a dictionary).
1# raw roster data: "name,section,score"
2roster_data = [
3 "Joel,A,81",
4 "Sarah,B,95",
5 "Rony,A,98",
6 "Kacie,C,88"
7]
8
9roster = {}
10for entry in roster_data:
11 name, section, score = entry.split(",")
12 roster[name] = {
13 "section": section,
14 "score": int(score)
15 }
16
17print(roster)
{'Joel': {'section': 'A', 'score': 81}, 'Sarah': {'section': 'B', 'score': 95}, 'Rony': {'section': 'A', 'score': 98}, 'Kacie': {'section': 'C', 'score': 88}}
1# now we can look up any student's info
2print(f"Rony is in section {roster['Rony']['section']} with score {roster['Rony']['score']}")
Rony is in section A with score 98
Summary#
Working with nested dictionaries follows the same rules as flat dictionaries — you just chain the operations:
Operation |
Flat |
Nested |
|---|---|---|
Read |
|
|
Update |
|
|
Add outer entry |
|
|
Add inner field |
n/a |
|
Safe read |
|
|
Practice: Code Tracing with Nested Dictionaries#
Predict the output of each code snippet before running the cell!
Trace 1#
1users = {
2 "alice": {"role": "admin", "active": True},
3 "bob": {"role": "editor", "active": False}
4}
5print(users["alice"]["role"])
A)
aliceB)
adminC)
{"role": "admin", "active": True}D)
True
Answer:
B) admin
users["alice"] returns the inner dictionary {"role": "admin", "active": True}. Then ["role"] looks up the key "role" in that dictionary, which gives "admin".
Trace 2#
1menu = {
2 "burger": {"price": 12.99, "vegetarian": False},
3 "salad": {"price": 9.50, "vegetarian": True}
4}
5menu["burger"]["price"] = 13.99
6menu["salad"]["calories"] = 350
7print(menu["burger"]["price"])
8print(menu["salad"])
A)
12.99then{"price": 9.50, "vegetarian": True}B)
13.99then{"price": 9.50, "vegetarian": True, "calories": 350}C)
13.99then{"price": 9.50, "vegetarian": True}D)
12.99then{"price": 9.50, "vegetarian": True, "calories": 350}
Answer:
B) 13.99 then {"price": 9.50, "vegetarian": True, "calories": 350}
The first line updates burger’s price from 12.99 to 13.99. The second line adds a new key "calories" to salad’s inner dictionary — you can add fields to an inner dict just like any other dictionary.
Trace 3#
1catalog = {
2 "INST126": {"instructor": "Joel", "enrolled": 38},
3 "INST326": {"instructor": "Pat", "enrolled": 22}
4}
5catalog["INST326"]["enrolled"] += 3
6result = catalog.get("INST201", {}).get("instructor", "TBD")
7print(catalog["INST326"]["enrolled"])
8print(result)
A)
22thenTBDB)
25thenTBDC)
25then KeyErrorD)
3then{}
Answer:
B) 25 then TBD
+= 3 adds 3 to the current enrolled count (22 → 25). For the .get() chain: "INST201" isn’t in the catalog, so the first .get() returns {} (the default). Then .get("instructor", "TBD") on that empty dictionary returns "TBD". No crash — that’s the advantage of safe chaining with .get().
Trace 4#
1shows = {
2 "Breaking Bad": {"genre": "drama", "seasons": 5},
3 "The Office": {"genre": "comedy", "seasons": 9}
4}
5shows["Friends"] = {"genre": "comedy", "seasons": 10}
6print(len(shows))
7print(shows["Friends"]["seasons"])
A)
2then KeyErrorB)
3then10C)
3then{"genre": "comedy", "seasons": 10}D)
2then10
Answer:
B) 3 then 10
Assigning to shows["Friends"] adds a new outer key with a full dictionary as its value. The catalog now has 3 entries. shows["Friends"]["seasons"] chains into the new entry and returns 10.
Trace 5#
1profiles = {
2 "terp1": {"name": "Joel", "followers": 100},
3 "terp2": {"name": "Sarah", "followers": 250}
4}
5profiles["terp1"]["followers"] += 50
6profiles["terp2"]["name"] = profiles["terp2"]["name"].upper()
7print(profiles["terp1"]["followers"])
8print(profiles["terp2"]["name"])
A)
100thenSarahB)
150thenSARAHC)
50thenSARAHD)
150thenSarah
Answer:
B) 150 then SARAH
+= 50 adds 50 to terp1’s follower count (100 → 150). For terp2’s name: profiles["terp2"]["name"] returns "Sarah", then .upper() returns "SARAH", and that result is assigned back. Notice the chaining: we read a nested value, call a string method on it, and assign the result back to the same nested location.
Trace 6#
1data = {}
2entries = ["Joel:A:INST126", "Sarah:B:INST201", "Joel:A+:INST326"]
3for entry in entries:
4 name, grade, course = entry.split(":")
5 data[name] = {"grade": grade, "course": course}
6print(data["Joel"]["grade"])
7print(len(data))
A)
Athen3B)
A+then2C)
Athen2D)
A+then3
Answer:
B) A+ then 2
Joel appears twice. The second entry ("Joel:A+:INST326") overwrites the first — keys are unique, so data["Joel"] ends up as {"grade": "A+", "course": "INST326"}. There are only 2 unique names (Joel and Sarah), so len(data) is 2.
Practice: Dictionary Scenarios#
For each scenario, start by creating the dictionary, then complete the retrieval and update operations.
Scenario 1: Gradebook (basic)#
You’re building a simple gradebook that maps student names to their current grade (a single letter grade).
Setup#
Create a dictionary called gradebook with the following entries:
Student |
Grade |
|---|---|
Joel |
B |
Sarah |
A |
Rony |
A+ |
Kacie |
B+ |
Miles |
C |
1# create the gradebook dictionary here
Retrieve#
R1. Print Joel’s grade.
1# your code here
R2. Check if “Pat” is in the gradebook. If so, print their grade; otherwise print "Pat is not in the gradebook".
1# your code here
R3. Use .get() to look up “Zara” in the gradebook. If she’s not there, print "Zara: no grade on file".
1# your code here
Update#
U1. Miles turned in extra credit — update his grade to "B-".
1# your code here
U2. A new student, Pat, joined the class with a grade of "B". Add them to the gradebook.
1# your code here
U3. Sarah and Rony both got final grade adjustments: Sarah is now "A-" and Rony is now "A". Update both in a single operation.
1# your code here
Scenario 4: Course directory (nested)#
You’re building a course directory where each course code maps to a dictionary of course details.
Setup#
Create a dictionary called courses with the following structure:
courses = {
"INST126": {"title": "Intro to Programming", "instructor": "Joel", "capacity": 40, "enrolled": 38},
"INST201": {"title": "Intro to Info Science", "instructor": "Sarah", "capacity": 35, "enrolled": 35},
"INST326": {"title": "OO Programming", "instructor": "Pat", "capacity": 30, "enrolled": 22}
}
1# create the courses dictionary here
Retrieve#
R1. Print the instructor for INST126.
1# your code here
R2. Print how many open seats INST126 has (capacity minus enrolled), formatted as: "INST126: {n} seats available".
1# your code here
R3. A student wants to know the title and instructor for INST326. Print: "INST326: OO Programming (taught by Pat)".
1# your code here
Update#
U1. 3 more students enrolled in INST326. Update the enrolled count.
1# your code here
U2. INST201 got a new instructor: “Rony”. Update it.
1# your code here
U3. Add a new course: "INST314" with title "Statistics", instructor "Kacie", capacity 25, and enrolled 0.
1# your code here
Scenario 5: Streaming catalog (nested)#
You’re modeling a simple streaming catalog where each show title maps to a dictionary of details.
Setup#
Create a dictionary called catalog with the following structure:
catalog = {
"Stranger Things": {"genre": "sci-fi", "seasons": 4, "rating": 8.7},
"The Office": {"genre": "comedy", "seasons": 9, "rating": 8.9},
"Breaking Bad": {"genre": "drama", "seasons": 5, "rating": 9.5},
"Ted Lasso": {"genre": "comedy", "seasons": 3, "rating": 8.8}
}
1# create the catalog dictionary here
Retrieve#
R1. Print the rating of Breaking Bad.
1# your code here
R2. Print how many seasons The Office has, formatted as: "The Office: 9 seasons".
1# your code here
R3. Check if “Wednesday” is in the catalog. Print "Wednesday is available" or "Wednesday is not in the catalog".
1# your code here
Update#
U1. Stranger Things released a new season. Update its season count to 5.
1# your code here
U2. After the new season, the rating for Stranger Things was updated to 8.9. Update it.
1# your code here
U3. Add a new show: "Wednesday" with genre "comedy", seasons 1, and rating 8.1.
1# your code here
Scenario 6: Website config (nested)#
You’re working with a website configuration dictionary that stores settings for different sections of the site.
Setup#
Create a dictionary called config with the following structure:
config = {
"homepage": {"title": "Welcome", "show_banner": True, "max_posts": 10},
"blog": {"title": "Our Blog", "show_banner": False, "max_posts": 25},
"about": {"title": "About Us", "show_banner": True, "max_posts": 0}
}
1# create the config dictionary here
Retrieve#
R1. Print the title of the blog page.
1# your code here
R2. Check whether the blog page has the banner enabled. Print "Blog banner: on" or "Blog banner: off" accordingly.
1# your code here
R3. Print the homepage title and max posts, formatted as: "Homepage: 'Welcome' (max 10 posts)".
1# your code here
Update#
U1. Enable the banner on the blog page.
1# your code here
U2. The homepage should now show a max of 15 posts instead of 10. Update it.
1# your code here
U3. Add a new page: "contact" with title "Contact Us", show_banner set to False, and max_posts set to 0.
1# your code here
Dictionary Application: Indexing#
Now that we have the dictionary data structure, we can apply it in a program that can create an index.
Example#
Here’s an example from above that takes a list of book chapter strings and a list of key concepts, and produces an index that maps key concepts to chapters.
1# how to know which chapters talk about strings? or debugging?
2book = [
3 "Chapter 1: talks about strings and how they have the property of immutability also some basic debugging",
4 "Chapter 2: continues talking about advanced methods for strings and also introduces the concept of functions",
5 "Chapter 3: discusses iteration and lists and also debugging",
6]
1# define what we want to index as keys
2concepts = ['strings', 'debugging', "lists", "iteration"]
3
4# make a dictionary to hold the index
5index = {}
6
7# go through every chapter
8for chapter in book:
9
10 # split into chapter and descr based on the colon
11 chapter, descr = chapter.split(":")
12
13 # index the concepts
14
15 # for every concept we care about
16 for keyconcept in concepts:
17 # check if it's in this chapter descr
18 if keyconcept in descr:
19 # if so, get the current list of chapters associated with this keyconcept
20 chs = index.get(keyconcept, [])
21 # then update the list of chapters
22 # if we haven't seen the chapter
23 if chapter not in chs:
24 chs.append(chapter)
25 # and update the index to map the keyconcept to the updated list of assoc. chapters
26 index.update({keyconcept: chs})
27index
{'strings': ['Chapter 1', 'Chapter 2'],
'debugging': ['Chapter 1', 'Chapter 3'],
'lists': ['Chapter 3'],
'iteration': ['Chapter 3']}
What dictionary concepts do you recognize?
The indexing pattern#
Here’s the generic structure of an indexing pattern
# create an empty dictionary to hold the index
# for every item in a list of things you want to index
# (optional: parse out the keys and values you want to index from the item)
# get the current value associated with the key in the index
# update the value
# update the index with the key and its updated value
Connection to patterns you already know#
This pattern might look new, but the core idea is something you’ve already practiced! In the Iteration chapter, we learned two key loop patterns:
Counting: initialize a counter to
0, loop through items, increment the counter when a condition is metAccumulating (into a list): initialize an empty list
[], loop through items,.append()items that meet some criteria
Dictionary indexing is just these same patterns — but instead of one counter or one list, you have a counter or list per key. The dictionary is what lets you keep track of multiple accumulators at once, one for each unique key.
Here’s the side-by-side:
Simple (from Iteration) |
Dictionary version (Indexing) |
|
|---|---|---|
Counting |
|
|
Accumulating |
|
|
The key difference is .get(key, default) — it handles the “first time we see this key” case by providing a starting value (0 for counting, [] for accumulating).
Let’s see both variations in action.
Variation 1: Counting index#
Let’s look at a super simple example: making a word count index.
We’ll take in a list of words, and produce an index that maps words as keys to counts of occurrences as values.
1# the thing we want to index
2word_list = ['she', 'sells', 'sea', 'shells', 'by', 'the', 'sea', 'shore', 'in', 'the', 'sea', 'and', 'the', 'shells', 'and', 'the', 'sea', 'sea', 'sea']
3
4# create an empty dictionary to hold the index
5word_counts = {}
6
7# for every item in a list of things you want to index
8for word in word_list:
9
10 # (optional: parse out the keys and values you want to index from the item)
11 # we don't need this step
12
13 # get the current value associated with the key in the index
14 # here, we use .get(), and have a default count of 0 if the key isn't yet in the index
15 count = word_counts.get(word, 0)
16
17 # update the value
18 count += 1
19
20 # update the index with the key and its updated value
21 word_counts.update({word: count})
22
23word_counts
{'she': 1,
'sells': 1,
'sea': 6,
'shells': 2,
'by': 1,
'the': 4,
'shore': 1,
'in': 1,
'and': 2}
Practice: index grade counts#
Practice! Test your understanding by adapting the above code to count how many times each grade was earned in a course.
1grades_counts = {}
1# your code here
Variation 2: Accumulating index#
The counting version uses d.get(key, 0) to start each key’s count at 0. But what if instead of counting, you want to collect items — like building a list of all chapters where a concept appears, or all emails from a particular sender?
This is the accumulating variation. Instead of 0 as the default, you use [] (an empty list), and instead of += 1, you .append() the item.
Practice: which words are n characters long?#
Let’s try the accumulating variation. We’ll index word lengths: we want an index that can tell us “what words in the list were 2 characters long, or 5 characters long?” The key is the word length, and the value is a list of words with that length.
1# the thing we want to index
2word_list = ['she', 'sells', 'sea', 'shells', 'by', 'the', 'sea', 'shore', 'in', 'the', 'sea', 'and', 'the', 'shells', 'and', 'the', 'sea', 'sea', 'sea']
3
4# your code here
Practice: counts of email addresses from email records (counting)#
Let’s practice the counting variation again. Index how many times we got emails from each email address!
1email_records = ['From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008',
2 'From louis@media.berkeley.edu Fri Jan 4 18:10:48 2008',
3 'From zqian@umich.edu Fri Jan 4 16:10:39 2008',
4 'From rjlowe@iupui.edu Fri Jan 4 15:46:24 2008',
5 'From zqian@umich.edu Fri Jan 4 15:03:18 2008',
6 'From rjlowe@iupui.edu Fri Jan 4 14:50:18 2008',
7 'From cwen@iupui.edu Fri Jan 4 11:37:30 2008',
8 'From cwen@iupui.edu Fri Jan 4 11:35:08 2008',
9 'From gsilver@umich.edu Fri Jan 4 11:12:37 2008',
10 'From gsilver@umich.edu Fri Jan 4 11:11:52 2008',
11 'From zqian@umich.edu Fri Jan 4 11:11:03 2008',
12 'From gsilver@umich.edu Fri Jan 4 11:10:22 2008',
13 'From wagnermr@iupui.edu Fri Jan 4 10:38:42 2008',
14 'From zqian@umich.edu Fri Jan 4 10:17:43 2008',
15 'From antranig@caret.cam.ac.uk Fri Jan 4 10:04:14 2008',
16 'From gopal.ramasammycook@gmail.com Fri Jan 4 09:05:31 2008',
17 'From david.horwitz@uct.ac.za Fri Jan 4 07:02:32 2008',
18 'From david.horwitz@uct.ac.za Fri Jan 4 06:08:27 2008',
19 'From david.horwitz@uct.ac.za Fri Jan 4 04:49:08 2008',
20 'From david.horwitz@uct.ac.za Fri Jan 4 04:33:44 2008',
21 'From stephen.marquard@uct.ac.za Fri Jan 4 04:07:34 2008',
22 'From louis@media.berkeley.edu Thu Jan 3 19:51:21 2008',
23 'From louis@media.berkeley.edu Thu Jan 3 17:18:23 2008',
24 'From ray@media.berkeley.edu Thu Jan 3 17:07:00 2008',
25 'From cwen@iupui.edu Thu Jan 3 16:34:40 2008',
26 'From cwen@iupui.edu Thu Jan 3 16:29:07 2008',
27 'From cwen@iupui.edu Thu Jan 3 16:23:48 2008']
28
29# your code here
Practice: map email addresses to email records (accumulating)#
Now modify the previous program to use the accumulating variation. Instead of counting, collect the full email records (as a list of strings) for each email address. We want to ask questions like “can I see the emails I got from cwen@iupui.edu?”
1email_records = ['From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008',
2 'From louis@media.berkeley.edu Fri Jan 4 18:10:48 2008',
3 'From zqian@umich.edu Fri Jan 4 16:10:39 2008',
4 'From rjlowe@iupui.edu Fri Jan 4 15:46:24 2008',
5 'From zqian@umich.edu Fri Jan 4 15:03:18 2008',
6 'From rjlowe@iupui.edu Fri Jan 4 14:50:18 2008',
7 'From cwen@iupui.edu Fri Jan 4 11:37:30 2008',
8 'From cwen@iupui.edu Fri Jan 4 11:35:08 2008',
9 'From gsilver@umich.edu Fri Jan 4 11:12:37 2008',
10 'From gsilver@umich.edu Fri Jan 4 11:11:52 2008',
11 'From zqian@umich.edu Fri Jan 4 11:11:03 2008',
12 'From gsilver@umich.edu Fri Jan 4 11:10:22 2008',
13 'From wagnermr@iupui.edu Fri Jan 4 10:38:42 2008',
14 'From zqian@umich.edu Fri Jan 4 10:17:43 2008',
15 'From antranig@caret.cam.ac.uk Fri Jan 4 10:04:14 2008',
16 'From gopal.ramasammycook@gmail.com Fri Jan 4 09:05:31 2008',
17 'From david.horwitz@uct.ac.za Fri Jan 4 07:02:32 2008',
18 'From david.horwitz@uct.ac.za Fri Jan 4 06:08:27 2008',
19 'From david.horwitz@uct.ac.za Fri Jan 4 04:49:08 2008',
20 'From david.horwitz@uct.ac.za Fri Jan 4 04:33:44 2008',
21 'From stephen.marquard@uct.ac.za Fri Jan 4 04:07:34 2008',
22 'From louis@media.berkeley.edu Thu Jan 3 19:51:21 2008',
23 'From louis@media.berkeley.edu Thu Jan 3 17:18:23 2008',
24 'From ray@media.berkeley.edu Thu Jan 3 17:07:00 2008',
25 'From cwen@iupui.edu Thu Jan 3 16:34:40 2008',
26 'From cwen@iupui.edu Thu Jan 3 16:29:07 2008',
27 'From cwen@iupui.edu Thu Jan 3 16:23:48 2008']
28
29# your code here
1email_occurrences.get("david.horwitz@uct.ac.za")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[125], line 1
----> 1 email_occurrences.get("david.horwitz@uct.ac.za")
NameError: name 'email_occurrences' is not defined
Investigate: integrate the book chapter indexing program into our generic structure#
What’s the same? What needs to be modified?
Practice: flag important emails#
Let’s say you have a list of ‘starred contacts’, and you want a separate index where you can ask the question “give me all the emails I got from this starred contact”. You can imagine this being sort of a rudimentary back-end of your “important” tab in Gmail.
How might we modify our email indexing program to do this?
1email_records = ['From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008',
2 'From louis@media.berkeley.edu Fri Jan 4 18:10:48 2008',
3 'From zqian@umich.edu Fri Jan 4 16:10:39 2008',
4 'From rjlowe@iupui.edu Fri Jan 4 15:46:24 2008',
5 'From zqian@umich.edu Fri Jan 4 15:03:18 2008',
6 'From rjlowe@iupui.edu Fri Jan 4 14:50:18 2008',
7 'From cwen@iupui.edu Fri Jan 4 11:37:30 2008',
8 'From cwen@iupui.edu Fri Jan 4 11:35:08 2008',
9 'From gsilver@umich.edu Fri Jan 4 11:12:37 2008',
10 'From gsilver@umich.edu Fri Jan 4 11:11:52 2008',
11 'From zqian@umich.edu Fri Jan 4 11:11:03 2008',
12 'From gsilver@umich.edu Fri Jan 4 11:10:22 2008',
13 'From wagnermr@iupui.edu Fri Jan 4 10:38:42 2008',
14 'From zqian@umich.edu Fri Jan 4 10:17:43 2008',
15 'From antranig@caret.cam.ac.uk Fri Jan 4 10:04:14 2008',
16 'From gopal.ramasammycook@gmail.com Fri Jan 4 09:05:31 2008',
17 'From david.horwitz@uct.ac.za Fri Jan 4 07:02:32 2008',
18 'From david.horwitz@uct.ac.za Fri Jan 4 06:08:27 2008',
19 'From david.horwitz@uct.ac.za Fri Jan 4 04:49:08 2008',
20 'From david.horwitz@uct.ac.za Fri Jan 4 04:33:44 2008',
21 'From stephen.marquard@uct.ac.za Fri Jan 4 04:07:34 2008',
22 'From louis@media.berkeley.edu Thu Jan 3 19:51:21 2008',
23 'From louis@media.berkeley.edu Thu Jan 3 17:18:23 2008',
24 'From ray@media.berkeley.edu Thu Jan 3 17:07:00 2008',
25 'From cwen@iupui.edu Thu Jan 3 16:34:40 2008',
26 'From cwen@iupui.edu Thu Jan 3 16:29:07 2008',
27 'From cwen@iupui.edu Thu Jan 3 16:23:48 2008']
28
29starred = ["cwen@iupui.edu", "david.horwitz@uct.ac.za"]
30# your code here
Practice: Code Tracing with Dictionary Indexing#
Predict the output of each code snippet before running the cell!
Trace 1 (counting)#
1letters = ["a", "b", "a", "c", "b", "a"]
2counts = {}
3for letter in letters:
4 counts[letter] = counts.get(letter, 0) + 1
5print(counts)
6print(counts["a"])
A)
{'a': 1, 'b': 1, 'c': 1}then1B)
{'a': 3, 'b': 2, 'c': 1}then3C)
{'a': 3, 'b': 2, 'a': 3, 'c': 1, 'b': 2, 'a': 3}then3D)
{'a': 2, 'b': 1, 'c': 1}then2
Answer:
B) {'a': 3, 'b': 2, 'c': 1} then 3
Each time we see a letter, .get(letter, 0) returns the current count (or 0 if it’s new), and we add 1. “a” appears 3 times, “b” twice, “c” once. Keys are unique — there’s only one entry per letter.
Trace 2 (accumulating)#
1words = ["cat", "car", "dog", "cup"]
2groups = {}
3for word in words:
4 first = word[0]
5 items = groups.get(first, [])
6 items.append(word)
7 groups[first] = items
8print(groups)
A)
{'c': ['cup'], 'd': ['dog']}B)
{'c': ['cat', 'car', 'cup'], 'd': ['dog']}C)
{'cat': 'c', 'car': 'c', 'dog': 'd', 'cup': 'c'}D)
{'c': 3, 'd': 1}
Answer:
B) {'c': ['cat', 'car', 'cup'], 'd': ['dog']}
This is the accumulating pattern. The key is the first letter, and the value is a list of words starting with that letter. .get(first, []) returns the existing list or an empty one for new keys. Each word is appended to the appropriate list.
Trace 3 (counting with parsing)#
1sales = ["apples:3", "bananas:2", "apples:1", "bananas:5"]
2totals = {}
3for sale in sales:
4 item, qty = sale.split(":")
5 totals[item] = totals.get(item, 0) + int(qty)
6print(totals)
A)
{'apples': 1, 'bananas': 5}B)
{'apples': 4, 'bananas': 7}C)
{'apples': 3, 'bananas': 2}D)
{'apples:3': 1, 'bananas:2': 1, 'apples:1': 1, 'bananas:5': 1}
Answer:
B) {'apples': 4, 'bananas': 7}
Each entry is parsed with .split(":"). The quantity is converted to int and added to the running total for that item. Apples: 3 + 1 = 4. Bananas: 2 + 5 = 7. This is the counting pattern but accumulating a sum rather than just counting occurrences.
Trace 4 (trap — overwrite, not accumulate!)#
1scores = [("Joel", 85), ("Sarah", 92), ("Joel", 90), ("Sarah", 88)]
2record = {}
3for name, score in scores:
4 record[name] = score
5print(record)
A)
{'Joel': [85, 90], 'Sarah': [92, 88]}B)
{'Joel': 90, 'Sarah': 88}C)
{'Joel': 85, 'Sarah': 92}D)
{'Joel': 175, 'Sarah': 180}
Answer:
B) {'Joel': 90, 'Sarah': 88}
This is a trap! The code just assigns record[name] = score — it overwrites the previous value each time. Joel’s second score (90) replaces 85. To keep all scores, you’d need the accumulating pattern: record[name] = record.get(name, []) then record[name].append(score).
Trace 5 (counting with conditional)#
1grades = ["A", "B", "A", "C", "A", "B", "F", "B", "A"]
2passing = {}
3for grade in grades:
4 if grade != "F":
5 passing[grade] = passing.get(grade, 0) + 1
6print(passing)
7print(len(passing))
A)
{'A': 4, 'B': 3, 'C': 1, 'F': 1}then4B)
{'A': 4, 'B': 3, 'C': 1}then3C)
{'A': 4, 'B': 3, 'C': 1, 'F': 0}then4D)
{'A': 3, 'B': 2, 'C': 1}then3
Answer:
B) {'A': 4, 'B': 3, 'C': 1} then 3
The if grade != "F" filter means F is never added to the dictionary at all — not even with a count of 0. Only passing grades are counted. There are 3 unique passing grades, so len(passing) is 3.
Trace 6 (accumulating with parsing)#
1data = ["INST126:Joel", "INST201:Sarah", "INST126:Rony"]
2index = {}
3for entry in data:
4 course, student = entry.split(":")
5 students = index.get(course, [])
6 students.append(student)
7 index[course] = students
8print(index["INST126"])
9print(len(index))
A)
['Joel']then3B)
['Joel', 'Rony']then2C)
['Rony']then2D)
['Joel', 'Rony']then3
Answer:
B) ['Joel', 'Rony'] then 2
INST126 appears twice. The first time, .get("INST126", []) returns [], and “Joel” is appended. The second time, it returns ['Joel'], and “Rony” is appended. The list grows — it doesn’t overwrite. There are 2 unique courses, so len(index) is 2.
Practice: Dictionary Indexing Problems#
P1: Count character types (counting)#
Given a string, count how many letters, digits, and spaces it contains. Store the results in a dictionary with keys "letters", "digits", and "spaces".
1text = "Hello World 123"
2char_counts = {}
3
4# your code here
Answer:
text = "Hello World 123"
char_counts = {}
for char in text:
if char.isalpha():
char_counts["letters"] = char_counts.get("letters", 0) + 1
elif char.isnumeric():
char_counts["digits"] = char_counts.get("digits", 0) + 1
elif char == " ":
char_counts["spaces"] = char_counts.get("spaces", 0) + 1
char_counts # {'letters': 10, 'digits': 3, 'spaces': 2}
P2: Group names by first letter (accumulating)#
Given a list of names, build a dictionary that maps each first letter to a list of names starting with that letter.
1names = ["Joel", "Sarah", "John", "Kacie", "Sam", "Jill", "Kelly"]
2groups = {}
3
4# your code here
Answer:
names = ["Joel", "Sarah", "John", "Kacie", "Sam", "Jill", "Kelly"]
groups = {}
for name in names:
first = name[0]
groups[first] = groups.get(first, [])
groups[first].append(name)
groups # {'J': ['Joel', 'John', 'Jill'], 'S': ['Sarah', 'Sam'], 'K': ['Kacie', 'Kelly']}
P3: Word frequency from a sentence (counting)#
Given a sentence, count how many times each word appears. Normalize to lowercase first so “The” and “the” are counted together.
1sentence = "the cat sat on the mat and the cat saw the dog"
2word_freq = {}
3
4# your code here
Answer:
sentence = "the cat sat on the mat and the cat saw the dog"
word_freq = {}
for word in sentence.lower().split():
word_freq[word] = word_freq.get(word, 0) + 1
word_freq # {'the': 4, 'cat': 2, 'sat': 1, 'on': 1, 'mat': 1, 'and': 1, 'saw': 1, 'dog': 1}
P4: Categorize scores (accumulating)#
Given a list of scores, build a dictionary that groups them into "high" (90+), "medium" (70-89), and "low" (below 70).
1scores = [95, 67, 82, 91, 55, 73, 88, 100, 42, 78]
2categories = {}
3
4# your code here
Answer:
scores = [95, 67, 82, 91, 55, 73, 88, 100, 42, 78]
categories = {}
for score in scores:
if score >= 90:
label = "high"
elif score >= 70:
label = "medium"
else:
label = "low"
categories[label] = categories.get(label, [])
categories[label].append(score)
categories # {'high': [95, 91, 100], 'medium': [82, 73, 88, 78], 'low': [67, 55, 42]}
P5: Count email domains (counting)#
Given a list of email addresses, count how many emails come from each domain (the part after @).
1emails = ["joel@umd.edu", "sarah@gmail.com", "rony@umd.edu", "pat@gmail.com", "kacie@umd.edu", "miles@yahoo.com"]
2domain_counts = {}
3
4# your code here
Answer:
emails = ["joel@umd.edu", "sarah@gmail.com", "rony@umd.edu", "pat@gmail.com", "kacie@umd.edu", "miles@yahoo.com"]
domain_counts = {}
for email in emails:
domain = email.split("@")[1]
domain_counts[domain] = domain_counts.get(domain, 0) + 1
domain_counts # {'umd.edu': 3, 'gmail.com': 2, 'yahoo.com': 1}
P6: Index course enrollments (accumulating)#
Given a list of enrollment records in the format "StudentName:CourseCode", build a dictionary that maps each course to a list of enrolled students.
1enrollments = [
2 "Joel:INST126", "Sarah:INST201", "Rony:INST126",
3 "Kacie:INST201", "Pat:INST126", "Miles:INST326",
4 "Sarah:INST126", "Joel:INST201"
5]
6course_roster = {}
7
8# your code here
Answer:
enrollments = [
"Joel:INST126", "Sarah:INST201", "Rony:INST126",
"Kacie:INST201", "Pat:INST126", "Miles:INST326",
"Sarah:INST126", "Joel:INST201"
]
course_roster = {}
for entry in enrollments:
student, course = entry.split(":")
course_roster[course] = course_roster.get(course, [])
course_roster[course].append(student)
course_roster
# {'INST126': ['Joel', 'Rony', 'Pat', 'Sarah'],
# 'INST201': ['Sarah', 'Kacie', 'Joel'],
# 'INST326': ['Miles']}