8: Dictionaries

Contents

8: Dictionaries#

What are dictionaries and why should we care about them?#

Dictionaries are for associating data and quick lookup

Motivating example: I am making an index for a book, because I want to know which concepts show up on which pages, to make it easier to jump back to the right spots.

1# how to know which chapters talk about strings? or debugging?
2book = [
3  "Chapter 1: talks about strings and how they have the property of immutability also some basic debugging",
4  "Chapter 2: continues talking about advanced methods for strings and also introduces the concept of functions",
5  "Chapter 3: discusses iteration and lists and also debugging",
6]

Here’s what it might look like without dictionaries.

 1concepts = ['strings', 'debugging', 'immutability']
 2index = []
 3
 4for chapter in book:
 5    # split into elements based on the colon
 6    elements = chapter.split(":")
 7    # first element is the chapter
 8    chapter = elements[0]
 9    # second element is the text
10    text = elements[1]
11
12    # parse the text
13    words = text.split()
14    for keyconcept in concepts:
15        if keyconcept in words:
16            index.append([keyconcept, chapter])
17index
[['strings', 'Chapter 1'],
 ['debugging', 'Chapter 1'],
 ['immutability', 'Chapter 1'],
 ['strings', 'Chapter 2'],
 ['debugging', 'Chapter 3']]
 1# given this data structure, how to find
 2# all the chapters that have strings in them?
 3query = "strings"
 4results = []
 5# go through every item in the index
 6for item in index:
 7    concept, chapter = item
 8    # if the concept matches the query
 9    if query == concept:
10        results.append(chapter)
11
12print(results)
['Chapter 1', 'Chapter 2']

And here is what it might look like with dictionaries.

1# now consider if the concept index is in a dictionary (much more like what you might see in a book!)
2index_d = {
3    'strings': ['Chapter 1', 'Chapter 2'],
4    'debugging': ['Chapter 1', 'Chapter 3']
5}
1# given this data structure, how to find
2# all the chapters that have strings in them?
3query = "strings"
4index_d.get("strings")
['Chapter 1', 'Chapter 2']

Another common use case: attributes of data entries. For instance, attributes of a class, like credit hours, pre-reqs, instructor, location, hours, and so on.

 1# without dictionaries
 2courses = [
 3    ["INST126", 3, "no", "Chan", "hybrid", "MWF"],
 4    ["INST256", 4, "yes", "Kanishka", "in-person", "TR"]
 5]
 6
 7# look up INST126 and check whether it has prereqs
 8for course in courses:
 9    if course[0] == "INST126":
10        print(course[2])
no

Rather than trying to remember which position we happened to have decided to use to store a particular attribute (if we used lists), we can use semantically meaningful indices for values, i.e., keys!

 1# with dictionaries
 2courses = {
 3    "INST126": {
 4      "credit hours": 3, "prereqs": "no", "instructor": "Chan", "location": "hybrid", "hours": "MWF"
 5      },
 6    "INST256": {
 7      "credit hours": 4, "prereqs": "yes", "instructor": "Rony", "location": "in-person", "hours": "TR"
 8      },
 9}
10
11# look up INST126 and check whether it has prereqs
12courses.get("INST256").get("instructor")
'Rony'

It’s a lot easier to remember keys (if we name them useful things) compared to just indices. And Python can help us remember too!

If you’re interested, there are also formal technical reasons to prefer dictionaries over lists if you care about speed/efficiency and your computational task is checking if an item exists in a collection and you’re dealing with very large scale data. The short version: looking up a key in a dictionary is O(1) (constant time: just 1 operation, no matter how big the dictionary is), while checking if an item is in a list is O(n) (grows with the size of the list: even though computers are pretty fast, when you get to quite large database sizes, you’ll notice a difference!). See https://www.geeksforgeeks.org/python/difference-between-list-and-dictionary-in-python/ for a beginner-friendly explanation, or https://wiki.python.org/moin/TimeComplexity for the official reference.

In the future, you may learn the pandas library (and the dataframe data structure, which is sort of a hybrid of lists and dictionaries): you can do really fast lookup, but also sort stuff!

Anatomy of a dictionary#

Dictionaries are not so different from… our dictionaries in real life. :) Basically map a bunch of keys (e.g., a word) to corresponding values (e.g., a definition). Another example is indices in the back of print books that map key terms to pages where that term shows up, or tags on websites, that map tags to webpages that include those tags.

The key parts of a dictionary literal are:

  1. The { } curly braces, which tell you and Python that it’s a dictionary (similar to "" for strings, or [] for lists)

  2. At least one entry (but usually several), that maps a value on the right of a : — which functions like the = expression — to a key on the left. For example, our first entry maps the value “apple” to the key “a”.

  3. Similar to lists, we include , to separate multiple entries in the dictionary.

Let’s look at a simple example that maps letters to an example word that starts with the letter

1d = {
2   'a': 'apple', # an entry that maps the value apple to the letter a
3   'b': 'ball', # another entry that maps the value ball to the letter b
4   'c': 'crayon'
5}
6d
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}

Style note: the indentation inside the dictionary is more for readability. Python doesn’t care, as long as there are commas between the dictionary entries.

You can also write it out like this, but I find it harder to read:

1d = {'a': 'apple', 'b': 'ball', 'c': 'crayon'}
2d
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}

Here are some other examples.

 1# map letters to numbers
 2another = {
 3    'a': 1,
 4    'b': 2,
 5    'c': 3
 6}
 7
 8# map letter grades to list of scores
 9grades = {
10    'A': [93, 100],
11    'B': [87, 93]
12}

Properties of a dictionary#

Here are some key properties of a dictionary in Python:

Dictionaries have length#

Similar to lists, dictionaries have length.

1d = {
2   'a': 'apple', # an entry that maps the value apple to the letter a
3   'b': 'ball', # another entry that maps the value ball to the letter b
4   'c': 'crayon'
5}
6len(d)
3

Dictionaries do not have an order or indexing by position#

Different from lists, dictionaries do not have an order. So you can’t really sort a dictionary, or grab things by position. You grab things by… key!

1# can't get things by position
2d[0]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[12], line 2
      1 # can't get things by position
----> 2 d[0]

KeyError: 0
1# can't sort either
2d.sort()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[13], line 2
      1 # can't sort either
----> 2 d.sort()

AttributeError: 'dict' object has no attribute 'sort'
1sorted(d) # this does *not* sort the dictionary; it returns a sorted list of the keys
['a', 'b', 'c']

Though as we will see next week, you can sneakily get some order, because Python now allows you to actually iterate through a dictionary in the order that stuff was put into it. It’s not reliable though! If you need your collection to have an actual sequence to it, don’t use dictionaries!

All keys in a dictionary are unique#

Also, all keys in a dictionary have to be unique. This makes dictionaries handy for keeping track of unique items (in contrast to say, lists, where you can have duplicate entries).

Values in the dictionary do not have to be unique, though: you can have different keys point to the same value, but not multiple values point to duplicate keys. There is a related data structure that has a similar property called sets if you’re interested.

1d = {
2   'a': 'apple', # an entry that maps the value apple to the letter a
3   'b': 'ball', # another entry that maps the value ball to the letter b
4   'c': 'crayon',
5   'a': 'animal',
6   'd': 'ball'
7}
8print(d) # oops, where did apple go? overwritten by the last a:animal entry since we can't have duplicate keys. d:ball is fine though, even though we already have b:ball.
{'a': 'animal', 'b': 'ball', 'c': 'crayon', 'd': 'ball'}

Dictionaries are mutable#

Like lists, dictionaries are also mutable: you can modify them directly (in contrast to strings, where you never modify them directly, but only ever create a new modified version of the string).

1print(f"d before modification: {d}")
2print("modifying d, by adding an entry mapping the key f to the value friend")
3d.update({"f": "friend"})
4print(f"d after modification: {d}")
d before modification: {'a': 'animal', 'b': 'ball', 'c': 'crayon', 'd': 'ball'}
modifying d, by adding an entry mapping the key f to the value friend
d after modification: {'a': 'animal', 'b': 'ball', 'c': 'crayon', 'd': 'ball', 'f': 'friend'}

What kinds of data can we put in a dictionary?#

Anything goes for values#

Basically anything goes for values. You can even nest a dictionary inside another dictionary, by mapping a dictionary value to some key.

 1def hello():
 2    print("hello world!")
 3
 4d = {
 5   'a': ['apple', "animal"], # an entry that maps a list to the letter a
 6   'b': 'ball', # another entry that maps the value ball to the letter b
 7   'c': 2,
 8   'd': [1, 3, "denizen"],
 9   'e': hello # even a function is fine!
10}
11d
{'a': ['apple', 'animal'],
 'b': 'ball',
 'c': 2,
 'd': [1, 3, 'denizen'],
 'e': <function __main__.hello()>}
1students = {
2    'joel': {
3        'major': 'info sci',
4        'year': 'senior',
5        'interests': ['programming', 'football', 'dancing']
6    },
7}
8students
{'joel': {'major': 'info sci',
  'year': 'senior',
  'interests': ['programming', 'football', 'dancing']}}

Keys must be hashable#

But keys need to be hashable.

What does this mean?

From the Python glossary:

An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash() method), and can be compared to other objects (it needs an eq() or cmp() method). Hashable objects which compare equal must have the same hash value. Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal, and their hash value is their id().

More info here: https://stackoverflow.com/questions/14535730/what-does-hashable-mean-in-python

1d = {['3']: 'apple'}
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[19], line 1
----> 1 d = {['3']: 'apple'}

TypeError: unhashable type: 'list'

I mention this because a common error when first working with dictionaries is to try to use an unhashable data structure as a key.

Understanding hashes is not something to worry about for now. We can focus on a basic rule of thumb for now: strings and numbers are ok as keys; everything else (that you’ll learn now) is not.

Working with dictionaries: basics#

Create a dictionary#

1d = {
2   'a': 'apple',
3   'b': 'ball',
4   'c': 'crayon',
5}

You can also start with an empty dictionary, and then add stuff later, programmatically or with other functions.

1emptyd = {} # dictionary with nothing in it
2emptyd = dict() # same thing
3print(emptyd)
4len(emptyd)
{}
0

PRACTICE: make a dictionary e that has the following entries: smith:A, chan:B, abuye:A+

1# your code here

Get the value associated with a key from a dictionary#

1d = {
2   'a': 'apple',
3   'b': 'ball',
4   'c': 'crayon'
5}
6d['c'] # put a key inside square brackets associated with a dictionary
'crayon'

This is called the “old style” or “indexing” pattern. Looks a little bit like lists.

1d = {
2   'a': 'apple',
3   'b': 'ball',
4   'c': 'crayon'
5}
6d.get('c') # use the get function to get the value for the key that we give it
'crayon'

This is the newer pattern that I prefer for clarity.

It also has the advantage of not breaking your program if you try to access a key that doesn’t exist.

1d = {
2   'a': 'apple',
3   'b': 'ball',
4   'c': 'crayon'
5}
6d['d'] # will crash the program with key error
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[25], line 6
      2    'a': 'apple',
      3    'b': 'ball',
      4    'c': 'crayon'
      5 }
----> 6 d['d'] # will crash the program with key error

KeyError: 'd'

.get() lets you specify a default value that should come back if the key doesn’t exist. This is very useful for writing clean and understandable dictionary patterns, such as indexing, which we’ll dig into next week.

1d = {
2   'a': 'apple',
3   'b': 'ball',
4   'c': 'crayon'
5}
6result = d.get('d', "Key not found") # will return "Key not found" as the default
7print(result)
Key not found

PRACTICE: how do you get the value associated with the key a?

1# your code here

Adding entries to a dictionary (or updating entries)#

Classic style, using indexing and assignment

 1d = {
 2   'a': 'apple',
 3   'b': 'ball',
 4   'c': 'crayon'
 5}
 6print(d)
 7# add a new entry for e
 8d['e'] = 'egg' # map the value egg to the key e
 9print(d)
10# update the entry for b
11d['b'] = 'bread' # map the value bread to the key b (which happens to already exist, so we update it)
12print(d)
13# update the entries for a and c
14d['a'] = 'ashes'
15d['c'] = 'charming'
16print(d)
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}
{'a': 'apple', 'b': 'ball', 'c': 'crayon', 'e': 'egg'}
{'a': 'apple', 'b': 'bread', 'c': 'crayon', 'e': 'egg'}
{'a': 'ashes', 'b': 'bread', 'c': 'charming', 'e': 'egg'}

Newer style, using .update()

 1d = {
 2   'a': 'apple',
 3   'b': 'ball',
 4   'c': 'crayon'
 5}
 6print(d)
 7# add a new entry for e
 8d.update({'e': 'egg'}) # map the value egg to the key e
 9print(d)
10# update the entry for b
11d.update({'b': 'bread'}) # map the value bread to the key b (which happens to already exist, so we update it)
12print(d)
13# update the entries for a and c
14new_for_a = "ashes"
15d.update({'a': new_for_a, 'c': 'charming'})
16print(d)
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}
{'a': 'apple', 'b': 'ball', 'c': 'crayon', 'e': 'egg'}
{'a': 'apple', 'b': 'bread', 'c': 'crayon', 'e': 'egg'}
{'a': 'ashes', 'b': 'bread', 'c': 'charming', 'e': 'egg'}

PRACTICE: how do you update d with new entry h:hello?

1# your code here

PRACTICE: how do you update d so that a maps to the value asteroid?

1# your code here

.update() has the advantage of being able to add multiple key-value pairs at once.

1d = {
2   'a': 'apple',
3   'b': 'ball',
4   'c': 'crayon'
5}
6print(d)
7d.update({'a': 'ashes', 'b':'bread', 'c':'charming', 'e': 'egg'})
8print(d)
{'a': 'apple', 'b': 'ball', 'c': 'crayon'}
{'a': 'ashes', 'b': 'bread', 'c': 'charming', 'e': 'egg'}

PRACTICE: how do you update d with new entries f:friend and g:grapes in a single operation?

1# your code here

PRACTICE: how do you update d so that f now points to funny and b points to bundle, in a single operation?

1# your code here

You can use the pattern that is comfortable for you, but I prefer .get() and .update() for now because it’s more readable and robust.

List keys and values#

.keys() gives you all the keys in the dictionary

.values() gives you all the values in the dictionary

.items() gives you all the entries in the dictionary

Each of these is iterable.

1d
{'a': 'ashes', 'b': 'bread', 'c': 'charming', 'e': 'egg'}
1# list all the keys in the dictionary
2d.keys()
dict_keys(['a', 'b', 'c', 'e'])
1# list all the values in the dictionary
2d.values()
dict_values(['ashes', 'bread', 'charming', 'egg'])
1# list all the key-value pairs in the dictionary
2d.items()
dict_items([('a', 'ashes'), ('b', 'bread'), ('c', 'charming'), ('e', 'egg')])
1# this means you can iterate through the keys/values
2for key in d.keys():
3    print(key, d.get(key))
a ashes
b bread
c charming
e egg
1# iterate through the items
2for key, value in d.items():
3    print(f'{key} is associated with the value {value}')
a is associated with the value ashes
b is associated with the value bread
c is associated with the value charming
e is associated with the value egg

Check if a key is in a dictionary#

1d = {
2   'a': 'apple',
3   'b': 'ball',
4   'c': 'crayon'
5}
6print('h' in d)
False

Can only use in operator with keys.

You can also do this with .get()!

1# retrieve value for h, if it's not found, return False
2d.get("h", False)
False

PRACTICE: how do you check if the key “c” is in the dictionary d?

1# your code here

Reverse look up keys from values: YOU CAN’T! Not really…#

Dictionaries are very powerful transformations of your data that make it REALLY easy to do a specific kind of operation, but lock you out of doing other things. So design the structure of the dictionary carefully. For example, if you make an index, and find that you actually care a lot about grabbing the top N words, you probably want to map counts (as keys) to words (as values), not words to counts.

 1s = "she sells sea shells by the sea shore in the sea and the shells and the sea sea sea"
 2
 3d = {} # define a dictionary to hold the index
 4# go through word by word
 5for word in s.split():
 6  # get the current count for the word, default to 0 if we haven't seen it
 7  current_count = d.get(word, 0)
 8  # update the count
 9  new_count = current_count + 1
10  # update the dictionary with the word and count
11  d.update({word: new_count})
12
13d
{'she': 1,
 'sells': 1,
 'sea': 6,
 'shells': 2,
 'by': 1,
 'the': 4,
 'shore': 1,
 'in': 1,
 'and': 2}

Could try to invert it, but….

1def reverse_dictionary(d):
2    return {v: k for k, v in d.items()}

You actually lose information, because remember: keys are unique! No duplicates! So we lose one of our “2” entries.

1d_invert = reverse_dictionary(d)
2d_invert
{1: 'in', 6: 'sea', 2: 'and', 4: 'the'}
1d_invert = {}
2for word, count in d.items():
3    words = d_invert.get(count, []) # get the current list of words associated with this count
4    words.append(word)
5    d_invert.update({count: words})
6d_invert
{1: ['she', 'sells', 'by', 'shore', 'in'],
 6: ['sea'],
 2: ['shells', 'and'],
 4: ['the']}

Practice: Code Tracing with Dictionaries#

Predict the output of each code snippet before running the cell!

Trace 1#

1d = {"x": 10, "y": 20, "z": 30}
2print(d["y"])
  • A) 20

  • B) "y"

  • C) 10

  • D) 1

Answer:

A) 20

d["y"] retrieves the value associated with the key "y", which is 20.

Trace 2#

1d = {"a": 1, "b": 2}
2d["c"] = 3
3d["a"] = 99
4print(d)
5print(len(d))
  • A) {'a': 1, 'b': 2, 'c': 3} then 3

  • B) {'a': 99, 'b': 2, 'c': 3} then 3

  • C) {'a': 99, 'b': 2} then 2

  • D) {'a': 1, 'a': 99, 'b': 2, 'c': 3} then 4

Answer:

B) {'a': 99, 'b': 2, 'c': 3} then 3

d["c"] = 3 adds a new entry. d["a"] = 99 updates the existing key "a" (keys are unique — you can’t have two "a" entries). So the dict has 3 entries.

Trace 3#

1d = {"name": "Joel", "age": 30}
2result = d.get("email", "not found")
3print(result)
4print(len(d))
  • A) None then 2

  • B) "not found" then 2

  • C) "not found" then 3

  • D) KeyError

Answer:

B) "not found" then 2

"email" is not a key in d, so .get() returns the default value "not found". Importantly, .get() does not add the key to the dictionary — d still has only 2 entries.

Trace 4#

1d = {"a": [1, 2], "b": [3]}
2d["a"].append(3)
3d["c"] = []
4d["c"].append(7)
5print(d)
  • A) {'a': [1, 2], 'b': [3], 'c': [7]}

  • B) {'a': [1, 2, 3], 'b': [3], 'c': [7]}

  • C) {'a': [3], 'b': [3], 'c': [7]}

  • D) {'a': [1, 2, 3], 'b': [3], 'c': []}

Answer:

B) {'a': [1, 2, 3], 'b': [3], 'c': [7]}

Dictionary values can be lists (or any other type). d["a"].append(3) gets the list [1, 2] and appends 3 to it — lists are mutable, so this modifies the list in place. d["c"] = [] creates a new key with an empty list, then .append(7) adds to it.

Trace 5#

1d = {"x": 10, "y": 20}
2d.update({"y": 50, "z": 30})
3print(d)
4print("z" in d)
5print(30 in d)
  • A) {'x': 10, 'y': 50, 'z': 30} then True then True

  • B) {'x': 10, 'y': 50, 'z': 30} then True then False

  • C) {'x': 10, 'y': 20, 'z': 30} then True then False

  • D) {'x': 10, 'y': 20, 'y': 50, 'z': 30} then True then True

Answer:

B) {'x': 10, 'y': 50, 'z': 30} then True then False

.update() adds "z": 30 and updates "y" from 20 to 50. "z" in d checks if "z" is a keyTrue. 30 in d checks if 30 is a keyFalse (30 is a value, not a key). The in operator only checks keys!

Working with nested dictionaries#

So far, most of our dictionary values have been simple: a string, a number, a boolean. But one of the most powerful things about dictionaries is that values can be dictionaries too. This lets us model structured data — things with multiple attributes — in a very natural way.

Why nest?#

Think about a student record. You could try to cram everything into a flat dictionary:

1# flat: one key per piece of info
2student_joel_name = "Joel"
3student_joel_major = "Info Sci"
4student_joel_year = "Senior"
5# this gets messy fast with multiple students...

Or you could use a nested dictionary where each student maps to a dictionary of their attributes:

 1students = {
 2    "joel": {
 3        "major": "Info Sci",
 4        "year": "Senior",
 5        "gpa": 3.5
 6    },
 7    "sarah": {
 8        "major": "Computer Science",
 9        "year": "Junior",
10        "gpa": 3.8
11    }
12}

This is much cleaner. Each student has a “record” of attributes, and we can look up any attribute for any student.

Reading nested values: chain the lookups#

To get a value from a nested dictionary, you chain two lookups — first the outer key, then the inner key.

1# get Joel's major
2print(students["joel"]["major"])
Info Sci

Chaining: a familiar idea#

This idea of chaining should feel familiar from strings! Remember how we could chain string methods like .strip().lower().replace("#", "")? That worked because each method returned a string, so you could immediately call another method on it.

The same principle is at work here: each operation returns something, and you can immediately do another operation on that result.

  • "  Hello ".strip() → returns a string → .lower() works on it

  • email.split("@")[0].split() returns a list → [0] indexes into it

  • students["joel"]["major"]students["joel"] returns a dictionary → ["major"] looks up a key in it

It’s the same pattern every time: the result of one operation becomes the input to the next.

Breaking it into steps#

If chaining feels confusing, you can always break it into separate steps — it does the exact same thing:

1# step 1: get Joel's record (this gives us a dictionary)
2joel_record = students["joel"]
3print(type(joel_record))
4print(joel_record)
5
6# step 2: get the major from that dictionary
7print(joel_record["major"])
<class 'dict'>
{'major': 'Info Sci', 'year': 'Senior', 'gpa': 3.5}
Info Sci

The one-liner students["joel"]["major"] just combines both steps. Think of it as reading left to right: “from students, get "joel", then from that, get "major".”

Safe chaining with .get()#

You can also chain .get() calls for safer access:

1# safe access with .get()
2print(students.get("joel", {}).get("major", "unknown"))
3
4# if the outer key doesn't exist, .get() returns {},
5# and then .get("major") on {} returns "unknown"
6print(students.get("pat", {}).get("major", "unknown"))
Info Sci
unknown

This works because .get("pat", {}) returns an empty dictionary {} when "pat" isn’t found, and then .get("major", "unknown") on that empty dictionary returns "unknown". Each step produces something the next step can work with.

Updating nested values#

To update a value inside a nested dictionary, chain the lookups on the left side of the assignment:

1# Joel's GPA went up
2students["joel"]["gpa"] = 3.6
3print(students["joel"])
{'major': 'Info Sci', 'year': 'Senior', 'gpa': 3.6}
1# Sarah changed her major
2students["sarah"]["major"] = "Data Science"
3print(students["sarah"])
{'major': 'Data Science', 'year': 'Junior', 'gpa': 3.8}

Adding a new entry to the outer dictionary#

To add a whole new record, assign a new dictionary to a new outer key:

1# add a new student
2students["rony"] = {
3    "major": "Info Sci",
4    "year": "Senior",
5    "gpa": 3.9
6}
7print(students["rony"])
{'major': 'Info Sci', 'year': 'Senior', 'gpa': 3.9}

Adding a new field to an existing inner dictionary#

You can also add new keys to an inner dictionary — it’s just a regular dictionary, after all:

1# add an email field to Joel's record
2students["joel"]["email"] = "joel@umd.edu"
3print(students["joel"])
{'major': 'Info Sci', 'year': 'Senior', 'gpa': 3.6, 'email': 'joel@umd.edu'}

A real-world example: course catalog#

Let’s put this together with a more realistic example. Here’s a course catalog where each course has multiple attributes:

 1catalog = {
 2    "INST126": {
 3        "title": "Intro to Programming",
 4        "instructor": "Joel",
 5        "credits": 3,
 6        "prereqs": []
 7    },
 8    "INST326": {
 9        "title": "OO Programming",
10        "instructor": "Pat",
11        "credits": 3,
12        "prereqs": ["INST126"]
13    },
14    "INST414": {
15        "title": "Data Science",
16        "instructor": "Sarah",
17        "credits": 3,
18        "prereqs": ["INST126", "INST201"]
19    }
20}

Now we can answer questions by chaining lookups:

1# who teaches INST326?
2print(catalog["INST326"]["instructor"])
Pat
1# what are the prereqs for INST414?
2print(catalog["INST414"]["prereqs"])
['INST126', 'INST201']
1# does INST126 have any prereqs?
2prereqs = catalog["INST126"]["prereqs"]
3if len(prereqs) == 0:
4    print("INST126 has no prerequisites")
5else:
6    print(f"INST126 requires: {prereqs}")
INST126 has no prerequisites

And we can update it:

1# Pat is no longer teaching INST326, Rony is
2catalog["INST326"]["instructor"] = "Rony"
3
4# INST326 added a new prereq
5catalog["INST326"]["prereqs"].append("INST201")
6
7print(catalog["INST326"])
{'title': 'OO Programming', 'instructor': 'Rony', 'credits': 3, 'prereqs': ['INST126', 'INST201']}

Notice that last line: catalog["INST326"]["prereqs"] gives us a list, and since lists are mutable, we can .append() to it directly. This is the power of nesting — the inner values follow all the rules of their own type.

Common pattern: building nested dictionaries from data#

Often you’ll start with raw data (like a list of strings) and need to build up a nested dictionary. The pattern is the same as building a flat dictionary, but the value you create or update is itself a dictionary (or a list inside a dictionary).

 1# raw roster data: "name,section,score"
 2roster_data = [
 3    "Joel,A,81",
 4    "Sarah,B,95",
 5    "Rony,A,98",
 6    "Kacie,C,88"
 7]
 8
 9roster = {}
10for entry in roster_data:
11    name, section, score = entry.split(",")
12    roster[name] = {
13        "section": section,
14        "score": int(score)
15    }
16
17print(roster)
{'Joel': {'section': 'A', 'score': 81}, 'Sarah': {'section': 'B', 'score': 95}, 'Rony': {'section': 'A', 'score': 98}, 'Kacie': {'section': 'C', 'score': 88}}
1# now we can look up any student's info
2print(f"Rony is in section {roster['Rony']['section']} with score {roster['Rony']['score']}")
Rony is in section A with score 98

Summary#

Working with nested dictionaries follows the same rules as flat dictionaries — you just chain the operations:

Operation

Flat

Nested

Read

d["key"]

d["outer"]["inner"]

Update

d["key"] = val

d["outer"]["inner"] = val

Add outer entry

d["new_key"] = val

d["new_key"] = {"inner": val}

Add inner field

n/a

d["outer"]["new_field"] = val

Safe read

d.get("key", default)

d.get("outer", {}).get("inner", default)

Practice: Code Tracing with Nested Dictionaries#

Predict the output of each code snippet before running the cell!

Trace 1#

1users = {
2    "alice": {"role": "admin", "active": True},
3    "bob": {"role": "editor", "active": False}
4}
5print(users["alice"]["role"])
  • A) alice

  • B) admin

  • C) {"role": "admin", "active": True}

  • D) True

Answer:

B) admin

users["alice"] returns the inner dictionary {"role": "admin", "active": True}. Then ["role"] looks up the key "role" in that dictionary, which gives "admin".

Trace 2#

1menu = {
2    "burger": {"price": 12.99, "vegetarian": False},
3    "salad": {"price": 9.50, "vegetarian": True}
4}
5menu["burger"]["price"] = 13.99
6menu["salad"]["calories"] = 350
7print(menu["burger"]["price"])
8print(menu["salad"])
  • A) 12.99 then {"price": 9.50, "vegetarian": True}

  • B) 13.99 then {"price": 9.50, "vegetarian": True, "calories": 350}

  • C) 13.99 then {"price": 9.50, "vegetarian": True}

  • D) 12.99 then {"price": 9.50, "vegetarian": True, "calories": 350}

Answer:

B) 13.99 then {"price": 9.50, "vegetarian": True, "calories": 350}

The first line updates burger’s price from 12.99 to 13.99. The second line adds a new key "calories" to salad’s inner dictionary — you can add fields to an inner dict just like any other dictionary.

Trace 3#

1catalog = {
2    "INST126": {"instructor": "Joel", "enrolled": 38},
3    "INST326": {"instructor": "Pat", "enrolled": 22}
4}
5catalog["INST326"]["enrolled"] += 3
6result = catalog.get("INST201", {}).get("instructor", "TBD")
7print(catalog["INST326"]["enrolled"])
8print(result)
  • A) 22 then TBD

  • B) 25 then TBD

  • C) 25 then KeyError

  • D) 3 then {}

Answer:

B) 25 then TBD

+= 3 adds 3 to the current enrolled count (22 → 25). For the .get() chain: "INST201" isn’t in the catalog, so the first .get() returns {} (the default). Then .get("instructor", "TBD") on that empty dictionary returns "TBD". No crash — that’s the advantage of safe chaining with .get().

Trace 4#

1shows = {
2    "Breaking Bad": {"genre": "drama", "seasons": 5},
3    "The Office": {"genre": "comedy", "seasons": 9}
4}
5shows["Friends"] = {"genre": "comedy", "seasons": 10}
6print(len(shows))
7print(shows["Friends"]["seasons"])
  • A) 2 then KeyError

  • B) 3 then 10

  • C) 3 then {"genre": "comedy", "seasons": 10}

  • D) 2 then 10

Answer:

B) 3 then 10

Assigning to shows["Friends"] adds a new outer key with a full dictionary as its value. The catalog now has 3 entries. shows["Friends"]["seasons"] chains into the new entry and returns 10.

Trace 5#

1profiles = {
2    "terp1": {"name": "Joel", "followers": 100},
3    "terp2": {"name": "Sarah", "followers": 250}
4}
5profiles["terp1"]["followers"] += 50
6profiles["terp2"]["name"] = profiles["terp2"]["name"].upper()
7print(profiles["terp1"]["followers"])
8print(profiles["terp2"]["name"])
  • A) 100 then Sarah

  • B) 150 then SARAH

  • C) 50 then SARAH

  • D) 150 then Sarah

Answer:

B) 150 then SARAH

+= 50 adds 50 to terp1’s follower count (100 → 150). For terp2’s name: profiles["terp2"]["name"] returns "Sarah", then .upper() returns "SARAH", and that result is assigned back. Notice the chaining: we read a nested value, call a string method on it, and assign the result back to the same nested location.

Trace 6#

1data = {}
2entries = ["Joel:A:INST126", "Sarah:B:INST201", "Joel:A+:INST326"]
3for entry in entries:
4    name, grade, course = entry.split(":")
5    data[name] = {"grade": grade, "course": course}
6print(data["Joel"]["grade"])
7print(len(data))
  • A) A then 3

  • B) A+ then 2

  • C) A then 2

  • D) A+ then 3

Answer:

B) A+ then 2

Joel appears twice. The second entry ("Joel:A+:INST326") overwrites the first — keys are unique, so data["Joel"] ends up as {"grade": "A+", "course": "INST326"}. There are only 2 unique names (Joel and Sarah), so len(data) is 2.

Practice: Dictionary Scenarios#

For each scenario, start by creating the dictionary, then complete the retrieval and update operations.

Scenario 1: Gradebook (basic)#

You’re building a simple gradebook that maps student names to their current grade (a single letter grade).

Setup#

Create a dictionary called gradebook with the following entries:

Student

Grade

Joel

B

Sarah

A

Rony

A+

Kacie

B+

Miles

C

1# create the gradebook dictionary here

Retrieve#

R1. Print Joel’s grade.

1# your code here

R2. Check if “Pat” is in the gradebook. If so, print their grade; otherwise print "Pat is not in the gradebook".

1# your code here

R3. Use .get() to look up “Zara” in the gradebook. If she’s not there, print "Zara: no grade on file".

1# your code here

Update#

U1. Miles turned in extra credit — update his grade to "B-".

1# your code here

U2. A new student, Pat, joined the class with a grade of "B". Add them to the gradebook.

1# your code here

U3. Sarah and Rony both got final grade adjustments: Sarah is now "A-" and Rony is now "A". Update both in a single operation.

1# your code here

Scenario 2: Restaurant menu (basic)#

You’re modeling a simple restaurant menu that maps dish names to their price.

Setup#

Create a dictionary called menu with the following entries:

Dish

Price

burger

12.99

salad

9.50

pasta

14.75

soup

7.25

fries

5.00

1# create the menu dictionary here

Retrieve#

R1. A customer asks how much the pasta costs. Print the price formatted as: "pasta: $14.75".

1# your code here

R2. A customer wants to know the price of fries. Print it formatted as "fries: $5.00".

1# your code here

R3. A customer asks for “milkshake”. Use .get() with a default to print "milkshake: not on the menu" if it doesn’t exist.

1# your code here

Update#

U1. The soup price increased to $8.50. Update the menu.

1# your code here

U2. Add a new item: "tacos" at $11.25.

1# your code here

U3. The burger and pasta prices both went up by $1.00. Update both in a single .update() call.

1# your code here

Scenario 3: Social media profile (basic)#

You’re modeling a simple social media profile where each key is a profile field.

Setup#

Create a dictionary called profile with the following entries:

Field

Value

username

terp_coder

display_name

Joel C.

followers

142

bio

INST126 instructor

verified

False

1# create the profile dictionary here

Retrieve#

R1. Print the display name and bio in the format: "{display_name} {bio}".

1# your code here

R2. Check if the account is verified. Print "Verified account" or "Not verified" accordingly.

1# your code here

R3. Print the follower count formatted as: "terp_coder has 142 followers" (use the values from the dictionary, don’t hardcode them).

1# your code here

Update#

U1. The user gained 8 new followers. Update the follower count (don’t just hardcode 150 — add 8 to the current value).

1# your code here

U2. The user changed their bio to "Python enthusiast | UMD". Update it.

1# your code here

U3. The account got verified, and they also want to add a new field "website" with the value "https://joelchan.me". Do both updates at once.

1# your code here

Scenario 4: Course directory (nested)#

You’re building a course directory where each course code maps to a dictionary of course details.

Setup#

Create a dictionary called courses with the following structure:

courses = {
    "INST126": {"title": "Intro to Programming", "instructor": "Joel", "capacity": 40, "enrolled": 38},
    "INST201": {"title": "Intro to Info Science", "instructor": "Sarah", "capacity": 35, "enrolled": 35},
    "INST326": {"title": "OO Programming", "instructor": "Pat", "capacity": 30, "enrolled": 22}
}
1# create the courses dictionary here

Retrieve#

R1. Print the instructor for INST126.

1# your code here

R2. Print how many open seats INST126 has (capacity minus enrolled), formatted as: "INST126: {n} seats available".

1# your code here

R3. A student wants to know the title and instructor for INST326. Print: "INST326: OO Programming (taught by Pat)".

1# your code here

Update#

U1. 3 more students enrolled in INST326. Update the enrolled count.

1# your code here

U2. INST201 got a new instructor: “Rony”. Update it.

1# your code here

U3. Add a new course: "INST314" with title "Statistics", instructor "Kacie", capacity 25, and enrolled 0.

1# your code here

Scenario 5: Streaming catalog (nested)#

You’re modeling a simple streaming catalog where each show title maps to a dictionary of details.

Setup#

Create a dictionary called catalog with the following structure:

catalog = {
    "Stranger Things": {"genre": "sci-fi", "seasons": 4, "rating": 8.7},
    "The Office": {"genre": "comedy", "seasons": 9, "rating": 8.9},
    "Breaking Bad": {"genre": "drama", "seasons": 5, "rating": 9.5},
    "Ted Lasso": {"genre": "comedy", "seasons": 3, "rating": 8.8}
}
1# create the catalog dictionary here

Retrieve#

R1. Print the rating of Breaking Bad.

1# your code here

R2. Print how many seasons The Office has, formatted as: "The Office: 9 seasons".

1# your code here

R3. Check if “Wednesday” is in the catalog. Print "Wednesday is available" or "Wednesday is not in the catalog".

1# your code here

Update#

U1. Stranger Things released a new season. Update its season count to 5.

1# your code here

U2. After the new season, the rating for Stranger Things was updated to 8.9. Update it.

1# your code here

U3. Add a new show: "Wednesday" with genre "comedy", seasons 1, and rating 8.1.

1# your code here

Scenario 6: Website config (nested)#

You’re working with a website configuration dictionary that stores settings for different sections of the site.

Setup#

Create a dictionary called config with the following structure:

config = {
    "homepage": {"title": "Welcome", "show_banner": True, "max_posts": 10},
    "blog": {"title": "Our Blog", "show_banner": False, "max_posts": 25},
    "about": {"title": "About Us", "show_banner": True, "max_posts": 0}
}
1# create the config dictionary here

Retrieve#

R1. Print the title of the blog page.

1# your code here

R2. Check whether the blog page has the banner enabled. Print "Blog banner: on" or "Blog banner: off" accordingly.

1# your code here

R3. Print the homepage title and max posts, formatted as: "Homepage: 'Welcome' (max 10 posts)".

1# your code here

Update#

U1. Enable the banner on the blog page.

1# your code here

U2. The homepage should now show a max of 15 posts instead of 10. Update it.

1# your code here

U3. Add a new page: "contact" with title "Contact Us", show_banner set to False, and max_posts set to 0.

1# your code here

Dictionary Application: Indexing#

Now that we have the dictionary data structure, we can apply it in a program that can create an index.

Example#

Here’s an example from above that takes a list of book chapter strings and a list of key concepts, and produces an index that maps key concepts to chapters.

1# how to know which chapters talk about strings? or debugging?
2book = [
3  "Chapter 1: talks about strings and how they have the property of immutability also some basic debugging",
4  "Chapter 2: continues talking about advanced methods for strings and also introduces the concept of functions",
5  "Chapter 3: discusses iteration and lists and also debugging",
6]
 1# define what we want to index as keys
 2concepts = ['strings', 'debugging', "lists", "iteration"]
 3
 4# make a dictionary to hold the index
 5index = {}
 6
 7# go through every chapter
 8for chapter in book:
 9
10    # split into chapter and descr based on the colon
11    chapter, descr = chapter.split(":")
12
13    # index the concepts
14
15    # for every concept we care about
16    for keyconcept in concepts:
17        # check if it's in this chapter descr
18        if keyconcept in descr:
19            # if so, get the current list of chapters associated with this keyconcept
20            chs = index.get(keyconcept, [])
21            # then update the list of chapters
22            # if we haven't seen the chapter
23            if chapter not in chs:
24                chs.append(chapter)
25            # and update the index to map the keyconcept to the updated list of assoc. chapters
26            index.update({keyconcept: chs})
27index
{'strings': ['Chapter 1', 'Chapter 2'],
 'debugging': ['Chapter 1', 'Chapter 3'],
 'lists': ['Chapter 3'],
 'iteration': ['Chapter 3']}

What dictionary concepts do you recognize?

The indexing pattern#

Here’s the generic structure of an indexing pattern

# create an empty dictionary to hold the index

# for every item in a list of things you want to index

    # (optional: parse out the keys and values you want to index from the item)

    # get the current value associated with the key in the index

    # update the value

    # update the index with the key and its updated value

Connection to patterns you already know#

This pattern might look new, but the core idea is something you’ve already practiced! In the Iteration chapter, we learned two key loop patterns:

  • Counting: initialize a counter to 0, loop through items, increment the counter when a condition is met

  • Accumulating (into a list): initialize an empty list [], loop through items, .append() items that meet some criteria

Dictionary indexing is just these same patterns — but instead of one counter or one list, you have a counter or list per key. The dictionary is what lets you keep track of multiple accumulators at once, one for each unique key.

Here’s the side-by-side:

Simple (from Iteration)

Dictionary version (Indexing)

Counting

count = 0count += 1

d[key] = d.get(key, 0) + 1

Accumulating

result = []result.append(item)

d[key] = d.get(key, [])d[key].append(item)

The key difference is .get(key, default) — it handles the “first time we see this key” case by providing a starting value (0 for counting, [] for accumulating).

Let’s see both variations in action.

Variation 1: Counting index#

Let’s look at a super simple example: making a word count index.

We’ll take in a list of words, and produce an index that maps words as keys to counts of occurrences as values.

 1# the thing we want to index
 2word_list = ['she', 'sells', 'sea', 'shells', 'by', 'the', 'sea', 'shore', 'in', 'the', 'sea', 'and', 'the', 'shells', 'and', 'the', 'sea', 'sea', 'sea']
 3
 4# create an empty dictionary to hold the index
 5word_counts = {}
 6
 7# for every item in a list of things you want to index
 8for word in word_list:
 9
10    # (optional: parse out the keys and values you want to index from the item)
11    # we don't need this step
12
13    # get the current value associated with the key in the index
14    # here, we use .get(), and have a default count of 0 if the key isn't yet in the index
15    count = word_counts.get(word, 0)
16
17    # update the value
18    count += 1
19
20    # update the index with the key and its updated value
21    word_counts.update({word: count})
22
23word_counts
{'she': 1,
 'sells': 1,
 'sea': 6,
 'shells': 2,
 'by': 1,
 'the': 4,
 'shore': 1,
 'in': 1,
 'and': 2}

Practice: index grade counts#

Practice! Test your understanding by adapting the above code to count how many times each grade was earned in a course.

1grades_counts = {}
1# your code here

Variation 2: Accumulating index#

The counting version uses d.get(key, 0) to start each key’s count at 0. But what if instead of counting, you want to collect items — like building a list of all chapters where a concept appears, or all emails from a particular sender?

This is the accumulating variation. Instead of 0 as the default, you use [] (an empty list), and instead of += 1, you .append() the item.

Practice: which words are n characters long?#

Let’s try the accumulating variation. We’ll index word lengths: we want an index that can tell us “what words in the list were 2 characters long, or 5 characters long?” The key is the word length, and the value is a list of words with that length.

1# the thing we want to index
2word_list = ['she', 'sells', 'sea', 'shells', 'by', 'the', 'sea', 'shore', 'in', 'the', 'sea', 'and', 'the', 'shells', 'and', 'the', 'sea', 'sea', 'sea']
3
4# your code here

Practice: counts of email addresses from email records (counting)#

Let’s practice the counting variation again. Index how many times we got emails from each email address!

 1email_records = ['From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008',
 2 'From louis@media.berkeley.edu Fri Jan  4 18:10:48 2008',
 3 'From zqian@umich.edu Fri Jan  4 16:10:39 2008',
 4 'From rjlowe@iupui.edu Fri Jan  4 15:46:24 2008',
 5 'From zqian@umich.edu Fri Jan  4 15:03:18 2008',
 6 'From rjlowe@iupui.edu Fri Jan  4 14:50:18 2008',
 7 'From cwen@iupui.edu Fri Jan  4 11:37:30 2008',
 8 'From cwen@iupui.edu Fri Jan  4 11:35:08 2008',
 9 'From gsilver@umich.edu Fri Jan  4 11:12:37 2008',
10 'From gsilver@umich.edu Fri Jan  4 11:11:52 2008',
11 'From zqian@umich.edu Fri Jan  4 11:11:03 2008',
12 'From gsilver@umich.edu Fri Jan  4 11:10:22 2008',
13 'From wagnermr@iupui.edu Fri Jan  4 10:38:42 2008',
14 'From zqian@umich.edu Fri Jan  4 10:17:43 2008',
15 'From antranig@caret.cam.ac.uk Fri Jan  4 10:04:14 2008',
16 'From gopal.ramasammycook@gmail.com Fri Jan  4 09:05:31 2008',
17 'From david.horwitz@uct.ac.za Fri Jan  4 07:02:32 2008',
18 'From david.horwitz@uct.ac.za Fri Jan  4 06:08:27 2008',
19 'From david.horwitz@uct.ac.za Fri Jan  4 04:49:08 2008',
20 'From david.horwitz@uct.ac.za Fri Jan  4 04:33:44 2008',
21 'From stephen.marquard@uct.ac.za Fri Jan  4 04:07:34 2008',
22 'From louis@media.berkeley.edu Thu Jan  3 19:51:21 2008',
23 'From louis@media.berkeley.edu Thu Jan  3 17:18:23 2008',
24 'From ray@media.berkeley.edu Thu Jan  3 17:07:00 2008',
25 'From cwen@iupui.edu Thu Jan  3 16:34:40 2008',
26 'From cwen@iupui.edu Thu Jan  3 16:29:07 2008',
27 'From cwen@iupui.edu Thu Jan  3 16:23:48 2008']
28
29# your code here

Practice: map email addresses to email records (accumulating)#

Now modify the previous program to use the accumulating variation. Instead of counting, collect the full email records (as a list of strings) for each email address. We want to ask questions like “can I see the emails I got from cwen@iupui.edu?”

 1email_records = ['From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008',
 2 'From louis@media.berkeley.edu Fri Jan  4 18:10:48 2008',
 3 'From zqian@umich.edu Fri Jan  4 16:10:39 2008',
 4 'From rjlowe@iupui.edu Fri Jan  4 15:46:24 2008',
 5 'From zqian@umich.edu Fri Jan  4 15:03:18 2008',
 6 'From rjlowe@iupui.edu Fri Jan  4 14:50:18 2008',
 7 'From cwen@iupui.edu Fri Jan  4 11:37:30 2008',
 8 'From cwen@iupui.edu Fri Jan  4 11:35:08 2008',
 9 'From gsilver@umich.edu Fri Jan  4 11:12:37 2008',
10 'From gsilver@umich.edu Fri Jan  4 11:11:52 2008',
11 'From zqian@umich.edu Fri Jan  4 11:11:03 2008',
12 'From gsilver@umich.edu Fri Jan  4 11:10:22 2008',
13 'From wagnermr@iupui.edu Fri Jan  4 10:38:42 2008',
14 'From zqian@umich.edu Fri Jan  4 10:17:43 2008',
15 'From antranig@caret.cam.ac.uk Fri Jan  4 10:04:14 2008',
16 'From gopal.ramasammycook@gmail.com Fri Jan  4 09:05:31 2008',
17 'From david.horwitz@uct.ac.za Fri Jan  4 07:02:32 2008',
18 'From david.horwitz@uct.ac.za Fri Jan  4 06:08:27 2008',
19 'From david.horwitz@uct.ac.za Fri Jan  4 04:49:08 2008',
20 'From david.horwitz@uct.ac.za Fri Jan  4 04:33:44 2008',
21 'From stephen.marquard@uct.ac.za Fri Jan  4 04:07:34 2008',
22 'From louis@media.berkeley.edu Thu Jan  3 19:51:21 2008',
23 'From louis@media.berkeley.edu Thu Jan  3 17:18:23 2008',
24 'From ray@media.berkeley.edu Thu Jan  3 17:07:00 2008',
25 'From cwen@iupui.edu Thu Jan  3 16:34:40 2008',
26 'From cwen@iupui.edu Thu Jan  3 16:29:07 2008',
27 'From cwen@iupui.edu Thu Jan  3 16:23:48 2008']
28
29# your code here
1email_occurrences.get("david.horwitz@uct.ac.za")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[125], line 1
----> 1 email_occurrences.get("david.horwitz@uct.ac.za")

NameError: name 'email_occurrences' is not defined

Investigate: integrate the book chapter indexing program into our generic structure#

What’s the same? What needs to be modified?

Practice: flag important emails#

Let’s say you have a list of ‘starred contacts’, and you want a separate index where you can ask the question “give me all the emails I got from this starred contact”. You can imagine this being sort of a rudimentary back-end of your “important” tab in Gmail.

How might we modify our email indexing program to do this?

 1email_records = ['From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008',
 2 'From louis@media.berkeley.edu Fri Jan  4 18:10:48 2008',
 3 'From zqian@umich.edu Fri Jan  4 16:10:39 2008',
 4 'From rjlowe@iupui.edu Fri Jan  4 15:46:24 2008',
 5 'From zqian@umich.edu Fri Jan  4 15:03:18 2008',
 6 'From rjlowe@iupui.edu Fri Jan  4 14:50:18 2008',
 7 'From cwen@iupui.edu Fri Jan  4 11:37:30 2008',
 8 'From cwen@iupui.edu Fri Jan  4 11:35:08 2008',
 9 'From gsilver@umich.edu Fri Jan  4 11:12:37 2008',
10 'From gsilver@umich.edu Fri Jan  4 11:11:52 2008',
11 'From zqian@umich.edu Fri Jan  4 11:11:03 2008',
12 'From gsilver@umich.edu Fri Jan  4 11:10:22 2008',
13 'From wagnermr@iupui.edu Fri Jan  4 10:38:42 2008',
14 'From zqian@umich.edu Fri Jan  4 10:17:43 2008',
15 'From antranig@caret.cam.ac.uk Fri Jan  4 10:04:14 2008',
16 'From gopal.ramasammycook@gmail.com Fri Jan  4 09:05:31 2008',
17 'From david.horwitz@uct.ac.za Fri Jan  4 07:02:32 2008',
18 'From david.horwitz@uct.ac.za Fri Jan  4 06:08:27 2008',
19 'From david.horwitz@uct.ac.za Fri Jan  4 04:49:08 2008',
20 'From david.horwitz@uct.ac.za Fri Jan  4 04:33:44 2008',
21 'From stephen.marquard@uct.ac.za Fri Jan  4 04:07:34 2008',
22 'From louis@media.berkeley.edu Thu Jan  3 19:51:21 2008',
23 'From louis@media.berkeley.edu Thu Jan  3 17:18:23 2008',
24 'From ray@media.berkeley.edu Thu Jan  3 17:07:00 2008',
25 'From cwen@iupui.edu Thu Jan  3 16:34:40 2008',
26 'From cwen@iupui.edu Thu Jan  3 16:29:07 2008',
27 'From cwen@iupui.edu Thu Jan  3 16:23:48 2008']
28
29starred = ["cwen@iupui.edu", "david.horwitz@uct.ac.za"]
30# your code here

Practice: Code Tracing with Dictionary Indexing#

Predict the output of each code snippet before running the cell!

Trace 1 (counting)#

1letters = ["a", "b", "a", "c", "b", "a"]
2counts = {}
3for letter in letters:
4    counts[letter] = counts.get(letter, 0) + 1
5print(counts)
6print(counts["a"])
  • A) {'a': 1, 'b': 1, 'c': 1} then 1

  • B) {'a': 3, 'b': 2, 'c': 1} then 3

  • C) {'a': 3, 'b': 2, 'a': 3, 'c': 1, 'b': 2, 'a': 3} then 3

  • D) {'a': 2, 'b': 1, 'c': 1} then 2

Answer:

B) {'a': 3, 'b': 2, 'c': 1} then 3

Each time we see a letter, .get(letter, 0) returns the current count (or 0 if it’s new), and we add 1. “a” appears 3 times, “b” twice, “c” once. Keys are unique — there’s only one entry per letter.

Trace 2 (accumulating)#

1words = ["cat", "car", "dog", "cup"]
2groups = {}
3for word in words:
4    first = word[0]
5    items = groups.get(first, [])
6    items.append(word)
7    groups[first] = items
8print(groups)
  • A) {'c': ['cup'], 'd': ['dog']}

  • B) {'c': ['cat', 'car', 'cup'], 'd': ['dog']}

  • C) {'cat': 'c', 'car': 'c', 'dog': 'd', 'cup': 'c'}

  • D) {'c': 3, 'd': 1}

Answer:

B) {'c': ['cat', 'car', 'cup'], 'd': ['dog']}

This is the accumulating pattern. The key is the first letter, and the value is a list of words starting with that letter. .get(first, []) returns the existing list or an empty one for new keys. Each word is appended to the appropriate list.

Trace 3 (counting with parsing)#

1sales = ["apples:3", "bananas:2", "apples:1", "bananas:5"]
2totals = {}
3for sale in sales:
4    item, qty = sale.split(":")
5    totals[item] = totals.get(item, 0) + int(qty)
6print(totals)
  • A) {'apples': 1, 'bananas': 5}

  • B) {'apples': 4, 'bananas': 7}

  • C) {'apples': 3, 'bananas': 2}

  • D) {'apples:3': 1, 'bananas:2': 1, 'apples:1': 1, 'bananas:5': 1}

Answer:

B) {'apples': 4, 'bananas': 7}

Each entry is parsed with .split(":"). The quantity is converted to int and added to the running total for that item. Apples: 3 + 1 = 4. Bananas: 2 + 5 = 7. This is the counting pattern but accumulating a sum rather than just counting occurrences.

Trace 4 (trap — overwrite, not accumulate!)#

1scores = [("Joel", 85), ("Sarah", 92), ("Joel", 90), ("Sarah", 88)]
2record = {}
3for name, score in scores:
4    record[name] = score
5print(record)
  • A) {'Joel': [85, 90], 'Sarah': [92, 88]}

  • B) {'Joel': 90, 'Sarah': 88}

  • C) {'Joel': 85, 'Sarah': 92}

  • D) {'Joel': 175, 'Sarah': 180}

Answer:

B) {'Joel': 90, 'Sarah': 88}

This is a trap! The code just assigns record[name] = score — it overwrites the previous value each time. Joel’s second score (90) replaces 85. To keep all scores, you’d need the accumulating pattern: record[name] = record.get(name, []) then record[name].append(score).

Trace 5 (counting with conditional)#

1grades = ["A", "B", "A", "C", "A", "B", "F", "B", "A"]
2passing = {}
3for grade in grades:
4    if grade != "F":
5        passing[grade] = passing.get(grade, 0) + 1
6print(passing)
7print(len(passing))
  • A) {'A': 4, 'B': 3, 'C': 1, 'F': 1} then 4

  • B) {'A': 4, 'B': 3, 'C': 1} then 3

  • C) {'A': 4, 'B': 3, 'C': 1, 'F': 0} then 4

  • D) {'A': 3, 'B': 2, 'C': 1} then 3

Answer:

B) {'A': 4, 'B': 3, 'C': 1} then 3

The if grade != "F" filter means F is never added to the dictionary at all — not even with a count of 0. Only passing grades are counted. There are 3 unique passing grades, so len(passing) is 3.

Trace 6 (accumulating with parsing)#

1data = ["INST126:Joel", "INST201:Sarah", "INST126:Rony"]
2index = {}
3for entry in data:
4    course, student = entry.split(":")
5    students = index.get(course, [])
6    students.append(student)
7    index[course] = students
8print(index["INST126"])
9print(len(index))
  • A) ['Joel'] then 3

  • B) ['Joel', 'Rony'] then 2

  • C) ['Rony'] then 2

  • D) ['Joel', 'Rony'] then 3

Answer:

B) ['Joel', 'Rony'] then 2

INST126 appears twice. The first time, .get("INST126", []) returns [], and “Joel” is appended. The second time, it returns ['Joel'], and “Rony” is appended. The list grows — it doesn’t overwrite. There are 2 unique courses, so len(index) is 2.

Practice: Dictionary Indexing Problems#

P1: Count character types (counting)#

Given a string, count how many letters, digits, and spaces it contains. Store the results in a dictionary with keys "letters", "digits", and "spaces".

1text = "Hello World 123"
2char_counts = {}
3
4# your code here

Answer:

text = "Hello World 123"
char_counts = {}
for char in text:
    if char.isalpha():
        char_counts["letters"] = char_counts.get("letters", 0) + 1
    elif char.isnumeric():
        char_counts["digits"] = char_counts.get("digits", 0) + 1
    elif char == " ":
        char_counts["spaces"] = char_counts.get("spaces", 0) + 1
char_counts  # {'letters': 10, 'digits': 3, 'spaces': 2}

P2: Group names by first letter (accumulating)#

Given a list of names, build a dictionary that maps each first letter to a list of names starting with that letter.

1names = ["Joel", "Sarah", "John", "Kacie", "Sam", "Jill", "Kelly"]
2groups = {}
3
4# your code here

Answer:

names = ["Joel", "Sarah", "John", "Kacie", "Sam", "Jill", "Kelly"]
groups = {}
for name in names:
    first = name[0]
    groups[first] = groups.get(first, [])
    groups[first].append(name)
groups  # {'J': ['Joel', 'John', 'Jill'], 'S': ['Sarah', 'Sam'], 'K': ['Kacie', 'Kelly']}

P3: Word frequency from a sentence (counting)#

Given a sentence, count how many times each word appears. Normalize to lowercase first so “The” and “the” are counted together.

1sentence = "the cat sat on the mat and the cat saw the dog"
2word_freq = {}
3
4# your code here

Answer:

sentence = "the cat sat on the mat and the cat saw the dog"
word_freq = {}
for word in sentence.lower().split():
    word_freq[word] = word_freq.get(word, 0) + 1
word_freq  # {'the': 4, 'cat': 2, 'sat': 1, 'on': 1, 'mat': 1, 'and': 1, 'saw': 1, 'dog': 1}

P4: Categorize scores (accumulating)#

Given a list of scores, build a dictionary that groups them into "high" (90+), "medium" (70-89), and "low" (below 70).

1scores = [95, 67, 82, 91, 55, 73, 88, 100, 42, 78]
2categories = {}
3
4# your code here

Answer:

scores = [95, 67, 82, 91, 55, 73, 88, 100, 42, 78]
categories = {}
for score in scores:
    if score >= 90:
        label = "high"
    elif score >= 70:
        label = "medium"
    else:
        label = "low"
    categories[label] = categories.get(label, [])
    categories[label].append(score)
categories  # {'high': [95, 91, 100], 'medium': [82, 73, 88, 78], 'low': [67, 55, 42]}

P5: Count email domains (counting)#

Given a list of email addresses, count how many emails come from each domain (the part after @).

1emails = ["joel@umd.edu", "sarah@gmail.com", "rony@umd.edu", "pat@gmail.com", "kacie@umd.edu", "miles@yahoo.com"]
2domain_counts = {}
3
4# your code here

Answer:

emails = ["joel@umd.edu", "sarah@gmail.com", "rony@umd.edu", "pat@gmail.com", "kacie@umd.edu", "miles@yahoo.com"]
domain_counts = {}
for email in emails:
    domain = email.split("@")[1]
    domain_counts[domain] = domain_counts.get(domain, 0) + 1
domain_counts  # {'umd.edu': 3, 'gmail.com': 2, 'yahoo.com': 1}

P6: Index course enrollments (accumulating)#

Given a list of enrollment records in the format "StudentName:CourseCode", build a dictionary that maps each course to a list of enrolled students.

1enrollments = [
2    "Joel:INST126", "Sarah:INST201", "Rony:INST126",
3    "Kacie:INST201", "Pat:INST126", "Miles:INST326",
4    "Sarah:INST126", "Joel:INST201"
5]
6course_roster = {}
7
8# your code here

Answer:

enrollments = [
    "Joel:INST126", "Sarah:INST201", "Rony:INST126",
    "Kacie:INST201", "Pat:INST126", "Miles:INST326",
    "Sarah:INST126", "Joel:INST201"
]
course_roster = {}
for entry in enrollments:
    student, course = entry.split(":")
    course_roster[course] = course_roster.get(course, [])
    course_roster[course].append(student)
course_roster
# {'INST126': ['Joel', 'Rony', 'Pat', 'Sarah'],
#  'INST201': ['Sarah', 'Kacie', 'Joel'],
#  'INST326': ['Miles']}