Overview
A dictionary is a new kind of Python data type. Dictionaries are fantastically useful and are found in nearly all Python programs. Once you learn what dictionaries are and how they work, you won’t want to program without them.
Before we describe what dictionaries are, let’s describe a problem they can solve.
Example problem: phone number database
You want to keep track of your friends' phone numbers. But since you have so many friends, this is a difficult job! How can the computer help?
For each friend, you need to store:
-
the name of the friend
-
their phone number
Also, you want to be able to retrieve the phone number for a given friend. Given what you know now, how can you do this?
Using a list
You could create a list of names and phone numbers:
phone_numbers = ['Joe', '567-8901', 'Jane', '123-4567', ...]
...but it would not be easy to find the number corresponding to a different name. It would be better if a name and the corresponding phone number were associated in some way.
Using a list of tuples
You could have a list of (<name>, <phone number>)
tuples:
phone_numbers = [('Joe', '567-8901'), ('Jane', '123-4567'), ...]
Let’s see what we would need to do in order to find the phone number
corresponding to a particular name e.g. 'Alex'
.
We could write the code like this:
for (name, num) in friends: if name == 'Alex': print('Phone number: {}'.format(num))
This is not too bad, but:
-
We can’t modify the phone number! (Tuples are immutable.) We might use lists instead of tuples, but...
-
We might have to look through the entire list (in the worst case) to find one number, which is inefficient.
Using a dictionary
The Right ThingTM to do in cases like this is to use a dictionary. So let’s talk about dictionaries and what makes them so awesome.
Keys and values
A dictionary (sometimes called a dict for short) is a Python data structure that associates keys with values. Each key is associated with exactly one value. (Sometimes this is called a mapping between keys and values.) Dictionaries allow you to do these things:
-
find the value corresponding to a particular key
-
change the value associated with a key
-
add new key/value pairs
-
delete key/value pairs
and they’re fast! (Much faster than a list of tuples, for instance.) [1]
Because we can add key/value pairs to a dictionary and delete key/value pairs from a dictionary, dictionaries, like lists, are not immutable.
There are two rules for keys and values:
-
The values in a dictionary can be any Python value.
-
The keys in a dictionary can be any kind of immutable Python value. [2]
Since strings are immutable, we can use strings as dictionary keys. You can also use numbers, tuples, and other kinds of values we haven’t seen yet. [3] In the example above, we can use names as keys and phone numbers as values.
Dictionary syntax
We want to create a dictionary from our friends' names and phone numbers. First, we have to know the syntax of dictionaries.
Empty dictionary
Dictionaries use curly braces, and the simplest dictionary is the empty one, which looks like this:
{}
It’s a dictionary with no key/value pairs. Pretty exciting!
Actually, though, empty dictionaries, like empty lists, are very useful. Often you start with an empty dictionary and then fill it up element-by-element in a loop, adding a new key/value pair for every iteration of the loop.
Non-empty dictionary
Alternatively, you can create a dictionary by writing out the key/value pairs inside of curly braces, separated in two ways:
-
different key/value pairs are separated by commas
-
the key and the value in a single key/value pair are separated by a colon (
:
) [4]
For our example, here is the dictionary we can create:
phone_numbers = { 'Joe' : '567-8901', 'Jane' : '123-4567' }
If there are more key/value pairs, we can add them too. You can see that the
first key/value pair is 'Joe' : '567-8901'
and the second is 'Jane' :
'123-4567'
. The keys are 'Joe'
and 'Jane'
and the values are '567-8901'
and '123-4567'
. The spaces in the dictionary are not required, but they help
to keep it readable.
Most of the time, when we write out a dictionary like this (called a literal dictionary), the keys and values are Python values, but they can also be Python expressions. Here’s a contrived example.
phone_numbers = { 'J' + 'oe' : '567' + '-8901', 'Ja' + 'ne' : '123-' + '4567' }
This would give the same dictionary as the previous code.
In cases like this (which are very rare), the key expressions and the value expressions are evaluated before the dictionary is created. (It’s not that rare to have computed values, but computed keys are very unusual.)
Dictionary types
The only restriction on the types of keys or values in a dictionary is that the key must be immutable i.e. its type must be the type of an immutable Python object. Other than that, a dictionary can have any type of key or value.
In particular, a single dictionary can have different types of (immutable) keys, and different types of values. This is a bit unusual, but sometimes it’s quite useful. So this is legal:
mydict = { 1 : 'foo', 'bar' : [1, 2, 3], ('baz', 'boom') : 3.14159 }
The mydict
dictionary has three different (immutable) key types, and three
different value types.
You may have heard of the JSON data format, which is a way of formatting structured data which is used a lot by internet applications. A JSON object is almost identical to a Python dictionary with string keys and different types of values. Python, like most languages, has a JSON library (actually more than one). |
Getting a value given a key
The most common thing to do with a dictionary is to look up the value that corresponds to a particular key. We’ll assume this dictionary again:
phone_numbers = { 'Joe' : '567-8901', 'Jane' : '123-4567' }
To get Joe’s phone number, all we have to write is this:
phone_numbers['Joe']
which will evaluate to '567-8901'
.
Notice that phone_numbers['Joe']
looks like accessing a list with a value of
'Joe'
. Python is overloading the meaning of the square brackets! Before
this, the value inside the brackets could only be an integer. But with a
dictionary, it can be any key value (which means any immutable Python value).
Python really likes to re-use its syntax for distinct but similar things!
Changing a value at a key
Another thing you commonly want to do with a dictionary is to change the value associated with a particular key. For instance, let’s say that Joe’s phone number changes. We can change the dictionary value too:
phone_numbers['Joe'] = '314-1592' # cool new phone number!
This is just like the syntax for changing a list value, except that the "index" is a string, not a number. Here’s the new dictionary:
>>> phone_numbers {'Joe': '314-1592', 'Jane': '123-4567'}
Adding a new key/value pair
Another very common thing to do with dictionaries is to add new key/value pairs. The syntax for this is identical to the syntax for changing the values at existing keys, except that the keys are not in the dictionary until after you add them. For instance, let’s say that you just made a new friend named Bob, and you wanted to add his phone number. No problem!
phone_numbers['Bob'] = '000-0000'
Now when you look at the entire dictionary, you see this:
>>> phone_numbers {'Joe': '314-1592', 'Jane': '123-4567', 'Bob': '000-0000'}
Even though it looks like the key/value pairs are stored in the order they were added, you shouldn’t depend on this. Python dictionaries are not sequences. The current implementation does keep keys in "insertion order", but earlier versions of Python dictionaries didn’t, and this might change again in the future. |
This is one way in which a dictionary is very different from e.g. a list. With a list, you can’t add new entries like this.
>>> lst = [0,1,2,3,4] >>> lst[5] = 5 Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: list assignment index out of range
Instead, you have to use lst.append(5)
.
Accessing a nonexistent key
What happens when you try to access a nonexistent key in a dictionary?
>>> phone_numbers['Mike'] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'Mike'
Python raises a KeyError
exception. This is the right thing to do.
[5]
Deleting a key/value pair: the del
statement
It’s not that common, but sometimes we want to delete a key/value pair. Let’s
say you had a falling out with your new friend Bob, and you decide you don’t
ever want to talk to him again. You might want to delete his phone number from
your phone_number
dictionary. Here’s how to do it:
>>> phone_numbers {'Joe': '314-1592', 'Jane': '123-4567', 'Bob': '000-0000'} >>> del phone_numbers['Bob'] >>> phone_numbers {'Joe': '314-1592', 'Jane': '123-4567'}
The new keyword del
is short for "delete". Given a key, it removes the
key/value pair that the key is part of from the dictionary. This is not a
function or method call! del
is actually a special Python statement.
Because it isn’t a function call, you don’t have to put parentheses around its
argument (and you shouldn’t).
del
can remove elements from things other than dictionaries (e.g. lists) but
it’s more useful with dictionaries than with lists. We will meet del
again
in future readings.
Back to the example: tuples as keys
Let’s improve the example by using a tuple of first and last names as keys:
phone_numbers = { ('Joe', 'Smith') : '567-8910', ('Jane', 'Doe') : '123-4567', ('El', 'Hovik') : '000-0000', ('Mike', 'Vanier') : '111-1111', }
(Fun fact: we don’t have to use the \<return>
line continuation characters
at the ends of the lines when writing out a dictionary like this.)
It’s OK to use a tuple of strings as a dictionary key, because both tuples and strings are immutable, so a tuple of strings is immutable too. If we had e.g. a tuple of lists, that would not be immutable, so you couldn’t use it as a key. Similarly, a list of strings is not an acceptable dictionary key. Let’s try it anyway:
>>> phone_numbers = { ['Joe', 'Smith'] : '567-8910', ['Jane', 'Doe'] : '123-4567' } Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list'
The error message unhashable type: 'list'
means that since lists are mutable,
they can’t be used as dictionary keys. [6]
OK, so we’ll use tuples. Once we’ve done this, we can access a value corresponding to a tuple:
>>> phone_numbers[('Joe', 'Smith')] '567-8910'
We have to use the entire tuple; either the first or last name doesn’t work:
>>> phone_numbers['Joe'] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'Joe' >>> phone_numbers['Smith'] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'Smith'
Dictionaries and for
loops
We’ve seen many things that can be looped over using for
loops:
-
lists
-
strings
-
files
So it shouldn’t surprise you to learn that dictionaries can also be looped over
in a for
loop. The way it works is that when you have a dictionary in a
for
loop following the in
keyword, you loop over the keys of the
dictionary. For instance, we could write this loop:
for key in phone_numbers: print(f'key: {key}, value: {phone_numbers[key]}')
which would print:
key: ('Joe', 'Smith'), value: 567-8910 key: ('Jane', 'Doe'), value: 123-4567 key: ('El', 'Hovik'), value: 000-0000 key: ('Mike', 'Vanier'), value: 111-1111
We could use this to print out the phone numbers of every person in the
dictionary whose first name is 'Joe'
:
for key in phone_numbers: (first_name, last_name) = key if first_name == 'Joe': print(f'name: {first_name} {last_name}, number: {phone_numbers[key]}')
Since there is only one 'Joe'
in the dictionary, this will print:
name: Joe Smith, number: 567-8910
Dictionary methods
Dictionaries are objects in Python (like lists, and strings, and files) Therefore, they have methods. In this section, we’ll discuss a few of the most important ones. For a full list of dictionary methods, consult the Python documentation.
get
If you try to get the value in a dictionary corresponding to a key which isn’t
in the dictionary, normally this results in a KeyError
exception:
>>> phone_numbers[('William', 'Shakespeare')] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: ('William', 'Shakespeare')
Instead of this, you can use the get
method if you would rather return a
default value:
>>> phone_numbers.get(('William', 'Shakespeare'), 'unknown') # 'unknown' is the default value returned if the key isn't in the dictionary 'unknown'
This is usually not what you want to do, but we’ll see an example below where this method is useful.
clear
If you want to empty out an existing dictionary, you can do that with the
clear
method:
>>> phone_numbers {('Joe', 'Smith'): '567-8910', ('Jane', 'Doe'): '123-4567', ('El', 'Hovik'): '000-0000', ('Mike', 'Vanier'): '111-1111'} >>> phone_numbers.clear() >>> phone_numbers {}
This is rarely needed; it’s actually easier to just do this:
>>> phone_numbers = {}
On the other hand, if the dictionary was passed in as an argument to a function
or was part of a larger data structure, you might need to use the clear
method if you need to empty it out.
keys
and values
If you need the dictionary’s keys or values as a separate thing, you can use
the keys
or values
methods. These return (respectively) a dict_keys
or a
dict_values
object. These act basically like iterators, and they can easily
be converted to lists:
phone_numbers = { ('Joe', 'Smith') : '567-8910', ('Jane', 'Doe') : '123-4567', ('El', 'Hovik') : '000-0000', ('Mike', 'Vanier') : '111-1111', } >>> phone_numbers.keys() dict_keys([('Joe', 'Smith'), ('Jane', 'Doe'), ('El', 'Hovik'), ('Mike', 'Vanier')]) >>> list(phone_numbers.keys()) [('Joe', 'Smith'), ('Jane', 'Doe'), ('El', 'Hovik'), ('Mike', 'Vanier')] >>> phone_numbers.values() dict_values(['567-8910', '123-4567', '000-0000', '111-1111']) >>> list(phone_numbers.values()) ['567-8910', '123-4567', '000-0000', '111-1111']
It’s rare that you actually need these methods.
items
The items
method is like the keys
and values
methods combined: it returns
a dict_items
object which can be converted to a list of key/value pairs:
>>> phone_numbers.items() dict_items([(('Joe', 'Smith'), '567-8910'), (('Jane', 'Doe'), '123-4567'), (('El', 'Hovik'), '000-0000'), (('Mike', 'Vanier'), '111-1111')]) >>> list(phone_numbers.items()) [(('Joe', 'Smith'), '567-8910'), (('Jane', 'Doe'), '123-4567'), (('El', 'Hovik'), '000-0000'), (('Mike', 'Vanier'), '111-1111')]
Sometimes the items
method can be used to good effect in a for
loop.
>>> for (key, value) in phone_numbers.items(): ... print(key, value) ... ('Joe', 'Smith') 567-8910 ('Jane', 'Doe') 123-4567 ('El', 'Hovik') 000-0000 ('Mike', 'Vanier') 111-1111
You usually don’t need to convert the items
return value into a list, and you
normally shouldn’t. (In this respect, the items
method is similar to the
range
function.)
update
The update
method adds the key/value pairs from another dictionary into a
dictionary, overwriting old values if the other dictionary has the same keys
with different values.
>>> for (key, value) in phone_numbers.items(): ... print(key, value) ... ('Joe', 'Smith') 567-8910 ('Jane', 'Doe') 123-4567 ('El', 'Hovik') 000-0000 ('Mike', 'Vanier') 111-1111 >>> new_phone_numbers = { ... ('Bob', 'Johnson') : '543-9876', ... ('Jane', 'Doe') : '7654-321' ... } >>> phone_numbers {('Joe', 'Smith'): '567-8910', ('Jane', 'Doe'): '123-4567', ('El', 'Hovik'): '000-0000', ('Mike', 'Vanier'): '111-1111'} >>> phone_numbers.update(new_phone_numbers) >>> phone_numbers {('Joe', 'Smith'): '567-8910', ('Jane', 'Doe'): '7654-321', ('El', 'Hovik'): '000-0000', ('Mike', 'Vanier'): '111-1111', ('Bob', 'Johnson'): '543-9876'} >>> for (key, value) in phone_numbers.items(): ... print(key, value) ... ('Joe', 'Smith') 567-8910 ('Jane', 'Doe') 7654-321 ('El', 'Hovik') 000-0000 ('Mike', 'Vanier') 111-1111 ('Bob', 'Johnson') 543-9876
We see that updating the phone_numbers
dictionary with new_phone_numbers
has provided a new phone number for Bob Johnson and has overwritten the old
phone number for Jane Doe.
What about append
?
There is no append
method for dictionaries, because it’s not needed!
To add a new key/value pair, just use normal assignment syntax:
>>> phone_numbers[('Don', 'Knuth')] = '271-8281' >>> for (key, value) in phone_numbers.items(): ... print(key, value) ... ('Joe', 'Smith') 567-8910 ('Jane', 'Doe') 7654-321 ('El', 'Hovik') 000-0000 ('Mike', 'Vanier') 111-1111 ('Bob', 'Johnson') 543-9876 ('Don', 'Knuth') 271-8281
The in
operator
Previously we’ve seen the in
operator for sequences. We can also use in
with
dictionaries. <key> in <dictionary>
means: is the key <key>
one of the keys in the
dictionary <dictionary>
?
>>> ('Don', 'Knuth') in phone_numbers True >>> ('Bill', 'Gates') in phone_numbers False
Example: creating a frequency table
OK, let’s do something useful!
We have a list of words. We want to create a frequency table for each word, which means that for each word, we want to record the number of times it occurs in the word list.
We will solve this by creating a dictionary:
-
key: a word in the list
-
value: the count of that word
Let’s write the code, and also print out the resulting table at the end.
words = ['to', 'be', 'or', 'not', 'to', 'be', 'that', 'is', 'the', 'question'] freqs = {} for word in words: if word in freqs: freqs[word] += 1 else: # first time we've seen that word freqs[word] = 1 for (key, value) in freqs.items(): print(f'Word: {key} occurs: {value} times')
This prints:
Word: to occurs: 2 times Word: be occurs: 2 times Word: or occurs: 1 times Word: not occurs: 1 times Word: that occurs: 1 times Word: is occurs: 1 times Word: the occurs: 1 times Word: question occurs: 1 times
See how easy that was? Dictionaries can make many programming tasks much easier to accomplish.
Of course, it’s pretty rare to find any code that can’t be improved somewhere... What can we do here?
Remember the get
method we described above? The idea there was that if the
key wasn’t in the dictionary, we would supply a default value to return. Here,
we have a similar situation, except that we are setting the values in a
dictionary. But if you look closely, you’ll see that the line
freqs[word] += 1
is equivalent to:
freqs[word] = freqs[word] + 1
which means that this line is both getting a value from a dictionary at a particular key and setting the value at the same key.
The trick to making this code simpler is to realize that when the key isn’t in
the dictionary, we can use the get
method to just return a count of 0
.
Then the code simplifies to this:
words = [...] # as before freqs = {} for word in words: freqs[word] = freqs.get(word, 0) + 1 for (key, value) in freqs.items(): print(f'Word {key} occurs {value} times')
We aren’t using the +=
operator any more, so line 4 is longer, but we’ve
eliminated the if
statement entirely. This counts as a win.
Conclusion
You may think that this is just another reading, but if you continue programming in Python we guarantee you that dictionaries will be one of the most useful things you ever learn. They are used everywhere, and learning to use them effectively will take you a long way towards becoming a good Python programmer.
[End of reading]