CS 1 Reading 8: Lists

What a list is

In this reading, we’re going to talk about Python’s lists. As is typical for programming jargon, the word "list" means something very different in programming than it does in everyday life. In everyday life, a "list" is something we use to write down an ordered sequence of things. (For instance, on a "to-do list" we might write down all the things we have to do this week.) So you might write down this to-do list:

Work on CS 1 homework.
Work on Math homework.
Work on Phys homework.
Work on Chem homework.
Binge watch Netflix for 10 hours.

In a programming language, a list is a kind of data object that can store a group of other data objects in a sequence. The data objects that are stored inside the list are called the elements of the list, or alternatively the list items. Here’s an example of a Python list:

courses = ['CS 1', 'Ma 1', 'Ph 1', 'Chem 1']

The list is the collection of items inside the square brackets, separated by commas.

The items in the list can be retrieved if you know their position in the list, and you can change the item at a particular position. You can also make a list bigger by adding more items to the end.

Many programming languages restrict lists so that any particular list can only hold one kind of data object (for instance, a "list of integers"). Python is more flexible: you can have a list containing many different kinds of data (including other lists). Nevertheless, most of the time, Python lists only contain a single kind of data.

Often we will be using a list as a single Python object. Python objects that contain other Python objects, like lists, are referred to as data structures. There are a number of other data structures that we’ll meet soon, including tuples, dictionaries, and sets. One nice thing about Python is that working with these data structures is much more convenient than in many other languages, which makes programming easier.

A simple example

It’s very common when programming to have many related values that you’d like to store in a single object. For instance, you might be doing weather research and need to store the average temperature for each day of the week. You could define variables for each day...

temp_sunday    = 59.6
temp_monday    = 72.4
temp_tuesday   = 68.5
temp_wednesday = 79.0
temp_thursday  = 66.4
temp_friday    = 77.1
temp_saturday  = -126.0  # Whoa, what happened here?

...but this would quickly become tedious. Even something as simple as finding the average temperature for these days would be difficult:

avg_temp = (temp_sunday + temp_monday + temp_tuesday + temp_wednesday + \
            temp_thursday + temp_friday + temp_saturday) / 7

A better way would be to collect all the temperatures in a single data structure (a list):

temps = [59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]

Now, finding the average is easy:

avg_temp = sum(temps) / 7

(We cheated a bit by using the built-in sum function. Once we learn about loops we’ll be able to write sum ourselves.)

The point of lists is to make it as easy to work with groups of values as it is to work with single values. Lists are used everywhere in Python code, and there are lots of predefined functions and methods for working with them, some of which we’ll learn about below. ^[1]

Creating lists

A literal list is a list that you write directly by giving the values of all the elements. The temps list we defined above is an example of a literal list. A literal list is surrounded by square brackets, and the list elements are separated by commas.

You don’t have to have literal values in a list, however. You can make a list of any kind of Python expressions, which will be evaluated when the list is created. Examples:

temp_sunday    = 59.6
temp_monday    = 72.4
temp_tuesday   = 68.5
temp_wednesday = 79.0
temp_thursday  = 66.4
temp_friday    = 77.1
temp_saturday  = -126.0

# Make a list from variables.
temps = [temp_sunday, temp_monday, temp_tuesday, temp_wednesday, \
         temp_thursday, temp_friday, temp_saturday]

# Make a list of powers of a number.
x = 42.8
x_list = [x, x * x, x * x * x, x * x * x * x, x ** 5]

# Make a list of trigonometric functions applied to pi.
from math import sin, cos, tan, pi
vals = [sin(pi), cos(pi), tan(pi)]

# Make a list from other lists.
lol = [[1, 2], [3, 4], [5, 6]]
nested_lists = [[1], [[2, 3], [[4, 5, 6], [7, 8, 9, 10]]]]

# Make a list with different types of data.
hlist = [1, 2.5, True, 'foobar', [3, 4, 5]]

There are other ways to create lists, which we’ll see below.

Empty list

An empty list is written as open/close square brackets:

empty = []

We’ll see uses for this shortly.

Accessing list elements

Once you’ve created a list, you need to be able to extract (access) the elements of the list. Python also uses square brackets with a particular syntax to get the list elements:

temps = [59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]
t0 = temps[0]   # 59.6

The syntax to access a list element is:

the name of the list (here, temps)
the location (or index) of the element, as an integer in square brackets (here, [0])

One peculiarity of list indexing is that the first element is "element 0". In computer programming, unlike in math, we almost always start counting from 0, not 1. ^[2]

The index doesn’t have to be a literal integer; it can also be a Python expression which evaluates to an integer.

temps = [59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]
t3 = temps[3]  # 79.0
t3a = temps[2 + 1]  # same
t3b = temps[2 ** 8 - 253]  # same

In this case, Python evaluates the index expression to an integer before doing the list access. (If the expression doesn’t evaluate to an integer, it’s an error.)

Index errors

If a list index is too large, Python signals an error:

>>> temps = [59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]
>>> temps[6]
-126.0
>>> temps[7]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

Most errors in Python give rise to exceptions. We will discuss exceptions and exception handling in detail later in the course. When an error situation arises which leads to an exception, we say that Python "raises an exception". We’ll use that terminology from now on. The IndexError in the error message is an example of a Python exception that is raised when you try to access a value off the end of a list. The Traceback stuff is also related to exception handling; for now, you can ignore it.

Note, though, that not all exceptions represent errors. We’ll see examples of this in later readings too. And, of course, not all errors end up raising exceptions. ^[3]

Negative indices

Python has a neat feature not found in most programming languages: you can use negative numbers as list indices. If you do, Python will access elements starting from the end of the list. So an index of -1 means the last element of the list, an index of -2 means the second-last, etc.

temps = [59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]
tlast = temps[-1]   # -126.0
t2last = temps[-2]  # 77.1

Note, though, that you can’t go too far back, or Python raises an IndexError exception:

>>> temps = [59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]
>>> temps[0]
59.6
>>> temps[-7]
59.6
>>> temps[-8]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

Python will not "wrap around" negative list indices if they go past the first element.

Modifying list elements

When we talked about strings, we pointed out that in Python strings are immutable: you can’t change the characters inside a string once it’s been created. In contrast, lists are always mutable: you can change (modify) a list element at any time. To modify a list element, just put the list element on the left-hand side of an assignment:

>>> temps = [59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]
>>> temps[0]
59.6
>>> temps[0] = 75.0
>>> temps[0]
75.0

You can even change a list element to an element of a different type:

>>> temps = [59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]
>>> temps[6] = 'really cold'
>>> temps
[59.6, 72.4, 68.5, 79.0, 66.4, 77.1, 'really cold']

This is very rarely a good idea. When it happens, it’s usually because of a bug in your code.

List operators: `+` and `*`

Much like strings, lists have operators that you can use on them. Many of these operators behave like their string counterparts. For instance, you can concatenate lists using the + operator:

>>> temps1 = [59.6, 72.4, 68.5]
>>> temps2 = [79.0, 66.4, 77.1, -126.0]
>>> temps = temps1 + temps2
>>> temps
[59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]
>>> temps + []
[59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]
>>> [] + temps
[59.6, 72.4, 68.5, 79.0, 66.4, 77.1, -126.0]

You can use the * operator with an integer on either side to replicate the elements of a list a certain number of times:

>>> [0] * 10
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> 5 * [42]
[42, 42, 42, 42, 42]
>>> 0 * [100]
[]
>>> 100 * []
[]

Notice that both the + and * operators do the same kind of thing for both strings and lists. That’s because strings and lists are both Python sequences, and Python tries hard to make sequences behave in similar ways as much as possible.

There is another important list operator called in that we will meet in a later reading.

List functions

`len`

The len function applied to a list argument returns the length of the list:

>>> lst = [1, 2, 3, 4, 5]
>>> len(lst)
5
>>> len([])
0

`list`

The list function converts other kinds of sequences to lists. The only other sequence type we know about is strings; let’s see what list does to a string:

>>> s = 'Monty Python'
>>> list(s)
['M', 'o', 'n', 't', 'y', ' ', 'P', 'y', 't', 'h', 'o', 'n']

This is the easiest way to split a string into letters. It’s also useful for something else: a string is immutable but a list is mutable. So if you really want to change a letter in a string, you can convert it to a list, change the list, and then convert the list back to a string using the join method on strings. ^[4] That might look like this:

>>> s = 'Monty Python'
>>> lst = list(s)
['M', 'o', 'n', 't', 'y', ' ', 'P', 'y', 't', 'h', 'o', 'n']
>>> lst[6] = 'B'
>>> lst
['M', 'o', 'n', 't', 'y', ' ', 'B', 'y', 't', 'h', 'o', 'n']
>>> ''.join(lst)  # trust us, this works
'Monty Bython'

List methods

Lists are Python objects, so they can have methods like any other object. There are a lot of list methods; some of the most useful ones are briefly described here. (Note: we won’t describe every aspect of these methods; see the official Python documentation for that.)

`append`

To add an element to the end of a list, use the append method:

>>> lst = []
>>> lst.append(1)
>>> lst
[1]
>>> lst.append(2)
>>> lst
[1, 2]
>>> lst.append(3)
>>> lst
[1, 2, 3]

This is probably the most-used list method. It’s very common to use it the way we’ve used it above: start with an empty list and add elements to it until you end up with the desired list. Notice that the append method doesn’t return anything; it just changes the list.

`pop`

To remove elements from the end of a list, use the pop method:

>>> lst = [1, 2, 3]
>>> lst.pop()
3
>>> lst
[1, 2]

The pop method is unusual in that it returns a value (the element "popped" off the end of the list) and also changes the list itself (by removing the last element). ^[5]

`index`

To find an element’s location (also called its index) in a list, use the index method:

>>> s = 'Monty Python'
>>> lst = list(s)
>>> lst
['M', 'o', 'n', 't', 'y', ' ', 'P', 'y', 't', 'h', 'o', 'n']
>>> lst.index('y')
4
>>> lst.index('z')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 'z' is not in list

The first call to lst.index returns 4 because the first 'y' character is at index 4. (If there are more than one of a particular element in the list, index returns the index of the first one.) Since there are no 'z' characters in the list lst, lst.index('z') raises an exception.

A ValueError exception is usually raised when a function gets an argument that is of the correct type but isn’t valid for some other reason. Here, the reason is that the method is supposed to only be called with arguments that are elements of the list.

`remove`

To remove a value from a list, use the remove method. It only removes the first occurrence of the value from the list. It doesn’t return anything, but changes the list in place. If the value doesn’t exist in the list, a ValueError exception is raised.

>>> lst = [1, 2, 3, 1, 2, 3]
>>> lst.remove(2)
>>> lst
[1, 3, 1, 2, 3]
>>> lst.remove(2)
>>> lst
[1, 3, 1, 3]
>>> lst.remove(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list

`count`

The count method returns the number of times a value is found in a list. The list is not altered.

>>> lst = [1, 2, 3, 2, 4, 1, 5, 2, 1]
>>> lst.count(1)
3
>>> lst.count(2)
3
>>> lst.count(5)
1
>>> lst.count(100)
0

`sort` and `reverse`

These two methods change the list in place; they don’t return anything. The sort method sorts the list into ascending order. ^[6] The reverse method reverses the list.

>>> lst = [5, 1, 4, 2, 6, 3, 5, 1, 2]
>>> lst.sort()
>>> lst
[1, 1, 2, 2, 3, 4, 5, 5, 6]
>>> lst.reverse()
>>> lst
[6, 5, 5, 4, 3, 2, 2, 1, 1]

Pitfall: aliasing

When you assign a variable containing a list to another variable, it looks like you might be copying it. In fact, you aren’t — all you’re doing is giving another name to the same list. This can be a shock if you aren’t expecting it. For instance:

>>> nums = [4, 6, 1503, 2, -3]
>>> nums2 = nums  # copy of nums?
>>> nums2[0] = 0
>>> nums2
[0, 6, 1503, 2, -3]
>>> nums
[0, 6, 1503, 2, -3]   # !!!

When we assign the list nums to nums2, we are not making a copy of nums. We are just giving another name to the same list. So when we change a value in the list using one of the names, the other variable sees the change. This kind of phenomenon is called aliasing and it’s a real pain! The best way to avoid it is to make sure you understand what Python assignment really means, and copy a list if you really need two independent lists.

We will show you how to copy lists in a few readings. ^[7]

One reason this may confuse you is that you might think that a Python variable is a "location in memory" where you can copy a value. In fact, this isn’t true at all! (In some languages, like C, it is true, but not in Python.) In Python, assigning a value to a variable doesn’t copy anything; it just creates a name which refers to the value. If you understand this, aliasing will be much less of a problem for you.

To avoid confusion in the future, we won’t say that a variable "contains" a value; instead we’ll say that a variable "is bound to" a value or "refers to" a value.

Nested lists

You can have lists within lists; these are called "nested lists". To get values from one of the inner lists, you have to double-up on the square bracket syntax.

>>> lst = [[1, 2], [3, 4]]
>>> lst[0][0]
1
>>> lst[0][1]
2
>>> lst[1][0]
3
>>> lst[1][1]
4

When you read syntax like lst[0][0], read it like this: (lst[0])[0]. In other words, the lst[0] picks out the sublist [1, 2], and the final [0] picks out the number 1 from the sublist.

You can use the same syntax on the left-hand side of an assignment statement to change a value in an inner list.

>>> lst[0][0] = 100
>>> lst
[[100, 2], [3, 4]]

List puzzles

Because of aliasing, strange things can happen with nested lists. Can you explain why the following code entered into the Python interpreter behaves the way it does? (Feel free to ask a TA or the instructors if you’re not sure.)

>>> lst1 = [1, 2, 3]
>>> lst2 = [lst1, lst1]
>>> lst2
[[1, 2, 3], [1, 2, 3]]
>>> lst1[0] = 42
>>> lst1
[42, 2, 3]

Nothing odd so far. Let’s continue the same interpreter session.

>>> lst2
[[42, 2, 3], [42, 2, 3]]  # ?!?
>>> lst2[0][0] = 1
>>> lst2
[[1, 2, 3], [1, 2, 3]]    # ?!?
>>> lst1
[1, 2, 3]                 # ?!?

Here’s another puzzle.

>>> lst1 = [1, 2, 3]
>>> lst2 = lst1 * 3
>>> lst2
[1, 2, 3, 1, 2, 3, 1, 2, 3]
>>> lst3 = [lst1] * 3
>>> lst3
[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
>>> lst1[0] = 42
>>> lst1
[42, 2, 3]
>>> lst2
[1, 2, 3, 1, 2, 3, 1, 2, 3]
>>> lst3
[[42, 2, 3], [42, 2, 3], [42, 2, 3]]

Why do all the 1s in lst3 change to 42 but the 1s in lst2 don’t?

[End of reading]

1. If you’ve programmed in other languages, be aware that what Python calls a "list" is pretty much the same as what other languages call an "array". The main difference is that Python lists can easily be expanded, which is not always true for arrays in other languages.

2. If you learn the C language (e.g. by taking CS 3 or the CS 11 C track), you will find out why this is.

3. That’s kind of a shame. If errors always ended up raising exceptions, we would know that our programs were correct as long as there are no exceptions raised when we run them! Alas, it’s not that simple.

4. We haven’t talked about the join method yet, but you can look it up in the Python online documentation.

5. The word "pop" comes from the stack data structure, which you’ll learn about in CS 2. Python lists can be used as stacks.

6. There are other ways to sort using this method (e.g. in descending order). See the Python documentation for details.

7. There are actually two distinct ways to copy lists. The easy way works for most lists, and the harder way works for all lists. We’ll see both of them.