Data Structures and Algorithms with Object-Oriented Design Patterns in Python
next up previous index

Python Lists and Arrays

 

Probably the most common way to aggregate data in a Python program is to use a Python list. A Python list is an object that contains an ordered collection of objects. For example,

a = [0, 0, 0, 0, 0]
creates a Python list that comprises five plain integers (all zero) and assigns it to the variable a.

The elements of a Python list are accessed using integer-valued indices. The first or leftmost element of a Python list always has index zero. Thus, the five elements of list a are a[0], a[1], ..., a[4]. Python also supports the use of negative indices to index into a list from the right. The last or rightmost element of a Python list always has index -1. Thus, the five elements of list a can also be accessed as a[-5], a[-4], ..., a[-1].

Python provides a built-in function called len that returns the length of any object (that has a length). When applied to a Python list, the len function returns the number of elements in that list. Thus, len(a) has the value 5.

Python checks at run-time that the index used to access a list element is valid. Valid indices fall between -n and n-1, where n is the length of the list. If an invalid index expression is used, an IndexError exception is raised.

It is important to remember that in Python, an assignment statement assigns a name to an object. In particular, the sequence of statements

a = [0, 0, 0, 0, 0]
b = a
causes the variable b to refer to the same list object as variable a. Furthermore, the sequence of statements
x = 57
a[0] = x
a[1] = x
causes x, a[0], and a[1] all to refer to the same object.

How is a Python list represented in the memory of the computer? The specification of the Python language leaves many of the details up to the system implementers[49]. However, Figure gif illustrates the typical implementation scenario.

   figure2562
Figure: Memory representation of a Python list.

A Python list represents a collection of objects. In this case, the objects are all plain integers. The list object actually contains an array of the identities (or addresses) of the objects in the collection. The array elements (the identities) typically occupy consecutive memory locations. That way, given i it is possible to find the identity of tex2html_wrap_inline60393 in constant time.

On the basis of Figure gif, we can now estimate the total storage required to represent a Python list. Let S(n) be the total storage (memory) needed to represent a list of n objects. S(n) is given by

eqnarray2683

where the function tex2html_wrap_inline60401 is the number of bytes used for the memory representation of an instance of an object of type X.

In the Python virtual machine, object identities (or addresses) are typically represented using fixed-size integers. Hence, tex2html_wrap_inline60403. In practice, a Python list may contain additional data items. For example, it is reasonable to expect that there is a datum that records the position in memory of the first array element. In any event, the overhead associated with a fixed number of additional data items is O(1). Therefore, S(n)=O(n).




next up previous index

Bruno Copyright © 2003, 2004 by Bruno R. Preiss, P.Eng. All rights reserved.