Data Structures and Algorithms with Object-Oriented Design Patterns in Python
next up previous index

Hash Function Implementations

 

The preceding section presents methods of hashing integer-valued keys. In reality, we cannot expect that the keys will always be integers. Depending on the application, the keys might be letters, character strings or even more complex data structures such as Associations or Containers.

In general given a set of keys, K, and a positive constant, M, a hash function is a function of the form

displaymath61895

In practice is it convenient to implement the hash function h as the composition of two functions f and g. The function f maps keys into integers:

displaymath61896

where tex2html_wrap_inline61291 is the set of integers. The function g maps non-negative integers into tex2html_wrap_inline61919:

displaymath61897

Given appropriate functions f and g, the hash function h is simply defined as the composition of those functions:

displaymath61898

That is, the hash value of a key x is given by g(f(x)).

By decomposing the function h in this way, we can separate the problem into two parts: The first involves finding a suitable mapping from the set of keys K to the non-negative integers. The second involves mapping non-negative integers into the interval [0,M-1]. Ideally, the two problems would be unrelated. That is, the choice of the function f would not depend on the choice of g and vice versa. Unfortunately, this is not always the case. However, if we are careful, we can design the functions in such a way that tex2html_wrap_inline61941 is a good hash function.

The hashing methods discussed in the preceding section deal with integer-valued keys. But this is precisely the domain of the function g. Consequently, we have already examined several different alternatives for the function g. On the other hand, the choice of a suitable function for f depends on the characteristics of its domain.

In the following sections, we consider various different domains (sets of keys) and develop suitable hash functions for each of them. Each domain considered is implemented as a Python class that provides a method called __hash__. The __hash__ method corresponds to the function f which maps keys into integers.

The Python built-in hash function automatically calls the __hash__ method of a class that provides one. Thus, if obj is an instance of a class that provides a __hash__ method, the following Python statements are equivalent:

hash(obj)
obj.__hash__()




next up previous index

Bruno Copyright © 2003, 2004 by Bruno R. Preiss, P.Eng. All rights reserved.