Data Structures and Algorithms with Object-Oriented Design Patterns in Python
next up previous index

Floating-Point Keys

Dealing with floating-point number involves a little more work. In Python the floating-point numbers are represented using the IEEE double-precision format (64 bits). We seek a function f which maps a floating-point value into a non-negative integer. A characteristic of floating-point numbers that must be dealt with is the extremely wide range of values which can be represented. For example, when using IEEE floating-point, the smallest double precision quantity that can be represented is tex2html_wrap_inline61969 and the largest is tex2html_wrap_inline61971. Somehow we need to map values in this large domain into the range of an int.

Every non-zero floating-point quantity x can be written uniquely as

displaymath61965

where tex2html_wrap_inline61975, tex2html_wrap_inline61977 and tex2html_wrap_inline61979. The quantity s is called the sign , m is called the mantissa  or significant  and e is called the exponent . This suggests the following definition for the function f:

  equation10784

where tex2html_wrap_inline61989.

This hashing method is best understood by considering the conditions under which a collision occurs between two distinct floating-point numbers x and y. Let tex2html_wrap_inline61995 and tex2html_wrap_inline61997 be the mantissas of x and y, respectively. The collision occurs when f(x)=f(y).

eqnarray10792

Thus, x and y collide if their mantissas differ by less than 1/2W. Notice that the sign of the number is not considered. Thus, x and -x collide Also, the exponent is not considered. Therefore, if x and y collide, then so too do x and tex2html_wrap_inline62021 for all permissible values of k.

Program gif introduces the class Float which extends the built-in float class as well as the abstract Object class introduced in Program gif. In this case, the __hash__ method computes the hash function defined in Equation gif.

   program10803
Program: Float class __hash__ method.

This implementation uses the frexp function provided in the standard Python math module. This function returns the mantissa, m, and the exponent, e, of a given floating-point number. In the IEEE standard floating-point format the least-significant 52 bits of a 64-bit floating-point number represent the quantity tex2html_wrap_inline62025. Since tex2html_wrap_inline61989, we can rewrite Equation gif as follows:

eqnarray10821

Thus, we can compute the hash function by computing the integer m' and then shifting the binary representation of that integer 20 bits to the right as shown in Program gif. Clearly the running time of the __hash__ method is O(1).


next up previous index

Bruno Copyright © 2003, 2004 by Bruno R. Preiss, P.Eng. All rights reserved.