|
Data Structures and Algorithms
with Object-Oriented Design Patterns in Python |
Dealing with floating-point number involves a little more work.
In Python the floating-point numbers are represented
using the IEEE double-precision format (64 bits).
We seek a function f which maps a floating-point value
into a non-negative integer.
A characteristic of floating-point numbers that must be dealt with
is the extremely wide range of values which can be represented.
For example, when using IEEE floating-point,
the smallest double precision quantity that can be represented is
and the largest is
.
Somehow we need to map values in this large domain
into the range of an int.
Every non-zero floating-point quantity x can be written uniquely as
![]()
where
,
and
.
The quantity s is called the sign ,
m is called the mantissa
or significant
and e is called the exponent .
This suggests the following definition for the function f:
where
.
This hashing method is best understood by considering
the conditions under which a collision occurs
between two distinct floating-point numbers x and y.
Let
and
be the mantissas of x and y, respectively.
The collision occurs when f(x)=f(y).

Thus, x and y collide if their mantissas differ by less than 1/2W.
Notice that the sign of the number is not considered.
Thus, x and -x collide
Also, the exponent is not considered.
Therefore, if x and y collide,
then so too do x and
for all permissible values of k.
Program
introduces the class Float
which extends the built-in float class
as well as the abstract Object class
introduced in Program
.
In this case,
the __hash__ method computes the
hash function defined in Equation
.

Program: Float class __hash__ method.
This implementation uses the frexp function
provided in the standard Python math module.
This function returns the mantissa, m,
and the exponent, e,
of a given floating-point number.
In the IEEE standard floating-point format
the least-significant 52 bits of a 64-bit floating-point number
represent the quantity
.
Since
,
we can rewrite Equation
as follows:

Thus, we can compute the hash function by computing the integer m'
and then shifting the binary representation of that integer
20 bits to the right as shown in Program
.
Clearly the running time of the __hash__ method is O(1).