Data Structures and Algorithms with Object-Oriented Design Patterns in Python
next up previous index

Radix Sort

 

This section presents a sorting algorithm known as least-significant-digit-first radix sorting  . Radix sorting is based on the bucket sorting algorithm discussed in the preceding section. However, radix sorting is practical for much larger universal sets than it is practical to handle with a bucket sort.

Radix sorting can be used when each element of the universal set can be viewed as a sequences of digits (or letters or any other symbols). For example, we can represent each integer between 0 and 99 as a sequence of two, decimal digits. (For example, the number five is represented as ``05'').

To sort an array of two-digit numbers, the algorithm makes two sorting passes through the array. In the first pass, the elements of the array are sorted by the least significant decimal digit. In the second pass, the elements of the array are sorted by the most significant decimal digit. The key characteristic of the radix sort is that the second pass is done in such a way that it does not destroy the effect of the first pass. Consequently, after two passes through the array, the data is contained therein is sorted.

Each pass of the radix sort is implemented as a bucket sort. In the example we base the sort on decimal digits. Therefore, this is called a radix-10 sort and ten buckets are required to do each sorting pass.

Figure gif illustrates the operation of the radix-10 sort. The first radix sorting pass considers the least significant digits. As in the bucket sort, a single pass is made through the unsorted data, counting the number of times each decimal digit appears as the least-significant digit. For example, there are no elements that have a 0 as the least-significant digit; there are two elements that have a 1 as the least-significant digit; and so on.

   figure45420
Figure: Radix sorting.

After the counts have been determined, it is necessary to permute the input sequence so that it is sorted by the least-significant digits. To do this permutation efficiently, we compute the sequence of offsets given by

  equation46246

where R is the sorting radix. Note that tex2html_wrap_inline69821 is the position in the permuted sequence of the first occurrence of an element whose least significant digit is i. By making use of the offsets, it is possible to permute the input sequence by making a single pass through the sequence.

The second radix sorting pass considers the most significant digits. As above a single pass is made through the permuted data sequence counting the number of times each decimal digit appears as the most-significant digit. Then the sequence of offsets is computed as above. The sequence is permuted again using the offsets producing the final, sorted sequence.

In general, radix sorting can be used when the elements of the universe can be viewed as p-digit numbers with respect to some radix, R. That is, each element of the universe has the form

displaymath69809

where tex2html_wrap_inline69829 for tex2html_wrap_inline69831. In this case, the radix sort algorithm must make p sorting passes from the least significant digit, tex2html_wrap_inline69835, to the most significant digit, tex2html_wrap_inline69837, and each sorting pass uses exactly R counters.

Radix sorting can also be used when the universe can be viewed as the cross-product of a finite number of finite sets. That is, when the universe has the form

displaymath69810

where p>0 is a fixed integer constant and tex2html_wrap_inline68643 is a finite set for tex2html_wrap_inline69845. For example, each card in a 52-card deck of playing cards can be represented as an element of tex2html_wrap_inline69847, where tex2html_wrap_inline69849 and tex2html_wrap_inline69851.

Before we can sort over the universe U, we need to define what it means for one element to precede another in U. The usual way to do this is called lexicographic ordering . For example in the case of the playing cards we may say that one card precedes another if its suit precedes the other suit or if the suits are equal but the face value precedes that of the other.

In general, given the universe tex2html_wrap_inline69857, and two elements of U, say x and y, represented by the p-tuples tex2html_wrap_inline69867 and tex2html_wrap_inline69869, respectively, we say that x lexicographically precedes   y if there exists tex2html_wrap_inline69875 such that tex2html_wrap_inline69877 and tex2html_wrap_inline69879 for all tex2html_wrap_inline69881.

With this definition of precedence, we can radix sort a sequence of elements drawn from U by sorting with respect to the components of the p-tuples. Specifically, we sort first with respect to tex2html_wrap_inline69887, then tex2html_wrap_inline69889, and so on down to tex2html_wrap_inline69891. Notice that the algorithm does p sorting passes and in the tex2html_wrap_inline57847 pass it requires tex2html_wrap_inline69897 counters. For example to sort a deck of cards, two passes are required. In first pass the cards are sorted into 13 piles according to their face values. In the second pass the cards are sorted into four piles according to their suits.




next up previous index

Bruno Copyright © 2003, 2004 by Bruno R. Preiss, P.Eng. All rights reserved.