Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Algorithm Alley


Jan00: Algorithm Alley

Tim is a visiting associate professor in the Computer Science Department of Eastern Washington University. He can be contacted at [email protected].


Card shuffling is an example of putting a fixed number of items into completely random order. The method used to shuffle cards applies to any programming circumstance where you need to randomize the order of a fixed number of items, including Scrabble tiles, dominoes, standard playing cards, Tarot cards, or lottery numbers, for example.

You can view this randomization of items as randomizing the permutation of entries in an array. In this article, I'll examine a couple of randomizing algorithms -- one that does not generate all permutations with equal probability, and another that does. Both algorithms are based on working through the array, swapping entries in the process.

The first algorithm focuses on proceeding through the array, randomly positioning the value found in each successive position. With this algorithm, as you proceed through the N positions of the array (source positions), you choose the target position randomly from the N possible positions in the array. The values in the source and target positions are then interchanged. H.M. Deitel and P.J. Deitel refer to this as their "high-performance card shuffling" algorithm in their book C++: How to Program (Prentice-Hall, 1998). In C++, this process looks like Example 1 (presuming a function or macro Swap (L,R)). At first glance, it appears that this algorithm generates all permutations with equal probability. On examination, however, you can see that this generates NN rearrangements of elements -- each of the N iterations of the loop positions a value among the N available positions -- even though there are only N! possible permutations of N elements. For each permutation there are a number of ways to generate that permutation -- an average of NN/N! ways.

The second algorithm focuses on choosing a random value for each position in the array. Once the value is placed in that position, the position (and the value) no longer participates in the shuffling. While you could start at either end of the array, the code is simpler if the region for choosing a source value always begins with subscript 0. To do that, you start with the rightmost position in the array as the target and randomly select a source position from the front of the array to that target position. You interchange the values, then treat the array as shortened by one position and do the same thing over again. There are no choices left to be made when the remaining array has a single element. Example 2 is C++ code that implements this approach. Examination of the structure of this loop shows that it generates N! rearrangements of elements. All permutations are equally likely, aside from the minor deviation from uniform distribution by selecting a random value between 0 and Dest as (rand()%(Dest+1)).

A natural question is "How far from the uniform distribution are the permutations from the NN algorithm discussed first?" To answer this, I'll develop a method of numbering permutations. Then all of the permutations generated by the NN algorithm can be generated and counted. Since, in a permutation, each position has one fewer choice than the one before it, you can borrow from programming the notion of a multidimensional array, with dimensions: [N] [N-1] [N-2]...[3] [2] [1].

You can then map subscripts for such an N-dimensional array onto a one- dimensional offset. Each permutation can then be viewed as N times selecting elements from those remaining (as selecting from position [0], [1], and so on). At first, there are N elements to choose from, then (N-1), then (N-2), down to 1. Viewed as subscripts in an N-dimensional array, these positions of elements chosen will generate offsets (permutation indices) between 0 and (N!-1). A further advantage of the numbering is that, if the original permutation is in increasing order, it numbers the permutations in lexicographic order; that is, something like alphabetical or dictionary order. Thus the permutation with all elements in ascending order is numbered "0," while the permutation with all elements in descending order is numbered "N!-1."

Figure 1 illustrates how, in the permutation of eight elements, the 88(16777216) rearrangements from the first algorithm map onto the 8! (40320) possible permutations. The file mapperms.cpp (available electronically; see "Resource Center," page 5) implements this process. Table 1 shows that there are some individual permutations that are markedly more likely than others. While most appear to cluster between 200 and 600, the pattern is by no means random. Applying a moving average to these data, you can show some underlying regularities in the data, again indicating that the data are far from random. In Figure 2, a moving window of about 0.5 percent of all the data is applied (201 cells averaged and assigned to the central point). Another way that you can get a feel for the distribution is by sorting those 40320 data points; see Figure 3.

Another approach is to investigate, in the mapping of these NN reorderings onto the N! permutations, the probability (fraction of the total) for each position in the initial string to end up in each position in the shuffled string. Table 2 shows those probabilities. In it, each column and each row totals to 100 percent. Down the columns, each position contains one of the eight available characters. Across the rows, each character shows up on one of the eight positions. The first character in the source string shows equal probability of ending up in all eight positions of the rearranged string, while all other characters in the source string show varying probabilities for their positions in the rearranged string.

Also, you see that only the final position in the rearranged string has an equal probability of receiving all of the characters of the source string. Figure 4 displays Table 2 with the eight series taken from the rows.

Acknowledgment

Thanks to Dr. Ray Hamel of the Eastern Washington University Computer Science Department for his comments, specifically for the idea of examining the relationships between the initial and final positions of items in the rearrangements.

DDJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.