Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Parallel

Algorithm Alley: Unbiasing Random Bits


Listing Two implements the recursive variation. You can again define a function B(p) to represent the average number of fair bits you get per coin flip when the coin lands heads with probability p. For every two flips, you again get a fair bit with probability 2pq; this adds pq fair bits per coin flip, on average. If you don't get a fair bit, you get a new flip. Recall the coin gave two heads with probability p2 and two tails with probability q2, so on average you get (p2+q2)/2 additional recursive coin flips per original coin flip. Also, each coin flip at the new level is heads with probability p2/(p2+q2) and tails with probability q2/(p2+q2). (These values sum to 1, and preserve the proper ratio between two heads and two tails.)

Listing Two
Function ExtractBits ( Flips[0,NumFlips-1] )
    NumNewFlips = 0;
    for (j = 0; j < (n-1)/2; j++) {
        if ( Flips[2*j] == Heads ) and ( Flips[2*j+1] == Tails ) print 0;
        if ( Flips[2*j] == Tails ) and ( Flips[2*j+1] == Heads ) print 1;
        if ( Flips[2*j] == Heads ) and ( Flips[2*j+1] == Heads) {
             NewFlips[NumNewFlips++] = Heads;
        }
        if ( Flips[2*j] == Tails ) and ( Flips[2*j+1] == Tails) {
             NewFlips[NumNewFlips++] =Tails;
        }
    }
    if (NumNewFlips >= 2) ExtractBits (NewFlips[0,NumNewFlips-1]);
}

So now you can write the appropriate equation, as in >Figure 1(a). While this equation B(p) does not appear to have a simple closed form, you can use it to calculate specific values of B(p) recursively. Moreover, when p=q=1/2, you obtain B(p)=1/3, as shown in Figure 1(b). While this is an improvement for fair coins over the initial scheme, you are still far from the optimal of one fair bit per coin flip when the coin is fair. The recursive extraction removes some of the waste, but there is still more to be done.


Figure 1: Calculating B(p) for a simple recursive scheme.

Making Even More Use of Symmetry

There is another symmetry you can take advantage of. Consider when the coin lands heads-heads-heads-tails and heads-tails-heads-heads. Both sequences produce one fair bit and one flip for the next sequence in the simple recursive scheme. But you have not taken advantage of the order in which these two events happened; in the first sequence, the fair bit was produced second, and in the second sequence, it was produced first. The symmetry between these two situations can yield another fair bit.

An easy way to extract this extra bit is to again use the idea of creating an additional sequence of coin flips. From the original sequence, you can derive a new sequence that gives you a biased coin flip for each consecutive pair of original flips. If the two flips are the same, you get heads; if the two flips are different, you get tails. You then apply von Neumann's scheme to the new sequence as well as the original sequence; see Table 3. You can again glue together the two outputs to obtain 01010010.


Table 3: Introducing additional sequences yields more bits.

Now you can also use the other new sequence described previously. So when you start with an original sequence, use it to derive two further sequences as in Table 4. It turns out that if you glue all the bits together, you get independent and fair bits.


Table 4: Using the two sequences together.

Of course, you can go further by recursively extracting more bits from each sequence. That is, each new sequence should generate two further new sequences, and so on. This recursive construction is easy to code; see Listing Three.

Listing Three
Function ExtractBits ( Flips[0,NumFlips-1] )
    NumNewFlipsA = 0;
    NumNewFlipsB = 0;
    for (j = 0; j < (NumFlips-1)/2; j++) {
        if ( Flips[2*j] == Heads ) and ( Flips[2*j+1] == Tails ) {
            print 0;
            NewFlipsA[NumNewFlipsA++] = Heads;
        } 
        if ( Flips[2*j] == Tails) and ( Flips[2*j+1] == Heads ) {
            print 1;
            NewFlipsA[NumNewFlipsA++] = Heads;
        }
        if ( Flips[2*j] == Heads ) and ( Flips[2*j+1] == Heads) {
            NewFlipsB[NumNewFlipsB++] = Heads;
            NewFlipsA[NumNewFlipsA++] = Tails;
        }
        if ( Flips[2*j] == Tails ) and ( Flips[2*j+1] == Tails) {
            NewFlipsB[NumNewFlipsB++] = Tails;
            NewFlipsA[NumNewFlipsA++] = Tails;
        }
    }
    if (NumNewFlipsA >= 2) ExtractBits (NewFlipsA[0,NumNewFlipsA-1]);
    if (NumNewFlipsB >= 2) ExtractBits (NewFlipsB[0,NumNewFlipsB-1]);
}

All of the procedures to this point have been designed recursively, so the bits from one sequence are output before the bits from the derived sequences are output. Is this important? In fact, you must be somewhat careful to make sure that you do not introduce correlations by ordering the bits in an unusual fashion. The recursive approach is known to be safe, so I recommend sticking with it.

You can again write an equation that determines the long-term average number of bits produced per flip. The equation is similar to the previous case, except now for every two flips, we also get an additional flip in one of the derived sequences. This flip is heads with probability p2+q2, since it comes up heads whenever the pair of flips are the same. This results in the equation in Figure 2(a). As shown in Figure 2(b), this equation yields that B(1/2)=1. Now if you flip a fair coin, you expect in the long run to get out one fair bit per flip using this recursive procedure. You are finally doing essentially as good as you could hope for.


Figure 2: Calculating B(p) for the recursive scheme with two new sequences generated per sequence.

This process extracts the maximum number of fair bits possible for every value of p. To prove this requires knowing some information theory. You need to know that the maximum rate at which fair bits can be produced from biased bits is given by the entropy function; see Figure 3(a). The complex looking equation has a straightforward solution; B(p) equals the entropy H(p). You can verify this by plugging B(p)=H(p) into the recurrence. It is easiest to first calculate term by term, and to use the fact that when p+q=1, you have 1-p2-q2=2pq. Suppose B(p)=H(p). Then you first calculate easier expressions for the terms on the right side; see Figures 3(b) and 3(c). Now check that the right side comes out to H(p)=B(p); see Figure 3(d).


Figure 3: Verifying that B(p)=H(p), the entropy function.

Conclusion

Although I have presented some basic psuedocode, you might find it interesting to write more efficient versions on your own. There is a collection of implementations and related links at http://www.ciphergoth.org/software/unbiasing/, and Peres's work has been the subject of discussion on the sci.crypt newsgroup.

Acknowledgment

Michael's work is supported by NSF grant numbers CCR-9983832, CCR-0118701, and CCR-0121154.


Michael is a professor of computer science at Harvard University and can be contacted at [email protected] or http://www.eecs.harvard.edu/~michaelm/.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.