Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Security

Doing it Randomly: Probabilistic Algorithms in Programming


Finding Large Primes

Everyone has written a factorization program. It works great for small numbers, but tends to bog down with thousand-bit numbers. However, if you're only interested in whether or not a number is prime (and not what its prime factors are), there are ways to use probability and number theory to quickly check if a number is prime.

There are two probabilistic primality testers: Gary Miller and Michael Rabin's method and Robert Solovay and Volker Strassen's method. The Miller-Rabin method uses results from number theory to generate a series of "tests" of primality. The chance of a number passing a test and not being prime is less than one half. And each test is orthogonal, so the chance of a number passing n tests and not being price is 1/(2").

To test a number, n, for primality, first generate a random integer k, such that 2 < k < n -1. Perform the test on n with k. If n passes the test, k is said to be a "witness" of n's primality (the probability of this happening if n is not prime is less than 50 percent). If k fails the test, then n is definitely not prime. If there are 100 witness to k's primality, the probability of n not being prime is less than the probability of random quantum fluctuations crashing your program.

Listing 1 is the algorithm with 100 witnesses. Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest go into the number theory in detail. Remember, to generate prime numbers of any useful length for cryptography, you will need some kind of package that works with very large numbers.

Miller-Rabin (n)
     FOR i := 1 to 100
          DO
              a := (a random number between 1 and n-1)
              d := 1
              FOR j := (number of binary digits of n-1) DOWNTO 0
                   DO
	               x := d
                       d := (d*d) mod n
                       IF d = 1 AND x<>1 AND x<>n-1
                       THEN RETURN "n is definitely composite"
                       IF (the ith binary digit of n-1) = 1
                       THEN d := (d*a) mod n
             IF d <> 1 THEN RETURN "n is definitely composite"
   RETURN "n is almost definetely prime"
Listing 1

Pattern Matching

If you're searching for a long pattern string inside an even longer text string, comparing the two strings bit for bit can get tedious. Use a CRC funtion to hash the pattern into a 32-bit number. Then, hash each of the possible text substrings using the same CRC function. Now, all you have to do is compare the pattern's hash with each of the substrings' hashes. If the hashes are the same, then the text substring equals the pattern.

The chance of two different strings hashing to the same 32-bit value is 1 in 2^32. If you're really paranoid, you could compare matching strings bit for bit. But since you're more likely to get killed in a car accident on the drive home than to have two different strings hash to the same value, I would save my worrying for more important things.

The other benefit of this algorithm over other techniques is that there are no tables to precompute. You don't need to keep the entire text is memory. The algorithm can even run in real time, constantly checking text data as it comes in over a communications port or is read in from tape, possibly looking for some obscure high-energy physics event.

Still More Randomness

There are other applications of randomness in computer science. Cryptography is just teeming with them. Probabilistic algorithms are used in fault-tolerant systems. Some LAN protocols are based on them. They might seem like interlopers in the nice, deterministic world of programming, but they are here to stay.

Suggested Reading

Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest. Introduction to Algorithms. New York, N.Y.: MIT Press/McGraw Hill, 1990.

Harel, David. The Science of Computing, New York, N.Y.: Addison-Wesley Publishing Co., 1990.

Lehman, D. and Michael O. Rabin. "On the advantages of free choice: A symmetric and fully distributed solution of the dining philosophers problem," Proceedings of the Eighth ACM Symposium on the Principles of Programming Languages. (1981): 133-138.

Miller, Gary. "Riemann's hypothesis and tests for primality," Journal of Computer and System Sciences, Vol. 13. (1976):300-317.

Rabin, Michael O. "Probabilistic algorithms for testing primality," Journal of Number Theory, Vol. 12. (1980):128-138.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.