July 09, 2007
More thoughts about binary searching
Although I didn't realize it at the time, the fellow who talked to me about binary searching at the Monk's Inn had an important idea: Binary searching is an example of an algorithm that is not only unusually useful but also unusually hard to test.
For example, a fundamental part of the nature of any binary-search algorithm is that it relies on its input being already sorted. However, it is not feasable for a binary-search program to test whether its input is sorted, because doing so would take so long as to defeat the whole purpose of using a binary search in the first place.
So we already have a reliability problem: The program will not work unless the input meets the requirements, but it must not check whether those requirements are actually met.
Suppose now that we are going to test a binary-search program. Will our tests include giving it unsorted input? If so, what does it mean to say that the program is working correctly? In other words, how does our test program distinguish between a search program that works correctly on incorrect input and one that doesn't?
It might be reasonable to require that a binary-search program not crash even when presented with unsorted input. But that notion just begs the question of what the program should do, because it's hard to check whether a program "does not crash." The trouble here is that programs can do things that the language does not define, and such actions might cause the program to crash on one occasion and not on another.
The only conclusion I can see is that if we are going to say that a program "must not crash," then we will have to look inside that program and insert code that verifies that it performs no undefined operations. In the case of a binary search, for example, we will have to instrument the code to verify that whenever we access an element of the array being searched, that element actually exists and is not out of bounds.
This conclusion is a special case of a more general notion: To test a program, it may be necessary to insert code into it that verifies that the program limits itself to well-defined operations.
Posted by Andrew Koenig at 10:34 AM Permalink
|