Site Archive (Complete)
C++
SELECTIVE IGNORANCE

Finding the Signal in the Noise

by Andrew Koenig

October 2006


October 25, 2006

Abstraction and performance -- a bit of history, part 2


As a demonstration, I had written a topological-sorting program in C++, using my newly written associative-array library. How did its performance compare with the C version that came with the system?

If the performance turned out to be as good, or even close, that would be wonderful. After all, the C version was much more code than the C++ version, and programmer time is more important than computer time.

However, the C++ version had two major arguments against it.

First, it used classes for all of its string processing, which meant that every time a string was allocated or deallocated, the string library made a trip to the system's memory allocator. The C version could allocate memory in big chunks, pick off what it needed, and then not bother to give it back to the system at all. After all, its authors knew that it would consume only a limited amount of string memory, and that all of that memory would automatically go back to the operating system when the program finished.

The second drawback of the C++ version was that the I/O library was quite unsophisticated at the time. In particular, it did no input buffering at all--instead, it called the operating system for each character of input. This system call executed hundreds of instructions; I don't remember how many. The effect of this strategy was to make any C++ program that did input run much more slowly on the input part than did its C counterpart.

I had known about these drawbacks of the C++ library, but hadn't thought that much about them when I was working on the associative-array library. After all, that was someone else's code, and the author was sure to improve it eventually, as users began clamoring for better performance.

Still, here I was, being asked about performance of a program that was being bogged down by two parts of the library that I did not write. Was there anything I could do to offset the disadvantage?

There was one thing: The C topological-sort program used singly-linked lists instead of associative arrays. Therefore, in the cases where I expected order n log(n) performance, the C program could be expected to run in quadratic time. In principle, therefore, if I could find a realistic excuse to give the two programs large enough input, the intrinsically slower performance of the C algorithm would overtake the less refined implementation of that early C++ library.

(to be continued)

Posted by Andrew Koenig at 02:56 PM  Permalink |


October 23, 2006

Abstraction and performance -- a bit of history, part 1


Programmers like to talk about "the abstraction penalty" -- the cost in performance for programming in an abstract style. The word "the" here is misleading, because it suggests that there is only one such penalty, independent of circumstances.

I have already observed that this point of view is too simplistic. Here's an example to illustrate that observation.

About 20 years ago, I was part of an organization that was trying to get C programmers to start (or at least consider) using C++. To do so, I would write C++ code to make everyday programming tasks easier; then I would give presentations that explained the code and showed examples of how to use it.

We already had list and string classes available, so I hit on the idea of writing an associative-array class. Because C++ did not yet have templates, I resorted to the C preprocessor to adapt this class to the index and object types. I believe that this is the first associative-array class ever to be written in C++.

In this class, I chose two ideas that persist in the C++ library to this day: Using an order-based, rather than a hash-based algorithm, and choosing an algorithm that was order log(n) in the worst case. Because I valued ease of implementation and robustness over micro-performance (after all, I was entirely responsible for implementation and support), I strove to make the code as clear as possible.

The result was an associative-array library that really did deliver its order log(n) performance claim, but with a fairly high constant overhead. Much of that overhead came from going to the operating system's memory allocator for each associative array element, and storing various pointers (used for debugging purposes) that were not strictly necessary.

When I looked around for an example to illustrate this library in context, I found the notion of topological sorting. A topological sort is an algorithm that takes in pairs of ordering constraints (e.g. A must precede B, and so on) and produces a list of items that meets all the constraints at once (if such a list exists). A common use of this algorithm was to reorder the modules in a library so that no module relied on any module that preceded it in the library. This strategy would allow a library to be searched in a single pass.

There was a system command that did topological sorting. It was written in C, and was a little more than 200 lines long. If I remember correctly, the C++ version using my library was about 30 lines long. So I was very pleased with myself until someone asked me: "How does the C++ version perform?"

(to be continued...)

Posted by Andrew Koenig at 11:33 AM  Permalink |


October 18, 2006

Am I being too fussy about trivialities?


Reading my last posting about how to add three integers may leave you thinking that I'm wasting my time on silly stuff. Here's why I'm not.

I have two examples.

Not too long ago, I entered an informal programming contest, which asked entrants to write a binary-search program. Like many such programs, my entry had a line somewhat like this:

    int mid = (low + high) / 2;

This line of code lost points because the computation of low+high might overflow. The solution in this case was

    int mid = low + (high - low) / 2;

This rewrite has the nice property of working for random-access iterators as well as for integers.

If you think this example is also trivial, here is one that is much less so. Once upon a time, I used a time-sharing system that integrated an interpretive programming language into its operating system. Because the language was based on an interpreter, the designers felt there was no need to protect the rest of the operating system from the interpreter. Instead, it relied on the interpreter to prevent user programs from doing nasty things to the operating system. This decision may seem foolhardy, but it looks less so when one realizes that this system was running on a computer that lacked the hardware that would have been necessary for such protection.

Anyway, I discovered that when I allocated a multidimensional array, the code that computed the array's size didn't check for overflow. So if I allocated an array with a size just slightly larger than all of memory, it looked to the interpreter like a small object, even though its dimensions were huge.

This anomaly allowed me to stay within the array's nominal dimensions while still accessing all of the machine's memory, which in turn allowed me to take over control of the operating system. I informed the system's proprietors that I could do this, and they refused to believe me until I demonstrated it for them.

In other words, what looks to an ordinary programmer like a silly error in an edge case may actually be a serious security vulnerability.

Bad guys like it when abstractions deviate from the reality they are intended to represent.

Posted by Andrew Koenig at 01:54 PM  Permalink |


October 16, 2006

More about abstraction quality


I would like to continue the discussion of abstraction quality by looking at just what it takes to add three integers correctly whenever it is possible to do so.

Let's keep it simple: We have three integers i, j, and k, and we wish to compute their sum, provided that it is possible to represent the sum as an integer. As with C and C++, we will assume that any sum, even an intermediate sum, that cannot fit in an integer causes undefined behavior.

If i, j, and k are all positive (or zero), there's no problem: We don't care if the final sum overflows, because if it does, we can't represent the result anyway. If the final sum doesn't overflow, the intermediate sum can't overflow either, so we don't need to worry about it. A similar argument applies if i, j, and k are all negative or zero.

The fun comes when the quantites to be summed have different signs, because if we add the two quantities that happen to have the same sign, that intermediate sum might overflow even when the final result doesn't.
Accordingly, we have to be sure that whenever two of the quantities have different signs, we add those two first, and only then add the third.

Our algorithm, then, looks like this: If i and j have different signs, the result is (i+j)+k. Because i+j cannot overflow (because i and j have different signs), the only possible overflow comes when the final result wouldn't have fit in an integer anyway.

If i and j have the same sign, then either k has the same sign as the other two or it has a different sign. If it has the same sign, it doesn't matter in what order we add the three quantities. If it has a different sign, then i+(j+k) will give the right answer whenever it is possible to do so, for the same reason that (i+j)+k worked before.

In other words, we can express the algorithm this way:

if (sign(i) == sign(j))
    sum = i + (j + k);
else
    sum = (i + j) + k;

This is the kind of code that deserves a substantial comment to explain what is going on.

This example should show you why high-quality abstractions are often hard to achieve: Even something as simple as the sum of three integers is surprisingly difficult to compute correctly. If you are still unconvinced of the difficulty of such problems, you might try computing the sum of an arbitrary number of integers.

Posted by Andrew Koenig at 05:34 PM  Permalink |


October 13, 2006

The abstraction quality dilemma


Library authors often face a tough choice: Is it worth the extra effort to get obscure cases right that may never be needed?

For example, if m and n are int variables, then m+n is the sum of the values of m and n whenever that sum can be represented as an int. However, if we introduce a third variable into the picture, say k, then we find that k+m+n might be undefined even though the mathematical sum fits in an int. For example, k+m might overflow, but n might be negative.

Now imagine that you are writing a library function to compute the sum of three integers. How would you go about it? Would you bother getting the edge cases right? What would you do if a user of your library function complained that it was failing to get the right answer in some circumstances?

Library authors use abstractions in order to provide abstractions to their users. Ideally, the abstractions they provide should be as robust as the ones they user. In practice, though, such a state of affairs is sometimes hard to achieve.

Posted by Andrew Koenig at 08:04 AM  Permalink |


October 11, 2006

Much ado about nothing


Have you ever noticed how important it is to make programs do the right thing when their input is empty? And how it's not always easy to know for sure what the right thing is?

Here's a simple example: Suppose you're writing a program that computes the sum of all of the elements in a sequence. What should it do when there aren't any elements?

A little reflection may convince you that the sum of an empty sequence is zero. But why? Why, for example, wouldn't it be equally correct to throw an exception?

The answer to the question hinges on whether the program's behavior on an empty sequence makes conceptual sense, or whether it is just a convenience. I claim that it makes conceptual sense to say that the sum of an empty sequence is zero. The reason is that if you remove one element from a nonempty sequence, sum the rest of the elements, and then add the element you removed, you get the sum of all of the elements.

In other words, sum(s[0]...s[n]) is equal to sum(s[0]...s[n-1])+s[n] whenever n>1. By defining the sum of an empty sequence to be zero, we preserve this property when n=1 (Of course, we can't preserve the property when n=0, because doing so would require us to ascribe a value to v[-1]).

Because defining the sum as zero preserves this property, we can be confident that the definition is consistent with the rest of the program's behavior.

To confirm that you get the point, you might think about how to define the product of the elements of an empty sequence.

Posted by Andrew Koenig at 06:23 PM  Permalink |


October 09, 2006

Why is debugging so hard?


If you're an experienced programmer, you've surely had the experience of becoming ever more frustrated while trying to understand why your program didn't do what you intended -- only to realize that once you found it, the answer was trivial.

In that sense, debugging seems to differ from other kinds of problem solving. What makes it so different?

I'd like to suggest a reason: We think about what our programs are supposed to do by forming a mental model--an abstraction--of the program's behavior. Whenever that model fails to match the program's actual behavior, our reasoning gives wrong results, even if the reasoning itself is correct.

If my suggestion is correct, then the most frustrating kinds of program misbehavior are those that violate our mental models of how the programs should work. As an example that supports this theory, consider what happens if you exceed the bounds of an array: Anything at all can happen, and sometimes the program seems to work correctly. How can one not be frustrated in modeling such flaky behavior?

By implication, we can make our programs easier to debug by programming in a way that assures us that our mental models are correct.

Posted by Andrew Koenig at 11:03 AM  Permalink |


October 06, 2006

Why compiler writers sometimes like undefined behavior


Undefined behavior in a programming language makes life harder for programmers than most people realize, because a program that evokes undefined behavior has an error that no amount of testing is assured of revealing.

If you do something that's formally undefined, but happens to do what you want, how can you ever find the problem short of happening to notice it?

Because undefined behavior is so problematic, one is tempted to wonder why any programming langauge ever fails to define behavior. Here is one example from real life.

The C and C++ language definitions both say that the result of a signed integer overflow is undefined. Just about every implementation treats integer overflow as modular arithmetic. In other words, if the result doesn't fit in a word, the implementation quietly throws away the extra high-order bits. Why not standardize that behavior, thus allowing programmers to count on it?

Here is one example of why not. Once upon a time, I knew someone who was working on a C compiler for a processor embedded in a large telephone switch. Like many such processors, this one had a "condition code" -- two bits that were automatically set based on the result of each computation. Two bits can represent four states; in this case, the states were positive, negative, zero, and overflow.

Consider the following C statement:

if ((x -= y) == 0) { /* ... */ }

Assuming that x and y are integer variables, what code should the compiler generate for this statement?

The obvious strategy is to subtract y from x and then generate an instruction to test the condition code. If the condition code indicates that the result was zero, it executes the code in the braces; otherwise it skips that code.

This strategy behaves anomalously if the values of x and y are such that the result is zero with an overflow. In that case, the value of x is zero but the condition code indicates overflow, not zero.

The question, then, is whether the compiler should be permitted to generate code that behaves this way. If you take the view that the effect of an overflow is undefined, then the answer is yes: Once your program has overflowed, the compiler can do as it pleases; and in the absence of overflow, the code is correct. If, on the other hand, you think that irrespective of what happened earlier, your program is asking about the value of x, then your compiler had better deal correctly with the case where x is zero, even if it got that way after an overflow.

The compiler writer pointed out that the language specification was clear, and that changing the compiler for the convenience of the user who had encountered the problem would make execution slower for all users. The reason was that in order to make the generated code work "correctly" after an overflow, it would have been necessary to generate an extra instruction to test the value of x after storing it. The processor would have to execute that instruction even when it wasn't needed.

When you see a language specification that says "in this case, the behavior is undefined," it is a reasonable bet that the language author(s) wanted to allow extra latitude for implementors in problematic cases such as this one. In effect, they're weakening the abstractions that the language provides as a trade for easier or more efficient implementation.

Posted by Andrew Koenig at 12:22 PM  Permalink |


October 05, 2006

Walter Brown's cool technique for reading files


How do you write a loop to do something with each line of a file? One common way:

string s;
while (getline(file, s)) { /* do something */ }

This technique has the small disadvantage of separating the definition of s from the loop that uses it, and therefore of having the value of s hang around longer than needed.

Walter Brown, from the Fermi National Accelerator Laboratory and a member of the C++ committee, once suggested to me the following elegant technique:

for (string s; getline(file, s);) { /* do something */ }

Once seen, this technique seems obvious; but good ideas often seem that way.

Posted by Andrew Koenig at 04:15 PM  Permalink |


October 04, 2006

The thinnest of threads


You have an iterator that refers to an element of an associative container. How do you erase that element and advance the iterator to the position that follows it?

For a vector, it's easy:

iter = container.erase(iter);

However, in general, erase does not return a value. How does one find the next position?

The obvious way doesn't work:

container.erase(iter); ++iter;

because after you've called erase, iter is invalid. Therefore you can't increment it.

If you had a second iterator, say iter2, you could do this:

iter2 = iter; ++iter; container.erase(iter2);

Because you incremented iter before you call erase, iter now refers to a different position in the container. Therefore the call to erase won't invalidate iter.

We can simplify this code by the clever observation that so long as we never care about using iter2 again, this example has exactly the same effect as

container.erase(iter++);

The reason is that the effect of iter++ is to copy iter somewhere else, then increment iter, and finally yield the copy of the previous value of iter as its result. This copy therefore behaves exactly the same as would iter2. Moreover, because a function's arguments must necessarily be evaluated before the function begins execution, there is no worry about order of evaluation.

This technique hangs by the thinnest of threads; I would ordinarily hesitate to recommend it for that reason. However, it is so much more succinct than any alternative I have found that I think it deserves wider recognition.

Posted by Andrew Koenig at 04:01 PM  Permalink |


October 03, 2006

The question behind the question


How often has someone asked you how to do something and your reaction has been "Why would anyone want to do that?" In my experience, such a reaction is often a signal that the questioner is asking the wrong question.

Such a question showed up today in one of the C++ newsgroups. The questioner wanted to know how to allocate memory, put its address into a pointer to const, and then use the pointer to change the memory.

That's almost surely a mistake. The whole purpose of having a pointer to const is to be able to promise that the pointer won't be used to change the memory; so using that pointer to change it defeats the purpose. It's misusing the abstraction.

The straightforward answer to the question would be to use a const_cast to cast the pointer into a plain pointer, then use that pointer to change the memory. But before suggesting such a course of action, I would first want to know what the original problem was.

The problem might come from a desire to avoid changing the memory in one part of the program, even though it's necessary to change it in another part. In that case, why not have two pointers?

T* p = new T; const T* cp = p;

This way, you can make cp available to the part of the program that modifies the memory, and use p when you want to promise that the memory won't change.

This suggestion is speculation, of course. My point is that when something feels wrong about a question, that's a good time to look for the question behind the question.

Posted by Andrew Koenig at 01:44 PM  Permalink |


October 02, 2006

The right amount of abstraction


In the comp.lang.c++ Usenet newsgroup today appeared a post by Marcus Kwok that included the following profound observation:

1) C is "easy" because it is a small language.
2) C is "hard" because it is a small language.

Every programming language is (among other things) an abstraction of the underlying hardware. Some such abstractions, such as assembly language, are closer to what they abstract than are others, such as C++.

An implication of Marcus Kwok's observation is that if you choose an abstraction that is too close to what you're abstracting, you don't gain much. On the other hand, if you choose an abstraction that is too far away from what you're abstracting, the abstraction itself may present new complexities.

Another context in which this phenomenon surfaces is in library design. A library that is too abstract may be more work to learn than it's worth; one that is not abstract enough may not offer enough power to be worthwhile.

Posted by Andrew Koenig at 01:26 PM  Permalink |



November 2007
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  


BLOGROLL
 
INFO-LINK


Related Sites: DotNetJunkies, SD Expo, SqlJunkies