Site Archive (Complete)
C++
SELECTIVE IGNORANCE

Finding the Signal in the Noise

by Andrew Koenig

October 2007


October 16, 2007

The principle of least surprise can be surprising


It's hard to study software design for long without coming across the principle of least surprise, sometimes called the principle of least astonishment. Both terms refer to the idea that a system should cause as little surprise as possible for someone who doesn't already know how it behaves.

This idea is a good one most of the time. However, sometimes there is a good and simple reason for behavior that is surprising at first glance. In such cases, following the principle of least surprise may introduce extra complexity into the system and make its behavior more surprising in the long run.

I'm going to give you an example in APL. I think it's a language that most readers don't know, so they won't have preconceptions about how things ought to work.

One of APL's most fundamental ideas is that of an array. In fact, every value in APL is an array. An array can have any number of dimensions. Indeed, in APL programs, it is not uncommon to create three- and four-dimensional arrays as temporary parts of larger computations, and then compress them back down to one- or two-dimensional arrays.

What kind of array should represent an ordinary scalar (i.e. a number that isn't really an array), such as 42? There art two choices: Perhaps it should be a one-dimensional array, with a single element, or perhaps it should be an array with no dimensions at all.

At first glance, the idea of an array with zero dimensions may seem surprising. However, treating a scalar as a zero-element array makes a lot of sense. First of all, if an array can have any number of dimensions, it would surely be surprising if "any number" did not include zero. So the language should admit zero-dimensional arrays; if they are not equivalent to scalars, then we have to explain what the difference is between scalars and zero-dimensional arrays.

In this sense, then defining a scalar as a zero-dimensional array follows the principle of least surprise.

However, in my experience with APL, the fact that scalars are zero-dimensional arrays causes one of the first serious surprises that most people have when they learn APL.

Consider a program that computes the average of a vector. I don't have the APL character set in this blog, so I will express the average of v as sum(v)/size(v). APL has its own characters for sum, size, and division, but I don't need them to make my point.

In a language with multi-dimensional arrays, it is not immediately obvious how to define sum and size. In fact, APL does so by saying that the sum of an array is an array with one less dimension. For example, the sum of a matrix is a vector, each element of which is the sum of one row of the matrix. It has to make an exception for the sum of a scalar, which has no dimensions to begin with, so it defines the sum as being the same as the scalar itself.

Similarly, the size of an array is a vector, with one element for each dimension of the array.

So what is the size of a scalar?Here is where it gets interesting. We said earlier that a scalar is a zero-dimensional array--so the size of a scalar should be a zero-element vector. Given our definition of average, what do you suppose is the average of a scalar?

Well, we know that the average is sum(v)/size(v). sum(v) is the same as v, so our average is the result of dividing our scalar by an empty vector. We just need to figure out what that is.

Again, the principle of least surprise comes into play. Suppose v is a vector and we write v+1. Surely the result of that addition should be to add 1 to each element of v. (It cannot be concatenation because APL uses a comma for concatenation; binary + always means addition). So applying an arithmetic operation to a scalar and a vector applies that operation to the scalar and each element of the vector.

But this rule means that dividing a scalar by an empty vector yields an empty vector--which means in turn that the average of a scalar is an empty vector! Just about everyone who sets out to learn APL is surprised by this behavior; people expect the average of a scalar to be the scalar itself, not an empty vector.

So here we have a case where a surprising result comes from three applications of the principle of least surprise:

1) It should be possible to have an array with no dimensions, and a zero-dimensional array should be the same thing as a scalar.

2) The size of an array should be a vector with one element for each dimension of the array.

3) When you do arithmetic on a scalar and a vector, the result has the same size as the vector.

Choosing the least surprising behavior in these three contexts cuses the surprising behavior that the average of a scalar is an empty vector. Does this mean that there is something wrong with our choices?

I don't think so. Rather, I think this exmaple shows that it is not always possible to have ideas that are unsurprising both individually and in combination. When such surprising combinations occur, it can be tempting to remove the surprise by tweaking individual ideas--but doing so is often harder than it looks. The trouble is that because the surprise doesn't come from any single design decision, changing one decision might introduce another surprise somewhere else--and that one might be even more surprising and harder to see.

Posted by Andrew Koenig at 10:39 AM  Permalink |


October 02, 2007

The hazards of warning messages


Experienced programmers usually think that warning messages from compilers are a Good Thing. After all, such messages let us know when we have written programs that might not do what we expected, and sometimes save us from hours of debugging.

For example, if we intended to write

    if (x == 0)

and instead wrote

    if (x = 0)

the compiler would surely be providing a useful service by telling us that we had probably done something we didn't intend.

However, there is a dark side to compiler warnings that takes a while to appreciate. We can begin to see its shadows by asking what we should do when we compile someone else's program and see a dozen (or a hundred) warning messages.

The obvious answer is to locate the source of the warnings and change the program so that they no longer appear. After all, every warning represents a part of the program that might be hiding a serious bug, and if the program produces too many warnings, we have no way of knowing how many of those bugs might be waiting to bite us.

In effect, a conscientious programmer will treat compiler warning messages similarly to compiler error messages: as indicating problems that must be fixed before proceeding further. Indeed, it is not uncommon for commercial programming shops to decree that warning messages are unacceptable: All code must compile warning-free.

That's a Good Thing too, isn't it? After all, isn't it reasonable to insist that programmers refrain from writing programs that even a compiler can see are hazardous? Who could argue with such an aim?

Indeed, it is hard to argue with an aim expressed as this one is. However, if we take a step back, we shall see that the problem is not quite that simple. For if an organization insists on programs compiling warning-free, then the programmers are no longer writing their programs in the programming language that their manuals define. Instead, they are programming in the intersection of the language in the manual and the language that the compiler doesn't warn about. Suddenly, the language definition changes from one that is written down to one that must be inferred from the compiler's behavior.

The situation gets worse when we have to deal with more than one compiler--as we all do. Remember that each new release of a compiler is effectively a new compiler from the user's viewpoint. Suppose, for example, that a compiler developer wakes up one day and realizes that a class with a destructor and no constructor is usually asking for trouble. After suitable discussion and review, a new release of the compiler comes out that gives warning messages for such classes.

Now shift focus to a developer using that compiler. The developer wrote this code:

    class Thing {
    public:
        virtual ~Thing() { }
    };

The old compiler accepted it with nary a peep; the new compiler warns that there is a destructor but no constructor.

In a sense, that behavior is a compiler bug: If you treat warnings as errors, you've just lost the ability to have an empty virtual destructor in an abstract base class. In another sense, of course, one might argue that it is never a bug to issue a warning as long as it is warning about something that is actually happening.

This example shows why I think warning messages are a decidedly mixed blessing. They have certainly saved my hide, but equally certainly they have made it hard for me to do things I thought were entirely reasonable.

I'd love to see reader comments about their experiences with warning messages.

Posted by Andrew Koenig at 11:36 AM  Permalink |



November 2007
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  


BLOGROLL
 

♦ sponsored
INFO-LINK


Related Sites: DotNetJunkies, SD Expo, SqlJunkies