|
April 2007
April 23, 2007
Why doesn't new return a reference?
A recent question on Usenet made me pause for thought: Why does new return a pointer rather than a reference?
Of course, one can use history as an excuse. Nevertheless, I still find it interesting to wonder whether it would be better for new to return a reference were it defined from scratch today.
The person who asked the question pointed out that new normally signals failure by throwing an exception, so, like a reference, the result of new cannot be null. Therefore, by returning a pointer, new implicitly offers an error-detection mechanism that it declines to use. If new were to return a reference instead, it would be clear that there is no need to check the return value for possible failures.
I started to compose a response about how pointers were easier to manage than references, when suddenly a thought occurred to me. Putting aside the new(nothrow) option, which does allow new to return a null pointer, there is still the common use of new to allocate an array.
In general, people use new to allocate arrays when they don't know in advance how many allocates the array should contain. In that case, of course, the type of the array isn't known during compilation--because the type of an array includes its size. So in that case, new cannot return a reference, because it's not possible to know the type to which the reference would refer.
So why not return a reference to the first element? Because the common use of new in such cases is to use the value returned as a way of accessing the elements of that array--in other words, to use it as an iterator.
Realizing that the new-array case really returned an iterator, I no longer felt that there was any reason for new to return a reference. The point is that pointers can serve as iterators and references can't--a fact that makes pointers a much better choice than references in this context.
Posted by Andrew Koenig at 11:28 AM Permalink
|
April 12, 2007
A stray thought about intrusive smart pointers
I've seen a lot of smart-pointer classes in C++, and they tend to divide into two categories: Intrusive smart pointers and non-intrusive ones. An intrusive pointer requires the author of the class to which it will point to accommodate the pointer, typically by deriving that class from a class designed as a helper to the smart-pointer class. For that reason, the non-intrusive pointers are often more interesting: You can attach one to an object of just about any type.
This flexibility typically comes at a cost: Non-intrusive smart pointers usually work by allocating a reference count as a separate object, which must then be maintained at a cost. Nevertheless, believing that people are generally more important than machines, I have long maintained that non-intrusive smart pointers should generally be preferred over the other kind.
Recently I saw an example that changed my thinking.
I am still not prepared to say that intrusive counters are always better than their counterparts--because of the burden they impose on class authors--but I realized only recently that intrusive smart pointers can do something that their non-intrusive counterparts cannot do easily.
Suppose you have an object with a smart pointer attached to it:
Smart_pointer p = new T;
Then copying p will create a new smart pointer referring to the same object as the original:
Smart_pointer q = p;
No problem so far. Suppose, however, that all you have is the address of the object to which the smart pointer is already attached, and you want to attach a new smart pointer to that object. For example, you have executed:
Smart_pointer p = new T;
T* tp = &*p;
and now you want to attach a new smart pointer to the object to which tp points without having direct access to p.
Unless the smart pointer is intrusive, I don't see how to accomplish this, even in principle. For such a smart pointer must locate not only the object but also the reference count associated with the object, and I can't see an easy way to map from the object to the corresponding reference count. If the smart pointer is intrusive, the reference count is right there.
The solution to this problem, of course, is that if you are using non-intrusive smart pointers, you should be even more careful to avoid handing raw addresses around, because you cannot convert those addresses into additional smart pointers. In particular, the obvious technique
Smart_pointer q(tp);
will appear to work, but will fail horribly because it will delete the object while there is still a smart pointer attached to it.
I'm not sure there's any moral to this story except that working successfully with pointers, smart or otherwise, requires keeping one's wits about one.
Posted by Andrew Koenig at 12:57 PM Permalink
|
April 09, 2007
A stopped clock is right twice a day
Which is better, a program that works most of the time or one that fails most of the time?
Faced with this question, most people would probably prefer the program that mostly works. But what if mostly working isn't good enough? Then the program that mostly fails is easier to fix.
Let's make this discussion concrete. A recent Usenet discussion centered on writing a function to return a const char*:
const const char* foo() {
char result[100];
// ...
return result;
}
This strategy obviously doesn't work: result is a local variable, so by the time the caller gets around to doing anything with the pointer returned, it points at memory that is long gone.
Obviously the program's author could dynamically allocate the memory instead:
const const char* foo() {
char* result = new char[100];
// ...
return result;
}
Now the pointer returned is valid. In exchange, the caller has to take responsibility for freeing the memory if the program is to avoid memory leaks.
Is there a strategy that combines the convenience of not having to worry about freeing memory with the requirement to return a valid pointer? Of course there is. Let the pointer stick around until the next call:
const const char* foo() {
static char* result;
delete[] result;
result = new char[100];
// ...
return result;
}
Now the caller doesn't have to delete the memory, but it goes away automatically at the next call. However, we now have a function that will fail if used twice in the same expression:
bar(foo(), foo());
Here, whichever call to foo is executed second will wipe out the memory addressed by the pointer that the first call returned. In other words, bar will receive one valid pointer and one invalid one.
We can solve this problem too: Instead of remembering the last pointer returned, we can remember the last n for a suitable value of n.
Now we have transformed an uncommon problem into a dangerously obscure one. The trouble is that users of foo will think it is safe to use the value returned indefinitely; but sooner or later someone is going to misuse it An attempt to make things easier has pushed the problem into the background where it is harder to detect.
When you have a choice between a program that fails subtly and one that fails spectacularly, you are often better off with the one that fails spectacularly.
Posted by Andrew Koenig at 02:33 PM Permalink
|
April 03, 2007
A "Duh" moment
I'm putting together solutions for homework exercises for this class I'm teaching. One of the exercises is to implement a simplified version of the standard list template, including erase.
To simplify the code, I decided to implement single-element erase in terms of generalized erase:
template<class T> class List {
// ...
List_iter erase(List_iter, List_iter); // Implemented elsewhere
List_iter erase(List_iter it) {
return this->erase(it, it);
}
// ...
};
It took surprisingly long for me to see the obvious mistake I had made. How long did it take you to spot it?
In case you haven't seen it yet, the problem is that the call to erase(it, it) is wrong: It erases an empty range. This fact should be obvious because the two arguments to erase are the same.
Having seen that, I can at least give myself credit for also seeing that
List_iter erase(List_iter it) {
return this->erase(it, ++it);
}
would have made matters worse, because it modifies it twice between sequence points.
So I had to do it this way:
List_iter erase(List_iter it) {
List_iter next = it;
return this->erase(it, ++next);
}
Sometimes the simplest things cause the most trouble.
Posted by Andrew Koenig at 09:31 AM Permalink
|
|