Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Much Ado About Nothing:...


May 04: Much Ado About Nothing: A (True) Null Pointer Value for C++

This column is about (pointing to) nothing. More seriously, it's about why we need a better null value for pointers than C [1] or C++ [2] currently provide.

Last fall, Bjarne Stroustrup and I proposed the nullptr keyword and feature to the ISO C++ Standards committee as a feature we ought to include in C++0x, the next version of the C++ Standard (see my trip report in last month's issue [3]). This proposal was based on work we had already done as part of C++/CLI, the C++ extensions for ISO CLI (the standardized subset of .NET) [4]. The C++ evolution working group liked the initial proposal, asked for a few changes, and we updated our proposal for the Spring 2004 C++ Standards meeting; this article is based on the updated proposal [5]. (Thanks to the editorial lead time of magazines, I'm writing this article before the Standards meeting, but by the time you read this, the meeting will have already occurred. I'll include any updates in my next trip report column, which should appear in the June or July issue of this magazine.)

First, let's recap what C and C++ say today about null pointers, see what the issues are, and then look at the solution that we are contemplating for C++0x.

Null Pointers in Today's C and C++: 0 and NULL

The key question is this: What is the type of the expression 0 (zero)?

The key problem is that, in C and C++, there is more than one answer. Bear with me as we take a quick look at what the language Standards actually say about it, then we'll see what that means for real code.

Both languages have a special rule that says that 0 is both an integer constant and a null pointer constant. Here's what the C Standard has to say about it:

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant...If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

[1], 6.3.2.3

The C++ Standard's take is similar, but closer to the original K&R C definition. The main difference is that a zero cast to void* is not considered a null pointer constant:

A null pointer constant is an integral constant expression (expr.const) value of integer type that evaluates to zero. A null pointer constant can be converted to a pointer type; the result is the null pointer value of that type and is distinguishable from every other value of pointer to object or pointer to function type.

[2], 4.10

Both C and C++ also provide the standard macro NULL, which expands to "an implementation-defined null pointer constant." (For those keeping score at home, see [1] 7.17 and [2] 18.1.)

This is one of the few cases where the answer to "What is the type of X?" gets a little blurry. That's unfortunate because, in a strongly typed language, it's a Bad Thing for the type system to have blurry answers to such basic type questions. When type answers get confused, then the language rules get confused.

And, ultimately, programmers get confused, too.

Let's consider some of the problems that stem from having a null pointer constant that behaves as though it had an ambiguous type. Some of these affect both C and C++ programs, but from now on, I'm going to focus on C++ only; it matters more in C++ because it interacts (poorly) with C++'s more type-sensitive features — notably, overloading.

Problem 1: Overloading And (Lack of) Ambiguity

This use of the value 0 to mean different things (a pointer constant and an int) has caused problems since at least 1985 in teaching, learning, and using C++. Here's an example first published in 1993's Effective C++ Item 25 [6] (and retained with slight changes in its second edition [7]):

// Example 1: Taken from Item 25 of [6]
//
void f(int x);
void f(char* ps);



f(0);    // calls f(int) or f(char*)?

You might think the call is ambiguous, because 0 is, after all, both a pointer and an int — right? And that is also what many programmers would want to have happen here.

Instead, what happens is that f(int) gets selected unambiguously. Why? Well, 0's type is, after all, both int and pointer, but when push comes to shove, it's an int first and foremost. Thus, Meyers adds the wry note:

This is a situation unique in the world of C++: a place where people think a call should be ambiguous, but compilers do not.

[7], Item 25

So, with this in mind, how should you best answer the question, "What's the type of 0"? Informally: 0 is always both an integer constant and a null pointer constant — except when it's not.

Before leaving this first problem, consider one last aspect: What if you want to call f(char*) with a null pointer? How do you write it? After all, if the overload f(int) wasn't hanging around, we'd just happily write f(0)...but with an f(int) in the mix, that doesn't work because 0 matches int better than it matches char*. So you have to be explicit:

f( (char*)0 );	// option 1: explicitly cast to char*



char* nil = 0;	// option 2: used a named variable
f( nil );       // of type char*

Is that asking a lot? Well, probably not. But it is unfortunate that we have no better way to utter a simple request to "pass in the null pointer value."

By the bye, the previous code option with its cute little nil variable gives us a nice segue to the second problem...

Problem 2: A Name for the Null

The second major problem is that the null pointer ought to have a name. And, no, NULL doesn't count.

Names are important. In particular, as Meyers laments:

It would be nice if you could somehow tiptoe around this problem by use of a symbolic name, say, NULL for null pointers, but that turns out to be a lot tougher than you might imagine.

[7], Item 25

Providing a convenient name is, indeed, one reason why the standard macro NULL exists, although that macro is insufficient for the reasons Meyers goes on to demonstrate. In a nutshell, the difficulty with the standard NULL macro is that it's no better than the type-ambiguous "null pointer constant," because that's exactly what it has to expand out to. No matter how you dress up a 0, it's still a 0 in the end. (NULL doesn't have to be spelled exactly "0", but the few alternatives are no better; see [7] for more gory details.)

If the null pointer constant had a type-safe name, that would also solve the previous problem: You could easily distinguish it from the integer 0 for overload resolution by simply writing, say, f( nullptr );.

Did I say "nullptr"? This brings up a key naming question, to wit...

Interlude: What Should It Be Called?

The null pointer constant should be called nullptr. That says what it is; for example, it's not a null reference.

Of all the names we might choose for this little beast, nullptr is the one that's least likely to conflict with existing programs. For example, a Google search for "nullptr cpp" returns a total of merely 150 hits, only one of which appears to use nullptr in a C++ program. That's the lowest Google result of any of the other alternatives to follow.

We certainly can't use NULL because that's already a standard macro in C and C++. Even if C++ defined NULL to be a keyword, it would still be replaced by macros lurking in older code. Also, there might be code "out there" that (unwisely) depended on NULL being 0. Finally, using such a name would go against the usual convention that identifiers in all caps are macro names (and, for example, testable by #ifdef), which this one isn't.

Similarly, we can't use null. That's appealingly simple and tantalizingly natural, but alas, it's nearly as bad as NULL because null is also commonly used in existing programs as an identifier name and (worse) as a macro name. If we took null as a reserved word, we'd break the many programs that already use that common word as an identifier, and we'd be stomped on by programs that already use it as a macro.

What's left? Not much. We also can't spell it 0P or 0p, where we'd be adding the letter as a constant type suffix. This alternative overlaps with a C extension that already uses P or p in a constant to write the binary exponent part of a hexadecimal floating-point constant (see [1] 6.4.4.2). For example, 0P occurs as a part of the constant 0x0P2. Although using 0P or 0p would not be ambiguous today (the C99 P or p must be preceded by 0x and a hex number and must be followed by a decimal number), it seems imprudent to reuse a constant type suffix already used for another type of constant. Also, using an obscure notation, such as 0P, would encourage people to rely on a NULL macro.

Our informal polling suggests that people seem to like nullptr. If nothing else, it is the spelling that has elicited the fewest strong objections to date in our experience. The evolution working group liked it, too, so that's what we've proposed.

A Workaround: The Best You Can Do In Today's C++

But does nullptr really require a language extension? C++ is a powerful language already; couldn't we somehow write nullptr as a library facility?

In short: Yes, we could, but it would be a poorer solution (the practical argument), and a concept this fundamental belongs in the language (the clean-design argument).

Meyers continued his treatise by presenting several unsuccessful attempts to simulate a true null pointer in the language, which successfully divorces the notions of "zero integer value" and "null pointer value." Here is a slightly edited version of the final attempt he presents in Item 25 of [7]:

// Example 2: Adapted from Item 25 of [7]
//
const                    // this is a const object...
class {
public:
  template<class T>      // convertible to any type
    operator T*() const  // of null non-member
    { return 0; }        // pointer...
  template<class C, class T>  // or any type of null
    operator T C::*() const   // member pointer...
    { return 0; }
private:
  void operator&() const;  // whose address can't be taken
} nullptr = {};            // and whose name is nullptr

Note what he's doing: nullptr is a const object, the only object of an unnamed class that can be converted to any pointer or pointer-to-member type and whose address can't be taken.

There's one real advantage to this workaround: It doesn't make nullptr a reserved word. (Of course, the whole point of having such a null pointer constant is that it actually be used widely and pervasively, so, in practice, the name is still effectively a reserved word for most purposes.) This means that it won't affect the meaning of existing programs that might use nullptr as an identifier. On the other hand, though, it also means that its name can be hidden by such an existing identifier.

There's one seeming advantage that we believe isn't an advantage in practice: It provides nullptr as a library extension, rather than a special value baked into the language and known to the compiler. We think this is a red herring because compiler implementations would likely bake in special knowledge of it anyway. Why? To produce quality warnings and errors. To see what I mean, try this experiment on whatever compiler(s) you're using today: Type in the aforementioned library implementation of nullptr and try writing code that uses it that should produce an error (for example, assigning nullptr to an int, or passing it to a template parameter of type int, or comparing it to an int, all of which shouldn't be allowed). My experiments with several popular compilers show that this library template implementation of nullptr generates poor and/or misleading error messages for common mistakes, including:

  • No conversion from const to int.
  • No suitable conversion function from const class <unnamed> to int exists.

  • A template argument may not reference an unnamed type.

  • No operator == matches these operands, operand types are: int == const class <unnamed>.

Not a word about "you can't do that with a null pointer" anywhere in sight. These diagnostics clearly aren't good enough for such a fundamental concept and such simple programming mistakes, so we believe that even if this approach were standardized (and it won't be), compilers would still need to add special knowledge of nullptr in order to provide quality error messages for common use cases. In short: Even if nullptr weren't technically a language extension, it might as well be.

That's the best we can do with a library-based solution, so it's time to consider what might become C++0x's first language extension.

Introducing nullptr

In draft Standard C++/CLI [4], and possibly soon in draft Standard C++0x [2], nullptr is a reserved word that designates a strongly typed null pointer constant that you can use in all the previously contexts mentioned that are good, and in none of the ones that are evil.

nullptr converts implicitly to any pointer or pointer-to-member type, but not to an integer type (it will have nothing to do with the ints, thank you very much just the same, we've just had far too many problems with that clan before). That means that if you have overloaded functions like these:

void f(int);
void f(char*);

then you can easily call the second with a null pointer by writing just:

f( nullptr );

You can throw nullptr; and compute sizeof(nullptr). But you can't do stuff that leads to trouble: You can't take nullptr's address, convert it to an integer, assign it to an integer variable, or compare it to an integer variable.

Here are a few examples to illustrate the kinds of things you can and can't do with 0 and nullptr, and in which cases nullptr is better:

// initialization and assignment
char* ch = nullptr;  // ch has the null pointer value
char* ch2 = 0;       // ch2 has the null pointer value
int n = nullptr;     // error
int n2 = 0;          // n2 is zero
// comparison</p>
if( ch == 0 );        // evaluates to true
if( ch == nullptr );  // evaluates to true
if( ch );             // evaluates to false
if( n2 == 0 );        // evaluates to true
if( n2 == nullptr );  // error
if( nullptr );        // error
if( nullptr == 0 );   // error
// arithmetic
nullptr = 0;         // error, nullptr is not an lvalue
nullptr + 2;         // error

In particular, note that 0 can still be assigned to a pointer and compared with pointers. This is essential for compatibility. The nullptr proposal deliberately proposes no change to the existing meanings of 0 and NULL. Granted, it could be tempting to define the Standard Library macro NULL to nullptr "while we're at it," but doing that would break some people's existing code (even though, in many cases, it would be code that deserves to be broken because it treats the NULL pointer like an int). Rather, the current thinking is that it's better to preserve backward compatibility for current code, while strongly encouraging new code to use the cleaner and safer nullptr.

Summary and Perspective

We need a type-safe and named null pointer, and the front-running candidate is the nullptr described here, which is part of the C++/CLI draft Standard [4] and may soon be part of the new draft Standard C++.

Having a distinct type for nullptr also turns out to be useful for library writers. In particular, while the nullptr proposal was being discussed, the library folks noticed that having a nullptr with a specific known type would be useful for several of the facilities in the new Standard Library extensions currently being added to C++ [8].

For example, the tr1::function facility generalizes the notion of a function or function object; it can bind (well, point) to anything function-like (see [9] and [10] for more on tr1::function and just how nifty and useful it is). Because it can be made to point to objects, it has to be able to deal with being made to point to nothing. Today, tr1::function relies on special tests to determine whether it is being given a null or nonnull pointer. If nullptr had a distinct type, tr1::function could instead add overloaded constructors and assignment operators that could take special action, specifically when a nullptr argument is passed, rather that relying on manual in-function checks.

nullptr takes one small step toward making the world (or, at least, the world's C++ code) a little more safe, pleasant, and friendly. Better still, it's not just some theoretical extension, but one that's practical today: It's being standardized in C++/CLI and possibly C++0x, and it's currently being implemented by several C++ compiler vendors, so you can expect to start seeing it soon in some compilers near you. Check your local stations for delivery times, and enjoy.

Acknowledgments

Thanks to Bjarne Stroustrup (of Texas A&M University), P.J. Plauger (of Dinkumware), Steve Adamczyk, John Spicer, and Daveed Vandevoorde (all of Edison Design Group), Tom Plum (of Plum Hall), and Brandon Bray, Jonathan Caves, Mark Hall, and Jeff Peil (all of Microsoft) for their collaboration and feedback on the C++/CLI language design in general, and on the nullptr design in particular.

References

[1] ISO/IEC 9899:1999(E), Programming Languages — C.

[2] ISO/IEC 14882:2003(E), Programming Languages — C++.

[3] H. Sutter. "Trips Report: October-December 2003," C/C++ Users Journal, 22(4), April 2004.

[4] C++/CLI Language Specification Candidate Base Document, Microsoft, November 2003. This document has already been improved by Ecma committee TG5 (C++/CLI) over the course of the last few months, and another public draft should be available soon, but as of this writing, the November 1993 base document is the most current publicly available working draft. You can get at it online via a link on my web page http://www.gotw.ca/microsoft/.

[5] H. Sutter and B. Stroustrup. "A name for the null pointer: nullptr (revision 2)," ISO C++ committee paper ISO/IEC JTC1/SC22/WG21/N1601, February 2004. Available online at http://std.dkuug.dk/jtc1/sc22/wg21/ docs/papers/2004/n1601.pdf.

[6] S. Meyers. Effective C++, Addison-Wesley, 1993.

[7] S. Meyers. Effective C++, Second Edition, Addison-Wesley, 1997.

[8] (Draft) Technical Report on Standard Library Extensions, C++ Standards Committee Document, ISO/IEC JTC1/SC22/WG21/N1596, February 2004. Available online at http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2004/n1596.pdf.

[9] H. Sutter. "Generalized Function Pointers," C/C++ Users Journal, 21(8), August 2003. Available online at http://www.cuj.com/documents/s=8464/cujcexp0308sutter/.

[10] H. Sutter. "Generalizing Observer," C/C++ Users Journal, 21(9), September 2003. Available online at http://www.cuj.com/documents/s=8840/cujexp0309sutter/.


Herb Sutter (http://www.gotw.ca/) is convener of the ISO C++ Standards committee, author of Exceptional C++ and More Exceptional C++, and Visual C++ architect for Microsoft.



Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.