Unmanaged Pointers in C++: Parameter Evaluation, auto_ptr, and Exception Safety

By Herb Sutter, December 01, 2002

December 2002 Special References/Unmanaged Pointers in C++: Parameter Evaluation, auto_ptr, and Exception Safety

Article adapted from Items 20 and 21 from the book More Exceptional C++ by H. Sutter, © 2002, Pearson Education, Inc. Reprinted by permission of Pearson Education, Inc. All rights reserved.

The Greek philosopher Socrates taught by asking his students questions — questions designed to guide them and help them draw conclusions from what they already knew, and to show them how the things they were learning related to each other and to their existing knowledge. This method has become so famous that we now call it the "Socratic method." From our point of view as students, Socrates' approach involves us, makes us think, and helps us relate and apply what we already know to new information. More Exceptional C++ [1] uses a Socratic problem-solution format to teach how to make effective use of Standard C++ and its standard library with a particular focus on sound software engineering in modern C++.

In this excerpt from More Exceptional C++ (Items 20 and 21, drawn from the section on "Exception Safety Issues and Techniques"), our focus is on a subtle issue related to parameter evaluation, why just using a smart pointer like auto_ptr doesn't entirely solve it, and what to do about it. The issue is subtle, but the analysis and ultimate solution are intriguing and simple, respectively, and something working C++ developers should know about.

Problem 1: Parameter Evaluation

1. In each of the following statements, what can you say about the order of evaluation of the functions f, g, and h and the expressions expr1 and expr2? Assume that expr1 and expr2 do not contain more function calls.
// Example 1-1(a)
//
f( expr1, expr2 );

// Example 1-1(b)
//
f( g( expr1 ), h( expr2 ) );
2. In your travels through the dusty corners of your company's code archives, you find the following code fragment:
// Example 1-2
//

// In some header file:
void f( T1*, T2* );

// In some implementation file:
f( new T1, new T2 );
Does this code have any potential exception-safety problems? Explain.

Solution

Recap: Evaluation Orders and Disorders

1. In each of the following statements, what can you say about the order of evaluation of the functions f, g, and h and the expressions expr1 and expr2? Assume that expr1 and expr2 do not contain more function calls.

Ignoring threads, which are not mentioned in the C++ Standard, the answer to the first question hinges on the following basic rules:

All of a function's arguments must be fully evaluated before the function is called. This includes the completion of any side effects of expressions used as function arguments.

Once the execution of a function begins, no expressions from the calling function begin or continue to be evaluated until execution of the called function has completed. Function executions never interleave with each other.

Expressions used as function arguments may generally be evaluated in any order, including interleaved, except as otherwise restricted by the other rules.

Given those rules, let's see what happens in our opening examples:
// Example 1-1(a)
//
f( expr1, expr2 );
In Example 1-1(a), all we can say is that both expr1 and expr2 must be evaluated before f() is called.

That's it. The compiler may choose to perform the evaluation of expr1 before, after, or interleaved with the evaluation of expr2. There are enough people who find this surprising that it comes up as a regular question on the newsgroups, but it's just a direct result of the C and C++ rules about sequence points.
// Example 1-1(b)
//
f( g( expr1 ), h( expr2 ) );
In Example 1-1(b), the functions and expressions may be evaluated in any order that respects the following rules:

expr1 must be evaluated before g() is called.

expr2 must be evaluated before h() is called.

Both g() and h() must complete before f() is called.

The evaluations of expr1 and expr2 may be interleaved with each other, but nothing may be interleaved with any of the function calls. For example, no part of the evaluation of expr2 nor the execution of h() may occur from the time g() begins until it ends.

That's it. For example, this means that any one or more of the following are possible:

Either g() or h() could be called first.

Evaluation of expr1 could begin, then be interrupted by h() being called, and then complete. (Likewise for expr2 and g().)

Some Function Call Exception-Safety Problems

2. In your travels through the dusty corners of your company's code archives, you find the following code fragment:
// Example 1-2
//

// In some header file:
void f( T1*, T2* );

// In some implementation file:
f( new T1, new T2 );
Does this code have any potential exception-safety problems? Explain.

Yes, there are several potential exception safety problems.

Brief recap: an expression such as new T1 is called, simply enough, a new-expression. Recall what a new-expression really does (ignoring in-place and array forms for simplicity, because they're not very relevant here):

It allocates memory.

It constructs a new object in that memory.

If the construction fails because of an exception, the allocated memory is freed.

So each new-expression is essentially a series of two function calls: one call to operator new() (either the global one, or one provided by the type of the object being created), and then a call to the constructor.

For Example 1-2, consider what happens if the compiler decides to generate code as follows:

Allocate memory for the T1.

Construct the T1.

Allocate memory for the T2.

Construct the T2.

Call f().

The problem is this: if either step 3 or step 4 fails because of an exception, the C++ Standard does not require that the T1 object be destroyed and its memory deallocated. This is a classic memory leak, and clearly Not a Good Thing.

Another possible sequence of events is the following:

Allocate memory for the T1.

Allocate memory for the T2.

Construct the T1.

Construct the T2.

Call f().

This sequence has, not one, but two exception-safety problems with different effects:

a) If step 3 fails because of an exception, then the memory allocated for the T1 object is automatically deallocated (step 1 is undone), but the Standard does not require that the memory allocated for the T2 object be deallocated. The memory is leaked.

b) If step 4 fails because of an exception, then the T1 object has been allocated and fully constructed, but the Standard does not require that it be destroyed and its memory deallocated. The T1 object is leaked.

"Hmm," you might wonder, "then why does this exception-safety loophole exist at all? Why doesn't the Standard just prevent the problem by requiring compilers to Do the Right Thing when it comes to cleanup?"

Following the spirit of C in the matter of efficiency, the C++ Standard allows the compiler some latitude with the order of evaluation of expressions, because this allows the compiler to perform optimizations that might not otherwise be possible. To permit this, the expression evaluation rules are specified in a way that is not exception-safe, so if you want to write exception-safe code you need to know about, and avoid, these cases. Fortunately, you can do just that and prevent this problem. Perhaps a managed pointer like auto_ptr could help? We'll see the answer in the following part.

Problem 2: What About auto_ptr
As you continue to root through the archives, you see that someone must not have liked Example 1-2 because later versions of the files in question were changed as follows:
// Example 2-1
//

// In some header file:
void f( auto_ptr<T1>, auto_ptr<T2> );

// In some implementation file:
f( auto_ptr<T1>( new T1 ), auto_ptr<T2>( new T2 ) );
What improvements does this version offer over Example 1-2, if any? Do any exception-safety problems remain? Explain.

Demonstrate how to write an auto_ptr_new facility that solves the safety problems in Question 1 and can be invoked as follows:
// Example 2-2
//

// In some header file:
void f( auto_ptr<T1>, auto_ptr<T2> );

// In some implementation file:
f( auto_ptr_new<T1>(), auto_ptr_new<T2>() );
Solution

1. As you continue to root through the archives, you see that someone must not have liked Example 1-2 because later versions of the files in question were changed as follows:
// Example 2-1
//

// In some header file:
void f( auto_ptr<T1>, auto_ptr<T2> );

// In some implementation file:
f( auto_ptr<T1>( new T1 ), auto_ptr<T2>( new T2 ) );
What improvements does this version offer over Example 1-2, if any? Do any exception-safety problems remain? Explain.

This code attempts to "throw [2] auto_ptr at the problem." Many people believe that a smart pointer like auto_ptr is an exception-safety panacea, a touchstone or amulet that by its mere presence somewhere nearby can help ward off compiler indigestion.

It is not. Nothing has changed. Example 2-1 is still not exception-safe, for exactly the same reasons as before.

Specifically, the problem is that the resources are safe only if they really make it into a managing auto_ptr, but the same problems already noted can still occur before either auto_ptr constructor is ever reached. This is because both of the two problematic execution orders mentioned earlier are still possible, but now with the auto_ptr constructors tacked onto the end before f(). For one example:

Allocate memory for the T1.

Construct the T1.

Allocate memory for the T2.

Construct the T2.

Construct the auto_ptr<T1>.

Construct the auto_ptr<T2>.

Call f().

In the above case, the same problems are still present if either of steps 3 or 4 throws. Similarly:

Allocate memory for the T1.

Allocate memory for the T2.

Construct the T1.

Construct the T2.

Construct the auto_ptr<T1>.

Construct the auto_ptr<T2>.

Call f().

Again, the same problems are present if either of steps 3 or 4 throws.

Fortunately, though, this is not a problem with auto_ptr; auto_ptr is being used the wrong way, that's all. In a moment, we'll see several ways to use it better.
Aside: A Non-Solution

Note that the following is not a solution:
// In some header file:
void f( auto_ptr<T1> = auto_ptr<T1>( new T1 ),
        auto_ptr<T2> = auto_ptr<T1>( new T2 ) );

// In some implementation file:
f();
Why is this code not a solution? Because it's identical to Example 2-1 in terms of expression evaluation. Default arguments are considered to be created in the function call expression, even though they're written somewhere else entirely (in the function declaration).

A Limited Solution

2. Demonstrate how to write an auto_ptr_new facility that solves the safety problems in Question 1 and can be invoked as follows:
// Example 2-2
//

// In some header file:
void f( auto_ptr<T1>, auto_ptr<T2> );

// In some implementation file:
f( auto_ptr_new<T1>(), auto_ptr_new<T2>() );
The simplest solution is to provide a function template like the following:
// Example 2-2(a): Partial solution
//
template<typename T>
auto_ptr<T> auto_ptr_new()
{
  return auto_ptr<T>( new T );
}
This solves the exception-safety problems. No sequence of generated code can cause resources to be leaked, because now all we have is two functions, and we know that one must be executed entirely before the other. Consider the following evaluation order:

Call auto_ptr_new<T1>().

Call auto_ptr_new<T2>().

If step 1 throws, there are no leaks because the auto_ptr_new() template is itself strongly exception-safe.

If step 2 throws, then is the temporary auto_ptr<T1> created by step 1 guaranteed to be cleaned up? Yes, it is. One might wonder: isn't this pretty much the same as the new T1 object created in the corresponding case in Example 1-2, which isn't correctly cleaned up? No, this time it's not quite the same, because here the auto_ptr<T1> is actually a temporary object, and cleanup of temporary objects is correctly specified in the Standard. From the Standard, in 12.2/3:

Temporary objects are destroyed as the last step in evaluating the full-expression that (lexically) contains the point where they were created. This is true even if that evaluation ends in throwing an exception.

But Example 2-2(a) is a limited solution: it only works with a default constructor, which breaks if a given type T doesn't have a default constructor, or if you don't want to use it. A more general solution is still needed.

Generalizing the auto_ptr_new() Solution

As pointed out by Dave Abrahams, we can extend the solution to support non-default constructors by providing a family of overloaded function templates:
// Example 2-2(b): Improved solution
//
template<typename T>
auto_ptr<T> auto_ptr_new()
{
  return auto_ptr<T>( new T );
}

template<typename T, typename Arg1>
auto_ptr<T> auto_ptr_new( const Arg1& arg1 )
{
  return auto_ptr<T>( new T( arg1 ) );
}

template<typename T, typename Arg1, typename Arg2>
auto_ptr<T> auto_ptr_new( const Arg1& arg1,
                          const Arg2& arg2 )
{
  return auto_ptr<T>( new T( arg1, arg2 ) );
}

// etc.
Now auto_ptr_new fully and naturally supports non-default construction.

A Better Solution

Although auto_ptr_new() is nice, is there any way we could have avoided all the exception-safety problems without writing such helper functions? Could we have avoided the problems with better coding standards? Yes, and here is one possible standard that would have eliminated the problem: Never allocate resources (for example, via new) in the same expression as any other code that could throw an exception. This applies even if the new'd resource will immediately be managed (for example, passed to an auto_ptr constructor) in the same expression.

In the Example 2-1 code, the way to satisfy this guideline is to move one of the temporary auto_ptrs into a separate named variable:
// Example 2-1(a): A solution
//

// In some header file:
void f( auto_ptr<T1>, auto_ptr<T2> );

// In some implementation file:
{
  auto_ptr<T1> t1( new T1 );
  f( t1, auto_ptr<T2>( new T2 ) );
}
This satisfies guideline #1 because, although we are still allocating a resource, it can't be leaked because of an exception, because it's not created in the same expression as any other code that could throw [3].

Here is another possible coding standard, which is even simpler and easier to get right (and easier to catch in code reviews): perform every explicit resource allocation (for example, new) in its own code statement, which immediately gives the new'd resource to a manager object (for example, auto_ptr).

In Example 2-1, the way to satisfy the second alternative guideline is to move both of the temporary auto_ptrs into separate named variables:
// Example 2-1(b): A simpler solution
//

// In some header file:
void f( auto_ptr<T1>, auto_ptr<T2> );

// In some implementation file:
{
  auto_ptr<T1> t1( new T1 );
  auto_ptr<T2> t2( new T2 );
  f( t1, t2 );
}
This satisfies guideline #2, and it required a lot less thought to get it right. Each new resource is created in its own expression and is immediately given to a managing object.

Summary

My recommendation is:

Guideline: Perform every explicit resource allocation (for example, new) in its own code statement, which immediately gives the new'd resource to a manager object (for example, auto_ptr).

This guideline is easy to understand and remember; it neatly avoids all the exception-safety problems in the original problem; and by mandating the use of manager objects, it helps to avoid many other exception-safety problems as well. This guideline is a good candidate for inclusion in your team's coding standards.

Acknowledgments

This Item was prompted by a discussion thread on comp.lang.c++.moderated. This solution draws on observations presented by James Kanze, Steve Clamage, and Dave Abrahams in that and other threads, and in private correspondence.

References and Notes

[1] H. Sutter. More Exceptional C++ (Addison-Wesley, 2002).

[2] Pun intended.

[3] I'm being deliberately, but only slightly, fuzzy, because although the body of f() is included in the expression evaluation, we don't care whether it throws.

Herb Sutter (<www.gotw.ca>) is convener of the ISO C++ standards committee, author of the acclaimed books Exceptional C++ and More Exceptional C++, and one of the instructors of The C++ Seminar (<www.gotw.ca/cpp_seminar>). In addition to his independent writing and consulting, he is also C++ community liaison for Microsoft.

Article adapted from Items 20 and 21 from the book More Exceptional C++ by H. Sutter, © 2002, Pearson Education, Inc. Reprinted by permission of Pearson Education, Inc. All rights reserved.

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Unmanaged Pointers in C++: Parameter Evaluation, auto_ptr, and Exception Safety

Problem 1: Parameter Evaluation

Solution

Recap: Evaluation Orders and Disorders

Some Function Call Exception-Safety Problems

Problem 2: What About auto_ptr

Solution

Aside: A Non-Solution

A Limited Solution

Generalizing the auto_ptr_new() Solution

A Better Solution

Summary

Acknowledgments

References and Notes

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Unmanaged Pointers in C++: Parameter Evaluation, auto_ptr, and Exception Safety

Problem 1: Parameter Evaluation

Solution

Recap: Evaluation Orders and Disorders

Some Function Call Exception-Safety Problems

Problem 2: What About auto_ptr

Solution

Aside: A Non-Solution

A Limited Solution

Generalizing the auto_ptr_new() Solution

A Better Solution

Summary

Acknowledgments

References and Notes

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content