March 01, 2007
Bail, return, jump, or . . . throw?By Dan Saks
The common techniques for handling run-time errors in C leave something to be desired, like maybe exception handling.
The common techniques for handling run-time errors in C leave something to be desired, like maybe exception handling.The exception handling machinery in C++ is designed to deal with program errors, such as a resource allocation failure or a value out of range. C++ exception handling provides a way to decouple error reporting from error handling. However, it's not designed to handle asynchronous events such as hardware interrupts. C++ exception handling is designed to address the limitations of error handling in C. In this installment, I'll look at some of the more common techniques for handling run-time errors in C programs and show you why these techniques leave something to be desired.
Error reporting via return values
(The macro ULONG_MAX is defined in the standard header <limits.h>. Macros ERANGE and errno are defined in <errno.h>.) If you want your C code to be reliable, you should write it so that it checks the return values from calls to all such functions. In some cases, adding code to check the return value isn't too burdensome. For example, a typical call to malloc such as:
![]() Now, suppose reality intrudes and function h has to check for a condition it can't handle. In that case, you might rewrite h so that it has a non-void return type, such as int, and appropriate return statements for error and normal returns. The function might look like:
int h(void)
{
if (something really bad happened)
return -1;
// do h
return 0;
}
Now g is responsible to heed the return value of h and act accordingly. However, more often than not, functions in the middle of a call chain, such as g and f, aren't in the position to handle the error. In that case, all they can do is look for error values coming from the functions they call and return them up the call chain. This means you must rewrite both f and g to have non-void return types along with appropriate return statements, as in:
int g(void)
{
int status;
if ((status = h()) != 0)
return status;
// do the rest of g
return 0;
}
int f(void)
{
int status;
if ((status = g()) != 0)
return status;
// do the rest of f
return 0;
}
Finally, the buck stops with main:
int main()
{
if (f() != 0)
// handle the error
// do the rest of main
return 0;
}
This approach--returning error codes via return values or arguments--effectively decouples error detection from error handling, but the costs can be high. Passing the error codes back up the call chain increases the size of both the source code and object code and slows execution time. It's been a while since I've used this approach to any extent, but my recollection is that the last time I did, it increased the non-comment source lines in my application by 15 to 20%, with a comparable increase in the object code. Other programmers have told me they've experienced increases to the tune of 30 to 40%.
This technique also increases coding effort and reduces readability. It's usually difficult to be sure that your code checks for all possible errors. Static analyzers, such as Lint, can tell you when you've ignored a function's return value, but as far as I know, they can't tell you when you've ignored the value of an argument passed by address. The consistent application of this technique can easily break down when the current maintainer of the code hands it off to a less experienced one. Jumping We could eliminate much of the error reporting code from the middle layers of the call chain by transferring control directly from the error-detection point to the error-handling point. Some languages let you do this with a non-local goto. If you could do this in C, it might look like:
int h(void)
{
if (something really bad happened)
goto error_handler;
// do h
return 0;
}
...
int main()
{
f();
// do the rest of main
return 0;
error_handler:
// handle the error
}
but you can't. It won't compile. However, you can do something similar using the facilities provided by the standard header <setjmp.h>. That header declares three components: a type named jmp_buf and two functions named setjmp and longjmp. (Actually, setjmp might be a function-like macro, but for the most part, you can think of it as a function.)
Calling setjmp(jb) stores a "snapshot" of the program's current calling environment into jmp_buf jb. That snapshot typically includes values such as the program counter, stack pointer, and possibly other CPU registers that characterize the current state of the calling environment. Subsequently, calling longjmp(jb, v) (I'll explain v shortly) effectively performs a non-local goto--it restores the calling environment from snapshot jb and causes the program to resume execution as if it were returning from the call to setjmp that took the snapshot previously. It's like déjà vu all over again. The function calling setjmp can use setjmp's return value to determine whether the return from setjmp is really that, or actually a return from longjmp. When a function directly calls setjmp(jb) to take a snapshot, setjmp returns 0. A later call to longjmp(jb, v), where v is non-zero, causes program execution to resume as if the corresponding call to setjmp returned v. In the special case where v is equal to 0, longjmp(jb, v) causes setjmp to return 1, so that setjmp only returns 0 when called directly. Listing 2 shows our hypothetical application with a longjmp from h to main. Since the longjmp bypasses g and f, these two functions no longer need to check for error return values, thus simplifying the source code and reducing the object code. ![]() Using setjmp and longjmp eliminates most, if not all, of the clutter that accrues from checking and returning error codes. So what's not to like about them? The problem is that you must be extremely cautious with them to avoid accessing invalid data or mismanaging resources. A jmp_buf need not contain any more information than necessary to enable the program to resume execution as if it were returning from a setjmp call. It need not and probably will not preserve the state of any local or global objects, files, or floating-point status flags. Using setjmp and longjmp can easily lead to resource leaks. For example, suppose functions g and f each allocate and deallocate a resource, as in:
A call to longjmp from h transfers control to main, completely bypassing the remaining portions of f and g. When this happens, f misses the opportunity to close its FILE, and g misses the opportunity to free its allocated memory.
|
|
||||||||||||||||||||||||||||
|
|
|
|