April 01, 2001
Adding Exception Testing to Unit Tests
Adding Exception Testing to Unit TestsBen Stanley
IntroductionMuch water has gone under the bridge since Tom Cargill expressed reservations about the reliability of code that uses exception handling [1]. Tom pointed out that naive programming of exception handling typically leads to resource leaks and incoherent object states. Exceptions should arise only under extreme conditions such as low memory or lack of disk space, making exception related problems difficult to reproduce. Testing is no substitute for writing robust and complete code. Before attempting to write code that must be exception safe, you should read Herb Sutters excellent book, Exceptional C++, for some robust exception handling techniques [2, Items 8-19]. Once the code is written, a solid testing regime can help flush out any remaining bugs, and increase confidence that the code operates correctly. Rigorous testing methods have been developed for normal execution paths [3, Chapter 25]. This article describes a simple method of adding exhaustive testing of the exception paths to the test suite. Current testing methodology seeks to construct a set of tests to verify the integrity of a unit of software. Usually, this unit is a class, although it could be a group of collaborating classes. For the test to be thorough, there must be at least one test to exercise each path of execution through every function or class method in the unit. Thus, there are at least as many tests in the test suite as there are normal execution paths through the software, and usually more. If you seek to test exceptional paths in addition to normal execution paths, you will have to add more tests. How many? Consider the following code snippet:
String EvaluateSalaryAndReturnName(
Employee e )
{
if( e.Title()=="CEO" or
e.Salary() > 100000 )
{
cout << e.First() << " "
<< e.Last()
<< " is overpaid" << endl;
}
return e.First()+" "+e.Last();
}
Herb Sutter claims that there are no fewer than 20 possible exceptional code paths through this function, compared to the three normal paths [2, Item 18]. Thats a lot of extra paths to test! Even worse, it is less than obvious what these paths are, or how to cause them to execute. Luckily, it is possible to test these 20 extra exceptional paths using only the test suite for the three normal paths, and some extra template code, which is shared by all your tests.
A simplified outline of the method is as follows:
Design of the Exception Testing SystemThe design of the exception testing system is shown in Figure 1. Boxes indicate code modules. Boxes with single borders are generic and do not change for testing separate systems these modules are supplied in the online source code that accompanies this article (see www.cuj.com/code). The boxes with double borders indicate code that must be supplied by the user. The arrows indicate function call dependencies between the code modules. The dotted arrow indicates a function call dependency that is usually not necessary. The central idea in the design is the enumeration of the exception points. This is done by placing a call to the exception point counter class at each point in the code where an exception may directly arise. The exception counter class maintains two counts: the number of the current exception point, sExceptionPointCount, and the number of the exception point where a test exception should be thrown, sThrowCount. The code under test calls function CouldThrow at each point where an exception could be thrown. The CouldThrow method will throw an exception if it is at the right exception point. The test harness driver has a loop which enumerates all the exception points in turn, by first calling SetThrowCount and then calling the unit test in the test harness. Since most exception points are due to memory management problems, we can take a giant shortcut by replacing the default implementation of operator new and operator delete with debugging versions that have been modified to call CouldThrow every time new is called. In many cases, this shortcut eliminates the need to write a debugging version of the classes used by the code under test. Some exception points will be impractical to instrument with a call to CouldThrow. Such exception points may alternatively be exercised by designing the test suite so that it sets up the conditions that cause the exception as one of the test cases. This will still cause the exception code path to be tested. Any remaining exception points that are not exercised by calling CouldThrow or explicitly tested by the test suite will not be tested and are deemed to be outside the domain of the test.ImplementationThe TestDriver FunctionThe TestDriver template function causes all the exception points in a test to actually throw an exception, one at a time. It requires a functor class containing one method with the signature static void DoTest(). This function is assumed to perform a sequence of tests to exercise one aspect of a classs functionality:
template<class Func>
void TestDriver()
{
int i = 0;
do {
// ...
Counter::SetThrowCount( ++i );
try {
Func::DoTest();
}
catch( TestException& e ){}
catch( bad_alloc& e ) {}
catch( ... ) {
Counter::Fail(
"Bad Exception");
}
// ...
} while( Counter::HasThrown() );
Counter::SetThrowCount( -1 );
}
A more complete version of this function is shown in Listing
1 (Counter.h). The loop enumerates the exception points, using i
as the counter. The call to SetThrowCount at the top of the loop instructs
the Counter class to throw an exception at the specified exception point.
Testing starts at the first exception point. Then, the function under test is
called, inside a try/catch block. It is expected that the function will
throw an exception. The Counter::CouldThrow method throws a TestException.
However, operator new translates this into a bad_alloc exception
for testing purposes, so we have to be prepared to deal with that here. Any
other kind of exception that is triggered by bad inputs should be caught by
the test harness, so any other strange exceptions caught here cause a test fail.
This should be changed if it is unsuitable for your application.
An interesting point here is the loop termination condition. There must be a finite number of exception points in a test function. When they have all been tested, the TestDriver function will return normally, testing the normal path of execution for that test as a by-product. In this case, the exception has not yet been thrown. This condition is detected by the Counter::HasThrown method, which is used to terminate the loop.
At the end of the exception tests, the throw point is set to -1 to prevent exceptions from being thrown in the testing program.
The parts of Listing 1 marked with an ellipsis (...)
contain checks for memory leaks that may have occurred during a test. It is
assumed that each test is self contained, and that no memory should remain allocated
after the test. You can also walk the heap and check that everything is okay,
if your debugging memory manager supports such things.
The ... areas also contain code to print out the number of exception paths that were tested.
Test Harness FunctionsThe test harness functions are supplied as functor objects. This allows automatic reporting of the name of the test set through the use of RTTI information. Each test harness functor should have the following form:
class TestDefaultConstruct {
public:
static void DoTest() {
String s;
Counter::Test( s == "", "correct value" );
}
};
The above code just tests the default constructor of a String class. It does so by constructing a String using the default constructor, and then testing its value against the expected value. The static function Counter::Test accepts a Boolean result from a test as its first argument a true counts as a pass, false counts as a fail. The second argument is a description of the test. This is printed in the case of a failure, to help in diagnosing what happened.
A proper test harness will have one of these classes to test each aspect of functionality of your class or module. You could incorporate test data from a file, or just hard-code the expected results from simple hand-worked examples. The key point is to compare the actual state of the class against what you expect it should be, given the functions you called. If the test harness detects anything amiss, it will let you know.
Instrumenting the Code Under TestThis section refers to the code that you wish to demonstrate operates correctly in the presence of exceptions. You can test individual functions, simple classes, or template classes. If you are testing a class, it is useful to add an extra debugging method, named Consistent, that checks whether the data members of the class are in a consistent state. This is also known as checking the class invariants. For example, if a stack class has a null pointer to its array element, but its size member says it has five elements, then the class state is inconsistent.Instrumenting ClassesIf your code under test uses any other classes, you may need to create debugging versions of them that call Counter::CouldThrow at every exception point. However, if the only exception points are calls to the standard operator new and operator delete, then you dont have to change anything. By using overloaded versions of operator new and operator delete, the required calls to CouldThrow can be obtained without changing anything. Other exception points will have to be treated [on an individual basis?]Exception Point CountingThe exception point counting code is at the bottom of the call graph diagram (Figure 1). It is really very simple:
class Counter {
private:
static int sExceptionPointCount;
static int sFailCount;
// ...
public:
static void CouldThrow() {
if( ++sExceptionPointCount ==
sFailCount )
throw TestException();
}
static void SetThrowCount(int c){
sTestCounter = 0;
sFailCount = c;
}
// ...
};
The SetThrowCount method is used to set the number of the exception
point that will throw. The CouldThrow method will throw the exception
at the appropriate exception point. There is a lot of other functionality included
in this class in the full version (see Listing 1, Counter.h
and Listing 2, Counter.cpp); its operation should
be self-evident from an inspection of the code.
A Debugging Memory ManagerI have included a debugging memory manager with the online listings for two reasons: 1) To find simple memory errors in the program under test; and 2) To allow calls to operator new and operator delete to be instrumented to call Counter::CouldThrow. The overloading of operator new and operator delete is a convenience these versions automatically call CouldThrow. This functionality is useful because the majority of exception points in a typical application are due to memory allocation. The debugging memory manager used in this testing facility additionally maintains a linked list of all allocated blocks, and a count of how many have been allocated. This allows the program to check if any memory leaks have occurred within a single test, and to also check that pointers passed to delete are actually pointers to validly allocated blocks. This was the minimum necessary functionality to detect the bugs that are in the examples. A more professional debugging memory manager may be substituted, as long as its operator new and operator new[] functions can be modified to call Counter::CouldThrow. Techniques for writing debugging memory managers are discussed by Steve Maguire [4].Test ExamplesThe following examples demonstrate how to use the testing method on a function and on a template class.Testing a FunctionThe function to be tested is the EvaluateSalaryAndReturnName function mentioned in the introduction. In order to test it, you need a String class and an Employee class; these are provided with the online source listings. There are three normal paths of execution through this code, so three test cases are needed to exercise it. A single test case is shown here, which forms one part of the test harness box in the design in Figure 1:
class TestPath1 {
static void DoTest() {
String s;
s=EvaluateSalaryAndReturnName(
Employee( "Homer", "Simpson",
"Nuclear Plant Controller",
25000 ) );
Counter::Test(
s == "Homer Simpson",
"Correct return value" );
}
};
Here I have constructed a test case which will fail to enter the if statement within the EvaluateSalaryAndReturnName function, so nothing should be printed. To fully test this function, the output would have to be redirected into an internal string buffer or file. This is possible on Unix by closing and reopening the standard output stream, but is left as an exercise for the reader. Our interest here is the exception paths that we can cause to execute through the function using the TestDriver function.
After constructing the test cases, a suitable main is needed to call them:
int main()
{
TestDriver<TestPath1>();
// TestDriver<TestPath2>();
// ...
Counter::PrintTestSummary();
return 0;
}
A full test harness will call further tests. However, when this program is built and run, it prints the following output:
Doing test 9TestPath1 18 execution paths were tested. Test results: Total Tests: 19 Passed : 19 Failed : 0(The strange test name is just how g++ prints out the class name when you use RTTI.) The important thing that the output shows here is that 18 execution paths were tested, even though there was only one normal path of execution through the code accessible to the test suite. This is due to all the exception points being tested, one by one. Each exception point gives rise to a separate execution path. The number is different from the claimed 20 possible paths for several reasons: 1) The test harness contains extra exception points outside the code under test. Thus, some execution paths only differ outside the function being testing here. This is acceptable, since we still know these paths are being tested. 2) The path that is being tested does not cover all the possible exception points in the function, since half of the if test and the body of the if statement are not executed. Thus, not all of the exceptional paths are revealed yet. At this point, I should point out that e.Salary() might return a user-defined type. The comparison between this user-defined type and 100000 would be performed by a user-defined operator==, which could throw. This possibility is included in Herb Sutters claimed count of 20 exception points. However, this test program has not tested these things, because this particular program does not return a user-defined type from e.Salary(). The program has only tested what could actually throw. A complete test set is supplied as part of the code with the article. While developing this test set, I found two paths of execution that neglect to call the String destructor. (This was verified by inspecting a function call trace.) These appear to be caused by bugs in the code generated by g++ 2.95.2! This testing methodology has also found bugs in other compilers. Testing a Template ClassThe template class to be tested here is the same one used in Tom Cargills article on exception handling David Reeds stack class [6]. This class is a good example for showing that the test method finds problems. The code is shown in Listing 3 (Stack_Reed.h). The original code assumed that new returned zero when it failed I have changed the code to assume that new throws bad_alloc. I have changed a few method names to be consistent with Herb Sutters implementations, to allow a common test harness. I have also added a Consistent method, which checks on the internal consistency of the Stacks data. The template class, Stack, is instantiated with a template argument of type TestClass (see Listing 4, TestClass.h). TestClass behaves like a very temperamental integer it can be copied, assigned to, added, and so on, all with the possibility of throwing an exception. The only method of TestClass that does not throw is the destructor. This throw capability is achieved by inserting a call to Counter::CouldThrow into each of the methods, thus allowing us to test the behavior of Stack under hostile conditions. The complete stack test suite is shown in Listing 5 (TestStack.cpp). The following Stack test from the suite exposes a fault in the copy constructor:
class TestCopyConstruct2 {
public:
static void DoTest() {
Stack<TestClass> a;
a.Push( TestClass(1) );
a.Push( TestClass(2) );
{ Stack<TestClass> b( a );
// ...}
// ... } };
When this test is run, it produces the following output:
Doing test 18TestCopyConstruct2 **** Failed test Memory leak (1 block) at exception point 27. **** Failed test Memory leak (1 block) at exception point 28. 29 execution paths tested.It seems that there are two memory leaks when exceptions are thrown during the tests. To identify the cause of the leaks, the execution paths must be identified. The program may be converted so that only the faulty execution path is used, for the convenience of debugging. To do this, follow this procedure:
Testing Exceptions Thrown by the Class Under TestNow that I have discussed how to test exceptions caused by memory faults, I will turn to exceptions that are thrown by the class under test. How do we test these? By setting up the conditions that cause the exception to be thrown. For example, the Pop method of the Stack class throws an exception if you try to pop an empty stack. We can test this exception path by writing a test such as the following:
class TestPop {
public:
static void DoTest() {
Stack<TestClass> a;
try {
a.Pop();
Counter::Fail("Pop empty");
} catch(const char* ) {
Counter::Pass("Pop empty");
}
Counter::Test( a.Consistent(),
"a internal state");
Counter::Test( 0 == a.Size(),
"a correct size");
}
};
This particular test tests Sutters version of the class, in which Pop has been modified not to return the popped element. The test sets up the stack class to be empty so that Pop will fail. The test is subsequently written so that Pop must throw an exception to pass the test. The stack must also subsequently have consistent internal state, and have zero size as well. It would also have been possible to test this path of execution by placing a call to Counter::CouldThrow at the point where the Pop could fail. This would not have required such a carefully designed test suite. However, it would have required modification of the code undergoing testing, which is usually undesirable.
DiscussionUsing this technique imposes the following requirements: 1) You must have the source code of the class or function under test. (Object code may be sufficient if the class does not use any templates.) 2) You must write an exhaustive test suite for the functionality of that class or function, including for any exceptions that the class or function itself may throw due to being misused in any way. Meeting the above requirements allows us to: 1) Test whether the class or function under test leaks memory under any circumstances, including due to exception propagation; 2) Easily exercise exception handling code for exceptions that are caused by out-of-memory conditions; and 3) Exercise exception handling code for exceptions that are caused by misusing the class (and the misuse is included in the test suite). Thus, this testing method is applicable to class-based testing or unit testing. It does not allow us to do any of the following: 1) Test pre-compiled binaries of libraries or complete programs; or 2) Test exceptions due to any reason other than memory failure or conditions deliberately set up by the test suite. An example of such conditions would be I/O faults. (I/O faults could be tested using this method if the I/O library were instrumented with calls to CouldThrow.) Therefore this method, as it stands, is not applicable to integrated system testing. This is not a perfect solution, but is much better than having no means of performing this testing at all.
Conclusion
This method allows you to test a function, normal class, or template class for its exception handling integrity. It requires some extra effort over that required to write a standard test suite. The tests must be crafted with some care for the results of the tests to be meaningful. However, if you go to the trouble of writing good tests and fixing any problems found with your code, you can be correspondingly more confident in your code. This technology has already been used to improve the quality of student assignments for simple classes. It has also uncovered incorrect exception handling code output from some compilers. This article only scratches the surface of what can be done by automated unit testing.
Acknowledgements
I gratefully thank Herb Sutter for encouraging me to write this article in the first place, and for assisting with reviewing it. This technique was invented independently by Matt Arnold, and subsequently used by David Abrahams [7] to write a generic test suite for the STL [8].
References
[1] Tom Cargill. Exception Handling: A False Sense of Security, C++ Report, Vol. 6, No. 9, November-December 1994. Also available at http://meyerscd.awl.com/. [2] Herb Sutter. Exceptional C++ 47 Engineering Puzzles, Programming Problems, and Solutions (Addison-Wesley, 2000). [3] Steve McConnel. Code Complete (Microsoft Press, 1993). [4] Steve Maguire. Writing Solid Code (Microsoft Press, 1993). [5] Scott Meyers. More Effective C++ (Addison-Wesley, 1996). Also available as a CD; see http://www.meyerscd.awl.com [6] David Reed. Exceptions: Pragmatic Issues with a New Language Feature, C++ Report, October 1993. [7] David Abrahams. Exception Safety in Generic Components, Dagstuhl Conference on Generic Programming, April 27 - May 1, 1998. Online at http://www.cs.rpi.edu/~musser/gp/dagstuhl/gpdag.html. [8] David Abrahams and Boris Fomitchev, Exception Handling Test Suite, available at http://www.stlport.org/doc/eh_testsuite.html.
Ben Stanley graduated from the Australian National University with Honors in Theoretical Physics in 1994. He is now doing a PhD in Robotics at the University of Wollongong, Australia. He has lectured some first and second year C++ units. When hes not busy writing his thesis, he makes puzzles.
|
|
||||||||||||||||||||||||||||
|
|
|
|