Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Generic Printable ENUM++


June2003/Generic Printable ENUM++

Generic Printable ENUM++

Mitch Besser


Introduction

Have you ever wished you could easily print an enum variable's value as formatted text rather than just printing its internal integer representation? Have you ever modified an enum's constants only to later rummage through mountains of source code looking for what broke from the change? Most programmers accept these shortcomings as an inevitable consequence of using the standard C++ enum.

The common, simple, and tedious nature of augmenting and maintaining enums makes them good candidates for applying automatic code generation and generic programming techniques.

When I think of generic programming in C++, I immediately think of C++ templates. While templates are a great feature, they have a few limitations. For one, they do not allow you to pass in string literals as template arguments. Luckily, templates are not the only tool available for generating code or writing generic C++. The Boost library [1] contains a framework of macros that help when C++ templates fall short.

Requirements and Goals

Before developing a standard C++ enum enhancement (hereafter referred to as ENUM++), I first created the following list of requirements:

  1. Allow conversion of values to formatted text
  2. Facilitate std::istream and std::ostream support
  3. Implement STL style iteration
  4. Detect out of range values
  5. Standardize the uninitialized state
  6. Allow compile-time configurable behavior

I also wanted to ensure that ENUM++ was applicable to the widest possible audience. For this purpose, I added the following goals:

  1. Make ENUM++ easy to use
  2. Make ENUM++ difficult to misuse
  3. Make ENUM++ a drop-in replacement for standard C++ enums

I wanted to support as many features of the standard C++ enum as possible, so that previously written C++ code could take advantage of ENUM++ with minimal source code modifications.

  1. Keep sizeof(ENUM++)==sizeof(enum)

This helps keep ENUM++ binary compatible with standard C++ enum types. This is often important to maintenance programmers attempting to replace an enum with ENUM++ functionality.

  1. Keep ENUM++ as efficient as standard C++ enums

Standard C++ enums were designed to be efficient, so a wholesale replacement needs to be just as efficient.

Basic Architecture

Standard C++ enums act like ints with some additional features and restrictions. In some ways, enums can be considered a subclass of the more primitive int data type. In much the same manner, it would be convenient to design ENUM++ as a subclass of the standard C++ enum type as shown in Listing 1.

Unfortunately, <enum E> is not a valid template parameter. Another problem is that you cannot inherit any C++ primitive data types, including enums. Given these imposed limitations, I instead chose to generate a wrapper class around a private standard C++ enum member. Access to the wrapped member is made available through generated member functions and overloaded operators.

A drawback with this approach is that it requires scoped access to the ENUM++ constants through the wrapping class. A solution to this problem is to generate a duplicate of the enum constants outside of the wrapping class. Listing 2 shows the generated MatrixMovie architecture.

This basic architecture is generated by a series of macros with help from the Boost library. While macros can take the drudgery out of generating code, they have some well known "evil" [2] properties. Therefore, I moved everything I could into a base class template named _EnumSuper. I only fell back to using macros for the chores that could not otherwise be done in the _EnumSuper template.

Usage

You will need to #include enum.h prior to any use of ENUM++. The enum.h header internally #includes StreamUDT.h [3], so you will need to make sure it is available in your include path as well. The enum.h header also relies on having the Boost preprocessor library [1] installed.

Next, consider coding an enum type called MatrixMovie as follows:

enum MatrixMovie {UNINITIALIZED=-1, Matrix,
 MatrixReloaded, MatrixRevolutions, SIZE};

Instead, replace the above standard C++ enum with the following ENUM++ macro:

ENUM3(MatrixMovie, Matrix, MatrixReloaded, MatrixRevolutions);

The enum type is the macro's first parameter. In this case there are three Matrix movies, so you must use the ENUM++ macro ENUM3(...). Unfortunately, C++ does not allow macros with a variable number of arguments, so you must specify the number in the macro's name. The good news is that if you miscount, your compiler will let you know.

Notice the UNINITIALIZED and SIZE constants are missing from the ENUM++ version. You may omit these because ENUM++ automatically generates uninitialized state support and a size() member function.

Once declared, your ENUM++ type can be used just like a standard C++ enum, except variables of your ENUM++ type have additional capabilities. For example, you could write a print_matrix_movies() function that loops and prints all the matrix movies as shown in Listing 3. If a fourth Matrix movie is ever made, you will not need to change the function at all. All that is required is to change ENUM3(MatrixMovie, ...) to ENUM4(MatrixMovie, ..., MatrixOverkill, ...) and you are done.

Listing 4 contains a partial list of the generated MatrixMovie members. For a complete and detailed list of ENUM++ functionality, see the enum.h header file.

If you want to use std::iostreams with ENUM++ variables, you can additionally define the following:

// defines operator>>
DEFINE_ENUM_ISTREAM(MatrixMovie);

// defines operator<<
DEFINE_ENUM_OSTREAM(MatrixMovie);

These macros allow reading and writing ENUM++ values as formatted text via C++ streams. For example, by using the DEFINE_ENUM_OSTREAM macro, you could rewrite the print_matrix_movies() function without loops by using the STL copy algorithm as shown in Listing 5.

Formatting and Streaming

A generated ENUM++ class will need to convert its enum constants into formatted text in order to print them. C++ templates cannot do this without help because template parameters cannot be string literals. The C++ preprocessor stringize operator (#) provides barely enough help. The problem with the preprocessor stringize operator is that you probably do not want to print the enum constant exactly as it is typed in the source code. For example, it would be much nicer for your users to read the value MatrixReloaded formatted as "Matrix Reloaded" rather than the spaceless "MatrixReloaded".

In order to format the text, ENUM++ starts by first generating an unformatted text map lookup table something like the following:

const char* map[]={"Matrix",
 "MatrixReloaded", "MatrixRevolutions"};

In order to process the unformatted text, ENUM++ contains a Singleton [4] class template named _EnumText that creates all the formatting information during its one time only construction. There are two tables of information created.

The first table contains formatted text created from the unformatted text map. The table is filled in by a simple parser that inserts spacing where appropriate. The parser makes an educated guess regarding spacing by using capital letters and numbers as a guide. If the parser finds an underscore within a constant, it assumes the programmer explicitly decided where the spaces should go and instead just performs a simple character substitution of underscores ('_') with spaces (' ').

The second table contains the best matching order to be used when extracting streaming text and converting it into an ENUM++ instance (_EnumText:: best_match_order_). The best matching order is defined as the descending order from the longest to the shortest formatted string. For example, a stream of incoming text containing "Matrix Revolutions" could match either the constant Matrix or MatrixRevolutions. By having the matching algorithm try the longest text first, the best possible match is found. If this were not done, the algorithm would stop short by matching "Matrix," leaving "Revolutions" behind in the input stream for the next stream reading operation to process.

With both of these tables in hand, facilitating C++ stream support becomes a reasonably easy task. I used a modified version of code taken from the book Standard C++ IOStreams and Locales [5]. The basic technique used is to write two member functions named get_stream() and put_stream(). The functions convert between the value's internal representation and its formatted text representation. The function names are well known to a pair of function templates found in the file StreamUDT.h. The templates take care of all the gritty details encountered when dealing with stream insertion and extraction.

All of this is well and good until you want to explicitly initialize enum constants like so:

ENUM3(MatrixMovie, Matrix,
 MatrixReloaded=2, MatrixRevolutions=2);

If you did this, the generated text map lookup table would require another level of indirection and therefore add a slight inefficiency. However, the most troublesome aspect of explicit initializers is the loss of a 1-to-1 relationship guarantee between the internal int representation and the programmer's abstraction. In other words, explicit initializers open the possibility that two enumeration constants will have the same value. It then becomes impossible to tell the difference between them at runtime.

ENUM++ solves this problem by forbidding the use of explicit enumeration initializers. As you will see, introducing this restriction makes other problems go away as well. The enum abstraction is primarily about a list of unique constants. The values they take are often of secondary importance. ENUM++ focuses its efforts on this primary usage.

Iteration

Imagine enums as compile-time containers of constants. Furthermore, imagine you want to perform some action with each contained constant. It then becomes easy to see why iteration is a desirable feature. Standard C++ enums contain no direct support for incrementing or decrementing between enum constants.

If ENUM++ constants were not sequential, iterating through an ENUM++ container would conflict with the efficiency goal. For example, it might be desirable to have the ++ operator skip over gaps as follows:

ENUM3(MatrixMovie, Matrix,
 MatrixReloaded=2, MatrixRevolutions);
mm=Matrix;
mm++;
assert(mm==MatrixReloaded); // ?

The value of mm is now ambiguous in terms of MatrixMovie constants. Making the above assertion work would require the separation of the iterator from its value. In the STL, the separation of iterators from containers is fundamental, but doing that here would reduce efficiency. As with printing, disallowing explicit initialization sidesteps the entire issue. This is a tradeoff of drop-in compatibility for efficiency.

Using enum values for iteration now becomes a simple matter of overloading operators as needed. I loosely modeled the list of operators available after the STL random access iterator concept specification [6]. Like all STL iterators, being one beyond the last element—using a half closed, half open interval [begin(), end())—is supported.

Misuse Detection

It is desirable to automatically detect as much ENUM++ misuse as possible. ENUM++ detects misuse at compile-time and run-time. Detecting misuse at compile-time is more effective than detecting it at run-time. Run-time detection requires that test cases be thorough enough to exercise the software completely. On the other hand, reliance on compile-time detection techniques is intrinsically incomplete. ENUM++ performs both types of misuse detection.

ENUM++ detects three different compile-time misuses:

  1. Miscounted ENUM++ macro parameters
  2. Assignment of ints to ENUM++ variables
  3. Explicit constant initialization

The detection of explicit constant initialization is the most interesting of the three. ENUM++ does this while generating the unformatted text map by producing an entry that will not compile. For example, the ENUM++ macro:

ENUM3(MatrixMovie, Matrix,
 MatrixReloaded=2, MatrixRevolutions);

generates a text map like the following:

const char* map[]=
{
  {(Matrix,
   "Matrix")},
  {(MatrixReloaded=2, // error
   "MatrixReloaded=2")},
  {(MatrixRevolutions,
   "MatrixRevolutions")}
};

Normally, the left side of the comma operator is evaluated and then ignored. However, in the case of MatrixReloaded, the initializer is syntactically treated like an assignment to a constant. Your compiler is bound to see this as an error.

Run-time detection comes in two phases—during software development and after software release.

During development you want to detect and debug as many problems as possible. The speed of execution is not nearly as important as diagnosing bugs. Therefore, it is worth the small run-time cost of generating a default constructor that initializes ENUM++ variables to a known state and generating code that deems out of range values as bugs.

In order to help detect bugs, ENUM++ uses two distinct but related constants—ENUM_INVALID_VALUE and ENUM_DEFAULT_VALUE. ENUM_INVALID_VALUE defines an internal value that ENUM++ will consider a bad state when accessed. ENUM_DEFAULT_VALUE defines an internal value that the default constructor and the clear() method both use for initialization. As a default, both ENUM_INVALID_VALUE and ENUM_DEFAULT_VALUE are defined as -1. For example, the following code increments an invalid value causing an assertion:

MatrixMovie mm;
++mm; // asserts and increments.

Finding problems after software is released is too late. When mistakes happen, it is good to know that your software's behavior will not depend upon the stack's internal state or any other non-reproducible system state. Since mm is initially set to ENUM_DEFAULT_VALUE (-1), mm will still increment its internal value to MatrixMovie::Matrix (0). Unless configured otherwise, ENUM++ mimics and stabilizes the standard C++ enum behavior.

Configurable Behavior

Since the ENUM++ programmer's interface is a set of macros, using template parameters as policy settings [7] is not possible. Configuration of ENUM++ is instead accomplished by defining preprocessor values that are used during compilation. For example, you can redirect run-time assertions by appropriately defining the ENUM_ASSERT macro. If you do nothing, ENUM++ automatically chooses default preprocessor values that are reasonable for most projects. Read the enum.h header documentation to see all the configuration settings available.

Summary

Even a fundamental C++ feature like enum can be improved. If your project has a large number of enumerations that require frequent modifications, ENUM++ will significantly reduce your effort. ENUM++'s only notable restriction is its lack of support for enum's explicit constant initialization syntax. This restriction is mitigated by ENUM++'s standardization of the uninitialized state and the generation of a size() member function. As is often the case with software, it is the small fundamental improvements that have the largest impacts.

References

[1] <http://www.boost.org>, "Preprocessor Metaprogramming"

[2] <http://www.parashift.com/c++-faq-lite/misc-technical-issues.html#faq-38.4>

[3] <http://www.cuj.com/code/>

[4] Gamma, Helm, Johnson and Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley, 1994)

[5] Langer, Kreft. Standard C++ IOStreams and Locales: Advanced Programmer's Guide and Reference, "3.1.5 Generic Inserters and Extractors" (Addison-Wesley, 2000)

[6] <http://www.sgi.com/tech/stl/Iterators.html>

[7] Alexandrescu. Modern C++ Design: Generic Programming and Design Patterns Applied, "1. Policy-Based Class Design" (Addison-Wesley, 2000)

About the Author

Mitch Besser is a Senior Software Engineer and Consultant with Solution Logic, Inc. based in Portland, Oregon. Mitch received a BSc in Wildlife Biology from the University of Montana in Missoula, MT and an MSc in Software Design and Development from the University of St. Thomas in St. Paul, MN. Mitch can be contacted at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.