Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Not-So-Obvious Utility Macros


March 2000/Not-So-Obvious Utility Macros


During my career as a software developer, I have acquired a small collection of C and C++ preprocessor macros. Some I collected and some I created by myself. A few of them might be already familiar to you, but I am sure there are also ones here you have never thought of. I present a number of those macros here.

Repeating Macros

From time to time a programmer must write long sequences of repeating text. For example:

// 16 spaces:
char spaces[] = "                "; 

8 '1's
int ones[] = {1,1,1,1,1,1,1,1}; //

The task becomes harder when the repetition count increases. A string of 1,000 spaces can be very tedious to write. In such a case, you might prefer using run-time rather than compile-time initialization code. This can make the code slower and more complicated — sometimes unnecessarily, as you will see.

The macro families REPEAT_xxx and REPEAT_WC_xxx (Listing 1) are designed to help you in solving this and similar problems. Macro REPEAT_xxx (where xxx is some number) expands to xxx repetitions of its argument. Similarly, REPEAT_WC_xxx expands to xxx repetitions, separated by commas.

Alternatively, there are the two-argument versions of the macros: REPEAT and REPEAT_WC. In them, the repetition count is supplied as first argument, and the text to be repeated as second.

With the REPEAT macros you can rewrite the above declarations as:

char spaces[] = REPEAT_16(" ");
int ones[] = { REPEAT_WC_8 (1) };

Note that only macros for repeating counts of powers of two are defined, such as REPEAT_2, REPEAT_4, REPEAT_8, etc, up to 1,024. You can easily increase this limit.

To use a count that is not a power of two, decompose it to sum of such numbers. For example, 100 is the sum 64 + 32 + 4. Use the corresponding macros in combination. So to print 100 asterisks, you can write:

#define REPEAT_100(x) \
REPEAT_64(x) \
REPEAT_32(x) \
REPEAT_4(x)

REPEAT_100 (std::cout << '*';)

This code expands to a sequence of 100 identical insertions. This is not as silly as it might look. An optimization techniques known as "loop unrolling" involves the replacing of loops by sequential code. If you are a fan of such optimizations, these macros give you the chance to unroll loops by hand.

Another example defines a function returning the index of the most significant bit of its eight-bit argument:

int getLastOnBit8 (unsigned f)
{
assert (f <= 0xff);
return ("\xff" // -1
        "\0"
        REPEAT_2   ("\1")
        REPEAT_4   ("\2")
        REPEAT_8   ("\3")
        REPEAT_16  ("\4")
        REPEAT_32  ("\5")
        REPEAT_64  ("\6")
        REPEAT_128 ("\7")) [f]);
// "\ff\0\1\2\2\3\3\3\3.."[f]
}

It may not be pretty, or space efficient, but it is very fast.

Unique Names

Macro UNIQUE_NAME (Listing 1) expands to a C++ identifier that is unique for each line of the source file. For this purpose, a prefix is concatenated with the result of the expansion of the predefined macro __LINE__. As a result, this code:

int UNIQUE_NAME,
    UNIQUE_NAME;

declares two different variables, whereas this one:

int UNIQUE_NAME, UNIQUE_NAME;

causes a translation error, such as:

'uniqueNameOnLine_65' : redefinition

Of course, a chance exists that two presumably different unique names could expand to one and the same identifier, probably causing a name conflict. This can happen if they appear on the same line number in different files (including included ones). To solve such conflicts, you can use the macro UNIQUE_NAME_WP (Listing 1). It lets you supply your own, more unique, prefix.

UNIQUE_NAME is good wherever you need an identifier that is different from all the others — a quite common case — provided that you will not use it anywhere else in your program. Such identifiers are effectively anonymous. They give you an easy way to avoid name conflicts, at least within one source file. You will see examples of using UNIQUE_NAME throughout the rest of this article.

As an implementation note, UNIQUE_NAME_1 and UNIQUE_NAME_2 (Listing 1) are internally used helper macros. It is not possible to define UNIQUE_NAME simply as:

#define UNIQUE_NAME \
   uniqueNameOnLine_##__LINE__

This is not how the C++ preprocessor works. On a macro call, a macro name is not expanded if it is an operand to a stringizing (#) or token-pasting (##) operator. So you need to make __LINE__ an argument to yet another macro. Thus, the implementation of UNIQUE_NAME requires two additional macro calls. In our case they are UNIQUE_NAME_1, to do the job of concatenating the prefix argument with the already expanded second argument __LINE__, and UNIQUE_NAME_2, where the expansion takes place.

Line Number as a String

The macro LINE_STRING (Listing 1) expands to a quoted string, such as "1234", corresponding to the line number in the source file where it is used. It can be used to produce diagnostic messages. Most compilers have a way of printing a user-supplied message during translation. For instance, Microsoft Visual C++ v6.0 has the pragma:

#pragma message ("Something to do")

The #pragma message expects a quoted string argument. This means that you are not allowed to use directly the predefined __LINE__ macro in its argument. The reason is that __LINE__ expands to a decimal integer constant (like 1234), and not to a string (like "1234"). Writing:

#pragma message (__LINE__": TO DO")

would cause the compiler to print a warning about improper use of the pragma rather than your message. But if you convert __LINE__ to a string, using LINE_STRING, as in:

#pragma message(LINE_STRING ": TO DO")

you will get the desired result:

123: TO DO

You can go even farther and define a macro which expands to a string in a format that is understandable to your IDE (integrated development environment). Clicking on this message might cause opening of the source file and marking the line where this macro was invoked, the same way as it is with normal compilation errors.

For instance, the Microsoft Visual Studio IDE understands compiler messages in the following format:

file_name(line_number):

The definition of the macro HERE (Listing 1) expands to a quoted string in exactly this format, so that you can use it — alone or combined with another string — in a pragma message. Compiling:

#pragma message (HERE " TO DO")

produces a message like:

test.cpp(44): TO DO

Clicking on this message brings you directly to the code above.

Controlling Code Execution

Quite often situations occur that require executing a code sequence only the first time the program reaches it. The most typical case is first-time initialization. One approach is to use a static initialized flag, as in:

static bool initialized = false;
if (!initialized)
    {
    initialized = true;
    << initialization_code >>
    }

The macro ONCE(execution_code) (Listing 1) does the same job as the code above, maintaining the flag automatically, and taking the <<initialization_code>> as an argument. It uses UNIQUE_NAME as flag name in order to avoid conflicts. For example:

ONCE (
     initialize_lookup_table();
     initialize_cache();
     )

Sometimes you need to execute some code only occasionally, such as every Nth time that flow of control reaches the code. This might be, for example, some slow code such as a display refresh that is called within a time-critical loop. For this purpose you may use a static counter:

static passes_count = 0;
if (++passes_count >= skip_count)
    {
    passes_count = 0;
    <<execution code>>
    }

Macro SKIP(count, execution_code) (Listing 1) simplifies this solution, maintaining the counter for you. Its first argument defines the skip count, and the second supplies the executable code. UNIQUE_NAME, used as counter variable, reduces the risk of name conflicts. Here is an example:

for (int j = 0; j < 1000; j++)
    SKIP(100, cout << '.'; )

This code prints ten (1000 / 100 == 10) dots.

By contrast, the macro REV_SKIP(skip_count, execution_code) (Listing 1) does a job exactly opposite to the one that SKIP does. With this macro, the code is executed on almost every visit. It is skipped just once every skip_count times.

Macro LOOP_C(loop_count, execution_code) (Listing 1) executes the code loop_count times. Its first argument, presumably an integer expression, is evaluated only once and stored in a counter variable whose name is determined by UNIQUE_NAME. For example:

LOOP_C (100, cout << '*';)

Macro LOOP(loop_count) (Listing 1), expands to a header of a for loop that will be executed loop_count times. For example:

LOOP (askGodForStarsCount())
   printf (a, b, c);

Note that this macro and the ones that follow make use of language features in C++. They typically cannot be used with C.

In C, program execution begins with the function main. In C++, though, the constructors of the global and file static objects execute before main is called. This gives you the opportunity to execute arbitrary code before main is called. All you have to do is to put it in the definition of the constructor of some class, and declare a global or file static object of this class, as in:

static class Unimportant
{
public:
Unimportant()
    {
    << code before main >>
    }
} unimportantObject;

The macro AT_START(execution_code) (Listing 1) simplifies using of this code pattern. It takes as its single argument the code to be executed when the global or file static object is initialized. The names of the declared class and the object (the "unimportant" ones from the example above) are chosen for you using UNIQUE_NAME. Here is an example:

AT_START (
         initializeUniverse();
         bigBang();
         )

Most often I use this macro in order to call (automatically!) some testing code, making only local changes in a single source file. This approach is fast, easy, and avoids polluting the rest of the program.

Similarly, you might want to execute something after main returns. For this purpose, a global object's destructor can be used. The macro AT_END(execution_code) (Listing 1) behaves as you might expect:

AT_END (
       printf ("Made in heaven");
       )

When using the macros AT_START and AT_END, it is important to remember the rules concerning the order of initialization and destruction of global and file static objects:

  • Objects defined in the same compilation unit are initialized in order of their appearance in this file.
  • The order of initialization of objects in different compilation units is undefined.
  • The order of destruction is exactly opposite to the order of initialization.

These rules naturally apply to AT_START and AT_END invocations as well.

Saving and Restoring Values

Imagine that you have some code that modifies, directly or by a side effect, a certain variable, and you must restore its initial value after the code completes execution. You might write:

// int var - to be restored

int var_saver = var;
<<some code>>
var = var_saver;

Things become more complicated if the code has branches:

int var_saver = var; // save
if (whatever)
   {
   <<some code>>
   var = var_saver; // restore
   return;
   }
else if (something_else)
   {
   <<more code>>
   var = var_saver; // restore
   throw 0;
   }
var = var_saver; // restore

In this example there are three places that restore the value of var upon leaving. In more complex code they can be even more. A common C++ idiom for simplifying this code is to declare an auto object whose destructor, called automatically upon leaving the current block, restores the saved value. The destructor is executed no matter how execution leaves the block.

The macro SAVE(var) (Listing 1) does the saving and restoring of the value of var for you. The macro makes use of a couple of template classes, a template function, and a macro definition. I present it here in several steps.

First consider the template class DelayedAssigner_T (Listing 1). It is able to:

  • store a reference to a variable of class T (the parameter type) that has to have a delayed assignment to it,
  • store the value of the same type that needs to be remembered,
  • upon destruction, assign the stored value to the referenced variable.

Now you can rewrite the example from above in this way:

DelayedAssigner_T<int>
    var_saver (var); // save
if (whatever)
   {
   <<some code>>
   return; // no restore
   }
else if (something else)
   {
   <<more code>>
   throw 0; // no restore
   }
// no restore

The constructor of var_saver saves the initial value of var, and its destructor assigns it back. It replaces all the restores spelled out in the initial example above.

The code can be simplified by using the macro SAVE_T(type, var) (Listing 1). Its first argument is the type of the variable to be saved, and its second argument is the variable itself. Using this macro, the save statement from above can be written as:

SAVE_T (int, var)

This is already quite simple. But if you are as lazy as I am, you might read farther to see an even simpler solution. I will try to remove the need to supply the type of the variable to be saved (the first argument of SAVE_T). It can be difficult to remember, type, and read. There is always a chance to get it wrong. And it is redundant anyway, as you will see. For this purpose, I will use the template function makeDelayedAssigner_T (Listing 1) to create a DelayedAssigner_T<T> that deduces T (the type of the variable to be saved/restored) out of its argument (the variable itself).

Unfortunately, it doesn't alone do the job. It will indeed return a properly created DelayedAssigner_T object, but afterwards this object must stay alive until flow of control leaves the block. (Calling the function without using the returned value would cause immediate destruction of the returned value). Therefore, I must declare a variable in which to store it. But in order to declare such variable, I must (again) explicitly mention what type it is. This is exactly what I am trying to avoid.

There might be a simple solution if C++ had some kind of operator typeof that we could be used in the declaration. typeof would yield the type of its argument without evaluating it, much like the operator sizeof. You could then write:

#define SAVE(var)                            \
   typeof (makeDelayedAssigner_T (var))      \
   UNIQUE_NAME (makeDelayedAssigner_T (var));

There were several discussions about typeof on the newsgroup comp.std.c++. But Standard C++ does not currently define such an operator.

Without typeof, one possible solution is to use a pointer to a non-template base class, containing the address of an object, dynamically allocated by a template function, and derived from class DelayedAssigner_T. We use a static type for this pointer, so we are able to declare a variable. We would also need a class capable of performing a delete upon destruction, such as template class auto_ptr in the Standard C++ library. And because our pointer would actually point to a derived class object, we would need the virtual mechanism to make sure that the proper destructor, that of the derived DelayedAssigner_T, gets called:

#define SAVE(var)                            \
   std::auto_ptr <DelayedAssigner_T_Base>    \
   UNIQUE_NAME (newDelayedAssigner_T (var));

This is already quite a workable solution. It has, though, a considerable drawback — it allocates memory in the process. Memory allocation has traditionally been considered to be slow. This means that such a solution might have considerable negative impact to the performance.

The "good" solution I am about to present has something in common with these ideas, so I hope that mentioning them will help you to better understand it. The memory allocation mentioned above can be avoided if "placement new" is used instead. Placement new lets you create an object in storage which you supply yourself rather than from the heap, such as in an auto variable. When using placement new, you are responsible for making sure that the storage is available and large enough.

The template class UniversalStorage_T (Listing 1) is designed to serve as such universal storage. It combines several useful features:

  • It is a template class, whose parameter SIZE is presumably the size of the object to be stored in it.
  • It has a template constructor, which deduces the type of the object being stored from its argument.
  • Its constructor relies on its argument's copy constructor to build the stored object in the right way.
  • Its destructor calls the stored object's base-class destructor, which, if virtual, calls the destructor of the derived class, as desired.

What we want to store in such an object is the DelayedAssigner_T object, returned by makeDelayedAssigner_T(var), as mentioned above. For this purpose, we will declare an auto variable of class UniversalStorage_T<Base, SIZE>. We are able to supply the right SIZE parameter, using the operator sizeof with a call to makeDelayedAssigner_T(var). The other parameter for UniversalStorage_T(Base) will always be one and the same, namely DelayedAssign_T_Base.

As argument for the constructor of the declared variable we will use the returned value from yet another call to makeDelayedAssigner_T. This call actually happens, unlike the one in the argument to sizeof, which is never executed. Thus we have finally got rid of the type of the variable to be saved and restored:

UniversalStorage_T
  <DelayedAssigner_T_Base,
   sizeof (makeDelayedAssigner_T (var))>
  saver_var (makeDelayedAssigner_T (var));

To simplify using the code from above, we define the macro SAVE(var) (Listing 1). Using it is quite simple:

SAVE (var)

Compared to the other solutions mentioned so far, this macro offers several savings and improvements:

  • The names of the supporting classes, functions, and macros are not needed. They are always the same.
  • Mentioning the type of the variable is also not needed. It is deduced automatically by a template function.
  • The name of the temporary storage variable is not needed. UNIQUE_NAME supplies the name.
  • The macro's name suggests its intention, improving code readability.
  • No dynamic memory allocation is involved.
  • Considerably less typing is required.

Here is one last variation on the same theme. Upon leaving the block you might want to assign to the target variable a value that is different from its initial one. The macro DELAYED_ASSIGN(var_ref, value) (Listing 1) offers such a service. Consider the following code that deletes a linked list of dynamically allocated nodes:

while (head)
   {
   DELAYED_ASSIGN(head, head->next)
   delete head;
   }

It does the same as:

while (head)
   {
   Node* next = head->next;
   delete head;
   head = next;
   }

Semicolons and Macros

As a final note, I would like to call attention to the way the macros have been used in all of the examples. Note the lack of a semicolon at the end of any macro "statements." This is a deliberate design decision, and is part of the philosophy I follow when I use and define macros. I know that many programmers write a semicolon in such cases, even if it's not needed, just because the code looks "nicer." Besides, the semicolon is "harmless" and "safer." As to what looks "nicer," there are as many opinions as there are programmers, or even more. The point is, though, that using semicolons with macros might be dangerous, and I will try to show why.

Let's consider a macro, taking a type name as argument, and defining some aliases for pointer and reference types:

#define TYPEDEF(Type)             \
   typedef           Type* Type##Ptr; \
   typedef const Type* Type##CPtr;\
   typedef const Type& Type##_;

Now let's try using it this way:

TYPEDEF(int);

This compiles properly. Funny, though, sometimes it does not do what you expected. How can this be?

The meaning of the code above depends on whether the compiler sees the macro definition or not:

  • If it sees it, it is an invocation of the TYPEDEF macro.
  • If not, it is (surprise!) a declaration of function called TYPEDEF, returning implicitly int, and taking an int argument.

Both are quite valid constructs in any context, so the compiler will always be content and silent. Even worse, if you misspell the name of the macro, it will in any case be taken as a function declaration.

Such errors are extremely difficult to find, because in most cases the compilation error appears (if at all) at a place quite unrelated to where the actual error occurs (e.g., intPtr undefined). Traditionally, we expect the compiler to help us find such "small" errors as forgotten include or define directives, or mistyped identifiera. And here it just does not happen! Is this not a Very Bad Thing?

Now let's assume that when using TYPEDEF you just removed the semicolon:

TYPEDEF(int)

It is no longer a good function declaration, but it is still a good macro invocation. If such macro does not exist, or is misspelled, or invisible, you will get a compilation error, most probably on the very next line.

I hope this convinces you not to use semicolons with macros anymore.

Last Words

Listing 2 shows a small test program that can be used to test the macros in the collection presented so far. To test a header containing these macros, just replace the name of the include file (UtilityMacros.h) with the name of the header containing definitions from Listing 1.

Some people assert that macros are no longer needed now that C++ includes templates. Although sometimes templates are much more appropriate than macros, there remain situations where macros are still the only alternative. As shown above, using both templates and macros in combination offers even greater possibilities.

Some of the ideas presented in this article might look like overkill for many simple tasks. Consider, however:

  • You will write them once but you will probably use them many times. This way, the implementation complexity is not an issue.
  • The additional performance overhead, if present at all, in most cases consists of a couple of inline function calls and can be neglected.
  • Using them can simplify your code, and code simplicity is the Most Important Thing.

Radoslav Getov works as senior R&D engineer at Ansoft Corporation, Pittsburgh, PA. He's been developing EDA CAD systems using C and C++ during most of his career as a software developer. You can reach him at [email protected] or visit his web page members.tripod.com/~radosoft.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.