Compatibilities

By Herb Sutter, December 01, 2004

When are two variables really two variables? Ask the Guru.

December, 2004: Compatibilities

Herb Sutter (http://www.gotw.ca/) chairs the ISO C++ Standards committee and is a software architect at Microsoft, where he is responsible for designing C++ language extensions for .NET programming (C++/CLI). His two most recent books are Exceptional C++ Style and C++ Coding Standards.

This article is about things our Standard never told us, and that most of us had to learn on our own the hard way, out in the streets.

I occasionally get mail from readers that starts with code like this:

// Example 1
//
void SomeFunction( std::string& );

You're probably scratching your head, thinking: "Well, nothing seems wrong with that. It's too simple to be wrong." And then, perhaps an anxious moment of: "Is C++ so weird that there's a pitfall in something this straightforward?" Relax. The issue isn't with C++. The issue is one that affects pretty much every language, and that is portability—specifically, the issue is link and binary portability.

The problem with Example 1 happens to programmer after programmer in company after company. When I get e-mail about it, or when people ask about it during conference or training sessions, the question usually goes something like this: "I'm in shared library (or DLL) Lib1, calling SomeFunction in shared library (or DLL) Lib2. It doesn't work. Why not?"

Here's the short answer: If you can guarantee that a function and all of the code that will ever call it will be compiled using:

the same compiler (including version),
compatible compiler switch settings, and
the same Standard Library implementation (including version) (if any Standard Library types appear in the function's interface, or any Standard Library facilities are shared),

only then can you use a user-defined type as a function parameter. And "user-defined types" include Standard Library types such as std::string, which brings us back to Example 1. (Aside: If you can't guarantee those things, you also can't reliably allocate memory in the function and deallocate it elsewhere, and vice versa, because the memory managers might be different. You can't reliably throw exceptions, either, because the exception-handling mechanisms might be different. I mention those for completeness, but in this article, I'm going to focus mainly on the issue of compatibility of user-defined types.)

Now you know (and you probably already knew) the main points of this column. But please read on anyway as we delve into the details, because the consequences can be a source of subtle surprises in the real world.

Object Layout

Clearly, functions can't communicate via an object (for example, as a function parameter) unless they can agree on how the bits of the object are laid out in memory. This means that they must agree on the object's size, the internal offsets of base classes and data members, the internal locations of virtual function call mechanics, and any other necessary details.

The C++ Standard, however, doesn't specify exactly how objects are laid out in memory.

Specifically, it does not fully specify three things. First, compilers are generally free to arrange the base and member objects of a class in whatever way they want. The layout of a base class B subobject in a derived class can even be different from a B object's layout when it is instantiated standalone. The only constraint the Standard imposes is that all the members that appear in the class's definition without an intervening access specifier (public:, protected:, or private:) must physically appear in the order the programmer declared them; this minor constraint exists for the sake of C compatibility.

For example, consider this class definition:

// Example 2
//
class C {
private:
  char c_;
private:
  int i_;
  short s_;
};

The compiler is required to generate an object layout for class C that puts i_ before s_, because there are no intervening access specifiers between the declarations of i_ and s_, but that's it. The compiler is free to put c_ before or after i_ and s_, or even to put it in between them.

Second, the Standard doesn't specify what padding (if any) is placed between data members: Compilers are free to add extra padding for the sake of alignment. Certain processors, or certain instructions on some processors, want given data types to be aligned on certain byte or word boundaries (for example, Intel SSE instructions want float arrays to have 16-byte alignment). Depending on the processor, if data is improperly aligned, then either the operation is slower or else the processor halts the program outright. So it's the compiler's job to insert extra "blank spaces" of unused padding bytes inside objects so that all the members are aligned correctly, for either correctness or just for better performance. In addition to doing this automatically, most C++ compilers also provide switches and/or language extensions to control the alignment (aligned(n), __declspec(align(n)), for instance) and packing (for example, #pragma pack) on a class-by-class basis, or even on an object-by-object basis.

By the way, in the preceding paragraph, did you notice one reason why using different switch settings on the very same compiler can generate different and binary-incompatible layouts for the same type? I said that many processors only prefer, but don't require, a given alignment, and therefore an object layout that violates the processor's preferred alignment can still be perfectly legal; some instructions may just not execute as quickly as they would if the data had been properly aligned. Therefore, you may find that your compiler could generate a different object layout for your classes, depending on whether you choose to optimize for space (favoring smaller objects with less padding) or for speed (favoring better speed with more alignment padding).

Consider again Example 2. Assume a compiler is targeting a platform with 4-byte ints, 2-byte shorts, and 1-byte chars. Also assume that this particular platform prefers, but does not require, that ints be aligned on a 4-byte boundary. If the compiler just lays out the members in declaration order without padding, you get:

char c_;   // 1 byte (byte 1)
int i_;	   // 4 bytes (bytes 2-5)
short s_;  // 2 bytes (bytes 6-7)
          // total size = 7

This layout is probably fine, but it fails to align the i_ member on a 4-byte boundary. What if the compiler lays out the members in declaration order, but aligns the int? Then you get:

char c_;     // 1 byte (byte 1)
            // 3 unused bytes (bytes 2-4)
int i_;     // 4 bytes (bytes 5-8)
short s_;   // 2 bytes (bytes 9-10)
	    // 2 unused padding bytes (bytes 11-12)
	    // total size = 12

Note that the compiler generates internal padding after c_ to ensure that i_ is at a 4-byte offset from the start of the object, and then it also adds external padding at the end of the object so that when you make an array, each object still has the right alignment because the whole object's preferred alignment is now a multiple of 4 bytes.

Of course, a compiler is unlikely to generate something as inefficient as that in Example 2. More likely, the compiler would take advantage of its ability to reorder the members and generate a layout similar to this:

int i_;     // 4 bytes (bytes 1-4)
short s_;   // 2 bytes (bytes 5-6)
char c_;    // 1 byte (byte 7)
            // 1 unused byte (byte 8)
            // total size = 8

This achieves the best possible alignment with the smallest padding overhead, and ensures that i_ is always correctly aligned, including in every object in an array of this type. (If directed to optimize for space instead of speed, a compiler might omit the padding byte. If so, that would preserve i_'s alignment in all single objects but sacrifice i_'s alignment in seven out of every eight elements of an array of this type.)

Third, and finally, the Standard doesn't specify the virtual function mechanics, and any machinery for dealing with virtual base classes. The compiler may insert a pointer to the class's vtable anywhere it likes inside the object layout (perhaps taking advantage of some space that would otherwise be mostly padding), if it uses a vptr/vtable scheme at all.

Thus, the C++ Standard states:

void SomeFunction( std::string& );

If you look in the object file, you probably won't see that name. Instead, you'll see an encoded name. The name of this particular function is encoded as follows on several popular compilers:

?SomeFunction@@YAXAAV?$basic_string@DU?
  $char_traits@D@std@@V?$allocator@D@2@@std@@@Z

You can almost see the encoding of the string parameter type because std::string is a convenience typedef for std::basic_string<char, std::char_traits<char>, std::allocator<char> >.

But the C++ Standard doesn't say that the compiler should mangle the name this particular way, or any particular way. It is silent on the matter, and lets compilers do whatever they want. That's actually a Really Good Thing, because this is one case where you really do want different compilers to use link-incompatible names for objects that aren't binary compatible with each other, so that you're less likely to accidentally succeed in linking them together!

In Example 1, say that the function is compiled using a compiler that does the above name mangling, and you try to call it from a function that's being compiled with a different compiler that uses a different name-mangling convention—and therefore generates a different mangled name when it tries to name SomeFunction in order to call it. From the linker's point of view, the caller is asking for a function that doesn't exist, and the linker will refuse to link the caller to the target function. And that is a good thing.

When Mangling Is "Too Compatible"

The difficulty arises when two compilers, or two settings for the same compiler, use the same name-mangling scheme even though they actually use different object layouts for the parameter and/or return types. Then the linker finds the name and dutifully links the caller to the callee, and the program fails at runtime because it turns out that the caller and callee don't actually agree on what the objects they're sharing actually mean—how big they are, how they're laid out, what the bits really store. The results are liable to manifest as a quick crash.

So now we can see what's going on in Example 1:

// Example 1 (reprise)
//
void SomeFunction( std::string& );

To correctly call this function, a caller must be compiled using:

The same compiler (or one that guarantees binary compatibility for object layouts, which is rare).
Compatible switch settings, so that optimization or packing settings won't affect the layout of the object.
The same Standard Library implementation and version, because different Standard Library implementations can and do implement standard types very differently, especially string (see [4]), even across successive versions of the same library implementation.

Finally, there is one last aspect of this problem: When people encounter this problem, they sometimes shake their heads (thinking it's the fault of their compiler vendor and/or Standard Library vendor) and say: "Well, heck, I guess we'll just have to wrap our uses of std::string in our own StringWrapper so that we can use that safely across modules." That sometimes works, if you know the problems above and are disciplined about ensuring perfect fidelity in the way to compile StringWrapper everywhere, but usually it just leads to the same problem as before, to the programmer's great consternation and despair. Why? Because unless you are very careful and have compilers that support binary compatibility, the layout of StringWrapper itself will now vary from compiler to compiler, or even from option to option in the same compiler.

Summary: Guidelines for Module Boundaries

What applies to functions applies in particular to the boundary of any module (nonsource unit of release; for example, shared library, DLL, or object file), which tends to be where you have the least amount of control over perfect build agreement with all possible callers. In a module, if you want to do any of the following in an externally accessible function:

Mention a user-defined type in the module's external interface (for example, as a parameter public or exported API).
Allocate memory in the module and deallocate it outside the module, or vice versa. (See Item 60 of [1].)
Allow an exception to propagate out of the module to a caller. (See Item 62 of [1].)

then you must be able to guarantee that the module and all of the code that will ever call it will be compiled using the same compiler and compatible compiler switch settings. Just using the same compiler isn't enough: Popular compilers can and do generate incompatible code for all three of the above if you throw different switch settings. You also have to use the same Standard Library implementation, including the same version, if any Standard Library facilities are mentioned in an externally visible functions signature or are shared outside the module.

Unless you are certain you can (and can afford to) guarantee and enforce that, you can't do the above-listed things. You have to use alternatives instead:

Instead of user-defined types, mention only fundamental types that all callers can agree on (for instance, int or float), or use opaque handles, usually typedefed void*s, along with plain nonmember functions that wrap construction and destruction so that callers have a way to create and destroy objects.
Instead of directly deallocating memory, provide plain nonmember functions that callers can use, which keeps the actual allocation and deallocation code inside the module. Otherwise, if your code must deallocate memory that was obtained in another module, consider using a callback whereby the caller provides a pointer to a function in the correct module to actually perform the deallocation.
Instead of allowing exceptions to propagate out of the module, put catch(...)-all firewalls around all externally accessible functions and use error codes to communicate status to callers.

For more, see also Item 63 of [1], and all of the Items in that section on Namespaces and Modules.

Next time: A trip report from the fall 2004 ISO C++ Standards meeting. Stay tuned.

References

[1] Sutter, H. and A. Alexandrescu. C++ Coding Standards, Addison-Wesley, 2005.

[2] Lippman, S. The C++ Object Model, Addison-Wesley, 1998.

[3] ISO/IEC 14882:2003(E), Programming Languages—C++ (updated ISO and ANSI C++ Standard including the contents of the 1998 Standard plus errata corrections).

[4] Meyers, S. Effective STL, Addison-Wesley, 2001.

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Compatibilities

Object Layout

When Mangling Is "Too Compatible"

Summary: Guidelines for Module Boundaries

References

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Compatibilities

Object Layout

When Mangling Is "Too Compatible"

Summary: Guidelines for Module Boundaries

References

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content