Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Interoperability & C++ Compilers


March 04: Interoperability & C++ Compilers

When new compilers are introduced on existing platforms, it is important that they work with other compilers on that same platform. Users need to be able to mix code generated by the new compiler with code generated by the existing ones. Interoperability is the ability to mix the object files and libraries generated by more than one compiler and expect the resulting executable image to run successfully.

Application Binary Interfaces (ABI) describe the contents of the object files and executable images emitted by compilers. Platform descriptions have included common C ABI documents for years. As a result, interoperable C compilers are a common occurrence. However, C++ compilers have not been able to achieve this level of interoperability due to the lack of a common C++ ABI.

Recently, a common C++ ABI was developed that has been adopted on multiple platforms. As a result, it is now possible to develop interoperable C++ compilers. The advantages of using ABI-conformant C++ compilers include:

  • Object files generated by the two compilers can be linked together to produce a working program. This lets users move a few source files at a time to the new compiler.
  • The C++ Language Support Runtime can be shared. Since all conformant C++ compilers are using the same interface, the C++ language support runtime can be provided for a platform instead of for each compiler.

  • Libraries can be shared. This includes libraries provided by third-party vendors (including the C++ Standard Library) and user-generated libraries. Third-party vendors will not have to provide different library versions for each compiler.

  • Debuggers and other tools relying on the details of the C++ compiler implementation work with all conformant compilers without retooling.

In this article, I'll describe what is required for C++ compilers to be interoperable, why this level of interoperability can be achieved today, and how interoperability benefits programmers.

What Does Compiler Interoperability Mean?

Two compilers are considered to be interoperable if:

  • The first source file can be compiled with one compiler.
  • The second source file can be compiled with the other compiler.

  • The resulting object files can be linked together to form an executable image that runs correctly.

This applies to any two source files that share common data structures and/or have at least one function call from one file to a function defined in the other file.

When the topic of interoperability between two compilers arises, some people believe that conforming to the programming language standard, emitting the same object file format, and emitting the same debug format as another compiler is sufficient. It is true that these are important interoperability requirements. No further requirements would be needed if the two files do not share any data structures and there are no calls to functions defined in another source file.

Few useful applications can be organized into files that do not share anything between them. Once information sharing starts happening, the format of the information shared must be specified. Therefore, language conformance, a common object file format, and a common debug format are only the beginning of what is required for compiler interoperability.

Programming language standards define the syntax and semantics of a language. They usually also define a set of standard library routines. The library routines constitute an API. This information is sufficient for users who develop an application on one platform, port that application to another platform, and expect their source to compile if standard language constructs are used.

However, programming language standards do not specify how two conforming compilers work together. For example, the C and C++ Standards specify a long type. However, these documents do not specify the size or alignment of long. One compiler could recognize long as a 32-bit quantity and another compiler could recognize it as a 64-bit quantity. Both compilers conform to the programming language standard, but the output of these compilers is not interoperable.

The ABI for a programming language and platform defines what it means for compilers supporting that language and platform to interoperate. This is the document that specifies whether long has a size of 32 bits or 64 bits, and whether it is 4-byte aligned or 8-byte aligned. Other details must be specified in this document for interoperability to become a reality.

C Interoperability Requirements

The first step in achieving C++ interoperability is achieving C interoperability. Interoperability between two C compilers adds the requirements that the two compilers must:

  • Observe the same data structure layout conventions. This includes the size and alignment of basic types, and the layout of struct and union members.
  • Observe the same calling conventions. This includes the layout of the arguments and the location of the return type.

  • Provide fully compatible system and language header files. This usually means that the two compilers are using the same system and language header files.

  • Take similar paths through the source files. The header file search algorithms must be the same. Additionally, any differences in preprocessor symbol definitions must not introduce an API or ABI compatibility issue.

  • Accept the same syntax and exhibit the same semantics for that syntax for all constructs in system and language header files.

Data structure layout and calling conventions are usually described in the ABI. An example of a C ABI is described in the IA-64 Software Conventions and Runtime Architecture Guide [4].

Note that it is not sufficient for the two compilers to provide their own standard header files. If the internal representation of the standard data structures do not match, it is likely that the result of the mixed compilation will not run. This violates the definition of compiler interoperability. Therefore, it is customary that interoperable C compilers use the same set of system headers.

Similar preprocessor symbol definitions are required to deal with situations such as this:

<b>#ifdef FOO
int foo(int a, int b, int c);
#else
int foo(int a, int b);
#endif
</b>

If this code is in a header file included by both source files and one compiler defines FOO while the other compiler does not define FOO, the generated code will not be interoperable. This is because the function foo has a different argument list depending on which compiler is used.

C++ Interoperability Requirements

In addition to the C requirements, C++ interoperability requires that interoperable compilers implement the same C++ object model. The C++ object model has received a lot less coverage than most C++ application programming issues. A detailed description of C++ object models is provided in Stanley Lippman's Inside the C++ Object Model [1]. In general, a C++ object model defines the following:

  • Name-mangling conventions. Interoperable C++ compilers must mangle external names the same way so that symbols generated by one compiler can be referenced in code generated by another compiler.
  • Object layout issues not addressed by the C ABI. This includes the location of the virtual function pointer, representation of multiple inheritance, and representation of virtual inheritance.

  • The format and naming convention for any tables required to resolve virtual function addresses or members of virtual bases. This includes the virtual function table.

  • The interface to the C++ language support runtime. The C++ Standard provides a library API intended to be referenced by users. The C++ language support runtime interface is one or more libraries containing entry points referenced by the compiler to implement C++ features. C++ features requiring runtime support include C++ exception handling, RTTI, stack unwinding, operator new, operator delete, and construction/destruction of static objects. The routines and data structures comprising the C++ language support runtime are described in the C++ ABI. Figure 1 illustrates the relationship between a C++ compiler, C++ Standard Library, and C++ language support runtime. An example of a C++ object model specification is contained in the Itanium C++ ABI [2].

Interoperable C++ compilers must also support template instantiation mechanisms that can work with each other. If one compiler requires a prelinking phase while the other compiler emits all instantiations and expects the linker to eliminate duplicates, link-time conflicts will likely result. Template instantiation alternatives are described in C++ Templates: The Complete Guide, by David Vandevoorde and Nicolai M. Josuttis [3].

It is also desirable to make sure that other external files generated by the compilers are compatible or do not interfere with each other. For example, if two compilers support precompiled headers and output differently formatted files using the same file naming convention, the application build will take longer. This is because the precompiled header files will keep conflicting and need to be regenerated. If the precompiled header file formats are different, the compilers should have different file-naming conventions. The situation is similar for source browser files.

Why C++ Interoperability Can Be Achieved Today

When a new platform is developed, it is traditional for the owner of the platform to specify the C ABI. Any compiler vendor developing a C compiler for that platform would then conform to that ABI. However, the C++ ABI is traditionally specified by the compiler vendor. This means that each C++ compiler for a given platform implements a different C++ object model. As a result, C++ interoperability between two compilers is a rare occurrence.

A few years ago, a consortium of compiler vendors realized that C++ interoperability could not happen with this model of development. The result of this consortium was the C++ ABI described in [2]. The first implementation of this ABI was for the Intel Itanium processor. Implementations of this ABI have also been done for the Pentium, ARM, and other processors by multiple C++ compiler vendors. The C++ ABI has also been endorsed by the Linux Standard Base (LSB) for use with C++ compilers on Linux systems.

Additionally, test suites have been developed to measure conformance to the C++ ABI [6]. These suites check that the code generated by the C++ compiler conforms to the C++ ABI specification.

The availability of multiple C++ compilers that conform to the C++ ABI has presented another validation opportunity. It is now possible to generate tests that are partially compiled by two C++ ABI-conformant compilers and compare the results. This approach finds problems that the conformance suites might miss, or finds issues in areas the conformance suites do not address.

The techniques I've described here have been used to create two compilers that conform to the C++ ABI and have established interoperability with each other. The Intel Compilers for the Pentium and Itanium families can interoperate with GCC 3.2 and its current successors on Linux platforms [7]. This demonstrates that C++ interoperability can indeed become a reality.

The existence of the C++ ABI represents a movement from compiler-specific C++ ABI specifications to a platform-specific ABI specification, as was done in [5]. The availability of conforming compilers and conformance suites allows the conformance levels of C++ compilers to be measured. Therefore, we have reached a point in the evolution of C++ where C++ interoperability can achieve the availability that C interoperability has enjoyed for years.

The Benefits of C++ Interoperability

C++ interoperability provides a significant benefit to many vendors in the C++ market. The fact that object files generated by different C++ compilers can be linked together to form an executable image that runs correctly creates several opportunities.

  • The C++ Language Support Runtime can be shared between compilers. A C ABI for a particular platform includes a description of the call stack. That document can be expanded to include C++ runtime constructs such as the exception-handling unwind mechanism. It could also include the functional interface and data structures required for the other language support items. (See [5] for an example of such a specification.)
  • Libraries can be shared between compilers. Third-party vendors, including vendors of the C++ Standard Library, will not have to provide different versions of their runtime libraries for each C++ compiler. Compiler vendors no longer have to work with several library vendors to make sure library versions for their compilers are available. Library vendors no longer have to support several versions of their libraries for a given platform. Users no longer have to be concerned about which compiler built a library they purchased.

  • Users can use more than one compiler when building their application. If users have a portion of their application that is performance sensitive, they could choose to build that part with a high-performance compiler and build the rest of the application with a different compiler. Additionally, users can migrate to a new compiler gradually.

  • Debuggers and other tools relying on the C++ object model will work with objects generated by different compilers without retooling. In theory, these tools should use a format that does not make C++ object model assumptions, but such assumptions occasionally creep into implementations. If all C++ compilers are making the same assumptions, these tools continue to work.

Conclusion

C compilers have been interoperable with each other for years. With the development of the C++ ABI specification [2] and the broad support it has received, interoperable C++ compilers have become a reality. C++ interoperability will benefit compiler vendors, third-party C++ vendors, and C++ users for years to come.

References

[1] Lippman, Stanley B., Inside the C++ Object Model, Addison-Wesley, 1996. ISBN 0-201-83454-5.

[2] Itanium C++ ABI, http://www.codesourcery.com/cxx-abi.

[3] Vandevoorde, David and Nicolai M. Josuttis, C++ Templates: The Complete Guide, Addison-Wesley, 2003. ISBN 0-201-73484-2.

[4] "Itanium Software Conventions and Runtime Architecture Reference Guide," http://developer.intel.com/design/itanium/downloads/245358.htm.

[5] Application Binary Interface for the ARM Architecture, http://www.armdevzone.com/ EABI/bsabi.pdf.

[6] C++ ABI Test Suite, http://www.codesourcery.com/abi_testsuite.

[7] Intel Compilers for Linux: Compatibility with GNU Compilers, http://www.intel.com/software/products/compilers/techtopics/LinuxCompilersCompatibility.htm.


Joe Goodman is a member of the Compiler Lab at Intel. He has been working with C++ compilers for over 10 years. Joe can be contacted at [email protected].



Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.