Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Mixed-Language Programming & External Linkage


December, 2005: Mixed-Language Programming & External Linkage

Giri Mandalika is an engineering consultant at Sun Microsystems. He can be contacted at [email protected].


It is common practice to call functions of a C library from a C++ program. This works out well as long as you restrict yourself to the standard headers and libraries that were supplied with the operating system. But novice programmers may stumble upon some link-time errors as soon as they try to call methods of their own C library from a C++ program. Potential reasons for such failures could include everything from unfamiliarity with linkage specifications to how C/C++ compilers handle symbols during the compilation.

In this article, I introduce the concept of linkage and show how a C++ program fails without proper language linkage. Mixing code written in C++ with code written in C is relatively straightforward, as C++ is mostly a superset of C. Although mixing C++ modules [1] with modules in languages other than C is allowed, it is a bit more complicated; hence, I restrict my discussion to C and C++ modules.

The C++ Standard provides a mechanism called "linkage specification" for mixing code in the same program that was written in different programming languages and compiled by the respective compilers. Steve Clamage, chair of the ANSI C++ Committee and technical lead for the Sun Studio C++ compiler, comments: "Linkage specifications have been a part of C++ from its early days. The Annotated C++ Reference Manual (ARM) described linkage specifications in 1990. I have found over the years that misunderstandings about linkage specifications are fairly common."

The C++ Standard uses the term "linkage" to describe the accessibility of objects in one file from another, or even within the same file. There are three types of linkage:

  • No linkage
  • Internal linkage
  • External linkage

Something internal to a function, in regard to its arguments, variables, and so on, always has "no linkage" and can be accessed only within the function.

Sometimes it is necessary to declare functions and other objects within a single file in a way that lets them reference each other, but not to be accessible from outside that file. This can be done through an "internal linkage." Symbols with internal linkage only refer to the same object within a single source file. Prefixing the declarations with the keyword "static" changes the linkage of external objects from external linkage to internal linkage.

Objects that have "external linkage" are all considered to be located at the outermost level of the program. This is the default linkage for functions and anything declared outside of a function. All instances of a particular name with external linkage refer to the same object in the program. If two or more declarations of the same symbol have external linkage but with incompatible types (for example, mismatch of declaration and definition), then the program may either crash or show abnormal behavior. Here, I examine problems with mixed code and provide a solution using external linkage.

The Problem

In the real world, it is common for code written in one programming language to use the functionality of code written in another language. A trivial example is a C++ programmer relying on a Standard C library (libc) for sorting a series of integers with the quick-sort technique. It works because the C implementation takes care of the language linkage automatically, but you need to take additional care if you use your own libraries written in C from a C++ program. Otherwise, the compilation may fail with link errors caused by unresolved symbols.

Consider this example: Assume you are writing C++ code and wish to call a C function from C++ code. Listing 1(a) is the code for the callee, a C routine. Now try to call the C function greet() from the C++ program in Listing 1(b). The extern keyword declares a variable or function and gives it external linkage. For example, its name is visible from files other than the one in which it's defined; see Figure 1. Though the C++ code is linked with the dynamic library that holds the implementation for greet(), libgreet.so, the linking failed with undefined symbol error. What went wrong?

What Went Wrong

The link error results because the C++ compiler mangles (encrypts) function names to support function overloading. So, the symbol greet is changed to something else, depending on the algorithm implemented in the compiler for the name-mangling process. Hence, the object file does not have the symbol greet anywhere in the symbol table. The symbol table of the object binary file mixedcode.o confirms this. You can use the command elfdump [2] to look at the symbol tables of both libgreet.so and mixedcode.o; see Figure 2. (The dem command [3], included with Sun Studio software, "demangles" the name and prints its real name.) For instance, char*greet() has been mangled to __1cFgreet6F_pc_ by the Sun Studio C++ compiler. This is why the static linker (ld) couldn't match the symbol in the object file. So how do you solve this problem?

Solution

The C++ Standard provides a mechanism called "linkage specification" to enable smooth compilation of mixed code. "Language linkage" creates a linkage between C++ and nonC++ code fragments. All function types, function names, and variable names have a default C++ language linkage. Language linkage can be achieved using this linkage specification:

extern string-literal { 
    function-declaration 
    function-declaration 
} 
extern string-literal function-declaration; 

The string-literal specifies the linkage associated with a particular function; for example, C and C++. Every C++ implementation provides a linkage to functions written in C language ("C") and linkage to C++ ("C++").

The solution then is to disable name mangling of the called external functions so you can use the functionality of external C functions from C++ code without any issues. You can do this using the linkage to C. The forward declaration of greet() in mixedcode.cpp should resolve the problem:

extern "C" char *greet(); 

Because you were calling C code from a C++ program, C linkage was specified for the greet() routine. The linkage directive extern "C" tells the compiler to inhibit the default encoding (name mangling) of a function name for a particular function and to use C calling conventions while sending external information to the linker. In other words, the C linkage specification forces the C++ compiler to adopt C conventions, which are not the same as C++ conventions. So, Listing 2 modifies the source of mixedcode.cpp and recompiles the program. It works! Figure 3 shows the symbol table of mixedcode.o. As expected, the function name greet() was not mangled by the C++ compiler. Consequently, the linker could find the symbol in the object file and was able to build the executable.

Here are some generic tips on mixed-code programming along with some warnings:

  • If you are mixing C and C++ code, use compilers that are compatible. For example, they must define basic types such as int, float, or pointer in the same way. Make sure that the data types in the different languages correspond.
  • While mixing code, avoid mismatching data types for parameters and return values.
  • Know the order in which arguments are expected on the stack.
  • Don't worry about language linkage while using standard header files because most of the C/C++ compiler vendors handle the linkage specifications inside their header files that work with both C and C++. This is why most existing C libraries can be called without explicit specification of C linkage.
  • Pay attention to case-sensitivity conventions for function names in the different languages.
  • A function declared as extern "C" cannot be overloaded.
  • extern "C" declarations can only be applied to global functions.
  • extern "C" declarations must always be after the last include.
  • Be aware that extern "C" declarations do not detail what must be done to allow the mixing of the C and C++ code.
  • It is possible to use linkage directives with all the functions in a file. This is useful if you wish to use C library functions in a C++ program. For example:
  • extern "C" { 
        #include "mylibrary.h" 
    } 
    
    

  • When programming header files to be used for both C and C++ programs, use the convention with predefined macros in Listing 3. The system header files under the /usr/include directory also provide an example of correct usage of the predefined macros.

This list is not complete. Name mangling is only part of the problem to be solved. Other issues with mixing fragments of code written in different programming languages exist, and additional steps are needed to resolve those issues. For example, differences in function argument passing between C++ and C functions may create some problems if wider arguments were passed than the expected arguments. Unfortunately, discussion of those issues is beyond the scope of this brief article. I would strongly encourage the reader to go through the C++ Standard for more examples and for the complete linkage specification.

References

  1. A source file contains one or more variables, function declarations, or similar items logically grouped together. From the source file, the compiler generates the object module, which is the machine code of the target system. Object modules will be linked with other modules to create the load module, a program in machine-language form, ready to run on the system.
  2. The elfdump utility can be used to dump selected parts of an object file, such as symbol table, elf header, and global offset table (http://docs.sun.com/app/docs/doc/816-5165/ 6mbb0m9fc? q=elfdump&a=view).
  3. The dem utility prints a demangled C++ name that closely resembles the name that was originally declared. Sun distributes this utility as part of the Sun Studio compiler collection suite (http://developers.sun.com/tools/cc/ documentation/ss10_docs/mr/man1/dem.1.html).
  4. _cplusplus is a predefined macro by the C++ compiler. To see all the predefined macros of C/C++ compilers in Sun Studio 8 or later, compile a simple program with -xdumpmacros flag. To learn more about -xdumpmacros, see the man page of CC (http://developers.sun.com/tools/cc/documentation/ss10_docs/mr/man1/CC.1.html).
  5. Programming Languages—C++, ISO/IEC 14882 International Standard.

CUJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.