Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Porting Your C++ Code to .NET


Visual C++.Net Expert


I am not a fan of porting code because it produces, well, ported code. When you design your code, you generally have a target system in mind, and you use this system when you test your code. Your code is designed and tested on one system, and when you port it to another system, you attempt to make the code work in a situation for which it was not designed. However, there are situations when porting is required. .NET is clearly the future for Microsoft operating systems, but few software houses are likely to want to start afresh with their code: If you have hundreds of thousands of lines of code, it makes sense to try to reuse it. This is where managed C++ excels: It is designed to allow you to compile unmanaged code to Microsoft Intermediate Language (MSIL) so that it runs under the .NET runtime. In this article, I will outline the basic issues involved in porting your native C++ to .NET.

What Does Porting in .NET Mean?

The first point to make about porting C++ code is that, in general, you are porting the code and not the data to .NET. What do I mean by this? When you compile C++ code with the /clr compiler switch, the compiler will generate MSIL for the code (there are some exceptions, but this statement is generally true). This means that all code, whether it be global functions or class methods, will run under the .NET runtime. However, if your classes are unmanaged (they are not marked with the __gc modifier), then instances will be created on the stack or in the C++ free store. Only instances of .NET types will be created on the managed heap and managed by the .NET Garbage Collector. Compiling native C++ with /clr is not a solution to fix poorly written C++ code that does not call delete on free store allocated objects!

Porting means giving .NET code access to native C++ routines. If you only have one managed C++ client to your code, then porting is unnecessary because such code should be accessed directly and the managed C++ compiler provides a technology called "It Just Works" (IJW) to facilitate this. On the other hand, if you want to use those routines in other .NET languages, or reuse them in several managed C++ projects, then the code needs to be ported to a managed library. Here, there are two choices: Do you provide a wrapper around the native code or do you port the entire library to managed code?

Porting Libraries

Typically, a C++ library is offered in one of three ways: a source library, for example, a template library; a static library; or a DLL. If you have a library provided in source form, the entire library can be compiled as MSIL; however, this will mean that only code within the current assembly will have access to the classes because to export a class out of an assembly, the class has to be a public .NET class. Furthermore, the method parameters and public data members of your class may be unmanaged types and this means that they will not be accessible by .NET code written in a language other than managed C++. To address this, you must provide a wrapper class that provides a managed interface to your class.

If your library is supplied as a static library, then you'll have to supply a wrapper class in managed C++ to enable other .NET languages to access the library. If the static library is class based, then you will have most of the work done for you because you do not have to identify the classes — you merely have to provide suitable wrappers.

If the library is DLL based, then in most cases it will be a collection of exported C functions. In such cases, you will have to do some work to identify the objects involved and design wrapper classes. In some cases, there may not be any objects and you can simply expose these functions through static members of your managed class. In other cases, the C API will represent a "flattening" of an object API. For example, in last month's column, I described the GDI+ unmanaged API, which is exported from the gdiplus.dll library using C functions. These functions use opaque handles to represent the this pointer of the object. The managed classes in the System::Drawing namespace are merely wrappers around this unmanaged library.

The major difference between a source code library and static or DLL libraries is that most of the code in a source code library will be compiled as MSIL and executed by the .NET runtime, whereas most of the code in a static library or a DLL will be x86 code running outside of the tentacles of the .NET runtime. This means that the code in a static library or DLL will be run in the environment where it was designed to run, so you don't have the issue of ported code being squeezed into an environment where it was never intended to run. Note also that Microsoft says that the thunking calls from the managed to the unmanaged world used in IJW or Platform Invoke add between 10 to 30 extra x86 instructions to the JITed code. A source code library may make many calls to other libraries (like the C Runtime library or the Win32 library), so compiling the entire code to MSIL will add the extra machine cycles to every unmanaged library call. On the other hand, when you call a static library, you only have those extra machine cycles when the static library methods are called from the wrapper class, and not when the library methods make CRT or Win32 calls.

Common Language Specification

The best library is one that can be used by the largest number of users. In .NET terms, this means that a library has to be Common Language Specification (CLS) compliant. The CLS rules of types and names should apply to all types that are publicly accessible, so your private types can be noncompliant. A compliant class exposes language features that are considered the baseline for all languages, so for example, since some languages cannot handle unsigned integer types, a compliant class should only use signed integers for method parameters, public fields, and properties. A CLS-compliant library should be declared as such with the assembly attribute [CLSCompliant(true)]; types and members of types that are not compliant should also be marked with this attribute but with False as the constructor parameter. The Tools Developers Guide installed as part of the .NET SDK provides a list of the CLS rules in the first part of the Common Language Infrastructure specification, so I won't go into details about them here. Instead, I will make a few general comments.

First, different languages have different rules about naming items. The CLS specification says that items should have unique names except when overloading is intended, so a single name cannot be used for a method and a field and two names must differ by more than just their case. Second, you can write a compliant class that has members that are targeted at a specific language (for example, methods that have unsigned integer parameters targeted at C# or managed C++) as long as it provides alternative members that are compliant. Providing CLS-compliant members of your wrapper class will involve extra work, but in the long run, it will be worthwhile because it will widen the potential market for the library.

Porting C++

If you have an existing C++ class and you want to expose this as a .NET class, the temptation is simply to write a wrapper class that uses composition to hold an instance of the native class and provide a managed implementation of each of the object's methods that do the necessary conversions between managed and unmanaged types before delegating the call to the unmanaged object. Although this approach is straightforward, there are problems with it because native C++ has some fundamental differences to managed C++.

Perhaps the most important difference is the lifetime issue: A native C++ object is explicitly destroyed, which results in a call to the object's destructor and frees the memory used by the object. The native C++ developer knows that, when the object is destroyed, the destructor will be called. This is not the case with .NET objects. The managed C++ compiler generates a method called Finalize, which calls the code in the C++ classes destructor. The Finalize method is called by the garbage collector at some time in the future when a garbage collection occurs. For objects that hold on to scant resources needed by other code, Finalize is unsatisfactory because clean up does not occur when the developer determines that it is necessary.

Finalize is protected, which means that the code that creates the instance does not have direct access to this clean up code, so you cannot get around this problem by calling it explicitly. The C++ compiler provides a public method called __dtor that calls the Finalize method, and since this method is public, client code can call it directly, or can call the delete operator on the object reference. Note that when you call delete on a managed C++ object, the result is merely to call __dtor: It has no effect on the memory occupied by the object. VB.NET and C# do not have the delete operator, but clients written in these languages can call the __dtor method directly. However, this is not natural to those languages and .NET has an established pattern to handle this situation.

If you have a native C++ class that implements a destructor, it means that the C++ class has resources that should be freed. In many situations, these resources should be freed as soon as possible, in which case the managed wrapper class should implement the IDisposable interface to provide a Dispose() method that will free the resource. C# developers have the using statement that identifies a block of code that determines the effective lifetime of an object and ensures that IDisposable::Dispose() is called after the block of code has completed.

Another issue involves the initialization of objects. In .NET, constructors are used solely for initialization when an object is created. Native C++ is a little more forgiving because it allows you to call constructors explicitly within your class. The most common use of this facility is when you have overloaded constructors that have common code, in which case one constructor can call another constructor to use its initialization routine. The managed C++ compiler does not allow you to call a constructor like this, and instead you have to generate a separate private initialization method.

Finally, it is worth pointing out that native C++ code provides a type of method (or constructor) overloading through parameter default values: Client code can omit the parameters with default values to get that value. .NET does not allow default parameters, but it does allow overloading, so the solution to this problem is to call the most generic overload and explicitly pass the default values.

Data Marshalling

The public interface to your class must have managed types and preferably should have CLS-compliant types. The methods of your native C++ code will usually have unmanaged types, so you will need to convert your managed types into unmanaged types before calling the native code and convert the results into managed types. I will leave the details of data marshalling to another article, but in general, your friend here is the Marshal class in System::Runtime::InteropServices. This class provides static methods to convert unmanaged strings to managed strings and vice versa, and to allocate and deallocate arbitrary amounts of memory.

Summary

Managed C++ is the only .NET language that allows you to mix managed and unmanaged code, and this facility is provided to allow you to use existing native C++ code in .NET classes. The C++ compiler gives you the option of compiling an entire library as managed code, or to compile the bulk of the code as native x86 so that the code runs in the environment where it was designed to run. Next month, I will explain more about the porting process.
w::d


Richard Grimes is an author and speaker on .NET. His latest book, Programming with Managed Extensions for Microsoft Visual C++ .NET, updated for Visual C++.NET 2003, is available now from Microsoft Press. He can be contacted at [email protected].

 


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.