C++ in .NET

By Rex Jaeschke, September 01, 2002

Need a jumpstart on Managed C++? Here it is in the proverbial nutshell.

September 2002/C++ in .NET

The .NET Framework includes a specification for a virtual machine instruction set, metadata, and an extensive class library. Compilers targeting this

framework process source code that uses this class library. The output from each compiler is in the form of metadata, which contains the MSIL (Microsoft Intermediate Language) instructions needed to perform the task defined by the code and descriptive information about those instructions. Such output is known as “managed code” and is generated when the /clr compiler option is used. (Unmanaged code, which is compiled to the machine language of a physical processor, occurs when the /clr compiler option is omitted.) The metadata can be used by a variety of tools, such as browsers, profilers, and debuggers, as well as other compilers. Finally, the generated code is executed by the CLR (Common Language Runtime). In the simplest case, the intermediate code can be interpreted. However, a more efficient approach is to compile this code to native machine code, either when the application is installed or each time it is run.

The compilation model for Managed C++ code is different from that used by earlier Visual C++ versions. Historically, each source file was compiled to an .obj file, with all such files being combined into an .exe file by the linker. Now, there is only one step, “building,” rather than the separate compilation and link.

The output from a single build is called an “assembly,” which has the form of an .exe or a .dll file. Typically, an executable application program is made up of one .exe assembly, zero or more application .dll assemblies, and the core library assembly mscorlib.dll.

Most existing C++ source can be compiled to managed code without any change in meaning simply by using the /clr compiler option. However, certain constructs still generate unmanaged code, resulting in an application that is part managed and part unmanaged. Simply compiling in managed mode does not magically cause the code to immediately take full advantage of the .NET Framework. It does, however, make available the facilities to do so if the programmer wishes to make the necessary changes or additions.

When any Managed C++ project template is used, the “Use Managed Extensions” property in the project’s “General” properties page is activated; the compiler option /clr is selected, so managed code is generated as much as possible. A consequence of selecting /clr is that is also selects multithreaded mode. A managed application runs on the main thread while the garbage collector runs on a lower-priority background thread; all managed applications are multithreaded.

A Simple Example

Here’s a simple application that consists of two separate source files, main.cpp and f.cpp, each of which is built into a separate assembly. The main function identifies itself and then calls function f, which also identifies itself.

// main.cpp

#using <mscorlib.dll>
#using <f.dll>
using namespace System;

int main()
{
  Console::Write(S"Hello from main\n");
  Lib::f();
}

Every source file for which managed code should be generated must contain the #using directive shown above. Directives such as this allow a program to directly import metadata from another assembly. Here, you import the metadata from the core class library mscorlib and the assembly f (which will be created shortly). This mechanism removes the need for headers.

.NET provides a complete I/O library. For example, System::Write writes to the standard output device. Write (and its sibling WriteLine) support format specifiers that provide behavior equivalent to Standard C++’s manipulators. It also supports currency formatting and, of course, internationalization.

Managed C++ introduces a new form of string, having a prefix S and type String. Note carefully that objects of this type are immutable; their contents cannot be changed. Identically spelled S-type literals share the same memory.

// f.cpp

#using <mscorlib.dll>
using namespace System;

public __gc class Lib
{
public:
  static void f()
  {
    Console::WriteLine(S"Hello from f");
  }
};

Types are exported from an assembly by declaring them public. By defining f inside a public type such as Lib, that function can be called from another assembly. Since functions cannot be exported directly, any functions you wish to write in C++ and make publicly available to other .NET-compliant languages will have to be encapsulated inside a public class or struct. The __gc modifier declares Lib to be a managed type. (Unless this modifier is present, you cannot make the class public.)

For main.cpp to compile, the assembly f.dll must exist and be accessible to the compiler. To achieve this, you tell the compiler where to look for assemblies that are imported, via the project properties page for “C/C++” and the property “Resolve #using References.”

The Type System

.NET supports two type categories: value types and reference types. (Note carefully that the term “reference” as used here is unrelated to C++’s notion of a reference variable.)

The value types include the built-in types, such as int and double. The set of reference types includes all user-defined class types having the __gc modifier. A variable of a reference type can only be created on the managed heap. (The managed and unmanaged heaps are separate. The former is subject to automatic garbage collection while the latter is not. However, both are accessed via the new operator.) Such variables cannot have automatic or static storage duration.

Ordinarily, the value of an uninitialized automatic pointer is undefined. However, when that pointer is to a managed type, the compiler and runtime combine to make sure it is initialized to zero, as an arbitrary pointer value could play havoc in a garbage-collected heap environment.

Note that managed pointers cannot be dereferenced. As a result, managed objects cannot be passed or returned by value, they cannot be copied using assignment (instead a Clone function must be called), and you cannot take their size using sizeof. As such, a managed class has no need for a copy constructor or an assignment operator.

In the .NET Framework, every managed class type is (ultimately) derived from class Object.

Occasionally, it is useful to be able to deal with value-type values in a reference-type context. To achieve this, the value of any value-type expression can be converted implicitly to type Object. This process is called “boxing.” The resulting Object expression can also be converted back again explicitly in a process known as “unboxing.”

The Standard C++ built-in types are mapped directly onto .NET Framework types that are implemented as lightweight classes (which are allocated on the stack rather than the managed heap), so they are more efficient and are suitable for use as value types. For example, int is a synonym for System::Int32, so the members of that class are available to all int variables.

Garbage Collection

You can allocate any number of objects on the managed heap using new, yet you don’t have to worry about explicitly freeing them when you are done with them. At run time, the execution environment runs a garbage collector to gather up discarded memory and make it part of the managed heap again.

Conceptually, it is useful to think that each object allocated on the managed heap is associated with a reference count, which simply keeps track of the number of pointer or reference variables that currently refer to that object. When an object’s reference count is decremented to zero, that object becomes a candidate for garbage collection.

Managed Arrays

Arrays in Standard C++ are fixed size; their size is set at compile time. However, the .NET platform (and hence Managed C++) supports managed arrays, whose size can seemingly vary at run time. For example:

int totals __gc[];

I have declared a reference to a one-dimensional managed array of arbitrary size. And even though totals is an automatic variable and has no explicit initializer, its initial value is set to null. As such, it does not currently refer to an array. Based on its type, this reference can only be made to point to a one-dimensional managed array of int or have the value null. No dimension size is specified, nor is one permitted.

Let’s initialize totals to point to a managed array of 4 int, which, like all managed objects, is allocated on the managed heap:

totals = new int __gc[4];
totals[0] = 5;
totals[2] = 3;

You access array elements in the usual way. By default, elements take on a value of zero, false, null, etc., as appropriate for their type. Array bounds checking is done at run time.

All managed arrays are derived implicitly from class Array, and all managed arrays have a read-only property called Length, which indicates their size. (A property is a class member that is used like a field, yet it is implemented as a function. It is used to get and/or set the value of some logical property for an object.)

Note carefully that a managed array subscript expression cannot be rewritten as a pointer arithmetic expression.

By making totals refer to a different managed array of int whose size is smaller or larger, you can give the illusion that the size of the array totals has changed. But, of course, totals is not really an array.

totals = new int __gc[8];

Since totals was the only reference to the previously allocated managed array of 4 int, once totals has been made to refer to some other managed array, the original array can be garbage collected.

Managed C++ supports true multidimensional arrays, unlike the arrays of arrays used by unmanaged C++. (There is no such thing as a managed array of array.) For example:

double values __gc[,];
values = new double __gc[5,7];
values = new double __gc[20,17];
values = 0;

Unfortunately, Managed C++ does not permit multidimensional arrays to have initializer lists, so you must initialize each element using assignment.

Since copying reference variables only does a shallow copy, to make a deep copy of an array, you must use the function Array::Copy.

Since a function can return a single value, and an array reference is such a value, a function can return a reference to a managed array. For example, the following function returns a reference to a managed array of int:

int GetNewArray
  (int count, int initValue)  __gc[];

Classes

Properties

As I mentioned earlier, a managed class can have read and/or write properties. For example:

public __gc class Point
{
  int xor;
  int yor;
public:
  __property int get_X() { return xor; }
  __property void set_X(int x) { xor = x; }
  __property int get_Y() { return yor; }
  __property void set_Y(int y) { yor = y; }

  Point()
  {
    X = 0; // use property [get_]X
  Y = 0; // use property [get_]Y
  }
};

Properties can be accessed like fields from languages that support them or by using functional notation from languages that don’t.

Static Members

Static members in managed classes need not be defined (and initialized) separately from their class.

A managed class can have static properties. It can also have a static constructor. If present, this constructor is executed once, before any instances of this type are created, and before any static members of this type are accessed.

Destruction

Unlike the destructor for an unmanaged class type, the destructor for a managed class type is automatically called by the system during garbage collection, which happens asynchronously. As such, a different approach to programming is required if object cleanup needs to occur at well-defined times.

Abstract Classes

To inhibit explicit instantiation of an unmanaged class type, at least one of its member functions must be declared as pure virtual. You achieve the same effect with a managed class type by declaring it abstract, as follows:

public __gc __abstract class Vehicle
{
  // ...
};

An abstract class need not have any pure virtual functions.

Sealed Classes

A class that is complete, either because it contains everything it needs or it inherits what it needs, can be defined such that it cannot be used as a base class. This is achieved by making that class sealed, as follows:

public __gc __sealed class String
{
  // ...
};

Inheritance

As stated earlier, all arrays are derived from System::Array. Likewise, all managed enumerated types are derived from System::Enum.

.NET supports single inheritance only.

Interfaces

.NET provides a way for multiple classes to implement a common set of capabilities through an “interface.” An interface is a set of function declarations. Note that the functions are only declared, not defined; an interface defines a type consisting of abstract functions, where those functions are implemented by client classes as they see fit. An interface allows unrelated classes to implement the same facilities with the same names and types without requiring those classes to share a common base class. For example:

// implicitly __abstract
// implicitly derived from System::Object
public __gc __interface ICollection
{
  // implicitly public, pure virtual
  void Put(Object *o); 
  // implicitly public, pure virtual
  Object *Get(); 
};

A class implements an interface using the same notation as it does for deriving from a base class; for example:

public __gc class List : public ICollection
{
public:
  void Put(Object *o)
  {
    // ...
  }
  // ...
};

A class can implement more than one interface, in which case, there is a comma-separated list of interfaces, whose order is arbitrary.

Operator Overloading

While operators can be overloaded for a managed class, there are some differences from Standard C++:

public __gc class Point
{
public:
  static bool op_Equality(Point& p1, Point& p2)
  {
    // ...
  }
};

Point& pa = *new Point(1, 2);
Point& pb = *new Point(1, 2);

if (pa == pb) ...

An operator overload function must be static, so all arguments must be declared explicitly.

An operator overload function must be static, so all arguments must be passed by address. Unfortunately, as a result, they can only be called on reference types using function notation. For example:

bool result = Point::op_Equality(p1, p2);
// can't use p1 == p2

Limitations

Managed class types have the following limitations:

They cannot contain members of managed pointer type.
Neither the sizeof operator nor the offsetof macro can be applied to a managed type or instance thereof.
Member functions cannot have default arguments.
Member functions cannot have a const or volatile qualifier.
The destructor is implicitly virtual.
They cannot have friend classes, functions, or interfaces.
They cannot be used as the base class for an unmanaged class.
They cannot be derived from an unmanaged base class.
They cannot inherit privately from a class.
They cannot inherit from more than one managed class.
They cannot contain a using declaration.

Exception Handling

.NET defines its own set of exception classes. However, a try block can have catch clauses for both managed and unmanaged exceptions.

The __finally keyword provided as part of structured exception handling some years ago is very useful for handling managed exceptions. Consider the case in which a try block has several catch blocks followed by a __finally block. When the try block completes normally, or any of its catch blocks completes, the associated __finally block is executed. This is especially useful since the timing of the execution of destructors of managed types is unspecified. A __finally block allows cleanup to be synchronized.

Delegates

A delegate is an object that encapsulates one or more functions, and for instance functions, it also associates each one with a particular instance. Once a delegate has been made to encapsulate one or more functions, you can invoke those functions via that delegate without knowing which functions have been encapsulated. Although C++ supports a delegate-like facility via function pointers, delegates provide more capability, and they are supported across .NET languages.

A common use of delegates is in the creation of event-handler lists. Each function that wants to be called when some event occurs adds itself to a delegate list that is used when that event occurs. Functions can also be removed from delegates.

#using <mscorlib.dll>
using namespace System;

__delegate void Del(int value);

public __gc class A
{
public:
   static void f1(int i) { /* ... */ }
};

public __gc class B
{
public:
  void f2(int i) { /* ... */ }
};

int main()
{
/*1*/ Del *d = new Del(0, &A::f1);
      d(3);

/*2*/ B *b = new B;
      d += new Del(b, &B::f2);
      d(5);

/*3*/ d -= new Del(0, &A::f1);
      d(6);
}

In case 1, delegate d is made to point to the static function A::f1, which is then called indirectly with an argument of 3. In case 2, the instance function B::f2 is added to d’s delegate list, and then both functions are called indirectly with an argument of 5, in the order in which they were added to the list. Finally, in case 3, the static function A::f1 is removed from the delegate list leaving B::f2 only, which is then called indirectly with an argument of 6.

Lightweight Classes

A value class type is a value type; when you declare a variable of some value class type, memory is allocated for all the fields of that type at that time — no pointer is involved. A value class variable can contain instance constructors, constants, fields, functions, properties, operators, and nested types. However, it cannot contain events or static constructors. Value classes are particularly useful for reasonably small data structures that have value semantics. Examples include points in a coordinate system and complex numbers.

public __value class Complex
{
  // ...
};

The Class Library

The .NET library is extensive. There are classes for threading, serialization, sockets, numerous collection types, database access, SQL, graphics, GUI, Web, and XML parsing, to name a few. And, of course, there is full support for internationalization.

Conclusion

.NET is very much a cornerstone of Microsoft’s strategy, so you can reasonably assume it’s here to stay. It’s certainly permeating their product line, and it’s no longer a proprietary platform. In December 2001, ECMA adopted an international standard on a subset of .NET, called CLI (Common Language Infrastructure). (It also adopted a standard for C#.) Co-sponsored by HP, Intel, and Microsoft, this standard was developed over a year, with participation from numerous other companies, including Fujitsu and IBM. That standard is now being considered for adoption as an ISO and ANSI standard.

Ximian, a software development company, has announced its intent to implement the complete .NET environment plus C# compiler for Linux. For more information, read about their “Mono” project at <www.ximian.com>.

For more information on Managed C++, see the Word file managedextensionsspec.doc in the directory Vc7 of the Visual Studio .NET installation directory. See also Essential Guide to Managed Extensions for C++ by Challa and Laksberg (Apress, 2002) and Developing Applications with Visual Studio.NET by Grimes (Addison-Wesley, 2002).

About the Author

Rex Jaeschke is an independent consultant, and developer and leader of seminars, specializing in programming languages and environments. He serves as editor of the C# Standard. Rex can be reached at [email protected].

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

C++ in .NET

A Simple Example

The Type System

Garbage Collection

Managed Arrays

Classes

Properties

Static Members

Destruction

Abstract Classes

Sealed Classes

Inheritance

Interfaces

Operator Overloading

Limitations

Exception Handling

Delegates

Lightweight Classes

The Class Library

Conclusion

About the Author

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

C++ in .NET

A Simple Example

The Type System

Garbage Collection

Managed Arrays

Classes

Properties

Static Members

Destruction

Abstract Classes

Sealed Classes

Inheritance

Interfaces

Operator Overloading

Limitations

Exception Handling

Delegates

Lightweight Classes

The Class Library

Conclusion

About the Author

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content