Control How COM Marshals Your Data

By Fran Heeran, March 01, 1998

March 1998/Control How COM Marshals Your Data

One of the promises of COM is that your COM object can reside in the same address space (via a DLL), in a different process, or even in a process on a remote machine, without clients of your COM object being any the wiser. Of course, when your COM object is not in the same address space as its caller, someone must handle the problem of moving the function call parameters across separate address spaces or across the network. The standard solution is to describe your COM interface with a Microsoft Interface Definition Language (MIDL) file, and let the MIDL compiler generate the code needed pack up the function call data and send it to the separate or remote process, a process called marshaling.

However, the MIDL compiler does not know how to marshal every conceivable data type. Most programmers designing a new COM interface would stick to data types that can be described in IDL (Interface Definition Language), but legacy code or special circumstances may sometimes dictate a data type that IDL can’t handle. Another drawback of the MIDL compiler is that it cannot generate the most optimal code for all situations. Fortunately, the IDL also gives you the ability to precisely control how your data gets marshaled. This is achieved using the wire_marshal attribute in your Interface Definition file (IDL). Having control of how your data is marshaled lets you marshal data types that cannot be completely described in an IDL file, and also to implement other features such as on-the-fly compression and decompression of data.

This article demonstrates the wire_marshal attribute and some of its applications. The accompanying example uses wire_marshal to implement compression of data passed from a client to an out-of-process COM server. This is particularly advantageous when the server is on another machine connected over a potentially slow network link. One of the great advantages of using wire_marshal is that it allows you to define how your data is transferred between client and server without either of them having to get involved. This allows you to keep your application code, your interfaces, and your data structures simple and easy-to-use while maintaining total control over how the data is transferred.

Introduction to Marshaling

When a process calls methods in a COM object that resides in another process, either on the same or a different machine, the parameters passed to the method need to be transferred between the two address spaces. To transfer the parameters, they must be serialized into a single block of memory which can then be transmitted to the destination process. The transfer mechanism may be either Local Remote Procedure Call (LRPC) if the destination process is on the same machine as the source, or Remote Procedure Call (RPC) if it is on a different machine. The process of serializing and de-serializing the parameters is known as marshaling.

Consider the situation where process A calls a method in an object residing in process B, passing a string pointer as a parameter. This pointer has no meaning to process B and would trigger an access violation (or worse, point to some random data in B’s address space) if used. Instead the string must be copied into process B’s address space before it can be used. The marshaling procedure copies the string into a block of memory which the RPC subsystem transmits to the target process. The string is then copied into a block of memory in the target process’s address space and a pointer to this memory passed to the method.

The responsibility for marshaling data between processes lies with what are called proxy and stub DLLs. These are automatically loaded by COM whenever a call is made to an out-of-process object. The responsibility for developing and distributing these DLLs lies with the COM server developer — clients know nothing about them. Figure 1 shows the relationship between all the components.

The proxy and stub code usually reside in the same DLL, which is separately loaded into each process space. The task of writing this DLL is greatly simplified by the Microsoft IDL compiler (MIDL) which reads an IDL file and produces a set of .c and .h files which, when compiled, produce the proxy/stub DLL — simple as that! Here is a sample IDL file that defines a simple interface with one method that takes two parameters — a string and a structure:
typedef struct {
    int     iDay;
    int     iMonth;
    int     iYear;
} DATE;
[
    object,
    uuid(c15c5681-e9db-11cf-a683-0020af4357e3),
]
interface ISimple : IUnknown
{
    import "unknwn.idl";
    import "oaidl.idl";
    HRESULT SetUserDetails([in] LPSTR pszUserName,
        [in] DATE  *pDateOfBirth);
}
The code generated by the IDL compiler will contain two functions named ISimple_SetUserDetails_Proxy() and ISimple_SetUserDetails_Stub(). The proxy function allocates a block of memory large enough to hold all the parameters, copies the parameters into the memory block, and passes this block to the stub using RPC. The stub extracts the parameters, allocating any memory required, and passes those parameters to the real SetUserDetails() method. When the real SetUserDetails() method returns, the stub code frees up any memory allocated for the parameters.

The final task in writing the proxy/stub code is to register it with COM so that it knows to how to load it when needed. Each component that has an associated proxy/stub has the key “ProxyStubClsid32” key added to its definition under
HKEY_CLASSES_ROOT
  \Interface
    \[clsid]
which identifies the correct proxy/stub DLL to load.

For a complete description on marshaling, IDL files, and proxy/stub registration, I would recommend reading Inside OLE by Kraig Brocksmidt and Inside COM by Dale Rogerson. Both are published by Microsoft Press.

The wire_marshal Attribute

The wire_marshal attribute is an IDL keyword that tells the IDL compiler that special marshaling is required for a given data type and that you will supply extra code to help in the marshaling. It is typically used when a data type cannot be completely described in the IDL file or when the data cannot be marshaled in its native form.

A good example of wire_marshal is the case where a client passes a HBITMAP as a parameter to a server. The bitmap handle has no meaning outside the client process, and HBITMAP is not a data type that IDL knows how to marshal. wire_marshal lets you intervene in the marshaling process and supply client-side code that packs the bitmap into a buffer for transfer to the server, as well as server-side code that reconstructs the bitmap and creates a valid HBITMAP for it in the server process. This custom marshaling code is independent of both the client calling code and the COM object being called in the server. In fact, it’s difficult for the client of a COM object to even determine whether or not the object is remote rather than in the same process. Microsoft provides marshaling for HBITMAPs in their standard IDL files using wire_marshal. The exact definition for this and other standard windows types can be found in the file wtypes.idl. (See \include\wtypes.idl for Visual C++, or \sdktools\rpc\wtypes.idl for Borland C++.)

My sample code uses the wire_marshal attribute to compress an array of BOOLs being passed from a client to a server residing in a different process/machine. The array is compressed into bitfields, allowing me to transmit eight BOOLs in a single BYTE. On the server side, I convert these bitfields back into an array of BOOLs. Neither the client nor the server are aware that this compression and decompression is taking place, allowing the application code to use a convenient data representation while ensuring maximum efficiency when transmitting the data. If you imagine that the client was a monitoring process sending state information for some piece of hardware to the server several times a second, you can begin to see the benefits of implementing the compression.

Using wire_marshal

The first step in implementing wire_marshal for a data type is to declare the data type with the wire_marshal attribute in your IDL file. To do this, you must first define a wire representation for your data. This is a representation of how your data will be transmitted once you have marshaled it. Although not an absolute requirement, it is strongly advised that your wire type definition be an actual representation of the structure of the data. This allows your wire type definition to be embedded in other definitions in the IDL file. Next, associate your wire type with the actual data type you want to marshal, by using the wire_marshal attribute. The following is an extract from sobj.idl (Listing 1) showing these steps:
typedef struct {
short size;
[size_is(size)] unsigned char *pData;
} wireCOMPARR;
typedef [unique] wireCOMPARR * LPCOMPARR;
typedef [wire_marshal(LPCOMPARR)] void * LPMYARRAY;
Here, I first define the wire data type (wireCOMPARR) I will use to transmit data; this is a structure consisting of a two-byte size field followed by a pointer to an array of bytes that will contain the compressed Boolean data. The last line of this IDL code extract associates the wire pointer type, LPCOMPARR, with the data type, LPMYARRAY. This tells the IDL compiler that whenever it encounters a data type of LPMYARRAY it should call out to user-supplied marshaling code which will marshal and unmarshal the data type as an LPCOMPARR type.

The remainder of sobj.idl (Listing 1) defines the ISimpleObject interface containing the single method DisplayArray(), which takes a pointer to the array of BOOLs (LPMYARRAY) as its parameter. I used the LPMYARRAY type instead of BOOL * or LPBOOL because I don’t want to control the marshaling of every BOOL pointer that might appear in the IDL file, only this particular parameter (which I happen to know is a specific size). That’s why I created a unique typedef for this function’s argument, so that I could apply the wire_marshal attribute to just this particular BOOL * parameter.

After you define your wire data type and use the wire_marshal attribute to associate it with the data type whose marshaling you want to control, you must write four functions. These functions take care of sizing, marshaling, and unmarshaling the data, and also free the memory allocated during the unmarshaling process. The functions are linked in with the rest of the MIDL-generated code to form the proxy/stub DLL. The required functions are:
<type>_UserSize()
<type>_UserMarshal()
<type>_UserUnmarshal()
<type>_UserFree()
where <type> is the data type you have declared with the wire_marshal attribute. marshal.c (Listing 2) contains the four functions I’ve written to marshal my array of BOOLs.

<type>_UserSize()

This function is called on the proxy side when the data is being prepared for marshaling, and has the following prototype:
unsigned long __RPC_USER <type>_UserSize(
      ULONG __RPC_FAR *pFlags, ULONG StartingSize,
      <type> __RPC_FAR *pUserObject);
The IDL-generated proxy code allocates a single block of memory large enough to hold all the parameters being passed. It calls this function to see how much space your data will occupy.

The first parameter contains a set of flags passed in by COM; Table 1 shows the definitions for those flags. The upper half of the flags specify machine-specific data characteristics. The lower half tells your code what context (e.g., is the server on a different machine?) the marshaling is taking place for. This flag lets you alter the behavior of your marshaling code depending on context. In my example, I could have decided to only compress the BOOL array when marshaling to a different machine.

The second parameter contains the number of bytes already required by the marshaling buffer due to previous parameters. Since the return value of this function is the new number of bytes required in the marshaling buffer, you will need to add this second parameter to the size you calculate.

The third parameter is a pointer to the object you are marshaling. LPMYARRAY_UserSize() in marshal.c (Listing 2) doesn’t need this parameter, since this particular data type has a fixed size (256 BOOLs).

The return value should be the size of your data as it will be when marshaled, plus the value of the startingSize parameter that was passed in. If you can’t determine the exact size your data will be until you actually marshal it, you can return an over-estimate of the size required. The size of the data sent is determined by the data size after marshaling and not by the buffer’s allocation size.

<type>_UserMarshal()

This function is called on the proxy side to marshal your data into the buffer supplied by the calling RPC code. It has the following prototype:
unsigned char __RPC_FAR * __RPC_USER < type >_UserMarshal(
   ULONG __RPC_FAR * pFlags, ULONG __RPC_FAR * pBuffer,
   < type > __RPC_FAR * pUserObject);
The first argument contains the same set of flags passed to the <type>_UserSize(). Table 1 explains the various bits.

The second argument is a pointer to the buffer into which you must copy the data you are marshaling. You should not, of course, copy more bytes than your <type>_UserSize() function estimated you would need (though you can copy fewer bytes).

The third argument is a pointer to the data you are being asked to marshal. LPMYARRAY_UserMarshal() in marshal.c (Listing 2) packs eight BOOL values into a single byte. You should also remember that your wire_type definition should match the actual layout of the marshaled data.

The return value should be the address of the first byte after your marshaled data in the buffer. Just add the number of bytes used to the buffer pointer that was passed in.

<type>_UserUnmarshal()

This function is called on the stub side to unmarshal your data back into its original form before it is passed to the function being called. Its prototype is:
unsigned char __RPC_FAR * __RPC_USER < type >_UserUnmarshal(
   ULONG __RPC_FAR * pFlags, ULONG __RPC_FAR * pBuffer, 
   < type > __RPC_FAR * pUserObject);
This function takes the same parameters as <type>_UserMarshal(), the difference being that the buffer (pBuffer) now points to the data that the proxy side copied into it, while pUserObject now points to data object that should receive the unmarshaled data.

The function should return a pointer to the new position in the RPC buffer, which should be the first byte after your marshaled data.

You should pay particular attention to how you allocate memory that you will be returning in the pUserObject pointer. For example, if the destination function passes this pointer on to COM or MAPI code, then you should use the appropriate COM or MAPI memory allocation mechanism. The sample code uses the COM function CoTaskMemAlloc() to allocate its memory.

<type>_UserFree()

This function is called on the stub side to free any memory you allocated in your <type>_UserUnmarshal(). After the call has been made to the interface method, the stub code will call this function to allow you to free the memory.

The first argument is the familiar set of flags documented in Table 1. The second argument is a pointer to the object that your <type>_UserUnmarshal() initialized. The sample uses CoTaskMemFree() to free the memory allocated by the call to CoTaskMemAlloc() in LPMYARRAY_UserUnmarshal().

Building the Proxy/Stub DLL

These four functions are usually placed in a separate .c file which is built along with the source generated by the IDL compiler to create the proxy/stub DLL. The file marshal.c (Listing 2) implements the four functions required for the sample. To build the proxy/stub DLL, run the MIDL compiler on your IDL file to generate the code for the proxy/stub DLL. Build it, making sure to include the file with your user marshaling code. The final step is to register the proxy/stub DLL in the registry so that COM knows how to load it when required. register.c (Listing 3) shows the registry settings required along with those required for the sample interface.

Besides the proxy/stub DLL, I’ve created a simple client application and server application to demonstrate the wire_marshal attribute. marsamp.h (Listing 4) contains the project header file, client.cpp (Listing 5) contains the console-based client application, and server.cpp (Listing 6) contains the console-based server application. makefile (Listing 7) contains nmake directives to build the project. This month’s code archive contains the complete source code.

The sample client program (client.exe) passes a 256-element array of BOOL values to the COM server (server.exe) for it to display. The client also displays the array before passing it to the server to allow the consistency of the data to be checked. The custom marshaling code compresses and uncompresses the array before and after transmission. The server program will show both the normal and compressed size of the data.

Building and Running the Sample

The sample source was built using Visual C++ v5.0. To build and run the sample:

1. Run nmake to build the project files. This will build client.exe, the client; server.exe, the server; and sobj.dll, the proxy/stub DLL.

2. Register the server and proxy/stub DLL by running the server with the following command line:
   server -register [path to server and proxy/stub location]
Note that the server and proxy/stub DLL must be in the same directory. This is due only to the registration code I wrote and is not a general COM requirement.

3. Run client.exe. This will create an instance of ISimpleObject, which will cause the server to run.

4. You can unregister the server and proxy/stub DLL by running server.exe with “-unregister” as a parameter.

Running the Sample across Machines

If you want to run the server on a different machine to the client, do the following:

1. Copy server.exe and sobj.dll to the server machine.

2. Repeat the previous step 2 on the server machine.

3. Run server.exe. You need to do this to see its output. The server program will display the string “Waiting...” and sit waiting for the client to connect to it.

4. Run client.exe on the client machine, passing the name of the server machine as the only parameter.

You must enable DCOM on both machines. The program dcomcnfg.exe can be used to verify this and also to ensure you have the correct access permissions enabled for the server. In the list of registered objects that DCOMCNFG displays, you will see an entry called “ISimpleObject - WDJ wire_marshal example”. Double clicking on this will bring up the access permissions for the object, which you can then change if required.

Debugging Proxy/Stub Code

When you are debugging code in the proxy/stub DLL, you should make sure that you have OLE RPC debugging enabled. This can be done in the “Options” dialog in Visual C++, under the “Debug” tab. Also note that you will not be able to set breakpoints in any of the source for the proxy/stub DLL until it is loaded by COM. This happens when you first create an object for which the proxy/stub is registered. Once the DLL is loaded you can set breakpoints in any of its source.

Summary

The wire_marshal attribute lets you define exactly how your data is represented when passed across process or machine boundaries. It also lets you locate your special marshaling code in one central place hidden from the upper layers that use it, keeping your application code and interfaces as simple as possible. The compression/decompression example used in this article is only one of many uses. Other uses include removing redundant information completely before transmission and marshaling data that cannot be marshaled in its native form, such as handles to resources or data structures that cannot be correctly defined in the IDL file.

Fran Heeran is a Senior Software Engineer at ISOCOR in Dublin, Ireland where he develops messaging clients and servers for Internet and Mobile messaging. He can be contacted at [email protected].

Get Source Code

1 2 3 4 5 6 7 8 9 10 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Control How COM Marshals Your Data

Introduction to Marshaling

The wire_marshal Attribute

Using wire_marshal

<type>_UserSize()

<type>_UserMarshal()

<type>_UserUnmarshal()

<type>_UserFree()

Building the Proxy/Stub DLL

Building and Running the Sample

Running the Sample across Machines

Debugging Proxy/Stub Code

Summary

Get Source Code

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Control How COM Marshals Your Data

Introduction to Marshaling

The wire_marshal Attribute

Using wire_marshal

<type>_UserSize()

<type>_UserMarshal()

<type>_UserUnmarshal()

<type>_UserFree()

Building the Proxy/Stub DLL

Building and Running the Sample

Running the Sample across Machines

Debugging Proxy/Stub Code

Summary

Get Source Code

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content