Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Marshalling C++ Objects with XML


April 2002/Marshalling C++ Objects with XML

Marshalling is a technique that allows a developer to represent C++ (or Java) objects in XML text format. The XML text can be transferred and demarshalled into C++ object on the recipient site. The advantages of marshalling are described in “Professional XML, Second Edition, Chapter 15, XML Data Binding”. Marshalling is actually another kind of serialization. However, marshalling is much more powerful and flexible and can be implemented so that its performance will be nearly as high as with serialization.

Recently I was involved in a client-server project were we used marshalling for transferring C++ objects between remote peers. In the beginning I looked for publicly available code but didn’t find any high performance and flexible code for marshalling of C++ objects. So I decided to develop it on my own. The result of my developments is a marshalling subsystem with the following features:

  • Adding of marshalling capability to an existing C++ class is relatively easy and straightforward for the average C/C++ developer.
  • High performance: server application is able to marshal about 105 objects of average size per second; client is able to unmarshal about 103 objects per second. If the practical need for the amount of marshalled and unmarshalled objects per second is 100 times less than those 105 and 103 numbers, then server and client may spend about 1 percent of their CPU resources for performing a marshalling task.
  • The marshalling subsystem supports the automatic propagation of changes in a C++ object to minimize data traffic between server and clients. Other means, like marshalling of some specific aspects of a C++ object, are also supported by the marshalling subsystem automatically.
  • The implementation is platform independent. However, it was tested only in Windows 9x and NT; compiled by Visual C++ 6.0. For other platforms, you’ll have to check the implementation yourself.

The source code is available online, and it consists of the three parts: the standard XML parser (expat — see www.jclark.com/xml/expat.html), the test program with various examples (cppmarsh.cpp, simple.*, ...), and the marshalling core (xmlbind.*, xmlprocessor.*).

Marshalling and Unmarshalling of Simple C++ Objects

Let us consider how to convert an existing C++ object into a marashallable object:

struct  CSimple {
    int cnt;
    string name;
    float pi;
};

Step 1. Derive the object from the Marshallable interface and insert DECLARE_MARSHALING string into the object declaration:

struct    CSimple : public Marshallable {
    int count;
    string name;
    float pi;
DECLARE_MARSHALING(CSimple)
};

Step 2. Define data description in cpp file:

BEGIN_IMPLEMENT_MARSHALING(CSimple)
    XDAT(int,     count,  "Count")
    XDAT(string,  name,   "Name")
    XDAT(float,   pi,     "PI")
END_IMPLEMENT_MARSHALING()

Step 3. To convert your object into XML representation, call the Marshall method:

CSimple obj;
obj.cnt = 100;
obj.name = "test for simple object";
obj.pi = 3.14;

XMLDoc* pDoc = NULL;
obj.Marshal(pDoc);
const char* xmltext = pXMLDoc->ReleaseText();

printf("%s\n", xmltext);

the result is as follows:

<CSimple>
    <Count>100</Count>
    <Name>test for simple object</Name>
    <PI>3.14</PI>
</CSimple>

Step 4. To convert the XML text into the object use XMLParser:

string message;
...
XMLParser parser;
Marshallable* pObj = parser.Parse(message);

The pointer to object is not equal to zero if the parser was able to parse the XML string.

Step 5. Use the Marshallable interface methods to obtain type name and typeid:

printf("Object type name: %s\n", pObj->GetTypeName());
printf("Object type id: %x\n pObj->GetTypeId());

the result is as follows:

Object type name: CSimple
Object type id: 0x5AA8AF35

The type identifier is an unsigned 32-bit integer that is calculated from the type name string by the standard CRC32 algorithm:

CRC32("CSimple") = 0x5AA8AF35

The GetTypeName and GetTypeId functions allow a developer to find an object’s type from the pointer to the Marshallable class. In the previous example, a developer can check the type name:

if(string("CSimple") == pObj->GetTypeName())

The real application may handle tens or hundreds of different marshallable types. In this case, it is convenient to use the GetTypeId function, for example:

switch(pObj->GetTypeId())
{
case tsig_CSimple:
    CSimple* pSimple = (CSimple*)pObj;
    ...;
break;
...
}

where tsig_CSimple is a constant defined by the developer in a header file as:

#define tsig_CSimple    (0x5AA8AF35)

I developed a utility program called “Type Signature Calculator” that helps me to calculate a type identifier constant for a marshallable type, see Figure 1.

A Self-Describing Implementation

The method is based on the idea of adding a self-description capability to a C++ class, which unlike Java and some other modern programming languages, does not support self-descriptive types. Class self-description should be added by developer via use of the XDAT macro for marshalled data members. Format of XDAT macro is simple:

XDAT(type,    member,    name)

In this format, type is a class member type, member is a class member name, and name is the name of the XML element. In most cases, XML element name and class member name coincide. They may differ if the developer wants them to. This opens the possibility for automatic generation of class self-description by using of a simple tool that creates XDAT table from a class definition.

The XDAT macro is defined in xmlbind.h as follows:

#define XDAT(type,field,fnm)
    new _ElementDescription(#type, 0,
        ((int)&(((_alias*)0)->field)), 
        sizeof(type), 1,
        fnm, marshal_##type, unmarshal_##type,
        ANY_ASP),

Another important macro XATR can be used to represent a class member not as an XML element but as an attribute. Format of the XATR macro is the same as that of XDAT. For example, we can specify a types description table for CSimple class as follows:

BEGIN_IMPLEMENT_MARSHALING(CSimple)
    XATR(int,     count,  "Count")
    XATR(string,  name,   "Name")
    XDAT(float,   pi,     "PI")
END_IMPLEMENT_MARSHALING()

where the difference with the previous example is that we use XATR for the two first parameters instead of XDAT. The result of marshalling will be as follows:

<CSimple Count="100" Name="test for simple object" >
    <PI>3.14</PI>
</CSimple>

There is an important advantage to represent a class data member as an XML attribute: attributes are available to the marshalling subsystem at the beginning of the unmarshalling of the class. Attributes are used to support partial updates of objects as described below in the “Differences publishing” paragraph.

The marshalling subsystem is capable of marshalling the basic and STL types listed in Table 1. Marshalling of other types can be implemented analogously to these standard types. For this, you can copy and modify the marshal_type sample procedure from xmlbind.cpp for each new type.

Marshalling of Complex C++ Objects

Some C++ classes can be complex: a class member can be a class itself, a pointer to linked list, an array, and so on. Here, I consider how different complex members are marshalled. In the first example, the class member is a class or a pointer to a class.

class A : public Marshallable {
public:
    int a1;
    string a2;
DECLARE_MARSHALING(A)
};

class B : public Marshallable {
public:
    WORD b1;
    A* b2;
DECLARE_MARSHALING(B)
};

You can use the standard PMARSH pseudo-type as follows:

BEGIN_IMPLEMENT_MARSHALING(B)
    XDAT(WORD,    b1,  "b1")
    XDAT(PMARSH,  b2,  "b2")
END_IMPLEMENT_MARSHALING()

The result of marshalling will be:

<B>
    <b1>123</b1>
    <A>
        <a1>56</a1>
        <a2>the test</a2>
    </A>
</B>

There is no type PMARSH defined in any header file. However, PMARSH word is used to specify that the marshalling subsystem should use the marshal_PMARSH procedure to marshal a marshallable class. The marshal_PMARSH procedure is implemented in xmlbind.cpp and is shown in Example 1.

In our next example, the class member is a pointer to the head of the linked list.

// Node.h
class Node : public Marshallable {
public:
    int data;
    Node* next;
DECLARE_MARSHALING(Node)
};

// Container.h
class Container : public Marshallable {
public:
    string smth;
    Node* listhead;
DECLARE_MARSHALING(Container)
};

You should create a custom marshalling procedure for type Node by copying and modifying the marshal_type sample procedure from xmlbind.cpp. In the Container class description table, you use the following macro:

XDAT(Node,    listhead,   "NodeSList")

The marshal_Node procedure can be implemented as shown in Example 2. The XMLDoc, XMLElement, and ElementDesc are types defined in the xmlbind.h header file, see Listing 1.

Publishing Differences

The second parameter in the Marshal procedure is optional and can be used for publishing differences:

    CSimple objnew, objold;
    ...
    objnew.Marshal(&XMLDoc, &objold);

The marshalling subsystem will automatically determine which members of obj1 and obj2 differ and publish only those. This powerful technique can be used to propagate only changed data members in an object. For example, the server often sends updates of an object to the client. It makes sense to send only a single data member if only one was changed and not to send the dozens that were not changed. Each time the object is about to change, the server uses the old state and the new state of the object to propagate change as shown previously. You should implement the CreateInstance function, which is the part of the Marshallable interface for class Csimple, see Example 3. Pseudo-code search_in_collection can find an existing object based on attributes.

Marshalling a Class Aspect

The Marshalling subsystem supports grouping of class members. Each group is an aspect of the class. It’s convenient to marshal one, several, or all aspects of a class. Currently, up to 32 aspects can be declared for a class. Also, each class member can be part of one or more aspects. The XDAT macro marks the data member of a class as part of any aspect. The XASP macro is analogous to XDAT but it has an extra parameter — the identifier of the aspect group. You should define an aspect identifier explicitly in code, for example:

    #define NAME_ASPECT        1
    #define AGE_ASPECT         2
    #define ADDRESS_ASPECT     4

Strictly speaking, the aspect identifier number has to be a power-of-two integer. The exception to this rule is only the ANY_ASPECT constant, which is declared in xmlbind.h as (0xFFFFFFFF).

struct    AnswerList : public Marshallable
    {
        string last_name;
        string first_name;
        string address;
        int age;

        DECLARE_MARSHALLING(AnswerList)
};

Use the XASP macro instead of XDAT in a cpp file, as shown in Example 4. Use the Marshal method to publish the aspect only:

AnswerList lst;
lst.Marshal(&XMLDoc, 0, NAME_ASPECT);

Role of Serialization

For marshalling of large objects with hundreds of data members, you may use serialization, which provides some advantages. It’s faster; and having the ability to serialize does not require you to define all object members with XDAT or XASP macros, which saves development time. You can use an ancillary stub class to transfer serialized representation of a class, see Example 5.

Object Embedding

You can use the object embedding technique when transferring objects between remote peers. The main idea of embedding is to use a universal envelope to “wrap” the transferred object: CSrvEnvelope. Every transferrable class should have the ability to “insert” itself into an envelope in the CreateInstance procedure shown in Example 6.

The importance of using the envelope class is high also because it allows a developer to specify destination, source, and some additional information.

Performance Estimations

I conducted some performance estimations for marshalling C++ objects with different number of data members ranging from two elements up to 80 elements. The results are shown in Table 2 and are graphed in Figure 2. The computer used featured an Athlon 1.4GHz CPU and 512 MB of RAM.


Konstantin Izmailov holds a Ph.D. in Computer Science. He is currently a Senior Software Engineer at Hypercom USA Inc. and can be reached at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.