Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

An Improved Variant Type Based on Member Templates


October 2000/An Improved Variant Type Based on Member Templates


This article presents class variant_t, which encapsulates a mechanism to hold values of arbitrary types. If the types used to initialize variant_t variables have full copy semantics, this variant_t can be stored in an array, returned from a function, and contained in any standard collection. As an example use of the variant_t, I present a class ArgumentList. It is a container adaptor to a vector<variant_t> that provides an interface akin to a list of arguments to a function.

Developing a variant_t

A preliminary definition for variant_t could be the following struct:

struct variant_t
{  
   variant_t (const void * value)
   : data ( value ) {}
   const void* data;
};

It would be used as follows:

variant_t array [2];
int n0 = 3;
double f0 = 3.14;
array[0] = variant_t((const void*)&n);
array[1] = variant_t((const void*)&f);
int n1 = *(const int*)(array[0].data);
double f1 = 
   *(const double*)(array[1].data);

One of the problems with this design is that it forces the user to explicitly cast to and from the variant_t object. A recent addition to the language, member templates, permits a more flexible design.

Version 0

Listing 1 shows the definition of variant0_t. The constructor for variant0_t is a member template function:

template<typename T>
variant0_t(const T& v){...}

A member template function is a member function that is instantiated (generated by the compiler) for a particular type when a call using this type is first seen by the translator. In the case of variant0_t, the compiler will generate a constructor according to the particular type of the argument. For example, the expression variant0_t(3) will generate a constructor that takes a const int & as an argument and produce a call to that constructor. Similarly, variant0_t has a generic conversion operator:

template<typename T>
operator T () const {...}

The compiler will try to generate code to convert from variant0_t to any type by instantiating this conversion operator for the particular return type. For example, the expression int n = variant0_t(3) will call an operator int (instantiated by the compiler) to provide an int that can be used to initialize n.

The following snippet shows how variant0_t can be used.

variant0_t array [2];
int n0 = 3;
double f0 = 3.14;
array[0] = n0;
array[1] = f0;
int n1 = array[0];
double f1 = array[1];

The conversion code uses reinterpret_cast<> to cast back the void pointer. I will add type checking to this cast later.

A problem with variant0_t is that it holds a pointer to the original value. If the original value goes out of scope, then variant0_t points to an invalid object. The solution is to hold a copy of the original value.

Version 1

Listing 2 shows variant1_t. It maintains a copy of the value it is initialized with in a separate memory block. Using variant1_t, the following operations are valid:

int* a0 = new int(3);
variant1_t v ( *a0 );
delete a0;
int a1 = v;  // a1 = 3

A problem with this new version is that it assumes that T can be safely copied as a bitwise copy operation.

Version 2

Listing 3 shows variant2_t. It is a considerable improvement over the last variant, so I'll explain it step by step.

Consider the following template:

template<typename T>
struct Impl
{
   Impl (T v) : data (v) {}
   T data;
};

This template can be used to safely obtain and store a copy of a given variable of an arbitrary type. Since it uses the type's copy constructor, the copy is guaranteed to be appropriate, as long as the copy constructor is properly defined. The next snippet extends Impl<> in a way that makes it particularly useful for our purposes:

struct ImplBase
{
   virtual ~ImplBase() {}
};
   
template<typename T>
struct Impl : ImplBase
{
   Impl (T v) : data (v) {}
   T data;
};

The big difference here is that Impl<T> is a class derived from ImplBase, which has a virtual destructor. Any instance of Impl<T> is a polymorphic type; that is, a pointer to an object of this type can be converted to a pointer to the base type, and vice versa. For instance, given a pointer to ImplBase, it is possible to safely cast it to Impl<T> using dynamic_cast<>. It is possible to cast a pointer to Impl<T> back to ImplBase using an implicit cast, since this is an upcast. This inheritance relationship enables arbitrary types to be stored in arrays and containers (via pointers), in a way that complements the variant_t.

Consider the following example:

ImplBase* array[2];
array[0] = new Impl<int>(3);
array[1] = new Impl<double>(3.14);
int n1 = (dynamic_cast<Impl<int>*>(array[0]))->data;
double f1 = (dynamic_cast<Impl<double>*>(array[1]))->data;
delete array[0];
delete array[1];

The above code implements an array of polymorphic types that holds values of types int and double. The expression

new Impl<int>(n0)

generates the type Impl<int> with the following important properties:

1) It contains a data member of type int.

2) This data member has been safely initialized as a copy of the int variable n0. Both the data member and the original variable are exactly of the same type so this initialization is type safe.

3) It is polymorphic, which means that it can safely be converted to and from BaseImpl *.

4) BaseImpl has a virtual destructor, so an Impl<int> * can be safely deleted from a BaseImpl *.

5) This type is uniquely bound to the type int. Different instances of Impl<> instantiated for different types will have similar properties, which will guarantee type-safe and copy-safe operations while maintaining polymorphic behavior.

The expression:

int n1 = (dynamic_cast<Impl<int>*>(array[0]))->data

is type-safe because the dynamic cast will fail if array[0] is not an instance of Impl<int> *. Therefore, if the dynamic cast succeeds, ->data is guaranteed to be of type int. Furthermore, because Impl<> is a polymorphic type, this dynamic cast is guaranteed to succeed when the result type matches the type of the n1.

variant2_t (Listing 3) is the combination of variant1_t and Impl<>. An additional method in variant_t encapsulates the casting operation shown above; CastFromBase enforces type checking by throwing an exception if the dynamic cast fails.

Dealing with Shared Representation

Since variant2_t holds a pointer to a value in a separate object, it creates a new problem.

Consider the following code:

variant_t foo() {return 3;}

The return statement is functionally equivalent to:

variant_t result ;
int _unnamed_int(3);
variant_t _unnamed_var_t(_unnamed_int); // temporary
result = _unamed_var_t;

As you can see, a temporary variant_t object is constructed and copied into the result object (which is constructed on the caller side and passed as a hidden parameter so the return statement can assign to it). This temporary holds a pointer to an instance of Impl<int> through its ImplBase pointer. This same instance is stored in result. But, when the temporary goes out of scope, this instance is deleted and result holds a pointer to an invalid object.

There are three basic solutions to this problem: "copy on copy," unique ownership, and reference counting.

By "copy on copy," I mean that every time the variant_t is initialized or copied, it makes a copy of the underlying value. I will not discuss this solution here, since it is relatively inefficient, but I provide an implementation on the CUJ website (www.cuj.com/code).

The unique ownership idiom mandates that only one instance of a (smart) pointer can point to a given object. std::auto_ptr<> is an example of this kind of smart pointer. Whenever a given instance of auto_ptr<> is copied, it passes a pointer to the owned object to the copy. In this manner, auto_ptr<> hands off ownership to the copy and relinquishes ownership for itself.

As an alternative, reference counting is the idiom for collaborative shared ownership. The reference counting idiom allows any number of smart pointers to own the same object without the risk of containing "dangling" pointers (i.e., pointers to deleted objects). Each smart pointer is guaranteed to contain a valid pointer because the jointly owned object can be destroyed only when all smart pointers to it have relinquished ownership.

There are two major drawbacks to reference counting:

1) The most straightforward implementations of the idiom require that additional data and operations be placed in the contained object. In this case, any class that will be reference counted must be slightly modified. (However, see [1] for an implementation that does not impose such requirements.)

2) Circular references might cause unpredictable behavior.

Unique ownership is a flexible and safe approach to providing smart pointer behavior in general; that's why auto_ptr<> uses this scheme. However, this scheme is unsuitable for general collections (including arrays and the standard containers), because general collections make extensive use of copying, possibly leading to situations where unowned objects are referenced. For this reason, I've chosen to implement variant_t as a reference counted object.

The Final Version

Listing 4 shows the complete code for the final variant_t type. It is functionally equivalent to variant2_t, except for the reference counting over Impl<>.

This final implementation of variant_t also includes a member template is_type(), with two overloaded versions. You can use is_type to test if variant_t holds a value of of some type T. The first version takes no argument and must be used as in:

v.is_type<MyClass>();

The second version takes a single argument and must be used as in:

v.is_type(MyObject);

where MyObject is of the type being tested for. I thought about adding a method of the form:

string type_name() const {return typeid(*data).name();}

but this method would return strings such as "variant_t::Impl<int>", revealing implementation details. Extracting "int" from the string above cannot be done portably because the exact string returned by typeid::name is compiler specific. So I just decided to leave this feature out.

Listing 5 shows a test program that demonstrates various uses of variant_t.

An Argument List Example

This example shows how to use a variant_t to create variable argument list. I think of an argument list as a collection with the following properties:

  • The list has a fixed size.
  • Elements are be added (or pushed) to the list before any of them are accessed.
  • Elements cannot be inserted in the middle of the list.
  • Elements cannot be removed from the list.
  • Elements can be randomly accessed.

I consider these properties to closely resemble the functioning of a formal parameter list to function.

With a properly defined variant_t class, designing the ArgumentList class is just a matter of determining its interface and the container it will use. Based on the above properties, I present class ArgumentList, shown in Listing 6. It is implemented in terms of a vector of variant_t.

Elements can be pushed on the back of the list. Just as you can with a variant_t, you can push elements directly to the list without explicit casting. You can use operator[] or the at method to access individual elements in the list, but you need to know the positions (indexes) of the specific arguments you are accessing. You cannot query the list about which type is at a given position.

Listing 7 shows a test program that demonstrates various uses of the ArgumentList class.

A Possible Extension

You will notice that this design of variant_t allows you to retrieve only a copy (or const reference) of the value. If you want to permit the user to modify the values held in a variant_t, you will need to add a copy-on-write mechanism. Beware, however: you must not add a method or operator that provides access to the value via reference or pointer to non-const. That will violate the basic assumption behind any copy-on-write mechanism, which is that only the class itself is allowed to modify the data.

Conclusion

The type variant_t is a special class with the ability to hold values of arbitrary types. variant_t itself is just a reference counted envelope for objects derived from variant_t::ImplBase. ImplBase forms a template-based polymorphic hierarchy, whose leaves are classes generated by the compiler according to the specific type of the value held by variant_t. These leaves are responsible for holding the copies of the values in a copy-safe and type-safe manner.

Reference

[1] Vladimir Batov. "Safe and Economical Reference Counting in C++," C/C++ Users Journal, June 2000.

Fernando Cacciola has been programming since 1984 and programming in C++ since 1990. He studied Biochemistry at John. F. Kennedy University. For the past five years, he has been developing computational geometry algorithms.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.