Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

IOStreams Storage


January, 2006: IOStreams Storage

Maciej Sobczak is a Ph.D. student at the Institute of Computer Science, Warsaw University of Technology. You can contact him at http://www.msobczak.com/.


Much of the C++ code we write manipulates the IOStreams library. This is to be expected because input and output are common activities for most programs. The following code, for instance, illustrates the use of stream manipulators—hex, oct, boolalpha, setprecision—that alter the behavior of a stream:

cout << 123 << endl;
cout << hex << 123 << endl;

bool b = true;
cout << boolalpha << true << endl;

cout << setprecision(3);
cout << 3.1415926536 << endl;
// ...

These manipulators are objects that can be inserted into the stream (or extracted, because there are also manipulators for the input streams), changing the stream's subsequent behavior.

Altering a stream's behavior means that there is a state (that is, information kept by the stream) that influences what the stream does—formatting options involving the conversion base, precision, use of a plus sign with positive numbers, skipping whitespaces, and the like. This state is used by insertion/extraction operators [1], and is therefore responsible for how the stream behaves.

The set of standard manipulators (and the amount of state that the stream keeps) is limited to those that cover the input/output options for fundamental types. But what about custom-defined classes? Say, for instance, you write a class for representing complex numbers (admittedly, there's already one in the Standard Library, but this is a good example and you've likely written one). While IOStreams insertion/extraction operators are natural extensions for such a class, this example turns out to be a difficult exercise in design, in terms of flexibility and reuse: What type of brackets do you choose for the complex number? What type of separator between the real and imaginary part? How do you give the freedom of choice to users of your class?

One interesting property of standard IOStreams manipulators is that formatting options are managed independently for each stream object. Consequently, you can have many different streams with many different formatting settings; for example, one stream for decimal output and another for hexadecimal. Providing the equivalent functionality for the custom classes lets you smoothly integrate classes with the complete IOStreams framework, with obvious benefits in the form of conceptual consistency. The solution seems straightforward—you just have to find a way to store additional data in the stream objects so that you can retrieve this data anytime you perform stream insertion/extraction. The additional data can represent, for example, the chosen bracketing style for the objects of the complex class.

Formatting issues and stream insertion/ extraction are only examples of situations where the additional storage in stream objects could be useful. That said, these examples are typical and are the main motivation for providing additional storage. So how do you go about storing additional data in the stream object?

Flexibility by Design

The basic idea behind storing additional data in the stream object is to provide a separate place in the stream object, so that programmers who want to extend the IOStreams library can store their own additional data. A simple approach would be to provide a single generic field:

class MyOstream {
   void *customdata_;
public:
   void setCustom(void *p);
   void * getCustom();
};

However, this code breaks down when two or more programmers independently want to provide their formatting functionality. When one of them writes the complex class (with the style of bracketing as the point of variation) and the other writes, say, an Address class (with the level of detail as the changeable feature), there is clearly a conflict. In other words, we need a solution that lets individual programmers have their own slot in the stream object for private use—the additional storage has to be a container of slots. This is exactly what the std::ios_base class provides. It is a base class for all stream classes in the IOStreams library. Figure 1 presents the conceptual view on this container of slots, which is managed by the stream object. Each slot can be seen as a pair able to store two values: one long value, intended for small types (the values that can fit in the size of long), and one void * value (this justifies the dreaded void * instead of some template type; you can store different types of data in each slot and the pointer to void is the only reasonable common denominator for bigger types).

The std::ios_base class provides separate member functions for manipulating the additional storage. The most fundamental functions are:

long &  iword(int index);
void *& pword(int index);

where the index identifies one particular slot in the container.

The first function, iword, gives you a reference to the slot containing the long value. The second function, pword, returns a reference to the slot containing the pointer value. You can use these functions for both reading and writing. For example, this code stores the long value in the third slot of the cout object:

cout.iword(2) = 12345L;

Similarly, this code retrieves the previously stored value and prints it using the same stream object:

cout << cout.iword(2) << endl;

But how big is the container? Well, an interesting property of the container of slots is that it is autoresizable—whenever you use a new index, the container resizes itself to accommodate a new index value. The newly allocated slots are automatically initialized to 0 (this applies to both long and void * slots).

Two issues need to be noted:

  • Both functions return references to the memory fields managed by the resizable container. These two things do not mix in the sense that when the container resizes (which can happen during any call to these functions), it may need to reallocate all its memory, which means that the previously returned reference is no longer valid, and therefore, should not be used any longer. This is important, but at the same time similar to the behavior of STL containers.
  • The resizing (which can mean reallocating the memory) can fail. Then, the bad bit is set in the stream object, which may result in the exception. If the exception is not thrown, the functions still guarantee to return valid references to objects initialized to 0.

So how should you choose the slot index so as not to collide with other programmers? The stream helps you with this. This std::ios_base member function static int xalloc(); returns the unique value each time it is called. This value is intended to be used as a slot index in iword and pword functions. Moreover, because this member function is static in the std::ios_base class, its return value is unique for all stream objects of any kind and can be safely used everywhere. The natural convention is that the allocated index number is associated with some aspect of the project that spans many stream objects. For example, the author of a complex class who wants to have his own private slot for storing bracketing-related information can use the once-allocated index number for all streams, both input and output.

The xalloc function is provided so that you don't have to choose your slot index at random, in order to avoid colliding with the choices of others (and even random guessing would not guarantee this safety). In fact, any arbitrary choice of slot index is error-prone and should be avoided. Keeping this in mind, you can change the previous example to more "civilized" code:

int myindex = cout.xalloc();
cout.iword(myindex) = 12345L;
cout << cout.iword(myindex);

Having these pieces, you can try to bring them together in the form of working code. Listing 1 is the skeleton of the complex class, with an inserter operator for output stream and a simple manipulator for changing the bracketing style. The important things to note in this code are:

  • The private (in the sense that it is used only and always for the bracketing style of the complex class) index value is retrieved in the first use and stored until the end of the program in the static member function myindex().
  • The bracketing style is stored in the long slot, by packing the left and right brackets in a single word. I assume that a char takes 8 bits and that two chars can fit in one long value.
  • The inserter operator retrieves the currently set bracketing style from the stream object or uses default brackets if the slot is "empty."
  • The setbrackets() function returns a struct containing the bracketing style and a special inserter for this struct puts the brackets into the private slot.

Listing 1 shows how the long slots can be used when printing the following:

(3.14, 2.71)
[3.14, 2.71]

The problem is that if you want to store a bigger state in the stream object, you use the other slot for void * values. But what if the state (the equivalent of the struct bracketing in Listing 1) is allocated on the free store? Its lifetime has to be managed; in particular, you would like to delete the object when the stream is destroyed to avoid memory (and possibly, other resources) leak. The stream object will not do it for you (it cannot even know whether the object ever needs to be deleted), so you need to be notified when there is some crucial thing happening, such as the destruction of the whole stream object. Callback functions help with this.

The idea behind callback functions is that you can ask the stream object to call the specified function (usually, your function) when something important happens. There are three such situations:

  • When the stream object is destroyed.
  • When the new locale is imbued onto the stream.
  • When users call the copyfmt() stream member function.

All these situations can be interesting in the sense that they (can) affect the data that you store in the stream object's additional storage.

The callback function should have a specific signature, as in this code:

void
mycallback(ios_base::event e,
    ios_base &iob, int index);

and can be registered like this:

cout.register_callback(mycallback, myindexvalue);

The callback function is called whenever one of these events happen:

  • Destruction of the stream, followed by the event ios_base::erase_event.
  • New locale is imbued, then the event is ios_base::imbue_event.
  • Users request to copy the format from one stream object to another, then the stream object performs a three-phase action: It calls the registered callback functions with the erase_event (because in the next step, the values in slots are overwritten), it copies all slot values (which are long and pointer values) from the source stream object to the destination stream object (this is a "shallow copy"), and it calls the registered callbacks with the event ios_base::copyfmt_event; this lets you perform "deep copy" (or reference counting, or whatever) of the objects pointed by void * slots.

Moreover, the callback function is called with the parameters set appropriately:

  • The event parameter gets the correct value, as previously described, so that the callback function can decide what exactly should be done in each case (the most important event is the stream's destruction, in which case you can delete the objects pointed by void * slots).
  • The iob parameter references the "current" stream object, so that you can reuse the same callback function for many different streams.
  • The index parameter gets the same value that was given during the registration. The natural convention is that you use this parameter to identify some specific slot in the container, so that you know which slot is affected by the current notification. (For this convention to work, you have to register your callback function for every slot you want to manage, even if it is the same callback function.)

However, there are a couple of things that still have to be taken care of; for example, callbacks are not allowed to emit exceptions. This makes error reporting from callbacks more complex, but still possible [2].

This detailed description may seem overwhelming, but I've presented the bare mechanics related to the additional IOStreams storage. The simple use of long slots is straightforward, and some care should be taken only when callback functions are involved to help us manage the lifetime of objects owned by pointer slots. Listing 2 presents an alternative solution that uses dynamically allocated objects and callback functions.

Reusable Component

After writing a couple of your own classes that use the additional storage in stream objects, you quickly discover that the way you use this functionality is repeatable to the extent that a special, reusable wrapper would be quite desirable. This is especially true when the dynamically allocated state is stored in the stream object and you are forced to implement the callback functions in a similar way. The code accompanying this article (see http://www.cuj.com/code/) implements wrapper classes that encapsulate the gory details previously described and can be helpful when writing custom classes that extend the IOStreams library with the use of additional stream storage.

The code presents two different approaches to encapsulate the additional storage:

  • The stream state (in other words, a private slot) is associated with the type of data we want to store in the stream object. Only one slot can be associated with a given type; for example, you can have only one object of struct bracketing stored in a single stream object. This approach is limiting, but at the same time it is easy to use; see Listing 3.
  • The stream state (and slot) is associated with the accessor object encapsulating the location information. Any type of data can be stored in the stream, including many instances of the same type (the location of each instance is encapsulated by a separate accessor object). This approach is more flexible, but more difficult to use—you have to keep the accessor object around because it is a magic key that gives you access to the associated slot. Listing 4 presents the possible implementation of the complex class following this approach.

References

  1. The stream state that is responsible for the formatting options for fundamental types is not used directly by the insertion and extraction operators, but rather by facets that perform the dirty work. See [2] for details.
  2. Langer, Angelika and Klaus Kreft. Standard IOStreams and Locales. Addison-Wesley, 2000. This book suggests that when the void * slot is used for storing the actual information, the long slot can be used as a means to communicate the error information, if exceptions are forbidden and there is no way to use setstate() at the same time. The code accompanying the article uses this strategy.

CUJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.