Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Handling Parameter Sets in Member Initializer Lists: With A Little Help From MUMI!


Handling Parameter Sets in Member Initializer Lists: With A Little Help From MUMI!

As is widely known [1, 2, 3], use of member initializer lists (MILs) is preferred over assignment to instance members in constructor bodies. In [4] I identify seven types of things one might initialize from a constructor and discuss how all but one—arrays—should be done via MILs, as well as describing the advantages to be gained from doing so.

But sometimes this rosy picture can be hard to satisfy. Consider the classes from the C++ mapping of recls, my recursive filesystem searching library [5]. The original C++ mapping had one search class, FileSearch, which handled only searching of filesystems. The recls API creates a file search handle via the Recls_Search() function

recls_rc_t Recls_Search(recls_char_t const  *searchRoot
                      , recls_char_t const  *pattern
                      , recls_uint32_t      flags
                      , hrecls_t            *phSrch);

Since the FileSearch class is designed to be independent of exceptions—for reasons of simplicity, and because it just serves as an enumerator object [6]—each instance maintains a member of type recls_rc_t, whose value is accessible via the GetLastError() method.

class FileSearch
{
public:
  FileSearch(char const *rootDir, char const *pattern, recls_uint32_t flags);

. . .

public:
  recls_rc_t  GetLastError() const;

. . .

private:
  hrecls_t    m_hSrch;
  recls_rc_t  m_lastError;
};

The problem with this, however, is that it's impossible to initialize both members in the member initializer list (MIL [4]). The usual way of doing so is to skip the MIL altogether, and assign them in the constructor body. With the FileSearch class we could get away with it, since the hrecls_t is merely a pointer-based handle, and recls_rc_t is an unsigned integer. Hence, the implementation could be:

FileSearch::FileSearch(char const *rootDir, char const *pattern, recls_uint32_t flags)
{
  m_lastError = Recls_Search(rootDir, pattern, flags, &m_hSrch);
}

Although that works fine as is, I must come clean and admit that that's not how I implemented it. Instead, I used the following implementation, which clearly shows the influence of my instincts towards MILs:

FileSearch::FileSearch(char const *rootDir, char const *pattern, recls_uint32_t flags)
  : /* m_hSrch(NULL)
  , */ m_lastError(Recls_Search(rootDir, pattern, flags, &m_hSrch))
{}

I'm not saying either of these is particularly good, although they both do the job. Since the Recls_Search() function will set the search handle to NULL on failure, they're both perfectly safe, even if each give qualms to the diligent MIL user.

As of version 1.5, however, support for recursive searching of FTP sites was added to recls. A recls FTP recursive search is initiated via the Recls_SearchFtp() function:

recls_rc_t Recls_SearchFtp( recls_char_t const  *host
                          , recls_char_t const  *username
                          , recls_char_t const  *password
                          , recls_char_t const  *searchRoot
                          , recls_char_t const  *pattern
                          , recls_uint32_t      flags
                          , hrecls_t            *phSrch);

Everything else in the recls API remained the same. The consequent refactoring moved all the functionality of FileSearch into a base Search class, from which FileSearch and the new FtpSearch class are derived. Each of these derived classes does nothing more than provide an appropriate constructor, which passes a search handle and return code to the base class:

class Search
{
protected:
  Search( ??? );

. . .
  hrecls_t    m_hSrch;
  recls_rc_t  m_lastError;

};

class FileSearch
  : public Search
{
public:
  FileSearch(char const *rootDir, char const *pattern, recls_uint32_t flags);
};

class FtpSearch
  : public Search
{
public:
  FtpSearch(char const *host, char const *username, char const *password
          , char const *pattern, recls_uint32_t flags);
};

Here's where we hit the problem. What we'd like to do is be able to define the constructor of Search as

Search::Search(hrecls_t hSrch, recls_rc_t lastError)
  : m_hSrch(hSrch)
  , m_lastError(lastError)
{
  recls_assert(NULL != hSrch || RECLS_FAILED(lastError));
  recls_assert(NULL == hSrch || RECLS_SUCCEEDED(lastError));
}

But there's no way to call the function, save the return value, and then pass them as two separate parameters to a base class constructor. There's no equivalent in C++ of the following:

FileSearch::FileSearch(char const *rootDir, char const *pattern, recls_uint32_t flags)
  : { hrecls_t    hSrch;
      recls_rc_t  rc = Recls_Search(rootDir, pattern, flags, &hSrch);
    Search(hSrch, rc);
    }
{}

Hence, we are reduced to hackery of some form or another. The obvious tactic is to declare m_hSrch and m_lastError as protected, rather than private members, and to initialize them in the derived classes. Search would just be given a default constructor that would do nothing. The FileSearch constructor would look like this:

FileSearch::FileSearch(char const *rootDir, char const *pattern, recls_uint32_t flags)
{
  Search::m_lastError = Recls_Search(rootDir, pattern, flags, &Search::m_hSrch);
}

(Naturally, we don't need to write Search:: for the m_lastError and m_hSrch members. I've just done so to emphasize the fact that they do not belong to the FileSearch class.)

Now, I've such strong antipathy towards protected data members that I just can't bring myself to write such things. What's to stop the reckless/malicious user of the libraries to derive their own class from Search, and do it badly? As a wise man once said, there's no accounting for Machiavelli, and trying to bulletproof your code against deliberate abuse is a game you'll never finish, so maybe we shouldn't worry? However, there's a stronger, practical reason why this solution is not acceptable. You may have noticed in the original suggested definition of Search::Search() that I used a touch of Design by Contract [7, 4], in that the assertions verify that the m_hSrch member can only be NULL if m_lastError indicates an error, or vice versa. If we oblige our derived classes to initialize our data members in their constructors, we lose control over this invariant. Sure, in a modest body of code, such as the recls C++ mapping, we're unlikely to get it wrong, but that's hardly the point. Good habits are hard to break, and bad ones harder, and I for one am too lazy to try to maintain more rule sets than is necessary.

So what's the answer? Well, as demonstrated in [4], one technique for adding sophistication to MILs without sacrificing robustness is to use static methods to translate, validate, or otherwise manipulate incoming arguments into a form acceptable to the members to be initialized. The restriction on the use of such functions is still that they cannot span more than one initializable entity. What? ...sounds as clear as custard. Well, what it means is that such a function still cannot provide two or more arguments to the base class constructor. At first look, this doesn't seem much help, but have faith.

The answer lies in this fact. If we cannot pass two things, let's only pass one. Search's constructor can be defined to take only a single parameter, of type std::pair<hrecls_t, recls_rc_t>. Now, the derived class FileSearch defines and uses a static private method CreateSearch_(), as follows:

class Search
{
protected:
  typedef std::pair<hrecls_t, recls_rc_t> ctor_args_type;

  Search(ctor_args_type args);

. . .
  hrecls_t    m_hSrch;
  recls_rc_t  m_lastError;
};

class FileSearch
  : public Search
{
public:
  FileSearch(char const *rootDir, char const *pattern, recls_uint32_t flags);

private:
  static ctor_args_type CreateSearch_(char const *rootDir, char const *pattern
                                    , recls_uint32_t flags);
};

FileSearch::FileSearch(char const *rootDir, char const *pattern, recls_uint32_t flags)
  : Search(CreateSearch_(rootDir, pattern, flags))
{}

The implementation of FileSearch::CreateSearch_() would be as follows:

Search::ctor_args_type FileSearch::CreateSearch_( char const      *rootDir
                                                , char const      *pattern
                                                , recls_uint32_t  flags)
{
  hrecls_t    hSrch;
  recls_rc_t  rc = Recls_Search(rootDir, pattern, flags, &hSrch);

  return Search::ctor_args_type(hSrch, rc);
}

Couldn't be simpler! The great advantage is now that Search does not need to give protected access to its member variables, and Search::Search() gets to keep its Design by Contract.

Search::Search(ctor_args_type args)
  : m_hSrch(args.first)
  , m_lastError(args.second)
{
  recls_assert(NULL != m_hSrch || RECLS_FAILED(m_lastError));
  recls_assert(NULL == m_hSrch || RECLS_SUCCEEDED(m_lastError));
}

In the rarer cases where you need to handle more than two parameters, std::pair doesn't cut it: You'll need to write a custom structure or use tuples (such as those available in Boost). But the principle—which I call the MUltiple Member Initialization idiom (MUMI) —is applicable to any number of parameters.

In the recls C++ mapping, the invariants are only enforced by assertions. But this technique may be applied generally, and you may well have even more reason than I did in this case to be unwilling to give that up. Now, with a little help from MUMI, you can have your flexibility without sacrificing the robustness you get from using MILs.

Notes and References

[1] Exception C++, Herb Sutter, Addison-Wesley, 2000.

[2] C++ Gotchas, Steve Dewhurst, Addison-Wesley, 2003.

[3] The C++ Programming Language, Special Edition, Bjarne Stroustrup, Addison-Wesley, 1997.

[4] Imperfect C++, Matthew Wilson, Addison-Wesley, 2004.

[5] The recls library (http://recls.org/) has been the exemplar for the first few installments of my Positive Integration column, for C/C++ Users Journal (http://www.cuj.com/). The changes described here happened in version 1.5.1, which is described in the September 2004 installment.

[6] In other words, one can use it as follows:

using reclspp::FileEntry;
using reclspp::FileSearch;

FileSearch fs("/home/matty", "recls.*", RECLS_F_FILES | RECLS_F_RECURSIVE);

for(; fs.HasMoreElements(); fs.GetNext())
{
  FileEntry entry = search.GetCurrentEntry();
  . . . // use entry
}

This is one of those few cases where a Construct And Test idiom is valid. In almost all other cases, you're much better off to throw an exception during constructor failure ([1]), giving cleaner and safer object initialization.

[7] Object-Oriented Software Construction, Bertrand Meyer, Prentice Hall, 1997.

About the Author

Matthew Wilson is a software development consultant for Synesis Software, and creator of the STLSoft libraries. He is author of the book Imperfect C++ (Addison-Wesley, Sept 2004), and is currently working on his next two books, one of which is not about C++. Matthew can be contacted via http://imperfectcplusplus.com/.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.