Hash Table Refresher
Hash tables were part of the Standard Template Library when the language standard was ratified in 1998. And while most of the Standard Template Library was incorporated into the standard library at that time, the hash_set
and hash_map
classes were excluded. According to Bjarne Stroustroup Evolving a Language In and For the Real World: C++ 1991-2006:
"they would have been in C++98 had we had the time to do a proper detailed design and specification job. There was no doubt that a hash_map was needed as an alternative to map for large tables where the key was a character string and we could design a good hash function. In 1995, Javier Barreirro, Robert Fraley and David Musser tried to get a proposal ready in time for the standard and their work became the basis for many of the later hash_maps [8]. The committee didn't have the time, though..."
While these containers will be added to the standard library under the awkward names unordered_map
and unordered_set
when TR1 is adopted, there is no reason we have to wait that long. Despite the fact that they are missing from the standard, virtually every modern C++ compiler has adopted a reasonable variant of hash_map
and hash_set
.
The hash_map
container does a great job of memoization as described in this article, with a caveat. As I said earlier, you'll need the key to your hash table to be some subset of the input parameters. That's all fine if your key is a single value using a built-in type, such as int, a std::string
, or a pointer. These key values are supported implicitly by most implementations of hash_map
.
Things get a little more complicated if you are using multiple values or your own structures as a key into the map. Owing to the lack of a standard, implementing non-standard keys requires slightly different techniques, depending on your compiler.
I'll give you examples here that work with the two compilers you are most likely to encounter: Visual C++ .NET 2003/2005, or gcc 3.x.
Let's say you're doing a study tracking peoples reading habits by age, and you want to implement the following code:
class Sample { public : Sample( int age=-1, const std::string &genre="" ) : mAge( age ) , mGenre( genre ){} int mAge; std::string mGenre; }; int main() { hash_map<Sample,int> AnnualBooksReadByAgeAndGenre; AnnualBooksReadByAgeAndGenre[ Sample( 15, "Fantasy" ) ] = 15; AnnualBooksReadByAgeAndGenre[ Sample( 18, "Sports" ) ] = 12; AnnualBooksReadByAgeAndGenre[ Sample( 35, "Mystery" ) ] = 17; AnnualBooksReadByAgeAndGenre[ Sample( 42, "Romance" ) ] = 125; return 1; }
You won't be able to compile this code as-is, for at least three reasons:
- You need to include the correct header file, which unfortunately differs depending on which compiler you are using.
- You need to hoist the
hash_map
name into your namespace. - You need to define the hash function and at least one comparison function for the key class,
Sample
.
Item 3 is only needed if you are using something other than a basic type for your class, and it's the trickiest. Defining the comparison is necessary so that the hashing library code can verify key equality during lookups or collisions.
In both cases, the function is usually pretty easy to write -- you just compose it using existing functions. Examples for Microsoft and gcc compilers circa 2007 are given below.