Dimension Checking of Physical Quantities

By Michael Kenniston, November 01, 2002

A practical yet powerful tool for catching unit errors at compile time.

November 2002/Dimension Checking of Physical Quantities

When you write a program to solve a scientific or engineering problem, you aren’t just dealing with numbers; most of those numbers actually correspond to physical quantities like density or area. These quantities have not only a magnitude but also dimensions (e.g., length, speed, or acceleration) and units (e.g., meters, feet, and inches). Explicitly including this additional information in source code makes programs easier to understand, and having the compiler verify consistency makes development faster and more reliable. But due to a lack of suitable tools, hardly anyone applies a systematic technique to associating dimensions and units with scientific variables.

This article describes one such tool: a C++ Quantity library which, by taking advantage of a number of generally useful template programming techniques, provides a system of types to keep track of dimensions automatically. Because this library uses templates, in most cases, the run-time performance is equivalent to that of primitive numerical types but with the benefit of compile-time consistency checking.

The Problem and the Constraints

Generations of high-school science students have checked the results of their calculations for sanity by verifying that the dimensions come out right. Since the rules are relatively simple, it would make a lot of sense to automate those rules. Any search engine will confirm that the Internet is awash with interactive calculators that automate the rules, but interactive tools are not particularly useful to a programmer.

What would really be helpful, at least to readers of this journal, is something that allows compilers to do the checking. The idea is certainly nothing new (see [1] for a detailed list of references). Variations on the idea have been implemented (e.g., [1, 2, 3, 4]), but in the past there hasn’t really been much support for this technique in commonly used languages. However, with the widespread availability of classes and templates in C++, it is now possible to build suitable tools for general use. This article examines the design of a simple, portable Quantity library, which uses templates to do compile-time checking of dimensional consistency. Although this library doesn’t check units explicitly, most unit errors will be caught at compile time as well.

If it isn’t obvious how helpful and useful this checking could be, simply consider the loss of the $125 million Mars Climate Orbiter [5]. Though there were many factors involved, the root cause was that one subsystem passed numbers to another subsystem, and the two subsystems did not agree on what the units were. The expected dimensions and units of the numbers were apparently not represented in a way that allowed automatic detection of this improper use. The result was both expensive and embarrassing. Of course, most software has far less spectacular failure modes, but, even for mundane work, it still makes sense to increase reliability by using as much automatic checking as possible.

Since the dimension-checking library described in this article is intended to solve a very practical problem, I felt it appropriate to impose a very practical constraint on it. In order to ensure that as many programmers as possible can use this code, it must not depend on C++ language features that don’t work (or don’t work well) on popular compilers — regardless of whether the ISO language specification says they should work. This constraint leads to some odd and inelegant implementation strategies, but C++ is tough enough to handle this constraint and still keep the library interface reasonably clean.

The Obvious Solution

Physical quantities can be represented as a floating-point magnitude and a vector of integer exponents (see sidebar, “The SI System of Units”), so the obvious solution is to define a class that contains magnitude and exponent values:
class quantity
{
  // ...
  private:
    double m_magnitude;
    int    m_length_exponent;
    int    m_mass_exponent;
    int    m_time_exponent;
    // ...
};
Appropriate constructors and arithmetic operators could then be defined for objects of this quantity class. The actual arithmetic is fairly easy. For example, to multiply two quantities, you simply multiply the magnitudes and add the exponents (e.g., meters * meters squared = meters cubed), and each member function can check for dimensional consistency as needed.

Such a class would work fine. It would even be necessary for certain applications, such as an interactive calculator, where the exact dimensions are not known until run time. It would also be an appropriate solution in Java. However, this approach does not provide compile-time checking of dimensional consistency, and it also imposes a serious run-time overhead in both space and time relative to arithmetic with primitive numeric types.

The Better Solution

A better design (for most problems) is to define a separate class for each possible combination of dimension exponents so that you have separate types for length, area, volume, time, speed, density, and so on. This solution makes the dimension exponents part of the type rather than part of the data, so dimension exponents do not consume space or processor overhead at run time. Since this technique specifies all dimensional information at compile time, it also allows the compiler to do consistency checking.

Of course, you must define many possible types. There are seven SI dimensions (length, mass, time, electric current, thermodynamic temperature, amount of substance, and luminous intensity). Even if you make the dubious assumption that no dimension will ever have an exponent larger than four, it still leaves nine possibilities (-4 through +4) for each of the seven exponents. A distinct type is required for each of the 97 possible combinations of exponents, since each represents a different sort of physical quantity. That’s over 4 million types, and then you need to define a multiplication operator for each possible pair of those types, which means (97)*(97) distinct operators. Now double that by adding division to the mix, and you have over 40 trillion operator functions.

Fortunately, you don’t have to define them all by hand. Templates can tame this complexity, and the most basic one just wraps all the dimension exponents into a single type:
template< int D1, int D2, int D3 >
struct dims
{
  enum { dim1 = D1,
         dim2 = D2,
         dim3 = D3 };
};
(For brevity, these examples include only the first three dimensions, but the code at <www.cuj.com/code> includes all seven.) struct and enum simply allow access to the template arguments from other places; this is a fairly standard technique in template programming.

The dims class really only has one purpose, to serve as a parameter for other template classes, for example, an improved quantity class:
template< typename D >
class quantity
{
  public:
    quantity() {}
    // default copy ctor is ok
    // default operator=() is ok
    // default dtor is ok
  private:
    explicit
      quantity( double m ) :
      m_magnitude( m ) {}
    double  m_magnitude;
};
The class definition provides a default constructor so that I can create arrays of quantities, so I can use quantities in STL containers. In order to be as efficient as a naked double, the default constructor leaves the actual value undefined. The copy constructor and assignment operator are as efficient as those of a double, and they provide exactly the correct semantics: only objects whose class is the same (i.e., whose dimensions are exactly the same) can be copied and assigned. When you need variables to hold accelerations, for example, you can now declare and use them like this:
typedef dims< 1, 0, -2 > accel_d;
quantity< accel_d > lead_accel;
quantity< accel_d > feather_accel;
feather_accel = lead_accel;
A full complement of arithmetic operators is also necessary. Most of these operators could be defined as member functions, but it works out better if they are just plain old (non-member) functions. These functions are quite easy to define, although naturally they must be templated (“lhs” and “rhs” are common abbreviations for “left-hand side” and “right-hand side”):
template< typename D >
quantity< D > operator+ (
  quantity< D > lhs,
  quantity< D > rhs )
{
  return quantity< D >(
    lhs.m_magnitude +
    rhs.m_magnitude );
}
At first, it might appear easier to use the whole quantity<> type as the operator function’s template parameter (rather than just the dimension type), but then the compiler would think this definition should be used for every addition, no matter what the operand types were. The definition above restricts this operator+ to apply only to quantity<>, so you can still have operator+ defined for other types.

You can define all the unary and binary addition and subtraction operators in a similar way, so long as you declare them all to be friends of the quantity<> class so they can use the constructor they need to create their results.

Type Generators

Multiplication is a bit messier because you can mix types. Even worse, the result type of a multiply is usually different from either of the operand types (e.g., speed * time = length). This presents something of a problem: in the definition of the operator* function, what is the return type?

The solution to this problem is a “type generator,” a class whose purpose is simply to define a type. An example will make this clearer:
template< typename D_LHS,
          typename D_RHS >
struct product
{
  enum { d1 = D_LHS::dim1 +
              D_RHS::dim1,
         d2 = D_LHS::dim2 +
              D_RHS::dim2,
         d3 = D_LHS::dim3 +
              D_RHS::dim3 };
  typedef quantity< dims<
    d1, d2, d3 > > result_type;
};

template< typename D_LHS,
          typename D_RHS >
typename
product< D_LHS,
         D_RHS >::result_type
  operator*(
    quantity< D_LHS > lhs,
    quantity< D_RHS > rhs )
{
  return typename
    product< D_LHS, D_RHS >::
    result_type(
      lhs.m_magnitude *
      rhs.m_magnitude );
}
Here struct product is the type generator. Its purpose is to define the result type of a multiplication of two quantities of differing dimensions. The actual calculation is trivial, which is good because all calculations in a type generator must be done at compile time. In other words, the expressions must be compile-time constants. The trick is to use the type generator to create a name that you can use for the return type in the declaration of the operator* function. (Whenever a type is pulled out of an instantiation of a templated class or struct like this — i.e., whenever you invoke a type generator — precede it with the keyword typename to let the compiler know what you’re doing.)

Division is handled in a similar way, with its own type generators. The complete Quantity library (available at <www.cuj.com/code>) also adds definitions of integer powers and roots of quantities.

Dimensionless Results

Multiplication and division introduce another twist into the situation: the dimension exponents of the results can be all zero. In other words, the result is just a number. For example, time * frequency or speed / speed just produce a dimensionless number.

The type generators described above actually create results of type quantity< dims< 0, 0, 0> >, which is going to create a problem. You cannot assign an object of that type to a double, or vice versa, because the types don’t match. C++ allows the creation of implicit conversion functions, but because implicit conversions tend to make a program — and its bugs — very hard to understand, it’s better in the long run to make the result of things like speed / speed really come out as a number instead of as a quantity< dims< 0, 0, 0 > >.

Another type generator can handle this job, though this one needs to be a little fancier, since it uses specialization to detect the interesting special case:
template< typename D >
class collapse
{
  typedef quantity< D >
    result_type;
}

template<>
class collapse< dims< 0, 0, 0 > >
{
  typedef double result_type;
}
In effect, this is the compile-time equivalent of an if statement. Invoking collapse< D >::result_type will produce the same thing as quantity< D >, except that, if D is dims< 0, 0, 0 >, the result collapses down to a double instead.

It’s fairly simple to insert this logic into the code, as the operator functions don’t change. All that’s needed is to replace the typedef inside each operation’s type generator with this new typedef:
typedef typename
  collapse< dims< d1, d2, d3 > >::
    result_type result_type;
Since there are no implicit conversions to or from double, you must add dedicated operator* and operator/ functions to handle the cases where one operand is a quantity<> and the other is a double.

This takes care of automatically generated types, but it’s still possible that the user might (incorrectly) instantiate quantity< dims< 0, 0, 0 > > explicitly. Another template, specialized for only one value of its bool template argument and deliberately left undefined for the other, can provide compile-time detection of this error:
template< bool B >
  struct static_assert;

template<> struct
  static_assert< true >
{
  typedef char dummy;
};
(In spite of any appearance to the contrary, this technique is not called “partial template specialization” — that term is used when you specify some but not all parameters of a template.) Now, inside the class dims<> enum, you add:
not_all_zero =
  D1 != 0 || D2 != 0 || D3 != 0
and inside class quantity<>, you add:
enum { ok = D::not_all_zero };
typedef

typename static_assert< ok >::dummy
  dimensions_must_not_all_be_zero;
With these additions, if quantity< dims< 0, 0, 0 > > is ever instantiated, the compiler will generate a reference to static_assert< false >::dummy, and since that is undefined, a compile-time error will result. If you’re lucky, the diagnostic will include the long name dimensions_must_not_all_be_zero, thereby giving the user a hint as to what actually went wrong. For more extensive examples of this technique, see the static_assert library in [7].

Input/Output

An output operator is straightforward and can produce something that is as close to the official standard as you can get in plain ASCII. Just output the magnitude, and then, for each dimension, output the official abbreviation and the exponent.

An input operator is harder to implement, and it is also less important (since you can just read a double and convert it as described below). For a simple library, an input operator isn’t necessary at all.

Units and Conversions

The last crucial piece of the puzzle is a set of constants for the base units (meters, kilograms, seconds), since without these constants there is no way to get a defined value into a quantity<> variable.

C++ provides various ways of defining these constants, but one of the simplest ways is to use inline functions. There are only a handful of these base units, and each must be declared as a friend of the quantity<> class to give it access to the private constructor:
inline quantity< dims< 1, 0, 0 > > meter() { return
quantity< dims< 1, 0, 0 > >( 1.0 ); }

inline quantity< dims< 0, 1, 0 > > kilogram() { return
quantity< dims< 0, 1, 0 > >( 1.0 ); }

inline quantity< dims< 0, 0, 1 > > second() { return
quantity< dims< 0, 0, 1 > >( 1.0 ); }
The constructor arguments supplied in these functions determine the internal representation that is being used. In this case, all internal values are stored as meters, kilograms, and seconds, but that association is hidden from the user, so it doesn’t normally matter what numbers are used internally.

Now the library is finally complete enough to initialize a quantity<> variable, for example:
quantity< accel_d > free_fall;
free_fall = 9.80665 * meter() /
  ( second() * second() );
The compile-time type checking forces the units to be stated explicitly this way, so there is no way that the programmer can simply forget to specify the units. Notice that there is an intermediate expression ( second() * second() ) that has a type of “second squared,” but the templates generate that type automatically. The only time a programmer has to specify dimensions is when declaring a variable.

Definitions of other units are easy:
inline quantity< dims< 1, 1, -2 > >
  newton() { return meter() *
    kilogram() /
    ( second() * second() ); }

inline quantity< dims< 1, 0, 0 > >
  yard() { return
    0.9144 * meter(); }
and so on.

One very nice feature that falls out of this design is that you don’t need any special conversion functions. To convert a quantity<> to a particular unit, just divide by that unit. Look at it this way: to find out how many 3’s there are in 12, divide 12 by 3. To find out how many yards there are in a quantity<> x, divide x by yard(). To verify at compile time that the dimensions match, just feed the result to the constructor for double:
cout << "x = " <<
  double( x / yard() ) <<
  " yards\n";
If you forget the double constructor, and x is not a length quantity<>, you will still find out at run time that something is wrong because the output will include other units in addition to "yards".

Portability Considerations

To maximize portability, the Quantity library implementation is tweaked to use only a subset of ISO-Standard C++. Some of these tweaks are minor and subtle, while others are glaring and pervasive.

The least noticeable technique is to make parsing easier by putting all compile-time arithmetic operations inside enum declarations. Since this approach is still ISO-Standard C++, it doesn’t create problems for compliant compilers, but it makes it possible for other compilers to parse the code.

A less subtle technique is to define the operator* and operator/ functions with two quantity<> operands as member functions of class quantity<>. Again, this solution is perfectly legal ISO C++, but it avoids problems that sometimes arise in non-compliant compilers.

Because MSVC (Microsoft Visual C++) doesn’t support partial template specializations at all, the library is structured to completely avoid using them. It was pure luck that I was able to get away with this; sometimes there are workarounds, but in many cases partial template specialization is essential to the design of the program, and you have no choice but to write off any compiler that doesn’t support it.

Last, but by no means least, the operator* and operator/ functions must be friends of all instantiations of class quantity<>. Getting friend declarations to work correctly when both the class and the function were templated proved very troublesome, so to make my life easier, I used an odd technique to avoid the need for friend declarations at all. Let me say right up front that this is ugly, but the alternatives I tried were even worse.

The trick is to define a “wrapper” class called permit:
struct permit
{
  permit() {}
  explicit permit( double v ) :
    m_value( v ) {}
  operator double() const
    { return m_value; }
  double m_value;
};
This class acts like a double, but it has a distinguishable type. Then I defined a constructor and an accessor function in class quantity<> that each take permit as an argument, thus allowing non-friends to create and read the internal representation of quantity<>.

Of course this technique violates encapsulation, but it does so in a disciplined way. To penetrate the encapsulation barrier, you must explicitly create a permit object, which the library actually places inside a detail namespace to make it even more obvious that you shouldn’t be messing with it. Thus it is impossible to accidentally violate the encapsulation. Since the philosophy of C++ itself is to prevent accidents but not to try to protect against malicious mischief, I felt this was a reasonable tradeoff in return for simple, robust code that would be easy to support. (Any programmer truly determined to access the representation could already do so with either a union or a reinterpret_cast anyway.) Of course as soon as compiler support for template friends improves sufficiently, I will rip this out in a heartbeat and replace it with the correct friend declarations.

To date, this code has been compiled and run under three compilers (MSVC, gnu g++, and Metrowerks) and three operating systems (Windows NT, SunOS, and Macintosh), but the library is still a work-in-progress and I intend to support it under other popular compilers as well.

Performance

When you compile a program using this library in debug mode, the performance will be much worse than if you use raw doubles. However, things get better when you turn on the optimizer. Exactly how much better depends on the compiler. This library is still new, and there isn’t much experience with it yet, but in preliminary testing, the gnu g++ compiler produced code whose execution time was so close to that of raw doubles that the difference was less than the measurement uncertainty, indicating that it is reasonable to expect zero cost at run time.

References

[1] Gordon S. Novak, Jr. “Conversion of Units of Measurement,” IEEE Transactions on Software Engineering, August 1995, pp. 651-661, <www.cs.utexas.edu/users/novak/units95.html>.

[2] Citrus is a library and suite of demo applications for converting numbers with units into numbers with different units. <http://citrus.sourceforge.net/>.

[3] The SI Library of Unit-Based Computation, <www.fnal.gov/fermitools/abstracts/siunits/abstract.html>. This is similar to the Quantity library discussed in this article, but it has more features and is less portable.

[4] John J. Barton and Lee R. Nackman. Scientific and Engineering C++: An Introduction with Advanced Techniques and Examples (Addison-Wesley, 1994).

[5] “NASA’s Metric Confusion Caused Mars Orbiter Loss,” <www.cnn.com/TECH/space/9909/30/mars.metric/index.html>

[6] Barry N. Taylor. “Guide for the Use of the International System of Units (SI),” National Institute of Standards and Technology Special Publication 811, <http://physics.nist.gov/Document/sp811.pdf>.

[7] The Boost website provides free peer-reviewed portable C++ source libraries. <www.boost.org>.

Michael Kenniston has a B.S. in Mathematics from Worcester Polytechnic Institute and a Ph.D. in Computer Science from Stanford University. He spent 15 years as a software engineer in industry, first at Bell Laboratories and later at FutureSource; he is now bringing his real-world experience back to the world of academia as a visiting assistant professor at DePaul University. He can be contacted at [email protected].

1 2 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Dimension Checking of Physical Quantities

The Problem and the Constraints

The Obvious Solution

The Better Solution

Type Generators

Dimensionless Results

Input/Output

Units and Conversions

Portability Considerations

Performance

References

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Dimension Checking of Physical Quantities

The Problem and the Constraints

The Obvious Solution

The Better Solution

Type Generators

Dimensionless Results

Input/Output

Units and Conversions

Portability Considerations

Performance

References

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content