Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Typelists & C++ Polynomial Meta-Arithmetic


August, 2004: Typelists & C++ Polynomial Meta-Arithmetic

Volodymyr Myrnyy is a member of the mathematics faculty at the Brandenburg University of Technology. He can be contacted at [email protected].


Implementations of numerical methods should provide some variety of numerical schemes applied to the mathematical model of the problem. Computing the schemes at runtime can ruin the computational efficiency, which is often very important. In ideal cases, such programs must store some constants describing a particular scheme. Of course, they can be evaluated using mathematical tools such as Mathematica, but can't always give you the desired variety and flexibility. Consequently, the question that came to my mind is: Why can't C++ compilers handle formulas analytically like Mathematica, computing necessary constants at compile time? Studying template techniques such as Andrei Alexandrescu's typelists [1] led me to suspect that this is possible. Further thinking about this led me to the following: Data of a computationally intensive program can be divided into compile-time known data and runtime known data. In this way, compilers can treat and simplify the former, then build a bridge to the runtime data using expression templates.

Consider, for example, modeling shape functions (SF) that arise in the finite element method (FEM) as polynomials. A chosen finite element approximation corresponds to a specific set of SFs defined by elements of a computational grid. The discrete weak formulation of a partial differential equation (PDE) to be solved by the FEM includes integrals of some products of these SFs, their derivatives, or a mixture of the two. The usual FEM approach uses a mapping from a grid's element to the unit element: To the segment [0,1] in one-dimensional case (1D), to the unit triangle or unit square in 2D, and to the unit tetrahedron or unit hexahedron in 3D. Then the necessary integrals over grid elements are the same integrals over unit elements (compile-time data), multiplied by corresponding Jacobi determinants (runtime data). Therefore, the compiler can evaluate the integrals over unit elements if the dimension of a problem and order of the SFs are specified as constants.

Mathematical Background

The FEM numerically solves a PDE on some domain W Rd (d=1,2,3) discretized by a computational grid T. The solution u is interpolated using SFs Ni(x), as in Example 1(a), where n is a number of vertices in the grid T, and ui are values of the solution u in these vertices. Therefore, the SF Ni equals 1 in the vertex i, and 0 in the rest of the vertices. Figure 1 presents examples of such functions of the first and second order on triangles.

For the sample model Poisson equation written in the weak form in Example 1(b), the weighting function WN defines the most popular Galerkin FEM. Then the substitution of the solution u from Example 1(a) gives the discrete formulation of the problem in Example 1(c). The latter yields the kth row of the sparse stiffness matrix. Its elements aki are only nonzero if the vertices k and i are adjacent. Taking into account given boundary conditions, the constructed linear algebraic system of equations has a unique solution {ui}. However, we are only interested in the compile-time computation of the integrals in the left side of the equation in Example 1(c), which are evaluated on the grid elements KT separately. The linear mapping KK0 from a grid element K into the unit element K0 disjoins the compile-time and the runtime data; see Example 1(d), where J is a Jacobi determinant dependent on the grid.

The next theoretical question is an automated, recursive construction of the SFs on the unit element. Example 2(a) shows the general form of a second-order shape function on a triangle. In this case, there are exactly six uniformly distributed points (degrees of freedom) on a triangle to find the unknown coefficients a0,...,a5 (Figure 2). The function Ni(x,y) equals 1 in the point i and is zero in other points. But it is not obvious how you calculate the coefficients. Example 2(b) shows a more flexible approach to defining an SF as a product of linear polynomials [3], where Lk=0 are equations of "lines" that cross all points except i. To write Example 2(a) in this form, you first need to introduce a convenient enumeration of the degrees of freedom on the unit triangle. To do this, you can use a tuple {i,j}, where i and j mean the number of the point along the x- and y-axis, respectively. Such a method would suit the unit triangle and unit square, and can be used as a triplet to the 3D case as well. So, the second order SF number {1,0} (that is, x=1/2, y=0), for instance, on the unit triangle is Example 2(c), where x=0 crosses and makes the function N1,0 zero in the points {0,0}, {0,1}, and {0,2}, while the second term x-y=1 crosses {0,2}, {1,1}, and {2,0} (see Figure 2). The integer coefficient in front ensures that the SF equals 1 in the point (1/2,0). Keeping in mind the linear terms as "lines" over degrees of freedom lets you generalize and automate the SF building process. Defining the order of an SF as m, there are three types of the "lines" on the unit triangle:

  • Parallel to y-axis: mx, mx-1,..., mx-m.
  • Parallel to x-axis: my, my-1,..., my-m.
  • Diagonal: 1-mx-my, 2-mx-my,..., m-mx-my.

The SF N0,0m consists only of the terms of the third type. The first step along the x-axis is to replace the first term of the third type with the first term of the first type. The second step along the x-axis requires two terms of the first type instead of the third type, and so on. Moving on, the y-axis corresponds to the second type terms in the same way. Example 3(a) is the general formulation. This formula remains similar in the 3D case. The diagonal terms are defined by (1-mx-my-mz),...,(m-mx-my-mz). The linear terms parallel to the corresponding axes x, y, and z are the same. So, the generalization in Example 3(a) to the d-dimensional case is straightforward.

To automate the building of an SF for the unit square, consider the SF N0,0 in Example 3(b). Its groups of the linear terms are strictly separated by the coordinate. There are no mixed diagonal terms. Moving along the x-axis, for instance, forces you to replace the first terms in Example 3(b) with the terms mx, mx-1,..., mx-m+1, which are actually negatives of the terms in Example 3(b). Although tricky, the procedure preserves the correct sign of the SF. The SF coefficient is evaluated in the same way as in Example 3(a). The generalization of the procedure to the d-dimensional case is also clear.

The differentiation and integration of an SF would certainly be simpler in the polynomial standard form. This form appears after multiplication of the linear terms and represents a sum of the products cx1p1...xdpd. The integrals of such a product over the unit d-dimensional tetrahedron T0 and hexahedron H0 are shown in Example 3(c) and 3(d). These formulas can be proved due to the induction on d. They always result in the inverse of an integer number and are identical if d=1. These are all necessary formulas to compute an integral in the right side of Example 1(d) that I'll now implement using template metaprogramming.

Numlist: A Compile-Time Array

A numlist is like a typelist [1], where the first template parameter is an integer number.

template<int Number, class T>
struct Numlist
{
   enum { value = Number };
   typedef T Tail;
};

Numlist does not define any functionality or runtime data. It holds an integer compile-time constant value and a type Tail, which can be either a next Numlist or NullType ending an array. For instance, the compile-time array {3,-2,1} would be:

Numlist<3,Numlist<-2,Numlist<1,NullType> > >

It isn't difficult to implement all the typelist's operations for the Numlist distinguishing between template parameters as integer numbers and as numlists, which are types. The Append operation, for instance, disaggregates into an addition of a number Num:

template <class NList, int Num>
struct Append;

and an addition of a numlist List to the end of NList.

template <class NList, class List>
struct AppendList;

Aside from the typelist's imported operations and linearization by a macro, Numlist lets you define pure numeric operations, including addition, multiplication, Sum, Min, Max, Sort, and more complicated operations as well (see Numlist.h, available at http://www .cuj.com/code/). Everything related to numlist operations—including its definition—is implemented in the NL namespace. It has also its own empty class, NL::NullType, to define the end of a numlist. For example, consider the template algorithms in Listing 1, which generate a numlist of the length Len with the default element Value. This is actually a simple loop from Len to 0, which is always implemented in template metaprogramming using recursion. The last specialization (Len=0) is the loop's at-end condition. Listing 2, on the other hand, is more complicated because it adds an integer Num to the element i (starting with 0) of the numlist NList and returns the new numlist. Passing nonvalid index i or an empty numlist leads to a compilation error. The class template Numlist can be a basic structure to many compile-time computations. In the next approach, it defines a polynomial term and the enumeration of the degrees of freedom.

Polynomial as a Typelist of Numlists

A polynomial is a sum of powers in one or more variables multiplied by coefficients [4]. The summands with coefficients are called "monomials"; for instance, c x1p1x2p2...xdpd; without coefficient—"polynomial terms." Implementing a polynomial term, the only integer powers p1,...,pd bear real information at compile time, since this series is ordered: The first power corresponds to x1, the second to x2, and so on. If the coefficients are assumed to be integer, then a numlist of the length Dim+1 can define a monomial, where the first element is a coefficient and the others are powers. For instance, NUMLIST_4(-3,2,0,1) represents -3x12x3. This simple algorithm constructs a monomial using InitNumlist from Listing 1:

template<unsigned short Dim,
         int Coef=0, int Power=0>
struct Monomial
{
   typedef Numlist<Coef,
    typename InitNumlist<Dim,Power>
               ::Result> Result;
};

The class template Monomial is only a builder to obtain a default monomial of a given dimension. Since Numlist is a type, the simple representation of a sum of monomials can be a typelist of numlists, which has variable length ending by NullType. For example, the linear polynomials in Example 3(a) contain exactly two monomials, except for the third line where the length of the linear terms changes depending on the dimension. The template recursion from Listing 3 constructs a typelist that represents the linear polynomial -mxd-...-mx1 (d=Dim, m=Order). The auxiliary template parameter Num is the number of summands and used as a counter from Dim to 0.

After a general polynomial has been defined in such a way at compile time, it must be used in a program. You have to be able to get its value in the given point x, which is known only at runtime. This is a good reason to apply the expression templates technique [5]. The evaluation role can be referred to an extra class template called Evaluate. It receives a polynomial as a template parameter in the form of a typelist and goes through the typelist and numlists recursively by defining an operator() as:

Typeofx operator()(Typeofx* x)
{  //call to operator() of the
   //nested Evaluate<...> classes
}

(See the additional source code at http://www.cuj.com/code/ for the full implementation.)

The most important operation on polynomials is Mult, which must be able to multiply two general polynomials given as typelists of numlists. Therefore, this template algorithm (Listing 4) contains specializations to cover all possible pairs of types that appear in a typelist of numlists. The simplest specializations are commented briefly. The product of two monomials (Mult<Numlist,Numlist>) is implemented with the help of the operation Add, which adds two numlists and also returns Numlist.

Each specialization of Mult results in a typelist, which often contains some equal polynomial terms. Mathematically stated, the polynomial has to be simplified. The idea of the template algorithm Simplify (Listing 5) is not tricky. For the given Term, the auxiliary algorithm SumAndDel looks for monomials with the same term in TList, counts the sum of their coefficients, and erases them. The algorithm Simplify applies SumAndDel to a current Numlist and the following tail of the typelist modified by the previous call of SumAndDel. If the sum of the coefficients is zero, then the monomial must be omitted. This case is determined by the operation Loki::Select<T0,T1,T2> [1], which is just a metaimplementation of the constraint: if T0 then T1 else T2.

Shape Function Construction

An SF is a product of linear polynomials defined uniquely by a dimension, order, and specific number of the degree of freedom. Consequently, an SF can be obtained by a sequential application of the operation Mult (Listing 4). However, if you need to evaluate a value of an SF, then the cheapest way is to use the source form as a product. Since the length of such a product is variable, Typelist comes in handy again. In this case, the convention proposed by Alexandrescu [2] must be overridden. In other words, a typelist can receive a typelist as the first template parameter, although the nature of Typelist remains unchanged. A typelist can appear at the first place only when representing a polynomial. The reason for this involves how the Mult operation distinguishes a typelist as a sum from a typelist as a product at its end. The solution is to supplement the Loki library [1] with the new empty type, UnitType. It can only appear at the end of a typelist of typelists, which defines a product of polynomials. As a consequence, the Loki::TL::Append operation must be completed with the specializations for UnitType in the same way as the specializations with NullType. Then the operation Mult obtains some additional specializations (Listing 6), which transforms a product of polynomials to a polynomial in the form of a typelist of numlists ending with NullType. To multiply higher order SFs efficiently, the implementation can be optimized by a substitution of the algorithm Simplify applied to the inner operations Mult.

For brevity, I consider only the construction of an SF on the unit triangle. The construction of an SF on the unit square based on Example 3(b) is similar. An SF is created by two lists of linear terms. The first one—from the third line in Example 3(b)—is produced by the template BasicList3 from Listing 7, which adds a constant to the result of BasicTerm and repeats that process Num times recursively.

The algorithm ShapeBasis (Listing 8) builds the second list. Its linear terms contain exactly two summands: A linear monomial mxi (i=CurVar, m=Order) and a constant, which varies from 0 to -Num+1; see the first and second lines of Example 3(b).

The construction algorithm ShapeFunction3 (Listing 9) receives a point number in the form of the numlist NList with the length Dim. There are two predefined template parameters for internal use only. The first one is a counter for the numlist (N=1...Dim). The second one computes the sum of elements in the numlist. The algorithm recurses through this numlist, taking a current number of the linear terms from ShapeBasis concerning variable N. Keeping in mind that an SF on a triangle contains exactly m linear terms makes the end specialization clear: The created typelist of typelists receives (Order-Sum) terms from the BasicList3. Since all coefficients in Example 3(b) are fractions with a unit nominator, the only denominator is calculated using the algorithm Factorial. The latter is the first classic example in the papers on template metaprogramming [6]. But don't forget to specialize 0!=1; otherwise, Examples 3(a) though 3(c) fail.

Differentiation and Integration

Integration over a unit triangle reduces a complicated typelist of numlists to a compile-time constant using Example 3(c). But before starting to integrate, consider the differentiation of a polynomial required in Example 1(d). The corresponding algorithm (Listing 10) implements the usual differentiation rule of the polynomial TList with respect to the variable number Var (1...Dim). If the power pow of Var in the current monomial NList is zero, then the monomial does not come into Result, else the derivative is calculated by sequential use of AddAt and MultAt applied to the powers and the coefficient of the monomial, respectively.

The integration algorithm Integrate3 (Listing 11) is the first one, handling noninteger constants. This is implemented using static functions, because some compilers do not accept the compile-time calculation of static constants. IntegrateTerm3 realizes Example 3(c), integrates a polynomial term, and saves the fraction as two integers—numerator and denominator. Integrate3 goes through a polynomial and sums up the fractions as real numbers.

The template algorithms Differentiate and Integrate3 both receive a typelist of numlists. Therefore, to apply them to a single SF, its product form must be expanded to the polynomial standard form. The template Expand (Listing 12) is actually an interface to the algorithm Mult. The long approach ends with an example of computing the integral in Example 1(d) of two sample second-order SFs over the unit triangle. There is no runtime data in this program. The whole FEM matrix assembling requires, of course, the integration of all pairs of the SFs that can be automated due to template algorithms as well.

How long will such programs be compiled? The template algorithms I present here have been compiled and tested using GNU GCC versions 3.3.2 and 2.96 as well Intel C++ 7.1 under Linux Fedora 1. Since the complexity of Mult is at least O(2m) and of Simplify is in the worst case O(22m), the order of SFs must be relatively small if you don't want to see your powerful workstation crying (for m>3 by my experience).

Conclusion

C++ class templates can be convenient tools to model mathematical objects, schemes, and formulas. In this article, I've concentrated on building shape functions as polynomials in any dimension of any order, their differentiation and integration over a unit triangle and square. That provides a variety of finite element method approximation schemes without additional runtime costs, which is a basis of an efficient scientific code.

References

  1. [1] Alexandrescu, A. Modern C++ Design, Addison-Wesley, 2001.
  2. [2] Alexandrescu, A. "Typelists and Applications," CUJ, February 2002 (http://www.cuj.com/documents/s=7986/cujcexp2002alexandr/ alexandr.htm).
  3. [3] Felippa, C.A. Introduction to Finite Element Methods, Department of Aerospace Engineering Sciences, University of Colorado at Boulder, 2003 (http://caswww.colorado.edu/courses.d/IFEM.d/).
  4. [4] Wolfram's "Mathworld: Polynomial," October 3, 2004 (http://mathworld.wolfram.com/Polynomial.html).
  5. [5] Veldhuizen, T. "Expression Templates," C++ Report, June 1995 (http://osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html).
  6. [6] Veldhuizen, T. "Using C++ Template Metaprograms," C++ Report, May 1995 (http://osl.iu.edu/~tveldhui/papers/Template-Metaprograms/ meta-art.html).


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.