Site Archive (Complete)
Open Source
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
TABLE OF CONTENTS
May 07, 2008
Performance Portable C++

Taking full advantage of new architectures

(Page 1 of 5)
Jeff Keasler
Performance portability means that code can achieve good performance across a range of computer architectures while maintaining a single body of source code.
Jeff is a computer scientist at Lawrence Livermore National Laboratory where he contributes to several software projects managed through the ASC program.


Programmers have two basic ways of organizing arrays of data; see Figure 1. The performance of each choice can vary greatly as code is ported from machine to machine and compiler to compiler.

It's easy to switch between the Array-like and Struct-like implementations in Figure 1 by hiding the array details behind a class API. Listing One shows how a coordinate array is implemented as a performance portable Point class. There are two important features of the Point class implementation:

  • Methods are inlined.
  • Methods return direct references to the underlying data.

Together, these two features let almost all compilers efficiently optimize most (if not all) class overhead, especially when interprocedural analysis has been enabled in the compiler optimization flags.

If you use classes having the above form, you can quickly switch between array layouts as you port code. The easiest way to do this is to create a configuration header file with system-specific layout choices, and #include that configuration file at the top of each array class header file.

If you don't hide the array implementation as I describe here, you can end up completely rewriting your software when switching from one form of array layout to the other.


(a) 

double x[10000] ; double y[10000] ; double z[10000] ;

(b) struct { double x,y,z ; } point[10000] ;

Figure 1: (a) Array-like, (b) Struct-like.

  
#define ML_STRUCT 0
#define ML_ARRAY  1

#if POINT_MEM == ML_ARRAY
class Point {
public:
   Point(const int size) : m_x(size), m_y(size) {}
   inline double &x(const int idx) { return m_x[idx] ; }
   inline double &y(const int idx) { return m_y[idx] ; }
private:
  Point() ;
  std::vector<double> m_x ;
  std::vector<double> m_y ;
} ;
#else /* ML_STRUCT */
class Point {
public:
   Point(const int size) : m_p(size) {}
   inline double &x(const int idx) { return m_p[idx].x ; }
   inline double &y(const int idx) { return m_p[idx].y ; }
private:
  struct Coord { double x, y ; } ;
  Point() ;
  std::vector<Coord> m_p ;
} ;
#endif 
Listing One

1 Performance Portable C++ | 2 The Benchmarks | 3 STL versus Naked Pointers | 4 Class Partitioning | 5 Conclusion Next Page
DR. DOBB'S CAREER CENTER
Ready to take that job and shove it? open | close
Search jobs on Dr. Dobb's TechCareers
Function:

Keyword(s):

State:  
  • Post Your Resume
  • Employers Area
  • News & Features
  • Blogs & Forums
  • Career Resources

    Browse By:
    Location | Employer | City
  • Most Recent Posts:



    MICROSITES
    FEATURED TOPIC

    ADDITIONAL TOPICS

    INFO-LINK



     
    ♦ sponsored




    Related Sites: DotNetJunkies, SD Expo, SqlJunkies