FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
August 12, 2001
Winning the Passing Game

Max Fomitchev
One of the most common approaches to improved performance is to optimize the most frequent operations. Even though the benefit resulting from optimizing a single operation is small, if that operation is called frequently, it can result in substantial improvement. Although the idea is simple, finding the right operations to optimize can be a challenge.
Optimize to the Max

One frequent operation that is commonly overlooked is subroutine or class method calling. Calls to subroutines or class methods may make up 50 percent or even as much as 80 percent of the source code. There is not much interesting in a subroutine call itself. But there are other operations that happen almost every time a subroutine is called. These are operations related to parameter passing.

In the good old days, when procedural languages reigned supreme, parameter passing was simple. All it took was to create a stack frame and to push values to the stack:

push A
push B
call proc
...
; proc
push ebp ; stack frame
mov ebp,esp ; created here
; do something
mov esp,ebp ; stack frame
pop ebp ; destruction
ret

This is still a very typical scenario for a C/C++ program because there are many routines that operate on "simple" (i.e. non-object) parameters. Even if the routine is defined with an __inline modifier and in-lined in the program code, parameters are pushed into the stack and stack frame is created. The only thing that changes is that the call instruction is replaced by the subroutine code and the ret instruction is eliminated. Looking at the design of modern CPUs it is easy to see that inlining does not provide any performance improvement. In most cases, call and ret instructions are processed in zero cycles due to successful static branch prediction and instruction prefetching. However, excessive inlining may blow up the size of the code and ultimately reduce performance due to the increased likelihood of instruction cache misses. Perhaps the only reason to use inlining is when the routines are extremely compact (just a few operations) and called very frequently. Good examples are CString methods in C++ and COMPLEX arithmetics.

Making the Most of Your Registers

There is another modifier that can be helpful: __fastcall forces subroutine parameters to be passed in registers. This eliminates memory operations such as pushing parameters into stack and stack frame access. Also, instructions that operate solely on registers execute faster in the internal CPU pipeline. However in the x86 architecture, sometimes there are just not enough registers to accommodate all the values.

Also there is an /Oy- compiler option in Visual C++. It turns off stack frame initialization, which saves a few instructions and frees the EBP register for general use. Though the advantage is small, it's still an advantage. Needless to say in a scarce pool of x86 registers, an extra register may be a big asset.

Simple parameters are only a part of the problem. Most programs use objects heavily and pass them as parameters frequently. Where there are objects, one finds constructors, destructors, and quite often memory allocation. And did I mention local variables? Consider what happens in the following code sample:

void foo(CString S)
{
 
CString S2;
 
...
}
...
CString S1;
foo(S1);

First the constructor for S1 is called. Then the copy constructor for S (which also allocates memory using the new operator). Then the constructor for S2 is called; then the subroutine does something. Then the destructor for S2 is called (which releases allocated memory using the delete operator). Then the destructor for S is called (which again releases allocated memory using the delete operator). What if there are more parameters? And what if they are complex objects with complex constructors, or destructors that, among other things, allocate and/or free memory? And what about all those local variables? It is clear that the overhead can be quite substantial. Is there a work around? Of course: Pass objects by reference and avoid, minimize, or consolidate local variables or make them static. Given these guidelines, the foo() routine can be rewritten as:

void foo(const CString& S)
{
 
static CString S2;
 
...
}

While there is nothing wrong with using static local objects (though you must remember to initialize or clear the static objects forcefully every time the routine is called) local variable consolidation is now considered a bad practice because it violates code separability. For instance, if you have two routines foo() and faa() that both rely on a local CString variable it is possible to consolidate both local variables into one by defining a global CString.

Also keep in mind that static or global variables are not thread safe. If several threads or processes call the same function that uses a static variable, the value of the static variable will be undetermined unless explicit synchronization (e.g., using incremental locks and mutexes) is employed. Though global variables are out of favor and there are some risks, there is no reason why we should not consider using them when performance really matters (or rather, there is no reason why compilers should not attempt to consolidate local object-type variables automatically).

A Winning Strategy

To improve the performance of subroutine/method calls, pass parameters in registers (__fastcall modifier in C++); pass objects by reference; reduce usage and/or consolidate expensive local variables by making them global, or make them static to prevent violation of code separability.

TOP 5 ARTICLES
No Top Articles.
DR. DOBB'S CAREER CENTER
Ready to take that job and shove it? open | close
Search jobs on Dr. Dobb's TechCareers
Function:

Keyword(s):

State:  
  • Post Your Resume
  • Employers Area
  • News & Features
  • Blogs & Forums
  • Career Resources

    Browse By:
    Location | Employer | City
  • Most Recent Posts:
    MEDIA CENTER  more
    NetSeminar
    Modernize your Development by Moving Build and Code Quality Upstream
    Moderated by Jon Erickson, Editor-in-Chief of Dr. Dobb's, this interactive panel discussion brings industry experts Anders Wallgren, CTO of Electric Cloud and Gwyn Fisher, CTO of Klocwork together for a candid discussion of the cost savings, productivity and quality benefits that can be achieved by stabilizing builds and code quality as early in the development cycle as possible.

    The reality of today's development environment - geographically distributed teams, the use of Agile development practices, increasing application complexity, etc. - is straining the viability of the traditional coding, build and release process. To stay ahead of the curve, development teams are modernizing their approach to dealing with these issues, and as a result are achieving new levels of development productivity. Register for the webcast.
    Date: Wednesday, July 15, 2009
    Time: 11 am PT/2 pm ET
    Modernize your Development by Moving Build and Code Quality Upstream
    Moderated by Jon Erickson, Editor-in-Chief of Dr. Dobb's, this interactive panel discussion brings industry experts Anders Wallgren, CTO of Electric Cloud and Gwyn Fisher, CTO of Klocwork together for a candid discussion of the cost savings, productivity and quality benefits that can be achieved by stabilizing builds and code quality as early in the development cycle as possible.

    The reality of today's development environment - geographically distributed teams, the use of Agile development practices, increasing application complexity, etc. - is straining the viability of the traditional coding, build and release process. To stay ahead of the curve, development teams are modernizing their approach to dealing with these issues, and as a result are achieving new levels of development productivity. Register for the webcast.
    Date: Wednesday, July 15, 2009
    Time: 11 am PT/2 pm ET
                                   
    INFO-LINK

    Resource Links: