FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
Development Tools
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
April 26, 2006
Testing Times

Resolving temporal costs at the level of individual machine cycles

(Page 1 of 4)
Richard Vaughan
Using the RDTSC instruction, applications can determine the number of machine cycles that have elapsed since the CPU was powered up, which can be very useful for building a test harness.
Richard Vaughan provides development, training, and consultancy services in London. He can be contacted at http://www.dodeca.co.uk.

The film Fantastic Voyage demonstrates microscopy through a provocative premise--the miniaturization of the observer. Appealing as this is, however, it would not yield definitive knowledge of biological processes because the behaviour of, say, a single blood cell does not reflect the properties of blood as a whole. Similarly, the RDTSC machine-instruction, supported by Intel processors from the Pentium onwards, proffers microscopy of time, in that it allows the number of cycles consumed during execution of a section of code to be determined exactly. However, the use of RDTSC is by no means simple, and a number of often-subtle issues must be observed if accurate, consistent, and representative results are to be gathered.

Using RDTSC

The RDTSC instruction lets applications determine the number of machine cycles that have elapsed since the CPU was powered up, and places a 64-bit value in EDX:EAX. Subtracting the value obtained just before a code sequence from that obtained once the sequence has completed yields the number of cycles consumed between the two points, and the figure can then be divided by the clock frequency to yield elapsed real-time if desired. Given this, Listing One should yield the number of cycles required to construct an instance of a simple class (note this reads just the lower 32 bits of the counter).

class Simple
   {
   private: int Attrib;
   public:  Simple (int Value) : Attrib (Value) { }
   };

void F () { unsigned StartTicks = 0; unsigned StopTicks = 0; __asm { rdtsc; // Read timestamp counter mov StartTicks, eax; // Copy counter value into variable, // uses lower 32 counter-bits } Simple MySimple (0); __asm { rdtsc; // Read timestamp counter mov StopTicks, eax; // Copy counter value into second variable } unsigned Cycles = StopTicks - StartTicks; ... }

Listing One: Determining the number of cycles required to construct an instance of a simple class.

However, while this will always give an accurate count of the cycles consumed, results can be inconsistent across repeated executions of the test, and additional steps must be taken to yield representative data. Firstly, RDTSC is a "non serializing" instruction, which means that out of order instruction-execution could cause it to return a value before the test sequence has completed. It is therefore necessary to force the CPU to complete all operations before executing RDTSC, and while there are several instructions that can be used to effect this, the most suitable by far is CPUID. This should be executed immediately before the first RDTSC, and immediately before the second.

A second consideration is that interrupts may occur during a test, which will increase the apparent execution time for a test sequence when the CPU is temporarily redirected into an interrupt handler. It is therefore necessary to use the CLI instruction to disable interrupts before executing a test sequence, and to then re-enable them using STI once the test has completed (although you need the appropriate privilege level for this to work).

1 Using RDTSC | 2 Cache Considerations | 3 A Temporal Clean-Room? | 4 Changing Designs Next Page
TOP 5 ARTICLES
No Top Articles.



MICROSITES
FEATURED TOPIC

ADDITIONAL TOPICS

INFO-LINK