FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
High Performance Computing
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
June 20, 2006
Microarchitecture Performance

Improved throughput in energy-efficient designs

(Page 1 of 6)
Ram Ramanathan
Improved throughput in energy-efficient designs for multicore processors and other high-performance systems
In his 11 years with Intel, Ram Ramanathan has held positions ranging from engineering to management. He has received four patents and has 10 patents pending in the areas of networking and security. Ram holds a master's degree in mathematics from Madurai Kamaraj University in India.


The computer industry continually needs smaller, faster, more efficient, and more capable servers. However, the fundamental challenge is in delivering that improved performance without increasing the system's power requirements. Today, the challenge is compounded as CMOS manufacturing processes scale toward physical, atomic limits. Complicated physics and breakthrough manufacturing processes are now required to approach the performance-power problem from two directions:

  • Increase the amount of work being done per clock cycle.
  • Deliver this increase in performance without increasing power requirements.

The industry has tried a variety of techniques to resolve those challenges. The best of these implementations are a combination of optimized microarchitecture, better transistor technologies, an increased number of execution cores, advanced memory technologies, and faster data access.

The new implementations are taking advantage of the increased compute density being delivered though breakthrough 65-nm process technology. The additional compute density--twice as many transistors in the same physical footprint--has made it possible to take parallelism down to the level of individual execution cores. The result is a new generation of energy-efficient systems, such as servers with improved instruction throughput that can respond faster to network demands.

Performance Foundation for Microarchitecture

Contrary to popular perceptions, performance is not based solely on clock frequency or on the number of instructions executed per clock cycle (IPC). Performance is the product of both clock frequency and IPC. To increase performance, one must increase either frequency, or IPC, or both. In today's designs, manufacturers are focusing not just on overall system architecture, but on microarchitecture improvements to deliver this performance in an energy-efficient form.

Performance = Frequency x (Instructions per clock cycle)

It is not always practical to improve both the frequency and the IPC. However, increasing one and holding the other close to constant can still achieve a significantly higher level of performance over previous-generation architectures. It is also possible to increase performance by reducing the number of instructions required to execute specific tasks.

In today's markets, power consumption is a critical challenge, and can be expressed as:

Power consumption = Dynamic capacitance x Voltage2 x Frequency

Dynamic capacitance is the ratio of the electrostatic charge on a conductor to the potential difference between the conductors required to maintain that charge. This is the dynamic capacitance required to maintain IPC efficiency. Voltage is the voltage that the transistors and I/O buffers are supplied with. Frequency is the GHz frequency that the transistors and signals are switching at.

The challenge is in balancing IPC efficiency and dynamic capacitance with the required voltage and frequency, to optimize for performance and power efficiency. The goal is to deliver microarchitectures that have an increased compute density for a given footprint, an improved performance per watt, and yet are still energy-efficient.

1 Introduction | 2 Wider Execution Cores | 3 Micro-op Fusion: An Additional Energy-saving Technique | 4 Intelligent Cache | 5 Increasing the Efficiency of Out-of-order Processing</ | 6 Doubling Throughput of Streaming SIMD Extension Instructions Next Page
TOP 5 ARTICLES
No Top Articles.



MICROSITES
FEATURED TOPIC

ADDITIONAL TOPICS

INFO-LINK