Parallel

Microarchitecture Performance

By Ram Ramanathan, June 20, 2006

Improved throughput in energy-efficient designs for multicore processors and other high-performance systems

In his 11 years with Intel, Ram Ramanathan has held positions ranging from engineering to management. He has received four patents and has 10 patents pending in the areas of networking and security. Ram holds a master's degree in mathematics from Madurai Kamaraj University in India.

The computer industry continually needs smaller, faster, more efficient, and more capable servers. However, the fundamental challenge is in delivering that improved performance without increasing the system's power requirements. Today, the challenge is compounded as CMOS manufacturing processes scale toward physical, atomic limits. Complicated physics and breakthrough manufacturing processes are now required to approach the performance-power problem from two directions:

Increase the amount of work being done per clock cycle.
Deliver this increase in performance without increasing power requirements.

The industry has tried a variety of techniques to resolve those challenges. The best of these implementations are a combination of optimized microarchitecture, better transistor technologies, an increased number of execution cores, advanced memory technologies, and faster data access.

The new implementations are taking advantage of the increased compute density being delivered though breakthrough 65-nm process technology. The additional compute density--twice as many transistors in the same physical footprint--has made it possible to take parallelism down to the level of individual execution cores. The result is a new generation of energy-efficient systems, such as servers with improved instruction throughput that can respond faster to network demands.

Performance Foundation for Microarchitecture

Contrary to popular perceptions, performance is not based solely on clock frequency or on the number of instructions executed per clock cycle (IPC). Performance is the product of both clock frequency and IPC. To increase performance, one must increase either frequency, or IPC, or both. In today's designs, manufacturers are focusing not just on overall system architecture, but on microarchitecture improvements to deliver this performance in an energy-efficient form.

Performance = Frequency x (Instructions per clock cycle)

It is not always practical to improve both the frequency and the IPC. However, increasing one and holding the other close to constant can still achieve a significantly higher level of performance over previous-generation architectures. It is also possible to increase performance by reducing the number of instructions required to execute specific tasks.

In today's markets, power consumption is a critical challenge, and can be expressed as:

Power consumption = Dynamic capacitance x Voltage2 x Frequency

Dynamic capacitance is the ratio of the electrostatic charge on a conductor to the potential difference between the conductors required to maintain that charge. This is the dynamic capacitance required to maintain IPC efficiency. Voltage is the voltage that the transistors and I/O buffers are supplied with. Frequency is the GHz frequency that the transistors and signals are switching at.

The challenge is in balancing IPC efficiency and dynamic capacitance with the required voltage and frequency, to optimize for performance and power efficiency. The goal is to deliver microarchitectures that have an increased compute density for a given footprint, an improved performance per watt, and yet are still energy-efficient.

1 2 3 4 5 6 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Parallel

Microarchitecture Performance

Performance Foundation for Microarchitecture

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Parallel

Microarchitecture Performance

Performance Foundation for Microarchitecture

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content