The computer industry continually needs smaller, faster, more efficient, and more capable servers. However, the fundamental challenge is in delivering that improved performance without increasing the system's power requirements. Today, the challenge is compounded as CMOS manufacturing processes scale toward physical, atomic limits. Complicated physics and breakthrough manufacturing processes are now required to approach the performance-power problem from two directions:
- Increase the amount of work being done per clock cycle.
- Deliver this increase in performance without increasing power requirements.
The industry has tried a variety of techniques to resolve those challenges. The best of these implementations are a combination of optimized microarchitecture, better transistor technologies, an increased number of execution cores, advanced memory technologies, and faster data access.
The new implementations are taking advantage of the increased compute density being delivered though breakthrough 65-nm process technology. The additional compute density--twice as many transistors in the same physical footprint--has made it possible to take parallelism down to the level of individual execution cores. The result is a new generation of energy-efficient systems, such as servers with improved instruction throughput that can respond faster to network demands.
Performance Foundation for Microarchitecture
Contrary to popular perceptions, performance is not based solely on clock frequency or on the number of instructions executed per clock cycle (IPC). Performance is the product of both clock frequency and IPC. To increase performance, one must increase either frequency, or IPC, or both. In today's designs, manufacturers are focusing not just on overall system architecture, but on microarchitecture improvements to deliver this performance in an energy-efficient form.
Performance = Frequency x (Instructions per clock cycle)
It is not always practical to improve both the frequency and the IPC. However, increasing one and holding the other close to constant can still achieve a significantly higher level of performance over previous-generation architectures. It is also possible to increase performance by reducing the number of instructions required to execute specific tasks.
In today's markets, power consumption is a critical challenge, and can be expressed as:
Power consumption = Dynamic capacitance x Voltage2 x Frequency
Dynamic capacitance is the ratio of the electrostatic charge on a conductor to the potential difference between the conductors required to maintain that charge. This is the dynamic capacitance required to maintain IPC efficiency. Voltage is the voltage that the transistors and I/O buffers are supplied with. Frequency is the GHz frequency that the transistors and signals are switching at.
The challenge is in balancing IPC efficiency and dynamic capacitance with the required voltage and frequency, to optimize for performance and power efficiency. The goal is to deliver microarchitectures that have an increased compute density for a given footprint, an improved performance per watt, and yet are still energy-efficient.