FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
Architecture & Design
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
June 20, 2006

Microarchitecture Performance

(Page 6 of 6)

Doubling Throughput of Streaming SIMD Extension Instructions

Streaming SIMD extension instructions are also known as SSE, SSE2, and SSE3 instructions. They accelerate a range of applications, such as video, speech and image, photo processing, encryption, financial, and engineering and scientific applications. Today, almost all servers execute these 128-bit instructions at a sustained execution rate of one complete instruction every two clock cycles. The lower 64-bits are executed in one clock cycle, and the upper 64-bits are executed in the next clock cycle.

However, wide dynamic execution now allows four 32-bit instructions (instead of three instructions) to be executed in a single clock cycle. This opens an opportunity for greater parallelism inside the execution core.

By moving to floating-point mathematics and improving methodology, one manufacturer is already delivering microarchitecture that executes two 64-bit instructions in a single clock cycle. This means that 128-bit instructions can be executed at a throughput rate of one full instruction per clock cycle (see Figure 4). Since floating-point mathematics can be performed faster than in previous-generation processors, this approach effectively doubles the speed of execution for SIMD-extension instructions.

[Click image to view at full size]
Figure 4(a): Doubling throughput of SIMD extension instructions. Typical industry execution of streaming SIMD extention instructions breaks 128-bit instruction into two 64-bt instructions; takes two clock cycles.

[Click image to view at full size]
Figure 4(b): Doubling throughput of SIMD extension instructions.Advanced microarchitecture fully executes 128-bit streaming SIMD extention instructions at throughput rate of one per clock cycle, doubling execution speed.

New Standards for Energy-efficient Performance

In response to industry's growing concern with energy efficiency, not just performance, Intel has developed and implemented advanced and unique techniques in microarchitecture. With state-of-the-art microarchitecture, desktops can now deliver greater compute performance as well as ultra-quiet, sleek and low-power designs. Servers can deliver greater compute density, and laptops can take the increasing compute capability of multi-core to new mobile form factors. The result is a new generation of high-quality, scalable, energy-efficient platforms for the desktop, server, and mobile markets.

For More Information

For learn information about energy efficient performance at Intel, go to http://www.intel.com/technology/eep/index.htm?ppc_cid=c98.

Previous Page | 1 Introduction | 2 Wider Execution Cores | 3 Micro-op Fusion: An Additional Energy-saving Technique | 4 Intelligent Cache | 5 Increasing the Efficiency of Out-of-order Processing</ | 6 Doubling Throughput of Streaming SIMD Extension Instructions
TOP 5 ARTICLES
No Top Articles.



MICROSITES
FEATURED TOPIC

ADDITIONAL TOPICS

INFO-LINK