Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Parallel

Itanium 2 Developer Days Diary


Making Sense of Microarchitecture

The main difference between standard processors and the Intel Itanium 2 microprocessors is Explicitly Parallel Instruction Computing (EPIC), which shifts the responsibility for maximizing parallelism from the processor to the compiler. Unlike microprocessors employing Reduced Instruction Set Computing (RISC) or Complex Instruction Set Computing (CISC) models, in the EPIC model the compiler, aware that there are multiple execution units, groups parallel-ready instructions in bundles. The processor executes the bundles in parallel without runtime analysis.

The Leap to EPIC: Architecture Highlights

  • The compiler orchestrates predication, allowing instructions to be executed conditionally and reducing the performance hits caused by branch mispredicts in RISC-based systems.
  • The Intel Itanium 2 compiler recognizes that there are multiple execution units; the compiler groups instructions that can be performed in parallel, making them ready for execution without runtime analysis.
  • The processor's scheduler is in the compiler, allowing the compiler to handle scheduling and produce code that takes full advantage of on-chip resources.
  • Intel Itanium 2 microarchitecture has 128 general-purpose and floating-point registers, versus the 32 general-purpose and floating registers found in most RISC-based systems.
  • Intel Itanium 2 processors use only the registers they need rather than the 8 registers that RISC-based systems take whether they need them or not.
  • Intel Itanium 2 microarchitecture has more units that execute instructions.
  • Two-way pipelines pre-load data ahead of possible over-writes, resulting in fewer flushes, fewer problems, and increased reliability and performance.
  • Software pipelining. Combining speculation, explicit parallelism, predicated execution and rotating registers with looping branch instruction allows:

    • Efficiently pipelined loops
    • Smaller code
    • Reduced latency
    • Elimination of copied code for prologue or epilogue
    • Increased parallelism
    • More Level 1 cache memory
    • Shorter wait times
    • Greater I/O bandwidth

—R.D.

The Intel Itanium 2 microprocessor has other speed-enhancing features in addition to the EPIC paradigm; in most cases, the compiler exploits them automatically. For example, non-EPIC processors use branch prediction to speed up processing times. Encountering a code branch, x86 chips don't wait around to find out which way to go. They "guess." Branch prediction algorithms are almost always right, but in the highly branched code relevant to data- and calculation-intensive computing, even a tiny percentage of wrong guesses can add up to big performance hits because a wrong guess sends the process back to the beginning.

The Intel Itanium 2 processor does use prediction, but adds predication to avoid misprediction performance hits by running each possible variation of a branch in parallel and tossing the incorrect result. The microprocessor actually contains extra bits which can be set to "true" or "false" for a given predicated instruction. The compiler chooses which branches are suitable for predication and sets the bit. All developers have to do is re-compile for the Intel Itanium 2 processor to make use of predication

Optimal use of the Intel Itanium 2 microarchitecture's extensive onboard memory caches is also critical to maximizing performance. "The idea is to arrange program execution so that needed instructions and data are in L1 cache as much as possible," said HP/s Dick Nicholson at the Developer Days conference sponsored by the Itanium Solutions Alliance. "In a best case/worst case comparison, a program whose data is always in Level 1 cache when needed will run much faster as a program whose data always has to be fetched from main memory."

[Click image to view at full size]

Figure 1: Intel Itanium 2 microarchitecture.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.