Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Parallel

Microarchitecture Performance


Intelligent Cache

With an increase in transistor density, manufacturers such as Intel can build significantly more cache for each core. This increases the probability that each execution core can access data from the faster, more efficient cache subsystem. Advanced parallelism in the microarchitecture then optimizes the use of that cache to reduce latency to frequently used data.

Each execution core now has a dedicated L1 cache for data specific to that core. Since more data is available locally, fewer fetches are made outside the processor, and traffic on the system bus is reduced. This reduces memory latency and accelerates data flow. All cores then share a larger L2 cache for common data, to better optimize cache resources.

The advanced parallelism takes work traditionally done in the processor architecture and performs it at the micro level--at the core level, core-to-core level, and memory level. Since this method uses fewer hardware elements in the server platform, power requirements are also reduced. The result is greater performance at an increased level of energy efficiency.

Dynamic Allocation of L2 Cache

Another advanced optimization being used by Intel is dynamic allocation of the shared L2 cache, based on each core's requirements. Each core can now dynamically use up to 100 percent of available L2 cache. If one core has minimal cache requirements, the other core can dynamically increase its proportion of L2 cache (Figure 3). This helps decrease cache misses and reduce latency.

Dynamic allocation of L2 cache also allows each core to obtain data from the cache at higher throughput rates as compared to previous-generation architectures. This increases processor efficiency, increasing absolute performance, as well as performance per watt, a critical benefit for servers.

[Click image to view at full size]
Figure 3: Dynamic allocation of L2 cache, based on each core's requirements.

Challenges and Approaches To Memory Access

No matter how much cache is put in the system, data must still be fetched from main memory to go into the cache. The industry has explored many techniques to speed up that main memory access, from designing a hardware-based memory controller into the processor, to optimizing memory access through more flexible designs and methodologies.

Each set of techniques has its benefits. However, using a single hardware-based memory technology means that a design cannot easily take advantage of newer, more advanced techniques for improving memory access. The better designs use architectures flexible enough to support multiple memory technologies, to meet any requirements in the system.

These advanced designs use intelligent memory access to optimize the use of the available data bandwidth from the memory subsystem, and to hide the latency of memory accesses. This ensures that data can be used as quickly as possible. It also helps make sure that data is located as close as possible to where it's needed. Intelligent memory access minimizes latency and significantly improves efficiency and speed of memory accesses.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.