Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Embedded Systems

Programming High-Performance DSPs: Part 3


Power Optimization for Embedded Systems Programmers
Despite the importance of power consumption and memory use, relatively little emphasis has been placed on optimizing power and memory for embedded applications. This paper will provide some guidelines on optimizing embedded applications for power.

Just as code size and speed impact cost, power consumption also affects cost. The more power consumed by an embedded application, the larger the battery required to drive it. For a portable application, this can make the product more expensive, unwieldy, and undesirable. To reduce power, you need to make the application run in as few cycles as possible, considering that each cycle consumes a measurable amount of energy. In this sense, it would seem that performance and power optimization are similar—consume the fewest number of cycles to get both performance and power optimization goals. Performance and power optimization strategies share similar goals but have subtle differences as will be shown shortly.

But the real power optimization gains come with how data is accessed before being processed by the embedded CPU. Most of the power consumed in an embedded application comes not from the CPU but from the processes used to get data from memory to the CPU. Each time the CPU accesses external memory, buses are turned on, and other functional units must be powered on and utilized to get the data to the CPU. This is where the majority of power is consumed. If the programmer can design embedded applications to minimize the use of external memory, efficiently move data into and out of the CPU, and make efficient use of cache to prevent cache thrashing, the overall power consumption of the application will be reduced significantly. Figure 16 shows the two main power contributors. The compute block includes the CPU and this is where the algorithmic functions are performed. The other is the memory transfer block and this is where the memory subsystems are utilized by the application. The memory transfer block is where the majority of the power is consumed by an embedded application.


Figure 16. The main power contributors for an embedded application are in the memory transfer functions, not in the compute block. (From PowerEscape)

LAST RESORT - ASSEMBLY LANGUAGE
Many times, the C code can be modified slightly to alleviate this situation, but it can take time and several iterations to get the optimal (or close to optimal) solution. The process of refining code in this manner is shown in Figure 17. The last resort is coding the algorithm in assembly language. Assembly language is harder to write, understand, and maintain. Tools have been developed that make it easier for assembly language programmers to write efficient code for superscalar and VLIW processors. Assembly language optimizers, for example, allow the programmer to write serial assembly language and then optimize it into software pipelined loops automatically.


Figure 17. Code optimization process

CONCLUSION
Real time programmers have always had to develop a library of tricks to allow software to run as fast as possible. As processors continue to become more complicated, this becomes a more difficult endeavor. For superscalar VLIW processors, managing two separate pipelines and insuring the highest amount of parallelism requires tools support. Optimizing compilers are helping overcome many of the obstacles of these powerful new processors, but even the compilers have limitations. Real time programmers should not trust the compiler to perform all of the necessary optimizations for you. They need help! The main steps to follow are:

  1. Study the assembly language produced by the compiler. In many instances, subtle changes to the structure of the C code can make a big difference in how the compiler generates the .asm language. This can make the difference in the real time performance of the system.
  2. Use the DMA capabilities. Especially for data intensive number crunching applications common in DSP systems. The DMA can take a huge burden off of the CPU and help manage data efficiently.
  3. Keep the pipelines full. The whole reason superscalar and VLIW processors were invented was to take advantage of parallelism. Look for areas of inefficiency in the assembly language make modifications to allow both pipelines to be used at their full efficiency. This requires an understanding of what the compiler looks for in terms of pipelining opportunities. It also requires an understanding of the application. Many times, just re-arranging the algorithm in a different way can make it run more efficiently on the processor.

References

  1. TMS320C62X Programmers Guide, Texas Instruments, 1997
  2. Computer Architecture, A Quantitative Approach, by John L Hennesey and David A Patterson, copyright 1990 by Morgan Kaufmann Publishers, Inc., Palo Alto, CA

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.