Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Design

Multi-core MPEG-4 Video Encode Partitioning



Efficient Multi-Core Partitioning
Efficient partitioning of complex algorithms such as video encoders requires a combination of the two partitioning techniques described above, and the ability to assign tasks to processors at run time instead of compile time whenever appropriate. To overcome the limitations of data partitioning, the granularity of the blocks being processed by individual processors needs to be smaller than a slice, which introduces data dependencies that need to be dealt with. This granularity level may be at macroblock (MB) level or the level of a small group of MBs. Bringing down the granularity of data partitioning to a finer level and combining it with data pipelining creates a large number of individual tasks. This large number of tasks that can be allocated to processors at run time is the key to an efficient use of multi-core architecture resources.

Many challenges are in the way of this approach. How do you define tasks to minimize data dependencies? How do you decide in which order tasks need to be processed to ensure that there will always be new tasks available when a processor becomes available and despite the fact that the processing requirements of some of the tasks may vary drastically with the data being processed? How do you ensure that the task-switching overhead -- the time spent between when a processor completes a task and starts the next task -- remains small? How do you ensure that this partitioning approach is scalable so that you can assign a variable number of processors to one algorithm depending on the other algorithms running in parallel and the respective processing requirements? How do you ensure that each processor has enough fast memory available to process their tasks efficiently given that some tasks have much higher memory requirements than others and that the amount of fast memory is limited and shared across many processors?

The answers to many of these questions depend on the application being targeted, the multi-core architecture being used, and the software libraries and tools provided by the processor vendor to develop and debug code running on that architecture. In this article we focus on how Cradle implemented the MPEG-4 encoder on the CT3600 chip and provide elements of answers to many of these questions. We start with a brief overview of Cradle CT3600 architecture and the structure of video encoders like MPEG-4. We then discuss in detail how the MPEG-4 encoder was partitioned on the CT3600 architecture.

The CT3600 MDSP family
The Cradle CT3600 family of Multi-core DSP (MDSP) processors is a family of heterogeneous multi-core chips, accompanied by an easy-to-use multi-core programming system that comprises development, debug and profiling capabilities. One platform can be reprogrammed for any or all of the multi-channel, multi-application products.

The Cradle CT3600 architecture has up to 8 RISC processors and 16 DSPs. It is a shared data memory architecture with all elements having their own instruction memory and 32-bit wide register files. Cradle defines a group of 4 RISC processors as a Quad. Associated with a Quad are 8 DSPs, 128k bytes of shared data memory and nine 8-bit Programmable I/O Ports, each embedding a CPLD and state machine (Figure 1).


Figure 1: CT3616 architecture block diagram

Global resources include a PCI Bus interface and DDR-SDRAM controller with multiple DMA channels, Global Semaphores and bus-performance monitors.

Co-designed with the processor architecture is the Cradle SDK -- a multi-core simulator and debugger Software Development Kit. All 24 processors and all I/Os can either be simulated or accessed in the hardware directly through a JTAG or PCI interface.

Next: MPEG-4 Encoder Structure


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.