tm1300 NXP Semiconductors, tm1300 Datasheet - Page 71

tm1300

Manufacturer Part Number

tm1300

Description

Tm-1300 Media Processor

Manufacturer

NXP Semiconductors

Datasheet

1.TM1300.pdf (533 pages)

Available stocks

Company

Part Number

Manufacturer

Quantity

Price

Company:

Bonase Electronics (HK) Co., Limited

Part Number:

tm1300-1.2

Quantity:

380

Current page: 71 of 533
Download datasheet (7Mb)

Philips Semiconductors

Figure 4-4. Straightforward code for MPEG frame reconstruction.

A straightforward coding of the reconstruction algorithm

might look as shown in

shares many of the undesirable properties of the first ex-

ample of byte-matrix transposition. The code accesses

memory a byte at a time instead of a word at a time,

which wastes 75% of the available bandwidth. Also, in

light of the many quad-byte-parallel operations intro-

duced in

tions,”

tions and one shift to process a single eight-bit pixel.

Perhaps even more unfortunate for a VLIW processor

like TM1300 is the branch-intensive code that performs

the saturation testing; eliminating these branches could

reap a significant performance gain.

Since MPEG decoding is the kind of task for which

TM1300 was created, there are two custom operations—

quadavg and dspuquadaddui—that exactly fit this impor-

tant MPEG kernel (and other kernels). These custom op-

erations process four pairs of 8-bit pixel values in paral-

lel. In addition, dspuquadaddui performs saturation tests

in hardware, which eliminates any need to execute ex-

plicit tests and branches.

For readers familiar with the details of MPEG algorithms,

the use of eight-bit IDCT values later in this example may

be confusing. The standard MPEG implementation calls

for nine-bit IDCT values, but extensive analysis has

shown that values outside the range [–128..127] occur

so rarely that they can be considered unimportant. Pur-

suant to this observation, the IDCT values are clipped

into the eight-bit range [–128..127] with saturating arith-

metic before the frame reconstruction code runs. The as-

sumption that this saturation occurs permits some of

TM1300’s custom operations to have clean, simple defi-

nitions.

The first step in seeing how custom operations can be of

value in this case, is to unroll the loop by a factor of four.

The unrolled code is shown in

code that is parallel with respect to the four pixel compu-

tations. As it is easily seen in the code, the four groups of

computations (one group per pixel) do not depend on

each other.

it seems inefficient to spend three separate addi-

Section 4.1.2, “Introduction to Custom Opera-

Figure

void reconstruct (unsigned char *back,

{

}

int i, temp;

for (i = 0; i < 64; i += 1)

{

}

4-4. This implementation

Figure

temp = ((back[i] + forward[i] + 1) >> 1) + idct[i];

if (temp > 255)

else if (temp < 0)

destination[i] = temp;

4-5. This creates

temp = 255;

temp = 0;

unsigned char *forward,

unsigned char *destination)

char *idct,

After some experience is gained with custom operations,

it is not necessary to unroll loops to discover situations

where custom operations are useful. Often, a good pro-

grammer with knowledge of the function of the custom

operations can see by simple inspection opportunities to

exploit custom operations.

To understand how quadavg and dspuquadaddui can be

used in this code, we examine the function of these cus-

tom operations.

The quadavg custom operation performs pixel averaging

on four pairs of pixels in parallel. Formally, the operation

of quadavg is as follows:

takes arguments in registers rsrc1 and rsrc2, and it com-

putes a result into register rdest. rsrc1 = [abcd], rsrc2 =

[wxyz], and rdest = [pqrs] where a, b, c, d, w, x, y, z, p, q,

r, and s are all unsigned eight-bit values. Then, quadavg

computes the output vector [pqrs] as follows:

The pixel averaging in

statement of each of the four groups of statements. The

rest of the code—adding idct[i] value and performing the

saturation test—can be performed by the dspuquadad-

dui operation. Formally, its function is as follows:

takes arguments in registers rsrc1 and rsrc2, and it com-

putes a result into register rdest. rsrc1 = [efgh], rsrc2 =

[stuv], and rdest = [ijkl] where e, f, g, h, i, j, k, and l are

unsigned 8-bit values; s, t, u, and v are signed 8-bit val-

ues. Then, dspuquadaddui computes the output vector

[ijkl] as follows:

The uclipi operation is defined in this case as it is for the

separate TM1300 operation of the same name described

definition is as follows:

PRODUCT SPECIFICATION

Appendix A, “DSPCPU Operations for

quadavg rscr1 rsrc2 -> rdest

p = (a + w + 1) >> 1

q = (b + x + 1) >> 1

r = (c + y + 1) >> 1

s = (d + z + 1) >> 1

dspuquadaddui rsrc1 rsrc2 -> rdest

i = uclipi(e + s, 255)

j = uclipi(f + t, 255)

k = uclipi(g + u, 255)

l = uclipi(h + v, 255)

Custom Operations for Multimedia

Figure 4-5

is evident in the first

TM1300,”. Its

4-5

tm1300 NXP Semiconductors, tm1300 Datasheet - Page 71

tm1300

Available stocks

Related parts for tm1300