AN2203 Freescale Semiconductor / Motorola, AN2203 Datasheet - Page 19

no-image

AN2203

Manufacturer Part Number
AN2203
Description
MPC7450 RISC Microprocessor Family Software Optimization Guide
Manufacturer
Freescale Semiconductor / Motorola
Datasheet

Available stocks

Company
Part Number
Manufacturer
Quantity
Price
Part Number:
AN22030A
Manufacturer:
PANASONIC/松下
Quantity:
20 000
Loop unrolling and vectorization can further increase performance. These are described in Section 4.4.3,
“Loop Unrolling for Long Pipelines,” and Section 4.4.4, “Vectorization.”
3.1.1.2
The following code shows how favoring taken branches affects fetch supply.
xxxxxx00
xxxxxx04
xxxxxx08
xxxxxx0C
xxxxxx10 targ
This example assumes the bne is usually taken (that is, most of the data in the array is non-zero). Table 3-3
assumes correct prediction of the bne, and cache and BTIC hits.
Rearranging the code as follows improves the fetch supply.
xxxxxx00
xxxxxx04
xxxxxx08
xxxxxx0C targ2 add (next basic block)
...
yyyyyy00 targ
yyyyyy04
Using the same assumptions as before, Table 3-4 shows the performance improvement. Note that the first
instruction of the next basic block (add) completes in the same cycle as before. However, by avoiding the
branch-taken bubble (because the branch is usually not taken), it also dispatches one cycle earlier, so that
the next basic block begins executing one cycle sooner.
MOTOROLA
Branch-Taken Bubble Example
add (3)
bdnz (3)
MPC7450 RISC Microprocessor Family Software Optimization Guide
Instruction
Table 3-2. MPC7450 Loop Example—Three Iterations (continued)
lwz r10,0x4(r9)
cmpi 4,r10,0x0
bne 4, targ
stw r11,0x4(r9)
add (next basic block)
lwz r10,0x4(r9)
cmpi 4,r10,0x0
beq 4,targ
stw r11,0x4(r9)
b targ2
lwz
cmpi
bne
add
Instruction
Freescale Semiconductor, Inc.
For More Information On This Product,
Table 3-3. Branch-Taken Bubble Example
0
1
Go to: www.freescale.com
BE
D
D
0
2
1
I
I
3
E0
D
2
BE
D
4
E1
3
I
D
5
I
E2
E
4
6
C
E
5
Fetch/Branch Considerations
7
C
C
6
8
E
9
19

Related parts for AN2203