AN2203 Freescale Semiconductor / Motorola, AN2203 Datasheet - Page 53

no-image

AN2203

Manufacturer Part Number
AN2203
Description
MPC7450 RISC Microprocessor Family Software Optimization Guide
Manufacturer
Freescale Semiconductor / Motorola
Datasheet

Available stocks

Company
Part Number
Manufacturer
Quantity
Price
Part Number:
AN22030A
Manufacturer:
PANASONIC/松下
Quantity:
20 000
Part V
Optimized Code Sequences
Many of the code sequences given in the book the PowerPC Compiler Writer’s Guide as optimal code
sequences are no longer optimal for current microprocessors. The primary problem with the sequences
suggested in the PowerPC Compiler Writer’s Guide is that they use carry forwarding, and the execution
serialization of carry consumers on the MPC7450 has often made the suggested sequence inferior to
alternatives. This chapter provides better optimized code sequences.
Compiler writers and programmers should carefully evaluate the given options for each sequence—often, a
longer set of instructions may execute faster than a sequence containing fewer instructions. However, the
additional instruction cache space requirements and register usage must be taken into account to determine
which sequence is better in a given case. For code sequences where a cycle count is given, that cycle count
is for the case where the instructions in question are the only instructions executing on the machine. This
assumes that all execution units of the processor are available, and that certain instructions may execute in
parallel. For cases where the cycle count is equal for the PowerPC Compiler Writer’s Guide sequence and
the MPC7450 sequence, the MPC7450 sequence is recommended because it is more likely to do well when
dynamic scheduling occurs.
The tables that follow give the standard recommended code sequence for each operation, along with a
MPC7450-specific recommended sequence, where applicable. The standard recommended code sequences
were taken from the Compiler Writer’s Guide and are located in the columns titled Compiler Writer’s Guide
code. For each code sequence, the input variables are allocated to registers r3, r4, and possibly r5, depending
on the number of arguments. The highest-numbered register used is allocated to the result. All registers
between those used for the arguments and the results hold temporary values.
The future designs mentioned in this document refer to future high performance designs that implement the
PowerPC architecture. The statements may not apply to all future designs.
5.1
The entries in Table 5-1 originally come from Section 3.2.3.5 of the PowerPC Compiler Writer’s Guide. The
argument is assumed to be in r3.
MOTOROLA
Signed divide by 2 srawi r4,r3,1
Signed divide by 4 srawi r4,r3,2
Operation
Signed Division Sequences
MPC7450 RISC Microprocessor Family Software Optimization Guide
addze r4,r4
Cycles: 5
addze r4,r4
Cycles: 5
Compiler Writer’s
Guide code
Freescale Semiconductor, Inc.
For More Information On This Product,
Table 5-1. Signed Division Sequences
Go to: www.freescale.com
srwi r4,r3,31
add r5,r4,r3
srawi r6,r5,1
Cycles: 3
srawi r4,r3,k
srwi r5,r4,30
add r6,r5,r3
srawi r7,r6,2
Cycles: 4
MPC7450 Code
(If Different)
The MPC7450 sequence takes 4 cycles to
complete, but the GPR result in r6 is available
after 3 cycles. As it is the only part of the result
that is used, the sequence is assumed to take
3 cycles.
k = any constant between 1 and 3. The
purpose of the first srawi is to provide a
duplicate copy of the sign bit, so any amount
of shifting that results in at least 2 copies of the
sign bit will suffice.
The MPC7450 sequence avoids execution
serialization and is more likely to run well on
future designs.
Signed Division Sequences
Comments
53

Related parts for AN2203