AN2203 Freescale Semiconductor / Motorola, AN2203 Datasheet - Page 45

no-image

AN2203

Manufacturer Part Number
AN2203
Description
MPC7450 RISC Microprocessor Family Software Optimization Guide
Manufacturer
Freescale Semiconductor / Motorola
Datasheet

Available stocks

Company
Part Number
Manufacturer
Quantity
Price
Part Number:
AN22030A
Manufacturer:
PANASONIC/松下
Quantity:
20 000
in the final code. A general set of rules is given below. Although these rules are generally reliable, there are
always a few cases where it can make sense to break them.
4.1.3
Programming languages are implemented such that applications repeatedly use smaller sequences of code
for common operations. Some examples are absolute value, minimum and maximum of two numbers and
bit manipulations. For those simple functions it is worthwhile to find the set of MPC7450 instructions that
has the best performance and use these instructions during code generation, writing peephole optimizations
where necessary. Part V, “Optimized Code Sequences,” lists a number of such known functions and
respective optimal instruction sequences.
4.1.4
Some control path problems can be converted to data path problems (predication). This includes the use of
instructions like fsel or vsel, or groups of instructions on the integer side to emulate a conditional integer
select. This approach should be taken only after careful analysis. It is typically useful if the branch is
difficult to predict or the computation overhead of the predicated code is very small.
Note that as pipelines get longer and mispredicts get more expensive, converting control path problems to
data path problems is an increasingly favored solution.
4.2
Because the MPC7450 microprocessor has higher branch penalties and a hardware link stack, the compiler
toolchain should consider some measures to improve branch performance.
4.2.1
Using the CTR is generally preferable over pairing compare/branch instructions. This has been a guideline
for prior implementations, but the possible penalty of using add/compare/branch instead of the CTR-based
branch-and-decrement is greater than on previous processors.
See Section 3.1.2.2, “Branch Loop Example,” for an example of how CTR-based loops can be better.
MOTOROLA
Use the load update and store update forms to merge a subsequent pointer update instruction with
the access. Note that excessive use of the load-update form (three load-update instructions in a
row) can cause dispatch and retirement stalls. See Section 3.2, “Dispatch Considerations,” and
Section 3.4.2, “Completion Groupings,” for more details.
Avoid carry consumers (instructions like adde that require the XER[CA] as an input) unless doing
more than 32-bit arithmetic.
Use carry generating instructions such as addc and subfc, only when they are needed to generate
XER[CA].
Use the record form of instructions only when needed.
Avoid toggling XER[SO]; see Section 3.4.3, “Serialization Effects.”
Optimizations to Exploit the Branch Unit
Optimal Code Sequences
Conversion of Control Path into Data Path
Bias Towards CTR for Loops
MPC7450 RISC Microprocessor Family Software Optimization Guide
Freescale Semiconductor, Inc.
For More Information On This Product,
Go to: www.freescale.com
Optimizations to Exploit the Branch Unit
45

Related parts for AN2203