AN2797 Freescale Semiconductor / Motorola, AN2797 Datasheet - Page 4

no-image

AN2797

Manufacturer Part Number
AN2797
Description
Migrating from IBM 750GX to MPC7447A
Manufacturer
Freescale Semiconductor / Motorola
Datasheet
Feature Overview
2.1.1 Integer Units
Fixed unit 1 (FXU1) and Fixed unit 2 (FXU2) are the complex and simple integer units respectively. The multiply
and divide instructions of FXU1 are multi-cycle, while all other operations are completed in a single cycle. Both of
the integer units operate on 32 32-bit registers.
unit consists of three parts, an adder/comparator, logical and a shift/rotate unit. In addition to these standard units,
FXU1 also has a multiply/divide unit.
Like the IBM 750GX, the MPC7447A has one complex integer unit with the same functionality as FXU1. However,
it has three simple integer units like FXU2, instead of one. A good compiler can take advantage of these three simple
integer units when presented with a combination of instructions that have multi-cycle latencies. Such a combination
would tie up two of the integer units, allowing the remaining units to start executing. Thus stalling would be
prevented. In addition, the MPC7447A has 16 general purpose registers (GPR) rename buffers to support the
16-entry completion queue, as compared to the six-entry completion queue for the IBM 750GX. The floating-point
can also source rename buffers as a source operand without waiting for the value to be committed and retrieved from
a GPR.
2.1.2 Floating-Point Units
The IBM 750GX floating-point unit has 32 64-bit registers for single-precision and double-precision IEEE-754
standards. Different operations have various latencies associated with them due to the 3 stage pipeline with multiply,
add and normalize stages. The latency/throughput varies from 3/1 clock cycles for single multiply-add, increasing
to 4/1 clocks for double multiply and double multiply-add since two cycles are required in the multiply unit.
The MPC7447A floating-point unit meets the same standards for IEEE-754 precision and, in addition, has an
increased pipeline depth of five stages to allow even double precision calculations to have a one-cycle throughput.
Although the latency is increased, the overall throughput is better for the majority of double-precision calculations.
The floating-point can also source rename buffers as a source operand without waiting for the value to be committed
and retrieved from a fixed point register (FPR).
2.1.3 Instruction Queues
The instruction queue in the IBM 750GX can hold up to six instructions. While the instruction queue depth allows,
the instruction fetcher retrieves up to the four instructions maximum per clock. Two instructions can be dispatched
simultaneously to fixed or floating point units, the branch processing unit and load/store unit[punctuation] to execute
in a four-stage pipeline containing fetch, dispatch, execute, and complete stages.
The MPC7447A offers a twelve-slot instruction queue with a maximum of four fetches per cycle and can dispatch
up to three instructions per cycle to any of the eleven instruction units: the branch processing unit, the four integer
units, the floating-point unit, the four 128-bit (AltiVec) vector units, or the load/store unit.
4
Operation
Add, shift, logical functions
Multiply/divide
Migrating from IBM 750GX to MPC7447A, Rev. 1.0
Table 1. FXU Operations
Table 1
shows the operations that each fixed unit can perform. Each
FXU1
Yes
Yes
FXU2
Yes
No
Freescale Semiconductor

Related parts for AN2797