Annotation of OpenXM_contrib/gmp/mpn/powerpc64/README, Revision 1.1.1.1
1.1 maekawa 1: PPC630 (aka Power3) pipeline information:
2:
3: Decoding is 4-way and issue is 8-way with some out-of-order capability.
4: LS1 - ld/st unit 1
5: LS2 - ld/st unit 2
6: FXU1 - integer unit 1, handles any simple integer instructions
7: FXU2 - integer unit 2, handles any simple integer instructions
8: FXU3 - integer unit 3, handles integer multiply and divide
9: FPU1 - floating-point unit 1
10: FPU2 - floating-point unit 2
11:
12: Memory: Any two memory operations can issue, but memory subsystem
13: can sustain just one store per cycle.
14: Simple integer: 2 operations (such as add, rl*)
15: Integer multiply: 1 operation every 9th cycle worst case; exact timing depends
16: on 2nd operand most significant bit position (10 bits per
17: cycle). Multiply unit is not pipelined, only one multiply
18: operation in progress is allowed.
19: Integer divide: ?
20: Floating-point: Any plain 2 arithmetic instructions (such as fmul, fadd, fmadd)
21: Latency = 4.
22: Floating-point divide:
23: ?
24: Floating-point square root:
25: ?
26:
27: Best possible times for the main loops:
28: shift: 1.5 cycles limited by integer unit contention.
29: With 63 special loops, one for each shift count, we could
30: reduce the needed integer instructions to 2, which would
31: reduce the best possible time to 1 cycle.
32: add/sub: 1.5 cycles, limited by ld/st unit contention.
33: mul: 18 cycles (average) unless floating-point operations are used,
34: but that would only help for multiplies of perhaps 10 and more
35: limbs.
36: addmul/submul:Same situation as for mul.
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>