Annotation of OpenXM_contrib/gmp/mpn/powerpc64/README, Revision 1.1
1.1 ! maekawa 1: PPC630 (aka Power3) pipeline information:
! 2:
! 3: Decoding is 4-way and issue is 8-way with some out-of-order capability.
! 4: LS1 - ld/st unit 1
! 5: LS2 - ld/st unit 2
! 6: FXU1 - integer unit 1, handles any simple integer instructions
! 7: FXU2 - integer unit 2, handles any simple integer instructions
! 8: FXU3 - integer unit 3, handles integer multiply and divide
! 9: FPU1 - floating-point unit 1
! 10: FPU2 - floating-point unit 2
! 11:
! 12: Memory: Any two memory operations can issue, but memory subsystem
! 13: can sustain just one store per cycle.
! 14: Simple integer: 2 operations (such as add, rl*)
! 15: Integer multiply: 1 operation every 9th cycle worst case; exact timing depends
! 16: on 2nd operand most significant bit position (10 bits per
! 17: cycle). Multiply unit is not pipelined, only one multiply
! 18: operation in progress is allowed.
! 19: Integer divide: ?
! 20: Floating-point: Any plain 2 arithmetic instructions (such as fmul, fadd, fmadd)
! 21: Latency = 4.
! 22: Floating-point divide:
! 23: ?
! 24: Floating-point square root:
! 25: ?
! 26:
! 27: Best possible times for the main loops:
! 28: shift: 1.5 cycles limited by integer unit contention.
! 29: With 63 special loops, one for each shift count, we could
! 30: reduce the needed integer instructions to 2, which would
! 31: reduce the best possible time to 1 cycle.
! 32: add/sub: 1.5 cycles, limited by ld/st unit contention.
! 33: mul: 18 cycles (average) unless floating-point operations are used,
! 34: but that would only help for multiplies of perhaps 10 and more
! 35: limbs.
! 36: addmul/submul:Same situation as for mul.
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>