[BACK]Return to README CVS log [TXT][DIR] Up to [local] / OpenXM_contrib / gmp / mpn / powerpc64

Annotation of OpenXM_contrib/gmp/mpn/powerpc64/README, Revision 1.1.1.1

1.1       maekawa     1: PPC630 (aka Power3) pipeline information:
                      2:
                      3: Decoding is 4-way and issue is 8-way with some out-of-order capability.
                      4: LS1  - ld/st unit 1
                      5: LS2  - ld/st unit 2
                      6: FXU1 - integer unit 1, handles any simple integer instructions
                      7: FXU2 - integer unit 2, handles any simple integer instructions
                      8: FXU3 - integer unit 3, handles integer multiply and divide
                      9: FPU1 - floating-point unit 1
                     10: FPU2 - floating-point unit 2
                     11:
                     12: Memory:                  Any two memory operations can issue, but memory subsystem
                     13:                  can sustain just one store per cycle.
                     14: Simple integer:          2 operations (such as add, rl*)
                     15: Integer multiply: 1 operation every 9th cycle worst case; exact timing depends
                     16:                  on 2nd operand most significant bit position (10 bits per
                     17:                  cycle).  Multiply unit is not pipelined, only one multiply
                     18:                  operation in progress is allowed.
                     19: Integer divide:          ?
                     20: Floating-point:          Any plain 2 arithmetic instructions (such as fmul, fadd, fmadd)
                     21:                  Latency = 4.
                     22: Floating-point divide:
                     23:                  ?
                     24: Floating-point square root:
                     25:                  ?
                     26:
                     27: Best possible times for the main loops:
                     28: shift:       1.5 cycles limited by integer unit contention.
                     29:              With 63 special loops, one for each shift count, we could
                     30:              reduce the needed integer instructions to 2, which would
                     31:              reduce the best possible time to 1 cycle.
                     32: add/sub:      1.5 cycles, limited by ld/st unit contention.
                     33: mul:         18 cycles (average) unless floating-point operations are used,
                     34:              but that would only help for multiplies of perhaps 10 and more
                     35:              limbs.
                     36: addmul/submul:Same situation as for mul.

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>