[BACK]Return to README CVS log [TXT][DIR] Up to [local] / OpenXM / src / kan96xx / gmp-2.0.2-ssh-2 / mpn / sparc32

Annotation of OpenXM/src/kan96xx/gmp-2.0.2-ssh-2/mpn/sparc32/README, Revision 1.1.1.1

1.1       takayama    1: This directory contains mpn functions for various SPARC chips.  Code that
                      2: runs only on version 8 SPARC implementations, is in the v8 subdirectory.
                      3:
                      4: RELEVANT OPTIMIZATION ISSUES
                      5:
                      6:   Load and Store timing
                      7:
                      8: On most early SPARC implementations, the ST instructions takes multiple
                      9: cycles, while a STD takes just a single cycle more than an ST.  For the CPUs
                     10: in SPARCstation I and II, the times are 3 and 4 cycles, respectively.
                     11: Therefore, combining two ST instrucitons into a STD when possible is a
                     12: significant optimiation.
                     13:
                     14: Later SPARC implementations have single cycle ST.
                     15:
                     16: For SuperSPARC, we can perform just one memory instruction per cycle, even
                     17: if up to two integer instructions can be executed in its pipeline.  For
                     18: programs that perform so many memory operations that there are not enough
                     19: non-memory operations to issue in parallel with all memory operations, using
                     20: LDD and STD when possible helps.
                     21:
                     22: STATUS
                     23:
                     24: 1. On a SuperSPARC, mpn_lshift and mpn_rshift run at 3 cycles/limb, or 2.5
                     25:    cycles/limb asymptotically.  We could optimize speed for special counts
                     26:    by using ADDXCC.
                     27:
                     28: 2. On a SuperSPARC, mpn_add_n and mpn_sub_n runs at 2.5 cycles/limb, or 2
                     29:    cycles/limb asymptotically.
                     30:
                     31: 3. mpn_mul_1 runs at what is believed to be optimal speed.
                     32:
                     33: 4. On SuperSPARC, mpn_addmul_1 and mpn_submul_1 could both be improved by a
                     34:    cycle by avoiding one of the add instrucitons.  See a29k/addmul_1.
                     35:
                     36: The speed of the code for other SPARC implementations is uncertain.

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>