OpenXM/src/kan96xx/gmp-2.0.2-ssh-2/mpn/sparc32/README - annotate

Return to README CVS log

Up to [local] / OpenXM / src / kan96xx / gmp-2.0.2-ssh-2 / mpn / sparc32

Annotation of OpenXM/src/kan96xx/gmp-2.0.2-ssh-2/mpn/sparc32/README, Revision 1.1.1.1

1.1 takayama 1: This directory contains mpn functions for various SPARC chips. Code that
2: runs only on version 8 SPARC implementations, is in the v8 subdirectory.
3:
4: RELEVANT OPTIMIZATION ISSUES
5:
6: Load and Store timing
7:
8: On most early SPARC implementations, the ST instructions takes multiple
9: cycles, while a STD takes just a single cycle more than an ST. For the CPUs
10: in SPARCstation I and II, the times are 3 and 4 cycles, respectively.
11: Therefore, combining two ST instrucitons into a STD when possible is a
12: significant optimiation.
13:
14: Later SPARC implementations have single cycle ST.
15:
16: For SuperSPARC, we can perform just one memory instruction per cycle, even
17: if up to two integer instructions can be executed in its pipeline. For
18: programs that perform so many memory operations that there are not enough
19: non-memory operations to issue in parallel with all memory operations, using
20: LDD and STD when possible helps.
21:
22: STATUS
23:
24: 1. On a SuperSPARC, mpn_lshift and mpn_rshift run at 3 cycles/limb, or 2.5
25: cycles/limb asymptotically. We could optimize speed for special counts
26: by using ADDXCC.
27:
28: 2. On a SuperSPARC, mpn_add_n and mpn_sub_n runs at 2.5 cycles/limb, or 2
29: cycles/limb asymptotically.
30:
31: 3. mpn_mul_1 runs at what is believed to be optimal speed.
32:
33: 4. On SuperSPARC, mpn_addmul_1 and mpn_submul_1 could both be improved by a
34: cycle by avoiding one of the add instrucitons. See a29k/addmul_1.
35:
36: The speed of the code for other SPARC implementations is uncertain.

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>