Annotation of OpenXM_contrib/gmp/mpn/power/mul_1.asm, Revision 1.1.1.1
1.1 ohara 1: dnl IBM POWER mpn_mul_1 -- Multiply a limb vector with a limb and store the
2: dnl result in a second limb vector.
3:
4: dnl Copyright 1992, 1994, 1999, 2000, 2001 Free Software Foundation, Inc.
5:
6: dnl This file is part of the GNU MP Library.
7:
8: dnl The GNU MP Library is free software; you can redistribute it and/or modify
9: dnl it under the terms of the GNU Lesser General Public License as published
10: dnl by the Free Software Foundation; either version 2.1 of the License, or (at
11: dnl your option) any later version.
12:
13: dnl The GNU MP Library is distributed in the hope that it will be useful, but
14: dnl WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
15: dnl or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
16: dnl License for more details.
17:
18: dnl You should have received a copy of the GNU Lesser General Public License
19: dnl along with the GNU MP Library; see the file COPYING.LIB. If not, write to
20: dnl the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
21: dnl MA 02111-1307, USA.
22:
23:
24: dnl INPUT PARAMETERS
25: dnl res_ptr r3
26: dnl s1_ptr r4
27: dnl size r5
28: dnl s2_limb r6
29:
30: dnl The POWER architecture has no unsigned 32x32->64 bit multiplication
31: dnl instruction. To obtain that operation, we have to use the 32x32->64
32: dnl signed multiplication instruction, and add the appropriate compensation to
33: dnl the high limb of the result. We add the multiplicand if the multiplier
34: dnl has its most significant bit set, and we add the multiplier if the
35: dnl multiplicand has its most significant bit set. We need to preserve the
36: dnl carry flag between each iteration, so we have to compute the compensation
37: dnl carefully (the natural, srai+and doesn't work). Since all POWER can
38: dnl branch in zero cycles, we use conditional branches to for the additions.
39:
40: include(`../config.m4')
41:
42: ASM_START()
43: PROLOGUE(mpn_mul_1)
44: cal 3,-4(3)
45: l 0,0(4)
46: cmpi 0,6,0
47: mtctr 5
48: mul 9,0,6
49: srai 7,0,31
50: and 7,7,6
51: mfmq 8
52: ai 0,0,0 C reset carry
53: cax 9,9,7
54: blt Lneg
55: Lpos: bdz Lend
56: Lploop: lu 0,4(4)
57: stu 8,4(3)
58: cmpi 0,0,0
59: mul 10,0,6
60: mfmq 0
61: ae 8,0,9
62: bge Lp0
63: cax 10,10,6 C adjust high limb for negative limb from s1
64: Lp0: bdz Lend0
65: lu 0,4(4)
66: stu 8,4(3)
67: cmpi 0,0,0
68: mul 9,0,6
69: mfmq 0
70: ae 8,0,10
71: bge Lp1
72: cax 9,9,6 C adjust high limb for negative limb from s1
73: Lp1: bdn Lploop
74: b Lend
75:
76: Lneg: cax 9,9,0
77: bdz Lend
78: Lnloop: lu 0,4(4)
79: stu 8,4(3)
80: cmpi 0,0,0
81: mul 10,0,6
82: cax 10,10,0 C adjust high limb for negative s2_limb
83: mfmq 0
84: ae 8,0,9
85: bge Ln0
86: cax 10,10,6 C adjust high limb for negative limb from s1
87: Ln0: bdz Lend0
88: lu 0,4(4)
89: stu 8,4(3)
90: cmpi 0,0,0
91: mul 9,0,6
92: cax 9,9,0 C adjust high limb for negative s2_limb
93: mfmq 0
94: ae 8,0,10
95: bge Ln1
96: cax 9,9,6 C adjust high limb for negative limb from s1
97: Ln1: bdn Lnloop
98: b Lend
99:
100: Lend0: cal 9,0(10)
101: Lend: st 8,4(3)
102: aze 3,9
103: br
104: EPILOGUE(mpn_mul_1)
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>