Annotation of OpenXM_contrib/gmp/mpn/x86/p6/diveby3.asm, Revision 1.1.1.1
1.1 maekawa 1: dnl Intel P6 mpn_divexact_by3 -- mpn division by 3, expecting no remainder.
2: dnl
3: dnl P6: 8.5 cycles/limb
4:
5:
6: dnl Copyright (C) 2000 Free Software Foundation, Inc.
7: dnl
8: dnl This file is part of the GNU MP Library.
9: dnl
10: dnl The GNU MP Library is free software; you can redistribute it and/or
11: dnl modify it under the terms of the GNU Lesser General Public License as
12: dnl published by the Free Software Foundation; either version 2.1 of the
13: dnl License, or (at your option) any later version.
14: dnl
15: dnl The GNU MP Library is distributed in the hope that it will be useful,
16: dnl but WITHOUT ANY WARRANTY; without even the implied warranty of
17: dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
18: dnl Lesser General Public License for more details.
19: dnl
20: dnl You should have received a copy of the GNU Lesser General Public
21: dnl License along with the GNU MP Library; see the file COPYING.LIB. If
22: dnl not, write to the Free Software Foundation, Inc., 59 Temple Place -
23: dnl Suite 330, Boston, MA 02111-1307, USA.
24:
25:
26: dnl The P5 code runs well on P6, in fact better than anything else found so
27: dnl far. An imul is 4 cycles, meaning the two cmp/sbbl pairs on the
28: dnl dependent path are taking 4.5 cycles.
29: dnl
30: dnl The destination cache line prefetching is unnecessary on P6, but
31: dnl removing it is a 2 cycle slowdown (approx), so it must be inducing
32: dnl something good in the out of order execution.
33:
34: include(`../config.m4')
35:
36: MULFUNC_PROLOGUE(mpn_divexact_by3c)
37: include_mpn(`x86/pentium/diveby3.asm')
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>