[BACK]Return to diveby3.asm CVS log [TXT][DIR] Up to [local] / OpenXM_contrib / gmp / mpn / x86 / p6

Annotation of OpenXM_contrib/gmp/mpn/x86/p6/diveby3.asm, Revision 1.1.1.2

1.1       maekawa     1: dnl  Intel P6 mpn_divexact_by3 -- mpn division by 3, expecting no remainder.
                      2:
1.1.1.2 ! ohara       3: dnl  Copyright 2000, 2002 Free Software Foundation, Inc.
1.1       maekawa     4: dnl
                      5: dnl  This file is part of the GNU MP Library.
                      6: dnl
                      7: dnl  The GNU MP Library is free software; you can redistribute it and/or
                      8: dnl  modify it under the terms of the GNU Lesser General Public License as
                      9: dnl  published by the Free Software Foundation; either version 2.1 of the
                     10: dnl  License, or (at your option) any later version.
                     11: dnl
                     12: dnl  The GNU MP Library is distributed in the hope that it will be useful,
                     13: dnl  but WITHOUT ANY WARRANTY; without even the implied warranty of
                     14: dnl  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
                     15: dnl  Lesser General Public License for more details.
                     16: dnl
                     17: dnl  You should have received a copy of the GNU Lesser General Public
                     18: dnl  License along with the GNU MP Library; see the file COPYING.LIB.  If
                     19: dnl  not, write to the Free Software Foundation, Inc., 59 Temple Place -
                     20: dnl  Suite 330, Boston, MA 02111-1307, USA.
                     21:
1.1.1.2 ! ohara      22: include(`../config.m4')
1.1       maekawa    23:
                     24:
1.1.1.2 ! ohara      25: C P6: 8.5 cycles/limb
        !            26:
        !            27:
        !            28: C The P5 code runs well on P6, in fact better than anything else found so
        !            29: C far.  An imul is 4 cycles, meaning the two cmp/sbbl pairs on the dependent
        !            30: C path are taking 4.5 cycles.
        !            31: C
        !            32: C The destination cache line prefetching is unnecessary on P6, but removing
        !            33: C it is a 2 cycle slowdown (approx), so it must be inducing something good
        !            34: C in the out of order execution.
1.1       maekawa    35:
                     36: MULFUNC_PROLOGUE(mpn_divexact_by3c)
                     37: include_mpn(`x86/pentium/diveby3.asm')

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>