[BACK]Return to diveby3.asm CVS log [TXT][DIR] Up to [local] / OpenXM_contrib / gmp / mpn / x86 / p6

Annotation of OpenXM_contrib/gmp/mpn/x86/p6/diveby3.asm, Revision 1.1

1.1     ! maekawa     1: dnl  Intel P6 mpn_divexact_by3 -- mpn division by 3, expecting no remainder.
        !             2: dnl
        !             3: dnl  P6: 8.5 cycles/limb
        !             4:
        !             5:
        !             6: dnl  Copyright (C) 2000 Free Software Foundation, Inc.
        !             7: dnl
        !             8: dnl  This file is part of the GNU MP Library.
        !             9: dnl
        !            10: dnl  The GNU MP Library is free software; you can redistribute it and/or
        !            11: dnl  modify it under the terms of the GNU Lesser General Public License as
        !            12: dnl  published by the Free Software Foundation; either version 2.1 of the
        !            13: dnl  License, or (at your option) any later version.
        !            14: dnl
        !            15: dnl  The GNU MP Library is distributed in the hope that it will be useful,
        !            16: dnl  but WITHOUT ANY WARRANTY; without even the implied warranty of
        !            17: dnl  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
        !            18: dnl  Lesser General Public License for more details.
        !            19: dnl
        !            20: dnl  You should have received a copy of the GNU Lesser General Public
        !            21: dnl  License along with the GNU MP Library; see the file COPYING.LIB.  If
        !            22: dnl  not, write to the Free Software Foundation, Inc., 59 Temple Place -
        !            23: dnl  Suite 330, Boston, MA 02111-1307, USA.
        !            24:
        !            25:
        !            26: dnl  The P5 code runs well on P6, in fact better than anything else found so
        !            27: dnl  far.  An imul is 4 cycles, meaning the two cmp/sbbl pairs on the
        !            28: dnl  dependent path are taking 4.5 cycles.
        !            29: dnl
        !            30: dnl  The destination cache line prefetching is unnecessary on P6, but
        !            31: dnl  removing it is a 2 cycle slowdown (approx), so it must be inducing
        !            32: dnl  something good in the out of order execution.
        !            33:
        !            34: include(`../config.m4')
        !            35:
        !            36: MULFUNC_PROLOGUE(mpn_divexact_by3c)
        !            37: include_mpn(`x86/pentium/diveby3.asm')

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>