[BACK]Return to diveby3.asm CVS log [TXT][DIR] Up to [local] / OpenXM_contrib / gmp / mpn / x86 / p6

Annotation of OpenXM_contrib/gmp/mpn/x86/p6/diveby3.asm, Revision 1.1.1.1

1.1       maekawa     1: dnl  Intel P6 mpn_divexact_by3 -- mpn division by 3, expecting no remainder.
                      2: dnl
                      3: dnl  P6: 8.5 cycles/limb
                      4:
                      5:
                      6: dnl  Copyright (C) 2000 Free Software Foundation, Inc.
                      7: dnl
                      8: dnl  This file is part of the GNU MP Library.
                      9: dnl
                     10: dnl  The GNU MP Library is free software; you can redistribute it and/or
                     11: dnl  modify it under the terms of the GNU Lesser General Public License as
                     12: dnl  published by the Free Software Foundation; either version 2.1 of the
                     13: dnl  License, or (at your option) any later version.
                     14: dnl
                     15: dnl  The GNU MP Library is distributed in the hope that it will be useful,
                     16: dnl  but WITHOUT ANY WARRANTY; without even the implied warranty of
                     17: dnl  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
                     18: dnl  Lesser General Public License for more details.
                     19: dnl
                     20: dnl  You should have received a copy of the GNU Lesser General Public
                     21: dnl  License along with the GNU MP Library; see the file COPYING.LIB.  If
                     22: dnl  not, write to the Free Software Foundation, Inc., 59 Temple Place -
                     23: dnl  Suite 330, Boston, MA 02111-1307, USA.
                     24:
                     25:
                     26: dnl  The P5 code runs well on P6, in fact better than anything else found so
                     27: dnl  far.  An imul is 4 cycles, meaning the two cmp/sbbl pairs on the
                     28: dnl  dependent path are taking 4.5 cycles.
                     29: dnl
                     30: dnl  The destination cache line prefetching is unnecessary on P6, but
                     31: dnl  removing it is a 2 cycle slowdown (approx), so it must be inducing
                     32: dnl  something good in the out of order execution.
                     33:
                     34: include(`../config.m4')
                     35:
                     36: MULFUNC_PROLOGUE(mpn_divexact_by3c)
                     37: include_mpn(`x86/pentium/diveby3.asm')

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>