Annotation of OpenXM_contrib/gmp/mpn/x86/p6/diveby3.asm, Revision 1.1
1.1 ! maekawa 1: dnl Intel P6 mpn_divexact_by3 -- mpn division by 3, expecting no remainder.
! 2: dnl
! 3: dnl P6: 8.5 cycles/limb
! 4:
! 5:
! 6: dnl Copyright (C) 2000 Free Software Foundation, Inc.
! 7: dnl
! 8: dnl This file is part of the GNU MP Library.
! 9: dnl
! 10: dnl The GNU MP Library is free software; you can redistribute it and/or
! 11: dnl modify it under the terms of the GNU Lesser General Public License as
! 12: dnl published by the Free Software Foundation; either version 2.1 of the
! 13: dnl License, or (at your option) any later version.
! 14: dnl
! 15: dnl The GNU MP Library is distributed in the hope that it will be useful,
! 16: dnl but WITHOUT ANY WARRANTY; without even the implied warranty of
! 17: dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
! 18: dnl Lesser General Public License for more details.
! 19: dnl
! 20: dnl You should have received a copy of the GNU Lesser General Public
! 21: dnl License along with the GNU MP Library; see the file COPYING.LIB. If
! 22: dnl not, write to the Free Software Foundation, Inc., 59 Temple Place -
! 23: dnl Suite 330, Boston, MA 02111-1307, USA.
! 24:
! 25:
! 26: dnl The P5 code runs well on P6, in fact better than anything else found so
! 27: dnl far. An imul is 4 cycles, meaning the two cmp/sbbl pairs on the
! 28: dnl dependent path are taking 4.5 cycles.
! 29: dnl
! 30: dnl The destination cache line prefetching is unnecessary on P6, but
! 31: dnl removing it is a 2 cycle slowdown (approx), so it must be inducing
! 32: dnl something good in the out of order execution.
! 33:
! 34: include(`../config.m4')
! 35:
! 36: MULFUNC_PROLOGUE(mpn_divexact_by3c)
! 37: include_mpn(`x86/pentium/diveby3.asm')
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>