version 1.1.1.2, 2000/12/01 05:44:50 |
version 1.1.1.3, 2003/08/25 16:06:18 |
|
|
dnl Alpha ev6 mpn_addmul_1 -- Multiply a limb vector with a limb and add |
dnl Alpha ev6 mpn_addmul_1 -- Multiply a limb vector with a limb and add |
dnl the result to a second limb vector. |
dnl the result to a second limb vector. |
|
|
dnl Copyright (C) 2000 Free Software Foundation, Inc. |
dnl Copyright 2000 Free Software Foundation, Inc. |
|
|
dnl This file is part of the GNU MP Library. |
dnl This file is part of the GNU MP Library. |
|
|
|
|
dnl s2_limb r19 |
dnl s2_limb r19 |
|
|
dnl This code runs at 42 cycles/limb on EV4, 18 cycles/limb on EV5, and |
dnl This code runs at 42 cycles/limb on EV4, 18 cycles/limb on EV5, and |
dnl exactly 3.625 cycles/limb on EV6... |
dnl exactly 3.5 cycles/limb on EV6... |
|
|
dnl This code was written in close cooperation with ev6 pipeline expert |
dnl This code was written in close cooperation with ev6 pipeline expert |
dnl Steve Root (root@toober.hlo.dec.com). Any errors are tege's fault, though. |
dnl Steve Root. Any errors are tege's fault, though. |
dnl |
dnl |
dnl Register usages for unrolled loop: |
dnl Register usages for unrolled loop: |
dnl 0-3 mul's |
dnl 0-3 mul's |
Line 41 dnl 8-15 mul results |
|
Line 41 dnl 8-15 mul results |
|
dnl 20,21 carry's |
dnl 20,21 carry's |
dnl 22,23 save for stores |
dnl 22,23 save for stores |
|
|
dnl Sustains 8 mul-adds in 29 cycles in the unrolled inner loop. |
dnl Sustains 8 mul-adds in 28 cycles in the unrolled inner loop. |
|
|
dnl The stores can issue a cycle late so we have paired no-op's to 'catch' |
dnl The stores can issue a cycle late so we have paired no-op's to 'catch' |
dnl them, so that further disturbance to the schedule is damped. |
dnl them, so that further disturbance to the schedule is damped. |