version 1.1.1.1, 2000/01/10 15:35:24 |
version 1.1.1.2, 2000/09/09 14:12:28 |
Line 15 dependent instruction really far from each other. |
|
Line 15 dependent instruction really far from each other. |
|
STATUS |
STATUS |
|
|
1. mpn_mul_1 could be improved to 6.5 cycles/limb on the PA7100, using the |
1. mpn_mul_1 could be improved to 6.5 cycles/limb on the PA7100, using the |
instructions bwlow (but some sw pipelining is needed to avoid the |
instructions below (but some sw pipelining is needed to avoid the |
xmpyu-fstds delay): |
xmpyu-fstds delay): |
|
|
fldds s1_ptr |
fldds s1_ptr |
|
|
stws res_ptr |
stws res_ptr |
|
|
addib |
addib |
|
|
|
3. For the PA8000 we have to stick to using 32-bit limbs before compiler |
|
support emerges. But we want to use 64-bit operations whenever possible, |
|
in particular for loads and stores. It is possible to handle mpn_add_n |
|
efficiently by rotating (when s1/s2 are aligned), masking+bit field |
|
inserting when (they are not). The speed should double compared to the |
|
code used today. |