OpenXM_contrib/gmp/gmp.info-5 - annotate

Return to gmp.info-5 CVS log
Up to [local] / OpenXM_contrib / gmp
Annotation of OpenXM_contrib/gmp/gmp.info-5, Revision 1.1.1.1

1.1       ohara       1: This is gmp.info, produced by makeinfo version 4.2 from gmp.texi.
                      2:
                      3: This manual describes how to install and use the GNU multiple precision
                      4: arithmetic library, version 4.1.2.
                      5:
                      6:    Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
                      7: 2001, 2002 Free Software Foundation, Inc.
                      8:
                      9:    Permission is granted to copy, distribute and/or modify this
                     10: document under the terms of the GNU Free Documentation License, Version
                     11: 1.1 or any later version published by the Free Software Foundation;
                     12: with no Invariant Sections, with the Front-Cover Texts being "A GNU
                     13: Manual", and with the Back-Cover Texts being "You have freedom to copy
                     14: and modify this GNU Manual, like GNU software".  A copy of the license
                     15: is included in *Note GNU Free Documentation License::.
                     16: INFO-DIR-SECTION GNU libraries
                     17: START-INFO-DIR-ENTRY
                     18: * gmp: (gmp).                   GNU Multiple Precision Arithmetic Library.
                     19: END-INFO-DIR-ENTRY
                     20:
                     21: 
                     22: File: gmp.info,  Node: C++ Interface General,  Next: C++ Interface Integers,  Prev: C++ Class Interface,  Up: C++ Class Interface
                     23:
                     24: C++ Interface General
                     25: =====================
                     26:
                     27: All the C++ classes and functions are available with
                     28:
                     29:      #include <gmpxx.h>
                     30:
                     31:    Programs should be linked with the `libgmpxx' and `libgmp'
                     32: libraries.  For example,
                     33:
                     34:      g++ mycxxprog.cc -lgmpxx -lgmp
                     35:
                     36: The classes defined are
                     37:
                     38:  - Class: mpz_class
                     39:  - Class: mpq_class
                     40:  - Class: mpf_class
                     41:
                     42:    The standard operators and various standard functions are overloaded
                     43: to allow arithmetic with these classes.  For example,
                     44:
                     45:      int
                     46:      main (void)
                     47:      {
                     48:        mpz_class a, b, c;
                     49:
                     50:        a = 1234;
                     51:        b = "-5678";
                     52:        c = a+b;
                     53:        cout << "sum is " << c << "\n";
                     54:        cout << "absolute value is " << abs(c) << "\n";
                     55:
                     56:        return 0;
                     57:      }
                     58:
                     59:    An important feature of the implementation is that an expression like
                     60: `a=b+c' results in a single call to the corresponding `mpz_add',
                     61: without using a temporary for the `b+c' part.  Expressions which by
                     62: their nature imply intermediate values, like `a=b*c+d*e', still use
                     63: temporaries though.
                     64:
                     65:    The classes can be freely intermixed in expressions, as can the
                     66: classes and the standard types `long', `unsigned long' and `double'.
                     67: Smaller types like `int' or `float' can also be intermixed, since C++
                     68: will promote them.
                     69:
                     70:    Note that `bool' is not accepted directly, but must be explicitly
                     71: cast to an `int' first.  This is because C++ will automatically convert
                     72: any pointer to a `bool', so if GMP accepted `bool' it would make all
                     73: sorts of invalid class and pointer combinations compile but almost
                     74: certainly not do anything sensible.
                     75:
                     76:    Conversions back from the classes to standard C++ types aren't done
                     77: automatically, instead member functions like `get_si' are provided (see
                     78: the following sections for details).
                     79:
                     80:    Also there are no automatic conversions from the classes to the
                     81: corresponding GMP C types, instead a reference to the underlying C
                     82: object can be obtained with the following functions,
                     83:
                     84:  - Function: mpz_t mpz_class::get_mpz_t ()
                     85:  - Function: mpq_t mpq_class::get_mpq_t ()
                     86:  - Function: mpf_t mpf_class::get_mpf_t ()
                     87:
                     88:    These can be used to call a C function which doesn't have a C++ class
                     89: interface.  For example to set `a' to the GCD of `b' and `c',
                     90:
                     91:      mpz_class a, b, c;
                     92:      ...
                     93:      mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t());
                     94:
                     95:    In the other direction, a class can be initialized from the
                     96: corresponding GMP C type, or assigned to if an explicit constructor is
                     97: used.  In both cases this makes a copy of the value, it doesn't create
                     98: any sort of association.  For example,
                     99:
                    100:      mpz_t z;
                    101:      // ... init and calculate z ...
                    102:      mpz_class x(z);
                    103:      mpz_class y;
                    104:      y = mpz_class (z);
                    105:
                    106:    There are no namespace setups in `gmpxx.h', all types and functions
                    107: are simply put into the global namespace.  This is what `gmp.h' has
                    108: done in the past, and continues to do for compatibility.  The extras
                    109: provided by `gmpxx.h' follow GMP naming conventions and are unlikely to
                    110: clash with anything.
                    111:
                    112: 
                    113: File: gmp.info,  Node: C++ Interface Integers,  Next: C++ Interface Rationals,  Prev: C++ Interface General,  Up: C++ Class Interface
                    114:
                    115: C++ Interface Integers
                    116: ======================
                    117:
                    118:  - Function: void mpz_class::mpz_class (type N)
                    119:      Construct an `mpz_class'.  All the standard C++ types may be used,
                    120:      except `long long' and `long double', and all the GMP C++ classes
                    121:      can be used.  Any necessary conversion follows the corresponding C
                    122:      function, for example `double' follows `mpz_set_d' (*note
                    123:      Assigning Integers::).
                    124:
                    125:  - Function: void mpz_class::mpz_class (mpz_t Z)
                    126:      Construct an `mpz_class' from an `mpz_t'.  The value in Z is
                    127:      copied into the new `mpz_class', there won't be any permanent
                    128:      association between it and Z.
                    129:
                    130:  - Function: void mpz_class::mpz_class (const char *S)
                    131:  - Function: void mpz_class::mpz_class (const char *S, int base)
                    132:  - Function: void mpz_class::mpz_class (const string& S)
                    133:  - Function: void mpz_class::mpz_class (const string& S, int base)
                    134:      Construct an `mpz_class' converted from a string using
                    135:      `mpz_set_str', (*note Assigning Integers::).  If the BASE is not
                    136:      given then 0 is used.
                    137:
                    138:  - Function: mpz_class operator/ (mpz_class A, mpz_class D)
                    139:  - Function: mpz_class operator% (mpz_class A, mpz_class D)
                    140:      Divisions involving `mpz_class' round towards zero, as per the
                    141:      `mpz_tdiv_q' and `mpz_tdiv_r' functions (*note Integer Division::).
                    142:      This corresponds to the rounding used for plain `int' calculations
                    143:      on most machines.
                    144:
                    145:      The `mpz_fdiv...' or `mpz_cdiv...' functions can always be called
                    146:      directly if desired.  For example,
                    147:
                    148:           mpz_class q, a, d;
                    149:           ...
                    150:           mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t());
                    151:
                    152:  - Function: mpz_class abs (mpz_class OP1)
                    153:  - Function: int cmp (mpz_class OP1, type OP2)
                    154:  - Function: int cmp (type OP1, mpz_class OP2)
                    155:  - Function: double mpz_class::get_d (void)
                    156:  - Function: long mpz_class::get_si (void)
                    157:  - Function: unsigned long mpz_class::get_ui (void)
                    158:  - Function: bool mpz_class::fits_sint_p (void)
                    159:  - Function: bool mpz_class::fits_slong_p (void)
                    160:  - Function: bool mpz_class::fits_sshort_p (void)
                    161:  - Function: bool mpz_class::fits_uint_p (void)
                    162:  - Function: bool mpz_class::fits_ulong_p (void)
                    163:  - Function: bool mpz_class::fits_ushort_p (void)
                    164:  - Function: int sgn (mpz_class OP)
                    165:  - Function: mpz_class sqrt (mpz_class OP)
                    166:      These functions provide a C++ class interface to the corresponding
                    167:      GMP C routines.
                    168:
                    169:      `cmp' can be used with any of the classes or the standard C++
                    170:      types, except `long long' and `long double'.
                    171:
                    172:
                    173:    Overloaded operators for combinations of `mpz_class' and `double'
                    174: are provided for completeness, but it should be noted that if the given
                    175: `double' is not an integer then the way any rounding is done is
                    176: currently unspecified.  The rounding might take place at the start, in
                    177: the middle, or at the end of the operation, and it might change in the
                    178: future.
                    179:
                    180:    Conversions between `mpz_class' and `double', however, are defined
                    181: to follow the corresponding C functions `mpz_get_d' and `mpz_set_d'.
                    182: And comparisons are always made exactly, as per `mpz_cmp_d'.
                    183:
                    184: 
                    185: File: gmp.info,  Node: C++ Interface Rationals,  Next: C++ Interface Floats,  Prev: C++ Interface Integers,  Up: C++ Class Interface
                    186:
                    187: C++ Interface Rationals
                    188: =======================
                    189:
                    190:    In all the following constructors, if a fraction is given then it
                    191: should be in canonical form, or if not then `mpq_class::canonicalize'
                    192: called.
                    193:
                    194:  - Function: void mpq_class::mpq_class (type OP)
                    195:  - Function: void mpq_class::mpq_class (integer NUM, integer DEN)
                    196:      Construct an `mpq_class'.  The initial value can be a single value
                    197:      of any type, or a pair of integers (`mpz_class' or standard C++
                    198:      integer types) representing a fraction, except that `long long'
                    199:      and `long double' are not supported.  For example,
                    200:
                    201:           mpq_class q (99);
                    202:           mpq_class q (1.75);
                    203:           mpq_class q (1, 3);
                    204:
                    205:  - Function: void mpq_class::mpq_class (mpq_t Q)
                    206:      Construct an `mpq_class' from an `mpq_t'.  The value in Q is
                    207:      copied into the new `mpq_class', there won't be any permanent
                    208:      association between it and Q.
                    209:
                    210:  - Function: void mpq_class::mpq_class (const char *S)
                    211:  - Function: void mpq_class::mpq_class (const char *S, int base)
                    212:  - Function: void mpq_class::mpq_class (const string& S)
                    213:  - Function: void mpq_class::mpq_class (const string& S, int base)
                    214:      Construct an `mpq_class' converted from a string using
                    215:      `mpq_set_str', (*note Initializing Rationals::).  If the BASE is
                    216:      not given then 0 is used.
                    217:
                    218:  - Function: void mpq_class::canonicalize ()
                    219:      Put an `mpq_class' into canonical form, as per *Note Rational
                    220:      Number Functions::.  All arithmetic operators require their
                    221:      operands in canonical form, and will return results in canonical
                    222:      form.
                    223:
                    224:  - Function: mpq_class abs (mpq_class OP)
                    225:  - Function: int cmp (mpq_class OP1, type OP2)
                    226:  - Function: int cmp (type OP1, mpq_class OP2)
                    227:  - Function: double mpq_class::get_d (void)
                    228:  - Function: int sgn (mpq_class OP)
                    229:      These functions provide a C++ class interface to the corresponding
                    230:      GMP C routines.
                    231:
                    232:      `cmp' can be used with any of the classes or the standard C++
                    233:      types, except `long long' and `long double'.
                    234:
                    235:  - Function: mpz_class& mpq_class::get_num ()
                    236:  - Function: mpz_class& mpq_class::get_den ()
                    237:      Get a reference to an `mpz_class' which is the numerator or
                    238:      denominator of an `mpq_class'.  This can be used both for read and
                    239:      write access.  If the object returned is modified, it modifies the
                    240:      original `mpq_class'.
                    241:
                    242:      If direct manipulation might produce a non-canonical value, then
                    243:      `mpq_class::canonicalize' must be called before further operations.
                    244:
                    245:  - Function: mpz_t mpq_class::get_num_mpz_t ()
                    246:  - Function: mpz_t mpq_class::get_den_mpz_t ()
                    247:      Get a reference to the underlying `mpz_t' numerator or denominator
                    248:      of an `mpq_class'.  This can be passed to C functions expecting an
                    249:      `mpz_t'.  Any modifications made to the `mpz_t' will modify the
                    250:      original `mpq_class'.
                    251:
                    252:      If direct manipulation might produce a non-canonical value, then
                    253:      `mpq_class::canonicalize' must be called before further operations.
                    254:
                    255:  - Function: istream& operator>> (istream& STREAM, mpq_class& ROP);
                    256:      Read ROP from STREAM, using its `ios' formatting settings, the
                    257:      same as `mpq_t operator>>' (*note C++ Formatted Input::).
                    258:
                    259:      If the ROP read might not be in canonical form then
                    260:      `mpq_class::canonicalize' must be called.
                    261:
                    262: 
                    263: File: gmp.info,  Node: C++ Interface Floats,  Next: C++ Interface MPFR,  Prev: C++ Interface Rationals,  Up: C++ Class Interface
                    264:
                    265: C++ Interface Floats
                    266: ====================
                    267:
                    268:    When an expression requires the use of temporary intermediate
                    269: `mpf_class' values, like `f=g*h+x*y', those temporaries will have the
                    270: same precision as the destination `f'.  Explicit constructors can be
                    271: used if this doesn't suit.
                    272:
                    273:  - Function:  mpf_class::mpf_class (type OP)
                    274:  - Function:  mpf_class::mpf_class (type OP, unsigned long PREC)
                    275:      Construct an `mpf_class'.  Any standard C++ type can be used,
                    276:      except `long long' and `long double', and any of the GMP C++
                    277:      classes can be used.
                    278:
                    279:      If PREC is given, the initial precision is that value, in bits.  If
                    280:      PREC is not given, then the initial precision is determined by the
                    281:      type of OP given.  An `mpz_class', `mpq_class', string, or C++
                    282:      builtin type will give the default `mpf' precision (*note
                    283:      Initializing Floats::).  An `mpf_class' or expression will give
                    284:      the precision of that value.  The precision of a binary expression
                    285:      is the higher of the two operands.
                    286:
                    287:           mpf_class f(1.5);        // default precision
                    288:           mpf_class f(1.5, 500);   // 500 bits (at least)
                    289:           mpf_class f(x);          // precision of x
                    290:           mpf_class f(abs(x));     // precision of x
                    291:           mpf_class f(-g, 1000);   // 1000 bits (at least)
                    292:           mpf_class f(x+y);        // greater of precisions of x and y
                    293:
                    294:  - Function: mpf_class abs (mpf_class OP)
                    295:  - Function: mpf_class ceil (mpf_class OP)
                    296:  - Function: int cmp (mpf_class OP1, type OP2)
                    297:  - Function: int cmp (type OP1, mpf_class OP2)
                    298:  - Function: mpf_class floor (mpf_class OP)
                    299:  - Function: mpf_class hypot (mpf_class OP1, mpf_class OP2)
                    300:  - Function: double mpf_class::get_d (void)
                    301:  - Function: long mpf_class::get_si (void)
                    302:  - Function: unsigned long mpf_class::get_ui (void)
                    303:  - Function: bool mpf_class::fits_sint_p (void)
                    304:  - Function: bool mpf_class::fits_slong_p (void)
                    305:  - Function: bool mpf_class::fits_sshort_p (void)
                    306:  - Function: bool mpf_class::fits_uint_p (void)
                    307:  - Function: bool mpf_class::fits_ulong_p (void)
                    308:  - Function: bool mpf_class::fits_ushort_p (void)
                    309:  - Function: int sgn (mpf_class OP)
                    310:  - Function: mpf_class sqrt (mpf_class OP)
                    311:  - Function: mpf_class trunc (mpf_class OP)
                    312:      These functions provide a C++ class interface to the corresponding
                    313:      GMP C routines.
                    314:
                    315:      `cmp' can be used with any of the classes or the standard C++
                    316:      types, except `long long' and `long double'.
                    317:
                    318:      The accuracy provided by `hypot' is not currently guaranteed.
                    319:
                    320:  - Function: unsigned long int mpf_class::get_prec ()
                    321:  - Function: void mpf_class::set_prec (unsigned long PREC)
                    322:  - Function: void mpf_class::set_prec_raw (unsigned long PREC)
                    323:      Get or set the current precision of an `mpf_class'.
                    324:
                    325:      The restrictions described for `mpf_set_prec_raw' (*note
                    326:      Initializing Floats::) apply to `mpf_class::set_prec_raw'.  Note
                    327:      in particular that the `mpf_class' must be restored to it's
                    328:      allocated precision before being destroyed.  This must be done by
                    329:      application code, there's no automatic mechanism for it.
                    330:
                    331: 
                    332: File: gmp.info,  Node: C++ Interface MPFR,  Next: C++ Interface Random Numbers,  Prev: C++ Interface Floats,  Up: C++ Class Interface
                    333:
                    334: C++ Interface MPFR
                    335: ==================
                    336:
                    337:    The C++ class interface to MPFR is provided if MPFR is enabled
                    338: (*note Build Options::).  This interface must be regarded as
                    339: preliminary and possibly subject to incompatible changes in the future,
                    340: since MPFR itself is preliminary.  All definitions can be obtained with
                    341:
                    342:      #include <mpfrxx.h>
                    343:
                    344: This defines
                    345:
                    346:  - Class: mpfr_class
                    347:
                    348: which behaves similarly to `mpf_class' (*note C++ Interface Floats::).
                    349:
                    350: 
                    351: File: gmp.info,  Node: C++ Interface Random Numbers,  Next: C++ Interface Limitations,  Prev: C++ Interface MPFR,  Up: C++ Class Interface
                    352:
                    353: C++ Interface Random Numbers
                    354: ============================
                    355:
                    356:  - Class: gmp_randclass
                    357:      The C++ class interface to the GMP random number functions uses
                    358:      `gmp_randclass' to hold an algorithm selection and current state,
                    359:      as per `gmp_randstate_t'.
                    360:
                    361:  - Function:  gmp_randclass::gmp_randclass (void (*RANDINIT)
                    362:           (gmp_randstate_t, ...), ...)
                    363:      Construct a `gmp_randclass', using a call to the given RANDINIT
                    364:      function (*note Random State Initialization::).  The arguments
                    365:      expected are the same as RANDINIT, but with `mpz_class' instead of
                    366:      `mpz_t'.  For example,
                    367:
                    368:           gmp_randclass r1 (gmp_randinit_default);
                    369:           gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32);
                    370:           gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp);
                    371:
                    372:      `gmp_randinit_lc_2exp_size' can fail if the size requested is too
                    373:      big, the behaviour of `gmp_randclass::gmp_randclass' is undefined
                    374:      in this case (perhaps this will change in the future).
                    375:
                    376:  - Function:  gmp_randclass::gmp_randclass (gmp_randalg_t ALG, ...)
                    377:      Construct a `gmp_randclass' using the same parameters as
                    378:      `gmp_randinit' (*note Random State Initialization::).  This
                    379:      function is obsolete and the above RANDINIT style should be
                    380:      preferred.
                    381:
                    382:  - Function: void gmp_randclass::seed (unsigned long int S)
                    383:  - Function: void gmp_randclass::seed (mpz_class S)
                    384:      Seed a random number generator.  See *note Random Number
                    385:      Functions::, for how to choose a good seed.
                    386:
                    387:  - Function: mpz_class gmp_randclass::get_z_bits (unsigned long BITS)
                    388:  - Function: mpz_class gmp_randclass::get_z_bits (mpz_class BITS)
                    389:      Generate a random integer with a specified number of bits.
                    390:
                    391:  - Function: mpz_class gmp_randclass::get_z_range (mpz_class N)
                    392:      Generate a random integer in the range 0 to N-1 inclusive.
                    393:
                    394:  - Function: mpf_class gmp_randclass::get_f ()
                    395:  - Function: mpf_class gmp_randclass::get_f (unsigned long PREC)
                    396:      Generate a random float F in the range 0 <= F < 1.  F will be to
                    397:      PREC bits precision, or if PREC is not given then to the precision
                    398:      of the destination.  For example,
                    399:
                    400:           gmp_randclass  r;
                    401:           ...
                    402:           mpf_class  f (0, 512);   // 512 bits precision
                    403:           f = r.get_f();           // random number, 512 bits
                    404:
                    405: 
                    406: File: gmp.info,  Node: C++ Interface Limitations,  Prev: C++ Interface Random Numbers,  Up: C++ Class Interface
                    407:
                    408: C++ Interface Limitations
                    409: =========================
                    410:
                    411: `mpq_class' and Templated Reading
                    412:      A generic piece of template code probably won't know that
                    413:      `mpq_class' requires a `canonicalize' call if inputs read with
                    414:      `operator>>' might be non-canonical.  This can lead to incorrect
                    415:      results.
                    416:
                    417:      `operator>>' behaves as it does for reasons of efficiency.  A
                    418:      canonicalize can be quite time consuming on large operands, and is
                    419:      best avoided if it's not necessary.
                    420:
                    421:      But this potential difficulty reduces the usefulness of
                    422:      `mpq_class'.  Perhaps a mechanism to tell `operator>>' what to do
                    423:      will be adopted in the future, maybe a preprocessor define, a
                    424:      global flag, or an `ios' flag pressed into service.  Or maybe, at
                    425:      the risk of inconsistency, the `mpq_class' `operator>>' could
                    426:      canonicalize and leave `mpq_t' `operator>>' not doing so, for use
                    427:      on those occasions when that's acceptable.  Send feedback or
                    428:      alternate ideas to <bug-gmp@gnu.org>.
                    429:
                    430: Subclassing
                    431:      Subclassing the GMP C++ classes works, but is not currently
                    432:      recommended.
                    433:
                    434:      Expressions involving subclasses resolve correctly (or seem to),
                    435:      but in normal C++ fashion the subclass doesn't inherit
                    436:      constructors and assignments.  There's many of those in the GMP
                    437:      classes, and a good way to reestablish them in a subclass is not
                    438:      yet provided.
                    439:
                    440: Templated Expressions
                    441:      A subtle difficulty exists when using expressions together with
                    442:      application-defined template functions.  Consider the following,
                    443:      with `T' intended to be some numeric type,
                    444:
                    445:           template <class T>
                    446:           T fun (const T &, const T &);
                    447:
                    448:      When used with, say, plain `mpz_class' variables, it works fine:
                    449:      `T' is resolved as `mpz_class'.
                    450:
                    451:           mpz_class f(1), g(2);
                    452:           fun (f, g);    // Good
                    453:
                    454:      But when one of the arguments is an expression, it doesn't work.
                    455:
                    456:           mpz_class f(1), g(2), h(3);
                    457:           fun (f, g+h);  // Bad
                    458:
                    459:      This is because `g+h' ends up being a certain expression template
                    460:      type internal to `gmpxx.h', which the C++ template resolution
                    461:      rules are unable to automatically convert to `mpz_class'.  The
                    462:      workaround is simply to add an explicit cast.
                    463:
                    464:           mpz_class f(1), g(2), h(3);
                    465:           fun (f, mpz_class(g+h));  // Good
                    466:
                    467:      Similarly, within `fun' it may be necessary to cast an expression
                    468:      to type `T' when calling a templated `fun2'.
                    469:
                    470:           template <class T>
                    471:           void fun (T f, T g)
                    472:           {
                    473:             fun2 (f, f+g);     // Bad
                    474:           }
                    475:
                    476:           template <class T>
                    477:           void fun (T f, T g)
                    478:           {
                    479:             fun2 (f, T(f+g));  // Good
                    480:           }
                    481:
                    482: 
                    483: File: gmp.info,  Node: BSD Compatible Functions,  Next: Custom Allocation,  Prev: C++ Class Interface,  Up: Top
                    484:
                    485: Berkeley MP Compatible Functions
                    486: ********************************
                    487:
                    488:    These functions are intended to be fully compatible with the
                    489: Berkeley MP library which is available on many BSD derived U*ix
                    490: systems.  The `--enable-mpbsd' option must be used when building GNU MP
                    491: to make these available (*note Installing GMP::).
                    492:
                    493:    The original Berkeley MP library has a usage restriction: you cannot
                    494: use the same variable as both source and destination in a single
                    495: function call.  The compatible functions in GNU MP do not share this
                    496: restriction--inputs and outputs may overlap.
                    497:
                    498:    It is not recommended that new programs are written using these
                    499: functions.  Apart from the incomplete set of functions, the interface
                    500: for initializing `MINT' objects is more error prone, and the `pow'
                    501: function collides with `pow' in `libm.a'.
                    502:
                    503:    Include the header `mp.h' to get the definition of the necessary
                    504: types and functions.  If you are on a BSD derived system, make sure to
                    505: include GNU `mp.h' if you are going to link the GNU `libmp.a' to your
                    506: program.  This means that you probably need to give the `-I<dir>'
                    507: option to the compiler, where `<dir>' is the directory where you have
                    508: GNU `mp.h'.
                    509:
                    510:  - Function: MINT * itom (signed short int INITIAL_VALUE)
                    511:      Allocate an integer consisting of a `MINT' object and dynamic limb
                    512:      space.  Initialize the integer to INITIAL_VALUE.  Return a pointer
                    513:      to the `MINT' object.
                    514:
                    515:  - Function: MINT * xtom (char *INITIAL_VALUE)
                    516:      Allocate an integer consisting of a `MINT' object and dynamic limb
                    517:      space.  Initialize the integer from INITIAL_VALUE, a hexadecimal,
                    518:      null-terminated C string.  Return a pointer to the `MINT' object.
                    519:
                    520:  - Function: void move (MINT *SRC, MINT *DEST)
                    521:      Set DEST to SRC by copying.  Both variables must be previously
                    522:      initialized.
                    523:
                    524:  - Function: void madd (MINT *SRC_1, MINT *SRC_2, MINT *DESTINATION)
                    525:      Add SRC_1 and SRC_2 and put the sum in DESTINATION.
                    526:
                    527:  - Function: void msub (MINT *SRC_1, MINT *SRC_2, MINT *DESTINATION)
                    528:      Subtract SRC_2 from SRC_1 and put the difference in DESTINATION.
                    529:
                    530:  - Function: void mult (MINT *SRC_1, MINT *SRC_2, MINT *DESTINATION)
                    531:      Multiply SRC_1 and SRC_2 and put the product in DESTINATION.
                    532:
                    533:  - Function: void mdiv (MINT *DIVIDEND, MINT *DIVISOR, MINT *QUOTIENT,
                    534:           MINT *REMAINDER)
                    535:  - Function: void sdiv (MINT *DIVIDEND, signed short int DIVISOR, MINT
                    536:           *QUOTIENT, signed short int *REMAINDER)
                    537:      Set QUOTIENT to DIVIDEND/DIVISOR, and REMAINDER to DIVIDEND mod
                    538:      DIVISOR.  The quotient is rounded towards zero; the remainder has
                    539:      the same sign as the dividend unless it is zero.
                    540:
                    541:      Some implementations of these functions work differently--or not
                    542:      at all--for negative arguments.
                    543:
                    544:  - Function: void msqrt (MINT *OP, MINT *ROOT, MINT *REMAINDER)
                    545:      Set ROOT to the truncated integer part of the square root of OP,
                    546:      like `mpz_sqrt'.  Set REMAINDER to OP-ROOT*ROOT, i.e.  zero if OP
                    547:      is a perfect square.
                    548:
                    549:      If ROOT and REMAINDER are the same variable, the results are
                    550:      undefined.
                    551:
                    552:  - Function: void pow (MINT *BASE, MINT *EXP, MINT *MOD, MINT *DEST)
                    553:      Set DEST to (BASE raised to EXP) modulo MOD.
                    554:
                    555:  - Function: void rpow (MINT *BASE, signed short int EXP, MINT *DEST)
                    556:      Set DEST to BASE raised to EXP.
                    557:
                    558:  - Function: void gcd (MINT *OP1, MINT *OP2, MINT *RES)
                    559:      Set RES to the greatest common divisor of OP1 and OP2.
                    560:
                    561:  - Function: int mcmp (MINT *OP1, MINT *OP2)
                    562:      Compare OP1 and OP2.  Return a positive value if OP1 > OP2, zero
                    563:      if OP1 = OP2, and a negative value if OP1 < OP2.
                    564:
                    565:  - Function: void min (MINT *DEST)
                    566:      Input a decimal string from `stdin', and put the read integer in
                    567:      DEST.  SPC and TAB are allowed in the number string, and are
                    568:      ignored.
                    569:
                    570:  - Function: void mout (MINT *SRC)
                    571:      Output SRC to `stdout', as a decimal string.  Also output a
                    572:      newline.
                    573:
                    574:  - Function: char * mtox (MINT *OP)
                    575:      Convert OP to a hexadecimal string, and return a pointer to the
                    576:      string.  The returned string is allocated using the default memory
                    577:      allocation function, `malloc' by default.  It will be
                    578:      `strlen(str)+1' bytes, that being exactly enough for the string
                    579:      and null-terminator.
                    580:
                    581:  - Function: void mfree (MINT *OP)
                    582:      De-allocate, the space used by OP.  *This function should only be
                    583:      passed a value returned by `itom' or `xtom'.*
                    584:
                    585: 
                    586: File: gmp.info,  Node: Custom Allocation,  Next: Language Bindings,  Prev: BSD Compatible Functions,  Up: Top
                    587:
                    588: Custom Allocation
                    589: *****************
                    590:
                    591:    By default GMP uses `malloc', `realloc' and `free' for memory
                    592: allocation, and if they fail GMP prints a message to the standard error
                    593: output and terminates the program.
                    594:
                    595:    Alternate functions can be specified to allocate memory in a
                    596: different way or to have a different error action on running out of
                    597: memory.
                    598:
                    599:    This feature is available in the Berkeley compatibility library
                    600: (*note BSD Compatible Functions::) as well as the main GMP library.
                    601:
                    602:  - Function: void mp_set_memory_functions (
                    603:           void *(*ALLOC_FUNC_PTR) (size_t),
                    604:           void *(*REALLOC_FUNC_PTR) (void *, size_t, size_t),
                    605:           void (*FREE_FUNC_PTR) (void *, size_t))
                    606:      Replace the current allocation functions from the arguments.  If
                    607:      an argument is `NULL', the corresponding default function is used.
                    608:
                    609:      These functions will be used for all memory allocation done by
                    610:      GMP, apart from temporary space from `alloca' if that function is
                    611:      available and GMP is configured to use it (*note Build Options::).
                    612:
                    613:      *Be sure to call `mp_set_memory_functions' only when there are no
                    614:      active GMP objects allocated using the previous memory functions!
                    615:      Usually that means calling it before any other GMP function.*
                    616:
                    617:    The functions supplied should fit the following declarations:
                    618:
                    619:  - Function: void * allocate_function (size_t ALLOC_SIZE)
                    620:      Return a pointer to newly allocated space with at least ALLOC_SIZE
                    621:      bytes.
                    622:
                    623:  - Function: void * reallocate_function (void *PTR, size_t OLD_SIZE,
                    624:           size_t NEW_SIZE)
                    625:      Resize a previously allocated block PTR of OLD_SIZE bytes to be
                    626:      NEW_SIZE bytes.
                    627:
                    628:      The block may be moved if necessary or if desired, and in that
                    629:      case the smaller of OLD_SIZE and NEW_SIZE bytes must be copied to
                    630:      the new location.  The return value is a pointer to the resized
                    631:      block, that being the new location if moved or just PTR if not.
                    632:
                    633:      PTR is never `NULL', it's always a previously allocated block.
                    634:      NEW_SIZE may be bigger or smaller than OLD_SIZE.
                    635:
                    636:  - Function: void deallocate_function (void *PTR, size_t SIZE)
                    637:      De-allocate the space pointed to by PTR.
                    638:
                    639:      PTR is never `NULL', it's always a previously allocated block of
                    640:      SIZE bytes.
                    641:
                    642:    A "byte" here means the unit used by the `sizeof' operator.
                    643:
                    644:    The OLD_SIZE parameters to REALLOCATE_FUNCTION and
                    645: DEALLOCATE_FUNCTION are passed for convenience, but of course can be
                    646: ignored if not needed.  The default functions using `malloc' and friends
                    647: for instance don't use them.
                    648:
                    649:    No error return is allowed from any of these functions, if they
                    650: return then they must have performed the specified operation.  In
                    651: particular note that ALLOCATE_FUNCTION or REALLOCATE_FUNCTION mustn't
                    652: return `NULL'.
                    653:
                    654:    Getting a different fatal error action is a good use for custom
                    655: allocation functions, for example giving a graphical dialog rather than
                    656: the default print to `stderr'.  How much is possible when genuinely out
                    657: of memory is another question though.
                    658:
                    659:    There's currently no defined way for the allocation functions to
                    660: recover from an error such as out of memory, they must terminate
                    661: program execution.  A `longjmp' or throwing a C++ exception will have
                    662: undefined results.  This may change in the future.
                    663:
                    664:    GMP may use allocated blocks to hold pointers to other allocated
                    665: blocks.  This will limit the assumptions a conservative garbage
                    666: collection scheme can make.
                    667:
                    668:    Since the default GMP allocation uses `malloc' and friends, those
                    669: functions will be linked in even if the first thing a program does is an
                    670: `mp_set_memory_functions'.  It's necessary to change the GMP sources if
                    671: this is a problem.
                    672:
                    673: 
                    674: File: gmp.info,  Node: Language Bindings,  Next: Algorithms,  Prev: Custom Allocation,  Up: Top
                    675:
                    676: Language Bindings
                    677: *****************
                    678:
                    679:    The following packages and projects offer access to GMP from
                    680: languages other than C, though perhaps with varying levels of
                    681: functionality and efficiency.
                    682:
                    683:
                    684: C++
                    685:         * GMP C++ class interface, *note C++ Class Interface::
                    686:           Straightforward interface, expression templates to eliminate
                    687:           temporaries.
                    688:
                    689:         * ALP `http://www.inria.fr/saga/logiciels/ALP'
                    690:           Linear algebra and polynomials using templates.
                    691:
                    692:         * Arithmos `http://win-www.uia.ac.be/u/cant/arithmos'
                    693:           Rationals with infinities and square roots.
                    694:
                    695:         * CLN `http://clisp.cons.org/~haible/packages-cln.html'
                    696:           High level classes for arithmetic.
                    697:
                    698:         * LiDIA `http://www.informatik.tu-darmstadt.de/TI/LiDIA'
                    699:           A C++ library for computational number theory.
                    700:
                    701:         * Linbox `http://www.linalg.org'
                    702:           Sparse vectors and matrices.
                    703:
                    704:         * NTL `http://www.shoup.net/ntl'
                    705:           A C++ number theory library.
                    706:
                    707: Fortran
                    708:         * Omni F77 `http://pdplab.trc.rwcp.or.jp/pdperf/Omni/home.html'
                    709:           Arbitrary precision floats.
                    710:
                    711: Haskell
                    712:         * Glasgow Haskell Compiler `http://www.haskell.org/ghc'
                    713:
                    714: Java
                    715:         * Kaffe `http://www.kaffe.org'
                    716:
                    717:         * Kissme `http://kissme.sourceforge.net'
                    718:
                    719: Lisp
                    720:         * GNU Common Lisp `http://www.gnu.org/software/gcl/gcl.html'
                    721:           In the process of switching to GMP for bignums.
                    722:
                    723:         * Librep `http://librep.sourceforge.net'
                    724:
                    725: M4
                    726:         * GNU m4 betas `http://www.seindal.dk/rene/gnu'
                    727:           Optionally provides an arbitrary precision `mpeval'.
                    728:
                    729: ML
                    730:         * MLton compiler `http://www.mlton.org'
                    731:
                    732: Oz
                    733:         * Mozart `http://www.mozart-oz.org'
                    734:
                    735: Pascal
                    736:         * GNU Pascal Compiler `http://www.gnu-pascal.de'
                    737:           GMP unit.
                    738:
                    739: Perl
                    740:         * GMP module, see `demos/perl' in the GMP sources.
                    741:
                    742:         * Math::GMP `http://www.cpan.org'
                    743:           Compatible with Math::BigInt, but not as many functions as
                    744:           the GMP module above.
                    745:
                    746:         * Math::BigInt::GMP `http://www.cpan.org'
                    747:           Plug Math::GMP into normal Math::BigInt operations.
                    748:
                    749: Pike
                    750:         * mpz module in the standard distribution,
                    751:           `http://pike.idonex.com'
                    752:
                    753: Prolog
                    754:         * SWI Prolog `http://www.swi.psy.uva.nl/projects/SWI-Prolog'
                    755:           Arbitrary precision floats.
                    756:
                    757: Python
                    758:         * mpz module in the standard distribution,
                    759:           `http://www.python.org'
                    760:
                    761:         * GMPY `http://gmpy.sourceforge.net'
                    762:
                    763: Scheme
                    764:         * RScheme `http://www.rscheme.org'
                    765:
                    766:         * STklos `http://kaolin.unice.fr/STklos'
                    767:
                    768: Smalltalk
                    769:         * GNU Smalltalk
                    770:           `http://www.smalltalk.org/versions/GNUSmalltalk.html'
                    771:
                    772: Other
                    773:         * DrGenius `http://drgenius.seul.org'
                    774:           Geometry system and mathematical programming language.
                    775:
                    776:         * GiNaC `http://www.ginac.de'
                    777:           C++ computer algebra using CLN.
                    778:
                    779:         * Maxima `http://www.ma.utexas.edu/users/wfs/maxima.html'
                    780:           Macsyma computer algebra using GCL.
                    781:
                    782:         * Q `http://www.musikwissenschaft.uni-mainz.de/~ag/q'
                    783:           Equational programming system.
                    784:
                    785:         * Regina `http://regina.sourceforge.net'
                    786:           Topological calculator.
                    787:
                    788:         * Yacas `http://www.xs4all.nl/~apinkus/yacas.html'
                    789:           Yet another computer algebra system.
                    790:
                    791: 
                    792: File: gmp.info,  Node: Algorithms,  Next: Internals,  Prev: Language Bindings,  Up: Top
                    793:
                    794: Algorithms
                    795: **********
                    796:
                    797:    This chapter is an introduction to some of the algorithms used for
                    798: various GMP operations.  The code is likely to be hard to understand
                    799: without knowing something about the algorithms.
                    800:
                    801:    Some GMP internals are mentioned, but applications that expect to be
                    802: compatible with future GMP releases should take care to use only the
                    803: documented functions.
                    804:
                    805: * Menu:
                    806:
                    807: * Multiplication Algorithms::
                    808: * Division Algorithms::
                    809: * Greatest Common Divisor Algorithms::
                    810: * Powering Algorithms::
                    811: * Root Extraction Algorithms::
                    812: * Radix Conversion Algorithms::
                    813: * Other Algorithms::
                    814: * Assembler Coding::
                    815:
                    816: 
                    817: File: gmp.info,  Node: Multiplication Algorithms,  Next: Division Algorithms,  Prev: Algorithms,  Up: Algorithms
                    818:
                    819: Multiplication
                    820: ==============
                    821:
                    822:    NxN limb multiplications and squares are done using one of four
                    823: algorithms, as the size N increases.
                    824:
                    825:      Algorithm      Threshold
                    826:      Basecase       (none)
                    827:      Karatsuba      `MUL_KARATSUBA_THRESHOLD'
                    828:      Toom-3         `MUL_TOOM3_THRESHOLD'
                    829:      FFT            `MUL_FFT_THRESHOLD'
                    830:
                    831:    Similarly for squaring, with the `SQR' thresholds.  Note though that
                    832: the FFT is only used if GMP is configured with `--enable-fft', *note
                    833: Build Options::.
                    834:
                    835:    NxM multiplications of operands with different sizes above
                    836: `MUL_KARATSUBA_THRESHOLD' are currently done by splitting into MxM
                    837: pieces.  The Karatsuba and Toom-3 routines then operate only on equal
                    838: size operands.  This is not very efficient, and is slated for
                    839: improvement in the future.
                    840:
                    841: * Menu:
                    842:
                    843: * Basecase Multiplication::
                    844: * Karatsuba Multiplication::
                    845: * Toom-Cook 3-Way Multiplication::
                    846: * FFT Multiplication::
                    847: * Other Multiplication::
                    848:
                    849: 
                    850: File: gmp.info,  Node: Basecase Multiplication,  Next: Karatsuba Multiplication,  Prev: Multiplication Algorithms,  Up: Multiplication Algorithms
                    851:
                    852: Basecase Multiplication
                    853: -----------------------
                    854:
                    855:    Basecase NxM multiplication is a straightforward rectangular set of
                    856: cross-products, the same as long multiplication done by hand and for
                    857: that reason sometimes known as the schoolbook or grammar school method.
                    858: This is an O(N*M) algorithm.  See Knuth section 4.3.1 algorithm M
                    859: (*note References::), and the `mpn/generic/mul_basecase.c' code.
                    860:
                    861:    Assembler implementations of `mpn_mul_basecase' are essentially the
                    862: same as the generic C code, but have all the usual assembler tricks and
                    863: obscurities introduced for speed.
                    864:
                    865:    A square can be done in roughly half the time of a multiply, by
                    866: using the fact that the cross products above and below the diagonal are
                    867: the same.  A triangle of products below the diagonal is formed, doubled
                    868: (left shift by one bit), and then the products on the diagonal added.
                    869: This can be seen in `mpn/generic/sqr_basecase.c'.  Again the assembler
                    870: implementations take essentially the same approach.
                    871:
                    872:           u0  u1  u2  u3  u4
                    873:         +---+---+---+---+---+
                    874:      u0 | d |   |   |   |   |
                    875:         +---+---+---+---+---+
                    876:      u1 |   | d |   |   |   |
                    877:         +---+---+---+---+---+
                    878:      u2 |   |   | d |   |   |
                    879:         +---+---+---+---+---+
                    880:      u3 |   |   |   | d |   |
                    881:         +---+---+---+---+---+
                    882:      u4 |   |   |   |   | d |
                    883:         +---+---+---+---+---+
                    884:
                    885:    In practice squaring isn't a full 2x faster than multiplying, it's
                    886: usually around 1.5x.  Less than 1.5x probably indicates
                    887: `mpn_sqr_basecase' wants improving on that CPU.
                    888:
                    889:    On some CPUs `mpn_mul_basecase' can be faster than the generic C
                    890: `mpn_sqr_basecase'.  `SQR_BASECASE_THRESHOLD' is the size at which to
                    891: use `mpn_sqr_basecase', this will be zero if that routine should be
                    892: used always.
                    893:
                    894: 
                    895: File: gmp.info,  Node: Karatsuba Multiplication,  Next: Toom-Cook 3-Way Multiplication,  Prev: Basecase Multiplication,  Up: Multiplication Algorithms
                    896:
                    897: Karatsuba Multiplication
                    898: ------------------------
                    899:
                    900:    The Karatsuba multiplication algorithm is described in Knuth section
                    901: 4.3.3 part A, and various other textbooks.  A brief description is
                    902: given here.
                    903:
                    904:    The inputs x and y are treated as each split into two parts of equal
                    905: length (or the most significant part one limb shorter if N is odd).
                    906:
                    907:       high              low
                    908:      +----------+----------+
                    909:      |    x1    |    x0    |
                    910:      +----------+----------+
                    911:
                    912:      +----------+----------+
                    913:      |    y1    |    y0    |
                    914:      +----------+----------+
                    915:
                    916:    Let b be the power of 2 where the split occurs, ie. if x0 is k limbs
                    917: (y0 the same) then b=2^(k*mp_bits_per_limb).  With that x=x1*b+x0 and
                    918: y=y1*b+y0, and the following holds,
                    919:
                    920:      x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0
                    921:
                    922:    This formula means doing only three multiplies of (N/2)x(N/2) limbs,
                    923: whereas a basecase multiply of NxN limbs is equivalent to four
                    924: multiplies of (N/2)x(N/2).  The factors (b^2+b) etc represent the
                    925: positions where the three products must be added.
                    926:
                    927:       high                              low
                    928:      +--------+--------+ +--------+--------+
                    929:      |      x1*y1      | |      x0*y0      |
                    930:      +--------+--------+ +--------+--------+
                    931:                +--------+--------+
                    932:            add |      x1*y1      |
                    933:                +--------+--------+
                    934:                +--------+--------+
                    935:            add |      x0*y0      |
                    936:                +--------+--------+
                    937:                +--------+--------+
                    938:            sub | (x1-x0)*(y1-y0) |
                    939:                +--------+--------+
                    940:
                    941:    The term (x1-x0)*(y1-y0) is best calculated as an absolute value,
                    942: and the sign used to choose to add or subtract.  Notice the sum
                    943: high(x0*y0)+low(x1*y1) occurs twice, so it's possible to do 5*k limb
                    944: additions, rather than 6*k, but in GMP extra function call overheads
                    945: outweigh the saving.
                    946:
                    947:    Squaring is similar to multiplying, but with x=y the formula reduces
                    948: to an equivalent with three squares,
                    949:
                    950:      x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2
                    951:
                    952:    The final result is accumulated from those three squares the same
                    953: way as for the three multiplies above.  The middle term (x1-x0)^2 is now
                    954: always positive.
                    955:
                    956:    A similar formula for both multiplying and squaring can be
                    957: constructed with a middle term (x1+x0)*(y1+y0).  But those sums can
                    958: exceed k limbs, leading to more carry handling and additions than the
                    959: form above.
                    960:
                    961:    Karatsuba multiplication is asymptotically an O(N^1.585) algorithm,
                    962: the exponent being log(3)/log(2), representing 3 multiplies each 1/2
                    963: the size of the inputs.  This is a big improvement over the basecase
                    964: multiply at O(N^2) and the advantage soon overcomes the extra additions
                    965: Karatsuba performs.
                    966:
                    967:    `MUL_KARATSUBA_THRESHOLD' can be as little as 10 limbs.  The `SQR'
                    968: threshold is usually about twice the `MUL'.  The basecase algorithm will
                    969: take a time of the form M(N) = a*N^2 + b*N + c and the Karatsuba
                    970: algorithm K(N) = 3*M(N/2) + d*N + e.  Clearly per-crossproduct speedups
                    971: in the basecase code reduce a and decrease the threshold, but linear
                    972: style speedups reducing b will actually increase the threshold.  The
                    973: latter can be seen for instance when adding an optimized
                    974: `mpn_sqr_diagonal' to `mpn_sqr_basecase'.  Of course all speedups
                    975: reduce total time, and in that sense the algorithm thresholds are
                    976: merely of academic interest.
                    977:
                    978: 
                    979: File: gmp.info,  Node: Toom-Cook 3-Way Multiplication,  Next: FFT Multiplication,  Prev: Karatsuba Multiplication,  Up: Multiplication Algorithms
                    980:
                    981: Toom-Cook 3-Way Multiplication
                    982: ------------------------------
                    983:
                    984:    The Karatsuba formula is the simplest case of a general approach to
                    985: splitting inputs that leads to both Toom-Cook and FFT algorithms.  A
                    986: description of Toom-Cook can be found in Knuth section 4.3.3, with an
                    987: example 3-way calculation after Theorem A.  The 3-way form used in GMP
                    988: is described here.
                    989:
                    990:    The operands are each considered split into 3 pieces of equal length
                    991: (or the most significant part 1 or 2 limbs shorter than the others).
                    992:
                    993:       high                         low
                    994:      +----------+----------+----------+
                    995:      |    x2    |    x1    |    x0    |
                    996:      +----------+----------+----------+
                    997:
                    998:      +----------+----------+----------+
                    999:      |    y2    |    y1    |    y0    |
                   1000:      +----------+----------+----------+
                   1001:
                   1002: These parts are treated as the coefficients of two polynomials
                   1003:
                   1004:      X(t) = x2*t^2 + x1*t + x0
                   1005:      Y(t) = y2*t^2 + y1*t + y0
                   1006:
                   1007:    Again let b equal the power of 2 which is the size of the x0, x1, y0
                   1008: and y1 pieces, ie. if they're k limbs each then
                   1009: b=2^(k*mp_bits_per_limb).  With this x=X(b) and y=Y(b).
                   1010:
                   1011:    Let a polynomial W(t)=X(t)*Y(t) and suppose its coefficients are
                   1012:
                   1013:      W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0
                   1014:
                   1015: The w[i] are going to be determined, and when they are they'll give the
                   1016: final result using w=W(b), since x*y=X(b)*Y(b)=W(b).  The coefficients
                   1017: will be roughly b^2 each, and the final W(b) will be an addition like,
                   1018:
                   1019:       high                                        low
                   1020:      +-------+-------+
                   1021:      |       w4      |
                   1022:      +-------+-------+
                   1023:             +--------+-------+
                   1024:             |        w3      |
                   1025:             +--------+-------+
                   1026:                     +--------+-------+
                   1027:                     |        w2      |
                   1028:                     +--------+-------+
                   1029:                             +--------+-------+
                   1030:                             |        w1      |
                   1031:                             +--------+-------+
                   1032:                                      +-------+-------+
                   1033:                                      |       w0      |
                   1034:                                      +-------+-------+
                   1035:
                   1036:    The w[i] coefficients could be formed by a simple set of cross
                   1037: products, like w4=x2*y2, w3=x2*y1+x1*y2, w2=x2*y0+x1*y1+x0*y2 etc, but
                   1038: this would need all nine x[i]*y[j] for i,j=0,1,2, and would be
                   1039: equivalent merely to a basecase multiply.  Instead the following
                   1040: approach is used.
                   1041:
                   1042:    X(t) and Y(t) are evaluated and multiplied at 5 points, giving
                   1043: values of W(t) at those points.  The points used can be chosen in
                   1044: various ways, but in GMP the following are used
                   1045:
                   1046:      Point    Value
                   1047:      t=0      x0*y0, which gives w0 immediately
                   1048:      t=2      (4*x2+2*x1+x0)*(4*y2+2*y1+y0)
                   1049:      t=1      (x2+x1+x0)*(y2+y1+y0)
                   1050:      t=1/2    (x2+2*x1+4*x0)*(y2+2*y1+4*y0)
                   1051:      t=inf    x2*y2, which gives w4 immediately
                   1052:
                   1053:    At t=1/2 the value calculated is actually 16*X(1/2)*Y(1/2), giving a
                   1054: value for 16*W(1/2), and this is always an integer.  At t=inf the value
                   1055: is actually X(t)*Y(t)/t^4 in the limit as t approaches infinity, but
                   1056: it's much easier to think of as simply x2*y2 giving w4 immediately
                   1057: (much like x0*y0 at t=0 gives w0 immediately).
                   1058:
                   1059:    Now each of the points substituted into W(t)=w4*t^4+...+w0 gives a
                   1060: linear combination of the w[i] coefficients, and the value of those
                   1061: combinations has just been calculated.
                   1062:
                   1063:         W(0)   =                                 w0
                   1064:      16*W(1/2) =    w4 + 2*w3 + 4*w2 + 8*w1 + 16*w0
                   1065:         W(1)   =    w4 +   w3 +   w2 +   w1 +    w0
                   1066:         W(2)   = 16*w4 + 8*w3 + 4*w2 + 2*w1 +    w0
                   1067:         W(inf) =    w4
                   1068:
                   1069:    This is a set of five equations in five unknowns, and some
                   1070: elementary linear algebra quickly isolates each w[i], by subtracting
                   1071: multiples of one equation from another.
                   1072:
                   1073:    In the code the set of five values W(0),...,W(inf) will represent
                   1074: those certain linear combinations.  By adding or subtracting one from
                   1075: another as necessary, values which are each w[i] alone are arrived at.
                   1076: This involves only a few subtractions of small multiples (some of which
                   1077: are powers of 2), and so is fast.  A couple of divisions remain by
                   1078: powers of 2 and one division by 3 (or by 6 rather), and that last uses
                   1079: the special `mpn_divexact_by3' (*note Exact Division::).
                   1080:
                   1081:    In the code the values w4, w2 and w0 are formed in the destination
                   1082: with pointers `E', `C' and `A', and w3 and w1 in temporary space `D'
                   1083: and `B' are added to them.  There are extra limbs `tD', `tC' and `tB'
                   1084: at the high end of w3, w2 and w1 which are handled separately.  The
                   1085: final addition then is as follows.
                   1086:
                   1087:       high                                        low
                   1088:      +-------+-------+-------+-------+-------+-------+
                   1089:      |       E       |       C       |       A       |
                   1090:      +-------+-------+-------+-------+-------+-------+
                   1091:               +------+-------++------+-------+
                   1092:               |      D       ||      B       |
                   1093:               +------+-------++------+-------+
                   1094:            --      --      --
                   1095:           |tD|    |tC|    |tB|
                   1096:            --      --      --
                   1097:
                   1098:    The conversion of W(t) values to the coefficients is interpolation.
                   1099: A polynomial of degree 4 like W(t) is uniquely determined by values
                   1100: known at 5 different points.  The points can be chosen to make the
                   1101: linear equations come out with a convenient set of steps for isolating
                   1102: the w[i].
                   1103:
                   1104:    In `mpn/generic/mul_n.c' the `interpolate3' routine performs the
                   1105: interpolation.  The open-coded one-pass version may be a bit hard to
                   1106: understand, the steps performed can be better seen in the `USE_MORE_MPN'
                   1107: version.
                   1108:
                   1109:    Squaring follows the same procedure as multiplication, but there's
                   1110: only one X(t) and it's evaluated at 5 points, and those values squared
                   1111: to give values of W(t).  The interpolation is then identical, and in
                   1112: fact the same `interpolate3' subroutine is used for both squaring and
                   1113: multiplying.
                   1114:
                   1115:    Toom-3 is asymptotically O(N^1.465), the exponent being
                   1116: log(5)/log(3), representing 5 recursive multiplies of 1/3 the original
                   1117: size.  This is an improvement over Karatsuba at O(N^1.585), though
                   1118: Toom-Cook does more work in the evaluation and interpolation and so it
                   1119: only realizes its advantage above a certain size.
                   1120:
                   1121:    Near the crossover between Toom-3 and Karatsuba there's generally a
                   1122: range of sizes where the difference between the two is small.
                   1123: `MUL_TOOM3_THRESHOLD' is a somewhat arbitrary point in that range and
                   1124: successive runs of the tune program can give different values due to
                   1125: small variations in measuring.  A graph of time versus size for the two
                   1126: shows the effect, see `tune/README'.
                   1127:
                   1128:    At the fairly small sizes where the Toom-3 thresholds occur it's
                   1129: worth remembering that the asymptotic behaviour for Karatsuba and
                   1130: Toom-3 can't be expected to make accurate predictions, due of course to
                   1131: the big influence of all sorts of overheads, and the fact that only a
                   1132: few recursions of each are being performed.  Even at large sizes
                   1133: there's a good chance machine dependent effects like cache architecture
                   1134: will mean actual performance deviates from what might be predicted.
                   1135:
                   1136:    The formula given above for the Karatsuba algorithm has an
                   1137: equivalent for Toom-3 involving only five multiplies, but this would be
                   1138: complicated and unenlightening.
                   1139:
                   1140:    An alternate view of Toom-3 can be found in Zuras (*note
                   1141: References::), using a vector to represent the x and y splits and a
                   1142: matrix multiplication for the evaluation and interpolation stages.  The
                   1143: matrix inverses are not meant to be actually used, and they have
                   1144: elements with values much greater than in fact arise in the
                   1145: interpolation steps.  The diagram shown for the 3-way is attractive,
                   1146: but again doesn't have to be implemented that way and for example with
                   1147: a bit of rearrangement just one division by 6 can be done.
                   1148:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>