Annotation of OpenXM_contrib/gmp/gmp.info-5, Revision 1.1
1.1 ! ohara 1: This is gmp.info, produced by makeinfo version 4.2 from gmp.texi.
! 2:
! 3: This manual describes how to install and use the GNU multiple precision
! 4: arithmetic library, version 4.1.2.
! 5:
! 6: Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
! 7: 2001, 2002 Free Software Foundation, Inc.
! 8:
! 9: Permission is granted to copy, distribute and/or modify this
! 10: document under the terms of the GNU Free Documentation License, Version
! 11: 1.1 or any later version published by the Free Software Foundation;
! 12: with no Invariant Sections, with the Front-Cover Texts being "A GNU
! 13: Manual", and with the Back-Cover Texts being "You have freedom to copy
! 14: and modify this GNU Manual, like GNU software". A copy of the license
! 15: is included in *Note GNU Free Documentation License::.
! 16: INFO-DIR-SECTION GNU libraries
! 17: START-INFO-DIR-ENTRY
! 18: * gmp: (gmp). GNU Multiple Precision Arithmetic Library.
! 19: END-INFO-DIR-ENTRY
! 20:
! 21:
! 22: File: gmp.info, Node: C++ Interface General, Next: C++ Interface Integers, Prev: C++ Class Interface, Up: C++ Class Interface
! 23:
! 24: C++ Interface General
! 25: =====================
! 26:
! 27: All the C++ classes and functions are available with
! 28:
! 29: #include <gmpxx.h>
! 30:
! 31: Programs should be linked with the `libgmpxx' and `libgmp'
! 32: libraries. For example,
! 33:
! 34: g++ mycxxprog.cc -lgmpxx -lgmp
! 35:
! 36: The classes defined are
! 37:
! 38: - Class: mpz_class
! 39: - Class: mpq_class
! 40: - Class: mpf_class
! 41:
! 42: The standard operators and various standard functions are overloaded
! 43: to allow arithmetic with these classes. For example,
! 44:
! 45: int
! 46: main (void)
! 47: {
! 48: mpz_class a, b, c;
! 49:
! 50: a = 1234;
! 51: b = "-5678";
! 52: c = a+b;
! 53: cout << "sum is " << c << "\n";
! 54: cout << "absolute value is " << abs(c) << "\n";
! 55:
! 56: return 0;
! 57: }
! 58:
! 59: An important feature of the implementation is that an expression like
! 60: `a=b+c' results in a single call to the corresponding `mpz_add',
! 61: without using a temporary for the `b+c' part. Expressions which by
! 62: their nature imply intermediate values, like `a=b*c+d*e', still use
! 63: temporaries though.
! 64:
! 65: The classes can be freely intermixed in expressions, as can the
! 66: classes and the standard types `long', `unsigned long' and `double'.
! 67: Smaller types like `int' or `float' can also be intermixed, since C++
! 68: will promote them.
! 69:
! 70: Note that `bool' is not accepted directly, but must be explicitly
! 71: cast to an `int' first. This is because C++ will automatically convert
! 72: any pointer to a `bool', so if GMP accepted `bool' it would make all
! 73: sorts of invalid class and pointer combinations compile but almost
! 74: certainly not do anything sensible.
! 75:
! 76: Conversions back from the classes to standard C++ types aren't done
! 77: automatically, instead member functions like `get_si' are provided (see
! 78: the following sections for details).
! 79:
! 80: Also there are no automatic conversions from the classes to the
! 81: corresponding GMP C types, instead a reference to the underlying C
! 82: object can be obtained with the following functions,
! 83:
! 84: - Function: mpz_t mpz_class::get_mpz_t ()
! 85: - Function: mpq_t mpq_class::get_mpq_t ()
! 86: - Function: mpf_t mpf_class::get_mpf_t ()
! 87:
! 88: These can be used to call a C function which doesn't have a C++ class
! 89: interface. For example to set `a' to the GCD of `b' and `c',
! 90:
! 91: mpz_class a, b, c;
! 92: ...
! 93: mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t());
! 94:
! 95: In the other direction, a class can be initialized from the
! 96: corresponding GMP C type, or assigned to if an explicit constructor is
! 97: used. In both cases this makes a copy of the value, it doesn't create
! 98: any sort of association. For example,
! 99:
! 100: mpz_t z;
! 101: // ... init and calculate z ...
! 102: mpz_class x(z);
! 103: mpz_class y;
! 104: y = mpz_class (z);
! 105:
! 106: There are no namespace setups in `gmpxx.h', all types and functions
! 107: are simply put into the global namespace. This is what `gmp.h' has
! 108: done in the past, and continues to do for compatibility. The extras
! 109: provided by `gmpxx.h' follow GMP naming conventions and are unlikely to
! 110: clash with anything.
! 111:
! 112:
! 113: File: gmp.info, Node: C++ Interface Integers, Next: C++ Interface Rationals, Prev: C++ Interface General, Up: C++ Class Interface
! 114:
! 115: C++ Interface Integers
! 116: ======================
! 117:
! 118: - Function: void mpz_class::mpz_class (type N)
! 119: Construct an `mpz_class'. All the standard C++ types may be used,
! 120: except `long long' and `long double', and all the GMP C++ classes
! 121: can be used. Any necessary conversion follows the corresponding C
! 122: function, for example `double' follows `mpz_set_d' (*note
! 123: Assigning Integers::).
! 124:
! 125: - Function: void mpz_class::mpz_class (mpz_t Z)
! 126: Construct an `mpz_class' from an `mpz_t'. The value in Z is
! 127: copied into the new `mpz_class', there won't be any permanent
! 128: association between it and Z.
! 129:
! 130: - Function: void mpz_class::mpz_class (const char *S)
! 131: - Function: void mpz_class::mpz_class (const char *S, int base)
! 132: - Function: void mpz_class::mpz_class (const string& S)
! 133: - Function: void mpz_class::mpz_class (const string& S, int base)
! 134: Construct an `mpz_class' converted from a string using
! 135: `mpz_set_str', (*note Assigning Integers::). If the BASE is not
! 136: given then 0 is used.
! 137:
! 138: - Function: mpz_class operator/ (mpz_class A, mpz_class D)
! 139: - Function: mpz_class operator% (mpz_class A, mpz_class D)
! 140: Divisions involving `mpz_class' round towards zero, as per the
! 141: `mpz_tdiv_q' and `mpz_tdiv_r' functions (*note Integer Division::).
! 142: This corresponds to the rounding used for plain `int' calculations
! 143: on most machines.
! 144:
! 145: The `mpz_fdiv...' or `mpz_cdiv...' functions can always be called
! 146: directly if desired. For example,
! 147:
! 148: mpz_class q, a, d;
! 149: ...
! 150: mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t());
! 151:
! 152: - Function: mpz_class abs (mpz_class OP1)
! 153: - Function: int cmp (mpz_class OP1, type OP2)
! 154: - Function: int cmp (type OP1, mpz_class OP2)
! 155: - Function: double mpz_class::get_d (void)
! 156: - Function: long mpz_class::get_si (void)
! 157: - Function: unsigned long mpz_class::get_ui (void)
! 158: - Function: bool mpz_class::fits_sint_p (void)
! 159: - Function: bool mpz_class::fits_slong_p (void)
! 160: - Function: bool mpz_class::fits_sshort_p (void)
! 161: - Function: bool mpz_class::fits_uint_p (void)
! 162: - Function: bool mpz_class::fits_ulong_p (void)
! 163: - Function: bool mpz_class::fits_ushort_p (void)
! 164: - Function: int sgn (mpz_class OP)
! 165: - Function: mpz_class sqrt (mpz_class OP)
! 166: These functions provide a C++ class interface to the corresponding
! 167: GMP C routines.
! 168:
! 169: `cmp' can be used with any of the classes or the standard C++
! 170: types, except `long long' and `long double'.
! 171:
! 172:
! 173: Overloaded operators for combinations of `mpz_class' and `double'
! 174: are provided for completeness, but it should be noted that if the given
! 175: `double' is not an integer then the way any rounding is done is
! 176: currently unspecified. The rounding might take place at the start, in
! 177: the middle, or at the end of the operation, and it might change in the
! 178: future.
! 179:
! 180: Conversions between `mpz_class' and `double', however, are defined
! 181: to follow the corresponding C functions `mpz_get_d' and `mpz_set_d'.
! 182: And comparisons are always made exactly, as per `mpz_cmp_d'.
! 183:
! 184:
! 185: File: gmp.info, Node: C++ Interface Rationals, Next: C++ Interface Floats, Prev: C++ Interface Integers, Up: C++ Class Interface
! 186:
! 187: C++ Interface Rationals
! 188: =======================
! 189:
! 190: In all the following constructors, if a fraction is given then it
! 191: should be in canonical form, or if not then `mpq_class::canonicalize'
! 192: called.
! 193:
! 194: - Function: void mpq_class::mpq_class (type OP)
! 195: - Function: void mpq_class::mpq_class (integer NUM, integer DEN)
! 196: Construct an `mpq_class'. The initial value can be a single value
! 197: of any type, or a pair of integers (`mpz_class' or standard C++
! 198: integer types) representing a fraction, except that `long long'
! 199: and `long double' are not supported. For example,
! 200:
! 201: mpq_class q (99);
! 202: mpq_class q (1.75);
! 203: mpq_class q (1, 3);
! 204:
! 205: - Function: void mpq_class::mpq_class (mpq_t Q)
! 206: Construct an `mpq_class' from an `mpq_t'. The value in Q is
! 207: copied into the new `mpq_class', there won't be any permanent
! 208: association between it and Q.
! 209:
! 210: - Function: void mpq_class::mpq_class (const char *S)
! 211: - Function: void mpq_class::mpq_class (const char *S, int base)
! 212: - Function: void mpq_class::mpq_class (const string& S)
! 213: - Function: void mpq_class::mpq_class (const string& S, int base)
! 214: Construct an `mpq_class' converted from a string using
! 215: `mpq_set_str', (*note Initializing Rationals::). If the BASE is
! 216: not given then 0 is used.
! 217:
! 218: - Function: void mpq_class::canonicalize ()
! 219: Put an `mpq_class' into canonical form, as per *Note Rational
! 220: Number Functions::. All arithmetic operators require their
! 221: operands in canonical form, and will return results in canonical
! 222: form.
! 223:
! 224: - Function: mpq_class abs (mpq_class OP)
! 225: - Function: int cmp (mpq_class OP1, type OP2)
! 226: - Function: int cmp (type OP1, mpq_class OP2)
! 227: - Function: double mpq_class::get_d (void)
! 228: - Function: int sgn (mpq_class OP)
! 229: These functions provide a C++ class interface to the corresponding
! 230: GMP C routines.
! 231:
! 232: `cmp' can be used with any of the classes or the standard C++
! 233: types, except `long long' and `long double'.
! 234:
! 235: - Function: mpz_class& mpq_class::get_num ()
! 236: - Function: mpz_class& mpq_class::get_den ()
! 237: Get a reference to an `mpz_class' which is the numerator or
! 238: denominator of an `mpq_class'. This can be used both for read and
! 239: write access. If the object returned is modified, it modifies the
! 240: original `mpq_class'.
! 241:
! 242: If direct manipulation might produce a non-canonical value, then
! 243: `mpq_class::canonicalize' must be called before further operations.
! 244:
! 245: - Function: mpz_t mpq_class::get_num_mpz_t ()
! 246: - Function: mpz_t mpq_class::get_den_mpz_t ()
! 247: Get a reference to the underlying `mpz_t' numerator or denominator
! 248: of an `mpq_class'. This can be passed to C functions expecting an
! 249: `mpz_t'. Any modifications made to the `mpz_t' will modify the
! 250: original `mpq_class'.
! 251:
! 252: If direct manipulation might produce a non-canonical value, then
! 253: `mpq_class::canonicalize' must be called before further operations.
! 254:
! 255: - Function: istream& operator>> (istream& STREAM, mpq_class& ROP);
! 256: Read ROP from STREAM, using its `ios' formatting settings, the
! 257: same as `mpq_t operator>>' (*note C++ Formatted Input::).
! 258:
! 259: If the ROP read might not be in canonical form then
! 260: `mpq_class::canonicalize' must be called.
! 261:
! 262:
! 263: File: gmp.info, Node: C++ Interface Floats, Next: C++ Interface MPFR, Prev: C++ Interface Rationals, Up: C++ Class Interface
! 264:
! 265: C++ Interface Floats
! 266: ====================
! 267:
! 268: When an expression requires the use of temporary intermediate
! 269: `mpf_class' values, like `f=g*h+x*y', those temporaries will have the
! 270: same precision as the destination `f'. Explicit constructors can be
! 271: used if this doesn't suit.
! 272:
! 273: - Function: mpf_class::mpf_class (type OP)
! 274: - Function: mpf_class::mpf_class (type OP, unsigned long PREC)
! 275: Construct an `mpf_class'. Any standard C++ type can be used,
! 276: except `long long' and `long double', and any of the GMP C++
! 277: classes can be used.
! 278:
! 279: If PREC is given, the initial precision is that value, in bits. If
! 280: PREC is not given, then the initial precision is determined by the
! 281: type of OP given. An `mpz_class', `mpq_class', string, or C++
! 282: builtin type will give the default `mpf' precision (*note
! 283: Initializing Floats::). An `mpf_class' or expression will give
! 284: the precision of that value. The precision of a binary expression
! 285: is the higher of the two operands.
! 286:
! 287: mpf_class f(1.5); // default precision
! 288: mpf_class f(1.5, 500); // 500 bits (at least)
! 289: mpf_class f(x); // precision of x
! 290: mpf_class f(abs(x)); // precision of x
! 291: mpf_class f(-g, 1000); // 1000 bits (at least)
! 292: mpf_class f(x+y); // greater of precisions of x and y
! 293:
! 294: - Function: mpf_class abs (mpf_class OP)
! 295: - Function: mpf_class ceil (mpf_class OP)
! 296: - Function: int cmp (mpf_class OP1, type OP2)
! 297: - Function: int cmp (type OP1, mpf_class OP2)
! 298: - Function: mpf_class floor (mpf_class OP)
! 299: - Function: mpf_class hypot (mpf_class OP1, mpf_class OP2)
! 300: - Function: double mpf_class::get_d (void)
! 301: - Function: long mpf_class::get_si (void)
! 302: - Function: unsigned long mpf_class::get_ui (void)
! 303: - Function: bool mpf_class::fits_sint_p (void)
! 304: - Function: bool mpf_class::fits_slong_p (void)
! 305: - Function: bool mpf_class::fits_sshort_p (void)
! 306: - Function: bool mpf_class::fits_uint_p (void)
! 307: - Function: bool mpf_class::fits_ulong_p (void)
! 308: - Function: bool mpf_class::fits_ushort_p (void)
! 309: - Function: int sgn (mpf_class OP)
! 310: - Function: mpf_class sqrt (mpf_class OP)
! 311: - Function: mpf_class trunc (mpf_class OP)
! 312: These functions provide a C++ class interface to the corresponding
! 313: GMP C routines.
! 314:
! 315: `cmp' can be used with any of the classes or the standard C++
! 316: types, except `long long' and `long double'.
! 317:
! 318: The accuracy provided by `hypot' is not currently guaranteed.
! 319:
! 320: - Function: unsigned long int mpf_class::get_prec ()
! 321: - Function: void mpf_class::set_prec (unsigned long PREC)
! 322: - Function: void mpf_class::set_prec_raw (unsigned long PREC)
! 323: Get or set the current precision of an `mpf_class'.
! 324:
! 325: The restrictions described for `mpf_set_prec_raw' (*note
! 326: Initializing Floats::) apply to `mpf_class::set_prec_raw'. Note
! 327: in particular that the `mpf_class' must be restored to it's
! 328: allocated precision before being destroyed. This must be done by
! 329: application code, there's no automatic mechanism for it.
! 330:
! 331:
! 332: File: gmp.info, Node: C++ Interface MPFR, Next: C++ Interface Random Numbers, Prev: C++ Interface Floats, Up: C++ Class Interface
! 333:
! 334: C++ Interface MPFR
! 335: ==================
! 336:
! 337: The C++ class interface to MPFR is provided if MPFR is enabled
! 338: (*note Build Options::). This interface must be regarded as
! 339: preliminary and possibly subject to incompatible changes in the future,
! 340: since MPFR itself is preliminary. All definitions can be obtained with
! 341:
! 342: #include <mpfrxx.h>
! 343:
! 344: This defines
! 345:
! 346: - Class: mpfr_class
! 347:
! 348: which behaves similarly to `mpf_class' (*note C++ Interface Floats::).
! 349:
! 350:
! 351: File: gmp.info, Node: C++ Interface Random Numbers, Next: C++ Interface Limitations, Prev: C++ Interface MPFR, Up: C++ Class Interface
! 352:
! 353: C++ Interface Random Numbers
! 354: ============================
! 355:
! 356: - Class: gmp_randclass
! 357: The C++ class interface to the GMP random number functions uses
! 358: `gmp_randclass' to hold an algorithm selection and current state,
! 359: as per `gmp_randstate_t'.
! 360:
! 361: - Function: gmp_randclass::gmp_randclass (void (*RANDINIT)
! 362: (gmp_randstate_t, ...), ...)
! 363: Construct a `gmp_randclass', using a call to the given RANDINIT
! 364: function (*note Random State Initialization::). The arguments
! 365: expected are the same as RANDINIT, but with `mpz_class' instead of
! 366: `mpz_t'. For example,
! 367:
! 368: gmp_randclass r1 (gmp_randinit_default);
! 369: gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32);
! 370: gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp);
! 371:
! 372: `gmp_randinit_lc_2exp_size' can fail if the size requested is too
! 373: big, the behaviour of `gmp_randclass::gmp_randclass' is undefined
! 374: in this case (perhaps this will change in the future).
! 375:
! 376: - Function: gmp_randclass::gmp_randclass (gmp_randalg_t ALG, ...)
! 377: Construct a `gmp_randclass' using the same parameters as
! 378: `gmp_randinit' (*note Random State Initialization::). This
! 379: function is obsolete and the above RANDINIT style should be
! 380: preferred.
! 381:
! 382: - Function: void gmp_randclass::seed (unsigned long int S)
! 383: - Function: void gmp_randclass::seed (mpz_class S)
! 384: Seed a random number generator. See *note Random Number
! 385: Functions::, for how to choose a good seed.
! 386:
! 387: - Function: mpz_class gmp_randclass::get_z_bits (unsigned long BITS)
! 388: - Function: mpz_class gmp_randclass::get_z_bits (mpz_class BITS)
! 389: Generate a random integer with a specified number of bits.
! 390:
! 391: - Function: mpz_class gmp_randclass::get_z_range (mpz_class N)
! 392: Generate a random integer in the range 0 to N-1 inclusive.
! 393:
! 394: - Function: mpf_class gmp_randclass::get_f ()
! 395: - Function: mpf_class gmp_randclass::get_f (unsigned long PREC)
! 396: Generate a random float F in the range 0 <= F < 1. F will be to
! 397: PREC bits precision, or if PREC is not given then to the precision
! 398: of the destination. For example,
! 399:
! 400: gmp_randclass r;
! 401: ...
! 402: mpf_class f (0, 512); // 512 bits precision
! 403: f = r.get_f(); // random number, 512 bits
! 404:
! 405:
! 406: File: gmp.info, Node: C++ Interface Limitations, Prev: C++ Interface Random Numbers, Up: C++ Class Interface
! 407:
! 408: C++ Interface Limitations
! 409: =========================
! 410:
! 411: `mpq_class' and Templated Reading
! 412: A generic piece of template code probably won't know that
! 413: `mpq_class' requires a `canonicalize' call if inputs read with
! 414: `operator>>' might be non-canonical. This can lead to incorrect
! 415: results.
! 416:
! 417: `operator>>' behaves as it does for reasons of efficiency. A
! 418: canonicalize can be quite time consuming on large operands, and is
! 419: best avoided if it's not necessary.
! 420:
! 421: But this potential difficulty reduces the usefulness of
! 422: `mpq_class'. Perhaps a mechanism to tell `operator>>' what to do
! 423: will be adopted in the future, maybe a preprocessor define, a
! 424: global flag, or an `ios' flag pressed into service. Or maybe, at
! 425: the risk of inconsistency, the `mpq_class' `operator>>' could
! 426: canonicalize and leave `mpq_t' `operator>>' not doing so, for use
! 427: on those occasions when that's acceptable. Send feedback or
! 428: alternate ideas to <bug-gmp@gnu.org>.
! 429:
! 430: Subclassing
! 431: Subclassing the GMP C++ classes works, but is not currently
! 432: recommended.
! 433:
! 434: Expressions involving subclasses resolve correctly (or seem to),
! 435: but in normal C++ fashion the subclass doesn't inherit
! 436: constructors and assignments. There's many of those in the GMP
! 437: classes, and a good way to reestablish them in a subclass is not
! 438: yet provided.
! 439:
! 440: Templated Expressions
! 441: A subtle difficulty exists when using expressions together with
! 442: application-defined template functions. Consider the following,
! 443: with `T' intended to be some numeric type,
! 444:
! 445: template <class T>
! 446: T fun (const T &, const T &);
! 447:
! 448: When used with, say, plain `mpz_class' variables, it works fine:
! 449: `T' is resolved as `mpz_class'.
! 450:
! 451: mpz_class f(1), g(2);
! 452: fun (f, g); // Good
! 453:
! 454: But when one of the arguments is an expression, it doesn't work.
! 455:
! 456: mpz_class f(1), g(2), h(3);
! 457: fun (f, g+h); // Bad
! 458:
! 459: This is because `g+h' ends up being a certain expression template
! 460: type internal to `gmpxx.h', which the C++ template resolution
! 461: rules are unable to automatically convert to `mpz_class'. The
! 462: workaround is simply to add an explicit cast.
! 463:
! 464: mpz_class f(1), g(2), h(3);
! 465: fun (f, mpz_class(g+h)); // Good
! 466:
! 467: Similarly, within `fun' it may be necessary to cast an expression
! 468: to type `T' when calling a templated `fun2'.
! 469:
! 470: template <class T>
! 471: void fun (T f, T g)
! 472: {
! 473: fun2 (f, f+g); // Bad
! 474: }
! 475:
! 476: template <class T>
! 477: void fun (T f, T g)
! 478: {
! 479: fun2 (f, T(f+g)); // Good
! 480: }
! 481:
! 482:
! 483: File: gmp.info, Node: BSD Compatible Functions, Next: Custom Allocation, Prev: C++ Class Interface, Up: Top
! 484:
! 485: Berkeley MP Compatible Functions
! 486: ********************************
! 487:
! 488: These functions are intended to be fully compatible with the
! 489: Berkeley MP library which is available on many BSD derived U*ix
! 490: systems. The `--enable-mpbsd' option must be used when building GNU MP
! 491: to make these available (*note Installing GMP::).
! 492:
! 493: The original Berkeley MP library has a usage restriction: you cannot
! 494: use the same variable as both source and destination in a single
! 495: function call. The compatible functions in GNU MP do not share this
! 496: restriction--inputs and outputs may overlap.
! 497:
! 498: It is not recommended that new programs are written using these
! 499: functions. Apart from the incomplete set of functions, the interface
! 500: for initializing `MINT' objects is more error prone, and the `pow'
! 501: function collides with `pow' in `libm.a'.
! 502:
! 503: Include the header `mp.h' to get the definition of the necessary
! 504: types and functions. If you are on a BSD derived system, make sure to
! 505: include GNU `mp.h' if you are going to link the GNU `libmp.a' to your
! 506: program. This means that you probably need to give the `-I<dir>'
! 507: option to the compiler, where `<dir>' is the directory where you have
! 508: GNU `mp.h'.
! 509:
! 510: - Function: MINT * itom (signed short int INITIAL_VALUE)
! 511: Allocate an integer consisting of a `MINT' object and dynamic limb
! 512: space. Initialize the integer to INITIAL_VALUE. Return a pointer
! 513: to the `MINT' object.
! 514:
! 515: - Function: MINT * xtom (char *INITIAL_VALUE)
! 516: Allocate an integer consisting of a `MINT' object and dynamic limb
! 517: space. Initialize the integer from INITIAL_VALUE, a hexadecimal,
! 518: null-terminated C string. Return a pointer to the `MINT' object.
! 519:
! 520: - Function: void move (MINT *SRC, MINT *DEST)
! 521: Set DEST to SRC by copying. Both variables must be previously
! 522: initialized.
! 523:
! 524: - Function: void madd (MINT *SRC_1, MINT *SRC_2, MINT *DESTINATION)
! 525: Add SRC_1 and SRC_2 and put the sum in DESTINATION.
! 526:
! 527: - Function: void msub (MINT *SRC_1, MINT *SRC_2, MINT *DESTINATION)
! 528: Subtract SRC_2 from SRC_1 and put the difference in DESTINATION.
! 529:
! 530: - Function: void mult (MINT *SRC_1, MINT *SRC_2, MINT *DESTINATION)
! 531: Multiply SRC_1 and SRC_2 and put the product in DESTINATION.
! 532:
! 533: - Function: void mdiv (MINT *DIVIDEND, MINT *DIVISOR, MINT *QUOTIENT,
! 534: MINT *REMAINDER)
! 535: - Function: void sdiv (MINT *DIVIDEND, signed short int DIVISOR, MINT
! 536: *QUOTIENT, signed short int *REMAINDER)
! 537: Set QUOTIENT to DIVIDEND/DIVISOR, and REMAINDER to DIVIDEND mod
! 538: DIVISOR. The quotient is rounded towards zero; the remainder has
! 539: the same sign as the dividend unless it is zero.
! 540:
! 541: Some implementations of these functions work differently--or not
! 542: at all--for negative arguments.
! 543:
! 544: - Function: void msqrt (MINT *OP, MINT *ROOT, MINT *REMAINDER)
! 545: Set ROOT to the truncated integer part of the square root of OP,
! 546: like `mpz_sqrt'. Set REMAINDER to OP-ROOT*ROOT, i.e. zero if OP
! 547: is a perfect square.
! 548:
! 549: If ROOT and REMAINDER are the same variable, the results are
! 550: undefined.
! 551:
! 552: - Function: void pow (MINT *BASE, MINT *EXP, MINT *MOD, MINT *DEST)
! 553: Set DEST to (BASE raised to EXP) modulo MOD.
! 554:
! 555: - Function: void rpow (MINT *BASE, signed short int EXP, MINT *DEST)
! 556: Set DEST to BASE raised to EXP.
! 557:
! 558: - Function: void gcd (MINT *OP1, MINT *OP2, MINT *RES)
! 559: Set RES to the greatest common divisor of OP1 and OP2.
! 560:
! 561: - Function: int mcmp (MINT *OP1, MINT *OP2)
! 562: Compare OP1 and OP2. Return a positive value if OP1 > OP2, zero
! 563: if OP1 = OP2, and a negative value if OP1 < OP2.
! 564:
! 565: - Function: void min (MINT *DEST)
! 566: Input a decimal string from `stdin', and put the read integer in
! 567: DEST. SPC and TAB are allowed in the number string, and are
! 568: ignored.
! 569:
! 570: - Function: void mout (MINT *SRC)
! 571: Output SRC to `stdout', as a decimal string. Also output a
! 572: newline.
! 573:
! 574: - Function: char * mtox (MINT *OP)
! 575: Convert OP to a hexadecimal string, and return a pointer to the
! 576: string. The returned string is allocated using the default memory
! 577: allocation function, `malloc' by default. It will be
! 578: `strlen(str)+1' bytes, that being exactly enough for the string
! 579: and null-terminator.
! 580:
! 581: - Function: void mfree (MINT *OP)
! 582: De-allocate, the space used by OP. *This function should only be
! 583: passed a value returned by `itom' or `xtom'.*
! 584:
! 585:
! 586: File: gmp.info, Node: Custom Allocation, Next: Language Bindings, Prev: BSD Compatible Functions, Up: Top
! 587:
! 588: Custom Allocation
! 589: *****************
! 590:
! 591: By default GMP uses `malloc', `realloc' and `free' for memory
! 592: allocation, and if they fail GMP prints a message to the standard error
! 593: output and terminates the program.
! 594:
! 595: Alternate functions can be specified to allocate memory in a
! 596: different way or to have a different error action on running out of
! 597: memory.
! 598:
! 599: This feature is available in the Berkeley compatibility library
! 600: (*note BSD Compatible Functions::) as well as the main GMP library.
! 601:
! 602: - Function: void mp_set_memory_functions (
! 603: void *(*ALLOC_FUNC_PTR) (size_t),
! 604: void *(*REALLOC_FUNC_PTR) (void *, size_t, size_t),
! 605: void (*FREE_FUNC_PTR) (void *, size_t))
! 606: Replace the current allocation functions from the arguments. If
! 607: an argument is `NULL', the corresponding default function is used.
! 608:
! 609: These functions will be used for all memory allocation done by
! 610: GMP, apart from temporary space from `alloca' if that function is
! 611: available and GMP is configured to use it (*note Build Options::).
! 612:
! 613: *Be sure to call `mp_set_memory_functions' only when there are no
! 614: active GMP objects allocated using the previous memory functions!
! 615: Usually that means calling it before any other GMP function.*
! 616:
! 617: The functions supplied should fit the following declarations:
! 618:
! 619: - Function: void * allocate_function (size_t ALLOC_SIZE)
! 620: Return a pointer to newly allocated space with at least ALLOC_SIZE
! 621: bytes.
! 622:
! 623: - Function: void * reallocate_function (void *PTR, size_t OLD_SIZE,
! 624: size_t NEW_SIZE)
! 625: Resize a previously allocated block PTR of OLD_SIZE bytes to be
! 626: NEW_SIZE bytes.
! 627:
! 628: The block may be moved if necessary or if desired, and in that
! 629: case the smaller of OLD_SIZE and NEW_SIZE bytes must be copied to
! 630: the new location. The return value is a pointer to the resized
! 631: block, that being the new location if moved or just PTR if not.
! 632:
! 633: PTR is never `NULL', it's always a previously allocated block.
! 634: NEW_SIZE may be bigger or smaller than OLD_SIZE.
! 635:
! 636: - Function: void deallocate_function (void *PTR, size_t SIZE)
! 637: De-allocate the space pointed to by PTR.
! 638:
! 639: PTR is never `NULL', it's always a previously allocated block of
! 640: SIZE bytes.
! 641:
! 642: A "byte" here means the unit used by the `sizeof' operator.
! 643:
! 644: The OLD_SIZE parameters to REALLOCATE_FUNCTION and
! 645: DEALLOCATE_FUNCTION are passed for convenience, but of course can be
! 646: ignored if not needed. The default functions using `malloc' and friends
! 647: for instance don't use them.
! 648:
! 649: No error return is allowed from any of these functions, if they
! 650: return then they must have performed the specified operation. In
! 651: particular note that ALLOCATE_FUNCTION or REALLOCATE_FUNCTION mustn't
! 652: return `NULL'.
! 653:
! 654: Getting a different fatal error action is a good use for custom
! 655: allocation functions, for example giving a graphical dialog rather than
! 656: the default print to `stderr'. How much is possible when genuinely out
! 657: of memory is another question though.
! 658:
! 659: There's currently no defined way for the allocation functions to
! 660: recover from an error such as out of memory, they must terminate
! 661: program execution. A `longjmp' or throwing a C++ exception will have
! 662: undefined results. This may change in the future.
! 663:
! 664: GMP may use allocated blocks to hold pointers to other allocated
! 665: blocks. This will limit the assumptions a conservative garbage
! 666: collection scheme can make.
! 667:
! 668: Since the default GMP allocation uses `malloc' and friends, those
! 669: functions will be linked in even if the first thing a program does is an
! 670: `mp_set_memory_functions'. It's necessary to change the GMP sources if
! 671: this is a problem.
! 672:
! 673:
! 674: File: gmp.info, Node: Language Bindings, Next: Algorithms, Prev: Custom Allocation, Up: Top
! 675:
! 676: Language Bindings
! 677: *****************
! 678:
! 679: The following packages and projects offer access to GMP from
! 680: languages other than C, though perhaps with varying levels of
! 681: functionality and efficiency.
! 682:
! 683:
! 684: C++
! 685: * GMP C++ class interface, *note C++ Class Interface::
! 686: Straightforward interface, expression templates to eliminate
! 687: temporaries.
! 688:
! 689: * ALP `http://www.inria.fr/saga/logiciels/ALP'
! 690: Linear algebra and polynomials using templates.
! 691:
! 692: * Arithmos `http://win-www.uia.ac.be/u/cant/arithmos'
! 693: Rationals with infinities and square roots.
! 694:
! 695: * CLN `http://clisp.cons.org/~haible/packages-cln.html'
! 696: High level classes for arithmetic.
! 697:
! 698: * LiDIA `http://www.informatik.tu-darmstadt.de/TI/LiDIA'
! 699: A C++ library for computational number theory.
! 700:
! 701: * Linbox `http://www.linalg.org'
! 702: Sparse vectors and matrices.
! 703:
! 704: * NTL `http://www.shoup.net/ntl'
! 705: A C++ number theory library.
! 706:
! 707: Fortran
! 708: * Omni F77 `http://pdplab.trc.rwcp.or.jp/pdperf/Omni/home.html'
! 709: Arbitrary precision floats.
! 710:
! 711: Haskell
! 712: * Glasgow Haskell Compiler `http://www.haskell.org/ghc'
! 713:
! 714: Java
! 715: * Kaffe `http://www.kaffe.org'
! 716:
! 717: * Kissme `http://kissme.sourceforge.net'
! 718:
! 719: Lisp
! 720: * GNU Common Lisp `http://www.gnu.org/software/gcl/gcl.html'
! 721: In the process of switching to GMP for bignums.
! 722:
! 723: * Librep `http://librep.sourceforge.net'
! 724:
! 725: M4
! 726: * GNU m4 betas `http://www.seindal.dk/rene/gnu'
! 727: Optionally provides an arbitrary precision `mpeval'.
! 728:
! 729: ML
! 730: * MLton compiler `http://www.mlton.org'
! 731:
! 732: Oz
! 733: * Mozart `http://www.mozart-oz.org'
! 734:
! 735: Pascal
! 736: * GNU Pascal Compiler `http://www.gnu-pascal.de'
! 737: GMP unit.
! 738:
! 739: Perl
! 740: * GMP module, see `demos/perl' in the GMP sources.
! 741:
! 742: * Math::GMP `http://www.cpan.org'
! 743: Compatible with Math::BigInt, but not as many functions as
! 744: the GMP module above.
! 745:
! 746: * Math::BigInt::GMP `http://www.cpan.org'
! 747: Plug Math::GMP into normal Math::BigInt operations.
! 748:
! 749: Pike
! 750: * mpz module in the standard distribution,
! 751: `http://pike.idonex.com'
! 752:
! 753: Prolog
! 754: * SWI Prolog `http://www.swi.psy.uva.nl/projects/SWI-Prolog'
! 755: Arbitrary precision floats.
! 756:
! 757: Python
! 758: * mpz module in the standard distribution,
! 759: `http://www.python.org'
! 760:
! 761: * GMPY `http://gmpy.sourceforge.net'
! 762:
! 763: Scheme
! 764: * RScheme `http://www.rscheme.org'
! 765:
! 766: * STklos `http://kaolin.unice.fr/STklos'
! 767:
! 768: Smalltalk
! 769: * GNU Smalltalk
! 770: `http://www.smalltalk.org/versions/GNUSmalltalk.html'
! 771:
! 772: Other
! 773: * DrGenius `http://drgenius.seul.org'
! 774: Geometry system and mathematical programming language.
! 775:
! 776: * GiNaC `http://www.ginac.de'
! 777: C++ computer algebra using CLN.
! 778:
! 779: * Maxima `http://www.ma.utexas.edu/users/wfs/maxima.html'
! 780: Macsyma computer algebra using GCL.
! 781:
! 782: * Q `http://www.musikwissenschaft.uni-mainz.de/~ag/q'
! 783: Equational programming system.
! 784:
! 785: * Regina `http://regina.sourceforge.net'
! 786: Topological calculator.
! 787:
! 788: * Yacas `http://www.xs4all.nl/~apinkus/yacas.html'
! 789: Yet another computer algebra system.
! 790:
! 791:
! 792: File: gmp.info, Node: Algorithms, Next: Internals, Prev: Language Bindings, Up: Top
! 793:
! 794: Algorithms
! 795: **********
! 796:
! 797: This chapter is an introduction to some of the algorithms used for
! 798: various GMP operations. The code is likely to be hard to understand
! 799: without knowing something about the algorithms.
! 800:
! 801: Some GMP internals are mentioned, but applications that expect to be
! 802: compatible with future GMP releases should take care to use only the
! 803: documented functions.
! 804:
! 805: * Menu:
! 806:
! 807: * Multiplication Algorithms::
! 808: * Division Algorithms::
! 809: * Greatest Common Divisor Algorithms::
! 810: * Powering Algorithms::
! 811: * Root Extraction Algorithms::
! 812: * Radix Conversion Algorithms::
! 813: * Other Algorithms::
! 814: * Assembler Coding::
! 815:
! 816:
! 817: File: gmp.info, Node: Multiplication Algorithms, Next: Division Algorithms, Prev: Algorithms, Up: Algorithms
! 818:
! 819: Multiplication
! 820: ==============
! 821:
! 822: NxN limb multiplications and squares are done using one of four
! 823: algorithms, as the size N increases.
! 824:
! 825: Algorithm Threshold
! 826: Basecase (none)
! 827: Karatsuba `MUL_KARATSUBA_THRESHOLD'
! 828: Toom-3 `MUL_TOOM3_THRESHOLD'
! 829: FFT `MUL_FFT_THRESHOLD'
! 830:
! 831: Similarly for squaring, with the `SQR' thresholds. Note though that
! 832: the FFT is only used if GMP is configured with `--enable-fft', *note
! 833: Build Options::.
! 834:
! 835: NxM multiplications of operands with different sizes above
! 836: `MUL_KARATSUBA_THRESHOLD' are currently done by splitting into MxM
! 837: pieces. The Karatsuba and Toom-3 routines then operate only on equal
! 838: size operands. This is not very efficient, and is slated for
! 839: improvement in the future.
! 840:
! 841: * Menu:
! 842:
! 843: * Basecase Multiplication::
! 844: * Karatsuba Multiplication::
! 845: * Toom-Cook 3-Way Multiplication::
! 846: * FFT Multiplication::
! 847: * Other Multiplication::
! 848:
! 849:
! 850: File: gmp.info, Node: Basecase Multiplication, Next: Karatsuba Multiplication, Prev: Multiplication Algorithms, Up: Multiplication Algorithms
! 851:
! 852: Basecase Multiplication
! 853: -----------------------
! 854:
! 855: Basecase NxM multiplication is a straightforward rectangular set of
! 856: cross-products, the same as long multiplication done by hand and for
! 857: that reason sometimes known as the schoolbook or grammar school method.
! 858: This is an O(N*M) algorithm. See Knuth section 4.3.1 algorithm M
! 859: (*note References::), and the `mpn/generic/mul_basecase.c' code.
! 860:
! 861: Assembler implementations of `mpn_mul_basecase' are essentially the
! 862: same as the generic C code, but have all the usual assembler tricks and
! 863: obscurities introduced for speed.
! 864:
! 865: A square can be done in roughly half the time of a multiply, by
! 866: using the fact that the cross products above and below the diagonal are
! 867: the same. A triangle of products below the diagonal is formed, doubled
! 868: (left shift by one bit), and then the products on the diagonal added.
! 869: This can be seen in `mpn/generic/sqr_basecase.c'. Again the assembler
! 870: implementations take essentially the same approach.
! 871:
! 872: u0 u1 u2 u3 u4
! 873: +---+---+---+---+---+
! 874: u0 | d | | | | |
! 875: +---+---+---+---+---+
! 876: u1 | | d | | | |
! 877: +---+---+---+---+---+
! 878: u2 | | | d | | |
! 879: +---+---+---+---+---+
! 880: u3 | | | | d | |
! 881: +---+---+---+---+---+
! 882: u4 | | | | | d |
! 883: +---+---+---+---+---+
! 884:
! 885: In practice squaring isn't a full 2x faster than multiplying, it's
! 886: usually around 1.5x. Less than 1.5x probably indicates
! 887: `mpn_sqr_basecase' wants improving on that CPU.
! 888:
! 889: On some CPUs `mpn_mul_basecase' can be faster than the generic C
! 890: `mpn_sqr_basecase'. `SQR_BASECASE_THRESHOLD' is the size at which to
! 891: use `mpn_sqr_basecase', this will be zero if that routine should be
! 892: used always.
! 893:
! 894:
! 895: File: gmp.info, Node: Karatsuba Multiplication, Next: Toom-Cook 3-Way Multiplication, Prev: Basecase Multiplication, Up: Multiplication Algorithms
! 896:
! 897: Karatsuba Multiplication
! 898: ------------------------
! 899:
! 900: The Karatsuba multiplication algorithm is described in Knuth section
! 901: 4.3.3 part A, and various other textbooks. A brief description is
! 902: given here.
! 903:
! 904: The inputs x and y are treated as each split into two parts of equal
! 905: length (or the most significant part one limb shorter if N is odd).
! 906:
! 907: high low
! 908: +----------+----------+
! 909: | x1 | x0 |
! 910: +----------+----------+
! 911:
! 912: +----------+----------+
! 913: | y1 | y0 |
! 914: +----------+----------+
! 915:
! 916: Let b be the power of 2 where the split occurs, ie. if x0 is k limbs
! 917: (y0 the same) then b=2^(k*mp_bits_per_limb). With that x=x1*b+x0 and
! 918: y=y1*b+y0, and the following holds,
! 919:
! 920: x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0
! 921:
! 922: This formula means doing only three multiplies of (N/2)x(N/2) limbs,
! 923: whereas a basecase multiply of NxN limbs is equivalent to four
! 924: multiplies of (N/2)x(N/2). The factors (b^2+b) etc represent the
! 925: positions where the three products must be added.
! 926:
! 927: high low
! 928: +--------+--------+ +--------+--------+
! 929: | x1*y1 | | x0*y0 |
! 930: +--------+--------+ +--------+--------+
! 931: +--------+--------+
! 932: add | x1*y1 |
! 933: +--------+--------+
! 934: +--------+--------+
! 935: add | x0*y0 |
! 936: +--------+--------+
! 937: +--------+--------+
! 938: sub | (x1-x0)*(y1-y0) |
! 939: +--------+--------+
! 940:
! 941: The term (x1-x0)*(y1-y0) is best calculated as an absolute value,
! 942: and the sign used to choose to add or subtract. Notice the sum
! 943: high(x0*y0)+low(x1*y1) occurs twice, so it's possible to do 5*k limb
! 944: additions, rather than 6*k, but in GMP extra function call overheads
! 945: outweigh the saving.
! 946:
! 947: Squaring is similar to multiplying, but with x=y the formula reduces
! 948: to an equivalent with three squares,
! 949:
! 950: x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2
! 951:
! 952: The final result is accumulated from those three squares the same
! 953: way as for the three multiplies above. The middle term (x1-x0)^2 is now
! 954: always positive.
! 955:
! 956: A similar formula for both multiplying and squaring can be
! 957: constructed with a middle term (x1+x0)*(y1+y0). But those sums can
! 958: exceed k limbs, leading to more carry handling and additions than the
! 959: form above.
! 960:
! 961: Karatsuba multiplication is asymptotically an O(N^1.585) algorithm,
! 962: the exponent being log(3)/log(2), representing 3 multiplies each 1/2
! 963: the size of the inputs. This is a big improvement over the basecase
! 964: multiply at O(N^2) and the advantage soon overcomes the extra additions
! 965: Karatsuba performs.
! 966:
! 967: `MUL_KARATSUBA_THRESHOLD' can be as little as 10 limbs. The `SQR'
! 968: threshold is usually about twice the `MUL'. The basecase algorithm will
! 969: take a time of the form M(N) = a*N^2 + b*N + c and the Karatsuba
! 970: algorithm K(N) = 3*M(N/2) + d*N + e. Clearly per-crossproduct speedups
! 971: in the basecase code reduce a and decrease the threshold, but linear
! 972: style speedups reducing b will actually increase the threshold. The
! 973: latter can be seen for instance when adding an optimized
! 974: `mpn_sqr_diagonal' to `mpn_sqr_basecase'. Of course all speedups
! 975: reduce total time, and in that sense the algorithm thresholds are
! 976: merely of academic interest.
! 977:
! 978:
! 979: File: gmp.info, Node: Toom-Cook 3-Way Multiplication, Next: FFT Multiplication, Prev: Karatsuba Multiplication, Up: Multiplication Algorithms
! 980:
! 981: Toom-Cook 3-Way Multiplication
! 982: ------------------------------
! 983:
! 984: The Karatsuba formula is the simplest case of a general approach to
! 985: splitting inputs that leads to both Toom-Cook and FFT algorithms. A
! 986: description of Toom-Cook can be found in Knuth section 4.3.3, with an
! 987: example 3-way calculation after Theorem A. The 3-way form used in GMP
! 988: is described here.
! 989:
! 990: The operands are each considered split into 3 pieces of equal length
! 991: (or the most significant part 1 or 2 limbs shorter than the others).
! 992:
! 993: high low
! 994: +----------+----------+----------+
! 995: | x2 | x1 | x0 |
! 996: +----------+----------+----------+
! 997:
! 998: +----------+----------+----------+
! 999: | y2 | y1 | y0 |
! 1000: +----------+----------+----------+
! 1001:
! 1002: These parts are treated as the coefficients of two polynomials
! 1003:
! 1004: X(t) = x2*t^2 + x1*t + x0
! 1005: Y(t) = y2*t^2 + y1*t + y0
! 1006:
! 1007: Again let b equal the power of 2 which is the size of the x0, x1, y0
! 1008: and y1 pieces, ie. if they're k limbs each then
! 1009: b=2^(k*mp_bits_per_limb). With this x=X(b) and y=Y(b).
! 1010:
! 1011: Let a polynomial W(t)=X(t)*Y(t) and suppose its coefficients are
! 1012:
! 1013: W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0
! 1014:
! 1015: The w[i] are going to be determined, and when they are they'll give the
! 1016: final result using w=W(b), since x*y=X(b)*Y(b)=W(b). The coefficients
! 1017: will be roughly b^2 each, and the final W(b) will be an addition like,
! 1018:
! 1019: high low
! 1020: +-------+-------+
! 1021: | w4 |
! 1022: +-------+-------+
! 1023: +--------+-------+
! 1024: | w3 |
! 1025: +--------+-------+
! 1026: +--------+-------+
! 1027: | w2 |
! 1028: +--------+-------+
! 1029: +--------+-------+
! 1030: | w1 |
! 1031: +--------+-------+
! 1032: +-------+-------+
! 1033: | w0 |
! 1034: +-------+-------+
! 1035:
! 1036: The w[i] coefficients could be formed by a simple set of cross
! 1037: products, like w4=x2*y2, w3=x2*y1+x1*y2, w2=x2*y0+x1*y1+x0*y2 etc, but
! 1038: this would need all nine x[i]*y[j] for i,j=0,1,2, and would be
! 1039: equivalent merely to a basecase multiply. Instead the following
! 1040: approach is used.
! 1041:
! 1042: X(t) and Y(t) are evaluated and multiplied at 5 points, giving
! 1043: values of W(t) at those points. The points used can be chosen in
! 1044: various ways, but in GMP the following are used
! 1045:
! 1046: Point Value
! 1047: t=0 x0*y0, which gives w0 immediately
! 1048: t=2 (4*x2+2*x1+x0)*(4*y2+2*y1+y0)
! 1049: t=1 (x2+x1+x0)*(y2+y1+y0)
! 1050: t=1/2 (x2+2*x1+4*x0)*(y2+2*y1+4*y0)
! 1051: t=inf x2*y2, which gives w4 immediately
! 1052:
! 1053: At t=1/2 the value calculated is actually 16*X(1/2)*Y(1/2), giving a
! 1054: value for 16*W(1/2), and this is always an integer. At t=inf the value
! 1055: is actually X(t)*Y(t)/t^4 in the limit as t approaches infinity, but
! 1056: it's much easier to think of as simply x2*y2 giving w4 immediately
! 1057: (much like x0*y0 at t=0 gives w0 immediately).
! 1058:
! 1059: Now each of the points substituted into W(t)=w4*t^4+...+w0 gives a
! 1060: linear combination of the w[i] coefficients, and the value of those
! 1061: combinations has just been calculated.
! 1062:
! 1063: W(0) = w0
! 1064: 16*W(1/2) = w4 + 2*w3 + 4*w2 + 8*w1 + 16*w0
! 1065: W(1) = w4 + w3 + w2 + w1 + w0
! 1066: W(2) = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0
! 1067: W(inf) = w4
! 1068:
! 1069: This is a set of five equations in five unknowns, and some
! 1070: elementary linear algebra quickly isolates each w[i], by subtracting
! 1071: multiples of one equation from another.
! 1072:
! 1073: In the code the set of five values W(0),...,W(inf) will represent
! 1074: those certain linear combinations. By adding or subtracting one from
! 1075: another as necessary, values which are each w[i] alone are arrived at.
! 1076: This involves only a few subtractions of small multiples (some of which
! 1077: are powers of 2), and so is fast. A couple of divisions remain by
! 1078: powers of 2 and one division by 3 (or by 6 rather), and that last uses
! 1079: the special `mpn_divexact_by3' (*note Exact Division::).
! 1080:
! 1081: In the code the values w4, w2 and w0 are formed in the destination
! 1082: with pointers `E', `C' and `A', and w3 and w1 in temporary space `D'
! 1083: and `B' are added to them. There are extra limbs `tD', `tC' and `tB'
! 1084: at the high end of w3, w2 and w1 which are handled separately. The
! 1085: final addition then is as follows.
! 1086:
! 1087: high low
! 1088: +-------+-------+-------+-------+-------+-------+
! 1089: | E | C | A |
! 1090: +-------+-------+-------+-------+-------+-------+
! 1091: +------+-------++------+-------+
! 1092: | D || B |
! 1093: +------+-------++------+-------+
! 1094: -- -- --
! 1095: |tD| |tC| |tB|
! 1096: -- -- --
! 1097:
! 1098: The conversion of W(t) values to the coefficients is interpolation.
! 1099: A polynomial of degree 4 like W(t) is uniquely determined by values
! 1100: known at 5 different points. The points can be chosen to make the
! 1101: linear equations come out with a convenient set of steps for isolating
! 1102: the w[i].
! 1103:
! 1104: In `mpn/generic/mul_n.c' the `interpolate3' routine performs the
! 1105: interpolation. The open-coded one-pass version may be a bit hard to
! 1106: understand, the steps performed can be better seen in the `USE_MORE_MPN'
! 1107: version.
! 1108:
! 1109: Squaring follows the same procedure as multiplication, but there's
! 1110: only one X(t) and it's evaluated at 5 points, and those values squared
! 1111: to give values of W(t). The interpolation is then identical, and in
! 1112: fact the same `interpolate3' subroutine is used for both squaring and
! 1113: multiplying.
! 1114:
! 1115: Toom-3 is asymptotically O(N^1.465), the exponent being
! 1116: log(5)/log(3), representing 5 recursive multiplies of 1/3 the original
! 1117: size. This is an improvement over Karatsuba at O(N^1.585), though
! 1118: Toom-Cook does more work in the evaluation and interpolation and so it
! 1119: only realizes its advantage above a certain size.
! 1120:
! 1121: Near the crossover between Toom-3 and Karatsuba there's generally a
! 1122: range of sizes where the difference between the two is small.
! 1123: `MUL_TOOM3_THRESHOLD' is a somewhat arbitrary point in that range and
! 1124: successive runs of the tune program can give different values due to
! 1125: small variations in measuring. A graph of time versus size for the two
! 1126: shows the effect, see `tune/README'.
! 1127:
! 1128: At the fairly small sizes where the Toom-3 thresholds occur it's
! 1129: worth remembering that the asymptotic behaviour for Karatsuba and
! 1130: Toom-3 can't be expected to make accurate predictions, due of course to
! 1131: the big influence of all sorts of overheads, and the fact that only a
! 1132: few recursions of each are being performed. Even at large sizes
! 1133: there's a good chance machine dependent effects like cache architecture
! 1134: will mean actual performance deviates from what might be predicted.
! 1135:
! 1136: The formula given above for the Karatsuba algorithm has an
! 1137: equivalent for Toom-3 involving only five multiplies, but this would be
! 1138: complicated and unenlightening.
! 1139:
! 1140: An alternate view of Toom-3 can be found in Zuras (*note
! 1141: References::), using a vector to represent the x and y splits and a
! 1142: matrix multiplication for the evaluation and interpolation stages. The
! 1143: matrix inverses are not meant to be actually used, and they have
! 1144: elements with values much greater than in fact arise in the
! 1145: interpolation steps. The diagram shown for the 3-way is attractive,
! 1146: but again doesn't have to be implemented that way and for example with
! 1147: a bit of rearrangement just one division by 6 can be done.
! 1148:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>