Annotation of OpenXM_contrib/gmp/gmp.texi, Revision 1.1.1.4
1.1 maekawa 1: \input texinfo @c -*-texinfo-*-
2: @c %**start of header
3: @setfilename gmp.info
1.1.1.2 maekawa 4: @include version.texi
5: @settitle GNU MP @value{VERSION}
1.1 maekawa 6: @synindex tp fn
7: @iftex
8: @afourpaper
9: @end iftex
10: @comment %**end of header
11:
1.1.1.4 ! ohara 12: @copying
! 13: This manual describes how to install and use the GNU multiple precision
! 14: arithmetic library, version @value{VERSION}.
1.1 maekawa 15:
1.1.1.4 ! ohara 16: Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002
! 17: Free Software Foundation, Inc.
1.1 maekawa 18:
1.1.1.4 ! ohara 19: Permission is granted to copy, distribute and/or modify this document under
! 20: the terms of the GNU Free Documentation License, Version 1.1 or any later
! 21: version published by the Free Software Foundation; with no Invariant Sections,
! 22: with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover
! 23: Texts being ``You have freedom to copy and modify this GNU Manual, like GNU
! 24: software''. A copy of the license is included in @ref{GNU Free Documentation
! 25: License}.
! 26: @end copying
1.1 maekawa 27:
28:
1.1.1.4 ! ohara 29: @c Texinfo version 4.2 or up will be needed to process this into .info files.
! 30: @c
! 31: @c The supplied texinfo.tex (or newer) should be used when processing into
! 32: @c .dvi etc.
! 33: @c
! 34: @c The version number and edition number are taken from version.texi provided
! 35: @c by automake (note it's regenerated only if you configure with
! 36: @c --enable-maintainer-mode).
! 37: @c
! 38: @c Discussions about this version in relation to previous ones (for instance
! 39: @c in the "Compatibility" section) obviously must be looked at manually
! 40: @c though.
! 41: @c
! 42: @c "cindex" entries have been made for function categories and programming
! 43: @c topics. Minutiae like particular systems and processors mentioned in
! 44: @c various places have been left out so as not to bury important topics under
! 45: @c a lot of junk. "mpn" functions aren't in the concept index because a
! 46: @c beginner looking for "GCD" or something is only going to be confused by
! 47: @c pointers to low level routines.
1.1 maekawa 48:
49:
1.1.1.4 ! ohara 50: @dircategory GNU libraries
! 51: @direntry
! 52: * gmp: (gmp). GNU Multiple Precision Arithmetic Library.
! 53: @end direntry
1.1 maekawa 54:
1.1.1.4 ! ohara 55: @c html <meta name=description content="...">
! 56: @documentdescription
! 57: How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}.
! 58: @end documentdescription
1.1 maekawa 59:
1.1.1.4 ! ohara 60: @c smallbook
! 61: @finalout
! 62: @setchapternewpage on
1.1.1.2 maekawa 63:
1.1.1.4 ! ohara 64: @ifnottex
! 65: @node Top, Copying, (dir), (dir)
! 66: @top GNU MP
1.1.1.2 maekawa 67: @end ifnottex
1.1 maekawa 68:
1.1.1.4 ! ohara 69: @iftex
1.1 maekawa 70: @titlepage
71: @title GNU MP
72: @subtitle The GNU Multiple Precision Arithmetic Library
1.1.1.2 maekawa 73: @subtitle Edition @value{EDITION}
74: @subtitle @value{UPDATED}
1.1 maekawa 75:
1.1.1.2 maekawa 76: @author by Torbj@"orn Granlund, Swox AB
77: @email{tege@@swox.com}
1.1 maekawa 78:
79: @c Include the Distribution inside the titlepage so
80: @c that headings are turned off.
81:
82: @tex
83: \global\parindent=0pt
84: \global\parskip=8pt
85: \global\baselineskip=13pt
86: @end tex
87:
88: @page
89: @vskip 0pt plus 1filll
1.1.1.4 ! ohara 90: @end iftex
1.1 maekawa 91:
1.1.1.4 ! ohara 92: @insertcopying
! 93: @ifnottex
! 94: @sp 1
! 95: @end ifnottex
1.1 maekawa 96:
1.1.1.4 ! ohara 97: @iftex
1.1 maekawa 98: @end titlepage
99: @headings double
1.1.1.4 ! ohara 100: @end iftex
1.1 maekawa 101:
1.1.1.4 ! ohara 102: @c Don't bother with contents for html, the menus seem adequate.
! 103: @ifnothtml
! 104: @contents
! 105: @end ifnothtml
1.1 maekawa 106:
107: @menu
1.1.1.2 maekawa 108: * Copying:: GMP Copying Conditions (LGPL).
109: * Introduction to GMP:: Brief introduction to GNU MP.
110: * Installing GMP:: How to configure and compile the GMP library.
1.1.1.4 ! ohara 111: * GMP Basics:: What every GMP user should know.
1.1.1.2 maekawa 112: * Reporting Bugs:: How to usefully report bugs.
113: * Integer Functions:: Functions for arithmetic on signed integers.
114: * Rational Number Functions:: Functions for arithmetic on rational numbers.
115: * Floating-point Functions:: Functions for arithmetic on floats.
116: * Low-level Functions:: Fast functions for natural numbers.
117: * Random Number Functions:: Functions for generating random numbers.
1.1.1.4 ! ohara 118: * Formatted Output:: @code{printf} style output.
! 119: * Formatted Input:: @code{scanf} style input.
! 120: * C++ Class Interface:: Class wrappers around GMP types.
1.1.1.2 maekawa 121: * BSD Compatible Functions:: All functions found in BSD MP.
122: * Custom Allocation:: How to customize the internal allocation.
1.1.1.4 ! ohara 123: * Language Bindings:: Using GMP from other languages.
! 124: * Algorithms:: What happens behind the scenes.
! 125: * Internals:: How values are represented behind the scenes.
1.1 maekawa 126:
1.1.1.2 maekawa 127: * Contributors:: Who brings your this library?
128: * References:: Some useful papers and books to read.
1.1.1.4 ! ohara 129: * GNU Free Documentation License::
1.1 maekawa 130: * Concept Index::
131: * Function Index::
132: @end menu
133:
1.1.1.4 ! ohara 134:
! 135: @c @m{T,N} is $T$ in tex or @math{N} otherwise. This is an easy way to give
! 136: @c different forms for math in tex and info. Commas in N or T don't work,
! 137: @c but @C{} can be used instead. \, works in info but not in tex.
! 138: @iftex
! 139: @macro m {T,N}
! 140: @tex$\T\$@end tex
! 141: @end macro
! 142: @end iftex
! 143: @ifnottex
! 144: @macro m {T,N}
! 145: @math{\N\}
! 146: @end macro
! 147: @end ifnottex
! 148:
! 149: @macro C {}
! 150: ,
! 151: @end macro
! 152:
! 153: @c @ms{V,N} is $V_N$ in tex or just vn otherwise. This suits simple
! 154: @c subscripts like @ms{x,0}.
! 155: @iftex
! 156: @macro ms {V,N}
! 157: @tex$\V\_{\N\}$@end tex
! 158: @end macro
! 159: @end iftex
! 160: @ifnottex
! 161: @macro ms {V,N}
! 162: \V\\N\
! 163: @end macro
! 164: @end ifnottex
! 165:
! 166: @c @nicode{S} is plain S in info, or @code{S} elsewhere. This can be used
! 167: @c when the quotes that @code{} gives in info aren't wanted, but the
! 168: @c fontification in tex or html is wanted. Doesn't work as @nicode{'\\0'}
! 169: @c though (gives two backslashes in tex).
! 170: @ifinfo
! 171: @macro nicode {S}
! 172: \S\
! 173: @end macro
! 174: @end ifinfo
! 175: @ifnotinfo
! 176: @macro nicode {S}
! 177: @code{\S\}
! 178: @end macro
! 179: @end ifnotinfo
! 180:
! 181: @c @nisamp{S} is plain S in info, or @samp{S} elsewhere. This can be used
! 182: @c when the quotes that @samp{} gives in info aren't wanted, but the
! 183: @c fontification in tex or html is wanted.
! 184: @ifinfo
! 185: @macro nisamp {S}
! 186: \S\
! 187: @end macro
! 188: @end ifinfo
! 189: @ifnotinfo
! 190: @macro nisamp {S}
! 191: @samp{\S\}
! 192: @end macro
! 193: @end ifnotinfo
! 194:
! 195: @c Usage: @GMPtimes{}
! 196: @c Give either \times or the word "times".
! 197: @tex
! 198: \gdef\GMPtimes{\times}
! 199: @end tex
! 200: @ifnottex
! 201: @macro GMPtimes
! 202: times
! 203: @end macro
! 204: @end ifnottex
! 205:
! 206: @c Usage: @GMPmultiply{}
! 207: @c Give * in info, or nothing in tex.
! 208: @tex
! 209: \gdef\GMPmultiply{}
! 210: @end tex
! 211: @ifnottex
! 212: @macro GMPmultiply
! 213: *
! 214: @end macro
! 215: @end ifnottex
! 216:
! 217: @c Usage: @GMPabs{x}
! 218: @c Give either |x| in tex, or abs(x) in info or html.
! 219: @tex
! 220: \gdef\GMPabs#1{|#1|}
! 221: @end tex
! 222: @ifnottex
! 223: @macro GMPabs {X}
! 224: @abs{}(\X\)
! 225: @end macro
! 226: @end ifnottex
! 227:
! 228: @c Usage: @GMPfloor{x}
! 229: @c Give either \lfloor x\rfloor in tex, or floor(x) in info or html.
! 230: @tex
! 231: \gdef\GMPfloor#1{\lfloor #1\rfloor}
! 232: @end tex
! 233: @ifnottex
! 234: @macro GMPfloor {X}
! 235: floor(\X\)
! 236: @end macro
! 237: @end ifnottex
! 238:
! 239: @c Usage: @GMPceil{x}
! 240: @c Give either \lceil x\rceil in tex, or ceil(x) in info or html.
! 241: @tex
! 242: \gdef\GMPceil#1{\lceil #1 \rceil}
! 243: @end tex
! 244: @ifnottex
! 245: @macro GMPceil {X}
! 246: ceil(\X\)
! 247: @end macro
! 248: @end ifnottex
! 249:
! 250: @c Math operators already available in tex, made available in info too.
! 251: @c For example @bmod{} can be used in both tex and info.
! 252: @ifnottex
! 253: @macro bmod
! 254: mod
! 255: @end macro
! 256: @macro gcd
! 257: gcd
! 258: @end macro
! 259: @macro ge
! 260: >=
! 261: @end macro
! 262: @macro le
! 263: <=
! 264: @end macro
! 265: @macro log
! 266: log
! 267: @end macro
! 268: @macro min
! 269: min
! 270: @end macro
! 271: @macro rightarrow
! 272: ->
! 273: @end macro
! 274: @end ifnottex
! 275:
! 276: @c New math operators.
! 277: @c @abs{} can be used in both tex and info, or just \abs in tex.
! 278: @tex
! 279: \gdef\abs{\mathop{\rm abs}}
! 280: @end tex
! 281: @ifnottex
! 282: @macro abs
! 283: abs
! 284: @end macro
! 285: @end ifnottex
! 286:
! 287: @c @cross{} is a \times symbol in tex, or an "x" in info. In tex it works
! 288: @c inside or outside $ $.
! 289: @tex
! 290: \gdef\cross{\ifmmode\times\else$\times$\fi}
! 291: @end tex
! 292: @ifnottex
! 293: @macro cross
! 294: x
! 295: @end macro
! 296: @end ifnottex
! 297:
! 298: @c @times{} made available as a "*" in info and html (already works in tex).
! 299: @ifnottex
! 300: @macro times
! 301: *
! 302: @end macro
! 303: @end ifnottex
! 304:
! 305: @c Usage: @W{text}
! 306: @c Like @w{} but working in math mode too.
! 307: @tex
! 308: \gdef\W#1{\ifmmode{#1}\else\w{#1}\fi}
! 309: @end tex
! 310: @ifnottex
! 311: @macro W {S}
! 312: @w{\S\}
! 313: @end macro
! 314: @end ifnottex
! 315:
! 316: @c Usage: \GMPdisplay{text}
! 317: @c Put the given text in an @display style indent, but without turning off
! 318: @c paragraph reflow etc.
! 319: @tex
! 320: \gdef\GMPdisplay#1{%
! 321: \noindent
! 322: \advance\leftskip by \lispnarrowing
! 323: #1\par}
! 324: @end tex
! 325:
! 326: @c Usage: \GMPhat
! 327: @c A new \hat that will work in math mode, unlike the texinfo redefined
! 328: @c version.
! 329: @tex
! 330: \gdef\GMPhat{\mathaccent"705E}
! 331: @end tex
! 332:
! 333: @c Usage: \GMPraise{text}
! 334: @c For use in a $ $ math expression as an alternative to "^". This is good
! 335: @c for @code{} in an exponent, since there seems to be no superscript font
! 336: @c for that.
! 337: @tex
! 338: \gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}}
! 339: @end tex
! 340:
! 341: @c Usage: @texlinebreak{}
! 342: @c A line break as per @*, but only in tex.
! 343: @iftex
! 344: @macro texlinebreak
! 345: @*
! 346: @end macro
! 347: @end iftex
! 348: @ifnottex
! 349: @macro texlinebreak
! 350: @end macro
! 351: @end ifnottex
! 352:
! 353: @c Usage: @maybepagebreak
! 354: @c Allow tex to insert a page break, if it feels the urge.
! 355: @c Normally blocks of @deftypefun/funx are kept together, which can lead to
! 356: @c some poor page break positioning if it's a big block, like the sets of
! 357: @c division functions etc.
! 358: @tex
! 359: \gdef\maybepagebreak{\penalty0}
! 360: @end tex
! 361: @ifnottex
! 362: @macro maybepagebreak
! 363: @end macro
! 364: @end ifnottex
! 365:
! 366:
1.1.1.2 maekawa 367: @node Copying, Introduction to GMP, Top, Top
1.1 maekawa 368: @comment node-name, next, previous, up
369: @unnumbered GNU MP Copying Conditions
370: @cindex Copying conditions
371: @cindex Conditions for copying GNU MP
1.1.1.4 ! ohara 372: @cindex License conditions
1.1 maekawa 373:
374: This library is @dfn{free}; this means that everyone is free to use it and
375: free to redistribute it on a free basis. The library is not in the public
376: domain; it is copyrighted and there are restrictions on its distribution, but
377: these restrictions are designed to permit everything that a good cooperating
378: citizen would want to do. What is not allowed is to try to prevent others
379: from further sharing any version of this library that they might get from
380: you.@refill
381:
382: Specifically, we want to make sure that you have the right to give away copies
383: of the library, that you receive source code or else can get it if you want
384: it, that you can change this library or use pieces of it in new free programs,
385: and that you know you can do these things.@refill
386:
387: To make sure that everyone has such rights, we have to forbid you to deprive
388: anyone else of these rights. For example, if you distribute copies of the GNU
389: MP library, you must give the recipients all the rights that you have. You
390: must make sure that they, too, receive or can get the source code. And you
391: must tell them their rights.@refill
392:
393: Also, for our own protection, we must make certain that everyone finds out
394: that there is no warranty for the GNU MP library. If it is modified by
395: someone else and passed on, we want their recipients to know that what they
396: have is not what we distributed, so that any problems introduced by others
397: will not reflect on our reputation.@refill
398:
399: The precise conditions of the license for the GNU MP library are found in the
1.1.1.4 ! ohara 400: Lesser General Public License version 2.1 that accompanies the source code,
! 401: see @file{COPYING.LIB}. Certain demonstration programs are provided under the
! 402: terms of the plain General Public License version 2, see @file{COPYING}.
1.1 maekawa 403:
1.1.1.2 maekawa 404:
405: @node Introduction to GMP, Installing GMP, Copying, Top
1.1 maekawa 406: @comment node-name, next, previous, up
407: @chapter Introduction to GNU MP
1.1.1.2 maekawa 408: @cindex Introduction
1.1 maekawa 409:
410: GNU MP is a portable library written in C for arbitrary precision arithmetic
411: on integers, rational numbers, and floating-point numbers. It aims to provide
412: the fastest possible arithmetic for all applications that need higher
413: precision than is directly supported by the basic C types.
414:
415: Many applications use just a few hundred bits of precision; but some
1.1.1.2 maekawa 416: applications may need thousands or even millions of bits. GMP is designed to
1.1 maekawa 417: give good performance for both, by choosing algorithms based on the sizes of
418: the operands, and by carefully keeping the overhead at a minimum.
419:
1.1.1.2 maekawa 420: The speed of GMP is achieved by using fullwords as the basic arithmetic type,
1.1 maekawa 421: by using sophisticated algorithms, by including carefully optimized assembly
422: code for the most common inner loops for many different CPUs, and by a general
423: emphasis on speed (as opposed to simplicity or elegance).
424:
1.1.1.2 maekawa 425: There is carefully optimized assembly code for these CPUs:
426: @cindex CPUs supported
427: ARM,
428: DEC Alpha 21064, 21164, and 21264,
429: AMD 29000,
1.1.1.4 ! ohara 430: AMD K6, K6-2 and Athlon,
1.1.1.2 maekawa 431: Hitachi SuperH and SH-2,
432: HPPA 1.0, 1.1 and 2.0,
1.1.1.4 ! ohara 433: Intel Pentium, Pentium Pro/II/III, Pentium 4, generic x86,
! 434: Intel IA-64, i960,
1.1.1.2 maekawa 435: Motorola MC68000, MC68020, MC88100, and MC88110,
436: Motorola/IBM PowerPC 32 and 64,
437: National NS32000,
438: IBM POWER,
439: MIPS R3000, R4000,
440: SPARCv7, SuperSPARC, generic SPARCv8, UltraSPARC,
441: DEC VAX,
1.1.1.4 ! ohara 442: and
! 443: Zilog Z8000.
! 444: Some optimizations also for
! 445: Cray vector systems,
! 446: Clipper,
! 447: IBM ROMP (RT),
! 448: and
! 449: Pyramid AP/XP.
! 450:
! 451: @cindex Mailing lists
! 452: There are two public mailing lists of interest. One for general questions and
! 453: discussions about usage of the GMP library and one for discussions about
! 454: development of GMP. There's more information about the mailing lists at
! 455: @uref{http://swox.com/mailman/listinfo/}. These lists are @strong{not} for
! 456: bug reports.
1.1.1.2 maekawa 457:
1.1.1.4 ! ohara 458: The proper place for bug reports is @email{bug-gmp@@gnu.org}. See
! 459: @ref{Reporting Bugs} for info about reporting bugs.
1.1.1.2 maekawa 460:
461: @cindex Home page
462: @cindex Web page
1.1.1.4 ! ohara 463: For up-to-date information on GMP, please see the GMP web pages at
! 464:
! 465: @display
! 466: @uref{http://swox.com/gmp/}
! 467: @end display
! 468:
! 469: @cindex Latest version of GMP
! 470: @cindex Anonymous FTP of latest version
! 471: @cindex FTP of latest version
! 472: The latest version of the library is available at
! 473:
! 474: @display
! 475: @uref{ftp://ftp.gnu.org/gnu/gmp}
! 476: @end display
! 477:
! 478: Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror
! 479: near you, see @uref{http://www.gnu.org/order/ftp.html} for a full list.
1.1 maekawa 480:
481:
482: @section How to use this Manual
1.1.1.2 maekawa 483: @cindex About this manual
1.1 maekawa 484:
1.1.1.2 maekawa 485: Everyone should read @ref{GMP Basics}. If you need to install the library
1.1.1.4 ! ohara 486: yourself, then read @ref{Installing GMP}. If you have a system with multiple
! 487: ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used
! 488: on applications.
1.1 maekawa 489:
490: The rest of the manual can be used for later reference, although it is
491: probably a good idea to glance through it.
492:
493:
1.1.1.2 maekawa 494: @node Installing GMP, GMP Basics, Introduction to GMP, Top
1.1 maekawa 495: @comment node-name, next, previous, up
1.1.1.2 maekawa 496: @chapter Installing GMP
497: @cindex Installing GMP
498: @cindex Configuring GMP
1.1.1.4 ! ohara 499: @cindex Building GMP
1.1 maekawa 500:
1.1.1.2 maekawa 501: GMP has an autoconf/automake/libtool based configuration system. On a
502: Unix-like system a basic build can be done with
1.1 maekawa 503:
1.1.1.2 maekawa 504: @example
505: ./configure
506: make
507: @end example
1.1 maekawa 508:
1.1.1.2 maekawa 509: @noindent
510: Some self-tests can be run with
1.1 maekawa 511:
1.1.1.2 maekawa 512: @example
513: make check
514: @end example
1.1 maekawa 515:
1.1.1.2 maekawa 516: @noindent
517: And you can install (under @file{/usr/local} by default) with
1.1 maekawa 518:
1.1.1.2 maekawa 519: @example
520: make install
521: @end example
1.1 maekawa 522:
1.1.1.2 maekawa 523: If you experience problems, please report them to @email{bug-gmp@@gnu.org}.
1.1.1.4 ! ohara 524: See @ref{Reporting Bugs}, for information on what to include in useful bug
! 525: reports.
1.1 maekawa 526:
1.1.1.2 maekawa 527: @menu
528: * Build Options::
529: * ABI and ISA::
530: * Notes for Package Builds::
531: * Notes for Particular Systems::
532: * Known Build Problems::
533: @end menu
1.1 maekawa 534:
535:
1.1.1.2 maekawa 536: @node Build Options, ABI and ISA, Installing GMP, Installing GMP
537: @section Build Options
538: @cindex Build options
1.1 maekawa 539:
1.1.1.2 maekawa 540: All the usual autoconf configure options are available, run @samp{./configure
1.1.1.4 ! ohara 541: --help} for a summary. The file @file{INSTALL.autoconf} has some generic
! 542: installation information too.
1.1 maekawa 543:
1.1.1.2 maekawa 544: @table @asis
545: @item Non-Unix Systems
1.1 maekawa 546:
1.1.1.4 ! ohara 547: @samp{configure} requires various Unix-like tools. On an MS-DOS system DJGPP
! 548: can be used, and on MS Windows Cygwin or MINGW can be used,
! 549:
! 550: @display
! 551: @uref{http://www.cygnus.com/cygwin}
! 552: @uref{http://www.delorie.com/djgpp}
! 553: @uref{http://www.mingw.org}
! 554: @end display
! 555:
! 556: Microsoft also publishes an Interix ``Services for Unix'' which can be used to
! 557: build GMP on Windows (with a normal @samp{./configure}), but it's not free
! 558: software.
! 559:
! 560: The @file{macos} directory contains an unsupported port to MacOS 9 on Power
! 561: Macintosh, see @file{macos/README}. Note that MacOS X ``Darwin'' should use
! 562: the normal Unix-style @samp{./configure}.
1.1.1.2 maekawa 563:
1.1.1.4 ! ohara 564: It might be possible to build without the help of @samp{configure}, certainly
! 565: all the code is there, but unfortunately you'll be on your own.
1.1.1.2 maekawa 566:
1.1.1.4 ! ohara 567: @item Build Directory
! 568:
! 569: To compile in a separate build directory, @command{cd} to that directory, and
1.1.1.2 maekawa 570: prefix the configure command with the path to the GMP source directory. For
1.1.1.4 ! ohara 571: example
! 572:
! 573: @example
! 574: cd /my/build/dir
! 575: /my/sources/gmp-@value{VERSION}/configure
! 576: @end example
! 577:
! 578: Not all @samp{make} programs have the necessary features (@code{VPATH}) to
! 579: support this. In particular, SunOS and Slowaris @command{make} have bugs that
! 580: make them unable to build in a separate directory. Use GNU @command{make}
! 581: instead.
1.1.1.2 maekawa 582:
583: @item @option{--disable-shared}, @option{--disable-static}
584:
585: By default both shared and static libraries are built (where possible), but
1.1.1.4 ! ohara 586: one or other can be disabled. Shared libraries result in smaller executables
! 587: and permit code sharing between separate running processes, but on some CPUs
! 588: are slightly slower, having a small cost on each function call.
! 589:
! 590: @item Native Compilation, @option{--build=CPU-VENDOR-OS}
! 591:
! 592: For normal native compilation, the system can be specified with
! 593: @samp{--build}. By default @samp{./configure} uses the output from running
! 594: @samp{./config.guess}. On some systems @samp{./config.guess} can determine
! 595: the exact CPU type, on others it will be necessary to give it explicitly. For
! 596: example,
! 597:
! 598: @example
! 599: ./configure --build=ultrasparc-sun-solaris2.7
! 600: @end example
! 601:
! 602: In all cases the @samp{OS} part is important, since it controls how libtool
! 603: generates shared libraries. Running @samp{./config.guess} is the simplest way
! 604: to see what it should be, if you don't know already.
! 605:
! 606: @item Cross Compilation, @option{--host=CPU-VENDOR-OS}
! 607:
! 608: When cross-compiling, the system used for compiling is given by @samp{--build}
! 609: and the system where the library will run is given by @samp{--host}. For
! 610: example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries,
! 611:
! 612: @example
! 613: ./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu
! 614: @end example
! 615:
! 616: Compiler tools are sought first with the host system type as a prefix. For
! 617: example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain
! 618: @command{ranlib}. This makes it possible for a set of cross-compiling tools
! 619: to co-exist with native tools. The prefix is the argument to @samp{--host},
! 620: and this can be an alias, such as @samp{m68k-linux}. But note that tools
! 621: don't have to be setup this way, it's enough to just have a @env{PATH} with a
! 622: suitable cross-compiling @command{cc} etc.
! 623:
! 624: Compiling for a different CPU in the same family as the build system is a form
! 625: of cross-compilation, though very possibly this would merely be special
! 626: options on a native compiler. In any case @samp{./configure} avoids depending
! 627: on being able to run code on the build system, which is important when
! 628: creating binaries for a newer CPU since they very possibly won't run on the
! 629: build system.
! 630:
! 631: In all cases the compiler must be able to produce an executable (of whatever
! 632: format) from a standard C @code{main}. Although only object files will go to
! 633: make up @file{libgmp}, @samp{./configure} uses linking tests for various
! 634: purposes, such as determining what functions are available on the host system.
! 635:
! 636: Currently a warning is given unless an explicit @samp{--build} is used when
! 637: cross-compiling, because it may not be possible to correctly guess the build
! 638: system type if the @env{PATH} has only a cross-compiling @command{cc}.
! 639:
! 640: Note that the @samp{--target} option is not appropriate for GMP. It's for use
! 641: when building compiler tools, with @samp{--host} being where they will run,
! 642: and @samp{--target} what they'll produce code for. Ordinary programs or
! 643: libraries like GMP are only interested in the @samp{--host} part, being where
! 644: they'll run. (Some past versions of GMP used @samp{--target} incorrectly.)
! 645:
! 646: @item CPU types
1.1 maekawa 647:
1.1.1.2 maekawa 648: In general, if you want a library that runs as fast as possible, you should
649: configure GMP for the exact CPU type your system uses. However, this may mean
650: the binaries won't run on older members of the family, and might run slower on
651: other members, older or newer. The best idea is always to build GMP for the
652: exact machine type you intend to run it on.
653:
1.1.1.4 ! ohara 654: The following CPUs have specific support. See @file{configure.in} for details
! 655: of what code and compiler options they select.
1.1 maekawa 656:
657: @itemize @bullet
658:
1.1.1.2 maekawa 659: @c Keep this formatting, it's easy to read and it can be grepped to
1.1.1.4 ! ohara 660: @c automatically test that CPUs listed get through ./config.sub
1.1 maekawa 661:
662: @item
1.1.1.2 maekawa 663: Alpha:
1.1.1.4 ! ohara 664: @nisamp{alpha},
! 665: @nisamp{alphaev5},
! 666: @nisamp{alphaev56},
! 667: @nisamp{alphapca56},
! 668: @nisamp{alphapca57},
! 669: @nisamp{alphaev6},
! 670: @nisamp{alphaev67},
! 671: @nisamp{alphaev68}
1.1.1.2 maekawa 672:
673: @item
1.1.1.4 ! ohara 674: Cray:
! 675: @nisamp{c90},
! 676: @nisamp{j90},
! 677: @nisamp{t90},
! 678: @nisamp{sv1}
1.1.1.2 maekawa 679:
680: @item
681: HPPA:
1.1.1.4 ! ohara 682: @nisamp{hppa1.0},
! 683: @nisamp{hppa1.1},
! 684: @nisamp{hppa2.0},
! 685: @nisamp{hppa2.0n},
! 686: @nisamp{hppa2.0w}
1.1.1.2 maekawa 687:
688: @item
689: MIPS:
1.1.1.4 ! ohara 690: @nisamp{mips},
! 691: @nisamp{mips3},
! 692: @nisamp{mips64}
1.1.1.2 maekawa 693:
694: @item
695: Motorola:
1.1.1.4 ! ohara 696: @nisamp{m68k},
! 697: @nisamp{m68000},
! 698: @nisamp{m68010},
! 699: @nisamp{m68020},
! 700: @nisamp{m68030},
! 701: @nisamp{m68040},
! 702: @nisamp{m68060},
! 703: @nisamp{m68302},
! 704: @nisamp{m68360},
! 705: @nisamp{m88k},
! 706: @nisamp{m88110}
1.1.1.2 maekawa 707:
708: @item
709: POWER:
1.1.1.4 ! ohara 710: @nisamp{power},
! 711: @nisamp{power1},
! 712: @nisamp{power2},
! 713: @nisamp{power2sc}
! 714:
! 715: @item
! 716: PowerPC:
! 717: @nisamp{powerpc},
! 718: @nisamp{powerpc64},
! 719: @nisamp{powerpc401},
! 720: @nisamp{powerpc403},
! 721: @nisamp{powerpc405},
! 722: @nisamp{powerpc505},
! 723: @nisamp{powerpc601},
! 724: @nisamp{powerpc602},
! 725: @nisamp{powerpc603},
! 726: @nisamp{powerpc603e},
! 727: @nisamp{powerpc604},
! 728: @nisamp{powerpc604e},
! 729: @nisamp{powerpc620},
! 730: @nisamp{powerpc630},
! 731: @nisamp{powerpc740},
! 732: @nisamp{powerpc7400},
! 733: @nisamp{powerpc7450},
! 734: @nisamp{powerpc750},
! 735: @nisamp{powerpc801},
! 736: @nisamp{powerpc821},
! 737: @nisamp{powerpc823},
! 738: @nisamp{powerpc860},
1.1.1.2 maekawa 739:
740: @item
741: SPARC:
1.1.1.4 ! ohara 742: @nisamp{sparc},
! 743: @nisamp{sparcv8},
! 744: @nisamp{microsparc},
! 745: @nisamp{supersparc},
! 746: @nisamp{sparcv9},
! 747: @nisamp{ultrasparc},
! 748: @nisamp{ultrasparc2},
! 749: @nisamp{ultrasparc2i},
! 750: @nisamp{ultrasparc3},
! 751: @nisamp{sparc64}
1.1.1.2 maekawa 752:
753: @item
754: 80x86 family:
1.1.1.4 ! ohara 755: @nisamp{i386},
! 756: @nisamp{i486},
! 757: @nisamp{i586},
! 758: @nisamp{pentium},
! 759: @nisamp{pentiummmx},
! 760: @nisamp{pentiumpro},
! 761: @nisamp{pentium2},
! 762: @nisamp{pentium3},
! 763: @nisamp{pentium4},
! 764: @nisamp{k6},
! 765: @nisamp{k62},
! 766: @nisamp{k63},
! 767: @nisamp{athlon}
1.1.1.2 maekawa 768:
769: @item
770: Other:
1.1.1.4 ! ohara 771: @nisamp{a29k},
! 772: @nisamp{arm},
! 773: @nisamp{clipper},
! 774: @nisamp{i960},
! 775: @nisamp{ns32k},
! 776: @nisamp{pyramid},
! 777: @nisamp{sh},
! 778: @nisamp{sh2},
! 779: @nisamp{vax},
! 780: @nisamp{z8k}
1.1.1.2 maekawa 781: @end itemize
1.1 maekawa 782:
1.1.1.4 ! ohara 783: CPUs not listed will use generic C code.
! 784:
! 785: @item Generic C Build
! 786:
! 787: If some of the assembly code causes problems, or if otherwise desired, the
! 788: generic C code can be selected with CPU @samp{none}. For example,
! 789:
! 790: @example
! 791: ./configure --host=none-unknown-freebsd3.5
! 792: @end example
! 793:
! 794: Note that this will run quite slowly, but it should be portable and should at
! 795: least make it possible to get something running if all else fails.
! 796:
! 797: @item @option{ABI}
! 798:
! 799: On some systems GMP supports multiple ABIs (application binary interfaces),
! 800: meaning data type sizes and calling conventions. By default GMP chooses the
! 801: best ABI available, but a particular ABI can be selected. For example
! 802:
! 803: @example
! 804: ./configure --host=mips64-sgi-irix6 ABI=n32
! 805: @end example
! 806:
! 807: See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what
! 808: applications need to do.
1.1 maekawa 809:
1.1.1.2 maekawa 810: @item @option{CC}, @option{CFLAGS}
1.1 maekawa 811:
1.1.1.4 ! ohara 812: By default the C compiler used is chosen from among some likely candidates,
! 813: with @command{gcc} normally preferred if it's present. The usual
! 814: @samp{CC=whatever} can be passed to @samp{./configure} to choose something
! 815: different.
! 816:
! 817: For some systems, default compiler flags are set based on the CPU and
! 818: compiler. The usual @samp{CFLAGS="-whatever"} can be passed to
! 819: @samp{./configure} to use something different or to set good flags for systems
! 820: GMP doesn't otherwise know.
! 821:
! 822: The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure},
! 823: and can be found in each generated @file{Makefile}. This is the easiest way
! 824: to check the defaults when considering changing or adding something.
! 825:
! 826: Note that when @samp{CC} and @samp{CFLAGS} are specified on a system
! 827: supporting multiple ABIs it's important to give an explicit
! 828: @samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and
! 829: won't be able to select the correct assembler code.
! 830:
! 831: If just @samp{CC} is selected then normal default @samp{CFLAGS} for that
! 832: compiler will be used (if GMP recognises it). For example @samp{CC=gcc} can
! 833: be used to force the use of GCC, with default flags (and default ABI).
! 834:
! 835: @item @option{CPPFLAGS}
! 836:
! 837: Any flags like @samp{-D} defines or @samp{-I} includes required by the
! 838: preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}.
! 839: Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but
! 840: preprocessing uses just @samp{CPPFLAGS}. This distinction is because most
! 841: preprocessors won't accept all the flags the compiler does. Preprocessing is
! 842: done separately in some configure tests, and in the @samp{ansi2knr} support
! 843: for K&R compilers.
! 844:
! 845: @item C++ Support, @option{--enable-cxx}
! 846: C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a
! 847: C++ compiler will be required. As a convenience @samp{--enable-cxx=detect}
! 848: can be used to enable C++ support only if a compiler can be found. The C++
! 849: support consists of a library @file{libgmpxx.la} and header file
! 850: @file{gmpxx.h}.
! 851:
! 852: A separate @file{libgmpxx.la} has been adopted rather than having C++ objects
! 853: within @file{libgmp.la} in order to ensure dynamic linked C programs aren't
! 854: bloated by a dependency on the C++ standard library, and to avoid any chance
! 855: that the C++ compiler could be required when linking plain C programs.
! 856:
! 857: @file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can
! 858: only be expected to work with @file{libgmp.la} from the same GMP version.
! 859: Future changes to the relevant internals will be accompanied by renaming, so a
! 860: mismatch will cause unresolved symbols rather than perhaps mysterious
! 861: misbehaviour.
! 862:
! 863: In general @file{libgmpxx.la} will be usable only with the C++ compiler that
! 864: built it, since name mangling and runtime support are usually incompatible
! 865: between different compilers.
! 866:
! 867: @item @option{CXX}, @option{CXXFLAGS}
! 868: When C++ support is enabled, the C++ compiler and its flags can be set with
! 869: variables @samp{CXX} and @samp{CXXFLAGS} in the usual way. The default for
! 870: @samp{CXX} is the first compiler that works from a list of likely candidates,
! 871: with @command{g++} normally preferred when available. The default for
! 872: @samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then
! 873: for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers
! 874: @samp{-g} or nothing. Trying @samp{CFLAGS} this way is convenient when using
! 875: @samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will
! 876: usually suit @samp{g++}.
! 877:
! 878: It's important that the C and C++ compilers match, meaning their startup and
! 879: runtime support routines are compatible and that they generate code in the
! 880: same ABI (if there's a choice of ABIs on the system). @samp{./configure}
! 881: isn't currently able to check these things very well itself, so for that
! 882: reason @samp{--disable-cxx} is the default, to avoid a build failure due to a
! 883: compiler mismatch. Perhaps this will change in the future.
! 884:
! 885: Incidentally, it's normally not good enough to set @samp{CXX} to the same as
! 886: @samp{CC}. Although @command{gcc} for instance recognises @file{foo.cc} as
! 887: C++ code, only @command{g++} will invoke the linker the right way when
! 888: building an executable or shared library from object files.
1.1.1.2 maekawa 889:
1.1.1.4 ! ohara 890: @item Temporary Memory, @option{--enable-alloca=<choice>}
1.1.1.2 maekawa 891: @cindex Stack overflow segfaults
892: @cindex @code{alloca}
893:
1.1.1.4 ! ohara 894: GMP allocates temporary workspace using one of the following three methods,
! 895: which can be selected with for instance
! 896: @samp{--enable-alloca=malloc-reentrant}.
! 897:
! 898: @itemize @bullet
! 899: @item
! 900: @samp{alloca} - C library or compiler builtin.
! 901: @item
! 902: @samp{malloc-reentrant} - the heap, in a re-entrant fashion.
! 903: @item
! 904: @samp{malloc-notreentrant} - the heap, with global variables.
! 905: @end itemize
! 906:
! 907: For convenience, the following choices are also available.
! 908: @samp{--disable-alloca} is the same as @samp{--enable-alloca=no}.
! 909:
! 910: @itemize @bullet
! 911: @item
! 912: @samp{yes} - a synonym for @samp{alloca}.
! 913: @item
! 914: @samp{no} - a synonym for @samp{malloc-reentrant}.
! 915: @item
! 916: @samp{reentrant} - @code{alloca} if available, otherwise
! 917: @samp{malloc-reentrant}. This is the default.
! 918: @item
! 919: @samp{notreentrant} - @code{alloca} if available, otherwise
! 920: @samp{malloc-notreentrant}.
! 921: @end itemize
! 922:
! 923: @code{alloca} is reentrant and fast, and is recommended, but when working with
! 924: large numbers it can overflow the available stack space, in which case one of
! 925: the two malloc methods will need to be used. Alternately it might be possible
! 926: to increase available stack with @command{limit}, @command{ulimit} or
! 927: @code{setrlimit}, or under DJGPP with @command{stubedit} or
! 928: @code{@w{_stklen}}. Note that depending on the system the only indication of
! 929: stack overflow might be a segmentation violation.
! 930:
! 931: @samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe,
! 932: but @samp{malloc-notreentrant} is faster and should be used if reentrancy is
! 933: not required.
! 934:
! 935: The two malloc methods in fact use the memory allocation functions selected by
! 936: @code{mp_set_memory_functions}, these being @code{malloc} and friends by
! 937: default. @xref{Custom Allocation}.
! 938:
! 939: An additional choice @samp{--enable-alloca=debug} is available, to help when
! 940: debugging memory related problems (@pxref{Debugging}).
! 941:
! 942: @item FFT Multiplication, @option{--disable-fft}
! 943:
! 944: By default multiplications are done using Karatsuba, 3-way Toom-Cook, and
! 945: Fermat FFT. The FFT is only used on large to very large operands and can be
! 946: disabled to save code size if desired.
1.1.1.2 maekawa 947:
1.1.1.4 ! ohara 948: @item Berkeley MP, @option{--enable-mpbsd}
1.1.1.2 maekawa 949:
1.1.1.4 ! ohara 950: The Berkeley MP compatibility library (@file{libmp}) and header file
1.1.1.2 maekawa 951: (@file{mp.h}) are built and installed only if @option{--enable-mpbsd} is used.
952: @xref{BSD Compatible Functions}.
953:
1.1.1.4 ! ohara 954: @item MPFR, @option{--enable-mpfr}
! 955: @cindex MPFR
! 956:
! 957: The optional MPFR functions are built and installed only if
! 958: @option{--enable-mpfr} is used. These are in a separate library
! 959: @file{libmpfr.a} and are documented separately too (@pxref{Introduction to
! 960: MPFR,, Introduction to MPFR, mpfr, MPFR}).
! 961:
! 962: @item Assertion Checking, @option{--enable-assert}
! 963:
! 964: This option enables some consistency checking within the library. This can be
! 965: of use while debugging, @pxref{Debugging}.
! 966:
! 967: @item Execution Profiling, @option{--enable-profiling=prof/gprof}
! 968:
! 969: Profiling support can be enabled either for @command{prof} or @command{gprof}.
! 970: This adds @samp{-p} or @samp{-pg} respectively to @samp{CFLAGS}, and for some
! 971: systems adds corresponding @code{mcount} calls to the assembler code.
! 972: @xref{Profiling}.
! 973:
1.1.1.2 maekawa 974: @item @option{MPN_PATH}
975:
1.1.1.4 ! ohara 976: Various assembler versions of each mpn subroutines are provided. For a given
! 977: CPU, a search is made though a path to choose a version of each. For example
! 978: @samp{sparcv8} has
1.1.1.2 maekawa 979:
1.1.1.4 ! ohara 980: @example
! 981: MPN_PATH="sparc32/v8 sparc32 generic"
! 982: @end example
1.1.1.2 maekawa 983:
1.1.1.4 ! ohara 984: which means look first for v8 code, then plain sparc32 (which is v7), and
! 985: finally fall back on generic C. Knowledgeable users with special requirements
! 986: can specify a different path. Normally this is completely unnecessary.
1.1.1.2 maekawa 987:
988: @item Documentation
989:
990: The document you're now reading is @file{gmp.texi}. The usual automake
1.1.1.4 ! ohara 991: targets are available to make PostScript @file{gmp.ps} and/or DVI
! 992: @file{gmp.dvi}.
! 993:
! 994: HTML can be produced with @samp{makeinfo --html}, see @ref{makeinfo
! 995: html,Generating HTML,Generating HTML,texinfo,Texinfo}. Or alternately
! 996: @samp{texi2html}, see @ref{Top,Texinfo to HTML,About,texi2html,Texinfo To
! 997: HTML}.
! 998:
! 999: PDF can be produced with @samp{texi2dvi --pdf} (@pxref{PDF
! 1000: Output,PDF,,texinfo,Texinfo}) or with @samp{pdftex}.
! 1001:
! 1002: Some supplementary notes can be found in the @file{doc} subdirectory.
! 1003:
1.1.1.2 maekawa 1004: @end table
1.1 maekawa 1005:
1006:
1.1.1.2 maekawa 1007: @need 2000
1008: @node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP
1009: @section ABI and ISA
1010: @cindex ABI
1.1.1.4 ! ohara 1011: @cindex Application Binary Interface
1.1.1.2 maekawa 1012: @cindex ISA
1.1.1.4 ! ohara 1013: @cindex Instruction Set Architecture
1.1.1.2 maekawa 1014:
1015: ABI (Application Binary Interface) refers to the calling conventions between
1016: functions, meaning what registers are used and what sizes the various C data
1017: types are. ISA (Instruction Set Architecture) refers to the instructions and
1018: registers a CPU has available.
1019:
1020: Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the
1.1.1.4 ! ohara 1021: latter for compatibility with older CPUs in the family. GMP supports some
! 1022: CPUs like this in both ABIs. In fact within GMP @samp{ABI} means a
! 1023: combination of chip ABI, plus how GMP chooses to use it. For example in some
! 1024: 32-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit
! 1025: @code{long long}.
! 1026:
! 1027: By default GMP chooses the best ABI available for a given system, and this
! 1028: generally gives significantly greater speed. But an ABI can be chosen
! 1029: explicitly to make GMP compatible with other libraries, or particular
! 1030: application requirements. For example,
! 1031:
! 1032: @example
! 1033: ./configure ABI=32
! 1034: @end example
1.1.1.2 maekawa 1035:
1.1.1.4 ! ohara 1036: In all cases it's vital that all object code used in a given program is
! 1037: compiled for the same ABI.
! 1038:
! 1039: Usually a limb is implemented as a @code{long}. When a @code{long long} limb
! 1040: is used this is encoded in the generated @file{gmp.h}. This is convenient for
! 1041: applications, but it does mean that @file{gmp.h} will vary, and can't be just
! 1042: copied around. @file{gmp.h} remains compiler independent though, since all
! 1043: compilers for a particular ABI will be expected to use the same limb type.
! 1044:
! 1045: Currently no attempt is made to follow whatever conventions a system has for
! 1046: installing library or header files built for a particular ABI. This will
! 1047: probably only matter when installing multiple builds of GMP, and it might be
! 1048: as simple as configuring with a special @samp{libdir}, or it might require
! 1049: more than that. Note that builds for different ABIs need to done separately,
! 1050: with a fresh @command{./configure} and @command{make} each.
1.1.1.2 maekawa 1051:
1052: @table @asis
1.1.1.4 ! ohara 1053: @sp 1
1.1.1.2 maekawa 1054: @need 1000
1.1.1.4 ! ohara 1055: @item HPPA 2.0 (@samp{hppa2.0*})
1.1.1.2 maekawa 1056:
1.1.1.4 ! ohara 1057: @table @asis
! 1058: @item @samp{ABI=2.0w}
1.1.1.2 maekawa 1059:
1.1.1.4 ! ohara 1060: The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or up
! 1061: when using @command{cc}. @command{gcc} support for this is in progress.
! 1062: Applications must be compiled with
1.1 maekawa 1063:
1.1.1.2 maekawa 1064: @example
1.1.1.4 ! ohara 1065: cc +DD64
1.1.1.2 maekawa 1066: @end example
1.1 maekawa 1067:
1.1.1.4 ! ohara 1068: @item @samp{ABI=2.0n}
1.1 maekawa 1069:
1.1.1.4 ! ohara 1070: The 2.0n ABI means the 32-bit HPPA 1.0 ABI but with a 64-bit limb using
! 1071: @code{long long}. This is available on HP-UX 10 or up when using
! 1072: @command{cc}. No @command{gcc} support is planned for this. Applications
! 1073: must be compiled with
1.1 maekawa 1074:
1.1.1.2 maekawa 1075: @example
1.1.1.4 ! ohara 1076: cc +DA2.0 +e
1.1.1.2 maekawa 1077: @end example
1.1 maekawa 1078:
1.1.1.4 ! ohara 1079: @item @samp{ABI=1.0}
! 1080:
! 1081: HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI.
! 1082: No special compiler options are needed for applications.
! 1083: @end table
! 1084:
! 1085: All three ABIs are available for CPUs @samp{hppa2.0w} and @samp{hppa2.0}, but
! 1086: for CPU @samp{hppa2.0n} only 2.0n or 1.0 are allowed.
! 1087:
! 1088: @sp 1
1.1.1.2 maekawa 1089: @need 1000
1.1.1.4 ! ohara 1090: @item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]})
! 1091:
! 1092: IRIX 6 supports the n32 and 64 ABIs and always has a 64-bit MIPS 3 or better
! 1093: CPU. In both these ABIs GMP uses a 64-bit limb. A new enough @command{gcc}
! 1094: is required (2.95 for instance).
1.1 maekawa 1095:
1.1.1.4 ! ohara 1096: @table @asis
! 1097: @item @samp{ABI=n32}
! 1098:
! 1099: The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a
! 1100: @code{long long}. Applications must be compiled with
1.1 maekawa 1101:
1.1.1.2 maekawa 1102: @example
1103: gcc -mabi=n32
1104: cc -n32
1105: @end example
1.1 maekawa 1106:
1.1.1.4 ! ohara 1107: @item @samp{ABI=64}
! 1108:
! 1109: The 64-bit ABI is 64-bit pointers and integers. Applications must be compiled
! 1110: with
! 1111:
! 1112: @example
! 1113: gcc -mabi=64
! 1114: cc -64
! 1115: @end example
! 1116: @end table
! 1117:
! 1118: Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary
! 1119: support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code.
! 1120:
! 1121: @sp 1
1.1.1.2 maekawa 1122: @need 1000
1.1.1.4 ! ohara 1123: @item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630})
! 1124:
! 1125: @table @asis
! 1126: @item @samp{ABI=aix64}
1.1 maekawa 1127:
1.1.1.4 ! ohara 1128: The AIX 64 ABI uses 64-bit limbs and pointers and is available on systems
! 1129: @samp{*-*-aix*}. Applications must be compiled (and linked) with
1.1 maekawa 1130:
1.1.1.2 maekawa 1131: @example
1132: gcc -maix64
1133: xlc -q64
1134: @end example
1.1 maekawa 1135:
1.1.1.4 ! ohara 1136: @item @samp{ABI=32}
1.1 maekawa 1137:
1.1.1.4 ! ohara 1138: This is the basic 32-bit PowerPC ABI. No special compiler options are needed
! 1139: for applications.
! 1140: @end table
1.1.1.2 maekawa 1141:
1.1.1.4 ! ohara 1142: @sp 1
1.1.1.2 maekawa 1143: @need 1000
1.1.1.4 ! ohara 1144: @item Sparc V9 (@samp{sparcv9} and @samp{ultrasparc*})
! 1145:
! 1146: @table @asis
! 1147: @item @samp{ABI=64}
1.1 maekawa 1148:
1.1.1.4 ! ohara 1149: The 64-bit V9 ABI is available on Solaris 2.7 and up and GNU/Linux. GCC 2.95
! 1150: or up, or Sun @command{cc} is required. Applications must be compiled with
1.1.1.2 maekawa 1151:
1.1.1.4 ! ohara 1152: @example
! 1153: gcc -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9
! 1154: cc -xarch=v9
! 1155: @end example
! 1156:
! 1157: @item @samp{ABI=32}
! 1158:
! 1159: On Solaris 2.6 and earlier, and on Solaris 2.7 with the kernel in 32-bit mode,
! 1160: only the plain V8 32-bit ABI can be used, since the kernel doesn't save all
! 1161: registers. GMP still uses as much of the V9 ISA as it can in these
! 1162: circumstances. No special compiler options are required for applications,
! 1163: though using something like the following requesting V9 code within the V8 ABI
! 1164: is recommended.
1.1 maekawa 1165:
1166: @example
1.1.1.2 maekawa 1167: gcc -mv8plus
1168: cc -xarch=v8plus
1.1 maekawa 1169: @end example
1170:
1.1.1.4 ! ohara 1171: @command{gcc} 2.8 and earlier only supports @samp{-mv8} though.
! 1172: @end table
! 1173:
! 1174: Don't be confused by the names of these sparc @samp{-m} and @samp{-x} options,
! 1175: they're called @samp{arch} but they effectively control the ABI.
! 1176:
! 1177: On Solaris 2.7 with the kernel in 32-bit-mode, a normal native build will
! 1178: reject @samp{ABI=64} because the resulting executables won't run.
! 1179: @samp{ABI=64} can still be built if desired by making it look like a
! 1180: cross-compile, for example
1.1 maekawa 1181:
1.1.1.2 maekawa 1182: @example
1.1.1.4 ! ohara 1183: ./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64
1.1.1.2 maekawa 1184: @end example
1185: @end table
1186:
1187:
1188: @need 2000
1189: @node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP
1190: @section Notes for Package Builds
1191: @cindex Build notes for binary packaging
1192: @cindex Packaged builds
1193:
1194: GMP should present no great difficulties for packaging in a binary
1195: distribution.
1196:
1197: @cindex Libtool versioning
1.1.1.4 ! ohara 1198: @cindex Shared library versioning
1.1.1.2 maekawa 1199: Libtool is used to build the library and @samp{-version-info} is set
1.1.1.4 ! ohara 1200: appropriately, having started from @samp{3:0:0} in GMP 3.0. The GMP 4 series
! 1201: will be upwardly binary compatible in each release and will be upwardly binary
! 1202: compatible with all of the GMP 3 series. Additional function interfaces may
! 1203: be added in each release, so on systems where libtool versioning is not fully
! 1204: checked by the loader an auxiliary mechanism may be needed to express that a
! 1205: dynamic linked application depends on a new enough GMP.
! 1206:
! 1207: An auxiliary mechanism may also be needed to express that @file{libgmpxx.la}
! 1208: (from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la}
! 1209: from the same GMP version, since this is not done by the libtool versioning,
! 1210: nor otherwise. A mismatch will result in unresolved symbols from the linker,
! 1211: or perhaps the loader.
! 1212:
! 1213: Using @samp{DESTDIR} or a @samp{prefix} override with @samp{make install} and
! 1214: a shared @file{libgmpxx} may run into a libtool relinking problem, see
! 1215: @ref{Known Build Problems}.
1.1.1.2 maekawa 1216:
1217: When building a package for a CPU family, care should be taken to use
1.1.1.4 ! ohara 1218: @samp{--host} (or @samp{--build}) to choose the least common denominator among
! 1219: the CPUs which might use the package. For example this might necessitate
! 1220: @samp{i386} for x86s, or plain @samp{sparc} (meaning V7) for SPARCs.
1.1.1.2 maekawa 1221:
1222: Users who care about speed will want GMP built for their exact CPU type, to
1223: make use of the available optimizations. Providing a way to suitably rebuild
1224: a package may be useful. This could be as simple as making it possible for a
1.1.1.4 ! ohara 1225: user to omit @samp{--build} (and @samp{--host}) so @samp{./config.guess} will
! 1226: detect the CPU. But a way to manually specify a @samp{--build} will be wanted
! 1227: for systems where @samp{./config.guess} is inexact.
! 1228:
! 1229: Note that @file{gmp.h} is a generated file, and will be architecture and ABI
! 1230: dependent.
1.1.1.2 maekawa 1231:
1232:
1233: @need 2000
1234: @node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP
1235: @section Notes for Particular Systems
1236: @cindex Build notes for particular systems
1.1.1.4 ! ohara 1237: @cindex Particular systems
! 1238: @cindex Systems
1.1.1.2 maekawa 1239: @table @asis
1240:
1241: @c This section is more or less meant for notes about performance or about
1242: @c build problems that have been worked around but might leave a user
1243: @c scratching their head. Fun with different ABIs on a system belongs in the
1244: @c above section.
1245:
1.1.1.4 ! ohara 1246: @item AIX 3 and 4
! 1247:
! 1248: On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since
! 1249: some versions of the native @command{ar} fail on the convenience libraries
! 1250: used. A shared build can be attempted with
! 1251:
! 1252: @example
! 1253: ./configure --enable-shared --disable-static
! 1254: @end example
! 1255:
! 1256: Note that the @samp{--disable-static} is necessary because in a shared build
! 1257: libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for
! 1258: the benefit of old versions of @command{ld} which only recognise @file{.a},
! 1259: but unfortunately this is done even if a fully functional @command{ld} is
! 1260: available.
! 1261:
! 1262: @item ARM
! 1263:
! 1264: On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a
! 1265: bug in unsigned division, giving wrong results for some operands. GMP
! 1266: @samp{./configure} will demand GCC 2.95.4 or later.
! 1267:
! 1268: @item Compaq C++
! 1269: Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and
! 1270: an old pre-standard one (see @samp{man iostream_intro}). GMP can only use the
! 1271: standard one, which unfortunately is not the default but must be selected by
! 1272: defining @code{__USE_STD_IOSTREAM}. Configure with for instance
! 1273:
! 1274: @example
! 1275: ./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM
! 1276: @end example
! 1277:
! 1278: @item Microsoft Windows
! 1279: On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by
! 1280: default GMP builds only a static library, but a DLL can be built instead using
! 1281:
! 1282: @example
! 1283: ./configure --disable-static --enable-shared
! 1284: @end example
! 1285:
! 1286: Static and DLL libraries can't both be built, since certain export directives
! 1287: in @file{gmp.h} must be different. @samp{--enable-cxx} cannot be used when
! 1288: building a DLL, since libtool doesn't currently support C++ DLLs. This might
! 1289: change in the future.
! 1290:
! 1291: @item Microsoft C
! 1292: A MINGW DLL build of GMP can be used with Microsoft C. Libtool doesn't
! 1293: install @file{.lib} and @file{.exp} files, but they can be created with the
! 1294: following commands, where @file{/my/inst/dir} is the install directory (with a
! 1295: @file{lib} subdirectory).
! 1296:
! 1297: @example
! 1298: lib /machine:IX86 /def:_libs/libgmp-3.dll-def
! 1299: cp libgmp-3.lib /my/inst/dir/lib
! 1300: cp _libs/libgmp-3.dll-exp /my/inst/dir/lib/libgmp-3.exp
! 1301: @end example
1.1.1.2 maekawa 1302:
1.1.1.4 ! ohara 1303: MINGW uses @samp{msvcrt.dll} for I/O, so applications wanting to use the GMP
! 1304: I/O routines must be compiled with @samp{cl /MD} to do the same. If one of
! 1305: the other I/O choices provided by MS C is desired then the suggestion is to
! 1306: use the GMP string functions and confine I/O to the application.
! 1307:
! 1308: @item Motorola 68k CPU Types
! 1309:
! 1310: @samp{m68k} is taken to mean 68000. @samp{m68020} or higher will give a
! 1311: performance boost on applicable CPUs. @samp{m68360} can be used for CPU32
! 1312: series chips. @samp{m68302} can be used for ``Dragonball'' series chips,
! 1313: though this is merely a synonym for @samp{m68000}.
1.1.1.2 maekawa 1314:
1315: @item OpenBSD 2.6
1316:
1317: @command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it
1318: unsuitable for @file{.asm} file processing. @samp{./configure} will detect
1319: the problem and either abort or choose another m4 in the @env{PATH}. The bug
1320: is fixed in OpenBSD 2.7, so either upgrade or use GNU m4.
1321:
1.1.1.4 ! ohara 1322: @item Power CPU Types
1.1.1.2 maekawa 1323:
1.1.1.4 ! ohara 1324: In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions
! 1325: not available on the other, so it's important to choose the right one for the
! 1326: CPU that will be used. Currently GMP has no assembler code support for using
! 1327: just the common instruction subset. To get executables that run on both, the
! 1328: current suggestion is to use the generic C code (CPU @samp{none}), possibly
! 1329: with appropriate compiler options (like @samp{-mcpu=common} for
! 1330: @command{gcc}). CPU @samp{rs6000} (which is not a CPU but a family of
! 1331: workstations) is accepted by @file{config.sub}, but is currently equivalent to
! 1332: @samp{none}.
! 1333:
! 1334: @item Sparc CPU Types
! 1335:
! 1336: @samp{sparcv8} or @samp{supersparc} on relevant systems will give a
! 1337: significant performance increase over the V7 code.
! 1338:
! 1339: @item Sparc App Regs
! 1340: @cindex Sparc
! 1341: The GMP assembler code for both 32-bit and 64-bit Sparc clobbers the
! 1342: ``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way
! 1343: that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,,, gcc,
! 1344: Using the GNU Compiler Collection (GCC)}).
! 1345:
! 1346: This makes that code unsuitable for use with the special V9
! 1347: @samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer),
! 1348: and for applications wanting to use those registers for special purposes. In
! 1349: these cases the only suggestion currently is to build GMP with CPU @samp{none}
! 1350: to avoid the assembler code.
1.1.1.2 maekawa 1351:
1352: @item SunOS 4
1353:
1354: @command{/usr/bin/m4} lacks various features needed to process @file{.asm}
1355: files, and instead @samp{./configure} will automatically use
1356: @command{/usr/5bin/m4}, which we believe is always available (if not then use
1357: GNU m4).
1358:
1.1.1.4 ! ohara 1359: @item x86 CPU Types
1.1.1.2 maekawa 1360:
1.1.1.4 ! ohara 1361: @samp{i386} selects generic code which will run reasonably well on all x86
! 1362: chips.
1.1.1.2 maekawa 1363:
1.1.1.4 ! ohara 1364: @samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for the intended
! 1365: P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II,
! 1366: P-III)@. @samp{i386} is a better choice when making binaries that must run on
! 1367: both.
! 1368:
! 1369: @samp{pentium4} and an SSE2 capable assembler are important for best results
! 1370: on Pentium 4. The specific code is for instance roughly a 2@cross{} to
! 1371: 3@cross{} speedup over the generic @samp{i386} code.
! 1372:
! 1373: @item x86 MMX and SSE2 Code
! 1374:
! 1375: If the CPU selected has MMX code but the assembler doesn't support it, a
! 1376: warning is given and non-MMX code is used instead. This will be an inferior
! 1377: build, since the MMX code that's present is there because it's faster than the
! 1378: corresponding plain integer code. The same applies to SSE2.
! 1379:
! 1380: Old versions of @samp{gas} don't support MMX instructions, in particular
! 1381: version 1.92.3 that comes with FreeBSD 2.2.8 doesn't (and unfortunately
! 1382: there's no newer assembler for that system).
! 1383:
! 1384: Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register
! 1385: to register @code{movq} instructions, and so can't be used for MMX code.
! 1386: Install a recent @command{gas} if MMX code is wanted on these systems.
! 1387: @end table
1.1.1.2 maekawa 1388:
1389:
1390: @need 2000
1391: @node Known Build Problems, , Notes for Particular Systems, Installing GMP
1392: @section Known Build Problems
1393: @cindex Build problems known
1394:
1395: @c This section is more or less meant for known build problems that are not
1396: @c otherwise worked around and require some sort of manual intervention.
1397:
1.1.1.4 ! ohara 1398: You might find more up-to-date information at @uref{http://swox.com/gmp/}.
1.1.1.2 maekawa 1399:
1400: @table @asis
1.1.1.4 ! ohara 1401: @item Compiler link options
! 1402: The version of libtool currently in use rather aggressively strips compiler
! 1403: options when linking a shared library. This will hopefully be relaxed in the
! 1404: future, but for now if this is a problem the suggestion is to create a little
! 1405: script to hide them, and for instance configure with
! 1406:
! 1407: @example
! 1408: ./configure CC=gcc-with-my-options
! 1409: @end example
! 1410:
! 1411: @item DJGPP
! 1412: The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure}
! 1413: script, it exits silently, having died writing a preamble to
! 1414: @file{config.log}. Use @command{bash} 2.04 or higher.
! 1415:
! 1416: @samp{make all} was found to run out of memory during the final
! 1417: @file{libgmp.la} link on one system tested, despite having 64Mb available. A
! 1418: separate @samp{make libgmp.la} helped, perhaps recursing into the various
! 1419: subdirectories uses up memory.
! 1420:
! 1421: @item @samp{DESTDIR} and shared @file{libgmpxx}
! 1422: @cindex @samp{DESTDIR}
! 1423: @samp{make install DESTDIR=/my/staging/area}, or the same with a @samp{prefix}
! 1424: override, to install to a temporary directory is not fully supported by
! 1425: current versions of libtool when building a shared version of a library which
! 1426: depends on another being built at the same time, like @file{libgmpxx} and
! 1427: @file{libgmp}.
! 1428:
! 1429: The problem is that @file{libgmpxx} is relinked at the install stage to ensure
! 1430: that if the system puts a hard-coded path to @file{libgmp} within
! 1431: @file{libgmpxx} then that path will be correct. Naturally the linker is
! 1432: directed to look only at the final location, not the staging area, so if
! 1433: @file{libgmp} is not already in that final location then the link will fail.
! 1434:
! 1435: A workaround for this on SVR4 style systems, such as GNU/Linux, where paths
! 1436: are not hard-coded, is to include the staging area in the linker's search
! 1437: using @code{LD_LIBRARY_PATH}. For example with @samp{--prefix=/usr} but
! 1438: installing under @samp{/my/staging/area},
1.1.1.2 maekawa 1439:
1.1.1.4 ! ohara 1440: @example
! 1441: LD_LIBRARY_PATH=/my/staging/area/usr/lib \
! 1442: make install DESTDIR=/my/staging/area
! 1443: @end example
! 1444:
! 1445: @item GNU binutils @command{strip} prior to 2.12
! 1446: @cindex Stripped libraries
! 1447:
! 1448: @command{strip} from GNU binutils 2.11 and earlier should not be used on the
! 1449: static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all
! 1450: but the last of multiple archive members with the same name, like the three
! 1451: versions of @file{init.o} in @file{libgmp.a}. Binutils 2.12 or higher can be
! 1452: used successfully.
! 1453:
! 1454: The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by
! 1455: this and any version of @command{strip} can be used on them.
! 1456:
! 1457: @item @command{make} syntax error
! 1458:
! 1459: On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make}
! 1460: is unable to handle the long dependencies list for @file{libgmp.la}. The
! 1461: symptom is a ``syntax error'' on the following line of the top-level
! 1462: @file{Makefile}.
! 1463:
! 1464: @example
! 1465: libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES)
! 1466: @end example
1.1.1.2 maekawa 1467:
1.1.1.4 ! ohara 1468: Either use GNU Make, or as a workaround remove
! 1469: @code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial
! 1470: build work, but if any recompiling is done @file{libgmp.la} might not be
! 1471: rebuilt).
! 1472:
! 1473: @item MacOS X and GCC
! 1474: Libtool currently only knows how to create shared libraries on MacOS X using
! 1475: the native @command{cc} (which is a modified GCC), not a plain GCC. A
! 1476: static-only build should work though (@samp{--disable-shared}).
! 1477:
! 1478: Also, libtool currently cannot build C++ shared libraries on MacOS X, so if
! 1479: @samp{--enable-cxx} is desired then @samp{--disable-shared} must be used.
! 1480: Hopefully this will be fixed in the future.
1.1.1.2 maekawa 1481:
1482: @item NeXT prior to 3.3
1483:
1484: The system compiler on old versions of NeXT was a massacred and old GCC, even
1485: if it called itself @file{cc}. This compiler cannot be used to build GMP, you
1.1.1.4 ! ohara 1486: need to get a real GCC, and install that. (NeXT may have fixed this in
! 1487: release 3.3 of their system.)
1.1.1.2 maekawa 1488:
1489: @item POWER and PowerPC
1490:
1491: Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or
1492: PowerPC. If you want to use GCC for these machines, get GCC 2.7.2.1 (or
1493: later).
1494:
1495: @item Sequent Symmetry
1496:
1497: Use the GNU assembler instead of the system assembler, since the latter has
1498: serious bugs.
1499:
1.1.1.4 ! ohara 1500: @item Solaris 2.6
1.1.1.2 maekawa 1501:
1.1.1.4 ! ohara 1502: The system @command{sed} prints an error ``Output line too long'' when libtool
! 1503: builds @file{libgmp.la}. This doesn't seem to cause any obvious ill effects,
! 1504: but GNU @command{sed} is recommended, to avoid any doubt.
! 1505:
! 1506: @item Sparc Solaris 2.7 with gcc 2.95.2 in ABI=32
! 1507:
! 1508: A shared library build of GMP seems to fail in this combination, it builds but
! 1509: then fails the tests, apparently due to some incorrect data relocations within
! 1510: @code{gmp_randinit_lc_2exp_size}. The exact cause is unknown,
! 1511: @samp{--disable-shared} is recommended.
! 1512:
! 1513: @item Windows DLL test programs
! 1514:
! 1515: When creating a DLL version of @file{libgmp}, libtool creates wrapper scripts
! 1516: like @file{t-mul} for programs that would normally be @file{t-mul.exe}, in
! 1517: order to setup the right library paths etc. This works fine, but the absence
! 1518: of @file{t-mul.exe} etc causes @command{make} to think they need recompiling
! 1519: every time, which is an annoyance when re-running a @samp{make check}.
1.1.1.2 maekawa 1520: @end table
1521:
1522:
1523: @node GMP Basics, Reporting Bugs, Installing GMP, Top
1524: @comment node-name, next, previous, up
1525: @chapter GMP Basics
1526: @cindex Basics
1.1 maekawa 1527:
1.1.1.2 maekawa 1528: @strong{Using functions, macros, data types, etc.@: not documented in this
1529: manual is strongly discouraged. If you do so your application is guaranteed
1530: to be incompatible with future versions of GMP.}
1531:
1532: @menu
1.1.1.4 ! ohara 1533: * Headers and Libraries::
! 1534: * Nomenclature and Types::
! 1535: * Function Classes::
! 1536: * Variable Conventions::
! 1537: * Parameter Conventions::
! 1538: * Memory Management::
! 1539: * Reentrancy::
! 1540: * Useful Macros and Constants::
! 1541: * Compatibility with older versions::
! 1542: * Demonstration Programs::
! 1543: * Efficiency::
! 1544: * Debugging::
! 1545: * Profiling::
! 1546: * Autoconf::
! 1547: * Emacs::
1.1.1.2 maekawa 1548: @end menu
1.1 maekawa 1549:
1.1.1.4 ! ohara 1550: @node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics
! 1551: @section Headers and Libraries
! 1552: @cindex Headers
! 1553:
! 1554: @cindex @file{gmp.h}
! 1555: All declarations needed to use GMP are collected in the include file
! 1556: @file{gmp.h}. It is designed to work with both C and C++ compilers.
! 1557:
! 1558: @example
! 1559: #include <gmp.h>
! 1560: @end example
! 1561:
! 1562: Note however that prototypes for GMP functions with @code{FILE *} parameters
! 1563: are only provided if @code{<stdio.h>} is included too.
! 1564:
! 1565: @example
! 1566: #include <stdio.h>
! 1567: #include <gmp.h>
! 1568: @end example
! 1569:
! 1570: Likewise @code{<stdarg.h>} (or @code{<varargs.h>}) is required for prototypes
! 1571: with @code{va_list} parameters, such as @code{gmp_vprintf}. And
! 1572: @code{<obstack.h>} for prototypes with @code{struct obstack} parameters, such
! 1573: as @code{gmp_obstack_printf}, when available.
! 1574:
! 1575: @cindex Libraries
! 1576: @cindex Linking
! 1577: All programs using GMP must link against the @file{libgmp} library. On a
! 1578: typical Unix-like system this can be done with @samp{-lgmp}, for example
! 1579:
! 1580: @example
! 1581: gcc myprogram.c -lgmp
! 1582: @end example
! 1583:
! 1584: GMP C++ functions are in a separate @file{libgmpxx} library. This is built
! 1585: and installed if C++ support has been enabled (@pxref{Build Options}). For
! 1586: example,
! 1587:
! 1588: @example
! 1589: g++ mycxxprog.cc -lgmpxx -lgmp
! 1590: @end example
! 1591:
! 1592: GMP is built using Libtool and an application can use that to link if desired,
! 1593: @pxref{Top,Shared library support for GNU,Introduction,libtool,GNU Libtool}
! 1594:
! 1595: If GMP has been installed to a non-standard location then it may be necessary
! 1596: to use @samp{-I} and @samp{-L} compiler options to point to the right
! 1597: directories, and some sort of run-time path for a shared library. Consult
! 1598: your compiler documentation, for instance @ref{Top,,Introduction,gcc,Using and
! 1599: Porting the GNU Compiler Collection}.
! 1600:
! 1601:
! 1602: @node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics
1.1 maekawa 1603: @section Nomenclature and Types
1.1.1.2 maekawa 1604: @cindex Nomenclature
1605: @cindex Types
1.1 maekawa 1606:
1607: @cindex Integer
1608: @tindex @code{mpz_t}
1609: @noindent
1610: In this manual, @dfn{integer} usually means a multiple precision integer, as
1.1.1.2 maekawa 1611: defined by the GMP library. The C data type for such integers is @code{mpz_t}.
1.1 maekawa 1612: Here are some examples of how to declare such integers:
1613:
1614: @example
1615: mpz_t sum;
1616:
1617: struct foo @{ mpz_t x, y; @};
1618:
1619: mpz_t vec[20];
1620: @end example
1621:
1622: @cindex Rational number
1623: @tindex @code{mpq_t}
1624: @noindent
1625: @dfn{Rational number} means a multiple precision fraction. The C data type
1626: for these fractions is @code{mpq_t}. For example:
1627:
1628: @example
1629: mpq_t quotient;
1630: @end example
1631:
1632: @cindex Floating-point number
1633: @tindex @code{mpf_t}
1634: @noindent
1635: @dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision
1.1.1.2 maekawa 1636: mantissa with a limited precision exponent. The C data type for such objects
1.1 maekawa 1637: is @code{mpf_t}.
1638:
1639: @cindex Limb
1640: @tindex @code{mp_limb_t}
1641: @noindent
1642: A @dfn{limb} means the part of a multi-precision number that fits in a single
1.1.1.4 ! ohara 1643: machine word. (We chose this word because a limb of the human body is
! 1644: analogous to a digit, only larger, and containing several digits.) Normally a
! 1645: limb is 32 or 64 bits. The C data type for a limb is @code{mp_limb_t}.
1.1 maekawa 1646:
1647:
1.1.1.4 ! ohara 1648: @node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics
1.1 maekawa 1649: @section Function Classes
1.1.1.2 maekawa 1650: @cindex Function classes
1.1 maekawa 1651:
1.1.1.2 maekawa 1652: There are six classes of functions in the GMP library:
1.1 maekawa 1653:
1654: @enumerate
1655: @item
1656: Functions for signed integer arithmetic, with names beginning with
1.1.1.4 ! ohara 1657: @code{mpz_}. The associated type is @code{mpz_t}. There are about 150
1.1 maekawa 1658: functions in this class.
1659:
1660: @item
1661: Functions for rational number arithmetic, with names beginning with
1.1.1.4 ! ohara 1662: @code{mpq_}. The associated type is @code{mpq_t}. There are about 40
! 1663: functions in this class, but the integer functions can be used for arithmetic
! 1664: on the numerator and denominator separately.
1.1 maekawa 1665:
1666: @item
1667: Functions for floating-point arithmetic, with names beginning with
1.1.1.4 ! ohara 1668: @code{mpf_}. The associated type is @code{mpf_t}. There are about 60
1.1 maekawa 1669: functions is this class.
1670:
1671: @item
1.1.1.4 ! ohara 1672: Functions compatible with Berkeley MP, such as @code{itom}, @code{madd}, and
1.1 maekawa 1673: @code{mult}. The associated type is @code{MINT}.
1674:
1675: @item
1676: Fast low-level functions that operate on natural numbers. These are used by
1677: the functions in the preceding groups, and you can also call them directly
1678: from very time-critical user programs. These functions' names begin with
1.1.1.4 ! ohara 1679: @code{mpn_}. The associated type is array of @code{mp_limb_t}. There are
! 1680: about 30 (hard-to-use) functions in this class.
1.1 maekawa 1681:
1682: @item
1.1.1.2 maekawa 1683: Miscellaneous functions. Functions for setting up custom allocation and
1684: functions for generating random numbers.
1.1 maekawa 1685: @end enumerate
1686:
1687:
1.1.1.4 ! ohara 1688: @node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics
! 1689: @section Variable Conventions
1.1.1.2 maekawa 1690: @cindex Variable conventions
1691: @cindex Conventions for variables
1.1 maekawa 1692:
1.1.1.4 ! ohara 1693: GMP functions generally have output arguments before input arguments. This
! 1694: notation is by analogy with the assignment operator. The BSD MP compatibility
! 1695: functions are exceptions, having the output arguments last.
1.1 maekawa 1696:
1.1.1.2 maekawa 1697: GMP lets you use the same variable for both input and output in one call. For
1698: example, the main function for integer multiplication, @code{mpz_mul}, can be
1699: used to square @code{x} and put the result back in @code{x} with
1700:
1701: @example
1702: mpz_mul (x, x, x);
1703: @end example
1.1 maekawa 1704:
1.1.1.2 maekawa 1705: Before you can assign to a GMP variable, you need to initialize it by calling
1.1 maekawa 1706: one of the special initialization functions. When you're done with a
1707: variable, you need to clear it out, using one of the functions for that
1708: purpose. Which function to use depends on the type of variable. See the
1709: chapters on integer functions, rational number functions, and floating-point
1710: functions for details.
1711:
1.1.1.4 ! ohara 1712: A variable should only be initialized once, or at least cleared between each
! 1713: initialization. After a variable has been initialized, it may be assigned to
! 1714: any number of times.
! 1715:
! 1716: For efficiency reasons, avoid excessive initializing and clearing. In
! 1717: general, initialize near the start of a function and clear near the end. For
! 1718: example,
! 1719:
! 1720: @example
! 1721: void
! 1722: foo (void)
! 1723: @{
! 1724: mpz_t n;
! 1725: int i;
! 1726: mpz_init (n);
! 1727: for (i = 1; i < 100; i++)
! 1728: @{
! 1729: mpz_mul (n, @dots{});
! 1730: mpz_fdiv_q (n, @dots{});
! 1731: @dots{}
! 1732: @}
! 1733: mpz_clear (n);
! 1734: @}
! 1735: @end example
! 1736:
! 1737:
! 1738: @node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics
! 1739: @section Parameter Conventions
! 1740: @cindex Parameter conventions
! 1741: @cindex Conventions for parameters
1.1.1.2 maekawa 1742:
1.1.1.4 ! ohara 1743: When a GMP variable is used as a function parameter, it's effectively a
! 1744: call-by-reference, meaning if the function stores a value there it will change
! 1745: the original in the caller. Parameters which are input-only can be designated
! 1746: @code{const} to provoke a compiler error or warning on attempting to modify
! 1747: them.
! 1748:
! 1749: When a function is going to return a GMP result, it should designate a
! 1750: parameter that it sets, like the library functions do. More than one value
! 1751: can be returned by having more than one output parameter, again like the
! 1752: library functions. A @code{return} of an @code{mpz_t} etc doesn't return the
! 1753: object, only a pointer, and this is almost certainly not what's wanted.
! 1754:
! 1755: Here's an example accepting an @code{mpz_t} parameter, doing a calculation,
! 1756: and storing the result to the indicated parameter.
1.1.1.2 maekawa 1757:
1758: @example
1759: void
1.1.1.4 ! ohara 1760: foo (mpz_t result, const mpz_t param, unsigned long n)
1.1.1.2 maekawa 1761: @{
1762: unsigned long i;
1763: mpz_mul_ui (result, param, n);
1764: for (i = 1; i < n; i++)
1765: mpz_add_ui (result, result, i*7);
1766: @}
1767:
1768: int
1769: main (void)
1770: @{
1771: mpz_t r, n;
1772: mpz_init (r);
1773: mpz_init_set_str (n, "123456", 0);
1.1.1.4 ! ohara 1774: foo (r, n, 20L);
! 1775: gmp_printf ("%Zd\n", r);
1.1.1.2 maekawa 1776: return 0;
1777: @}
1778: @end example
1779:
1.1.1.4 ! ohara 1780: @code{foo} works even if the mainline passes the same variable for
! 1781: @code{param} and @code{result}, just like the library functions. But
! 1782: sometimes it's tricky to make that work, and an application might not want to
! 1783: bother supporting that sort of thing.
! 1784:
! 1785: For interest, the GMP types @code{mpz_t} etc are implemented as one-element
! 1786: arrays of certain structures. This is why declaring a variable creates an
! 1787: object with the fields GMP needs, but then using it as a parameter passes a
! 1788: pointer to the object. Note that the actual fields in each @code{mpz_t} etc
! 1789: are for internal use only and should not be accessed directly by code that
! 1790: expects to be compatible with future GMP releases.
! 1791:
! 1792:
! 1793: @need 1000
! 1794: @node Memory Management, Reentrancy, Parameter Conventions, GMP Basics
! 1795: @section Memory Management
! 1796: @cindex Memory Management
! 1797:
! 1798: The GMP types like @code{mpz_t} are small, containing only a couple of sizes,
! 1799: and pointers to allocated data. Once a variable is initialized, GMP takes
! 1800: care of all space allocation. Additional space is allocated whenever a
! 1801: variable doesn't have enough.
! 1802:
! 1803: @code{mpz_t} and @code{mpq_t} variables never reduce their allocated space.
! 1804: Normally this is the best policy, since it avoids frequent reallocation.
! 1805: Applications that need to return memory to the heap at some particular point
! 1806: can use @code{mpz_realloc2}, or clear variables no longer needed.
! 1807:
! 1808: @code{mpf_t} variables, in the current implementation, use a fixed amount of
! 1809: space, determined by the chosen precision and allocated at initialization, so
! 1810: their size doesn't change.
! 1811:
! 1812: All memory is allocated using @code{malloc} and friends by default, but this
! 1813: can be changed, see @ref{Custom Allocation}. Temporary memory on the stack is
! 1814: also used (via @code{alloca}), but this can be changed at build-time if
! 1815: desired, see @ref{Build Options}.
1.1.1.2 maekawa 1816:
1817:
1.1.1.4 ! ohara 1818: @node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics
! 1819: @section Reentrancy
1.1.1.2 maekawa 1820: @cindex Reentrancy
1821: @cindex Thread safety
1822: @cindex Multi-threading
1823:
1.1.1.4 ! ohara 1824: GMP is reentrant and thread-safe, with some exceptions:
1.1.1.2 maekawa 1825:
1826: @itemize @bullet
1827: @item
1.1.1.4 ! ohara 1828: If configured with @option{--enable-alloca=malloc-notreentrant} (or with
! 1829: @option{--enable-alloca=notreentrant} when @code{alloca} is not available),
! 1830: then naturally GMP is not reentrant.
1.1.1.2 maekawa 1831:
1832: @item
1.1.1.4 ! ohara 1833: @code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the
! 1834: selected precision. @code{mpf_init2} can be used instead.
! 1835:
! 1836: @item
! 1837: @code{mpz_random} and the other old random number functions use a global
! 1838: random state and are hence not reentrant. The newer random number functions
! 1839: that accept a @code{gmp_randstate_t} parameter can be used instead.
! 1840:
! 1841: @item
! 1842: @code{mp_set_memory_functions} uses global variables to store the selected
! 1843: memory allocation functions.
1.1.1.2 maekawa 1844:
1845: @item
1846: If the memory allocation functions set by a call to
1847: @code{mp_set_memory_functions} (or @code{malloc} and friends by default) are
1.1.1.4 ! ohara 1848: not reentrant, then GMP will not be reentrant either.
1.1.1.2 maekawa 1849:
1850: @item
1.1.1.4 ! ohara 1851: If the standard I/O functions such as @code{fwrite} are not reentrant then the
! 1852: GMP I/O functions using them will not be reentrant either.
1.1.1.3 maekawa 1853:
1854: @item
1.1.1.4 ! ohara 1855: It's safe for two threads to read from the same GMP variable simultaneously,
! 1856: but it's not safe for one to read while the another might be writing, nor for
! 1857: two threads to write simultaneously. It's not safe for two threads to
! 1858: generate a random number from the same @code{gmp_randstate_t} simultaneously,
! 1859: since this involves an update of that variable.
! 1860:
! 1861: @item
! 1862: On SCO systems the default @code{<ctype.h>} macros use per-file static
! 1863: variables and may not be reentrant, depending whether the compiler optimizes
! 1864: away fetches from them. The GMP text-based input functions are affected.
1.1.1.2 maekawa 1865: @end itemize
1.1 maekawa 1866:
1867:
1.1.1.2 maekawa 1868: @need 2000
1.1.1.4 ! ohara 1869: @node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics
1.1 maekawa 1870: @section Useful Macros and Constants
1.1.1.2 maekawa 1871: @cindex Useful macros and constants
1872: @cindex Constants
1.1 maekawa 1873:
1874: @deftypevr {Global Constant} {const int} mp_bits_per_limb
1.1.1.4 ! ohara 1875: @findex mp_bits_per_limb
1.1.1.2 maekawa 1876: @cindex Bits per limb
1877: @cindex Limb size
1.1 maekawa 1878: The number of bits per limb.
1879: @end deftypevr
1880:
1881: @defmac __GNU_MP_VERSION
1882: @defmacx __GNU_MP_VERSION_MINOR
1.1.1.2 maekawa 1883: @defmacx __GNU_MP_VERSION_PATCHLEVEL
1884: @cindex Version number
1885: @cindex GMP version number
1886: The major and minor GMP version, and patch level, respectively, as integers.
1887: For GMP i.j, these numbers will be i, j, and 0, respectively.
1888: For GMP i.j.k, these numbers will be i, j, and k, respectively.
1.1 maekawa 1889: @end defmac
1890:
1.1.1.4 ! ohara 1891: @deftypevr {Global Constant} {const char * const} gmp_version
! 1892: @findex gmp_version
! 1893: The GMP version number, as a null-terminated string, in the form ``i.j'' or
! 1894: ``i.j.k''. This release is @nicode{"@value{VERSION}"}.
! 1895: @end deftypevr
! 1896:
1.1 maekawa 1897:
1.1.1.4 ! ohara 1898: @node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics
1.1.1.2 maekawa 1899: @section Compatibility with older versions
1900: @cindex Compatibility with older versions
1901: @cindex Upward compatibility
1902:
1.1.1.4 ! ohara 1903: This version of GMP is upwardly binary compatible with all 4.x and 3.x
! 1904: versions, and upwardly compatible at the source level with all 2.x versions,
! 1905: with the following exceptions.
1.1 maekawa 1906:
1.1.1.2 maekawa 1907: @itemize @bullet
1908: @item
1.1.1.4 ! ohara 1909: @code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency
1.1.1.2 maekawa 1910: with other @code{mpn} functions.
1.1 maekawa 1911:
1.1.1.2 maekawa 1912: @item
1913: @code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and
1.1.1.4 ! ohara 1914: 3.0.1, but in 3.1 reverted to the 2.x style.
1.1.1.2 maekawa 1915: @end itemize
1.1 maekawa 1916:
1.1.1.2 maekawa 1917: There are a number of compatibility issues between GMP 1 and GMP 2 that of
1.1.1.4 ! ohara 1918: course also apply when porting applications from GMP 1 to GMP 4. Please
1.1.1.2 maekawa 1919: see the GMP 2 manual for details.
1920:
1.1.1.4 ! ohara 1921: The Berkeley MP compatibility library (@pxref{BSD Compatible Functions}) is
! 1922: source and binary compatible with the standard @file{libmp}.
! 1923:
1.1.1.2 maekawa 1924: @c @enumerate
1925: @c @item Integer division functions round the result differently. The obsolete
1926: @c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv},
1927: @c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the
1928: @c quotient towards
1929: @c @ifinfo
1930: @c @minus{}infinity).
1931: @c @end ifinfo
1932: @c @iftex
1933: @c @tex
1934: @c $-\infty$).
1935: @c @end tex
1936: @c @end iftex
1937: @c There are a lot of functions for integer division, giving the user better
1938: @c control over the rounding.
1939:
1940: @c @item The function @code{mpz_mod} now compute the true @strong{mod} function.
1941:
1942: @c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use
1943: @c @strong{mod} for reduction.
1944:
1945: @c @item The assignment functions for rational numbers do no longer canonicalize
1946: @c their results. In the case a non-canonical result could arise from an
1947: @c assignment, the user need to insert an explicit call to
1948: @c @code{mpq_canonicalize}. This change was made for efficiency.
1949:
1950: @c @item Output generated by @code{mpz_out_raw} in this release cannot be read
1951: @c by @code{mpz_inp_raw} in previous releases. This change was made for making
1952: @c the file format truly portable between machines with different word sizes.
1953:
1954: @c @item Several @code{mpn} functions have changed. But they were intentionally
1955: @c undocumented in previous releases.
1956:
1957: @c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui}
1958: @c are now implemented as macros, and thereby sometimes evaluate their
1959: @c arguments multiple times.
1960:
1961: @c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1
1962: @c for 0^0. (In version 1, they yielded 0.)
1963:
1.1.1.4 ! ohara 1964: @c In version 1 of the library, @code{mpq_set_den} handled negative
! 1965: @c denominators by copying the sign to the numerator. That is no longer done.
! 1966:
! 1967: @c Pure assignment functions do not canonicalize the assigned variable. It is
! 1968: @c the responsibility of the user to canonicalize the assigned variable before
! 1969: @c any arithmetic operations are performed on that variable.
! 1970: @c Note that this is an incompatible change from version 1 of the library.
! 1971:
1.1.1.2 maekawa 1972: @c @end enumerate
1973:
1974:
1.1.1.4 ! ohara 1975: @need 1000
! 1976: @node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics
! 1977: @section Demonstration programs
! 1978: @cindex Demonstration programs
! 1979: @cindex Example programs
! 1980: @cindex Sample programs
! 1981: The @file{demos} subdirectory has some sample programs using GMP. These
! 1982: aren't built or installed, but there's a @file{Makefile} with rules for them.
! 1983: For instance,
! 1984:
! 1985: @example
! 1986: make pexpr
! 1987: ./pexpr 68^975+10
! 1988: @end example
! 1989:
! 1990: @noindent
! 1991: The following programs are provided
! 1992:
! 1993: @itemize @bullet
! 1994: @item
! 1995: @samp{pexpr} is an expression evaluator, the program used on the GMP web page.
! 1996: @item
! 1997: The @samp{calc} subdirectory has a similar but simpler evaluator using
! 1998: @command{lex} and @command{yacc}.
! 1999: @item
! 2000: The @samp{expr} subdirectory is yet another expression evaluator, a library
! 2001: designed for ease of use within a C program. See @file{demos/expr/README} for
! 2002: more information.
! 2003: @item
! 2004: @samp{factorize} is a Pollard-Rho factorization program.
! 2005: @item
! 2006: @samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p}
! 2007: function.
! 2008: @item
! 2009: @samp{primes} counts or lists primes in an interval, using a sieve.
! 2010: @item
! 2011: @samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic
! 2012: class numbers.
! 2013: @item
! 2014: @cindex @code{perl}
! 2015: The @samp{perl} subdirectory is a comprehensive perl interface to GMP. See
! 2016: @file{demos/perl/INSTALL} for more information. Documentation is in POD
! 2017: format in @file{demos/perl/GMP.pm}.
! 2018: @end itemize
! 2019:
! 2020:
! 2021: @need 1000
! 2022: @node Efficiency, Debugging, Demonstration Programs, GMP Basics
! 2023: @section Efficiency
! 2024: @cindex Efficiency
! 2025:
! 2026: @table @asis
! 2027: @item Small operands
! 2028: On small operands, the time for function call overheads and memory allocation
! 2029: can be significant in comparison to actual calculation. This is unavoidable
! 2030: in a general purpose variable precision library, although GMP attempts to be
! 2031: as efficient as it can on both large and small operands.
! 2032:
! 2033: @item Static Linking
! 2034: On some CPUs, in particular the x86s, the static @file{libgmp.a} should be
! 2035: used for maximum speed, since the PIC code in the shared @file{libgmp.so} will
! 2036: have a small overhead on each function call and global data address. For many
! 2037: programs this will be insignificant, but for long calculations there's a gain
! 2038: to be had.
! 2039:
! 2040: @item Initializing and clearing
! 2041: Avoid excessive initializing and clearing of variables, since this can be
! 2042: quite time consuming, especially in comparison to otherwise fast operations
! 2043: like addition.
! 2044:
! 2045: A language interpreter might want to keep a free list or stack of
! 2046: initialized variables ready for use. It should be possible to integrate
! 2047: something like that with a garbage collector too.
! 2048:
! 2049: @item Reallocations
! 2050: An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing
! 2051: values will have its memory repeatedly @code{realloc}ed, which could be quite
! 2052: slow or could fragment memory, depending on the C library. If an application
! 2053: can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can
! 2054: be called to allocate the necessary space from the beginning
! 2055: (@pxref{Initializing Integers}).
! 2056:
! 2057: It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2}
! 2058: is too small, since all functions will do a further reallocation if necessary.
! 2059: Badly overestimating memory required will waste space though.
! 2060:
! 2061: @item @code{2exp} functions
! 2062: It's up to an application to call functions like @code{mpz_mul_2exp} when
! 2063: appropriate. General purpose functions like @code{mpz_mul} make no attempt to
! 2064: identify powers of two or other special forms, because such inputs will
! 2065: usually be very rare and testing every time would be wasteful.
! 2066:
! 2067: @item @code{ui} and @code{si} functions
! 2068: The @code{ui} functions and the small number of @code{si} functions exist for
! 2069: convenience and should be used where applicable. But if for example an
! 2070: @code{mpz_t} contains a value that fits in an @code{unsigned long} there's no
! 2071: need extract it and call a @code{ui} function, just use the regular @code{mpz}
! 2072: function.
! 2073:
! 2074: @item In-Place Operations
! 2075: @code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg}
! 2076: and @code{mpf_neg} are fast when used for in-place operations like
! 2077: @code{mpz_abs(x,x)}, since in the current implementation only a single field
! 2078: of @code{x} needs changing. On suitable compilers (GCC for instance) this is
! 2079: inlined too.
! 2080:
! 2081: @code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui}
! 2082: benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since
! 2083: usually only one or two limbs of @code{x} will need to be changed. The same
! 2084: applies to the full precision @code{mpz_add} etc if @code{y} is small. If
! 2085: @code{y} is big then cache locality may be helped, but that's all.
! 2086:
! 2087: @code{mpz_mul} is currently the opposite, a separate destination is slightly
! 2088: better. A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one
! 2089: limb, make a temporary copy of @code{x} before forming the result. Normally
! 2090: that copying will only be a tiny fraction of the time for the multiply, so
! 2091: this is not a particularly important consideration.
! 2092:
! 2093: @code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make
! 2094: no attempt to recognise a copy of something to itself, so a call like
! 2095: @code{mpz_set(x,x)} will be wasteful. Naturally that would never be written
! 2096: deliberately, but if it might arise from two pointers to the same object then
! 2097: a test to avoid it might be desirable.
! 2098:
! 2099: @example
! 2100: if (x != y)
! 2101: mpz_set (x, y);
! 2102: @end example
! 2103:
! 2104: Note that it's never worth introducing extra @code{mpz_set} calls just to get
! 2105: in-place operations. If a result should go to a particular variable then just
! 2106: direct it there and let GMP take care of data movement.
! 2107:
! 2108: @item Divisibility Testing (Small Integers)
! 2109:
! 2110: @code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions
! 2111: for testing whether an @code{mpz_t} is divisible by an individual small
! 2112: integer. They use an algorithm which is faster than @code{mpz_tdiv_ui}, but
! 2113: which gives no useful information about the actual remainder, only whether
! 2114: it's zero (or a particular value).
! 2115:
! 2116: However when testing divisibility by several small integers, it's best to take
! 2117: a remainder modulo their product, to save multi-precision operations. For
! 2118: instance to test whether a number is divisible by any of 23, 29 or 31 take a
! 2119: remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that.
! 2120:
! 2121: The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well
! 2122: as a remainder are generally a little slower than the remainder-only functions
! 2123: like @code{mpz_tdiv_ui}. If the quotient is only rarely wanted then it's
! 2124: probably best to just take a remainder and then go back and calculate the
! 2125: quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the
! 2126: remainder is zero).
! 2127:
! 2128: @item Rational Arithmetic
! 2129: The @code{mpq} functions operate on @code{mpq_t} values with no common factors
! 2130: in the numerator and denominator. Common factors are checked-for and cast out
! 2131: as necessary. In general, cancelling factors every time is the best approach
! 2132: since it minimizes the sizes for subsequent operations.
! 2133:
! 2134: However, applications that know something about the factorization of the
! 2135: values they're working with might be able to avoid some of the GCDs used for
! 2136: canonicalization, or swap them for divisions. For example when multiplying by
! 2137: a prime it's enough to check for factors of it in the denominator instead of
! 2138: doing a full GCD. Or when forming a big product it might be known that very
! 2139: little cancellation will be possible, and so canonicalization can be left to
! 2140: the end.
! 2141:
! 2142: The @code{mpq_numref} and @code{mpq_denref} macros give access to the
! 2143: numerator and denominator to do things outside the scope of the supplied
! 2144: @code{mpq} functions. @xref{Applying Integer Functions}.
! 2145:
! 2146: The canonical form for rationals allows mixed-type @code{mpq_t} and integer
! 2147: additions or subtractions to be done directly with multiples of the
! 2148: denominator. This will be somewhat faster than @code{mpq_add}. For example,
! 2149:
! 2150: @example
! 2151: /* mpq increment */
! 2152: mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q));
! 2153:
! 2154: /* mpq += unsigned long */
! 2155: mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL);
! 2156:
! 2157: /* mpq -= mpz */
! 2158: mpz_submul (mpq_numref(q), mpq_denref(q), z);
! 2159: @end example
! 2160:
! 2161: @item Number Sequences
! 2162: Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui}
! 2163: are designed for calculating isolated values. If a range of values is wanted
! 2164: it's probably best to call to get a starting point and iterate from there.
! 2165:
! 2166: @item Text Input/Output
! 2167: Hexadecimal or octal are suggested for input or output in text form.
! 2168: Power-of-2 bases like these can be converted much more efficiently than other
! 2169: bases, like decimal. For big numbers there's usually nothing of particular
! 2170: interest to be seen in the digits, so the base doesn't matter much.
! 2171:
! 2172: Maybe we can hope octal will one day become the normal base for everyday use,
! 2173: as proposed by King Charles XII of Sweden and later reformers.
! 2174: @c Reference: Knuth volume 2 section 4.1, page 184 of second edition. :-)
! 2175: @end table
! 2176:
! 2177:
! 2178: @node Debugging, Profiling, Efficiency, GMP Basics
! 2179: @section Debugging
! 2180: @cindex Debugging
! 2181:
! 2182: @table @asis
! 2183: @item Stack Overflow
! 2184: Depending on the system, a segmentation violation or bus error might be the
! 2185: only indication of stack overflow. See @samp{--enable-alloca} choices in
! 2186: @ref{Build Options}, for how to address this.
! 2187:
! 2188: In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an
! 2189: overflow is recognised by the system before too much damage is done, or
! 2190: @samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to
! 2191: add checking if the system itself doesn't do any (@pxref{Code Gen Options,,
! 2192: Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}).
! 2193: These options must be added to the @samp{CFLAGS} used in the GMP build
! 2194: (@pxref{Build Options}), adding them just to an application will have no
! 2195: effect. Note also they're a slowdown, adding overhead to each function call
! 2196: and each stack allocation.
! 2197:
! 2198: @item Heap Problems
! 2199: The most likely cause of application problems with GMP is heap corruption.
! 2200: Failing to @code{init} GMP variables will have unpredictable effects, and
! 2201: corruption arising elsewhere in a program may well affect GMP. Initializing
! 2202: GMP variables more than once or failing to clear them will cause memory leaks.
! 2203:
! 2204: In all such cases a malloc debugger is recommended. On a GNU or BSD system
! 2205: the standard C library @code{malloc} has some diagnostic facilities, see
! 2206: @ref{Allocation Debugging,,,libc,The GNU C Library Reference Manual}, or
! 2207: @samp{man 3 malloc}. Other possibilities, in no particular order, include
! 2208:
! 2209: @display
! 2210: @uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc}
! 2211: @uref{http://quorum.tamu.edu/jon/gnu} @ (debauch)
! 2212: @uref{http://dmalloc.com}
! 2213: @uref{http://www.perens.com/FreeSoftware} @ (electric fence)
! 2214: @uref{http://packages.debian.org/fda}
! 2215: @uref{http://www.gnupdate.org/components/leakbug}
! 2216: @uref{http://people.redhat.com/~otaylor/memprof}
! 2217: @uref{http://www.cbmamiga.demon.co.uk/mpatrol}
! 2218: @end display
! 2219:
! 2220: The GMP default allocation routines in @file{memory.c} also have a simple
! 2221: sentinel scheme which can be enabled with @code{#define DEBUG} in that file.
! 2222: This is mainly designed for detecting buffer overruns during GMP development,
! 2223: but might find other uses.
! 2224:
! 2225: @item Stack Backtraces
! 2226: On some systems the compiler options GMP uses by default can interfere with
! 2227: debugging. In particular on x86 and 68k systems @samp{-fomit-frame-pointer}
! 2228: is used and this generally inhibits stack backtracing. Recompiling without
! 2229: such options may help while debugging, though the usual caveats about it
! 2230: potentially moving a memory problem or hiding a compiler bug will apply.
! 2231:
! 2232: @item GNU Debugger
! 2233: A sample @file{.gdbinit} is included in the distribution, showing how to call
! 2234: some undocumented dump functions to print GMP variables from within GDB. Note
! 2235: that these functions shouldn't be used in final application code since they're
! 2236: undocumented and may be subject to incompatible changes in future versions of
! 2237: GMP.
! 2238:
! 2239: @item Source File Paths
! 2240: GMP has multiple source files with the same name, in different directories.
! 2241: For example @file{mpz}, @file{mpq}, @file{mpf} and @file{mpfr} each have an
! 2242: @file{init.c}. If the debugger can't already determine the right one it may
! 2243: help to build with absolute paths on each C file. One way to do that is to
! 2244: use a separate object directory with an absolute path to the source directory.
! 2245:
! 2246: @example
! 2247: cd /my/build/dir
! 2248: /my/source/dir/gmp-@value{VERSION}/configure
! 2249: @end example
! 2250:
! 2251: This works via @code{VPATH}, and might require GNU @command{make}.
! 2252: Alternately it might be possible to change the @code{.c.lo} rules
! 2253: appropriately.
! 2254:
! 2255: @item Assertion Checking
! 2256: The build option @option{--enable-assert} is available to add some consistency
! 2257: checks to the library (see @ref{Build Options}). These are likely to be of
! 2258: limited value to most applications. Assertion failures are just as likely to
! 2259: indicate memory corruption as a library or compiler bug.
! 2260:
! 2261: Applications using the low-level @code{mpn} functions, however, will benefit
! 2262: from @option{--enable-assert} since it adds checks on the parameters of most
! 2263: such functions, many of which have subtle restrictions on their usage. Note
! 2264: however that only the generic C code has checks, not the assembler code, so
! 2265: CPU @samp{none} should be used for maximum checking.
! 2266:
! 2267: @item Temporary Memory Checking
! 2268: The build option @option{--enable-alloca=debug} arranges that each block of
! 2269: temporary memory in GMP is allocated with a separate call to @code{malloc} (or
! 2270: the allocation function set with @code{mp_set_memory_functions}).
! 2271:
! 2272: This can help a malloc debugger detect accesses outside the intended bounds,
! 2273: or detect memory not released. In a normal build, on the other hand,
! 2274: temporary memory is allocated in blocks which GMP divides up for its own use,
! 2275: or may be allocated with a compiler builtin @code{alloca} which will go
! 2276: nowhere near any malloc debugger hooks.
1.1.1.2 maekawa 2277:
1.1.1.4 ! ohara 2278: @item Maximum Debuggability
! 2279: To summarize the above, a GMP build for maximum debuggability would be
! 2280:
! 2281: @example
! 2282: ./configure --disable-shared --enable-assert \
! 2283: --enable-alloca=debug --host=none CFLAGS=-g
! 2284: @end example
! 2285:
! 2286: For C++, add @samp{--enable-cxx CXXFLAGS=-g}.
! 2287:
! 2288: @item Checker
! 2289: The checker program (@uref{http://savannah.gnu.org/projects/checker}) can be
! 2290: used with GMP. It contains a stub library which means GMP applications
! 2291: compiled with checker can use a normal GMP build.
! 2292:
! 2293: A build of GMP with checking within GMP itself can be made. This will run
! 2294: very very slowly. Configure with
! 2295:
! 2296: @example
! 2297: ./configure --host=none-pc-linux-gnu CC=checkergcc
! 2298: @end example
! 2299:
! 2300: @samp{--host=none} must be used, since the GMP assembler code doesn't support
! 2301: the checking scheme. The GMP C++ features cannot be used, since current
! 2302: versions of checker (0.9.9.1) don't yet support the standard C++ library.
! 2303:
! 2304: @item Valgrind
! 2305: The valgrind program (@uref{http://devel-home.kde.org/~sewardj}) is a memory
! 2306: checker for x86s. It translates and emulates machine instructions to do
! 2307: strong checks for uninitialized data (at the level of individual bits), memory
! 2308: accesses through bad pointers, and memory leaks.
! 2309:
! 2310: Current versions (20020226 snapshot) don't support MMX or SSE, so GMP must be
! 2311: configured for an x86 without those (eg. plain @samp{i386}), or with a special
! 2312: @code{MPN_PATH} that excludes those subdirectories (@pxref{Build Options}).
! 2313:
! 2314: @item Other Problems
! 2315: Any suspected bug in GMP itself should be isolated to make sure it's not an
! 2316: application problem, see @ref{Reporting Bugs}.
! 2317: @end table
! 2318:
! 2319:
! 2320: @node Profiling, Autoconf, Debugging, GMP Basics
! 2321: @section Profiling
! 2322: @cindex Profiling
! 2323:
! 2324: Running a program under a profiler is a good way to find where it's spending
! 2325: most time and where improvements can be best sought.
! 2326:
! 2327: Depending on the system, it may be possible to get a flat profile, meaning
! 2328: simple timer sampling of the program counter, with no special GMP build
! 2329: options, just a @samp{-p} when compiling the mainline. This is a good way to
! 2330: ensure minimum interference with normal operation. The necessary symbol type
! 2331: and size information exists in most of the GMP assembler code.
! 2332:
! 2333: The @samp{--enable-profiling} build option can be used to add suitable
! 2334: compiler flags, either for @command{prof} (@samp{-p}) or @command{gprof}
! 2335: (@samp{-pg}), see @ref{Build Options}. Which of the two is available and what
! 2336: they do will depend on the system, and possibly on support available in
! 2337: @file{libc}. For some systems appropriate corresponding @code{mcount} calls
! 2338: are added to the assembler code too.
! 2339:
! 2340: On x86 systems @command{prof} gives call counting, so that average time spent
! 2341: in a function can be determined. @command{gprof}, where supported, adds call
! 2342: graph construction, so for instance calls to @code{mpn_add_n} from
! 2343: @code{mpz_add} and from @code{mpz_mul} can be differentiated.
! 2344:
! 2345: On x86 and 68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are
! 2346: incompatible, so the latter is not used when @command{gprof} profiling is
! 2347: selected, which may result in poorer code generation. If @command{prof}
! 2348: profiling is selected instead it should still be possible to use
! 2349: @command{gprof}, but only the @samp{gprof -p} flat profile and call counts can
! 2350: be expected to be valid, not the @samp{gprof -q} call graph.
! 2351:
! 2352:
! 2353: @node Autoconf, Emacs, Profiling, GMP Basics
! 2354: @section Autoconf
! 2355: @cindex Autoconf detections
! 2356:
! 2357: Autoconf based applications can easily check whether GMP is installed. The
! 2358: only thing to be noted is that GMP library symbols from version 3 onwards have
! 2359: prefixes like @code{__gmpz}. The following therefore would be a simple test,
! 2360:
! 2361: @example
! 2362: AC_CHECK_LIB(gmp, __gmpz_init)
! 2363: @end example
! 2364:
! 2365: This just uses the default @code{AC_CHECK_LIB} actions for found or not found,
! 2366: but an application that must have GMP would want to generate an error if not
! 2367: found. For example,
! 2368:
! 2369: @example
! 2370: AC_CHECK_LIB(gmp, __gmpz_init, , [AC_MSG_ERROR(
! 2371: [GNU MP not found, see http://swox.com/gmp])])
! 2372: @end example
! 2373:
! 2374: If functions added in some particular version of GMP are required, then one of
! 2375: those can be used when checking. For example @code{mpz_mul_si} was added in
! 2376: GMP 3.1,
! 2377:
! 2378: @example
! 2379: AC_CHECK_LIB(gmp, __gmpz_mul_si, , [AC_MSG_ERROR(
! 2380: [GNU MP not found, or not 3.1 or up, see http://swox.com/gmp])])
! 2381: @end example
! 2382:
! 2383: An alternative would be to test the version number in @file{gmp.h} using say
! 2384: @code{AC_EGREP_CPP}. That would make it possible to test the exact version,
! 2385: if some particular sub-minor release is known to be necessary.
! 2386:
! 2387: An application that can use either GMP 2 or 3 will need to test for
! 2388: @code{__gmpz_init} (GMP 3 and up) or @code{mpz_init} (GMP 2), and it's also
! 2389: worth checking for @file{libgmp2} since Debian GNU/Linux systems used that
! 2390: name in the past. For example,
! 2391:
! 2392: @example
! 2393: AC_CHECK_LIB(gmp, __gmpz_init, ,
! 2394: [AC_CHECK_LIB(gmp, mpz_init, ,
! 2395: [AC_CHECK_LIB(gmp2, mpz_init)])])
! 2396: @end example
! 2397:
! 2398: In general it's suggested that applications should simply demand a new enough
! 2399: GMP rather than trying to provide supplements for features not available in
! 2400: past versions.
! 2401:
! 2402: Occasionally an application will need or want to know the size of a type at
! 2403: configuration or preprocessing time, not just with @code{sizeof} in the code.
! 2404: This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or
! 2405: up is best for this, since prior versions needed certain @samp{-D} defines on
! 2406: systems using a @code{long long} limb. The following would suit Autoconf 2.50
! 2407: or up,
! 2408:
! 2409: @example
! 2410: AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>])
! 2411: @end example
! 2412:
! 2413: The optional @code{mpfr} functions are provided in a separate
! 2414: @file{libmpfr.a}, and this might be from GMP with @option{--enable-mpfr} or
! 2415: from MPFR installed separately. Either way @file{libmpfr} depends on
! 2416: @file{libgmp}, it doesn't stand alone. Currently only a static
! 2417: @file{libmpfr.a} will be available, not a shared library, since upward binary
! 2418: compatibility is not guaranteed.
! 2419:
! 2420: @example
! 2421: AC_CHECK_LIB(mpfr, mpfr_add, , [AC_MSG_ERROR(
! 2422: [Need MPFR either from GNU MP 4 or separate MPFR package.
! 2423: See http://www.mpfr.org or http://swox.com/gmp])
! 2424: @end example
! 2425:
! 2426:
! 2427: @node Emacs, , Autoconf, GMP Basics
! 2428: @section Emacs
! 2429: @cindex Emacs
! 2430:
! 2431: @key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation
! 2432: on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup,
! 2433: emacs, The Emacs Editor}).
! 2434:
! 2435: The GMP manual can be included in such lookups by putting the following in
! 2436: your @file{.emacs},
! 2437:
! 2438: @c This isn't pretty, but there doesn't seem to be a better way (in emacs
! 2439: @c 21.2 at least). info-lookup->mode-value could be used for the "assoc"s,
! 2440: @c but that function isn't documented, whereas info-lookup-alist is.
! 2441: @c
! 2442: @example
! 2443: (eval-after-load "info-look"
! 2444: '(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist))))
! 2445: (setcar (nthcdr 3 mode-value)
! 2446: (cons '("(gmp)Function Index" nil "^ -.* " "\\>")
! 2447: (nth 3 mode-value)))))
! 2448: @end example
! 2449:
! 2450: The same can be done for MPFR, with @code{(mpfr)} in place of @code{(gmp)}.
1.1 maekawa 2451:
2452:
1.1.1.2 maekawa 2453: @node Reporting Bugs, Integer Functions, GMP Basics, Top
1.1 maekawa 2454: @comment node-name, next, previous, up
2455: @chapter Reporting Bugs
2456: @cindex Reporting bugs
1.1.1.2 maekawa 2457: @cindex Bug reporting
2458:
2459: If you think you have found a bug in the GMP library, please investigate it
2460: and report it. We have made this library available to you, and it is not too
1.1.1.4 ! ohara 2461: much to ask you to report the bugs you find.
! 2462:
! 2463: Before you report a bug, check it's not already addressed in @ref{Known Build
! 2464: Problems}, or perhaps @ref{Notes for Particular Systems}. You may also want
! 2465: to check @uref{http://swox.com/gmp/} for patches for this release.
1.1.1.2 maekawa 2466:
2467: Please include the following in any report,
2468:
2469: @itemize @bullet
2470: @item
2471: The GMP version number, and if pre-packaged or patched then say so.
2472:
2473: @item
2474: A test program that makes it possible for us to reproduce the bug. Include
2475: instructions on how to run the program.
1.1 maekawa 2476:
1.1.1.2 maekawa 2477: @item
2478: A description of what is wrong. If the results are incorrect, in what way.
2479: If you get a crash, say so.
2480:
2481: @item
2482: If you get a crash, include a stack backtrace from the debugger if it's
2483: informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}).
2484:
2485: @item
1.1.1.4 ! ohara 2486: Please do not send core dumps, executables or @command{strace}s.
1.1.1.2 maekawa 2487:
2488: @item
2489: The configuration options you used when building GMP, if any.
2490:
2491: @item
2492: The name of the compiler and its version. For @command{gcc}, get the version
2493: with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar.
2494:
2495: @item
2496: The output from running @samp{uname -a}.
1.1 maekawa 2497:
1.1.1.2 maekawa 2498: @item
1.1.1.4 ! ohara 2499: The output from running @samp{./config.guess}, and from running
! 2500: @samp{./configfsf.guess} (might be the same).
1.1 maekawa 2501:
1.1.1.2 maekawa 2502: @item
2503: If the bug is related to @samp{configure}, then the contents of
2504: @file{config.log}.
1.1 maekawa 2505:
1.1.1.2 maekawa 2506: @item
2507: If the bug is related to an @file{asm} file not assembling, then the contents
1.1.1.4 ! ohara 2508: of @file{config.m4} and the offending line or lines from the temporary
! 2509: @file{mpn/tmp-<file>.s}.
1.1.1.2 maekawa 2510: @end itemize
1.1 maekawa 2511:
1.1.1.4 ! ohara 2512: Please make an effort to produce a self-contained report, with something
! 2513: definite that can be tested or debugged. Vague queries or piecemeal messages
! 2514: are difficult to act on and don't help the development effort.
! 2515:
1.1 maekawa 2516: It is not uncommon that an observed problem is actually due to a bug in the
1.1.1.2 maekawa 2517: compiler; the GMP code tends to explore interesting corners in compilers.
1.1 maekawa 2518:
1.1.1.2 maekawa 2519: If your bug report is good, we will do our best to help you get a corrected
1.1 maekawa 2520: version of the library; if the bug report is poor, we won't do anything about
1.1.1.2 maekawa 2521: it (except maybe ask you to send a better report).
1.1 maekawa 2522:
1.1.1.2 maekawa 2523: Send your report to: @email{bug-gmp@@gnu.org}.
1.1 maekawa 2524:
2525: If you think something in this manual is unclear, or downright incorrect, or if
2526: the language needs to be improved, please send a note to the same address.
2527:
2528:
2529: @node Integer Functions, Rational Number Functions, Reporting Bugs, Top
2530: @comment node-name, next, previous, up
2531: @chapter Integer Functions
2532: @cindex Integer functions
2533:
1.1.1.2 maekawa 2534: This chapter describes the GMP functions for performing integer arithmetic.
1.1 maekawa 2535: These functions start with the prefix @code{mpz_}.
2536:
1.1.1.2 maekawa 2537: GMP integers are stored in objects of type @code{mpz_t}.
1.1 maekawa 2538:
2539: @menu
1.1.1.2 maekawa 2540: * Initializing Integers::
2541: * Assigning Integers::
2542: * Simultaneous Integer Init & Assign::
2543: * Converting Integers::
2544: * Integer Arithmetic::
2545: * Integer Division::
2546: * Integer Exponentiation::
2547: * Integer Roots::
2548: * Number Theoretic Functions::
2549: * Integer Comparisons::
2550: * Integer Logic and Bit Fiddling::
2551: * I/O of Integers::
2552: * Integer Random Numbers::
1.1.1.4 ! ohara 2553: * Integer Import and Export::
1.1.1.2 maekawa 2554: * Miscellaneous Integer Functions::
1.1 maekawa 2555: @end menu
2556:
1.1.1.2 maekawa 2557: @node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions
1.1 maekawa 2558: @comment node-name, next, previous, up
1.1.1.2 maekawa 2559: @section Initialization Functions
2560: @cindex Integer initialization functions
2561: @cindex Initialization functions
1.1 maekawa 2562:
2563: The functions for integer arithmetic assume that all integer objects are
1.1.1.4 ! ohara 2564: initialized. You do that by calling the function @code{mpz_init}. For
! 2565: example,
1.1 maekawa 2566:
2567: @example
2568: @{
2569: mpz_t integ;
2570: mpz_init (integ);
2571: @dots{}
2572: mpz_add (integ, @dots{});
2573: @dots{}
2574: mpz_sub (integ, @dots{});
2575:
2576: /* Unless the program is about to exit, do ... */
2577: mpz_clear (integ);
2578: @}
2579: @end example
2580:
2581: As you can see, you can store new values any number of times, once an
2582: object is initialized.
2583:
1.1.1.4 ! ohara 2584: @deftypefun void mpz_init (mpz_t @var{integer})
! 2585: Initialize @var{integer}, and set its value to 0.
! 2586: @end deftypefun
! 2587:
! 2588: @deftypefun void mpz_init2 (mpz_t @var{integer}, unsigned long @var{n})
! 2589: Initialize @var{integer}, with space for @var{n} bits, and set its value to 0.
! 2590:
! 2591: @var{n} is only the initial space, @var{integer} will grow automatically in
! 2592: the normal way, if necessary, for subsequent values stored. @code{mpz_init2}
! 2593: makes it possible to avoid such reallocations if a maximum size is known in
! 2594: advance.
! 2595: @end deftypefun
! 2596:
1.1 maekawa 2597: @deftypefun void mpz_clear (mpz_t @var{integer})
1.1.1.4 ! ohara 2598: Free the space occupied by @var{integer}. Call this function for all
! 2599: @code{mpz_t} variables when you are done with them.
1.1 maekawa 2600: @end deftypefun
2601:
1.1.1.4 ! ohara 2602: @deftypefun void mpz_realloc2 (mpz_t @var{integer}, unsigned long @var{n})
! 2603: Change the space allocated for @var{integer} to @var{n} bits. The value in
! 2604: @var{integer} is preserved if it fits, or is set to 0 if not.
! 2605:
! 2606: This function can be used to increase the space for a variable in order to
! 2607: avoid repeated automatic reallocations, or to decrease it to give memory back
! 2608: to the heap.
! 2609: @end deftypefun
! 2610:
! 2611: @deftypefun void mpz_array_init (mpz_t @var{integer_array}[], size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}})
! 2612: This is a special type of initialization. @strong{Fixed} space of
! 2613: @var{fixed_num_bits} bits is allocated to each of the @var{array_size}
! 2614: integers in @var{integer_array}.
! 2615:
! 2616: The space will not be automatically increased, unlike the normal
! 2617: @code{mpz_init}, but instead an application must ensure it's sufficient for
! 2618: any value stored. The following space requirements apply to various
! 2619: functions,
! 2620:
! 2621: @itemize @bullet
! 2622: @item
! 2623: @code{mpz_abs}, @code{mpz_neg}, @code{mpz_set}, @code{mpz_set_si} and
! 2624: @code{mpz_set_ui} need room for the value they store.
! 2625:
! 2626: @item
! 2627: @code{mpz_add}, @code{mpz_add_ui}, @code{mpz_sub} and @code{mpz_sub_ui} need
! 2628: room for the larger of the two operands, plus an extra
! 2629: @code{mp_bits_per_limb}.
1.1 maekawa 2630:
1.1.1.4 ! ohara 2631: @item
! 2632: @code{mpz_mul}, @code{mpz_mul_ui} and @code{mpz_mul_ui} need room for the sum
! 2633: of the number of bits in their operands, but each rounded up to a multiple of
! 2634: @code{mp_bits_per_limb}.
! 2635:
! 2636: @item
! 2637: @code{mpz_swap} can be used between two array variables, but not between an
! 2638: array and a normal variable.
! 2639: @end itemize
1.1 maekawa 2640:
1.1.1.4 ! ohara 2641: For other functions, or if in doubt, the suggestion is to calculate in a
! 2642: regular @code{mpz_init} variable and copy the result to an array variable with
! 2643: @code{mpz_set}.
! 2644:
! 2645: @code{mpz_array_init} can reduce memory usage in algorithms that need large
! 2646: arrays of integers, since it avoids allocating and reallocating lots of small
! 2647: memory blocks. There is no way to free the storage allocated by this
! 2648: function. Don't call @code{mpz_clear}!
! 2649: @end deftypefun
! 2650:
! 2651: @deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc})
! 2652: Change the space for @var{integer} to @var{new_alloc} limbs. The value in
! 2653: @var{integer} is preserved if it fits, or is set to 0 if not. The return
! 2654: value is not useful to applications and should be ignored.
! 2655:
! 2656: @code{mpz_realloc2} is the preferred way to accomplish allocation changes like
! 2657: this. @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that
! 2658: @code{_mpz_realloc} takes the new size in limbs.
1.1 maekawa 2659: @end deftypefun
2660:
2661:
2662: @node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions
2663: @comment node-name, next, previous, up
1.1.1.2 maekawa 2664: @section Assignment Functions
1.1 maekawa 2665: @cindex Integer assignment functions
1.1.1.2 maekawa 2666: @cindex Assignment functions
1.1 maekawa 2667:
2668: These functions assign new values to already initialized integers
2669: (@pxref{Initializing Integers}).
2670:
2671: @deftypefun void mpz_set (mpz_t @var{rop}, mpz_t @var{op})
2672: @deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
2673: @deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op})
2674: @deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op})
2675: @deftypefunx void mpz_set_q (mpz_t @var{rop}, mpq_t @var{op})
2676: @deftypefunx void mpz_set_f (mpz_t @var{rop}, mpf_t @var{op})
2677: Set the value of @var{rop} from @var{op}.
1.1.1.4 ! ohara 2678:
! 2679: @code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to
! 2680: make it an integer.
1.1 maekawa 2681: @end deftypefun
2682:
2683: @deftypefun int mpz_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base})
1.1.1.4 ! ohara 2684: Set the value of @var{rop} from @var{str}, a null-terminated C string in base
1.1 maekawa 2685: @var{base}. White space is allowed in the string, and is simply ignored. The
2686: base may vary from 2 to 36. If @var{base} is 0, the actual base is determined
1.1.1.4 ! ohara 2687: from the leading characters: if the first two characters are ``0x'' or ``0X'',
! 2688: hexadecimal is assumed, otherwise if the first character is ``0'', octal is
1.1 maekawa 2689: assumed, otherwise decimal is assumed.
2690:
1.1.1.4 ! ohara 2691: This function returns 0 if the entire string is a valid number in base
! 2692: @var{base}. Otherwise it returns @minus{}1.
1.1.1.2 maekawa 2693:
2694: [It turns out that it is not entirely true that this function ignores
1.1.1.4 ! ohara 2695: white-space. It does ignore it between digits, but not after a minus sign or
! 2696: within or after ``0x''. We are considering changing the definition of this
! 2697: function, making it fail when there is any white-space in the input, since
! 2698: that makes a lot of sense. Send your opinion of this change to
! 2699: @email{bug-gmp@@gnu.org}. Do you really want it to accept @nicode{"3 14"} as
! 2700: meaning 314 as it does now?]
1.1.1.2 maekawa 2701: @end deftypefun
2702:
2703: @deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2})
2704: Swap the values @var{rop1} and @var{rop2} efficiently.
1.1 maekawa 2705: @end deftypefun
2706:
2707:
2708: @node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions
2709: @comment node-name, next, previous, up
1.1.1.2 maekawa 2710: @section Combined Initialization and Assignment Functions
1.1 maekawa 2711: @cindex Initialization and assignment functions
1.1.1.2 maekawa 2712: @cindex Integer init and assign
1.1 maekawa 2713:
1.1.1.2 maekawa 2714: For convenience, GMP provides a parallel series of initialize-and-set functions
1.1 maekawa 2715: which initialize the output and then store the value there. These functions'
2716: names have the form @code{mpz_init_set@dots{}}
2717:
2718: Here is an example of using one:
2719:
2720: @example
2721: @{
2722: mpz_t pie;
2723: mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10);
2724: @dots{}
2725: mpz_sub (pie, @dots{});
2726: @dots{}
2727: mpz_clear (pie);
2728: @}
2729: @end example
2730:
2731: @noindent
2732: Once the integer has been initialized by any of the @code{mpz_init_set@dots{}}
2733: functions, it can be used as the source or destination operand for the ordinary
2734: integer functions. Don't use an initialize-and-set function on a variable
2735: already initialized!
2736:
2737: @deftypefun void mpz_init_set (mpz_t @var{rop}, mpz_t @var{op})
2738: @deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
2739: @deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op})
2740: @deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op})
2741: Initialize @var{rop} with limb space and set the initial numeric value from
2742: @var{op}.
2743: @end deftypefun
2744:
2745: @deftypefun int mpz_init_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base})
2746: Initialize @var{rop} and set its value like @code{mpz_set_str} (see its
2747: documentation above for details).
2748:
2749: If the string is a correct base @var{base} number, the function returns 0;
2750: if an error occurs it returns @minus{}1. @var{rop} is initialized even if
2751: an error occurs. (I.e., you have to call @code{mpz_clear} for it.)
2752: @end deftypefun
2753:
2754:
1.1.1.2 maekawa 2755: @node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions
1.1 maekawa 2756: @comment node-name, next, previous, up
2757: @section Conversion Functions
2758: @cindex Integer conversion functions
2759: @cindex Conversion functions
2760:
1.1.1.2 maekawa 2761: This section describes functions for converting GMP integers to standard C
2762: types. Functions for converting @emph{to} GMP integers are described in
2763: @ref{Assigning Integers} and @ref{I/O of Integers}.
2764:
1.1 maekawa 2765: @deftypefun {unsigned long int} mpz_get_ui (mpz_t @var{op})
1.1.1.4 ! ohara 2766: Return the value of @var{op} as an @code{unsigned long}.
! 2767:
! 2768: If @var{op} is too big to fit an @code{unsigned long} then just the least
! 2769: significant bits that do fit are returned. The sign of @var{op} is ignored,
! 2770: only the absolute value is used.
1.1 maekawa 2771: @end deftypefun
2772:
2773: @deftypefun {signed long int} mpz_get_si (mpz_t @var{op})
2774: If @var{op} fits into a @code{signed long int} return the value of @var{op}.
2775: Otherwise return the least significant part of @var{op}, with the same sign
2776: as @var{op}.
2777:
1.1.1.4 ! ohara 2778: If @var{op} is too big to fit in a @code{signed long int}, the returned
1.1.1.2 maekawa 2779: result is probably not very useful. To find out if the value will fit, use
2780: the function @code{mpz_fits_slong_p}.
1.1 maekawa 2781: @end deftypefun
2782:
2783: @deftypefun double mpz_get_d (mpz_t @var{op})
1.1.1.4 ! ohara 2784: Convert @var{op} to a @code{double}.
! 2785: @end deftypefun
! 2786:
! 2787: @deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, mpz_t @var{op})
! 2788: Find @var{d} and @var{exp} such that @m{@var{d}\times 2^{exp}, @var{d} times 2
! 2789: raised to @var{exp}}, with @math{0.5@le{}@GMPabs{@var{d}}<1}, is a good
! 2790: approximation to @var{op}.
1.1 maekawa 2791: @end deftypefun
2792:
2793: @deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, mpz_t @var{op})
2794: Convert @var{op} to a string of digits in base @var{base}. The base may vary
2795: from 2 to 36.
2796:
1.1.1.4 ! ohara 2797: If @var{str} is @code{NULL}, the result string is allocated using the current
! 2798: allocation function (@pxref{Custom Allocation}). The block will be
! 2799: @code{strlen(str)+1} bytes, that being exactly enough for the string and
! 2800: null-terminator.
! 2801:
! 2802: If @var{str} is not @code{NULL}, it should point to a block of storage large
! 2803: enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base})
! 2804: + 2}. The two extra bytes are for a possible minus sign, and the
! 2805: null-terminator.
! 2806:
! 2807: A pointer to the result string is returned, being either the allocated block,
! 2808: or the given @var{str}.
! 2809: @end deftypefun
1.1 maekawa 2810:
1.1.1.4 ! ohara 2811: @deftypefun mp_limb_t mpz_getlimbn (mpz_t @var{op}, mp_size_t @var{n})
! 2812: Return limb number @var{n} from @var{op}. The sign of @var{op} is ignored,
! 2813: just the absolute value is used. The least significant limb is number 0.
1.1.1.2 maekawa 2814:
1.1.1.4 ! ohara 2815: @code{mpz_size} can be used to find how many limbs make up @var{op}.
! 2816: @code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to
! 2817: @code{mpz_size(@var{op})-1}.
1.1 maekawa 2818: @end deftypefun
2819:
2820:
1.1.1.2 maekawa 2821: @need 2000
2822: @node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions
1.1 maekawa 2823: @comment node-name, next, previous, up
2824: @section Arithmetic Functions
2825: @cindex Integer arithmetic functions
2826: @cindex Arithmetic functions
2827:
2828: @deftypefun void mpz_add (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2829: @deftypefunx void mpz_add_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
1.1.1.4 ! ohara 2830: Set @var{rop} to @math{@var{op1} + @var{op2}}.
1.1 maekawa 2831: @end deftypefun
2832:
2833: @deftypefun void mpz_sub (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
2834: @deftypefunx void mpz_sub_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
1.1.1.4 ! ohara 2835: @deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, mpz_t @var{op2})
1.1 maekawa 2836: Set @var{rop} to @var{op1} @minus{} @var{op2}.
2837: @end deftypefun
2838:
2839: @deftypefun void mpz_mul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
1.1.1.2 maekawa 2840: @deftypefunx void mpz_mul_si (mpz_t @var{rop}, mpz_t @var{op1}, long int @var{op2})
1.1 maekawa 2841: @deftypefunx void mpz_mul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
1.1.1.4 ! ohara 2842: Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
1.1.1.2 maekawa 2843: @end deftypefun
2844:
1.1.1.4 ! ohara 2845: @deftypefun void mpz_addmul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
! 2846: @deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
! 2847: Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}.
! 2848: @end deftypefun
! 2849:
! 2850: @deftypefun void mpz_submul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
! 2851: @deftypefunx void mpz_submul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
! 2852: Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}.
1.1 maekawa 2853: @end deftypefun
2854:
2855: @deftypefun void mpz_mul_2exp (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
1.1.1.2 maekawa 2856: @cindex Bit shift left
1.1.1.4 ! ohara 2857: Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
! 2858: @var{op2}}. This operation can also be defined as a left shift by @var{op2}
! 2859: bits.
1.1 maekawa 2860: @end deftypefun
2861:
2862: @deftypefun void mpz_neg (mpz_t @var{rop}, mpz_t @var{op})
2863: Set @var{rop} to @minus{}@var{op}.
2864: @end deftypefun
2865:
2866: @deftypefun void mpz_abs (mpz_t @var{rop}, mpz_t @var{op})
2867: Set @var{rop} to the absolute value of @var{op}.
2868: @end deftypefun
2869:
2870:
1.1.1.2 maekawa 2871: @need 2000
2872: @node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions
2873: @section Division Functions
2874: @cindex Integer division functions
2875: @cindex Division functions
1.1 maekawa 2876:
1.1.1.4 ! ohara 2877: Division is undefined if the divisor is zero. Passing a zero divisor to the
! 2878: division or modulo functions (including the modular powering functions
! 2879: @code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by
! 2880: zero. This lets a program handle arithmetic exceptions in these functions the
! 2881: same way as for normal C @code{int} arithmetic.
! 2882:
! 2883: @c Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line
! 2884: @c between each, and seem to let tex do a better job of page breaks than an
! 2885: @c @sp 1 in the middle of one big set.
! 2886:
! 2887: @deftypefun void mpz_cdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
! 2888: @deftypefunx void mpz_cdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
! 2889: @deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
! 2890: @maybepagebreak
! 2891: @deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2892: @deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2893: @deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
! 2894: @deftypefunx {unsigned long int} mpz_cdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2895: @maybepagebreak
! 2896: @deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}})
! 2897: @deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}})
! 2898: @end deftypefun
! 2899:
! 2900: @deftypefun void mpz_fdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
! 2901: @deftypefunx void mpz_fdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
! 2902: @deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
! 2903: @maybepagebreak
! 2904: @deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2905: @deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2906: @deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
! 2907: @deftypefunx {unsigned long int} mpz_fdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2908: @maybepagebreak
! 2909: @deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}})
! 2910: @deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}})
! 2911: @end deftypefun
! 2912:
! 2913: @deftypefun void mpz_tdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
! 2914: @deftypefunx void mpz_tdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
! 2915: @deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
! 2916: @maybepagebreak
! 2917: @deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2918: @deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2919: @deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
! 2920: @deftypefunx {unsigned long int} mpz_tdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2921: @maybepagebreak
! 2922: @deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}})
! 2923: @deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}})
! 2924: @cindex Bit shift right
! 2925:
! 2926: @sp 1
! 2927: Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder
! 2928: @var{r}. For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}.
! 2929: The rounding is in three styles, each suiting different applications.
1.1 maekawa 2930:
2931: @itemize @bullet
2932: @item
1.1.1.4 ! ohara 2933: @code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will
! 2934: have the opposite sign to @var{d}. The @code{c} stands for ``ceil''.
! 2935:
1.1 maekawa 2936: @item
1.1.1.4 ! ohara 2937: @code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and
! 2938: @var{r} will have the same sign as @var{d}. The @code{f} stands for
! 2939: ``floor''.
! 2940:
1.1.1.2 maekawa 2941: @item
1.1.1.4 ! ohara 2942: @code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign
! 2943: as @var{n}. The @code{t} stands for ``truncate''.
1.1 maekawa 2944: @end itemize
2945:
1.1.1.4 ! ohara 2946: In all cases @var{q} and @var{r} will satisfy
! 2947: @m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and
! 2948: @var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}.
! 2949:
! 2950: The @code{q} functions calculate only the quotient, the @code{r} functions
! 2951: only the remainder, and the @code{qr} functions calculate both. Note that for
! 2952: @code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or
! 2953: results will be unpredictable.
! 2954:
! 2955: For the @code{ui} variants the return value is the remainder, and in fact
! 2956: returning the remainder is all the @code{div_ui} functions do. For
! 2957: @code{tdiv} and @code{cdiv} the remainder can be negative, so for those the
! 2958: return value is the absolute value of the remainder.
! 2959:
! 2960: The @code{2exp} functions are right shifts and bit masks, but of course
! 2961: rounding the same as the other functions. For positive @var{n} both
! 2962: @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp} are simple bitwise right
! 2963: shifts. For negative @var{n}, @code{mpz_fdiv_q_2exp} is effectively an
! 2964: arithmetic right shift treating @var{n} as twos complement the same as the
! 2965: bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp} effectively
! 2966: treats @var{n} as sign and magnitude.
! 2967: @end deftypefun
1.1.1.2 maekawa 2968:
1.1.1.4 ! ohara 2969: @deftypefun void mpz_mod (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
! 2970: @deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
! 2971: Set @var{r} to @var{n} @code{mod} @var{d}. The sign of the divisor is
! 2972: ignored; the result is always non-negative.
! 2973:
! 2974: @code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the
! 2975: remainder as well as setting @var{r}. See @code{mpz_fdiv_ui} above if only
! 2976: the return value is wanted.
1.1 maekawa 2977: @end deftypefun
2978:
1.1.1.2 maekawa 2979: @deftypefun void mpz_divexact (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
1.1.1.4 ! ohara 2980: @deftypefunx void mpz_divexact_ui (mpz_t @var{q}, mpz_t @var{n}, unsigned long @var{d})
1.1.1.2 maekawa 2981: @cindex Exact division functions
1.1.1.4 ! ohara 2982: Set @var{q} to @var{n}/@var{d}. These functions produce correct results only
1.1.1.2 maekawa 2983: when it is known in advance that @var{d} divides @var{n}.
1.1 maekawa 2984:
1.1.1.4 ! ohara 2985: These routines are much faster than the other division functions, and are the
! 2986: best choice when exact division is known to occur, for example reducing a
! 2987: rational to lowest terms.
! 2988: @end deftypefun
! 2989:
! 2990: @deftypefun int mpz_divisible_p (mpz_t @var{n}, mpz_t @var{d})
! 2991: @deftypefunx int mpz_divisible_ui_p (mpz_t @var{n}, unsigned long int @var{d})
! 2992: @deftypefunx int mpz_divisible_2exp_p (mpz_t @var{n}, unsigned long int @var{b})
! 2993: Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of
! 2994: @code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}.
! 2995: @end deftypefun
! 2996:
! 2997: @deftypefun int mpz_congruent_p (mpz_t @var{n}, mpz_t @var{c}, mpz_t @var{d})
! 2998: @deftypefunx int mpz_congruent_ui_p (mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d})
! 2999: @deftypefunx int mpz_congruent_2exp_p (mpz_t @var{n}, mpz_t @var{c}, unsigned long int @var{b})
! 3000: Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the
! 3001: case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}.
1.1 maekawa 3002: @end deftypefun
3003:
1.1.1.2 maekawa 3004:
3005: @need 2000
3006: @node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions
3007: @section Exponentiation Functions
3008: @cindex Integer exponentiation functions
3009: @cindex Exponentiation functions
1.1.1.4 ! ohara 3010: @cindex Powering functions
1.1 maekawa 3011:
3012: @deftypefun void mpz_powm (mpz_t @var{rop}, mpz_t @var{base}, mpz_t @var{exp}, mpz_t @var{mod})
3013: @deftypefunx void mpz_powm_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp}, mpz_t @var{mod})
1.1.1.4 ! ohara 3014: Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp})
! 3015: modulo @var{mod}}.
1.1.1.2 maekawa 3016:
1.1.1.4 ! ohara 3017: Negative @var{exp} is supported if an inverse @math{@var{base}^@W{-1} @bmod
! 3018: @var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}).
! 3019: If an inverse doesn't exist then a divide by zero is raised.
1.1 maekawa 3020: @end deftypefun
3021:
3022: @deftypefun void mpz_pow_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp})
3023: @deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp})
1.1.1.4 ! ohara 3024: Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}. The case
! 3025: @math{0^0} yields 1.
1.1 maekawa 3026: @end deftypefun
3027:
1.1.1.2 maekawa 3028:
3029: @need 2000
3030: @node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions
3031: @section Root Extraction Functions
3032: @cindex Integer root functions
3033: @cindex Root extraction functions
3034:
3035: @deftypefun int mpz_root (mpz_t @var{rop}, mpz_t @var{op}, unsigned long int @var{n})
1.1.1.4 ! ohara 3036: Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer
! 3037: part of the @var{n}th root of @var{op}. Return non-zero if the computation
! 3038: was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power.
1.1.1.2 maekawa 3039: @end deftypefun
1.1 maekawa 3040:
3041: @deftypefun void mpz_sqrt (mpz_t @var{rop}, mpz_t @var{op})
1.1.1.4 ! ohara 3042: Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated
! 3043: integer part of the square root of @var{op}.
1.1 maekawa 3044: @end deftypefun
3045:
3046: @deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, mpz_t @var{op})
1.1.1.4 ! ohara 3047: Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
! 3048: of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{rop2} to the
! 3049: remainder @m{(@var{op} - @var{rop1}^2),
! 3050: @var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a
! 3051: perfect square.
1.1 maekawa 3052:
3053: If @var{rop1} and @var{rop2} are the same variable, the results are
3054: undefined.
3055: @end deftypefun
3056:
1.1.1.2 maekawa 3057: @deftypefun int mpz_perfect_power_p (mpz_t @var{op})
3058: Return non-zero if @var{op} is a perfect power, i.e., if there exist integers
1.1.1.4 ! ohara 3059: @m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that
! 3060: @m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}.
! 3061:
! 3062: Under this definition both 0 and 1 are considered to be perfect powers.
! 3063: Negative values of @var{op} are accepted, but of course can only be odd
! 3064: perfect powers.
1.1.1.2 maekawa 3065: @end deftypefun
3066:
1.1 maekawa 3067: @deftypefun int mpz_perfect_square_p (mpz_t @var{op})
3068: Return non-zero if @var{op} is a perfect square, i.e., if the square root of
1.1.1.4 ! ohara 3069: @var{op} is an integer. Under this definition both 0 and 1 are considered to
! 3070: be perfect squares.
1.1 maekawa 3071: @end deftypefun
3072:
3073:
1.1.1.2 maekawa 3074: @need 2000
3075: @node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions
3076: @section Number Theoretic Functions
3077: @cindex Number theoretic functions
3078:
3079: @deftypefun int mpz_probab_prime_p (mpz_t @var{n}, int @var{reps})
3080: @cindex Prime testing functions
1.1.1.4 ! ohara 3081: Determine whether @var{n} is prime. Return 2 if @var{n} is definitely prime,
! 3082: return 1 if @var{n} is probably prime (without being certain), or return 0 if
! 3083: @var{n} is definitely composite.
! 3084:
! 3085: This function does some trial divisions, then some Miller-Rabin probabilistic
! 3086: primality tests. @var{reps} controls how many such tests are done, 5 to 10 is
! 3087: a reasonable number, more will reduce the chances of a composite being
! 3088: returned as ``probably prime''.
! 3089:
! 3090: Miller-Rabin and similar tests can be more properly called compositeness
! 3091: tests. Numbers which fail are known to be composite but those which pass
! 3092: might be prime or might be composite. Only a few composites pass, hence those
! 3093: which pass are considered probably prime.
1.1 maekawa 3094: @end deftypefun
3095:
1.1.1.4 ! ohara 3096: @deftypefun void mpz_nextprime (mpz_t @var{rop}, mpz_t @var{op})
1.1.1.2 maekawa 3097: Set @var{rop} to the next prime greater than @var{op}.
3098:
1.1.1.4 ! ohara 3099: This function uses a probabilistic algorithm to identify primes. For
! 3100: practical purposes it's adequate, the chance of a composite passing will be
! 3101: extremely small.
1.1.1.2 maekawa 3102: @end deftypefun
3103:
3104: @c mpz_prime_p not implemented as of gmp 3.0.
3105:
3106: @c @deftypefun int mpz_prime_p (mpz_t @var{n})
3107: @c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime.
3108: @c This function is far slower than @code{mpz_probab_prime_p}, but then it
3109: @c never returns non-zero for composite numbers.
3110:
3111: @c (For practical purposes, using @code{mpz_probab_prime_p} is adequate.
3112: @c The likelihood of a programming error or hardware malfunction is orders
3113: @c of magnitudes greater than the likelihood for a composite to pass as a
3114: @c prime, if the @var{reps} argument is in the suggested range.)
3115: @c @end deftypefun
3116:
1.1 maekawa 3117: @deftypefun void mpz_gcd (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
1.1.1.2 maekawa 3118: @cindex Greatest common divisor functions
1.1 maekawa 3119: Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}.
1.1.1.4 ! ohara 3120: The result is always positive even if one or both input operands
1.1.1.2 maekawa 3121: are negative.
1.1 maekawa 3122: @end deftypefun
3123:
3124: @deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
3125: Compute the greatest common divisor of @var{op1} and @var{op2}. If
1.1.1.2 maekawa 3126: @var{rop} is not @code{NULL}, store the result there.
1.1 maekawa 3127:
3128: If the result is small enough to fit in an @code{unsigned long int}, it is
3129: returned. If the result does not fit, 0 is returned, and the result is equal
3130: to the argument @var{op1}. Note that the result will always fit if @var{op2}
3131: is non-zero.
3132: @end deftypefun
3133:
3134: @deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, mpz_t @var{a}, mpz_t @var{b})
1.1.1.2 maekawa 3135: @cindex Extended GCD
1.1.1.4 ! ohara 3136: Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in
! 3137: addition set @var{s} and @var{t} to coefficients satisfying
! 3138: @math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}.
! 3139: @var{g} is always positive, even if one or both of @var{a} and @var{b} are
! 3140: negative.
! 3141:
! 3142: If @var{t} is @code{NULL} then that value is not computed.
1.1.1.2 maekawa 3143: @end deftypefun
3144:
3145: @deftypefun void mpz_lcm (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
1.1.1.4 ! ohara 3146: @deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long @var{op2})
1.1.1.2 maekawa 3147: @cindex Least common multiple functions
3148: Set @var{rop} to the least common multiple of @var{op1} and @var{op2}.
1.1.1.4 ! ohara 3149: @var{rop} is always positive, irrespective of the signs of @var{op1} and
! 3150: @var{op2}. @var{rop} will be zero if either @var{op1} or @var{op2} is zero.
1.1 maekawa 3151: @end deftypefun
3152:
3153: @deftypefun int mpz_invert (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
1.1.1.2 maekawa 3154: @cindex Modular inverse functions
1.1 maekawa 3155: Compute the inverse of @var{op1} modulo @var{op2} and put the result in
1.1.1.4 ! ohara 3156: @var{rop}. If the inverse exists, the return value is non-zero and @var{rop}
! 3157: will satisfy @math{0 @le{} @var{rop} < @var{op2}}. If an inverse doesn't exist
! 3158: the return value is zero and @var{rop} is undefined.
1.1 maekawa 3159: @end deftypefun
3160:
1.1.1.4 ! ohara 3161: @deftypefun int mpz_jacobi (mpz_t @var{a}, mpz_t @var{b})
! 3162: @cindex Jacobi symbol functions
! 3163: Calculate the Jacobi symbol @m{\left(a \over b\right),
! 3164: (@var{a}/@var{b})}. This is defined only for @var{b} odd.
1.1.1.2 maekawa 3165: @end deftypefun
3166:
1.1.1.4 ! ohara 3167: @deftypefun int mpz_legendre (mpz_t @var{a}, mpz_t @var{p})
! 3168: Calculate the Legendre symbol @m{\left(a \over p\right),
! 3169: (@var{a}/@var{p})}. This is defined only for @var{p} an odd positive
! 3170: prime, and for such @var{p} it's identical to the Jacobi symbol.
! 3171: @end deftypefun
! 3172:
! 3173: @deftypefun int mpz_kronecker (mpz_t @var{a}, mpz_t @var{b})
! 3174: @deftypefunx int mpz_kronecker_si (mpz_t @var{a}, long @var{b})
! 3175: @deftypefunx int mpz_kronecker_ui (mpz_t @var{a}, unsigned long @var{b})
! 3176: @deftypefunx int mpz_si_kronecker (long @var{a}, mpz_t @var{b})
! 3177: @deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, mpz_t @var{b})
1.1.1.2 maekawa 3178: @cindex Kronecker symbol functions
1.1.1.4 ! ohara 3179: Calculate the Jacobi symbol @m{\left(a \over b\right),
! 3180: (@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over
! 3181: 2\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or
! 3182: @m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even.
! 3183:
! 3184: When @var{b} is odd the Jacobi symbol and Kronecker symbol are
! 3185: identical, so @code{mpz_kronecker_ui} etc can be used for mixed
! 3186: precision Jacobi symbols too.
! 3187:
! 3188: For more information see Henri Cohen section 1.4.2 (@pxref{References}),
! 3189: or any number theory textbook. See also the example program
! 3190: @file{demos/qcn.c} which uses @code{mpz_kronecker_ui}.
1.1.1.2 maekawa 3191: @end deftypefun
3192:
3193: @deftypefun {unsigned long int} mpz_remove (mpz_t @var{rop}, mpz_t @var{op}, mpz_t @var{f})
3194: Remove all occurrences of the factor @var{f} from @var{op} and store the
1.1.1.4 ! ohara 3195: result in @var{rop}. The return value is how many such occurrences were
! 3196: removed.
1.1 maekawa 3197: @end deftypefun
3198:
1.1.1.2 maekawa 3199: @deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{op})
3200: @cindex Factorial functions
3201: Set @var{rop} to @var{op}!, the factorial of @var{op}.
3202: @end deftypefun
3203:
3204: @deftypefun void mpz_bin_ui (mpz_t @var{rop}, mpz_t @var{n}, unsigned long int @var{k})
3205: @deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}})
3206: @cindex Binomial coefficient functions
1.1.1.4 ! ohara 3207: Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over
! 3208: @var{k}} and store the result in @var{rop}. Negative values of @var{n} are
! 3209: supported by @code{mpz_bin_ui}, using the identity
! 3210: @m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right),
! 3211: bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6
! 3212: part G.
1.1.1.2 maekawa 3213: @end deftypefun
3214:
1.1.1.4 ! ohara 3215: @deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n})
! 3216: @deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n})
1.1.1.2 maekawa 3217: @cindex Fibonacci sequence functions
1.1.1.4 ! ohara 3218: @code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci
! 3219: number. @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to
! 3220: @m{F_{n-1},F[n-1]}.
! 3221:
! 3222: These functions are designed for calculating isolated Fibonacci numbers. When
! 3223: a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and
! 3224: iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or
! 3225: similar.
! 3226: @end deftypefun
! 3227:
! 3228: @deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n})
! 3229: @deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n})
! 3230: @cindex Lucas number functions
! 3231: @code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas
! 3232: number. @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1}
! 3233: to @m{L_{n-1},L[n-1]}.
! 3234:
! 3235: These functions are designed for calculating isolated Lucas numbers. When a
! 3236: sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and
! 3237: iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or
! 3238: similar.
! 3239:
! 3240: The Fibonacci numbers and Lucas numbers are related sequences, so it's never
! 3241: necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}. The
! 3242: formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers
! 3243: Algorithm}, the reverse is straightforward too.
1.1.1.2 maekawa 3244: @end deftypefun
3245:
3246:
3247: @node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions
1.1 maekawa 3248: @comment node-name, next, previous, up
3249: @section Comparison Functions
1.1.1.2 maekawa 3250: @cindex Integer comparison functions
3251: @cindex Comparison functions
1.1 maekawa 3252:
1.1.1.4 ! ohara 3253: @deftypefn Function int mpz_cmp (mpz_t @var{op1}, mpz_t @var{op2})
! 3254: @deftypefnx Function int mpz_cmp_d (mpz_t @var{op1}, double @var{op2})
1.1 maekawa 3255: @deftypefnx Macro int mpz_cmp_si (mpz_t @var{op1}, signed long int @var{op2})
1.1.1.4 ! ohara 3256: @deftypefnx Macro int mpz_cmp_ui (mpz_t @var{op1}, unsigned long int @var{op2})
! 3257: Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} >
! 3258: @var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if
! 3259: @math{@var{op1} < @var{op2}}.
1.1 maekawa 3260:
1.1.1.4 ! ohara 3261: Note that @code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate
! 3262: their arguments more than once.
1.1 maekawa 3263: @end deftypefn
3264:
1.1.1.4 ! ohara 3265: @deftypefn Function int mpz_cmpabs (mpz_t @var{op1}, mpz_t @var{op2})
! 3266: @deftypefnx Function int mpz_cmpabs_d (mpz_t @var{op1}, double @var{op2})
! 3267: @deftypefnx Function int mpz_cmpabs_ui (mpz_t @var{op1}, unsigned long int @var{op2})
1.1.1.2 maekawa 3268: Compare the absolute values of @var{op1} and @var{op2}. Return a positive
1.1.1.4 ! ohara 3269: value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if
! 3270: @math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if
! 3271: @math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}.
! 3272:
! 3273: Note that @code{mpz_cmpabs_si} is a macro and will evaluate its arguments more
! 3274: than once.
! 3275: @end deftypefn
1.1.1.2 maekawa 3276:
1.1 maekawa 3277: @deftypefn Macro int mpz_sgn (mpz_t @var{op})
1.1.1.4 ! ohara 3278: @cindex Sign tests
! 3279: @cindex Integer sign tests
! 3280: Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
! 3281: @math{-1} if @math{@var{op} < 0}.
1.1 maekawa 3282:
1.1.1.4 ! ohara 3283: This function is actually implemented as a macro. It evaluates its argument
! 3284: multiple times.
1.1 maekawa 3285: @end deftypefn
3286:
1.1.1.4 ! ohara 3287:
1.1.1.2 maekawa 3288: @node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions
1.1 maekawa 3289: @comment node-name, next, previous, up
3290: @section Logical and Bit Manipulation Functions
3291: @cindex Logical functions
3292: @cindex Bit manipulation functions
1.1.1.2 maekawa 3293: @cindex Integer bit manipulation functions
1.1 maekawa 3294:
1.1.1.4 ! ohara 3295: These functions behave as if twos complement arithmetic were used (although
! 3296: sign-magnitude is the actual implementation). The least significant bit is
! 3297: number 0.
1.1 maekawa 3298:
3299: @deftypefun void mpz_and (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3300: Set @var{rop} to @var{op1} logical-and @var{op2}.
3301: @end deftypefun
3302:
3303: @deftypefun void mpz_ior (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3304: Set @var{rop} to @var{op1} inclusive-or @var{op2}.
3305: @end deftypefun
3306:
1.1.1.2 maekawa 3307: @deftypefun void mpz_xor (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3308: Set @var{rop} to @var{op1} exclusive-or @var{op2}.
3309: @end deftypefun
1.1 maekawa 3310:
3311: @deftypefun void mpz_com (mpz_t @var{rop}, mpz_t @var{op})
3312: Set @var{rop} to the one's complement of @var{op}.
3313: @end deftypefun
3314:
3315: @deftypefun {unsigned long int} mpz_popcount (mpz_t @var{op})
1.1.1.4 ! ohara 3316: If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is
! 3317: the number of 1 bits in the binary representation. If @math{@var{op}<0}, the
! 3318: number of 1s is infinite, and the return value is @var{MAX_ULONG}, the largest
! 3319: possible @code{unsigned long}.
1.1 maekawa 3320: @end deftypefun
3321:
3322: @deftypefun {unsigned long int} mpz_hamdist (mpz_t @var{op1}, mpz_t @var{op2})
1.1.1.4 ! ohara 3323: If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return
! 3324: the hamming distance between the two operands, which is the number of bit
! 3325: positions where @var{op1} and @var{op2} have different bit values. If one
! 3326: operand is @math{@ge{}0} and the other @math{<0} then the number of bits
! 3327: different is infinite, and the return value is @var{MAX_ULONG}, the largest
! 3328: possible @code{unsigned long}.
1.1 maekawa 3329: @end deftypefun
3330:
3331: @deftypefun {unsigned long int} mpz_scan0 (mpz_t @var{op}, unsigned long int @var{starting_bit})
1.1.1.4 ! ohara 3332: @deftypefunx {unsigned long int} mpz_scan1 (mpz_t @var{op}, unsigned long int @var{starting_bit})
! 3333: Scan @var{op}, starting from bit @var{starting_bit}, towards more significant
! 3334: bits, until the first 0 or 1 bit (respectively) is found. Return the index of
! 3335: the found bit.
! 3336:
! 3337: If the bit at @var{starting_bit} is already what's sought, then
! 3338: @var{starting_bit} is returned.
! 3339:
! 3340: If there's no bit found, then @var{MAX_ULONG} is returned. This will happen
! 3341: in @code{mpz_scan0} past the end of a positive number, or @code{mpz_scan1}
! 3342: past the end of a negative.
1.1 maekawa 3343: @end deftypefun
3344:
3345: @deftypefun void mpz_setbit (mpz_t @var{rop}, unsigned long int @var{bit_index})
1.1.1.2 maekawa 3346: Set bit @var{bit_index} in @var{rop}.
1.1 maekawa 3347: @end deftypefun
3348:
3349: @deftypefun void mpz_clrbit (mpz_t @var{rop}, unsigned long int @var{bit_index})
1.1.1.2 maekawa 3350: Clear bit @var{bit_index} in @var{rop}.
1.1 maekawa 3351: @end deftypefun
3352:
1.1.1.2 maekawa 3353: @deftypefun int mpz_tstbit (mpz_t @var{op}, unsigned long int @var{bit_index})
1.1.1.4 ! ohara 3354: Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly.
1.1.1.2 maekawa 3355: @end deftypefun
3356:
3357: @node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions
1.1 maekawa 3358: @comment node-name, next, previous, up
3359: @section Input and Output Functions
3360: @cindex Integer input and output functions
3361: @cindex Input functions
3362: @cindex Output functions
3363: @cindex I/O functions
3364:
3365: Functions that perform input from a stdio stream, and functions that output to
1.1.1.2 maekawa 3366: a stdio stream. Passing a @code{NULL} pointer for a @var{stream} argument to any of
1.1 maekawa 3367: these functions will make them read from @code{stdin} and write to
3368: @code{stdout}, respectively.
3369:
3370: When using any of these functions, it is a good idea to include @file{stdio.h}
3371: before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3372: for these functions.
3373:
3374: @deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, mpz_t @var{op})
3375: Output @var{op} on stdio stream @var{stream}, as a string of digits in base
3376: @var{base}. The base may vary from 2 to 36.
3377:
3378: Return the number of bytes written, or if an error occurred, return 0.
3379: @end deftypefun
3380:
3381: @deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base})
3382: Input a possibly white-space preceded string in base @var{base} from stdio
3383: stream @var{stream}, and put the read integer in @var{rop}. The base may vary
3384: from 2 to 36. If @var{base} is 0, the actual base is determined from the
3385: leading characters: if the first two characters are `0x' or `0X', hexadecimal
3386: is assumed, otherwise if the first character is `0', octal is assumed,
3387: otherwise decimal is assumed.
3388:
3389: Return the number of bytes read, or if an error occurred, return 0.
3390: @end deftypefun
3391:
3392: @deftypefun size_t mpz_out_raw (FILE *@var{stream}, mpz_t @var{op})
3393: Output @var{op} on stdio stream @var{stream}, in raw binary format. The
3394: integer is written in a portable format, with 4 bytes of size information, and
3395: that many bytes of limbs. Both the size and the limbs are written in
3396: decreasing significance order (i.e., in big-endian).
3397:
3398: The output can be read with @code{mpz_inp_raw}.
3399:
3400: Return the number of bytes written, or if an error occurred, return 0.
3401:
3402: The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because
3403: of changes necessary for compatibility between 32-bit and 64-bit machines.
3404: @end deftypefun
3405:
3406: @deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream})
3407: Input from stdio stream @var{stream} in the format written by
3408: @code{mpz_out_raw}, and put the result in @var{rop}. Return the number of
3409: bytes read, or if an error occurred, return 0.
3410:
3411: This routine can read the output from @code{mpz_out_raw} also from GMP 1, in
3412: spite of changes necessary for compatibility between 32-bit and 64-bit
3413: machines.
3414: @end deftypefun
3415:
3416:
3417: @need 2000
1.1.1.4 ! ohara 3418: @node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions
1.1 maekawa 3419: @comment node-name, next, previous, up
1.1.1.2 maekawa 3420: @section Random Number Functions
3421: @cindex Integer random number functions
3422: @cindex Random number functions
3423:
3424: The random number functions of GMP come in two groups; older function
3425: that rely on a global state, and newer functions that accept a state
3426: parameter that is read and modified. Please see the @ref{Random Number
3427: Functions} for more information on how to use and not to use random
3428: number functions.
3429:
1.1.1.4 ! ohara 3430: @deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{n})
! 3431: Generate a uniformly distributed random integer in the range 0 to @m{2^n-1,
! 3432: 2^@var{n}@minus{}1}, inclusive.
1.1.1.2 maekawa 3433:
3434: The variable @var{state} must be initialized by calling one of the
3435: @code{gmp_randinit} functions (@ref{Random State Initialization}) before
3436: invoking this function.
3437: @end deftypefun
3438:
1.1.1.4 ! ohara 3439: @deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, mpz_t @var{n})
! 3440: Generate a uniform random integer in the range 0 to @math{@var{n}-1},
! 3441: inclusive.
1.1.1.2 maekawa 3442:
3443: The variable @var{state} must be initialized by calling one of the
3444: @code{gmp_randinit} functions (@ref{Random State Initialization})
3445: before invoking this function.
3446: @end deftypefun
3447:
3448: @deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{n})
3449: Generate a random integer with long strings of zeros and ones in the
3450: binary representation. Useful for testing functions and algorithms,
3451: since this kind of random numbers have proven to be more likely to
3452: trigger corner-case bugs. The random number will be in the range
1.1.1.4 ! ohara 3453: 0 to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive.
1.1.1.2 maekawa 3454:
3455: The variable @var{state} must be initialized by calling one of the
3456: @code{gmp_randinit} functions (@ref{Random State Initialization})
3457: before invoking this function.
3458: @end deftypefun
1.1 maekawa 3459:
3460: @deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size})
3461: Generate a random integer of at most @var{max_size} limbs. The generated
3462: random number doesn't satisfy any particular requirements of randomness.
3463: Negative random numbers are generated when @var{max_size} is negative.
1.1.1.2 maekawa 3464:
3465: This function is obsolete. Use @code{mpz_urandomb} or
3466: @code{mpz_urandomm} instead.
1.1 maekawa 3467: @end deftypefun
3468:
3469: @deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size})
3470: Generate a random integer of at most @var{max_size} limbs, with long strings
3471: of zeros and ones in the binary representation. Useful for testing functions
3472: and algorithms, since this kind of random numbers have proven to be more
3473: likely to trigger corner-case bugs. Negative random numbers are generated
3474: when @var{max_size} is negative.
1.1.1.2 maekawa 3475:
3476: This function is obsolete. Use @code{mpz_rrandomb} instead.
1.1 maekawa 3477: @end deftypefun
3478:
1.1.1.2 maekawa 3479:
1.1.1.4 ! ohara 3480: @node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions
! 3481: @section Integer Import and Export
! 3482:
! 3483: @code{mpz_t} variables can be converted to and from arbitrary words of binary
! 3484: data with the following functions.
! 3485:
! 3486: @deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, int @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op})
! 3487: @cindex Integer import
! 3488: @cindex Import
! 3489: Set @var{rop} from an array of word data at @var{op}.
! 3490:
! 3491: The parameters specify the format of the data. @var{count} many words are
! 3492: read, each @var{size} bytes. @var{order} can be 1 for most significant word
! 3493: first or -1 for least significant first. Within each word @var{endian} can be
! 3494: 1 for most significant byte first, -1 for least significant first, or 0 for
! 3495: the native endianness of the host CPU. The most significant @var{nails} bits
! 3496: of each word are skipped, this can be 0 to use the full words.
! 3497:
! 3498: There are no data alignment restrictions on @var{op}, any address is allowed.
! 3499:
! 3500: Here's an example converting an array of @code{unsigned long} data, most
! 3501: significant element first and host byte order within each value.
! 3502:
! 3503: @example
! 3504: unsigned long a[20];
! 3505: mpz_t z;
! 3506: mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a);
! 3507: @end example
! 3508:
! 3509: This example assumes the full @code{sizeof} bytes are used for data in the
! 3510: given type, which is usually true, and certainly true for @code{unsigned long}
! 3511: everywhere we know of. However on Cray vector systems it may be noted that
! 3512: @code{short} and @code{int} are always stored in 8 bytes (and with
! 3513: @code{sizeof} indicating that) but use only 32 or 46 bits. The @var{nails}
! 3514: feature can account for this, by passing for instance
! 3515: @code{8*sizeof(int)-INT_BIT}.
! 3516: @end deftypefun
! 3517:
! 3518: @deftypefun void *mpz_export (void *@var{rop}, size_t *@var{count}, int @var{order}, int @var{size}, int @var{endian}, size_t @var{nails}, mpz_t @var{op})
! 3519: @cindex Integer export
! 3520: @cindex Export
! 3521: Fill @var{rop} with word data from @var{op}.
! 3522:
! 3523: The parameters specify the format of the data produced. Each word will be
! 3524: @var{size} bytes and @var{order} can be 1 for most significant word first or
! 3525: -1 for least significant first. Within each word @var{endian} can be 1 for
! 3526: most significant byte first, -1 for least significant first, or 0 for the
! 3527: native endianness of the host CPU. The most significant @var{nails} bits of
! 3528: each word are unused and set to zero, this can be 0 to produce full words.
! 3529:
! 3530: The number of words produced is written to @code{*@var{count}}. @var{rop}
! 3531: must have enough space for the data, or if @var{rop} is @code{NULL} then a
! 3532: result array of the necessary size is allocated using the current GMP
! 3533: allocation function (@pxref{Custom Allocation}). In either case the return
! 3534: value is the destination used, @var{rop} or the allocated block.
! 3535:
! 3536: If @var{op} is non-zero then the most significant word produced will be
! 3537: non-zero. If @var{op} is zero then the count returned will be zero and
! 3538: nothing written to @var{rop}. If @var{rop} is @code{NULL} in this case, no
! 3539: block is allocated, just @code{NULL} is returned.
! 3540:
! 3541: There are no data alignment restrictions on @var{rop}, any address is allowed.
! 3542: The sign of @var{op} is ignored, just the absolute value is used.
! 3543:
! 3544: When an application is allocating space itself the required size can be
! 3545: determined with a calculation like the following. Since @code{mpz_sizeinbase}
! 3546: always returns at least 1, @code{count} here will be at least one, which
! 3547: avoids any portability problems with @code{malloc(0)}, though if @code{z} is
! 3548: zero no space at all is actually needed.
! 3549:
! 3550: @example
! 3551: numb = 8*size - nail;
! 3552: count = (mpz_sizeinbase (z, 2) + numb-1) / numb;
! 3553: p = malloc (count * size);
! 3554: @end example
! 3555: @end deftypefun
! 3556:
! 3557:
1.1.1.2 maekawa 3558: @need 2000
1.1.1.4 ! ohara 3559: @node Miscellaneous Integer Functions, , Integer Import and Export, Integer Functions
1.1.1.2 maekawa 3560: @comment node-name, next, previous, up
3561: @section Miscellaneous Functions
3562: @cindex Miscellaneous integer functions
3563: @cindex Integer miscellaneous functions
3564:
3565: @deftypefun int mpz_fits_ulong_p (mpz_t @var{op})
3566: @deftypefunx int mpz_fits_slong_p (mpz_t @var{op})
3567: @deftypefunx int mpz_fits_uint_p (mpz_t @var{op})
3568: @deftypefunx int mpz_fits_sint_p (mpz_t @var{op})
3569: @deftypefunx int mpz_fits_ushort_p (mpz_t @var{op})
3570: @deftypefunx int mpz_fits_sshort_p (mpz_t @var{op})
3571: Return non-zero iff the value of @var{op} fits in an @code{unsigned long int},
3572: @code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned
3573: short int}, or @code{signed short int}, respectively. Otherwise, return zero.
3574: @end deftypefun
3575:
3576: @deftypefn Macro int mpz_odd_p (mpz_t @var{op})
3577: @deftypefnx Macro int mpz_even_p (mpz_t @var{op})
3578: Determine whether @var{op} is odd or even, respectively. Return non-zero if
1.1.1.4 ! ohara 3579: yes, zero if no. These macros evaluate their argument more than once.
1.1.1.2 maekawa 3580: @end deftypefn
3581:
1.1 maekawa 3582: @deftypefun size_t mpz_size (mpz_t @var{op})
3583: Return the size of @var{op} measured in number of limbs. If @var{op} is zero,
3584: the returned value will be zero.
3585: @c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.)
3586: @end deftypefun
3587:
3588: @deftypefun size_t mpz_sizeinbase (mpz_t @var{op}, int @var{base})
3589: Return the size of @var{op} measured in number of digits in base @var{base}.
1.1.1.4 ! ohara 3590: The base may vary from 2 to 36. The sign of @var{op} is ignored, just the
! 3591: absolute value is used. The result will be exact or 1 too big. If @var{base}
! 3592: is a power of 2, the result will always be exact. If @var{op} is zero the
! 3593: return value is always 1.
1.1 maekawa 3594:
3595: This function is useful in order to allocate the right amount of space before
3596: converting @var{op} to a string. The right amount of allocation is normally
3597: two more than the value returned by @code{mpz_sizeinbase} (one extra for a
1.1.1.4 ! ohara 3598: minus sign and one for the null-terminator).
1.1 maekawa 3599: @end deftypefun
3600:
3601:
3602: @node Rational Number Functions, Floating-point Functions, Integer Functions, Top
3603: @comment node-name, next, previous, up
3604: @chapter Rational Number Functions
3605: @cindex Rational number functions
3606:
1.1.1.2 maekawa 3607: This chapter describes the GMP functions for performing arithmetic on rational
1.1 maekawa 3608: numbers. These functions start with the prefix @code{mpq_}.
3609:
3610: Rational numbers are stored in objects of type @code{mpq_t}.
3611:
3612: All rational arithmetic functions assume operands have a canonical form, and
3613: canonicalize their result. The canonical from means that the denominator and
3614: the numerator have no common factors, and that the denominator is positive.
3615: Zero has the unique representation 0/1.
3616:
3617: Pure assignment functions do not canonicalize the assigned variable. It is
3618: the responsibility of the user to canonicalize the assigned variable before
1.1.1.4 ! ohara 3619: any arithmetic operations are performed on that variable.
1.1 maekawa 3620:
3621: @deftypefun void mpq_canonicalize (mpq_t @var{op})
3622: Remove any factors that are common to the numerator and denominator of
3623: @var{op}, and make the denominator positive.
3624: @end deftypefun
3625:
3626: @menu
1.1.1.2 maekawa 3627: * Initializing Rationals::
1.1.1.4 ! ohara 3628: * Rational Conversions::
1.1.1.2 maekawa 3629: * Rational Arithmetic::
3630: * Comparing Rationals::
3631: * Applying Integer Functions::
3632: * I/O of Rationals::
1.1 maekawa 3633: @end menu
3634:
1.1.1.4 ! ohara 3635: @node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions
1.1 maekawa 3636: @comment node-name, next, previous, up
3637: @section Initialization and Assignment Functions
1.1.1.2 maekawa 3638: @cindex Initialization and assignment functions
3639: @cindex Rational init and assign
1.1 maekawa 3640:
3641: @deftypefun void mpq_init (mpq_t @var{dest_rational})
3642: Initialize @var{dest_rational} and set it to 0/1. Each variable should
3643: normally only be initialized once, or at least cleared out (using the function
3644: @code{mpq_clear}) between each initialization.
3645: @end deftypefun
3646:
3647: @deftypefun void mpq_clear (mpq_t @var{rational_number})
3648: Free the space occupied by @var{rational_number}. Make sure to call this
3649: function for all @code{mpq_t} variables when you are done with them.
3650: @end deftypefun
3651:
3652: @deftypefun void mpq_set (mpq_t @var{rop}, mpq_t @var{op})
3653: @deftypefunx void mpq_set_z (mpq_t @var{rop}, mpz_t @var{op})
3654: Assign @var{rop} from @var{op}.
3655: @end deftypefun
3656:
3657: @deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2})
3658: @deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2})
3659: Set the value of @var{rop} to @var{op1}/@var{op2}. Note that if @var{op1} and
3660: @var{op2} have common factors, @var{rop} has to be passed to
3661: @code{mpq_canonicalize} before any operations are performed on @var{rop}.
3662: @end deftypefun
3663:
1.1.1.4 ! ohara 3664: @deftypefun int mpq_set_str (mpq_t @var{rop}, char *@var{str}, int @var{base})
! 3665: Set @var{rop} from a null-terminated string @var{str} in the given @var{base}.
! 3666:
! 3667: The string can be an integer like ``41'' or a fraction like ``41/152''. The
! 3668: fraction must be in canonical form (@pxref{Rational Number Functions}), or if
! 3669: not then @code{mpq_canonicalize} must be called.
! 3670:
! 3671: The numerator and optional denominator are parsed the same as in
! 3672: @code{mpz_set_str} (@pxref{Assigning Integers}). White space is allowed in
! 3673: the string, and is simply ignored. The @var{base} can vary from 2 to 36, or
! 3674: if @var{base} is 0 then the leading characters are used: @code{0x} for hex,
! 3675: @code{0} for octal, or decimal otherwise. Note that this is done separately
! 3676: for the numerator and denominator, so for instance @code{0xEF/100} is 239/100,
! 3677: whereas @code{0xEF/0x100} is 239/256.
! 3678:
! 3679: The return value is 0 if the entire string is a valid number, or @minus{}1 if
! 3680: not.
! 3681: @end deftypefun
! 3682:
1.1.1.2 maekawa 3683: @deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2})
3684: Swap the values @var{rop1} and @var{rop2} efficiently.
3685: @end deftypefun
3686:
3687:
1.1.1.4 ! ohara 3688: @need 2000
! 3689: @node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions
! 3690: @comment node-name, next, previous, up
! 3691: @section Conversion Functions
! 3692: @cindex Rational conversion functions
! 3693: @cindex Conversion functions
! 3694:
! 3695: @deftypefun double mpq_get_d (mpq_t @var{op})
! 3696: Convert @var{op} to a @code{double}.
! 3697: @end deftypefun
! 3698:
! 3699: @deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op})
! 3700: @deftypefunx void mpq_set_f (mpq_t @var{rop}, mpf_t @var{op})
! 3701: Set @var{rop} to the value of @var{op}, without rounding.
! 3702: @end deftypefun
! 3703:
! 3704: @deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, mpq_t @var{op})
! 3705: Convert @var{op} to a string of digits in base @var{base}. The base may vary
! 3706: from 2 to 36. The string will be of the form @samp{num/den}, or if the
! 3707: denominator is 1 then just @samp{num}.
! 3708:
! 3709: If @var{str} is @code{NULL}, the result string is allocated using the current
! 3710: allocation function (@pxref{Custom Allocation}). The block will be
! 3711: @code{strlen(str)+1} bytes, that being exactly enough for the string and
! 3712: null-terminator.
! 3713:
! 3714: If @var{str} is not @code{NULL}, it should point to a block of storage large
! 3715: enough for the result, that being
! 3716:
! 3717: @example
! 3718: mpz_sizeinbase (mpq_numref(@var{op}), @var{base})
! 3719: + mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3
! 3720: @end example
! 3721:
! 3722: The three extra bytes are for a possible minus sign, possible slash, and the
! 3723: null-terminator.
! 3724:
! 3725: A pointer to the result string is returned, being either the allocated block,
! 3726: or the given @var{str}.
! 3727: @end deftypefun
! 3728:
! 3729:
! 3730: @node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions
1.1 maekawa 3731: @comment node-name, next, previous, up
3732: @section Arithmetic Functions
1.1.1.2 maekawa 3733: @cindex Rational arithmetic functions
3734: @cindex Arithmetic functions
1.1 maekawa 3735:
3736: @deftypefun void mpq_add (mpq_t @var{sum}, mpq_t @var{addend1}, mpq_t @var{addend2})
3737: Set @var{sum} to @var{addend1} + @var{addend2}.
3738: @end deftypefun
3739:
3740: @deftypefun void mpq_sub (mpq_t @var{difference}, mpq_t @var{minuend}, mpq_t @var{subtrahend})
3741: Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}.
3742: @end deftypefun
3743:
3744: @deftypefun void mpq_mul (mpq_t @var{product}, mpq_t @var{multiplier}, mpq_t @var{multiplicand})
1.1.1.4 ! ohara 3745: Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}.
! 3746: @end deftypefun
! 3747:
! 3748: @deftypefun void mpq_mul_2exp (mpq_t @var{rop}, mpq_t @var{op1}, unsigned long int @var{op2})
! 3749: Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
! 3750: @var{op2}}.
1.1 maekawa 3751: @end deftypefun
3752:
3753: @deftypefun void mpq_div (mpq_t @var{quotient}, mpq_t @var{dividend}, mpq_t @var{divisor})
1.1.1.2 maekawa 3754: @cindex Division functions
1.1 maekawa 3755: Set @var{quotient} to @var{dividend}/@var{divisor}.
3756: @end deftypefun
3757:
1.1.1.4 ! ohara 3758: @deftypefun void mpq_div_2exp (mpq_t @var{rop}, mpq_t @var{op1}, unsigned long int @var{op2})
! 3759: Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
! 3760: @var{op2}}.
! 3761: @end deftypefun
! 3762:
1.1 maekawa 3763: @deftypefun void mpq_neg (mpq_t @var{negated_operand}, mpq_t @var{operand})
3764: Set @var{negated_operand} to @minus{}@var{operand}.
3765: @end deftypefun
3766:
1.1.1.4 ! ohara 3767: @deftypefun void mpq_abs (mpq_t @var{rop}, mpq_t @var{op})
! 3768: Set @var{rop} to the absolute value of @var{op}.
! 3769: @end deftypefun
! 3770:
1.1 maekawa 3771: @deftypefun void mpq_inv (mpq_t @var{inverted_number}, mpq_t @var{number})
3772: Set @var{inverted_number} to 1/@var{number}. If the new denominator is
3773: zero, this routine will divide by zero.
3774: @end deftypefun
3775:
1.1.1.2 maekawa 3776: @node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions
1.1 maekawa 3777: @comment node-name, next, previous, up
3778: @section Comparison Functions
1.1.1.2 maekawa 3779: @cindex Rational comparison functions
3780: @cindex Comparison functions
1.1 maekawa 3781:
3782: @deftypefun int mpq_cmp (mpq_t @var{op1}, mpq_t @var{op2})
1.1.1.4 ! ohara 3783: Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} >
! 3784: @var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
! 3785: @math{@var{op1} < @var{op2}}.
1.1 maekawa 3786:
3787: To determine if two rationals are equal, @code{mpq_equal} is faster than
3788: @code{mpq_cmp}.
3789: @end deftypefun
3790:
3791: @deftypefn Macro int mpq_cmp_ui (mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2})
1.1.1.4 ! ohara 3792: @deftypefnx Macro int mpq_cmp_si (mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2})
1.1 maekawa 3793: Compare @var{op1} and @var{num2}/@var{den2}. Return a positive value if
1.1.1.4 ! ohara 3794: @math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} =
! 3795: @var{num2}/@var{den2}}, and a negative value if @math{@var{op1} <
! 3796: @var{num2}/@var{den2}}.
1.1 maekawa 3797:
1.1.1.4 ! ohara 3798: @var{num2} and @var{den2} are allowed to have common factors.
1.1 maekawa 3799:
1.1.1.4 ! ohara 3800: These functions are implemented as a macros and evaluate their arguments
! 3801: multiple times.
1.1 maekawa 3802: @end deftypefn
3803:
3804: @deftypefn Macro int mpq_sgn (mpq_t @var{op})
1.1.1.4 ! ohara 3805: @cindex Sign tests
! 3806: @cindex Rational sign tests
! 3807: Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
! 3808: @math{-1} if @math{@var{op} < 0}.
1.1 maekawa 3809:
3810: This function is actually implemented as a macro. It evaluates its
3811: arguments multiple times.
3812: @end deftypefn
3813:
3814: @deftypefun int mpq_equal (mpq_t @var{op1}, mpq_t @var{op2})
3815: Return non-zero if @var{op1} and @var{op2} are equal, zero if they are
3816: non-equal. Although @code{mpq_cmp} can be used for the same purpose, this
3817: function is much faster.
3818: @end deftypefun
3819:
1.1.1.2 maekawa 3820: @node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions
1.1 maekawa 3821: @comment node-name, next, previous, up
3822: @section Applying Integer Functions to Rationals
1.1.1.2 maekawa 3823: @cindex Rational numerator and denominator
3824: @cindex Numerator and denominator
1.1 maekawa 3825:
1.1.1.2 maekawa 3826: The set of @code{mpq} functions is quite small. In particular, there are few
1.1.1.4 ! ohara 3827: functions for either input or output. The following functions give direct
! 3828: access to the numerator and denominator of an @code{mpq_t}.
! 3829:
! 3830: Note that if an assignment to the numerator and/or denominator could take an
! 3831: @code{mpq_t} out of the canonical form described at the start of this chapter
! 3832: (@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be
! 3833: called before any other @code{mpq} functions are applied to that @code{mpq_t}.
1.1 maekawa 3834:
3835: @deftypefn Macro mpz_t mpq_numref (mpq_t @var{op})
3836: @deftypefnx Macro mpz_t mpq_denref (mpq_t @var{op})
3837: Return a reference to the numerator and denominator of @var{op}, respectively.
3838: The @code{mpz} functions can be used on the result of these macros.
3839: @end deftypefn
3840:
1.1.1.4 ! ohara 3841: @deftypefun void mpq_get_num (mpz_t @var{numerator}, mpq_t @var{rational})
! 3842: @deftypefunx void mpq_get_den (mpz_t @var{denominator}, mpq_t @var{rational})
! 3843: @deftypefunx void mpq_set_num (mpq_t @var{rational}, mpz_t @var{numerator})
! 3844: @deftypefunx void mpq_set_den (mpq_t @var{rational}, mpz_t @var{denominator})
! 3845: Get or set the numerator or denominator of a rational. These functions are
! 3846: equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or
! 3847: @code{mpq_denref}. Direct use of @code{mpq_numref} or @code{mpq_denref} is
! 3848: recommended instead of these functions.
! 3849: @end deftypefun
! 3850:
1.1.1.2 maekawa 3851:
3852: @need 2000
1.1.1.4 ! ohara 3853: @node I/O of Rationals, , Applying Integer Functions, Rational Number Functions
1.1.1.2 maekawa 3854: @comment node-name, next, previous, up
3855: @section Input and Output Functions
3856: @cindex Rational input and output functions
3857: @cindex Input functions
3858: @cindex Output functions
3859: @cindex I/O functions
3860:
1.1.1.4 ! ohara 3861: When using any of these functions, it's a good idea to include @file{stdio.h}
1.1.1.2 maekawa 3862: before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3863: for these functions.
3864:
1.1.1.4 ! ohara 3865: Passing a @code{NULL} pointer for a @var{stream} argument to any of these
! 3866: functions will make them read from @code{stdin} and write to @code{stdout},
! 3867: respectively.
! 3868:
1.1.1.2 maekawa 3869: @deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, mpq_t @var{op})
3870: Output @var{op} on stdio stream @var{stream}, as a string of digits in base
3871: @var{base}. The base may vary from 2 to 36. Output is in the form
3872: @samp{num/den} or if the denominator is 1 then just @samp{num}.
3873:
3874: Return the number of bytes written, or if an error occurred, return 0.
3875: @end deftypefun
3876:
1.1.1.4 ! ohara 3877: @deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base})
! 3878: Read a string of digits from @var{stream} and convert them to a rational in
! 3879: @var{rop}. Any initial white-space characters are read and discarded. Return
! 3880: the number of characters read (including white space), or 0 if a rational
! 3881: could not be read.
! 3882:
! 3883: The input can be a fraction like @samp{17/63} or just an integer like
! 3884: @samp{123}. Reading stops at the first character not in this form, and white
! 3885: space is not permitted within the string. If the input might not be in
! 3886: canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational
! 3887: Number Functions}).
! 3888:
! 3889: The @var{base} can be between 2 and 36, or can be 0 in which case the leading
! 3890: characters of the string determine the base, @samp{0x} or @samp{0X} for
! 3891: hexadecimal, @samp{0} for octal, or decimal otherwise. The leading characters
! 3892: are examined separately for the numerator and denominator of a fraction, so
! 3893: for instance @samp{0x10/11} is 16/11, whereas @samp{0x10/0x11} is 16/17.
1.1 maekawa 3894: @end deftypefun
3895:
3896:
3897: @node Floating-point Functions, Low-level Functions, Rational Number Functions, Top
3898: @comment node-name, next, previous, up
3899: @chapter Floating-point Functions
3900: @cindex Floating-point functions
3901: @cindex Float functions
3902: @cindex User-defined precision
1.1.1.2 maekawa 3903: @cindex Precision of floats
1.1.1.4 ! ohara 3904:
! 3905: GMP floating point numbers are stored in objects of type @code{mpf_t} and
! 3906: functions operating on them have an @code{mpf_} prefix.
! 3907:
! 3908: The mantissa of each float has a user-selectable precision, limited only by
! 3909: available memory. Each variable has its own precision, and that can be
! 3910: increased or decreased at any time.
! 3911:
! 3912: The exponent of each float is a fixed precision, one machine word on most
! 3913: systems. In the current implementation the exponent is a count of limbs, so
! 3914: for example on a 32-bit system this means a range of roughly
! 3915: @math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system
! 3916: this will be greater. Note however @code{mpf_get_str} can only return an
! 3917: exponent which fits an @code{mp_exp_t} and currently @code{mpf_set_str}
! 3918: doesn't accept exponents bigger than a @code{long}.
! 3919:
! 3920: Each variable keeps a size for the mantissa data actually in use. This means
! 3921: that if a float is exactly represented in only a few bits then only those bits
! 3922: will be used in a calculation, even if the selected precision is high.
! 3923:
! 3924: All calculations are performed to the precision of the destination variable.
! 3925: Each function is defined to calculate with ``infinite precision'' followed by
! 3926: a truncation to the destination precision, but of course the work done is only
! 3927: what's needed to determine a result under that definition.
! 3928:
! 3929: The precision selected for a variable is a minimum value, GMP may increase it
! 3930: a little to facilitate efficient calculation. Currently this means rounding
! 3931: up to a whole limb, and then sometimes having a further partial limb,
! 3932: depending on the high limb of the mantissa. But applications shouldn't be
! 3933: concerned by such details.
! 3934:
! 3935: The mantissa in stored in binary, as might be imagined from the fact
! 3936: precisions are expressed in bits. One consequence of this is that decimal
! 3937: fractions like @math{0.1} cannot be represented exactly. The same is true of
! 3938: plain IEEE @code{double} floats. This makes both highly unsuitable for
! 3939: calculations involving money or other values that should be exact decimal
! 3940: fractions. (Suitably scaled integers, or perhaps rationals, are better
! 3941: choices.)
! 3942:
! 3943: @code{mpf} functions and variables have no special notion of infinity or
! 3944: not-a-number, and applications must take care not to overflow the exponent or
! 3945: results will be unpredictable. This might change in a future release.
! 3946:
! 3947: Note that the @code{mpf} functions are @emph{not} intended as a smooth
! 3948: extension to IEEE P754 arithmetic. In particular results obtained on one
! 3949: computer often differ from the results on a computer with a different word
! 3950: size.
1.1 maekawa 3951:
3952: @menu
1.1.1.2 maekawa 3953: * Initializing Floats::
3954: * Assigning Floats::
3955: * Simultaneous Float Init & Assign::
3956: * Converting Floats::
3957: * Float Arithmetic::
3958: * Float Comparison::
3959: * I/O of Floats::
3960: * Miscellaneous Float Functions::
1.1 maekawa 3961: @end menu
3962:
1.1.1.2 maekawa 3963: @node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions
1.1 maekawa 3964: @comment node-name, next, previous, up
1.1.1.2 maekawa 3965: @section Initialization Functions
3966: @cindex Float initialization functions
3967: @cindex Initialization functions
1.1 maekawa 3968:
3969: @deftypefun void mpf_set_default_prec (unsigned long int @var{prec})
3970: Set the default precision to be @strong{at least} @var{prec} bits. All
3971: subsequent calls to @code{mpf_init} will use this precision, but previously
3972: initialized variables are unaffected.
3973: @end deftypefun
3974:
1.1.1.4 ! ohara 3975: @deftypefun {unsigned long int} mpf_get_default_prec (void)
! 3976: Return the default default precision actually used.
! 3977: @end deftypefun
! 3978:
1.1 maekawa 3979: An @code{mpf_t} object must be initialized before storing the first value in
3980: it. The functions @code{mpf_init} and @code{mpf_init2} are used for that
3981: purpose.
3982:
3983: @deftypefun void mpf_init (mpf_t @var{x})
3984: Initialize @var{x} to 0. Normally, a variable should be initialized once only
3985: or at least be cleared, using @code{mpf_clear}, between initializations. The
3986: precision of @var{x} is undefined unless a default precision has already been
3987: established by a call to @code{mpf_set_default_prec}.
3988: @end deftypefun
3989:
3990: @deftypefun void mpf_init2 (mpf_t @var{x}, unsigned long int @var{prec})
3991: Initialize @var{x} to 0 and set its precision to be @strong{at least}
3992: @var{prec} bits. Normally, a variable should be initialized once only or at
3993: least be cleared, using @code{mpf_clear}, between initializations.
3994: @end deftypefun
3995:
3996: @deftypefun void mpf_clear (mpf_t @var{x})
3997: Free the space occupied by @var{x}. Make sure to call this function for all
3998: @code{mpf_t} variables when you are done with them.
3999: @end deftypefun
4000:
4001: @need 2000
4002: Here is an example on how to initialize floating-point variables:
4003: @example
4004: @{
4005: mpf_t x, y;
1.1.1.4 ! ohara 4006: mpf_init (x); /* use default precision */
! 4007: mpf_init2 (y, 256); /* precision @emph{at least} 256 bits */
1.1 maekawa 4008: @dots{}
4009: /* Unless the program is about to exit, do ... */
4010: mpf_clear (x);
4011: mpf_clear (y);
4012: @}
4013: @end example
4014:
4015: The following three functions are useful for changing the precision during a
4016: calculation. A typical use would be for adjusting the precision gradually in
4017: iterative algorithms like Newton-Raphson, making the computation precision
4018: closely match the actual accurate part of the numbers.
4019:
1.1.1.4 ! ohara 4020: @deftypefun {unsigned long int} mpf_get_prec (mpf_t @var{op})
! 4021: Return the current precision of @var{op}, in bits.
1.1 maekawa 4022: @end deftypefun
4023:
1.1.1.4 ! ohara 4024: @deftypefun void mpf_set_prec (mpf_t @var{rop}, unsigned long int @var{prec})
! 4025: Set the precision of @var{rop} to be @strong{at least} @var{prec} bits. The
! 4026: value in @var{rop} will be truncated to the new precision.
! 4027:
! 4028: This function requires a call to @code{realloc}, and so should not be used in
! 4029: a tight loop.
1.1 maekawa 4030: @end deftypefun
4031:
4032: @deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, unsigned long int @var{prec})
1.1.1.4 ! ohara 4033: Set the precision of @var{rop} to be @strong{at least} @var{prec} bits,
! 4034: without changing the memory allocated.
! 4035:
! 4036: @var{prec} must be no more than the allocated precision for @var{rop}, that
! 4037: being the precision when @var{rop} was initialized, or in the most recent
! 4038: @code{mpf_set_prec}.
! 4039:
! 4040: The value in @var{rop} is unchanged, and in particular if it had a higher
! 4041: precision than @var{prec} it will retain that higher precision. New values
! 4042: written to @var{rop} will use the new @var{prec}.
! 4043:
! 4044: Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another
! 4045: @code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original
! 4046: allocated precision. Failing to do so will have unpredictable results.
! 4047:
! 4048: @code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the
! 4049: original allocated precision. After @code{mpf_set_prec_raw} it reflects the
! 4050: @var{prec} value set.
! 4051:
! 4052: @code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at
! 4053: different precisions during a calculation, perhaps to gradually increase
! 4054: precision in an iteration, or just to use various different precisions for
! 4055: different purposes during a calculation.
1.1 maekawa 4056: @end deftypefun
4057:
4058:
1.1.1.2 maekawa 4059: @need 2000
1.1 maekawa 4060: @node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions
4061: @comment node-name, next, previous, up
1.1.1.2 maekawa 4062: @section Assignment Functions
1.1 maekawa 4063: @cindex Float assignment functions
1.1.1.2 maekawa 4064: @cindex Assignment functions
1.1 maekawa 4065:
4066: These functions assign new values to already initialized floats
4067: (@pxref{Initializing Floats}).
4068:
4069: @deftypefun void mpf_set (mpf_t @var{rop}, mpf_t @var{op})
4070: @deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4071: @deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op})
4072: @deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op})
4073: @deftypefunx void mpf_set_z (mpf_t @var{rop}, mpz_t @var{op})
4074: @deftypefunx void mpf_set_q (mpf_t @var{rop}, mpq_t @var{op})
4075: Set the value of @var{rop} from @var{op}.
4076: @end deftypefun
4077:
4078: @deftypefun int mpf_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base})
4079: Set the value of @var{rop} from the string in @var{str}. The string is of the
4080: form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}.
4081: @samp{M} is the mantissa and @samp{N} is the exponent. The mantissa is always
4082: in the specified base. The exponent is either in the specified base or, if
1.1.1.4 ! ohara 4083: @var{base} is negative, in decimal. The decimal point expected is taken from
! 4084: the current locale, on systems providing @code{localeconv}.
1.1 maekawa 4085:
4086: The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
4087: @minus{}2. Negative values are used to specify that the exponent is in
4088: decimal.
4089:
4090: Unlike the corresponding @code{mpz} function, the base will not be determined
4091: from the leading characters of the string if @var{base} is 0. This is so that
4092: numbers like @samp{0.23} are not interpreted as octal.
4093:
1.1.1.2 maekawa 4094: White space is allowed in the string, and is simply ignored. [This is not
4095: really true; white-space is ignored in the beginning of the string and within
4096: the mantissa, but not in other places, such as after a minus sign or in the
4097: exponent. We are considering changing the definition of this function, making
4098: it fail when there is any white-space in the input, since that makes a lot of
4099: sense. Please tell us your opinion about this change. Do you really want it
1.1.1.4 ! ohara 4100: to accept @nicode{"3 14"} as meaning 314 as it does now?]
1.1 maekawa 4101:
1.1.1.4 ! ohara 4102: This function returns 0 if the entire string is a valid number in base
! 4103: @var{base}. Otherwise it returns @minus{}1.
1.1 maekawa 4104: @end deftypefun
4105:
1.1.1.2 maekawa 4106: @deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2})
1.1.1.4 ! ohara 4107: Swap @var{rop1} and @var{rop2} efficiently. Both the values and the
! 4108: precisions of the two variables are swapped.
1.1.1.2 maekawa 4109: @end deftypefun
4110:
1.1 maekawa 4111:
4112: @node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions
4113: @comment node-name, next, previous, up
1.1.1.2 maekawa 4114: @section Combined Initialization and Assignment Functions
1.1 maekawa 4115: @cindex Initialization and assignment functions
1.1.1.2 maekawa 4116: @cindex Float init and assign functions
1.1 maekawa 4117:
1.1.1.2 maekawa 4118: For convenience, GMP provides a parallel series of initialize-and-set functions
1.1 maekawa 4119: which initialize the output and then store the value there. These functions'
4120: names have the form @code{mpf_init_set@dots{}}
4121:
4122: Once the float has been initialized by any of the @code{mpf_init_set@dots{}}
4123: functions, it can be used as the source or destination operand for the ordinary
4124: float functions. Don't use an initialize-and-set function on a variable
4125: already initialized!
4126:
4127: @deftypefun void mpf_init_set (mpf_t @var{rop}, mpf_t @var{op})
4128: @deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4129: @deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op})
4130: @deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op})
4131: Initialize @var{rop} and set its value from @var{op}.
4132:
4133: The precision of @var{rop} will be taken from the active default precision, as
4134: set by @code{mpf_set_default_prec}.
4135: @end deftypefun
4136:
4137: @deftypefun int mpf_init_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base})
4138: Initialize @var{rop} and set its value from the string in @var{str}. See
4139: @code{mpf_set_str} above for details on the assignment operation.
4140:
4141: Note that @var{rop} is initialized even if an error occurs. (I.e., you have to
4142: call @code{mpf_clear} for it.)
4143:
4144: The precision of @var{rop} will be taken from the active default precision, as
4145: set by @code{mpf_set_default_prec}.
4146: @end deftypefun
4147:
4148:
4149: @node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions
4150: @comment node-name, next, previous, up
4151: @section Conversion Functions
1.1.1.2 maekawa 4152: @cindex Float conversion functions
1.1 maekawa 4153: @cindex Conversion functions
4154:
4155: @deftypefun double mpf_get_d (mpf_t @var{op})
1.1.1.4 ! ohara 4156: Convert @var{op} to a @code{double}.
1.1 maekawa 4157: @end deftypefun
4158:
1.1.1.4 ! ohara 4159: @deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, mpf_t @var{op})
! 4160: Find @var{d} and @var{exp} such that @m{@var{d}\times 2^{exp}, @var{d} times 2
! 4161: raised to @var{exp}}, with @math{0.5@le{}@GMPabs{@var{d}}<1}, is a good
! 4162: approximation to @var{op}. This is similar to the standard C function
! 4163: @code{frexp}.
! 4164: @end deftypefun
! 4165:
! 4166: @deftypefun long mpf_get_si (mpf_t @var{op})
! 4167: @deftypefunx {unsigned long} mpf_get_ui (mpf_t @var{op})
! 4168: Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any
! 4169: fraction part. If @var{op} is too big for the return type, the result is
! 4170: undefined.
1.1 maekawa 4171:
1.1.1.4 ! ohara 4172: See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p}
! 4173: (@pxref{Miscellaneous Float Functions}).
! 4174: @end deftypefun
1.1 maekawa 4175:
1.1.1.4 ! ohara 4176: @deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op})
! 4177: Convert @var{op} to a string of digits in base @var{base}. @var{base} can be
! 4178: 2 to 36. Up to @var{n_digits} digits will be generated. Trailing zeros are
! 4179: not returned. No more digits than can be accurately represented by @var{op}
! 4180: are ever generated. If @var{n_digits} is 0 then that accurate maximum number
! 4181: of digits are generated.
! 4182:
! 4183: If @var{str} is @code{NULL}, the result string is allocated using the current
! 4184: allocation function (@pxref{Custom Allocation}). The block will be
! 4185: @code{strlen(str)+1} bytes, that being exactly enough for the string and
! 4186: null-terminator.
! 4187:
! 4188: If @var{str} is not @code{NULL}, it should point to a block of
! 4189: @math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a
! 4190: possible minus sign, and a null-terminator. When @var{n_digits} is 0 to get
! 4191: all significant digits, an application won't be able to know the space
! 4192: required, and @var{str} should be @code{NULL} in that case.
1.1 maekawa 4193:
4194: The generated string is a fraction, with an implicit radix point immediately
1.1.1.4 ! ohara 4195: to the left of the first digit. The applicable exponent is written through
! 4196: the @var{expptr} pointer. For example, the number 3.1416 would be returned as
! 4197: string @nicode{"31416"} and exponent 1.
! 4198:
! 4199: When @var{op} is zero, an empty string is produced and the exponent returned
! 4200: is 0.
1.1.1.2 maekawa 4201:
1.1.1.4 ! ohara 4202: A pointer to the result string is returned, being either the allocated block
! 4203: or the given @var{str}.
1.1 maekawa 4204: @end deftypefun
4205:
4206:
4207: @node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions
4208: @comment node-name, next, previous, up
4209: @section Arithmetic Functions
4210: @cindex Float arithmetic functions
4211: @cindex Arithmetic functions
4212:
4213: @deftypefun void mpf_add (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4214: @deftypefunx void mpf_add_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
1.1.1.4 ! ohara 4215: Set @var{rop} to @math{@var{op1} + @var{op2}}.
1.1 maekawa 4216: @end deftypefun
4217:
4218: @deftypefun void mpf_sub (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4219: @deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2})
4220: @deftypefunx void mpf_sub_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4221: Set @var{rop} to @var{op1} @minus{} @var{op2}.
4222: @end deftypefun
4223:
4224: @deftypefun void mpf_mul (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4225: @deftypefunx void mpf_mul_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
1.1.1.4 ! ohara 4226: Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
1.1 maekawa 4227: @end deftypefun
4228:
1.1.1.4 ! ohara 4229: Division is undefined if the divisor is zero, and passing a zero divisor to the
! 4230: divide functions will make these functions intentionally divide by zero. This
! 4231: lets the user handle arithmetic exceptions in these functions in the same
1.1.1.2 maekawa 4232: manner as other arithmetic exceptions.
1.1 maekawa 4233:
4234: @deftypefun void mpf_div (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4235: @deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2})
4236: @deftypefunx void mpf_div_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
1.1.1.2 maekawa 4237: @cindex Division functions
1.1 maekawa 4238: Set @var{rop} to @var{op1}/@var{op2}.
4239: @end deftypefun
4240:
4241: @deftypefun void mpf_sqrt (mpf_t @var{rop}, mpf_t @var{op})
4242: @deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op})
1.1.1.2 maekawa 4243: @cindex Root extraction functions
1.1.1.4 ! ohara 4244: Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}.
1.1 maekawa 4245: @end deftypefun
4246:
1.1.1.2 maekawa 4247: @deftypefun void mpf_pow_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4248: @cindex Exponentiation functions
1.1.1.4 ! ohara 4249: @cindex Powering functions
! 4250: Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}.
1.1.1.2 maekawa 4251: @end deftypefun
1.1 maekawa 4252:
4253: @deftypefun void mpf_neg (mpf_t @var{rop}, mpf_t @var{op})
4254: Set @var{rop} to @minus{}@var{op}.
4255: @end deftypefun
4256:
4257: @deftypefun void mpf_abs (mpf_t @var{rop}, mpf_t @var{op})
4258: Set @var{rop} to the absolute value of @var{op}.
4259: @end deftypefun
4260:
4261: @deftypefun void mpf_mul_2exp (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
1.1.1.4 ! ohara 4262: Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
! 4263: @var{op2}}.
1.1 maekawa 4264: @end deftypefun
4265:
4266: @deftypefun void mpf_div_2exp (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
1.1.1.4 ! ohara 4267: Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
! 4268: @var{op2}}.
1.1 maekawa 4269: @end deftypefun
4270:
4271: @node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions
4272: @comment node-name, next, previous, up
4273: @section Comparison Functions
1.1.1.2 maekawa 4274: @cindex Float comparison functions
1.1 maekawa 4275: @cindex Comparison functions
4276:
4277: @deftypefun int mpf_cmp (mpf_t @var{op1}, mpf_t @var{op2})
1.1.1.4 ! ohara 4278: @deftypefunx int mpf_cmp_d (mpf_t @var{op1}, double @var{op2})
1.1 maekawa 4279: @deftypefunx int mpf_cmp_ui (mpf_t @var{op1}, unsigned long int @var{op2})
4280: @deftypefunx int mpf_cmp_si (mpf_t @var{op1}, signed long int @var{op2})
1.1.1.4 ! ohara 4281: Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} >
! 4282: @var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
! 4283: @math{@var{op1} < @var{op2}}.
1.1 maekawa 4284: @end deftypefun
4285:
4286: @deftypefun int mpf_eq (mpf_t @var{op1}, mpf_t @var{op2}, unsigned long int op3)
4287: Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are
1.1.1.4 ! ohara 4288: equal, zero otherwise. I.e., test of @var{op1} and @var{op2} are approximately
! 4289: equal.
! 4290:
! 4291: Caution: Currently only whole limbs are compared, and only in an exact
! 4292: fashion. In the future values like 1000 and 0111 may be considered the same
! 4293: to 3 bits (on the basis that their difference is that small).
1.1 maekawa 4294: @end deftypefun
4295:
4296: @deftypefun void mpf_reldiff (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4297: Compute the relative difference between @var{op1} and @var{op2} and store the
1.1.1.4 ! ohara 4298: result in @var{rop}. This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}.
1.1 maekawa 4299: @end deftypefun
4300:
4301: @deftypefn Macro int mpf_sgn (mpf_t @var{op})
1.1.1.4 ! ohara 4302: @cindex Sign tests
! 4303: @cindex Float sign tests
! 4304: Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
! 4305: @math{-1} if @math{@var{op} < 0}.
1.1 maekawa 4306:
1.1.1.4 ! ohara 4307: This function is actually implemented as a macro. It evaluates its arguments
! 4308: multiple times.
1.1 maekawa 4309: @end deftypefn
4310:
4311: @node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions
4312: @comment node-name, next, previous, up
4313: @section Input and Output Functions
4314: @cindex Float input and output functions
4315: @cindex Input functions
4316: @cindex Output functions
4317: @cindex I/O functions
4318:
4319: Functions that perform input from a stdio stream, and functions that output to
1.1.1.4 ! ohara 4320: a stdio stream. Passing a @code{NULL} pointer for a @var{stream} argument to
! 4321: any of these functions will make them read from @code{stdin} and write to
1.1 maekawa 4322: @code{stdout}, respectively.
4323:
4324: When using any of these functions, it is a good idea to include @file{stdio.h}
4325: before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
4326: for these functions.
4327:
4328: @deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op})
1.1.1.4 ! ohara 4329: Print @var{op} to @var{stream}, as a string of digits. Return the number of
! 4330: bytes written, or if an error occurred, return 0.
1.1 maekawa 4331:
1.1.1.4 ! ohara 4332: The mantissa is prefixed with an @samp{0.} and is in the given @var{base},
! 4333: which may vary from 2 to 36. An exponent then printed, separated by an
! 4334: @samp{e}, or if @var{base} is greater than 10 then by an @samp{@@}. The
! 4335: exponent is always in decimal. The decimal point follows the current locale,
! 4336: on systems providing @code{localeconv}.
! 4337:
! 4338: Up to @var{n_digits} will be printed from the mantissa, except that no more
! 4339: digits than are accurately representable by @var{op} will be printed.
! 4340: @var{n_digits} can be 0 to select that accurate maximum.
1.1 maekawa 4341: @end deftypefun
4342:
4343: @deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base})
1.1.1.4 ! ohara 4344: Read a string in base @var{base} from @var{stream}, and put the read float in
! 4345: @var{rop}. The string is of the form @samp{M@@N} or, if the base is 10 or
! 4346: less, alternatively @samp{MeN}. @samp{M} is the mantissa and @samp{N} is the
! 4347: exponent. The mantissa is always in the specified base. The exponent is
! 4348: either in the specified base or, if @var{base} is negative, in decimal. The
! 4349: decimal point expected is taken from the current locale, on systems providing
! 4350: @code{localeconv}.
1.1 maekawa 4351:
4352: The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
4353: @minus{}2. Negative values are used to specify that the exponent is in
4354: decimal.
4355:
4356: Unlike the corresponding @code{mpz} function, the base will not be determined
4357: from the leading characters of the string if @var{base} is 0. This is so that
4358: numbers like @samp{0.23} are not interpreted as octal.
4359:
4360: Return the number of bytes read, or if an error occurred, return 0.
4361: @end deftypefun
4362:
4363: @c @deftypefun void mpf_out_raw (FILE *@var{stream}, mpf_t @var{float})
4364: @c Output @var{float} on stdio stream @var{stream}, in raw binary
4365: @c format. The float is written in a portable format, with 4 bytes of
4366: @c size information, and that many bytes of limbs. Both the size and the
4367: @c limbs are written in decreasing significance order.
4368: @c @end deftypefun
4369:
4370: @c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream})
4371: @c Input from stdio stream @var{stream} in the format written by
4372: @c @code{mpf_out_raw}, and put the result in @var{float}.
4373: @c @end deftypefun
4374:
4375:
1.1.1.2 maekawa 4376: @node Miscellaneous Float Functions, , I/O of Floats, Floating-point Functions
1.1 maekawa 4377: @comment node-name, next, previous, up
4378: @section Miscellaneous Functions
4379: @cindex Miscellaneous float functions
1.1.1.2 maekawa 4380: @cindex Float miscellaneous functions
4381:
4382: @deftypefun void mpf_ceil (mpf_t @var{rop}, mpf_t @var{op})
4383: @deftypefunx void mpf_floor (mpf_t @var{rop}, mpf_t @var{op})
4384: @deftypefunx void mpf_trunc (mpf_t @var{rop}, mpf_t @var{op})
1.1.1.4 ! ohara 4385: Set @var{rop} to @var{op} rounded to an integer. @code{mpf_ceil} rounds to the
! 4386: next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc}
! 4387: to the integer towards zero.
! 4388: @end deftypefun
! 4389:
! 4390: @deftypefun int mpf_integer_p (mpf_t @var{op})
! 4391: Return non-zero if @var{op} is an integer.
! 4392: @end deftypefun
! 4393:
! 4394: @deftypefun int mpf_fits_ulong_p (mpf_t @var{op})
! 4395: @deftypefunx int mpf_fits_slong_p (mpf_t @var{op})
! 4396: @deftypefunx int mpf_fits_uint_p (mpf_t @var{op})
! 4397: @deftypefunx int mpf_fits_sint_p (mpf_t @var{op})
! 4398: @deftypefunx int mpf_fits_ushort_p (mpf_t @var{op})
! 4399: @deftypefunx int mpf_fits_sshort_p (mpf_t @var{op})
! 4400: Return non-zero if @var{op} would fit in the respective C data type, when
! 4401: truncated to an integer.
1.1.1.2 maekawa 4402: @end deftypefun
4403:
1.1.1.4 ! ohara 4404: @deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{nbits})
! 4405: Generate a uniformly distributed random float in @var{rop}, such that @math{0
! 4406: @le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa.
1.1.1.2 maekawa 4407:
4408: The variable @var{state} must be initialized by calling one of the
1.1.1.4 ! ohara 4409: @code{gmp_randinit} functions (@ref{Random State Initialization}) before
! 4410: invoking this function.
1.1.1.2 maekawa 4411: @end deftypefun
1.1 maekawa 4412:
1.1.1.4 ! ohara 4413: @deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp})
1.1 maekawa 4414: Generate a random float of at most @var{max_size} limbs, with long strings of
4415: zeros and ones in the binary representation. The exponent of the number is in
4416: the interval @minus{}@var{exp} to @var{exp}. This function is useful for
1.1.1.4 ! ohara 4417: testing functions and algorithms, since this kind of random numbers have proven
! 4418: to be more likely to trigger corner-case bugs. Negative random numbers are
! 4419: generated when @var{max_size} is negative.
1.1 maekawa 4420: @end deftypefun
4421:
4422: @c @deftypefun size_t mpf_size (mpf_t @var{op})
4423: @c Return the size of @var{op} measured in number of limbs. If @var{op} is
4424: @c zero, the returned value will be zero. (@xref{Nomenclature}, for an
4425: @c explanation of the concept @dfn{limb}.)
4426: @c
1.1.1.2 maekawa 4427: @c @strong{This function is obsolete. It will disappear from future GMP
1.1 maekawa 4428: @c releases.}
4429: @c @end deftypefun
4430:
1.1.1.4 ! ohara 4431:
1.1.1.2 maekawa 4432: @node Low-level Functions, Random Number Functions, Floating-point Functions, Top
1.1 maekawa 4433: @comment node-name, next, previous, up
4434: @chapter Low-level Functions
4435: @cindex Low-level functions
4436:
1.1.1.4 ! ohara 4437: This chapter describes low-level GMP functions, used to implement the
! 4438: high-level GMP functions, but also intended for time-critical user code.
1.1 maekawa 4439:
4440: These functions start with the prefix @code{mpn_}.
4441:
4442: @c 1. Some of these function clobber input operands.
4443: @c
4444:
4445: The @code{mpn} functions are designed to be as fast as possible, @strong{not}
4446: to provide a coherent calling interface. The different functions have somewhat
4447: similar interfaces, but there are variations that make them hard to use. These
4448: functions do as little as possible apart from the real multiple precision
4449: computation, so that no time is spent on things that not all callers need.
4450:
4451: A source operand is specified by a pointer to the least significant limb and a
4452: limb count. A destination operand is specified by just a pointer. It is the
4453: responsibility of the caller to ensure that the destination has enough space
4454: for storing the result.
4455:
1.1.1.4 ! ohara 4456: With this way of specifying operands, it is possible to perform computations on
! 4457: subranges of an argument, and store the result into a subrange of a
1.1 maekawa 4458: destination.
4459:
1.1.1.4 ! ohara 4460: A common requirement for all functions is that each source area needs at least
! 4461: one limb. No size argument may be zero. Unless otherwise stated, in-place
! 4462: operations are allowed where source and destination are the same, but not where
! 4463: they only partly overlap.
1.1 maekawa 4464:
1.1.1.2 maekawa 4465: The @code{mpn} functions are the base for the implementation of the
4466: @code{mpz_}, @code{mpf_}, and @code{mpq_} functions.
1.1 maekawa 4467:
1.1.1.4 ! ohara 4468: This example adds the number beginning at @var{s1p} and the number beginning at
! 4469: @var{s2p} and writes the sum at @var{destp}. All areas have @var{n} limbs.
1.1 maekawa 4470:
4471: @example
1.1.1.4 ! ohara 4472: cy = mpn_add_n (destp, s1p, s2p, n)
1.1 maekawa 4473: @end example
4474:
4475: @noindent
4476: In the notation used here, a source operand is identified by the pointer to
4477: the least significant limb, and the limb count in braces. For example,
1.1.1.4 ! ohara 4478: @{@var{s1p}, @var{s1n}@}.
1.1 maekawa 4479:
1.1.1.4 ! ohara 4480: @deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
! 4481: Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n}
! 4482: least significant limbs of the result to @var{rp}. Return carry, either 0 or
! 4483: 1.
1.1 maekawa 4484:
4485: This is the lowest-level function for addition. It is the preferred function
1.1.1.4 ! ohara 4486: for addition, since it is written in assembly for most CPUs. For addition of
! 4487: a variable to itself (i.e., @var{s1p} equals @var{s2p}, use @code{mpn_lshift}
! 4488: with a count of 1 for optimal speed.
1.1 maekawa 4489: @end deftypefun
4490:
1.1.1.4 ! ohara 4491: @deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
! 4492: Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least
! 4493: significant limbs of the result to @var{rp}. Return carry, either 0 or 1.
1.1 maekawa 4494: @end deftypefun
4495:
1.1.1.4 ! ohara 4496: @deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
! 4497: Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
! 4498: @var{s1n} least significant limbs of the result to @var{rp}. Return carry,
! 4499: either 0 or 1.
1.1 maekawa 4500:
1.1.1.4 ! ohara 4501: This function requires that @var{s1n} is greater than or equal to @var{s2n}.
1.1 maekawa 4502: @end deftypefun
4503:
1.1.1.4 ! ohara 4504: @deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
! 4505: Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the
! 4506: @var{n} least significant limbs of the result to @var{rp}. Return borrow,
! 4507: either 0 or 1.
1.1 maekawa 4508:
4509: This is the lowest-level function for subtraction. It is the preferred
1.1.1.4 ! ohara 4510: function for subtraction, since it is written in assembly for most CPUs.
1.1 maekawa 4511: @end deftypefun
4512:
1.1.1.4 ! ohara 4513: @deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
! 4514: Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least
! 4515: significant limbs of the result to @var{rp}. Return borrow, either 0 or 1.
1.1 maekawa 4516: @end deftypefun
4517:
1.1.1.4 ! ohara 4518: @deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
! 4519: Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the
! 4520: @var{s1n} least significant limbs of the result to @var{rp}. Return borrow,
! 4521: either 0 or 1.
1.1 maekawa 4522:
1.1.1.4 ! ohara 4523: This function requires that @var{s1n} is greater than or equal to
! 4524: @var{s2n}.
1.1 maekawa 4525: @end deftypefun
4526:
1.1.1.4 ! ohara 4527: @deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
! 4528: Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the
! 4529: 2*@var{n}-limb result to @var{rp}.
1.1 maekawa 4530:
1.1.1.4 ! ohara 4531: The destination has to have space for 2*@var{n} limbs, even if the product's
! 4532: most significant limb is zero.
1.1 maekawa 4533: @end deftypefun
4534:
1.1.1.4 ! ohara 4535: @deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
! 4536: Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least
! 4537: significant limbs of the product to @var{rp}. Return the most significant
! 4538: limb of the product. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are
! 4539: allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
1.1 maekawa 4540:
4541: This is a low-level function that is a building block for general
1.1.1.2 maekawa 4542: multiplication as well as other operations in GMP. It is written in assembly
1.1.1.4 ! ohara 4543: for most CPUs.
1.1 maekawa 4544:
1.1.1.4 ! ohara 4545: Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift}
! 4546: with a count equal to the logarithm of @var{s2limb} instead, for optimal speed.
1.1 maekawa 4547: @end deftypefun
4548:
1.1.1.4 ! ohara 4549: @deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
! 4550: Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least
! 4551: significant limbs of the product to @{@var{rp}, @var{n}@} and write the result
! 4552: to @var{rp}. Return the most significant limb of the product, plus carry-out
! 4553: from the addition.
1.1 maekawa 4554:
4555: This is a low-level function that is a building block for general
1.1.1.2 maekawa 4556: multiplication as well as other operations in GMP. It is written in assembly
1.1.1.4 ! ohara 4557: for most CPUs.
1.1 maekawa 4558: @end deftypefun
4559:
1.1.1.4 ! ohara 4560: @deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
! 4561: Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n}
! 4562: least significant limbs of the product from @{@var{rp}, @var{n}@} and write the
! 4563: result to @var{rp}. Return the most significant limb of the product, minus
! 4564: borrow-out from the subtraction.
1.1 maekawa 4565:
4566: This is a low-level function that is a building block for general
1.1.1.2 maekawa 4567: multiplication and division as well as other operations in GMP. It is written
1.1.1.4 ! ohara 4568: in assembly for most CPUs.
1.1 maekawa 4569: @end deftypefun
4570:
1.1.1.4 ! ohara 4571: @deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
! 4572: Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
! 4573: result to @var{rp}. Return the most significant limb of the result.
1.1 maekawa 4574:
1.1.1.4 ! ohara 4575: The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
! 4576: result might be one limb smaller.
1.1 maekawa 4577:
1.1.1.4 ! ohara 4578: This function requires that @var{s1n} is greater than or equal to
! 4579: @var{s2n}. The destination must be distinct from both input operands.
1.1.1.2 maekawa 4580: @end deftypefun
4581:
1.1.1.3 maekawa 4582: @deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn})
1.1.1.4 ! ohara 4583: Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient
! 4584: at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp},
! 4585: @var{dn}@}. The quotient is rounded towards 0.
1.1.1.2 maekawa 4586:
1.1.1.4 ! ohara 4587: No overlap is permitted between arguments. @var{nn} must be greater than or
! 4588: equal to @var{dn}. The most significant limb of @var{dp} must be non-zero.
! 4589: The @var{qxn} operand must be zero.
! 4590: @comment FIXME: Relax overlap requirements!
1.1 maekawa 4591: @end deftypefun
4592:
1.1.1.4 ! ohara 4593: @deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
! 4594: [This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best
! 4595: performance.]
1.1.1.2 maekawa 4596:
1.1.1.4 ! ohara 4597: Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the
! 4598: quotient at @var{r1p}, with the exception of the most significant limb, which
! 4599: is returned. The remainder replaces the dividend at @var{rs2p}; it will be
! 4600: @var{s3n} limbs long (i.e., as many limbs as the divisor).
1.1 maekawa 4601:
1.1.1.4 ! ohara 4602: In addition to an integer quotient, @var{qxn} fraction limbs are developed, and
! 4603: stored after the integral limbs. For most usages, @var{qxn} will be zero.
1.1 maekawa 4604:
1.1.1.4 ! ohara 4605: It is required that @var{rs2n} is greater than or equal to @var{s3n}. It is
! 4606: required that the most significant bit of the divisor is set.
1.1 maekawa 4607:
1.1.1.4 ! ohara 4608: If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}. Aside
! 4609: from that special case, no overlap between arguments is permitted.
1.1 maekawa 4610:
4611: Return the most significant limb of the quotient, either 0 or 1.
4612:
1.1.1.4 ! ohara 4613: The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn}
! 4614: limbs large.
1.1 maekawa 4615: @end deftypefun
4616:
1.1.1.4 ! ohara 4617: @deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb})
! 4618: @deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}})
! 4619: Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at
! 4620: @var{r1p}. Return the remainder.
! 4621:
! 4622: The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in
! 4623: addition @var{qxn} fraction limbs are developed and written to @{@var{r1p},
! 4624: @var{qxn}@}. Either or both @var{s2n} and @var{qxn} can be zero. For most
! 4625: usages, @var{qxn} will be zero.
1.1.1.2 maekawa 4626:
4627: @code{mpn_divmod_1} exists for upward source compatibility and is simply a
1.1.1.4 ! ohara 4628: macro calling @code{mpn_divrem_1} with a @var{qxn} of 0.
1.1 maekawa 4629:
4630: The areas at @var{r1p} and @var{s2p} have to be identical or completely
4631: separate, not partially overlapping.
1.1.1.2 maekawa 4632: @end deftypefn
1.1 maekawa 4633:
1.1.1.4 ! ohara 4634: @deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
! 4635: [This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best
! 4636: performance.]
1.1 maekawa 4637: @end deftypefun
4638:
1.1.1.4 ! ohara 4639: @deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}})
! 4640: @deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry})
! 4641: Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing
! 4642: the result to @{@var{rp}, @var{n}@}. If 3 divides exactly, the return value is
! 4643: zero and the result is the quotient. If not, the return value is non-zero and
! 4644: the result won't be anything useful.
1.1.1.2 maekawa 4645:
4646: @code{mpn_divexact_by3c} takes an initial carry parameter, which can be the
4647: return value from a previous call, so a large calculation can be done piece by
1.1.1.4 ! ohara 4648: piece from low to high. @code{mpn_divexact_by3} is simply a macro calling
1.1.1.2 maekawa 4649: @code{mpn_divexact_by3c} with a 0 carry parameter.
4650:
4651: These routines use a multiply-by-inverse and will be faster than
4652: @code{mpn_divrem_1} on CPUs with fast multiplication but slow division.
4653:
4654: The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i},
1.1.1.4 ! ohara 4655: and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where
! 4656: @m{b=2\GMPraise{@code{mp\_bits\_per\_limb}}, b=2^mp_bits_per_limb}. The
! 4657: return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also
! 4658: be 0, 1 or 2 (these are both borrows really). When @math{c=0} clearly
! 4659: @math{q=(a-i)/3}. When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{}
! 4660: 3} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when
! 4661: @code{mp_bits_per_limb} is even, which is always so currently).
1.1.1.2 maekawa 4662: @end deftypefn
1.1 maekawa 4663:
1.1.1.4 ! ohara 4664: @deftypefun mp_limb_t mpn_mod_1 (mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
! 4665: Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder.
! 4666: @var{s1n} can be zero.
1.1 maekawa 4667: @end deftypefun
4668:
1.1.1.4 ! ohara 4669: @deftypefun mp_limb_t mpn_bdivmod (mp_limb_t *@var{rp}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}, unsigned long int @var{d})
! 4670: This function puts the low
! 4671: @math{@GMPfloor{@var{d}/@nicode{mp\_bits\_per\_limb}}} limbs of @var{q} =
! 4672: @{@var{s1p}, @var{s1n}@}/@{@var{s2p}, @var{s2n}@} mod @m{2^d,2^@var{d}} at
! 4673: @var{rp}, and returns the high @var{d} mod @code{mp_bits_per_limb} bits of
! 4674: @var{q}.
1.1 maekawa 4675:
1.1.1.4 ! ohara 4676: @{@var{s1p}, @var{s1n}@} - @var{q} * @{@var{s2p}, @var{s2n}@} mod @m{2
! 4677: \GMPraise{@var{s1n}*@code{mp\_bits\_per\_limb}},
! 4678: 2^(@var{s1n}*@nicode{mp\_bits\_per\_limb})} is placed at @var{s1p}. Since the
! 4679: low @math{@GMPfloor{@var{d}/@nicode{mp\_bits\_per\_limb}}} limbs of this
! 4680: difference are zero, it is possible to overwrite the low limbs at @var{s1p}
! 4681: with this difference, provided @math{@var{rp} @le{} @var{s1p}}.
1.1 maekawa 4682:
1.1.1.4 ! ohara 4683: This function requires that @math{@var{s1n} * @nicode{mp\_bits\_per\_limb}
! 4684: @ge{} @var{D}}, and that @{@var{s2p}, @var{s2n}@} is odd.
1.1 maekawa 4685:
1.1.1.4 ! ohara 4686: @strong{This interface is preliminary. It might change incompatibly in future
! 4687: revisions.}
1.1 maekawa 4688: @end deftypefun
4689:
1.1.1.4 ! ohara 4690: @deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
! 4691: Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to
! 4692: @{@var{rp}, @var{n}@}. The bits shifted out at the left are returned in the
! 4693: least significant @var{count} bits of the return value (the rest of the return
! 4694: value is zero).
1.1 maekawa 4695:
1.1.1.4 ! ohara 4696: @var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The
! 4697: regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
! 4698: @math{@var{rp} @ge{} @var{sp}}.
1.1 maekawa 4699:
1.1.1.4 ! ohara 4700: This function is written in assembly for most CPUs.
1.1 maekawa 4701: @end deftypefun
4702:
1.1.1.4 ! ohara 4703: @deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
! 4704: Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to
! 4705: @{@var{rp}, @var{n}@}. The bits shifted out at the right are returned in the
! 4706: most significant @var{count} bits of the return value (the rest of the return
! 4707: value is zero).
1.1 maekawa 4708:
1.1.1.4 ! ohara 4709: @var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The
! 4710: regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
! 4711: @math{@var{rp} @le{} @var{sp}}.
1.1 maekawa 4712:
1.1.1.4 ! ohara 4713: This function is written in assembly for most CPUs.
1.1 maekawa 4714: @end deftypefun
4715:
1.1.1.4 ! ohara 4716: @deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
! 4717: Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a
! 4718: positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a
! 4719: negative value if @math{@var{s1} < @var{s2}}.
1.1 maekawa 4720: @end deftypefun
4721:
1.1.1.4 ! ohara 4722: @deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
! 4723: Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{s1p},
! 4724: @var{s1n}@} and @{@var{s2p}, @var{s2n}@}. The result can be up to @var{s2n}
! 4725: limbs, the return value is the actual number produced. Both source operands
! 4726: are destroyed.
1.1 maekawa 4727:
1.1.1.4 ! ohara 4728: @{@var{s1p}, @var{s1n}@} must have at least as many bits as @{@var{s2p},
! 4729: @var{s2n}@}. @{@var{s2p}, @var{s2n}@} must be odd. Both operands must have
! 4730: non-zero most significant limbs. No overlap is permitted between @{@var{s1p},
! 4731: @var{s1n}@} and @{@var{s2p}, @var{s2n}@}.
1.1 maekawa 4732: @end deftypefun
4733:
1.1.1.4 ! ohara 4734: @deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
! 4735: Return the greatest common divisor of @{@var{s1p}, @var{s1n}@} and
! 4736: @var{s2limb}. Both operands must be non-zero.
1.1 maekawa 4737: @end deftypefun
4738:
1.1.1.4 ! ohara 4739: @deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, mp_size_t *@var{r2n}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
! 4740: Calculate the greatest common divisor of @{@var{s1p}, @var{s1n}@} and
! 4741: @{@var{s2p}, @var{s2n}@}. Store the gcd at @{@var{r1p}, @var{retval}@} and
! 4742: the first cofactor at @{@var{r2p}, *@var{r2n}@}, with *@var{r2n} negative if
! 4743: the cofactor is negative. @var{r1p} and @var{r2p} should each have room for
! 4744: @math{@var{s1n}+1} limbs, but the return value and value stored through
! 4745: @var{r2n} indicate the actual number produced.
1.1.1.2 maekawa 4746:
1.1.1.4 ! ohara 4747: @math{@{@var{s1p}, @var{s1n}@} @ge{} @{@var{s2p}, @var{s2n}@}} is required,
! 4748: and both must be non-zero. The regions @{@var{s1p}, @math{@var{s1n}+1}@} and
! 4749: @{@var{s2p}, @math{@var{s2n}+1}@} are destroyed (i.e. the operands plus an
! 4750: extra limb past the end of each).
! 4751:
! 4752: The cofactor @var{r1} will satisfy @m{r_2 s_1 + k s_2 = r_1, @var{r2}*@var{s1}
! 4753: + @var{k}*@var{s2} = @var{r1}}. The second cofactor @var{k} is not calculated
! 4754: but can easily be obtained from @m{(r_1 - r_2 s_1) / s_2, (@var{r1} -
! 4755: @var{r2}*@var{s1}) / @var{s2}}.
1.1 maekawa 4756: @end deftypefun
4757:
1.1.1.4 ! ohara 4758: @deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
! 4759: Compute the square root of @{@var{sp}, @var{n}@} and put the result at
! 4760: @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p},
! 4761: @var{retval}@}. @var{r2p} needs space for @var{n} limbs, but the return value
! 4762: indicates how many are produced.
1.1 maekawa 4763:
1.1.1.4 ! ohara 4764: The most significant limb of @{@var{sp}, @var{n}@} must be non-zero. The
! 4765: areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must
! 4766: be completely separate. The areas @{@var{r2p}, @var{n}@} and @{@var{sp},
! 4767: @var{n}@} must be either identical or completely separate.
1.1 maekawa 4768:
1.1.1.4 ! ohara 4769: If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this
! 4770: case the return value is zero or non-zero according to whether the remainder
! 4771: would have been zero or non-zero.
1.1 maekawa 4772:
1.1.1.4 ! ohara 4773: A return value of zero indicates a perfect square. See also
! 4774: @code{mpz_perfect_square_p}.
1.1 maekawa 4775: @end deftypefun
4776:
1.1.1.4 ! ohara 4777: @deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n})
! 4778: Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in
! 4779: base @var{base}, and return the number of characters produced. There may be
! 4780: leading zeros in the string. The string is not in ASCII; to convert it to
! 4781: printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on
! 4782: the base and range. @var{base} can vary from 2 to 256.
1.1 maekawa 4783:
1.1.1.4 ! ohara 4784: The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be
! 4785: non-zero. The input @{@var{s1p}, @var{s1n}@} is clobbered, except when
! 4786: @var{base} is a power of 2, in which case it's unchanged.
1.1 maekawa 4787:
4788: The area at @var{str} has to have space for the largest possible number
1.1.1.4 ! ohara 4789: represented by a @var{s1n} long limb array, plus one extra character.
1.1 maekawa 4790: @end deftypefun
4791:
1.1.1.4 ! ohara 4792: @deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base})
! 4793: Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at
! 4794: @var{rp}.
! 4795:
! 4796: @math{@var{str}[0]} is the most significant byte and
! 4797: @math{@var{str}[@var{strsize}-1]} is the least significant. Each byte should
! 4798: be a value in the range 0 to @math{@var{base}-1}, not an ASCII character.
! 4799: @var{base} can vary from 2 to 256.
! 4800:
! 4801: The return value is the number of limbs written to @var{rp}. If the most
! 4802: significant input byte is non-zero then the high limb at @var{rp} will be
! 4803: non-zero, and only that exact number of limbs will be required there.
1.1 maekawa 4804:
1.1.1.4 ! ohara 4805: If the most significant input byte is zero then there may be high zero limbs
! 4806: written to @var{rp} and included in the return value.
! 4807:
! 4808: @var{strsize} must be at least 1, and no overlap is permitted between
! 4809: @{@var{str},@var{strsize}@} and the result at @var{rp}.
1.1 maekawa 4810: @end deftypefun
4811:
1.1.1.2 maekawa 4812: @deftypefun {unsigned long int} mpn_scan0 (const mp_limb_t *@var{s1p}, unsigned long int @var{bit})
1.1 maekawa 4813: Scan @var{s1p} from bit position @var{bit} for the next clear bit.
4814:
4815: It is required that there be a clear bit within the area at @var{s1p} at or
4816: beyond bit position @var{bit}, so that the function has something to return.
4817: @end deftypefun
4818:
1.1.1.2 maekawa 4819: @deftypefun {unsigned long int} mpn_scan1 (const mp_limb_t *@var{s1p}, unsigned long int @var{bit})
1.1 maekawa 4820: Scan @var{s1p} from bit position @var{bit} for the next set bit.
4821:
4822: It is required that there be a set bit within the area at @var{s1p} at or
4823: beyond bit position @var{bit}, so that the function has something to return.
4824: @end deftypefun
4825:
1.1.1.4 ! ohara 4826: @deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
! 4827: @deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
! 4828: Generate a random number of length @var{r1n} and store it at @var{r1p}. The
! 4829: most significant limb is always non-zero. @code{mpn_random} generates
1.1.1.2 maekawa 4830: uniformly distributed limb data, @code{mpn_random2} generates long strings of
4831: zeros and ones in the binary representation.
1.1 maekawa 4832:
1.1.1.2 maekawa 4833: @code{mpn_random2} is intended for testing the correctness of the @code{mpn}
4834: routines.
1.1 maekawa 4835: @end deftypefun
4836:
1.1.1.4 ! ohara 4837: @deftypefun {unsigned long int} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
! 4838: Count the number of set bits in @{@var{s1p}, @var{n}@}.
1.1 maekawa 4839: @end deftypefun
4840:
1.1.1.4 ! ohara 4841: @deftypefun {unsigned long int} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
! 4842: Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p},
! 4843: @var{n}@}.
1.1 maekawa 4844: @end deftypefun
4845:
1.1.1.4 ! ohara 4846: @deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
! 4847: Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square.
1.1 maekawa 4848: @end deftypefun
4849:
4850:
1.1.1.4 ! ohara 4851: @sp 1
! 4852: @section Nails
! 4853: @cindex Nails
! 4854:
! 4855: @strong{Everything in this section is highly experimental and may disappear or
! 4856: be subject to incompatible changes in a future version of GMP.}
! 4857:
! 4858: Nails are an experimental feature whereby a few bits are left unused at the
! 4859: top of each @code{mp_limb_t}. This can significantly improve carry handling
! 4860: on some processors.
! 4861:
! 4862: All the @code{mpn} functions accepting limb data will expect the nail bits to
! 4863: be zero on entry, and will return data with the nails similarly all zero.
! 4864: This applies both to limb vectors and to single limb arguments.
! 4865:
! 4866: Nails can be enabled by configuring with @samp{--enable-nails}. By default
! 4867: the number of bits will be chosen according to what suits the host processor,
! 4868: but a particular number can be selected with @samp{--enable-nails=N}.
! 4869:
! 4870: At the mpn level, a nail build is neither source nor binary compatible with a
! 4871: non-nail build, strictly speaking. But programs acting on limbs only through
! 4872: the mpn functions are likely to work equally well with either build, and
! 4873: judicious use of the definitions below should make any program compatible with
! 4874: either build, at the source level.
! 4875:
! 4876: For the higher level routines, meaning @code{mpz} etc, a nail build should be
! 4877: fully source and binary compatible with a non-nail build.
! 4878:
! 4879: @defmac GMP_NAIL_BITS
! 4880: @defmacx GMP_NUMB_BITS
! 4881: @defmacx GMP_LIMB_BITS
! 4882: @code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in
! 4883: use. @code{GMP_NUMB_BITS} is the number of data bits in a limb.
! 4884: @code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}. In
! 4885: all cases
! 4886:
! 4887: @example
! 4888: GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS
! 4889: @end example
! 4890: @end defmac
! 4891:
! 4892: @defmac GMP_NAIL_MASK
! 4893: @defmacx GMP_NUMB_MASK
! 4894: Bit masks for the nail and number parts of a limb. @code{GMP_NAIL_MASK} is 0
! 4895: when nails are not in use.
! 4896:
! 4897: @code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained
! 4898: with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which
! 4899: can help various RISC chips.
! 4900: @end defmac
! 4901:
! 4902: @defmac GMP_NUMB_MAX
! 4903: The maximum value that can be stored in the number part of a limb. This is
! 4904: the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing
! 4905: comparisons rather than bit-wise operations.
! 4906: @end defmac
! 4907:
! 4908: The term ``nails'' comes from finger or toe nails, which are at the ends of a
! 4909: limb (arm or leg). ``numb'' is short for number, but is also how the
! 4910: developers felt after trying for a long time to come up with sensible names
! 4911: for these things.
! 4912:
! 4913: In the future (the distant future most likely) a non-zero nail might be
! 4914: permitted, giving non-unique representations for numbers in a limb vector.
! 4915: This would help vector processors since carries would only ever need to
! 4916: propagate one or two limbs.
! 4917:
! 4918:
! 4919: @node Random Number Functions, Formatted Output, Low-level Functions, Top
1.1.1.2 maekawa 4920: @chapter Random Number Functions
4921: @cindex Random number functions
4922:
1.1.1.4 ! ohara 4923: Sequences of pseudo-random numbers in GMP are generated using a variable of
! 4924: type @code{gmp_randstate_t}, which holds an algorithm selection and a current
! 4925: state. Such a variable must be initialized by a call to one of the
! 4926: @code{gmp_randinit} functions, and can be seeded with one of the
! 4927: @code{gmp_randseed} functions.
! 4928:
! 4929: The functions actually generating random numbers are described in @ref{Integer
! 4930: Random Numbers}, and @ref{Miscellaneous Float Functions}.
! 4931:
! 4932: The older style random number functions don't accept a @code{gmp_randstate_t}
! 4933: parameter but instead share a global variable of that type. They use a
! 4934: default algorithm and are currently not seeded (though perhaps that will
! 4935: change in the future). The new functions accepting a @code{gmp_randstate_t}
! 4936: are recommended for applications that care about randomness.
1.1.1.2 maekawa 4937:
4938: @menu
1.1.1.4 ! ohara 4939: * Random State Initialization::
! 4940: * Random State Seeding::
1.1.1.2 maekawa 4941: @end menu
4942:
1.1.1.4 ! ohara 4943: @node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions
1.1.1.2 maekawa 4944: @section Random State Initialization
4945: @cindex Random number state
4946:
1.1.1.4 ! ohara 4947: @deftypefun void gmp_randinit_default (gmp_randstate_t @var{state})
! 4948: Initialize @var{state} with a default algorithm. This will be a compromise
! 4949: between speed and randomness, and is recommended for applications with no
! 4950: special requirements.
! 4951: @end deftypefun
1.1.1.2 maekawa 4952:
1.1.1.4 ! ohara 4953: @deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, mpz_t @var{a}, @w{unsigned long @var{c}}, @w{unsigned long @var{m2exp}})
! 4954: Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X +
! 4955: @var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}.
1.1.1.2 maekawa 4956:
1.1.1.4 ! ohara 4957: The low bits of @math{X} in this algorithm are not very random. The least
! 4958: significant bit will have a period no more than 2, and the second bit no more
! 4959: than 4, etc. For this reason only the high half of each @math{X} is actually
! 4960: used.
1.1.1.2 maekawa 4961:
1.1.1.4 ! ohara 4962: When a random number of more than @math{@var{m2exp}/2} bits is to be
! 4963: generated, multiple iterations of the recurrence are used and the results
! 4964: concatenated.
1.1.1.2 maekawa 4965: @end deftypefun
4966:
1.1.1.4 ! ohara 4967: @deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, unsigned long @var{size})
! 4968: Initialize @var{state} for a linear congruential algorithm as per
! 4969: @code{gmp_randinit_lc_2exp}. @var{a}, @var{c} and @var{m2exp} are selected
! 4970: from a table, chosen so that @var{size} bits (or more) of each @math{X} will
! 4971: be used, ie. @math{@var{m2exp}/2 @ge{} @var{size}}.
1.1.1.2 maekawa 4972:
1.1.1.4 ! ohara 4973: If successful the return value is non-zero. If @var{size} is bigger than the
! 4974: table data provides then the return value is zero. The maximum @var{size}
! 4975: currently supported is 128.
! 4976: @end deftypefun
! 4977:
! 4978: @deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, ...)
! 4979: @strong{This function is obsolete.}
1.1.1.2 maekawa 4980:
1.1.1.4 ! ohara 4981: Initialize @var{state} with an algorithm selected by @var{alg}. The only
! 4982: choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size}.
! 4983: A third parameter of type @code{unsigned long} is required, this is the
! 4984: @var{size} for that function. @code{GMP_RAND_ALG_DEFAULT} or 0 are the same
! 4985: as @code{GMP_RAND_ALG_LC}.
1.1.1.2 maekawa 4986:
1.1.1.4 ! ohara 4987: @code{gmp_randinit} sets bits in @code{gmp_errno} to indicate an error.
! 4988: @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is unsupported, or
! 4989: @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter is too big.
! 4990: @end deftypefun
1.1.1.2 maekawa 4991:
1.1.1.4 ! ohara 4992: @c Not yet in the library.
! 4993: @ignore
! 4994: @deftypefun void gmp_randinit_lc (gmp_randstate_t @var{state}, mpz_t @var{a}, unsigned long int @var{c}, mpz_t @var{m})
! 4995: Initialize @var{state} for a linear congruential scheme @m{X = (@var{a}X +
! 4996: @var{c}) @bmod @var{m}, X = (@var{a}*X + @var{c}) mod 2^@var{m}}.
1.1.1.2 maekawa 4997: @end deftypefun
4998: @end ignore
4999:
1.1.1.4 ! ohara 5000: @deftypefun void gmp_randclear (gmp_randstate_t @var{state})
! 5001: Free all memory occupied by @var{state}.
! 5002: @end deftypefun
1.1.1.2 maekawa 5003:
5004:
1.1.1.4 ! ohara 5005: @node Random State Seeding, , Random State Initialization, Random Number Functions
! 5006: @section Random State Seeding
! 5007: @cindex Random number seeding
1.1.1.2 maekawa 5008:
1.1.1.4 ! ohara 5009: @deftypefun void gmp_randseed (gmp_randstate_t @var{state}, mpz_t @var{seed})
! 5010: @deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}})
! 5011: Set an initial seed value into @var{state}.
1.1.1.2 maekawa 5012:
1.1.1.4 ! ohara 5013: The size of a seed determines how many different sequences of random numbers
! 5014: that it's possible to generate. The ``quality'' of the seed is the randomness
! 5015: of a given seed compared to the previous seed used, and this affects the
! 5016: randomness of separate number sequences. The method for choosing a seed is
! 5017: critical if the generated numbers are to be used for important applications,
! 5018: such as generating cryptographic keys.
! 5019:
! 5020: Traditionally the system time has been used to seed, but care needs to be
! 5021: taken with this. If an application seeds often and the resolution of the
! 5022: system clock is low, then the same sequence of numbers might be repeated.
! 5023: Also, the system time is quite easy to guess, so if unpredictability is
! 5024: required then it should definitely not be the only source for the seed value.
! 5025: On some systems there's a special device @file{/dev/random} which provides
! 5026: random data better suited for use as a seed.
1.1.1.2 maekawa 5027: @end deftypefun
5028:
5029:
1.1.1.4 ! ohara 5030: @node Formatted Output, Formatted Input, Random Number Functions, Top
! 5031: @chapter Formatted Output
! 5032: @cindex Formatted output
! 5033: @cindex @code{printf} formatted output
1.1.1.2 maekawa 5034:
1.1.1.4 ! ohara 5035: @menu
! 5036: * Formatted Output Strings::
! 5037: * Formatted Output Functions::
! 5038: * C++ Formatted Output::
! 5039: @end menu
1.1.1.2 maekawa 5040:
1.1.1.4 ! ohara 5041: @node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output
! 5042: @section Format Strings
1.1.1.2 maekawa 5043:
1.1.1.4 ! ohara 5044: @code{gmp_printf} and friends accept format strings similar to the standard C
! 5045: @code{printf} (@pxref{Formatted Output,,,libc,The GNU C Library Reference
! 5046: Manual}). A format specification is of the form
1.1 maekawa 5047:
1.1.1.4 ! ohara 5048: @example
! 5049: % [flags] [width] [.[precision]] [type] conv
! 5050: @end example
1.1 maekawa 5051:
1.1.1.4 ! ohara 5052: GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
! 5053: and @code{mpf_t} respectively, and @samp{N} for an @code{mp_limb_t} array.
! 5054: @samp{Z}, @samp{Q} and @samp{N} behave like integers. @samp{Q} will print a
! 5055: @samp{/} and a denominator, if needed. @samp{F} behaves like a float. For
! 5056: example,
1.1 maekawa 5057:
1.1.1.4 ! ohara 5058: @example
! 5059: mpz_t z;
! 5060: gmp_printf ("%s is an mpz %Zd\n", "here", z);
1.1 maekawa 5061:
1.1.1.4 ! ohara 5062: mpq_t q;
! 5063: gmp_printf ("a hex rational: %#40Qx\n", q);
1.1 maekawa 5064:
1.1.1.4 ! ohara 5065: mpf_t f;
! 5066: int n;
! 5067: gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n);
! 5068:
! 5069: const mp_limb_t *ptr;
! 5070: mp_size_t size;
! 5071: gmp_printf ("limb array %Nx\n", ptr, size);
! 5072: @end example
1.1 maekawa 5073:
1.1.1.4 ! ohara 5074: For @samp{N} the limbs are expected least significant first, as per the
! 5075: @code{mpn} functions (@pxref{Low-level Functions}). A negative size can be
! 5076: given to print the value as a negative.
! 5077:
! 5078: All the standard C @code{printf} types behave the same as the C library
! 5079: @code{printf}, and can be freely intermixed with the GMP extensions. In the
! 5080: current implementation the standard parts of the format string are simply
! 5081: handed to @code{printf} and only the GMP extensions handled directly.
! 5082:
! 5083: The flags accepted are as follows. GLIBC style @nisamp{'} is only for the
! 5084: standard C types (not the GMP types), and only if the C library supports it.
! 5085:
! 5086: @quotation
! 5087: @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 5088: @item @nicode{0} @tab pad with zeros (rather than spaces)
! 5089: @item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0}
! 5090: @item @nicode{+} @tab always show a sign
! 5091: @item (space) @tab show a space or a @samp{-} sign
! 5092: @item @nicode{'} @tab group digits, GLIBC style (not GMP types)
! 5093: @end multitable
! 5094: @end quotation
! 5095:
! 5096: The optional width and precision can be given as a number within the format
! 5097: string, or as a @samp{*} to take an extra parameter of type @code{int}, the
! 5098: same as the standard @code{printf}.
! 5099:
! 5100: The standard types accepted are as follows. @samp{h} and @samp{l} are
! 5101: portable, the rest will depend on the compiler (or include files) for the type
! 5102: and the C library for the output.
! 5103:
! 5104: @quotation
! 5105: @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 5106: @item @nicode{h} @tab @nicode{short}
! 5107: @item @nicode{hh} @tab @nicode{char}
! 5108: @item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t}
! 5109: @item @nicode{l} @tab @nicode{long} or @nicode{wchar_t}
! 5110: @item @nicode{ll} @tab @nicode{long long}
! 5111: @item @nicode{L} @tab @nicode{long double}
! 5112: @item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t}
! 5113: @item @nicode{t} @tab @nicode{ptrdiff_t}
! 5114: @item @nicode{z} @tab @nicode{size_t}
! 5115: @end multitable
! 5116: @end quotation
1.1 maekawa 5117:
1.1.1.4 ! ohara 5118: @noindent
! 5119: The GMP types are
1.1 maekawa 5120:
1.1.1.4 ! ohara 5121: @quotation
! 5122: @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 5123: @item @nicode{F} @tab @nicode{mpf_t}, float conversions
! 5124: @item @nicode{Q} @tab @nicode{mpq_t}, integer conversions
! 5125: @item @nicode{N} @tab @nicode{mp_limb_t} array, integer conversions
! 5126: @item @nicode{Z} @tab @nicode{mpz_t}, integer conversions
! 5127: @end multitable
! 5128: @end quotation
! 5129:
! 5130: The conversions accepted are as follows. @samp{a} and @samp{A} are always
! 5131: supported for @code{mpf_t} but depend on the C library for standard C float
! 5132: types. @samp{m} and @samp{p} depend on the C library.
! 5133:
! 5134: @quotation
! 5135: @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 5136: @item @nicode{a} @nicode{A} @tab hex floats, C99 style
! 5137: @item @nicode{c} @tab character
! 5138: @item @nicode{d} @tab decimal integer
! 5139: @item @nicode{e} @nicode{E} @tab scientific format float
! 5140: @item @nicode{f} @tab fixed point float
! 5141: @item @nicode{i} @tab same as @nicode{d}
! 5142: @item @nicode{g} @nicode{G} @tab fixed or scientific float
! 5143: @item @nicode{m} @tab @code{strerror} string, GLIBC style
! 5144: @item @nicode{n} @tab store characters written so far
! 5145: @item @nicode{o} @tab octal integer
! 5146: @item @nicode{p} @tab pointer
! 5147: @item @nicode{s} @tab string
! 5148: @item @nicode{u} @tab unsigned integer
! 5149: @item @nicode{x} @nicode{X} @tab hex integer
! 5150: @end multitable
! 5151: @end quotation
! 5152:
! 5153: @samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for
! 5154: types @samp{Z}, @samp{Q} and @samp{N} they are signed. @samp{u} is not
! 5155: meaningful for @samp{Z}, @samp{Q} and @samp{N}.
! 5156:
! 5157: @samp{n} can be used with any type, even the GMP types.
! 5158:
! 5159: Other types or conversions that might be accepted by the C library
! 5160: @code{printf} cannot be used through @code{gmp_printf}, this includes for
! 5161: instance extensions registered with GLIBC @code{register_printf_function}.
! 5162: Also currently there's no support for POSIX @samp{$} style numbered arguments
! 5163: (perhaps this will be added in the future).
! 5164:
! 5165: The precision field has it's usual meaning for integer @samp{Z} and float
! 5166: @samp{F} types, but is currently undefined for @samp{Q} and should not be used
! 5167: with that.
! 5168:
! 5169: @code{mpf_t} conversions only ever generate as many digits as can be
! 5170: accurately represented by the operand, the same as @code{mpf_get_str} does.
! 5171: Zeros will be used if necessary to pad to the requested precision. This
! 5172: happens even for an @samp{f} conversion of an @code{mpf_t} which is an
! 5173: integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits
! 5174: precision will only produce about 40 digits, then pad with zeros to the
! 5175: decimal point. An empty precision field like @samp{%.Fe} or @samp{%.Ff} can
! 5176: be used to specifically request just the significant digits.
! 5177:
! 5178: The decimal point character (or string) is taken from the current locale
! 5179: settings on systems which provide @code{localeconv} (@pxref{Locales,,Locales
! 5180: and Internationalization,libc,The GNU C Library Reference Manual}). The C
! 5181: library will normally do the same for standard float output.
! 5182:
! 5183: The format string is only interpreted as plain @code{char}s, multibyte
! 5184: characters are not recognised. Perhaps this will change in the future.
! 5185:
! 5186:
! 5187: @node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output
! 5188: @section Functions
! 5189:
! 5190: Each of the following functions is similar to the corresponding C library
! 5191: function. The basic @code{printf} forms take a variable argument list. The
! 5192: @code{vprintf} forms take an argument pointer, see @ref{Variadic
! 5193: Functions,,,libc,The GNU C Library Reference Manual}, or @samp{man 3
! 5194: va_start}.
! 5195:
! 5196: It should be emphasised that if a format string is invalid, or the arguments
! 5197: don't match what the format specifies, then the behaviour of any of these
! 5198: functions will be unpredictable. GCC format string checking is not available,
! 5199: since it doesn't recognise the GMP extensions.
! 5200:
! 5201: The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return
! 5202: @math{-1} to indicate a write error. All the functions can return @math{-1}
! 5203: if the C library @code{printf} variant in use returns @math{-1}, but this
! 5204: shouldn't normally occur.
! 5205:
! 5206: @deftypefun int gmp_printf (const char *@var{fmt}, ...)
! 5207: @deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap})
! 5208: Print to the standard output @code{stdout}. Return the number of characters
! 5209: written, or @math{-1} if an error occurred.
! 5210: @end deftypefun
! 5211:
! 5212: @deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, ...)
! 5213: @deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
! 5214: Print to the stream @var{fp}. Return the number of characters written, or
! 5215: @math{-1} if an error occurred.
! 5216: @end deftypefun
! 5217:
! 5218: @deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, ...)
! 5219: @deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap})
! 5220: Form a null-terminated string in @var{buf}. Return the number of characters
! 5221: written, excluding the terminating null.
! 5222:
! 5223: No overlap is permitted between the space at @var{buf} and the string
! 5224: @var{fmt}.
! 5225:
! 5226: These functions are not recommended, since there's no protection against
! 5227: exceeding the space available at @var{buf}.
! 5228: @end deftypefun
! 5229:
! 5230: @deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, ...)
! 5231: @deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap})
! 5232: Form a null-terminated string in @var{buf}. No more than @var{size} bytes
! 5233: will be written. To get the full output, @var{size} must be enough for the
! 5234: string and null-terminator.
! 5235:
! 5236: The return value is the total number of characters which ought to have been
! 5237: produced, excluding the terminating null. If @math{@var{retval} @ge{}
! 5238: @var{size}} then the actual output has been truncated to the first
! 5239: @math{@var{size}-1} characters, and a null appended.
! 5240:
! 5241: No overlap is permitted between the region @{@var{buf},@var{size}@} and the
! 5242: @var{fmt} string.
! 5243:
! 5244: Notice the return value is in ISO C99 @code{snprintf} style. This is so even
! 5245: if the C library @code{vsnprintf} is the older GLIBC 2.0.x style.
! 5246: @end deftypefun
! 5247:
! 5248: @deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, ...)
! 5249: @deftypefunx int gmp_vasprintf (char *@var{pp}, const char *@var{fmt}, va_list @var{ap})
! 5250: Form a null-terminated string in a block of memory obtained from the current
! 5251: memory allocation function (@pxref{Custom Allocation}). The block will be the
! 5252: size of the string and null-terminator. Put the address of the block in
! 5253: *@var{pp}. Return the number of characters produced, excluding the
! 5254: null-terminator.
! 5255:
! 5256: Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return
! 5257: @math{-1} if there's no more memory available, it lets the current allocation
! 5258: function handle that.
! 5259: @end deftypefun
! 5260:
! 5261: @deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, ...)
! 5262: @deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap})
! 5263: Append to the current obstack object, in the same style as
! 5264: @code{obstack_printf}. Return the number of characters written. A
! 5265: null-terminator is not written.
! 5266:
! 5267: @var{fmt} cannot be within the current obstack object, since the object might
! 5268: move as it grows.
! 5269:
! 5270: These functions are available only when the C library provides the obstack
! 5271: feature, which probably means only on GNU systems, see
! 5272: @ref{Obstacks,,,libc,The GNU C Library Reference Manual}.
! 5273: @end deftypefun
! 5274:
! 5275:
! 5276: @node C++ Formatted Output, , Formatted Output Functions, Formatted Output
! 5277: @section C++ Formatted Output
! 5278: @cindex C++ @code{ostream} output
! 5279: @cindex @code{ostream} output
! 5280:
! 5281: The following functions are provided in @file{libgmpxx}, which is built if C++
! 5282: support is enabled (@pxref{Build Options}). Prototypes are available from
! 5283: @code{<gmp.h>}.
! 5284:
! 5285: @deftypefun ostream& operator<< (ostream& @var{stream}, mpz_t @var{op})
! 5286: Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
! 5287: @code{ios::width} is reset to 0 after output, the same as the standard
! 5288: @code{ostream operator<<} routines do.
! 5289:
! 5290: In hex or octal, @var{op} is printed as a signed number, the same as for
! 5291: decimal. This is unlike the standard @code{operator<<} routines on @code{int}
! 5292: etc, which instead give twos complement.
! 5293: @end deftypefun
! 5294:
! 5295: @deftypefun ostream& operator<< (ostream& @var{stream}, mpq_t @var{op})
! 5296: Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
! 5297: @code{ios::width} is reset to 0 after output, the same as the standard
! 5298: @code{ostream operator<<} routines do.
! 5299:
! 5300: Output will be a fraction like @samp{5/9}, or if the denominator is 1 then
! 5301: just a plain integer like @samp{123}.
! 5302:
! 5303: In hex or octal, @var{op} is printed as a signed value, the same as for
! 5304: decimal. If @code{ios::showbase} is set then a base indicator is shown on
! 5305: both the numerator and denominator (if the denominator is required).
! 5306: @end deftypefun
! 5307:
! 5308: @deftypefun ostream& operator<< (ostream& @var{stream}, mpf_t @var{op})
! 5309: Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
! 5310: @code{ios::width} is reset to 0 after output, the same as the standard
! 5311: @code{ostream operator<<} routines do. The decimal point follows the current
! 5312: locale, on systems providing @code{localeconv}.
! 5313:
! 5314: Hex and octal are supported, unlike the standard @code{operator<<} on
! 5315: @code{double}. The mantissa will be in hex or octal, the exponent will be in
! 5316: decimal. For hex the exponent delimiter is an @samp{@@}. This is as per
! 5317: @code{mpf_out_str}.
! 5318:
! 5319: @code{ios::showbase} is supported, and will put a base on the mantissa, for
! 5320: example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}.
! 5321: This last form is slightly strange, but at least differentiates itself from
! 5322: decimal.
1.1 maekawa 5323: @end deftypefun
5324:
1.1.1.4 ! ohara 5325: These operators mean that GMP types can be printed in the usual C++ way, for
! 5326: example,
1.1 maekawa 5327:
1.1.1.4 ! ohara 5328: @example
! 5329: mpz_t z;
! 5330: int n;
! 5331: ...
! 5332: cout << "iteration " << n << " value " << z << "\n";
! 5333: @end example
1.1 maekawa 5334:
1.1.1.4 ! ohara 5335: But note that @code{ostream} output (and @code{istream} input, @pxref{C++
! 5336: Formatted Input}) is the only overloading available and using for instance
! 5337: @code{+} with an @code{mpz_t} will have unpredictable results.
1.1 maekawa 5338:
5339:
1.1.1.4 ! ohara 5340: @node Formatted Input, C++ Class Interface, Formatted Output, Top
! 5341: @chapter Formatted Input
! 5342: @cindex Formatted input
! 5343: @cindex @code{scanf} formatted input
1.1 maekawa 5344:
1.1.1.4 ! ohara 5345: @menu
! 5346: * Formatted Input Strings::
! 5347: * Formatted Input Functions::
! 5348: * C++ Formatted Input::
! 5349: @end menu
1.1 maekawa 5350:
5351:
1.1.1.4 ! ohara 5352: @node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input
! 5353: @section Formatted Input Strings
1.1 maekawa 5354:
1.1.1.4 ! ohara 5355: @code{gmp_scanf} and friends accept format strings similar to the standard C
! 5356: @code{scanf} (@pxref{Formatted Input,,,libc,The GNU C Library Reference
! 5357: Manual}). A format specification is of the form
1.1 maekawa 5358:
1.1.1.4 ! ohara 5359: @example
! 5360: % [flags] [width] [type] conv
! 5361: @end example
! 5362:
! 5363: GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
! 5364: and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers.
! 5365: @samp{Q} will read a @samp{/} and a denominator, if present. @samp{F} behaves
! 5366: like a float.
! 5367:
! 5368: GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since
! 5369: they're already ``call-by-reference''. For example,
! 5370:
! 5371: @example
! 5372: /* to read say "a(5) = 1234" */
! 5373: int n;
! 5374: mpz_t z;
! 5375: gmp_scanf ("a(%d) = %Zd\n", &n, z);
! 5376:
! 5377: mpq_t q1, q2;
! 5378: gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2);
! 5379:
! 5380: /* to read say "topleft (1.55,-2.66)" */
! 5381: mpf_t x, y;
! 5382: char buf[32];
! 5383: gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y);
! 5384: @end example
! 5385:
! 5386: All the standard C @code{scanf} types behave the same as in the C library
! 5387: @code{scanf}, and can be freely intermixed with the GMP extensions. In the
! 5388: current implementation the standard parts of the format string are simply
! 5389: handed to @code{scanf} and only the GMP extensions handled directly.
! 5390:
! 5391: The flags accepted are as follows. @samp{a} and @samp{'} will depend on
! 5392: support from the C library, and @samp{'} cannot be used with GMP types.
! 5393:
! 5394: @quotation
! 5395: @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 5396: @item @nicode{*} @tab read but don't store
! 5397: @item @nicode{a} @tab allocate a buffer (string conversions)
! 5398: @item @nicode{'} @tab group digits, GLIBC style (not GMP types)
! 5399: @end multitable
! 5400: @end quotation
! 5401:
! 5402: The standard types accepted are as follows. @samp{h} and @samp{l} are
! 5403: portable, the rest will depend on the compiler (or include files) for the type
! 5404: and the C library for the input.
! 5405:
! 5406: @quotation
! 5407: @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 5408: @item @nicode{h} @tab @nicode{short}
! 5409: @item @nicode{hh} @tab @nicode{char}
! 5410: @item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t}
! 5411: @item @nicode{l} @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t}
! 5412: @item @nicode{ll} @tab @nicode{long long}
! 5413: @item @nicode{L} @tab @nicode{long double}
! 5414: @item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t}
! 5415: @item @nicode{t} @tab @nicode{ptrdiff_t}
! 5416: @item @nicode{z} @tab @nicode{size_t}
! 5417: @end multitable
! 5418: @end quotation
! 5419:
! 5420: @noindent
! 5421: The GMP types are
! 5422:
! 5423: @quotation
! 5424: @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 5425: @item @nicode{F} @tab @nicode{mpf_t}, float conversions
! 5426: @item @nicode{Q} @tab @nicode{mpq_t}, integer conversions
! 5427: @item @nicode{Z} @tab @nicode{mpz_t}, integer conversions
! 5428: @end multitable
! 5429: @end quotation
! 5430:
! 5431: The conversions accepted are as follows. @samp{p} and @samp{[} will depend on
! 5432: support from the C library, the rest are standard.
! 5433:
! 5434: @quotation
! 5435: @multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 5436: @item @nicode{c} @tab character or characters
! 5437: @item @nicode{d} @tab decimal integer
! 5438: @item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G}
! 5439: @tab float
! 5440: @item @nicode{i} @tab integer with base indicator
! 5441: @item @nicode{n} @tab characters read so far
! 5442: @item @nicode{o} @tab octal integer
! 5443: @item @nicode{p} @tab pointer
! 5444: @item @nicode{s} @tab string of non-whitespace characters
! 5445: @item @nicode{u} @tab decimal integer
! 5446: @item @nicode{x} @nicode{X} @tab hex integer
! 5447: @item @nicode{[} @tab string of characters in a set
! 5448: @end multitable
! 5449: @end quotation
! 5450:
! 5451: @samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all
! 5452: read either fixed point or scientific format, and either @samp{e} or @samp{E}
! 5453: for the exponent in scientific format.
! 5454:
! 5455: @samp{x} and @samp{X} are identical, both accept both upper and lower case
! 5456: hexadecimal.
! 5457:
! 5458: @samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative
! 5459: values. For the standard C types these are described as ``unsigned''
! 5460: conversions, but that merely affects certain overflow handling, negatives are
! 5461: still allowed (see @code{strtoul}, @ref{Parsing of Integers,,,libc,The GNU C
! 5462: Library Reference Manual}). For GMP types there are no overflows, and
! 5463: @samp{d} and @samp{u} are identical.
! 5464:
! 5465: @samp{Q} type reads the numerator and (optional) denominator as given. If the
! 5466: value might not be in canonical form then @code{mpq_canonicalize} must be
! 5467: called before using it in any calculations (@pxref{Rational Number
! 5468: Functions}).
! 5469:
! 5470: @samp{Qi} will read a base specification separately for the numerator and
! 5471: denominator. For example @samp{0x10/11} would be 16/11, whereas
! 5472: @samp{0x10/0x11} would be 16/17.
! 5473:
! 5474: @samp{n} can be used with any of the types above, even the GMP types.
! 5475: @samp{*} to suppress assignment is allowed, though the field would then do
! 5476: nothing at all.
! 5477:
! 5478: Other conversions or types that might be accepted by the C library
! 5479: @code{scanf} cannot be used through @code{gmp_scanf}.
! 5480:
! 5481: Whitespace is read and discarded before a field, except for @samp{c} and
! 5482: @samp{[} conversions.
! 5483:
! 5484: For float conversions, the decimal point character (or string) expected is
! 5485: taken from the current locale settings on systems which provide
! 5486: @code{localeconv} (@pxref{Locales,,Locales and Internationalization,libc,The
! 5487: GNU C Library Reference Manual}). The C library will normally do the same for
! 5488: standard float input.
! 5489:
! 5490: The format string is only interpreted as plain @code{char}s, multibyte
! 5491: characters are not recognised. Perhaps this will change in the future.
! 5492:
! 5493:
! 5494: @node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input
! 5495: @section Formatted Input Functions
! 5496:
! 5497: Each of the following functions is similar to the corresponding C library
! 5498: function. The plain @code{scanf} forms take a variable argument list. The
! 5499: @code{vscanf} forms take an argument pointer, see @ref{Variadic
! 5500: Functions,,,libc,The GNU C Library Reference Manual}, or @samp{man 3
! 5501: va_start}.
! 5502:
! 5503: It should be emphasised that if a format string is invalid, or the arguments
! 5504: don't match what the format specifies, then the behaviour of any of these
! 5505: functions will be unpredictable. GCC format string checking is not available,
! 5506: since it doesn't recognise the GMP extensions.
! 5507:
! 5508: No overlap is permitted between the @var{fmt} string and any of the results
! 5509: produced.
! 5510:
! 5511: @deftypefun int gmp_scanf (const char *@var{fmt}, ...)
! 5512: @deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap})
! 5513: Read from the standard input @code{stdin}.
! 5514: @end deftypefun
! 5515:
! 5516: @deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, ...)
! 5517: @deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
! 5518: Read from the stream @var{fp}.
! 5519: @end deftypefun
! 5520:
! 5521: @deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, ...)
! 5522: @deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap})
! 5523: Read from a null-terminated string @var{s}.
! 5524: @end deftypefun
! 5525:
! 5526: The return value from each of these functions is the same as the standard C99
! 5527: @code{scanf}, namely the number of fields successfully parsed and stored.
! 5528: @samp{%n} fields and fields read but suppressed by @samp{*} don't count
! 5529: towards the return value.
! 5530:
! 5531: If end of file or file error, or end of string, is reached when a match is
! 5532: required, and when no previous non-suppressed fields have matched, then the
! 5533: return value is EOF instead of 0. A match is required for a literal character
! 5534: in the format string or a field other than @samp{%n}. Whitespace in the
! 5535: format string is only an optional match and won't induce an EOF in this
! 5536: fashion. Leading whitespace read and discarded for a field doesn't count as a
! 5537: match.
! 5538:
! 5539:
! 5540: @node C++ Formatted Input, , Formatted Input Functions, Formatted Input
! 5541: @section C++ Formatted Input
! 5542: @cindex C++ @code{istream} input
! 5543: @cindex @code{istream} input
! 5544:
! 5545: The following functions are provided in @file{libgmpxx}, which is built only
! 5546: if C++ support is enabled (@pxref{Build Options}). Prototypes are available
! 5547: from @code{<gmp.h>}.
! 5548:
! 5549: @deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop})
! 5550: Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
! 5551: @end deftypefun
! 5552:
! 5553: @deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop})
! 5554: Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
! 5555:
! 5556: An integer like @samp{123} will be read, or a fraction like @samp{5/9}. If
! 5557: the fraction is not in canonical form then @code{mpq_canonicalize} must be
! 5558: called (@pxref{Rational Number Functions}).
! 5559: @end deftypefun
! 5560:
! 5561: @deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop})
! 5562: Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
! 5563:
! 5564: Hex or octal floats are not supported, but might be in the future.
! 5565: @end deftypefun
! 5566:
! 5567: These operators mean that GMP types can be read in the usual C++ way, for
! 5568: example,
! 5569:
! 5570: @example
! 5571: mpz_t z;
! 5572: ...
! 5573: cin >> z;
! 5574: @end example
! 5575:
! 5576: But note that @code{istream} input (and @code{ostream} output, @pxref{C++
! 5577: Formatted Output}) is the only overloading available and using for instance
! 5578: @code{+} with an @code{mpz_t} will have unpredictable results.
! 5579:
! 5580:
! 5581: @node C++ Class Interface, BSD Compatible Functions, Formatted Input, Top
! 5582: @chapter C++ Class Interface
! 5583: @cindex C++ Interface
! 5584:
! 5585: This chapter describes the C++ class based interface to GMP.
! 5586:
! 5587: All GMP C language types and functions can be used in C++ programs, since
! 5588: @file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers
! 5589: overloaded functions and operators which may be more convenient.
! 5590:
! 5591: Due to the implementation of this interface, a reasonably recent C++ compiler
! 5592: is required, one supporting namespaces, partial specialization of templates
! 5593: and member templates. For GCC this means version 2.91 or later.
! 5594:
! 5595: @strong{Everything described in this chapter is to be considered preliminary
! 5596: and might be subject to incompatible changes if some unforeseen difficulty
! 5597: reveals itself.}
! 5598:
! 5599: @menu
! 5600: * C++ Interface General::
! 5601: * C++ Interface Integers::
! 5602: * C++ Interface Rationals::
! 5603: * C++ Interface Floats::
! 5604: * C++ Interface MPFR::
! 5605: * C++ Interface Random Numbers::
! 5606: * C++ Interface Limitations::
! 5607: @end menu
! 5608:
! 5609:
! 5610: @node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface
! 5611: @section C++ Interface General
! 5612:
! 5613: @noindent
! 5614: All the C++ classes and functions are available with
! 5615:
! 5616: @cindex gmpxx.h
! 5617: @example
! 5618: #include <gmpxx.h>
! 5619: @end example
! 5620:
! 5621: Programs should be linked with the @file{libgmpxx} and @file{libgmp}
! 5622: libraries. For example,
! 5623:
! 5624: @example
! 5625: g++ mycxxprog.cc -lgmpxx -lgmp
! 5626: @end example
! 5627:
! 5628: @noindent
! 5629: The classes defined are
! 5630:
! 5631: @deftp Class mpz_class
! 5632: @deftpx Class mpq_class
! 5633: @deftpx Class mpf_class
! 5634: @end deftp
! 5635:
! 5636: The standard operators and various standard functions are overloaded to allow
! 5637: arithmetic with these classes. For example,
! 5638:
! 5639: @example
! 5640: int
! 5641: main (void)
! 5642: @{
! 5643: mpz_class a, b, c;
! 5644:
! 5645: a = 1234;
! 5646: b = "-5678";
! 5647: c = a+b;
! 5648: cout << "sum is " << c << "\n";
! 5649: cout << "absolute value is " << abs(c) << "\n";
! 5650:
! 5651: return 0;
! 5652: @}
! 5653: @end example
! 5654:
! 5655: An important feature of the implementation is that an expression like
! 5656: @code{a=b+c} results in a single call to the corresponding @code{mpz_add},
! 5657: without using a temporary for the @code{b+c} part. Expressions which by their
! 5658: nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries
! 5659: though.
! 5660:
! 5661: The classes can be freely intermixed in expressions, as can the classes and
! 5662: the standard types @code{long}, @code{unsigned long} and @code{double}.
! 5663: Smaller types like @code{int} or @code{float} can also be intermixed, since
! 5664: C++ will promote them.
! 5665:
! 5666: Note that @code{bool} is not accepted directly, but must be explicitly cast to
! 5667: an @code{int} first. This is because C++ will automatically convert any
! 5668: pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all
! 5669: sorts of invalid class and pointer combinations compile but almost certainly
! 5670: not do anything sensible.
! 5671:
! 5672: Conversions back from the classes to standard C++ types aren't done
! 5673: automatically, instead member functions like @code{get_si} are provided (see
! 5674: the following sections for details).
! 5675:
! 5676: Also there are no automatic conversions from the classes to the corresponding
! 5677: GMP C types, instead a reference to the underlying C object can be obtained
! 5678: with the following functions,
! 5679:
! 5680: @deftypefun mpz_t mpz_class::get_mpz_t ()
! 5681: @deftypefunx mpq_t mpq_class::get_mpq_t ()
! 5682: @deftypefunx mpf_t mpf_class::get_mpf_t ()
! 5683: @end deftypefun
! 5684:
! 5685: These can be used to call a C function which doesn't have a C++ class
! 5686: interface. For example to set @code{a} to the GCD of @code{b} and @code{c},
! 5687:
! 5688: @example
! 5689: mpz_class a, b, c;
! 5690: ...
! 5691: mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t());
! 5692: @end example
! 5693:
! 5694: In the other direction, a class can be initialized from the corresponding GMP
! 5695: C type, or assigned to if an explicit constructor is used. In both cases this
! 5696: makes a copy of the value, it doesn't create any sort of association. For
! 5697: example,
! 5698:
! 5699: @example
! 5700: mpz_t z;
! 5701: // ... init and calculate z ...
! 5702: mpz_class x(z);
! 5703: mpz_class y;
! 5704: y = mpz_class (z);
! 5705: @end example
! 5706:
! 5707: There are no namespace setups in @file{gmpxx.h}, all types and functions are
! 5708: simply put into the global namespace. This is what @file{gmp.h} has done in
! 5709: the past, and continues to do for compatibility. The extras provided by
! 5710: @file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with
! 5711: anything.
! 5712:
! 5713:
! 5714: @node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface
! 5715: @section C++ Interface Integers
! 5716:
! 5717: @deftypefun void mpz_class::mpz_class (type @var{n})
! 5718: Construct an @code{mpz_class}. All the standard C++ types may be used, except
! 5719: @code{long long} and @code{long double}, and all the GMP C++ classes can be
! 5720: used. Any necessary conversion follows the corresponding C function, for
! 5721: example @code{double} follows @code{mpz_set_d} (@pxref{Assigning Integers}).
! 5722: @end deftypefun
! 5723:
! 5724: @deftypefun void mpz_class::mpz_class (mpz_t @var{z})
! 5725: Construct an @code{mpz_class} from an @code{mpz_t}. The value in @var{z} is
! 5726: copied into the new @code{mpz_class}, there won't be any permanent association
! 5727: between it and @var{z}.
! 5728: @end deftypefun
! 5729:
! 5730: @deftypefun void mpz_class::mpz_class (const char *@var{s})
! 5731: @deftypefunx void mpz_class::mpz_class (const char *@var{s}, int base)
! 5732: @deftypefunx void mpz_class::mpz_class (const string& @var{s})
! 5733: @deftypefunx void mpz_class::mpz_class (const string& @var{s}, int base)
! 5734: Construct an @code{mpz_class} converted from a string using
! 5735: @code{mpz_set_str}, (@pxref{Assigning Integers}). If the @var{base} is not
! 5736: given then 0 is used.
! 5737: @end deftypefun
! 5738:
! 5739: @deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d})
! 5740: @deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d})
! 5741: Divisions involving @code{mpz_class} round towards zero, as per the
! 5742: @code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}).
! 5743: This corresponds to the rounding used for plain @code{int} calculations on
! 5744: most machines.
! 5745:
! 5746: The @code{mpz_fdiv...} or @code{mpz_cdiv...} functions can always be called
! 5747: directly if desired. For example,
! 5748:
! 5749: @example
! 5750: mpz_class q, a, d;
! 5751: ...
! 5752: mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t());
! 5753: @end example
! 5754: @end deftypefun
! 5755:
! 5756: @deftypefun mpz_class abs (mpz_class @var{op1})
! 5757: @deftypefunx int cmp (mpz_class @var{op1}, type @var{op2})
! 5758: @deftypefunx int cmp (type @var{op1}, mpz_class @var{op2})
! 5759: @deftypefunx double mpz_class::get_d (void)
! 5760: @deftypefunx long mpz_class::get_si (void)
! 5761: @deftypefunx {unsigned long} mpz_class::get_ui (void)
! 5762: @maybepagebreak
! 5763: @deftypefunx bool mpz_class::fits_sint_p (void)
! 5764: @deftypefunx bool mpz_class::fits_slong_p (void)
! 5765: @deftypefunx bool mpz_class::fits_sshort_p (void)
! 5766: @maybepagebreak
! 5767: @deftypefunx bool mpz_class::fits_uint_p (void)
! 5768: @deftypefunx bool mpz_class::fits_ulong_p (void)
! 5769: @deftypefunx bool mpz_class::fits_ushort_p (void)
! 5770: @maybepagebreak
! 5771: @deftypefunx int sgn (mpz_class @var{op})
! 5772: @deftypefunx mpz_class sqrt (mpz_class @var{op})
! 5773: These functions provide a C++ class interface to the corresponding GMP C
! 5774: routines.
! 5775:
! 5776: @code{cmp} can be used with any of the classes or the standard C++ types,
! 5777: except @code{long long} and @code{long double}.
! 5778: @end deftypefun
! 5779:
! 5780: @sp 1
! 5781: Overloaded operators for combinations of @code{mpz_class} and @code{double}
! 5782: are provided for completeness, but it should be noted that if the given
! 5783: @code{double} is not an integer then the way any rounding is done is currently
! 5784: unspecified. The rounding might take place at the start, in the middle, or at
! 5785: the end of the operation, and it might change in the future.
! 5786:
! 5787: Conversions between @code{mpz_class} and @code{double}, however, are defined
! 5788: to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}.
! 5789: And comparisons are always made exactly, as per @code{mpz_cmp_d}.
! 5790:
! 5791:
! 5792: @node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface
! 5793: @section C++ Interface Rationals
! 5794:
! 5795: In all the following constructors, if a fraction is given then it should be in
! 5796: canonical form, or if not then @code{mpq_class::canonicalize} called.
! 5797:
! 5798: @deftypefun void mpq_class::mpq_class (type @var{op})
! 5799: @deftypefunx void mpq_class::mpq_class (integer @var{num}, integer @var{den})
! 5800: Construct an @code{mpq_class}. The initial value can be a single value of any
! 5801: type, or a pair of integers (@code{mpz_class} or standard C++ integer types)
! 5802: representing a fraction, except that @code{long long} and @code{long double}
! 5803: are not supported. For example,
! 5804:
! 5805: @example
! 5806: mpq_class q (99);
! 5807: mpq_class q (1.75);
! 5808: mpq_class q (1, 3);
! 5809: @end example
! 5810: @end deftypefun
! 5811:
! 5812: @deftypefun void mpq_class::mpq_class (mpq_t @var{q})
! 5813: Construct an @code{mpq_class} from an @code{mpq_t}. The value in @var{q} is
! 5814: copied into the new @code{mpq_class}, there won't be any permanent association
! 5815: between it and @var{q}.
! 5816: @end deftypefun
! 5817:
! 5818: @deftypefun void mpq_class::mpq_class (const char *@var{s})
! 5819: @deftypefunx void mpq_class::mpq_class (const char *@var{s}, int base)
! 5820: @deftypefunx void mpq_class::mpq_class (const string& @var{s})
! 5821: @deftypefunx void mpq_class::mpq_class (const string& @var{s}, int base)
! 5822: Construct an @code{mpq_class} converted from a string using
! 5823: @code{mpq_set_str}, (@pxref{Initializing Rationals}). If the @var{base} is
! 5824: not given then 0 is used.
! 5825: @end deftypefun
! 5826:
! 5827: @deftypefun void mpq_class::canonicalize ()
! 5828: Put an @code{mpq_class} into canonical form, as per @ref{Rational Number
! 5829: Functions}. All arithmetic operators require their operands in canonical
! 5830: form, and will return results in canonical form.
! 5831: @end deftypefun
! 5832:
! 5833: @deftypefun mpq_class abs (mpq_class @var{op})
! 5834: @deftypefunx int cmp (mpq_class @var{op1}, type @var{op2})
! 5835: @deftypefunx int cmp (type @var{op1}, mpq_class @var{op2})
! 5836: @maybepagebreak
! 5837: @deftypefunx double mpq_class::get_d (void)
! 5838: @deftypefunx int sgn (mpq_class @var{op})
! 5839: These functions provide a C++ class interface to the corresponding GMP C
! 5840: routines.
! 5841:
! 5842: @code{cmp} can be used with any of the classes or the standard C++ types,
! 5843: except @code{long long} and @code{long double}.
! 5844: @end deftypefun
! 5845:
! 5846: @deftypefun {mpz_class&} mpq_class::get_num ()
! 5847: @deftypefunx {mpz_class&} mpq_class::get_den ()
! 5848: Get a reference to an @code{mpz_class} which is the numerator or denominator
! 5849: of an @code{mpq_class}. This can be used both for read and write access. If
! 5850: the object returned is modified, it modifies the original @code{mpq_class}.
! 5851:
! 5852: If direct manipulation might produce a non-canonical value, then
! 5853: @code{mpq_class::canonicalize} must be called before further operations.
! 5854: @end deftypefun
! 5855:
! 5856: @deftypefun mpz_t mpq_class::get_num_mpz_t ()
! 5857: @deftypefunx mpz_t mpq_class::get_den_mpz_t ()
! 5858: Get a reference to the underlying @code{mpz_t} numerator or denominator of an
! 5859: @code{mpq_class}. This can be passed to C functions expecting an
! 5860: @code{mpz_t}. Any modifications made to the @code{mpz_t} will modify the
! 5861: original @code{mpq_class}.
! 5862:
! 5863: If direct manipulation might produce a non-canonical value, then
! 5864: @code{mpq_class::canonicalize} must be called before further operations.
! 5865: @end deftypefun
! 5866:
! 5867: @deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop});
! 5868: Read @var{rop} from @var{stream}, using its @code{ios} formatting settings,
! 5869: the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}).
! 5870:
! 5871: If the @var{rop} read might not be in canonical form then
! 5872: @code{mpq_class::canonicalize} must be called.
! 5873: @end deftypefun
! 5874:
! 5875:
! 5876: @node C++ Interface Floats, C++ Interface MPFR, C++ Interface Rationals, C++ Class Interface
! 5877: @section C++ Interface Floats
! 5878:
! 5879: When an expression requires the use of temporary intermediate @code{mpf_class}
! 5880: values, like @code{f=g*h+x*y}, those temporaries will have the same precision
! 5881: as the destination @code{f}. Explicit constructors can be used if this
! 5882: doesn't suit.
! 5883:
! 5884: @deftypefun {} mpf_class::mpf_class (type @var{op})
! 5885: @deftypefunx {} mpf_class::mpf_class (type @var{op}, unsigned long @var{prec})
! 5886: Construct an @code{mpf_class}. Any standard C++ type can be used, except
! 5887: @code{long long} and @code{long double}, and any of the GMP C++ classes can be
! 5888: used.
! 5889:
! 5890: If @var{prec} is given, the initial precision is that value, in bits. If
! 5891: @var{prec} is not given, then the initial precision is determined by the type
! 5892: of @var{op} given. An @code{mpz_class}, @code{mpq_class}, string, or C++
! 5893: builtin type will give the default @code{mpf} precision (@pxref{Initializing
! 5894: Floats}). An @code{mpf_class} or expression will give the precision of that
! 5895: value. The precision of a binary expression is the higher of the two
! 5896: operands.
! 5897:
! 5898: @example
! 5899: mpf_class f(1.5); // default precision
! 5900: mpf_class f(1.5, 500); // 500 bits (at least)
! 5901: mpf_class f(x); // precision of x
! 5902: mpf_class f(abs(x)); // precision of x
! 5903: mpf_class f(-g, 1000); // 1000 bits (at least)
! 5904: mpf_class f(x+y); // greater of precisions of x and y
! 5905: @end example
! 5906: @end deftypefun
! 5907:
! 5908: @deftypefun mpf_class abs (mpf_class @var{op})
! 5909: @deftypefunx mpf_class ceil (mpf_class @var{op})
! 5910: @deftypefunx int cmp (mpf_class @var{op1}, type @var{op2})
! 5911: @deftypefunx int cmp (type @var{op1}, mpf_class @var{op2})
! 5912: @maybepagebreak
! 5913: @deftypefunx mpf_class floor (mpf_class @var{op})
! 5914: @deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2})
! 5915: @deftypefunx double mpf_class::get_d (void)
! 5916: @deftypefunx long mpf_class::get_si (void)
! 5917: @deftypefunx {unsigned long} mpf_class::get_ui (void)
! 5918: @maybepagebreak
! 5919: @deftypefunx bool mpf_class::fits_sint_p (void)
! 5920: @deftypefunx bool mpf_class::fits_slong_p (void)
! 5921: @deftypefunx bool mpf_class::fits_sshort_p (void)
! 5922: @maybepagebreak
! 5923: @deftypefunx bool mpf_class::fits_uint_p (void)
! 5924: @deftypefunx bool mpf_class::fits_ulong_p (void)
! 5925: @deftypefunx bool mpf_class::fits_ushort_p (void)
! 5926: @maybepagebreak
! 5927: @deftypefunx int sgn (mpf_class @var{op})
! 5928: @deftypefunx mpf_class sqrt (mpf_class @var{op})
! 5929: @deftypefunx mpf_class trunc (mpf_class @var{op})
! 5930: These functions provide a C++ class interface to the corresponding GMP C
! 5931: routines.
! 5932:
! 5933: @code{cmp} can be used with any of the classes or the standard C++ types,
! 5934: except @code{long long} and @code{long double}.
! 5935:
! 5936: The accuracy provided by @code{hypot} is not currently guaranteed.
! 5937: @end deftypefun
! 5938:
! 5939: @deftypefun {unsigned long int} mpf_class::get_prec ()
! 5940: @deftypefunx void mpf_class::set_prec (unsigned long @var{prec})
! 5941: @deftypefunx void mpf_class::set_prec_raw (unsigned long @var{prec})
! 5942: Get or set the current precision of an @code{mpf_class}.
! 5943:
! 5944: The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing
! 5945: Floats}) apply to @code{mpf_class::set_prec_raw}. Note in particular that the
! 5946: @code{mpf_class} must be restored to it's allocated precision before being
! 5947: destroyed. This must be done by application code, there's no automatic
! 5948: mechanism for it.
! 5949: @end deftypefun
! 5950:
! 5951:
! 5952: @node C++ Interface MPFR, C++ Interface Random Numbers, C++ Interface Floats, C++ Class Interface
! 5953: @section C++ Interface MPFR
! 5954:
! 5955: The C++ class interface to MPFR is provided if MPFR is enabled (@pxref{Build
! 5956: Options}). This interface must be regarded as preliminary and possibly
! 5957: subject to incompatible changes in the future, since MPFR itself is
! 5958: preliminary. All definitions can be obtained with
! 5959:
! 5960: @cindex mpfrxx.h
! 5961: @example
! 5962: #include <mpfrxx.h>
! 5963: @end example
! 5964:
! 5965: @noindent
! 5966: This defines
! 5967:
! 5968: @deftp Class mpfr_class
! 5969: @end deftp
! 5970:
! 5971: @noindent
! 5972: which behaves similarly to @code{mpf_class} (@pxref{C++ Interface Floats}).
! 5973:
! 5974:
! 5975: @node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface MPFR, C++ Class Interface
! 5976: @section C++ Interface Random Numbers
! 5977:
! 5978: @deftp Class gmp_randclass
! 5979: The C++ class interface to the GMP random number functions uses
! 5980: @code{gmp_randclass} to hold an algorithm selection and current state, as per
! 5981: @code{gmp_randstate_t}.
! 5982: @end deftp
! 5983:
! 5984: @deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, ...), ...)
! 5985: Construct a @code{gmp_randclass}, using a call to the given @var{randinit}
! 5986: function (@pxref{Random State Initialization}). The arguments expected are
! 5987: the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}.
! 5988: For example,
! 5989:
! 5990: @example
! 5991: gmp_randclass r1 (gmp_randinit_default);
! 5992: gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32);
! 5993: gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp);
! 5994: @end example
! 5995:
! 5996: @code{gmp_randinit_lc_2exp_size} can fail if the size requested is too big,
! 5997: the behaviour of @code{gmp_randclass::gmp_randclass} is undefined in this case
! 5998: (perhaps this will change in the future).
! 5999: @end deftypefun
! 6000:
! 6001: @deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, ...)
! 6002: Construct a @code{gmp_randclass} using the same parameters as
! 6003: @code{gmp_randinit} (@pxref{Random State Initialization}). This function is
! 6004: obsolete and the above @var{randinit} style should be preferred.
! 6005: @end deftypefun
! 6006:
! 6007: @deftypefun void gmp_randclass::seed (unsigned long int @var{s})
! 6008: @deftypefunx void gmp_randclass::seed (mpz_class @var{s})
! 6009: Seed a random number generator. See @pxref{Random Number Functions}, for how
! 6010: to choose a good seed.
! 6011: @end deftypefun
! 6012:
! 6013: @deftypefun mpz_class gmp_randclass::get_z_bits (unsigned long @var{bits})
! 6014: @deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits})
! 6015: Generate a random integer with a specified number of bits.
! 6016: @end deftypefun
! 6017:
! 6018: @deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n})
! 6019: Generate a random integer in the range 0 to @math{@var{n}-1} inclusive.
! 6020: @end deftypefun
! 6021:
! 6022: @deftypefun mpf_class gmp_randclass::get_f ()
! 6023: @deftypefunx mpf_class gmp_randclass::get_f (unsigned long @var{prec})
! 6024: Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}. @var{f}
! 6025: will be to @var{prec} bits precision, or if @var{prec} is not given then to
! 6026: the precision of the destination. For example,
! 6027:
! 6028: @example
! 6029: gmp_randclass r;
! 6030: ...
! 6031: mpf_class f (0, 512); // 512 bits precision
! 6032: f = r.get_f(); // random number, 512 bits
! 6033: @end example
! 6034: @end deftypefun
! 6035:
! 6036:
! 6037:
! 6038: @node C++ Interface Limitations, , C++ Interface Random Numbers, C++ Class Interface
! 6039: @section C++ Interface Limitations
! 6040:
! 6041: @table @asis
! 6042: @item @code{mpq_class} and Templated Reading
! 6043: A generic piece of template code probably won't know that @code{mpq_class}
! 6044: requires a @code{canonicalize} call if inputs read with @code{operator>>}
! 6045: might be non-canonical. This can lead to incorrect results.
! 6046:
! 6047: @code{operator>>} behaves as it does for reasons of efficiency. A
! 6048: canonicalize can be quite time consuming on large operands, and is best
! 6049: avoided if it's not necessary.
! 6050:
! 6051: But this potential difficulty reduces the usefulness of @code{mpq_class}.
! 6052: Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in
! 6053: the future, maybe a preprocessor define, a global flag, or an @code{ios} flag
! 6054: pressed into service. Or maybe, at the risk of inconsistency, the
! 6055: @code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t}
! 6056: @code{operator>>} not doing so, for use on those occasions when that's
! 6057: acceptable. Send feedback or alternate ideas to @email{bug-gmp@@gnu.org}.
! 6058:
! 6059: @item Subclassing
! 6060: Subclassing the GMP C++ classes works, but is not currently recommended.
! 6061:
! 6062: Expressions involving subclasses resolve correctly (or seem to), but in normal
! 6063: C++ fashion the subclass doesn't inherit constructors and assignments.
! 6064: There's many of those in the GMP classes, and a good way to reestablish them
! 6065: in a subclass is not yet provided.
! 6066:
! 6067: @item Templated Expressions
! 6068:
! 6069: A subtle difficulty exists when using expressions together with
! 6070: application-defined template functions. Consider the following, with @code{T}
! 6071: intended to be some numeric type,
! 6072:
! 6073: @example
! 6074: template <class T>
! 6075: T fun (const T &, const T &);
! 6076: @end example
! 6077:
! 6078: @noindent
! 6079: When used with, say, plain @code{mpz_class} variables, it works fine: @code{T}
! 6080: is resolved as @code{mpz_class}.
! 6081:
! 6082: @example
! 6083: mpz_class f(1), g(2);
! 6084: fun (f, g); // Good
! 6085: @end example
! 6086:
! 6087: @noindent
! 6088: But when one of the arguments is an expression, it doesn't work.
! 6089:
! 6090: @example
! 6091: mpz_class f(1), g(2), h(3);
! 6092: fun (f, g+h); // Bad
! 6093: @end example
! 6094:
! 6095: This is because @code{g+h} ends up being a certain expression template type
! 6096: internal to @code{gmpxx.h}, which the C++ template resolution rules are unable
! 6097: to automatically convert to @code{mpz_class}. The workaround is simply to add
! 6098: an explicit cast.
! 6099:
! 6100: @example
! 6101: mpz_class f(1), g(2), h(3);
! 6102: fun (f, mpz_class(g+h)); // Good
! 6103: @end example
! 6104:
! 6105: Similarly, within @code{fun} it may be necessary to cast an expression to type
! 6106: @code{T} when calling a templated @code{fun2}.
! 6107:
! 6108: @example
! 6109: template <class T>
! 6110: void fun (T f, T g)
! 6111: @{
! 6112: fun2 (f, f+g); // Bad
! 6113: @}
! 6114:
! 6115: template <class T>
! 6116: void fun (T f, T g)
! 6117: @{
! 6118: fun2 (f, T(f+g)); // Good
! 6119: @}
! 6120: @end example
! 6121: @end table
! 6122:
! 6123:
! 6124: @node BSD Compatible Functions, Custom Allocation, C++ Class Interface, Top
! 6125: @comment node-name, next, previous, up
! 6126: @chapter Berkeley MP Compatible Functions
! 6127: @cindex Berkeley MP compatible functions
! 6128: @cindex BSD MP compatible functions
! 6129:
! 6130: These functions are intended to be fully compatible with the Berkeley MP
! 6131: library which is available on many BSD derived U*ix systems. The
! 6132: @samp{--enable-mpbsd} option must be used when building GNU MP to make these
! 6133: available (@pxref{Installing GMP}).
! 6134:
! 6135: The original Berkeley MP library has a usage restriction: you cannot use the
! 6136: same variable as both source and destination in a single function call. The
! 6137: compatible functions in GNU MP do not share this restriction---inputs and
! 6138: outputs may overlap.
! 6139:
! 6140: It is not recommended that new programs are written using these functions.
! 6141: Apart from the incomplete set of functions, the interface for initializing
! 6142: @code{MINT} objects is more error prone, and the @code{pow} function collides
! 6143: with @code{pow} in @file{libm.a}.
! 6144:
! 6145: @cindex @file{mp.h}
! 6146: Include the header @file{mp.h} to get the definition of the necessary types and
! 6147: functions. If you are on a BSD derived system, make sure to include GNU
! 6148: @file{mp.h} if you are going to link the GNU @file{libmp.a} to your program.
! 6149: This means that you probably need to give the @samp{-I<dir>} option to the
! 6150: compiler, where @samp{<dir>} is the directory where you have GNU @file{mp.h}.
! 6151:
! 6152: @deftypefun {MINT *} itom (signed short int @var{initial_value})
! 6153: Allocate an integer consisting of a @code{MINT} object and dynamic limb space.
! 6154: Initialize the integer to @var{initial_value}. Return a pointer to the
! 6155: @code{MINT} object.
! 6156: @end deftypefun
! 6157:
! 6158: @deftypefun {MINT *} xtom (char *@var{initial_value})
! 6159: Allocate an integer consisting of a @code{MINT} object and dynamic limb space.
! 6160: Initialize the integer from @var{initial_value}, a hexadecimal,
! 6161: null-terminated C string. Return a pointer to the @code{MINT} object.
! 6162: @end deftypefun
! 6163:
! 6164: @deftypefun void move (MINT *@var{src}, MINT *@var{dest})
! 6165: Set @var{dest} to @var{src} by copying. Both variables must be previously
! 6166: initialized.
! 6167: @end deftypefun
! 6168:
! 6169: @deftypefun void madd (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
! 6170: Add @var{src_1} and @var{src_2} and put the sum in @var{destination}.
! 6171: @end deftypefun
! 6172:
! 6173: @deftypefun void msub (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
! 6174: Subtract @var{src_2} from @var{src_1} and put the difference in
! 6175: @var{destination}.
! 6176: @end deftypefun
! 6177:
! 6178: @deftypefun void mult (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
! 6179: Multiply @var{src_1} and @var{src_2} and put the product in @var{destination}.
! 6180: @end deftypefun
! 6181:
! 6182: @deftypefun void mdiv (MINT *@var{dividend}, MINT *@var{divisor}, MINT *@var{quotient}, MINT *@var{remainder})
! 6183: @deftypefunx void sdiv (MINT *@var{dividend}, signed short int @var{divisor}, MINT *@var{quotient}, signed short int *@var{remainder})
! 6184: Set @var{quotient} to @var{dividend}/@var{divisor}, and @var{remainder} to
! 6185: @var{dividend} mod @var{divisor}. The quotient is rounded towards zero; the
! 6186: remainder has the same sign as the dividend unless it is zero.
! 6187:
! 6188: Some implementations of these functions work differently---or not at all---for
! 6189: negative arguments.
! 6190: @end deftypefun
! 6191:
! 6192: @deftypefun void msqrt (MINT *@var{op}, MINT *@var{root}, MINT *@var{remainder})
! 6193: Set @var{root} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
! 6194: of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{remainder} to
! 6195: @m{(@var{op} - @var{root}^2), @var{op}@minus{}@var{root}*@var{root}}, i.e.
! 6196: zero if @var{op} is a perfect square.
! 6197:
! 6198: If @var{root} and @var{remainder} are the same variable, the results are
! 6199: undefined.
! 6200: @end deftypefun
! 6201:
! 6202: @deftypefun void pow (MINT *@var{base}, MINT *@var{exp}, MINT *@var{mod}, MINT *@var{dest})
! 6203: Set @var{dest} to (@var{base} raised to @var{exp}) modulo @var{mod}.
! 6204: @end deftypefun
! 6205:
! 6206: @deftypefun void rpow (MINT *@var{base}, signed short int @var{exp}, MINT *@var{dest})
! 6207: Set @var{dest} to @var{base} raised to @var{exp}.
! 6208: @end deftypefun
! 6209:
! 6210: @deftypefun void gcd (MINT *@var{op1}, MINT *@var{op2}, MINT *@var{res})
! 6211: Set @var{res} to the greatest common divisor of @var{op1} and @var{op2}.
! 6212: @end deftypefun
! 6213:
! 6214: @deftypefun int mcmp (MINT *@var{op1}, MINT *@var{op2})
! 6215: Compare @var{op1} and @var{op2}. Return a positive value if @var{op1} >
! 6216: @var{op2}, zero if @var{op1} = @var{op2}, and a negative value if @var{op1} <
! 6217: @var{op2}.
! 6218: @end deftypefun
! 6219:
! 6220: @deftypefun void min (MINT *@var{dest})
! 6221: Input a decimal string from @code{stdin}, and put the read integer in
! 6222: @var{dest}. SPC and TAB are allowed in the number string, and are ignored.
! 6223: @end deftypefun
! 6224:
! 6225: @deftypefun void mout (MINT *@var{src})
! 6226: Output @var{src} to @code{stdout}, as a decimal string. Also output a newline.
! 6227: @end deftypefun
! 6228:
! 6229: @deftypefun {char *} mtox (MINT *@var{op})
! 6230: Convert @var{op} to a hexadecimal string, and return a pointer to the string.
! 6231: The returned string is allocated using the default memory allocation function,
! 6232: @code{malloc} by default. It will be @code{strlen(str)+1} bytes, that being
! 6233: exactly enough for the string and null-terminator.
! 6234: @end deftypefun
! 6235:
! 6236: @deftypefun void mfree (MINT *@var{op})
! 6237: De-allocate, the space used by @var{op}. @strong{This function should only be
! 6238: passed a value returned by @code{itom} or @code{xtom}.}
! 6239: @end deftypefun
! 6240:
! 6241:
! 6242: @node Custom Allocation, Language Bindings, BSD Compatible Functions, Top
! 6243: @comment node-name, next, previous, up
! 6244: @chapter Custom Allocation
! 6245: @cindex Custom allocation
! 6246: @cindex Memory allocation
! 6247: @cindex Allocation of memory
! 6248:
! 6249: By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory
! 6250: allocation, and if they fail GMP prints a message to the standard error output
! 6251: and terminates the program.
! 6252:
! 6253: Alternate functions can be specified to allocate memory in a different way or
! 6254: to have a different error action on running out of memory.
! 6255:
! 6256: This feature is available in the Berkeley compatibility library (@pxref{BSD
! 6257: Compatible Functions}) as well as the main GMP library.
! 6258:
! 6259: @deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t))
! 6260: Replace the current allocation functions from the arguments. If an argument
! 6261: is @code{NULL}, the corresponding default function is used.
! 6262:
! 6263: These functions will be used for all memory allocation done by GMP, apart from
! 6264: temporary space from @code{alloca} if that function is available and GMP is
! 6265: configured to use it (@pxref{Build Options}).
! 6266:
! 6267: @strong{Be sure to call @code{mp_set_memory_functions} only when there are no
! 6268: active GMP objects allocated using the previous memory functions! Usually
! 6269: that means calling it before any other GMP function.}
! 6270: @end deftypefun
! 6271:
! 6272: The functions supplied should fit the following declarations:
! 6273:
! 6274: @deftypefun {void *} allocate_function (size_t @var{alloc_size})
! 6275: Return a pointer to newly allocated space with at least @var{alloc_size}
! 6276: bytes.
! 6277: @end deftypefun
! 6278:
! 6279: @deftypefun {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size})
! 6280: Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be
! 6281: @var{new_size} bytes.
! 6282:
! 6283: The block may be moved if necessary or if desired, and in that case the
! 6284: smaller of @var{old_size} and @var{new_size} bytes must be copied to the new
! 6285: location. The return value is a pointer to the resized block, that being the
! 6286: new location if moved or just @var{ptr} if not.
! 6287:
! 6288: @var{ptr} is never @code{NULL}, it's always a previously allocated block.
! 6289: @var{new_size} may be bigger or smaller than @var{old_size}.
! 6290: @end deftypefun
! 6291:
! 6292: @deftypefun void deallocate_function (void *@var{ptr}, size_t @var{size})
! 6293: De-allocate the space pointed to by @var{ptr}.
! 6294:
! 6295: @var{ptr} is never @code{NULL}, it's always a previously allocated block of
! 6296: @var{size} bytes.
! 6297: @end deftypefun
! 6298:
! 6299: A @dfn{byte} here means the unit used by the @code{sizeof} operator.
! 6300:
! 6301: The @var{old_size} parameters to @var{reallocate_function} and
! 6302: @var{deallocate_function} are passed for convenience, but of course can be
! 6303: ignored if not needed. The default functions using @code{malloc} and friends
! 6304: for instance don't use them.
! 6305:
! 6306: No error return is allowed from any of these functions, if they return then
! 6307: they must have performed the specified operation. In particular note that
! 6308: @var{allocate_function} or @var{reallocate_function} mustn't return
! 6309: @code{NULL}.
! 6310:
! 6311: Getting a different fatal error action is a good use for custom allocation
! 6312: functions, for example giving a graphical dialog rather than the default print
! 6313: to @code{stderr}. How much is possible when genuinely out of memory is
! 6314: another question though.
! 6315:
! 6316: There's currently no defined way for the allocation functions to recover from
! 6317: an error such as out of memory, they must terminate program execution. A
! 6318: @code{longjmp} or throwing a C++ exception will have undefined results. This
! 6319: may change in the future.
! 6320:
! 6321: GMP may use allocated blocks to hold pointers to other allocated blocks. This
! 6322: will limit the assumptions a conservative garbage collection scheme can make.
! 6323:
! 6324: Since the default GMP allocation uses @code{malloc} and friends, those
! 6325: functions will be linked in even if the first thing a program does is an
! 6326: @code{mp_set_memory_functions}. It's necessary to change the GMP sources if
! 6327: this is a problem.
! 6328:
! 6329:
! 6330: @node Language Bindings, Algorithms, Custom Allocation, Top
! 6331: @chapter Language Bindings
! 6332:
! 6333: The following packages and projects offer access to GMP from languages other
! 6334: than C, though perhaps with varying levels of functionality and efficiency.
! 6335:
! 6336: @c GNUstep Base Library @uref{http://www.gnustep.org} (version 0.9.1) is
! 6337: @c intending to use GMP for its NSDecimal class, which would be an Objective
! 6338: @c C binding for GMP. Has some configure stuff ready, but no code.
! 6339:
! 6340: @c @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces
! 6341: @c in tex, just to separate the URL from the preceding text a bit.
! 6342: @iftex
! 6343: @macro spaceuref {U}
! 6344: @ @ @uref{\U\}
! 6345: @end macro
! 6346: @end iftex
! 6347: @ifnottex
! 6348: @macro spaceuref {U}
! 6349: @uref{\U\}
! 6350: @end macro
! 6351: @end ifnottex
! 6352:
! 6353: @sp 1
! 6354: @table @asis
! 6355: @item C++
! 6356: @itemize @bullet
! 6357: @item
! 6358: GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward
! 6359: interface, expression templates to eliminate temporaries.
! 6360: @item
! 6361: ALP @spaceuref{http://www.inria.fr/saga/logiciels/ALP} @* Linear algebra and
! 6362: polynomials using templates.
! 6363: @item
! 6364: Arithmos @spaceuref{http://win-www.uia.ac.be/u/cant/arithmos} @* Rationals
! 6365: with infinities and square roots.
! 6366: @item
! 6367: CLN @spaceuref{http://clisp.cons.org/~haible/packages-cln.html} @* High level
! 6368: classes for arithmetic.
! 6369: @item
! 6370: LiDIA @spaceuref{http://www.informatik.tu-darmstadt.de/TI/LiDIA} @* A C++
! 6371: library for computational number theory.
! 6372: @item
! 6373: Linbox @spaceuref{http://www.linalg.org} @* Sparse vectors and matrices.
! 6374: @item
! 6375: NTL @spaceuref{http://www.shoup.net/ntl} @* A C++ number theory library.
! 6376: @end itemize
! 6377:
! 6378: @item Fortran
! 6379: @itemize @bullet
! 6380: @item
! 6381: Omni F77 @spaceuref{http://pdplab.trc.rwcp.or.jp/pdperf/Omni/home.html} @*
! 6382: Arbitrary precision floats.
! 6383: @end itemize
! 6384:
! 6385: @item Haskell
! 6386: @itemize @bullet
! 6387: @item
! 6388: Glasgow Haskell Compiler @spaceuref{http://www.haskell.org/ghc}
! 6389: @end itemize
! 6390:
! 6391: @item Java
! 6392: @itemize @bullet
! 6393: @item
! 6394: Kaffe @spaceuref{http://www.kaffe.org}
! 6395: @item
! 6396: Kissme @spaceuref{http://kissme.sourceforge.net}
! 6397: @end itemize
! 6398:
! 6399: @item Lisp
! 6400: @itemize @bullet
! 6401: @item
! 6402: GNU Common Lisp @spaceuref{http://www.gnu.org/software/gcl/gcl.html} @* In the
! 6403: process of switching to GMP for bignums.
! 6404: @item
! 6405: Librep @spaceuref{http://librep.sourceforge.net}
! 6406: @end itemize
! 6407:
! 6408: @item M4
! 6409: @itemize @bullet
! 6410: @item
! 6411: GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu} @* Optionally provides
! 6412: an arbitrary precision @code{mpeval}.
! 6413: @end itemize
! 6414:
! 6415: @item ML
! 6416: @itemize @bullet
! 6417: @item
! 6418: MLton compiler @spaceuref{http://www.mlton.org}
! 6419: @end itemize
! 6420:
! 6421: @item Oz
! 6422: @itemize @bullet
! 6423: @item
! 6424: Mozart @spaceuref{http://www.mozart-oz.org}
! 6425: @end itemize
! 6426:
! 6427: @item Pascal
! 6428: @itemize @bullet
! 6429: @item
! 6430: GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de} @* GMP unit.
! 6431: @end itemize
! 6432:
! 6433: @item Perl
! 6434: @itemize @bullet
! 6435: @item
! 6436: GMP module, see @file{demos/perl} in the GMP sources.
! 6437: @item
! 6438: Math::GMP @spaceuref{http://www.cpan.org} @* Compatible with Math::BigInt, but
! 6439: not as many functions as the GMP module above.
! 6440: @item
! 6441: Math::BigInt::GMP @spaceuref{http://www.cpan.org} @* Plug Math::GMP into
! 6442: normal Math::BigInt operations.
! 6443: @end itemize
! 6444:
! 6445: @need 1000
! 6446: @item Pike
! 6447: @itemize @bullet
! 6448: @item
! 6449: mpz module in the standard distribution, @uref{http://pike.idonex.com}
! 6450: @end itemize
! 6451:
! 6452: @need 500
! 6453: @item Prolog
! 6454: @itemize @bullet
! 6455: @item
! 6456: SWI Prolog @spaceuref{http://www.swi.psy.uva.nl/projects/SWI-Prolog} @*
! 6457: Arbitrary precision floats.
! 6458: @end itemize
! 6459:
! 6460: @item Python
! 6461: @itemize @bullet
! 6462: @item
! 6463: mpz module in the standard distribution, @uref{http://www.python.org}
! 6464: @item
! 6465: GMPY @uref{http://gmpy.sourceforge.net}
! 6466: @end itemize
! 6467:
! 6468: @item Scheme
! 6469: @itemize @bullet
! 6470: @item
! 6471: RScheme @spaceuref{http://www.rscheme.org}
! 6472: @item
! 6473: STklos @spaceuref{http://kaolin.unice.fr/STklos}
! 6474: @end itemize
! 6475:
! 6476: @item Smalltalk
! 6477: @itemize @bullet
! 6478: @item
! 6479: GNU Smalltalk @spaceuref{http://www.smalltalk.org/versions/GNUSmalltalk.html}
! 6480: @end itemize
! 6481:
! 6482: @item Other
! 6483: @itemize @bullet
! 6484: @item
! 6485: DrGenius @spaceuref{http://drgenius.seul.org} @* Geometry system and
! 6486: mathematical programming language.
! 6487: @item
! 6488: GiNaC @spaceuref{http://www.ginac.de} @* C++ computer algebra using CLN.
! 6489: @item
! 6490: Maxima @uref{http://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma
! 6491: computer algebra using GCL.
! 6492: @item
! 6493: Q @spaceuref{http://www.musikwissenschaft.uni-mainz.de/~ag/q} @* Equational
! 6494: programming system.
! 6495: @item
! 6496: Regina @spaceuref{http://regina.sourceforge.net} @* Topological calculator.
! 6497: @item
! 6498: Yacas @spaceuref{http://www.xs4all.nl/~apinkus/yacas.html} @* Yet another
! 6499: computer algebra system.
! 6500: @end itemize
! 6501:
! 6502: @end table
! 6503:
! 6504:
! 6505: @node Algorithms, Internals, Language Bindings, Top
! 6506: @chapter Algorithms
! 6507: @cindex Algorithms
! 6508:
! 6509: This chapter is an introduction to some of the algorithms used for various GMP
! 6510: operations. The code is likely to be hard to understand without knowing
! 6511: something about the algorithms.
! 6512:
! 6513: Some GMP internals are mentioned, but applications that expect to be
! 6514: compatible with future GMP releases should take care to use only the
! 6515: documented functions.
! 6516:
! 6517: @menu
! 6518: * Multiplication Algorithms::
! 6519: * Division Algorithms::
! 6520: * Greatest Common Divisor Algorithms::
! 6521: * Powering Algorithms::
! 6522: * Root Extraction Algorithms::
! 6523: * Radix Conversion Algorithms::
! 6524: * Other Algorithms::
! 6525: * Assembler Coding::
! 6526: @end menu
! 6527:
! 6528:
! 6529: @node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms
! 6530: @section Multiplication
! 6531: @cindex Multiplication algorithms
! 6532:
! 6533: N@cross{}N limb multiplications and squares are done using one of four
! 6534: algorithms, as the size N increases.
! 6535:
! 6536: @quotation
! 6537: @multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 6538: @item Algorithm @tab Threshold
! 6539: @item Basecase @tab (none)
! 6540: @item Karatsuba @tab @code{MUL_KARATSUBA_THRESHOLD}
! 6541: @item Toom-3 @tab @code{MUL_TOOM3_THRESHOLD}
! 6542: @item FFT @tab @code{MUL_FFT_THRESHOLD}
! 6543: @end multitable
! 6544: @end quotation
! 6545:
! 6546: Similarly for squaring, with the @code{SQR} thresholds. Note though that the
! 6547: FFT is only used if GMP is configured with @samp{--enable-fft}, @pxref{Build
! 6548: Options}.
! 6549:
! 6550: N@cross{}M multiplications of operands with different sizes above
! 6551: @code{MUL_KARATSUBA_THRESHOLD} are currently done by splitting into M@cross{}M
! 6552: pieces. The Karatsuba and Toom-3 routines then operate only on equal size
! 6553: operands. This is not very efficient, and is slated for improvement in the
! 6554: future.
! 6555:
! 6556: @menu
! 6557: * Basecase Multiplication::
! 6558: * Karatsuba Multiplication::
! 6559: * Toom-Cook 3-Way Multiplication::
! 6560: * FFT Multiplication::
! 6561: * Other Multiplication::
! 6562: @end menu
! 6563:
! 6564:
! 6565: @node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms
! 6566: @subsection Basecase Multiplication
! 6567:
! 6568: Basecase N@cross{}M multiplication is a straightforward rectangular set of
! 6569: cross-products, the same as long multiplication done by hand and for that
! 6570: reason sometimes known as the schoolbook or grammar school method. This is an
! 6571: @m{O(NM),O(N*M)} algorithm. See Knuth section 4.3.1 algorithm M
! 6572: (@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code.
! 6573:
! 6574: Assembler implementations of @code{mpn_mul_basecase} are essentially the same
! 6575: as the generic C code, but have all the usual assembler tricks and
! 6576: obscurities introduced for speed.
! 6577:
! 6578: A square can be done in roughly half the time of a multiply, by using the fact
! 6579: that the cross products above and below the diagonal are the same. A triangle
! 6580: of products below the diagonal is formed, doubled (left shift by one bit), and
! 6581: then the products on the diagonal added. This can be seen in
! 6582: @file{mpn/generic/sqr_basecase.c}. Again the assembler implementations take
! 6583: essentially the same approach.
! 6584:
! 6585: @tex
! 6586: \def\GMPline#1#2#3#4#5#6{%
! 6587: \hbox {%
! 6588: \vrule height 2.5ex depth 1ex
! 6589: \hbox to 2em {\hfil{#2}\hfil}%
! 6590: \vrule \hbox to 2em {\hfil{#3}\hfil}%
! 6591: \vrule \hbox to 2em {\hfil{#4}\hfil}%
! 6592: \vrule \hbox to 2em {\hfil{#5}\hfil}%
! 6593: \vrule \hbox to 2em {\hfil{#6}\hfil}%
! 6594: \vrule}}
! 6595: \GMPdisplay{
! 6596: \hbox{%
! 6597: \vbox{%
! 6598: \hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}%
! 6599: \hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}%
! 6600: \hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}%
! 6601: \hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}%
! 6602: \hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}%
! 6603: \hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}%
! 6604: \vfill}%
! 6605: \vbox{%
! 6606: \hbox{%
! 6607: \hbox to 2em {\hfil u0\hfil}%
! 6608: \hbox to 2em {\hfil u1\hfil}%
! 6609: \hbox to 2em {\hfil u2\hfil}%
! 6610: \hbox to 2em {\hfil u3\hfil}%
! 6611: \hbox to 2em {\hfil u4\hfil}}%
! 6612: \vskip 0.7ex
! 6613: \hrule
! 6614: \GMPline{u0}{d}{}{}{}{}%
! 6615: \hrule
! 6616: \GMPline{u1}{}{d}{}{}{}%
! 6617: \hrule
! 6618: \GMPline{u2}{}{}{d}{}{}%
! 6619: \hrule
! 6620: \GMPline{u3}{}{}{}{d}{}%
! 6621: \hrule
! 6622: \GMPline{u4}{}{}{}{}{d}%
! 6623: \hrule}}}
! 6624: @end tex
! 6625: @ifnottex
! 6626: @example
! 6627: @group
! 6628: u0 u1 u2 u3 u4
! 6629: +---+---+---+---+---+
! 6630: u0 | d | | | | |
! 6631: +---+---+---+---+---+
! 6632: u1 | | d | | | |
! 6633: +---+---+---+---+---+
! 6634: u2 | | | d | | |
! 6635: +---+---+---+---+---+
! 6636: u3 | | | | d | |
! 6637: +---+---+---+---+---+
! 6638: u4 | | | | | d |
! 6639: +---+---+---+---+---+
! 6640: @end group
! 6641: @end example
! 6642: @end ifnottex
! 6643:
! 6644: In practice squaring isn't a full 2@cross{} faster than multiplying, it's
! 6645: usually around 1.5@cross{}. Less than 1.5@cross{} probably indicates
! 6646: @code{mpn_sqr_basecase} wants improving on that CPU.
! 6647:
! 6648: On some CPUs @code{mpn_mul_basecase} can be faster than the generic C
! 6649: @code{mpn_sqr_basecase}. @code{SQR_BASECASE_THRESHOLD} is the size at which
! 6650: to use @code{mpn_sqr_basecase}, this will be zero if that routine should be
! 6651: used always.
! 6652:
! 6653:
! 6654: @node Karatsuba Multiplication, Toom-Cook 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms
! 6655: @subsection Karatsuba Multiplication
! 6656:
! 6657: The Karatsuba multiplication algorithm is described in Knuth section 4.3.3
! 6658: part A, and various other textbooks. A brief description is given here.
! 6659:
! 6660: The inputs @math{x} and @math{y} are treated as each split into two parts of
! 6661: equal length (or the most significant part one limb shorter if N is odd).
! 6662:
! 6663: @tex
! 6664: % GMPboxwidth used for all the multiplication pictures
! 6665: \global\newdimen\GMPboxwidth \global\GMPboxwidth=5em
! 6666: % GMPboxdepth and GMPboxheight are also used for the float pictures
! 6667: \global\newdimen\GMPboxdepth \global\GMPboxdepth=1ex
! 6668: \global\newdimen\GMPboxheight \global\GMPboxheight=2ex
! 6669: \gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth}
! 6670: \def\GMPbox#1#2{%
! 6671: \vbox {%
! 6672: \hrule
! 6673: \hbox to 2\GMPboxwidth{%
! 6674: \GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}%
! 6675: \hrule}}
! 6676: \GMPdisplay{%
! 6677: \vbox{%
! 6678: \hbox to 2\GMPboxwidth {high \hfil low}
! 6679: \vskip 0.7ex
! 6680: \GMPbox{x_1}{x_0}
! 6681: \vskip 0.5ex
! 6682: \GMPbox{y_1}{y_0}
! 6683: }}
! 6684: @end tex
! 6685: @ifnottex
! 6686: @example
! 6687: @group
! 6688: high low
! 6689: +----------+----------+
! 6690: | x1 | x0 |
! 6691: +----------+----------+
! 6692:
! 6693: +----------+----------+
! 6694: | y1 | y0 |
! 6695: +----------+----------+
! 6696: @end group
! 6697: @end example
! 6698: @end ifnottex
! 6699:
! 6700: Let @math{b} be the power of 2 where the split occurs, ie.@: if @ms{x,0} is
! 6701: @math{k} limbs (@ms{y,0} the same) then
! 6702: @m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}.
! 6703: With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the
! 6704: following holds,
! 6705:
! 6706: @display
! 6707: @m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0,
! 6708: x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0}
! 6709: @end display
! 6710:
! 6711: This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs,
! 6712: whereas a basecase multiply of N@cross{}N limbs is equivalent to four
! 6713: multiplies of (N/2)@cross{}(N/2). The factors @math{(b^2+b)} etc represent
! 6714: the positions where the three products must be added.
! 6715:
! 6716: @tex
! 6717: \def\GMPboxA#1#2{%
! 6718: \vbox{%
! 6719: \hrule
! 6720: \hbox{%
! 6721: \GMPvrule
! 6722: \hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}%
! 6723: \vrule
! 6724: \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
! 6725: \vrule}
! 6726: \hrule}}
! 6727: \def\GMPboxB#1#2{%
! 6728: \hbox{%
! 6729: \raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}%
! 6730: \vbox{%
! 6731: \hrule
! 6732: \hbox{%
! 6733: \GMPvrule
! 6734: \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
! 6735: \vrule}%
! 6736: \hrule}}}
! 6737: \GMPdisplay{%
! 6738: \vbox{%
! 6739: \hbox to 4\GMPboxwidth {high \hfil low}
! 6740: \vskip 0.7ex
! 6741: \GMPboxA{x_1y_1}{x_0y_0}
! 6742: \vskip 0.5ex
! 6743: \GMPboxB{$+$}{x_1y_1}
! 6744: \vskip 0.5ex
! 6745: \GMPboxB{$+$}{x_0y_0}
! 6746: \vskip 0.5ex
! 6747: \GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)}
! 6748: }}
! 6749: @end tex
! 6750: @ifnottex
! 6751: @example
! 6752: @group
! 6753: high low
! 6754: +--------+--------+ +--------+--------+
! 6755: | x1*y1 | | x0*y0 |
! 6756: +--------+--------+ +--------+--------+
! 6757: +--------+--------+
! 6758: add | x1*y1 |
! 6759: +--------+--------+
! 6760: +--------+--------+
! 6761: add | x0*y0 |
! 6762: +--------+--------+
! 6763: +--------+--------+
! 6764: sub | (x1-x0)*(y1-y0) |
! 6765: +--------+--------+
! 6766: @end group
! 6767: @end example
! 6768: @end ifnottex
! 6769:
! 6770: The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an
! 6771: absolute value, and the sign used to choose to add or subtract. Notice the
! 6772: sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1),
! 6773: high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb
! 6774: additions, rather than @m{6k,6*k}, but in GMP extra function call overheads
! 6775: outweigh the saving.
! 6776:
! 6777: Squaring is similar to multiplying, but with @math{x=y} the formula reduces to
! 6778: an equivalent with three squares,
! 6779:
! 6780: @display
! 6781: @m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2,
! 6782: x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2}
! 6783: @end display
! 6784:
! 6785: The final result is accumulated from those three squares the same way as for
! 6786: the three multiplies above. The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now
! 6787: always positive.
! 6788:
! 6789: A similar formula for both multiplying and squaring can be constructed with a
! 6790: middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}. But those sums can exceed
! 6791: @math{k} limbs, leading to more carry handling and additions than the form
! 6792: above.
! 6793:
! 6794: Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm,
! 6795: the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies
! 6796: each 1/2 the size of the inputs. This is a big improvement over the basecase
! 6797: multiply at @math{O(N^2)} and the advantage soon overcomes the extra additions
! 6798: Karatsuba performs.
! 6799:
! 6800: @code{MUL_KARATSUBA_THRESHOLD} can be as little as 10 limbs. The @code{SQR}
! 6801: threshold is usually about twice the @code{MUL}. The basecase algorithm will
! 6802: take a time of the form @m{M(N) = aN^2 + bN + c, M(N) = a*N^2 + b*N + c} and
! 6803: the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN + e, K(N) = 3*M(N/2) + d*N +
! 6804: e}. Clearly per-crossproduct speedups in the basecase code reduce @math{a}
! 6805: and decrease the threshold, but linear style speedups reducing @math{b} will
! 6806: actually increase the threshold. The latter can be seen for instance when
! 6807: adding an optimized @code{mpn_sqr_diagonal} to @code{mpn_sqr_basecase}. Of
! 6808: course all speedups reduce total time, and in that sense the algorithm
! 6809: thresholds are merely of academic interest.
! 6810:
! 6811:
! 6812: @node Toom-Cook 3-Way Multiplication, FFT Multiplication, Karatsuba Multiplication, Multiplication Algorithms
! 6813: @subsection Toom-Cook 3-Way Multiplication
! 6814:
! 6815: The Karatsuba formula is the simplest case of a general approach to splitting
! 6816: inputs that leads to both Toom-Cook and FFT algorithms. A description of
! 6817: Toom-Cook can be found in Knuth section 4.3.3, with an example 3-way
! 6818: calculation after Theorem A. The 3-way form used in GMP is described here.
! 6819:
! 6820: The operands are each considered split into 3 pieces of equal length (or the
! 6821: most significant part 1 or 2 limbs shorter than the others).
! 6822:
! 6823: @tex
! 6824: \def\GMPbox#1#2#3{%
! 6825: \vbox{%
! 6826: \hrule \vfil
! 6827: \hbox to 3\GMPboxwidth {%
! 6828: \GMPvrule
! 6829: \hfil$#1$\hfil
! 6830: \vrule
! 6831: \hfil$#2$\hfil
! 6832: \vrule
! 6833: \hfil$#3$\hfil
! 6834: \vrule}%
! 6835: \vfil \hrule
! 6836: }}
! 6837: \GMPdisplay{%
! 6838: \vbox{%
! 6839: \hbox to 3\GMPboxwidth {high \hfil low}
! 6840: \vskip 0.7ex
! 6841: \GMPbox{x_2}{x_1}{x_0}
! 6842: \vskip 0.5ex
! 6843: \GMPbox{y_2}{y_1}{y_0}
! 6844: \vskip 0.5ex
! 6845: }}
! 6846: @end tex
! 6847: @ifnottex
! 6848: @example
! 6849: @group
! 6850: high low
! 6851: +----------+----------+----------+
! 6852: | x2 | x1 | x0 |
! 6853: +----------+----------+----------+
! 6854:
! 6855: +----------+----------+----------+
! 6856: | y2 | y1 | y0 |
! 6857: +----------+----------+----------+
! 6858: @end group
! 6859: @end example
! 6860: @end ifnottex
! 6861:
! 6862: @noindent
! 6863: These parts are treated as the coefficients of two polynomials
! 6864:
! 6865: @display
! 6866: @group
! 6867: @m{X(t) = x_2t^2 + x_1t + x_0,
! 6868: X(t) = x2*t^2 + x1*t + x0}
! 6869: @m{Y(t) = y_2t^2 + y_1t + y_0,
! 6870: Y(t) = y2*t^2 + y1*t + y0}
! 6871: @end group
! 6872: @end display
! 6873:
! 6874: Again let @math{b} equal the power of 2 which is the size of the @ms{x,0},
! 6875: @ms{x,1}, @ms{y,0} and @ms{y,1} pieces, ie.@: if they're @math{k} limbs each
! 6876: then @m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}},
! 6877: b=2^(k*mp_bits_per_limb)}. With this @math{x=X(b)} and @math{y=Y(b)}.
! 6878:
! 6879: Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients
! 6880: are
! 6881:
! 6882: @display
! 6883: @m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0,
! 6884: W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0}
! 6885: @end display
! 6886:
! 6887: @noindent
! 6888: The @m{w_i,w[i]} are going to be determined, and when they are they'll give
! 6889: the final result using @math{w=W(b)}, since
! 6890: @m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}. The coefficients will be roughly
! 6891: @math{b^2} each, and the final @math{W(b)} will be an addition like,
! 6892:
! 6893: @tex
! 6894: \def\GMPbox#1#2{%
! 6895: \moveright #1\GMPboxwidth
! 6896: \vbox{%
! 6897: \hrule
! 6898: \hbox{%
! 6899: \GMPvrule
! 6900: \hbox to 2\GMPboxwidth {\hfil$#2$\hfil}%
! 6901: \vrule}%
! 6902: \hrule
! 6903: }}
! 6904: \GMPdisplay{%
! 6905: \vbox{%
! 6906: \hbox to 6\GMPboxwidth {high \hfil low}%
! 6907: \vskip 0.7ex
! 6908: \GMPbox{0}{w_4}
! 6909: \vskip 0.5ex
! 6910: \GMPbox{1}{w_3}
! 6911: \vskip 0.5ex
! 6912: \GMPbox{2}{w_2}
! 6913: \vskip 0.5ex
! 6914: \GMPbox{3}{w_1}
! 6915: \vskip 0.5ex
! 6916: \GMPbox{4}{w_1}
! 6917: }}
! 6918: @end tex
! 6919: @ifnottex
! 6920: @example
! 6921: @group
! 6922: high low
! 6923: +-------+-------+
! 6924: | w4 |
! 6925: +-------+-------+
! 6926: +--------+-------+
! 6927: | w3 |
! 6928: +--------+-------+
! 6929: +--------+-------+
! 6930: | w2 |
! 6931: +--------+-------+
! 6932: +--------+-------+
! 6933: | w1 |
! 6934: +--------+-------+
! 6935: +-------+-------+
! 6936: | w0 |
! 6937: +-------+-------+
! 6938: @end group
! 6939: @end example
! 6940: @end ifnottex
! 6941:
! 6942: The @m{w_i,w[i]} coefficients could be formed by a simple set of cross
! 6943: products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2},
! 6944: @m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all
! 6945: nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely
! 6946: to a basecase multiply. Instead the following approach is used.
! 6947:
! 6948: @math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving
! 6949: values of @math{W(t)} at those points. The points used can be chosen in
! 6950: various ways, but in GMP the following are used
! 6951:
! 6952: @quotation
! 6953: @multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
! 6954: @item Point @tab Value
! 6955: @item @math{t=0} @tab @m{x_0y_0,x0*y0}, which gives @ms{w,0} immediately
! 6956: @item @math{t=2} @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0)*(4*y2+2*y1+y0)}
! 6957: @item @math{t=1} @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0)*(y2+y1+y0)}
! 6958: @item @m{t={1\over2},t=1/2} @tab @m{(x_2+2x_1+4x_0)(y_2+2y_1+4y_0),(x2+2*x1+4*x0)*(y2+2*y1+4*y0)}
! 6959: @item @m{t=\infty,t=inf} @tab @m{x_2y_2,x2*y2}, which gives @ms{w,4} immediately
! 6960: @end multitable
! 6961: @end quotation
! 6962:
! 6963: At @m{t={1\over2},t=1/2} the value calculated is actually
! 6964: @m{16X({1\over2})Y({1\over2}), 16*X(1/2)*Y(1/2)}, giving a value for
! 6965: @m{16W({1\over2}),16*W(1/2)}, and this is always an integer. At
! 6966: @m{t=\infty,t=inf} the value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over
! 6967: t^4}, X(t)*Y(t)/t^4 in the limit as t approaches infinity}, but it's much
! 6968: easier to think of as simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately
! 6969: (much like @m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately).
! 6970:
! 6971: Now each of the points substituted into
! 6972: @m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination
! 6973: of the @m{w_i,w[i]} coefficients, and the value of those combinations has just
! 6974: been calculated.
! 6975:
! 6976: @tex
! 6977: \GMPdisplay{%
! 6978: $\matrix{%
! 6979: W(0) & = & & & & & & & & & w_0 \cr
! 6980: 16W({1\over2}) & = & w_4 & + & 2w_3 & + & 4w_2 & + & 8w_1 & + & 16w_0 \cr
! 6981: W(1) & = & w_4 & + & w_3 & + & w_2 & + & w_1 & + & w_0 \cr
! 6982: W(2) & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr
! 6983: W(\infty) & = & w_4 \cr
! 6984: }$}
! 6985: @end tex
! 6986: @ifnottex
! 6987: @example
! 6988: @group
! 6989: W(0) = w0
! 6990: 16*W(1/2) = w4 + 2*w3 + 4*w2 + 8*w1 + 16*w0
! 6991: W(1) = w4 + w3 + w2 + w1 + w0
! 6992: W(2) = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0
! 6993: W(inf) = w4
! 6994: @end group
! 6995: @end example
! 6996: @end ifnottex
! 6997:
! 6998: This is a set of five equations in five unknowns, and some elementary linear
! 6999: algebra quickly isolates each @m{w_i,w[i]}, by subtracting multiples of one
! 7000: equation from another.
! 7001:
! 7002: In the code the set of five values @math{W(0)},@dots{},@m{W(\infty),W(inf)}
! 7003: will represent those certain linear combinations. By adding or subtracting
! 7004: one from another as necessary, values which are each @m{w_i,w[i]} alone are
! 7005: arrived at. This involves only a few subtractions of small multiples (some of
! 7006: which are powers of 2), and so is fast. A couple of divisions remain by
! 7007: powers of 2 and one division by 3 (or by 6 rather), and that last uses the
! 7008: special @code{mpn_divexact_by3} (@pxref{Exact Division}).
! 7009:
! 7010: In the code the values @ms{w,4}, @ms{w,2} and @ms{w,0} are formed in the
! 7011: destination with pointers @code{E}, @code{C} and @code{A}, and @ms{w,3} and
! 7012: @ms{w,1} in temporary space @code{D} and @code{B} are added to them. There
! 7013: are extra limbs @code{tD}, @code{tC} and @code{tB} at the high end of
! 7014: @ms{w,3}, @ms{w,2} and @ms{w,1} which are handled separately. The final
! 7015: addition then is as follows.
! 7016:
! 7017: @tex
! 7018: \def\GMPboxT#1{%
! 7019: \vbox{%
! 7020: \hrule
! 7021: \hbox {\GMPvrule\hskip 0.4em #1\hskip 0.4em \vrule}%
! 7022: \hrule
! 7023: }}
! 7024: \GMPdisplay{%
! 7025: \vbox{%
! 7026: \hbox to 6\GMPboxwidth {high \hfil low}%
! 7027: \vskip 0.7ex
! 7028: \vbox{%
! 7029: \hrule
! 7030: \hbox{%
! 7031: \GMPvrule
! 7032: \hbox to 2\GMPboxwidth {\hfil@code{E}\hfil}
! 7033: \vrule
! 7034: \hbox to 2\GMPboxwidth {\hfil@code{C}\hfil}
! 7035: \vrule
! 7036: \hbox to 2\GMPboxwidth {\hfil@code{A}\hfil}
! 7037: \vrule}%
! 7038: \hrule}%
! 7039: \vskip 0.5ex
! 7040: \moveright \GMPboxwidth \vbox{%
! 7041: \hrule
! 7042: \hbox to 4\GMPboxwidth {%
! 7043: \GMPvrule \hfil @code{D}\hfil
! 7044: \vrule \hfil @code{B}\hfil
! 7045: \vrule}
! 7046: \hrule}%
! 7047: \vskip 0.5ex
! 7048: \hbox{%
! 7049: \hbox to \GMPboxwidth{\hfil \GMPboxT{\code{tD}}}%
! 7050: \hbox to \GMPboxwidth{\hfil \GMPboxT{\code{tC}}}%
! 7051: \hbox to \GMPboxwidth{\hfil \GMPboxT{\code{tB}}}}
! 7052: }}
! 7053: @end tex
! 7054: @ifnottex
! 7055: @example
! 7056: @group
! 7057: high low
! 7058: +-------+-------+-------+-------+-------+-------+
! 7059: | E | C | A |
! 7060: +-------+-------+-------+-------+-------+-------+
! 7061: +------+-------++------+-------+
! 7062: | D || B |
! 7063: +------+-------++------+-------+
! 7064: -- -- --
! 7065: |tD| |tC| |tB|
! 7066: -- -- --
! 7067: @end group
! 7068: @end example
! 7069: @end ifnottex
! 7070:
! 7071: The conversion of @math{W(t)} values to the coefficients is interpolation. A
! 7072: polynomial of degree 4 like @math{W(t)} is uniquely determined by values known
! 7073: at 5 different points. The points can be chosen to make the linear equations
! 7074: come out with a convenient set of steps for isolating the @m{w_i,w[i]}.
! 7075:
! 7076: In @file{mpn/generic/mul_n.c} the @code{interpolate3} routine performs the
! 7077: interpolation. The open-coded one-pass version may be a bit hard to
! 7078: understand, the steps performed can be better seen in the @code{USE_MORE_MPN}
! 7079: version.
! 7080:
! 7081: Squaring follows the same procedure as multiplication, but there's only one
! 7082: @math{X(t)} and it's evaluated at 5 points, and those values squared to give
! 7083: values of @math{W(t)}. The interpolation is then identical, and in fact the
! 7084: same @code{interpolate3} subroutine is used for both squaring and multiplying.
! 7085:
! 7086: Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being
! 7087: @m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the
! 7088: original size. This is an improvement over Karatsuba at @math{O(N^@W{1.585})},
! 7089: though Toom-Cook does more work in the evaluation and interpolation and so it
! 7090: only realizes its advantage above a certain size.
! 7091:
! 7092: Near the crossover between Toom-3 and Karatsuba there's generally a range of
! 7093: sizes where the difference between the two is small.
! 7094: @code{MUL_TOOM3_THRESHOLD} is a somewhat arbitrary point in that range and
! 7095: successive runs of the tune program can give different values due to small
! 7096: variations in measuring. A graph of time versus size for the two shows the
! 7097: effect, see @file{tune/README}.
! 7098:
! 7099: At the fairly small sizes where the Toom-3 thresholds occur it's worth
! 7100: remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be
! 7101: expected to make accurate predictions, due of course to the big influence of
! 7102: all sorts of overheads, and the fact that only a few recursions of each are
! 7103: being performed. Even at large sizes there's a good chance machine dependent
! 7104: effects like cache architecture will mean actual performance deviates from
! 7105: what might be predicted.
! 7106:
! 7107: The formula given above for the Karatsuba algorithm has an equivalent for
! 7108: Toom-3 involving only five multiplies, but this would be complicated and
! 7109: unenlightening.
! 7110:
! 7111: An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using
! 7112: a vector to represent the @math{x} and @math{y} splits and a matrix
! 7113: multiplication for the evaluation and interpolation stages. The matrix
! 7114: inverses are not meant to be actually used, and they have elements with values
! 7115: much greater than in fact arise in the interpolation steps. The diagram shown
! 7116: for the 3-way is attractive, but again doesn't have to be implemented that way
! 7117: and for example with a bit of rearrangement just one division by 6 can be
! 7118: done.
! 7119:
! 7120:
! 7121: @node FFT Multiplication, Other Multiplication, Toom-Cook 3-Way Multiplication, Multiplication Algorithms
! 7122: @subsection FFT Multiplication
! 7123:
! 7124: At large to very large sizes a Fermat style FFT multiplication is used,
! 7125: following Sch@"onhage and Strassen (@pxref{References}). Descriptions of FFTs
! 7126: in various forms can be found in many textbooks, for instance Knuth section
! 7127: 4.3.3 part C or Lipson chapter IX. A brief description of the form used in
! 7128: GMP is given here.
! 7129:
! 7130: The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given
! 7131: @math{N}. A full product @m{xy,x*y} is obtained by choosing @m{N \ge
! 7132: \mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding
! 7133: @math{x} and @math{y} with high zero limbs. The modular product is the native
! 7134: form for the algorithm, so padding to get a full product is unavoidable.
! 7135:
! 7136: The algorithm follows a split, evaluate, pointwise multiply, interpolate and
! 7137: combine similar to that described above for Karatsuba and Toom-3. A @math{k}
! 7138: parameter controls the split, with an FFT-@math{k} splitting into @math{2^k}
! 7139: pieces of @math{M=N/2^k} bits each. @math{N} must be a multiple of
! 7140: @m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so
! 7141: the split falls on limb boundaries, avoiding bit shifts in the split and
! 7142: combine stages.
! 7143:
! 7144: The evaluations, pointwise multiplications, and interpolation, are all done
! 7145: modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a
! 7146: multiple of @math{2^k} and of @code{mp_bits_per_limb}. The results of
! 7147: interpolation will be the following negacyclic convolution of the input
! 7148: pieces, and the choice of @math{N'} ensures these sums aren't truncated.
! 7149: @tex
! 7150: $$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$
! 7151: @end tex
! 7152: @ifnottex
! 7153:
! 7154: @example
! 7155: ---
! 7156: \ b
! 7157: w[n] = / (-1) * x[i] * y[j]
! 7158: ---
! 7159: i+j==b*2^k+n
! 7160: b=0,1
! 7161: @end example
! 7162:
! 7163: @end ifnottex
! 7164: The points used for the evaluation are @math{g^i} for @math{i=0} to
! 7165: @math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}. @math{g} is a
! 7166: @m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary
! 7167: cancellations at the interpolation stage, and it's also a power of 2 so the
! 7168: fast fourier transforms used for the evaluation and interpolation do only
! 7169: shifts, adds and negations.
! 7170:
! 7171: The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either
! 7172: recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or
! 7173: basecase), whichever is optimal at the size @math{N'}. The interpolation is
! 7174: an inverse fast fourier transform. The resulting set of sums of @m{x_iy_j,
! 7175: x[i]*y[j]} are added at appropriate offsets to give the final result.
! 7176:
! 7177: Squaring is the same, but @math{x} is the only input so it's one transform at
! 7178: the evaluate stage and the pointwise multiplies are squares. The
! 7179: interpolation is the same.
! 7180:
! 7181: For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}),
! 7182: O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed
! 7183: modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original.
! 7184: Each successive @math{k} is an asymptotic improvement, but overheads mean each
! 7185: is only faster at bigger and bigger sizes. In the code, @code{MUL_FFT_TABLE}
! 7186: and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used. Each
! 7187: new @math{k} effectively swaps some multiplying for some shifts, adds and
! 7188: overheads.
! 7189:
! 7190: A mod @math{2^N+1} product can be formed with a normal
! 7191: @math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT
! 7192: and Toom-3 etc can be compared directly. A @math{k=4} FFT at
! 7193: @math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at
! 7194: @math{O(N^@W{1.465})}. In practice this is what's found, with
! 7195: @code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between
! 7196: 300 and 1000 limbs, depending on the CPU. So far it's been found that only
! 7197: very large FFTs recurse into pointwise multiplies above these sizes.
! 7198:
! 7199: When an FFT is to give a full product, the change of @math{N} to @math{2N}
! 7200: doesn't alter the theoretical complexity for a given @math{k}, but for the
! 7201: purposes of considering where an FFT might be first used it can be assumed
! 7202: that the FFT is recursing into a normal multiply and that on that basis it's
! 7203: doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of
! 7204: the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}. This would mean
! 7205: @math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3.
! 7206: In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been
! 7207: found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs.
! 7208:
! 7209: The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is
! 7210: rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that
! 7211: when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a
! 7212: multiple of @m{2^{2k-1},2^(2k-1)} bits. The @math{+k+3} means some values of
! 7213: @math{N} just under such a multiple will be rounded to the next. The
! 7214: complexity calculations above assume that a favourable size is used, meaning
! 7215: one which isn't padded through rounding, and it's also assumed that the extra
! 7216: @math{+k+3} bits are negligible at typical FFT sizes.
! 7217:
! 7218: The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a
! 7219: step-effect into measured speeds. For example @math{k=8} will round @math{N}
! 7220: up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb
! 7221: groups of sizes for which @code{mpn_mul_n} runs at the same speed. Or for
! 7222: @math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc. In
! 7223: practice it's been found each @math{k} is used at quite small multiples of its
! 7224: size constraint and so the step effect is quite noticeable in a time versus
! 7225: size graph.
! 7226:
! 7227: The threshold determinations currently measure at the mid-points of size
! 7228: steps, but this is sub-optimal since at the start of a new step it can happen
! 7229: that it's better to go back to the previous @math{k} for a while. Something
! 7230: more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be
! 7231: needed.
! 7232:
! 7233:
! 7234: @node Other Multiplication, , FFT Multiplication, Multiplication Algorithms
! 7235: @subsection Other Multiplication
! 7236:
! 7237: The 3-way Toom-Cook algorithm described above (@pxref{Toom-Cook 3-Way
! 7238: Multiplication}) generalizes to split into an arbitrary number of pieces, as
! 7239: per Knuth section 4.3.3 algorithm C. This is not currently used, though it's
! 7240: possible a Toom-4 might fit in between Toom-3 and the FFTs. The notes here
! 7241: are merely for interest.
! 7242:
! 7243: In general a split into @math{r+1} pieces is made, and evaluations and
! 7244: pointwise multiplications done at @m{2r+1,2*r+1} points. A 4-way split does 7
! 7245: pointwise multiplies, 5-way does 9, etc. Asymptotically an @math{(r+1)}-way
! 7246: algorithm is @m{O(N^{log(2r+1)/log(r+1)}, O(N^(log(2*r+1)/log(r+1)))}. Only
! 7247: the pointwise multiplications count towards big-@math{O} complexity, but the
! 7248: time spent in the evaluate and interpolate stages grows with @math{r} and has
! 7249: a significant practical impact, with the asymptotic advantage of each @math{r}
! 7250: realized only at bigger and bigger sizes. The overheads grow as
! 7251: @m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log
! 7252: r), O(N*log(r))}.
! 7253:
! 7254: Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4
! 7255: uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small
! 7256: multiplies in the evaluate stage (or rather trades them for additions), and
! 7257: has a further saving of nearly half the interpolate steps. The idea is to
! 7258: separate odd and even final coefficients and then perform algorithm C steps C7
! 7259: and C8 on them separately. The divisors at step C7 become @math{j^2} and the
! 7260: multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}.
! 7261:
! 7262: Splitting odd and even parts through positive and negative points can be
! 7263: thought of as using @math{-1} as a square root of unity. If a 4th root of
! 7264: unity was available then a further split and speedup would be possible, but no
! 7265: such root exists for plain integers. Going to complex integers with
! 7266: @m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in cartesian
! 7267: form it takes three real multiplies to do a complex multiply. The existence
! 7268: of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast
! 7269: fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}.
! 7270:
! 7271: Floating point FFTs use complex numbers approximating Nth roots of unity.
! 7272: Some processors have special support for such FFTs. But these are not used in
! 7273: GMP since it's very difficult to guarantee an exact result (to some number of
! 7274: bits). An occasional difference of 1 in the last bit might not matter to a
! 7275: typical signal processing algorithm, but is of course of vital importance to
! 7276: GMP.
! 7277:
! 7278:
! 7279: @node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms
! 7280: @section Division Algorithms
! 7281: @cindex Division algorithms
! 7282:
! 7283: @menu
! 7284: * Single Limb Division::
! 7285: * Basecase Division::
! 7286: * Divide and Conquer Division::
! 7287: * Exact Division::
! 7288: * Exact Remainder::
! 7289: * Small Quotient Division::
! 7290: @end menu
! 7291:
! 7292:
! 7293: @node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms
! 7294: @subsection Single Limb Division
! 7295:
! 7296: N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from
! 7297: high to low, either with a hardware divide instruction or a multiplication by
! 7298: inverse, whichever is best on a given CPU.
! 7299:
! 7300: The multiply by inverse follows section 8 of ``Division by Invariant Integers
! 7301: using Multiplication'' by Granlund and Montgomery (@pxref{References}) and is
! 7302: implemented as @code{udiv_qrnnd_preinv} in @file{gmp-impl.h}. The idea is to
! 7303: have a fixed-point approximation to @math{1/d} (see @code{invert_limb}) and
! 7304: then multiply by the high limb (plus one bit) of the dividend to get a
! 7305: quotient @math{q}. With @math{d} normalized (high bit set), @math{q} is no
! 7306: more than 1 too small. Subtracting @m{qd,q*d} from the dividend gives a
! 7307: remainder, and reveals whether @math{q} or @math{q-1} is correct.
! 7308:
! 7309: The result is a division done with two multiplications and four or five
! 7310: arithmetic operations. On CPUs with low latency multipliers this can be much
! 7311: faster than a hardware divide, though the cost of calculating the inverse at
! 7312: the start may mean it's only better on inputs bigger than say 4 or 5 limbs.
! 7313:
! 7314: When a divisor must be normalized, either for the generic C
! 7315: @code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is
! 7316: actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and
! 7317: @math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set.
! 7318: The bit shifts for the dividend are usually accomplished ``on the fly''
! 7319: meaning by extracting the appropriate bits at each step. Done this way the
! 7320: quotient limbs come out aligned ready to store. When only the remainder is
! 7321: wanted, an alternative is to take the dividend limbs unshifted and calculate
! 7322: @m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k
! 7323: \bmod d2^k, r*2^k mod d*2^k}. This can help on CPUs with poor bit shifts or
! 7324: few registers.
! 7325:
! 7326: The multiply by inverse can be done two limbs at a time. The calculation is
! 7327: basically the same, but the inverse is two limbs and the divisor treated as if
! 7328: padded with a low zero limb. This means more work, since the inverse will
! 7329: need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are
! 7330: independent and can therefore be done partly or wholly in parallel. Likewise
! 7331: for a 2@cross{}1 calculating @m{qd,q*d}. The net effect is to process two
! 7332: limbs with roughly the same two multiplies worth of latency that one limb at a
! 7333: time gives. This extends to 3 or 4 limbs at a time, though the extra work to
! 7334: apply the inverse will almost certainly soon reach the limits of multiplier
! 7335: throughput.
! 7336:
! 7337: A similar approach in reverse can be taken to process just half a limb at a
! 7338: time if the divisor is only a half limb. In this case the 1@cross{}1 multiply
! 7339: for the inverse effectively becomes two @m{1\over2@cross{}1, (1/2)x1} for each
! 7340: limb, which can be a saving on CPUs with a fast half limb multiply, or in fact
! 7341: if the only multiply is a half limb, and especially if it's not pipelined.
! 7342:
! 7343:
! 7344: @node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms
! 7345: @subsection Basecase Division
! 7346:
! 7347: Basecase N@cross{}M division is like long division done by hand, but in base
! 7348: @m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}. See Knuth
! 7349: section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}.
! 7350:
! 7351: Briefly stated, while the dividend remains larger than the divisor, a high
! 7352: quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at
! 7353: the top end of the dividend. With a normalized divisor (most significant bit
! 7354: set), each quotient limb can be formed with a 2@cross{}1 division and a
! 7355: 1@cross{}1 multiplication plus some subtractions. The 2@cross{}1 division is
! 7356: by the high limb of the divisor and is done either with a hardware divide or a
! 7357: multiply by inverse (the same as in @ref{Single Limb Division}) whichever is
! 7358: faster. Such a quotient is sometimes one too big, requiring an addback of the
! 7359: divisor, but that happens rarely.
! 7360:
! 7361: With Q=N@minus{}M being the number of quotient limbs, this is an
! 7362: @m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase
! 7363: Q@cross{}M multiplication, differing in fact only in the extra multiply and
! 7364: divide for each of the Q quotient limbs.
! 7365:
! 7366:
! 7367: @node Divide and Conquer Division, Exact Division, Basecase Division, Division Algorithms
! 7368: @subsection Divide and Conquer Division
! 7369:
! 7370: For divisors larger than @code{DIV_DC_THRESHOLD}, division is done by dividing.
! 7371: Or to be precise by a recursive divide and conquer algorithm based on work by
! 7372: Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}).
! 7373:
! 7374: The algorithm consists essentially of recognising that a 2N@cross{}N division
! 7375: can be done with the basecase division algorithm (@pxref{Basecase Division}),
! 7376: but using N/2 limbs as a base, not just a single limb. This way the
! 7377: multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of
! 7378: Karatsuba and higher multiplication algorithms (@pxref{Multiplication
! 7379: Algorithms}). The ``digits'' of the quotient are formed by recursive
! 7380: N@cross{}(N/2) divisions.
! 7381:
! 7382: If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication
! 7383: then the work is about the same as a basecase division, but with more function
! 7384: call overheads and with some subtractions separated from the multiplies.
! 7385: These overheads mean that it's only when N/2 is above
! 7386: @code{MUL_KARATSUBA_THRESHOLD} that divide and conquer is of use.
! 7387:
! 7388: @code{DIV_DC_THRESHOLD} is based on the divisor size N, so it will be somewhere
! 7389: above twice @code{MUL_KARATSUBA_THRESHOLD}, but how much above depends on the
! 7390: CPU. An optimized @code{mpn_mul_basecase} can lower @code{DIV_DC_THRESHOLD} a
! 7391: little by offering a ready-made advantage over repeated @code{mpn_submul_1}
! 7392: calls.
! 7393:
! 7394: Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where
! 7395: @math{M(N)} is the time for an N@cross{}N multiplication done with FFTs. The
! 7396: actual time is a sum over multiplications of the recursed sizes, as can be
! 7397: seen near the end of section 2.2 of Burnikel and Ziegler. For example, within
! 7398: the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}. With higher
! 7399: algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log
! 7400: N, log(N)}. In practice, at moderate to large sizes, a 2N@cross{}N division
! 7401: is about 2 to 4 times slower than an N@cross{}N multiplication.
! 7402:
! 7403: Newton's method used for division is asymptotically @math{O(M(N))} and should
! 7404: therefore be superior to divide and conquer, but it's believed this would only
! 7405: be for large to very large N.
! 7406:
! 7407:
! 7408: @node Exact Division, Exact Remainder, Divide and Conquer Division, Division Algorithms
! 7409: @subsection Exact Division
! 7410:
! 7411: A so-called exact division is when the dividend is known to be an exact
! 7412: multiple of the divisor. Jebelean's exact division algorithm uses this
! 7413: knowledge to make some significant optimizations (@pxref{References}).
! 7414:
! 7415: The idea can be illustrated in decimal for example with 368154 divided by
! 7416: 543. Because the low digit of the dividend is 4, the low digit of the
! 7417: quotient must be 8. This is arrived at from @m{4 \mathord{\times} 7 \bmod 10,
! 7418: 4*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of
! 7419: the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7
! 7420: @equiv{} 1 mod 10}. So @m{8\mathord{\times}543 = 4344,8*543=4344} can be
! 7421: subtracted from the dividend leaving 363810. Notice the low digit has become
! 7422: zero.
! 7423:
! 7424: The procedure is repeated at the second digit, with the next quotient digit 7
! 7425: (@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting
! 7426: @m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800. And finally at
! 7427: the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7
! 7428: mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0.
! 7429: So the quotient is 678.
! 7430:
! 7431: Notice however that the multiplies and subtractions don't need to extend past
! 7432: the low three digits of the dividend, since that's enough to determine the
! 7433: three quotient digits. For the last quotient digit no subtraction is needed
! 7434: at all. On a 2N@cross{}N division like this one, only about half the work of
! 7435: a normal basecase division is necessary.
! 7436:
! 7437: For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the
! 7438: saving over a normal basecase division is in two parts. Firstly, each of the
! 7439: Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and
! 7440: multiply. Secondly, the crossproducts are reduced when @math{Q>M} to
! 7441: @m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2,
! 7442: Q*(Q-1)/2}. Notice the savings are complementary. If Q is big then many
! 7443: divisions are saved, or if Q is small then the crossproducts reduce to a small
! 7444: number.
! 7445:
! 7446: The modular inverse used is calculated efficiently by @code{modlimb_invert} in
! 7447: @file{gmp-impl.h}. This does four multiplies for a 32-bit limb, or six for a
! 7448: 64-bit limb. @file{tune/modlinv.c} has some alternate implementations that
! 7449: might suit processors better at bit twiddling than multiplying.
! 7450:
! 7451: The sub-quadratic exact division described by Jebelean in ``Exact Division
! 7452: with Karatsuba Complexity'' is not currently implemented. It uses a
! 7453: rearrangement similar to the divide and conquer for normal division
! 7454: (@pxref{Divide and Conquer Division}), but operating from low to high. A
! 7455: further possibility not currently implemented is ``Bidirectional Exact Integer
! 7456: Division'' by Krandick and Jebelean which forms quotient limbs from both the
! 7457: high and low ends of the dividend, and can halve once more the number of
! 7458: crossproducts needed in a 2N@cross{}N division.
! 7459:
! 7460: A special case exact division by 3 exists in @code{mpn_divexact_by3},
! 7461: supporting Toom-3 multiplication and @code{mpq} canonicalizations. It forms
! 7462: quotient digits with a multiply by the modular inverse of 3 (which is
! 7463: @code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next
! 7464: limb. The multiplications don't need to be on the dependent chain, as long as
! 7465: the effect of the borrows is applied. Only a few optimized assembler
! 7466: implementations currently exist.
! 7467:
! 7468:
! 7469: @node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms
! 7470: @subsection Exact Remainder
! 7471:
! 7472: If the exact division algorithm is done with a full subtraction at each stage
! 7473: and the dividend isn't a multiple of the divisor, then low zero limbs are
! 7474: produced but with a remainder in the high limbs. For dividend @math{a},
! 7475: divisor @math{d}, quotient @math{q}, and @m{b = 2
! 7476: \GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, then this
! 7477: remainder @math{r} is of the form
! 7478: @tex
! 7479: $$ a = qd + r b^n $$
! 7480: @end tex
! 7481: @ifnottex
! 7482:
! 7483: @example
! 7484: a = q*d + r*b^n
! 7485: @end example
! 7486:
! 7487: @end ifnottex
! 7488: @math{n} represents the number of zero limbs produced by the subtractions,
! 7489: that being the number of limbs produced for @math{q}. @math{r} will be in the
! 7490: range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by
! 7491: a factor of @math{b^n}.
! 7492:
! 7493: Carrying out full subtractions at each stage means the same number of cross
! 7494: products must be done as a normal division, but there's still some single limb
! 7495: divisions saved. When @math{d} is a single limb some simplifications arise,
! 7496: providing good speedups on a number of processors.
! 7497:
! 7498: @code{mpn_bdivmod}, @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the
! 7499: @code{redc} function in @code{mpz_powm} differ subtly in how they return
! 7500: @math{r}, leading to some negations in the above formula, but all are
! 7501: essentially the same.
! 7502:
! 7503: Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this
! 7504: leads to divisibility or congruence tests which are potentially more efficient
! 7505: than a normal division.
! 7506:
! 7507: The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is
! 7508: odd, hence the use of @code{mpn_bdivmod} in @code{mpn_gcd}, and the use of
! 7509: @code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and @code{mpz_kronecker_ui} etc
! 7510: (@pxref{Greatest Common Divisor Algorithms}).
! 7511:
! 7512: Montgomery's REDC method for modular multiplications uses operands of the form
! 7513: of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n})
! 7514: (yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact
! 7515: remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n}
! 7516: (@pxref{Modular Powering Algorithm}).
! 7517:
! 7518: Notice that @math{r} generally gives no useful information about the ordinary
! 7519: remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything. If
! 7520: however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the
! 7521: ordinary remainder. This occurs whenever @math{d} is a factor of
! 7522: @math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}. Other such
! 7523: factors include 5, 17 and 257, but no particular use has been found for this.
! 7524:
! 7525:
! 7526: @node Small Quotient Division, , Exact Remainder, Division Algorithms
! 7527: @subsection Small Quotient Division
! 7528:
! 7529: An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is
! 7530: small can be optimized somewhat.
! 7531:
! 7532: An ordinary basecase division normalizes the divisor by shifting it to make
! 7533: the high bit set, shifting the dividend accordingly, and shifting the
! 7534: remainder back down at the end of the calculation. This is wasteful if only a
! 7535: few quotient limbs are to be formed. Instead a division of just the top
! 7536: @m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be
! 7537: used to form a trial quotient. This requires only those limbs normalized, not
! 7538: the whole of the divisor and dividend.
! 7539:
! 7540: A multiply and subtract then applies the trial quotient to the M@minus{}Q
! 7541: unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q
! 7542: limbs remaining from the trial quotient division). The starting trial
! 7543: quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1
! 7544: too big are detected by first comparing the most significant limbs that will
! 7545: arise from the subtraction. An addback is done if the quotient still turns
! 7546: out to be 1 too big.
! 7547:
! 7548: This whole procedure is essentially the same as one step of the basecase
! 7549: algorithm done in a Q limb base, though with the trial quotient test done only
! 7550: with the high limbs, not an entire Q limb ``digit'' product. The correctness
! 7551: of this weaker test can be established by following the argument of Knuth
! 7552: section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r
! 7553: + u_2, v2*q>b*r+u2} condition appropriately relaxed.
! 7554:
! 7555:
! 7556: @need 1000
! 7557: @node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms
! 7558: @section Greatest Common Divisor
! 7559: @cindex Greatest common divisor algorithms
! 7560:
! 7561: @menu
! 7562: * Binary GCD::
! 7563: * Accelerated GCD::
! 7564: * Extended GCD::
! 7565: * Jacobi Symbol::
! 7566: @end menu
! 7567:
! 7568:
! 7569: @node Binary GCD, Accelerated GCD, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms
! 7570: @subsection Binary GCD
! 7571:
! 7572: At small sizes GMP uses an @math{O(N^2)} binary style GCD. This is described
! 7573: in many textbooks, for example Knuth section 4.5.2 algorithm B. It simply
! 7574: consists of successively reducing operands @math{a} and @math{b} using
! 7575: @math{@gcd{}(a,b) = @gcd{}(@min{}(a,b),@abs{}(a-b))}, and also that if
! 7576: @math{a} and @math{b} are first made odd then @math{@abs{}(a-b)} is even and
! 7577: factors of two can be discarded.
! 7578:
! 7579: Variants like letting @math{a-b} become negative and doing a different next
! 7580: step are of interest only as far as they suit particular CPUs, since on small
! 7581: operands it's machine dependent factors that determine performance.
! 7582:
! 7583: The Euclidean GCD algorithm, as per Knuth algorithms E and A, reduces using
! 7584: @math{a @bmod b} but this has so far been found to be slower everywhere. One
! 7585: reason the binary method does well is that the implied quotient at each step
! 7586: is usually small, so often only one or two subtractions are needed to get the
! 7587: same effect as a division. Quotients 1, 2 and 3 for example occur 67.7% of
! 7588: the time, see Knuth section 4.5.3 Theorem E.
! 7589:
! 7590: When the implied quotient is large, meaning @math{b} is much smaller than
! 7591: @math{a}, then a division is worthwhile. This is the basis for the initial
! 7592: @math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter
! 7593: for both N@cross{}1 and 1@cross{}1 cases). But after that initial reduction,
! 7594: big quotients occur too rarely to make it worth checking for them.
! 7595:
! 7596:
! 7597: @node Accelerated GCD, Extended GCD, Binary GCD, Greatest Common Divisor Algorithms
! 7598: @subsection Accelerated GCD
! 7599:
! 7600: For sizes above @code{GCD_ACCEL_THRESHOLD}, GMP uses the Accelerated GCD
! 7601: algorithm described independently by Weber and Jebelean (the latter as the
! 7602: ``Generalized Binary'' algorithm), @pxref{References}. This algorithm is
! 7603: still @math{O(N^2)}, but is much faster than the binary algorithm since it
! 7604: does fewer multi-precision operations. It consists of alternating the
! 7605: @math{k}-ary reduction by Sorenson, and a ``dmod'' exact remainder reduction.
! 7606:
! 7607: For operands @math{u} and @math{v} the @math{k}-ary reduction replaces
! 7608: @math{u} with @m{nv-du,n*v-d*u} where @math{n} and @math{d} are single limb
! 7609: values chosen to give two trailing zero limbs on that value, which can be
! 7610: stripped. @math{n} and @math{d} are calculated using an algorithm similar to
! 7611: half of a two limb GCD (see @code{find_a} in @file{mpn/generic/gcd.c}).
! 7612:
! 7613: When @math{u} and @math{v} differ in size by more than a certain number of
! 7614: bits, a dmod is performed to zero out bits at the low end of the larger. It
! 7615: consists of an exact remainder style division applied to an appropriate number
! 7616: of bits (@pxref{Exact Division}, and @pxref{Exact Remainder}). This is faster
! 7617: than a @math{k}-ary reduction but useful only when the operands differ in
! 7618: size. There's a dmod after each @math{k}-ary reduction, and if the dmod
! 7619: leaves the operands still differing in size then it's repeated.
! 7620:
! 7621: The @math{k}-ary reduction step can introduce spurious factors into the GCD
! 7622: calculated, and these are eliminated at the end by taking GCDs with the
! 7623: original inputs @math{@gcd{}(u,@gcd{}(v,g))} using the binary algorithm.
! 7624: Since @math{g} is almost always small this takes very little time.
! 7625:
! 7626: At small sizes the algorithm needs a good implementation of @code{find_a}. At
! 7627: larger sizes it's dominated by @code{mpn_addmul_1} applying @math{n} and
! 7628: @math{d}.
! 7629:
! 7630:
! 7631: @node Extended GCD, Jacobi Symbol, Accelerated GCD, Greatest Common Divisor Algorithms
! 7632: @subsection Extended GCD
! 7633:
! 7634: The extended GCD calculates @math{@gcd{}(a,b)} and also cofactors @math{x} and
! 7635: @math{y} satisfying @m{ax+by=\gcd(a@C{}b), a*x+b*y=gcd(a@C{}b)}. Lehmer's
! 7636: multi-step improvement of the extended Euclidean algorithm is used. See Knuth
! 7637: section 4.5.2 algorithm L, and @file{mpn/generic/gcdext.c}. This is an
! 7638: @math{O(N^2)} algorithm.
! 7639:
! 7640: The multipliers at each step are found using single limb calculations for
! 7641: sizes up to @code{GCDEXT_THRESHOLD}, or double limb calculations above that.
! 7642: The single limb code is faster but doesn't produce full-limb multipliers,
! 7643: hence not making full use of the @code{mpn_addmul_1} calls.
! 7644:
! 7645: When a CPU has a data-dependent multiplier, meaning one which is faster on
! 7646: operands with fewer bits, the extra work in the double-limb calculation might
! 7647: only save some looping overheads, leading to a large @code{GCDEXT_THRESHOLD}.
! 7648:
! 7649: Currently the single limb calculation doesn't optimize for the small quotients
! 7650: that often occur, and this can lead to unusually low values of
! 7651: @code{GCDEXT_THRESHOLD}, depending on the CPU.
! 7652:
! 7653: An analysis of double-limb calculations can be found in ``A Double-Digit
! 7654: Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The code in GMP
! 7655: was developed independently.
! 7656:
! 7657: It should be noted that when a double limb calculation is used, it's used for
! 7658: the whole of that GCD, it doesn't fall back to single limb part way through.
! 7659: This is because as the algorithm proceeds, the inputs @math{a} and @math{b}
! 7660: are reduced, but the cofactors @math{x} and @math{y} grow, so the multipliers
! 7661: at each step are applied to a roughly constant total number of limbs.
! 7662:
! 7663:
! 7664: @node Jacobi Symbol, , Extended GCD, Greatest Common Divisor Algorithms
! 7665: @subsection Jacobi Symbol
! 7666:
! 7667: @code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a
! 7668: simple binary algorithm similar to that described for the GCDs (@pxref{Binary
! 7669: GCD}). They're not very fast when both inputs are large. Lehmer's multi-step
! 7670: improvement or a binary based multi-step algorithm is likely to be better.
! 7671:
! 7672: When one operand fits a single limb, and that includes @code{mpz_kronecker_ui}
! 7673: and friends, an initial reduction is done with either @code{mpn_mod_1} or
! 7674: @code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb.
! 7675: The binary algorithm is well suited to a single limb, and the whole
! 7676: calculation in this case is quite efficient.
! 7677:
! 7678: In all the routines sign changes for the result are accumulated using some bit
! 7679: twiddling, avoiding table lookups or conditional jumps.
! 7680:
! 7681:
! 7682: @need 1000
! 7683: @node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms
! 7684: @section Powering Algorithms
! 7685: @cindex Powering algorithms
! 7686:
! 7687: @menu
! 7688: * Normal Powering Algorithm::
! 7689: * Modular Powering Algorithm::
! 7690: @end menu
1.1 maekawa 7691:
7692:
1.1.1.4 ! ohara 7693: @node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms
! 7694: @subsection Normal Powering
1.1 maekawa 7695:
1.1.1.4 ! ohara 7696: Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm,
! 7697: successively squaring and then multiplying by the base when a 1 bit is seen in
! 7698: the exponent, as per Knuth section 4.6.3. The ``left to right''
! 7699: variant described there is used rather than algorithm A, since it's just as
! 7700: easy and can be done with somewhat less temporary memory.
! 7701:
! 7702:
! 7703: @node Modular Powering Algorithm, , Normal Powering Algorithm, Powering Algorithms
! 7704: @subsection Modular Powering
! 7705:
! 7706: Modular powering is implemented using a @math{2^k}-ary sliding window
! 7707: algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85
! 7708: (@pxref{References}). @math{k} is chosen according to the size of the
! 7709: exponent. Larger exponents use larger values of @math{k}, the choice being
! 7710: made to minimize the average number of multiplications that must supplement
! 7711: the squaring.
! 7712:
! 7713: The modular multiplies and squares use either a simple division or the REDC
! 7714: method by Montgomery (@pxref{References}). REDC is a little faster,
! 7715: essentially saving N single limb divisions in a fashion similar to an exact
! 7716: remainder (@pxref{Exact Remainder}). The current REDC has some limitations.
! 7717: It's only @math{O(N^2)} so above @code{POWM_THRESHOLD} division becomes faster
! 7718: and is used. It doesn't attempt to detect small bases, but rather always uses
! 7719: a REDC form, which is usually a full size operand. And lastly it's only
! 7720: applied to odd moduli.
! 7721:
! 7722:
! 7723: @node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms
! 7724: @section Root Extraction Algorithms
! 7725: @cindex Root extraction algorithms
1.1 maekawa 7726:
1.1.1.4 ! ohara 7727: @menu
! 7728: * Square Root Algorithm::
! 7729: * Nth Root Algorithm::
! 7730: * Perfect Square Algorithm::
! 7731: * Perfect Power Algorithm::
! 7732: @end menu
1.1 maekawa 7733:
1.1.1.2 maekawa 7734:
1.1.1.4 ! ohara 7735: @node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms
! 7736: @subsection Square Root
1.1.1.2 maekawa 7737:
1.1.1.4 ! ohara 7738: Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul
! 7739: Zimmermann (@pxref{References}). This is expressed in a divide and conquer
! 7740: form, but as noted in the paper it can also be viewed as a discrete variant of
! 7741: Newton's method.
! 7742:
! 7743: In the Karatsuba multiplication range this is an @m{O({3\over2}
! 7744: M(N/2)),O(1.5*M(N/2))} algorithm, where @math{M(n)} is the time to multiply
! 7745: two numbers of @math{n} limbs. In the FFT multiplication range this grows to
! 7746: a bound of @m{O(6 M(N/2)),O(6*M(N/2))}. In practice a factor of about 1.5 to
! 7747: 1.8 is found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT
! 7748: range.
! 7749:
! 7750: The algorithm does all its calculations in integers and the resulting
! 7751: @code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}.
! 7752: The extended precision given by @code{mpf_sqrt_ui} is obtained by
! 7753: padding with zero limbs.
1.1 maekawa 7754:
7755:
1.1.1.4 ! ohara 7756: @node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms
! 7757: @subsection Nth Root
1.1 maekawa 7758:
1.1.1.4 ! ohara 7759: Integer Nth roots are taken using Newton's method with the following
! 7760: iteration, where @math{A} is the input and @math{n} is the root to be taken.
! 7761: @tex
! 7762: $$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$
! 7763: @end tex
! 7764: @ifnottex
1.1 maekawa 7765:
1.1.1.4 ! ohara 7766: @example
! 7767: 1 A
! 7768: a[i+1] = - * ( --------- + (n-1)*a[i] )
! 7769: n a[i]^(n-1)
! 7770: @end example
1.1 maekawa 7771:
1.1.1.4 ! ohara 7772: @end ifnottex
! 7773: The initial approximation @m{a_1,a[1]} is generated bitwise by successively
! 7774: powering a trial root with or without new 1 bits, aiming to be just above the
! 7775: true root. The iteration converges quadratically when started from a good
! 7776: approximation. When @math{n} is large more initial bits are needed to get
! 7777: good convergence. The current implementation is not particularly well
! 7778: optimized.
1.1 maekawa 7779:
7780:
1.1.1.4 ! ohara 7781: @node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms
! 7782: @subsection Perfect Square
1.1 maekawa 7783:
1.1.1.4 ! ohara 7784: @code{mpz_perfect_square_p} is able to quickly exclude most non-squares by
! 7785: checking whether the input is a quadratic residue modulo some small integers.
! 7786:
! 7787: The first test is modulo 256 which means simply examining the least
! 7788: significant byte. Only 44 different values occur as the low byte of a square,
! 7789: so 82.8% of non-squares can be immediately excluded. Similar tests modulo
! 7790: primes from 3 to 29 exclude 99.5% of those remaining, or if a limb is 64 bits
! 7791: then primes up to 53 are used, excluding 99.99%. A single N@cross{}1
! 7792: remainder using @code{PP} from @file{gmp-impl.h} quickly gives all these
! 7793: remainders.
! 7794:
! 7795: A square root must still be taken for any value that passes the residue tests,
! 7796: to verify it's really a square and not one of the 0.086% (or 0.000156% for 64
! 7797: bits) non-squares that get through. @xref{Square Root Algorithm}.
! 7798:
! 7799:
! 7800: @node Perfect Power Algorithm, , Perfect Square Algorithm, Root Extraction Algorithms
! 7801: @subsection Perfect Power
! 7802:
! 7803: Detecting perfect powers is required by some factorization algorithms.
! 7804: Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root
! 7805: extractions, though naturally only prime roots need to be considered.
! 7806: (@xref{Nth Root Algorithm}.)
! 7807:
! 7808: If a prime divisor @math{p} with multiplicity @math{e} can be found, then only
! 7809: roots which are divisors of @math{e} need to be considered, much reducing the
! 7810: work necessary. To this end divisibility by a set of small primes is checked.
! 7811:
! 7812:
! 7813: @node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms
! 7814: @section Radix Conversion
! 7815: @cindex Radix conversion algorithms
! 7816:
! 7817: Radix conversions are less important than other algorithms. A program
! 7818: dominated by conversions should probably use a different data representation.
! 7819:
! 7820: @menu
! 7821: * Binary to Radix::
! 7822: * Radix to Binary::
! 7823: @end menu
! 7824:
! 7825:
! 7826: @node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms
! 7827: @subsection Binary to Radix
! 7828:
! 7829: Conversions from binary to a power-of-2 radix use a simple and fast
! 7830: @math{O(N)} bit extraction algorithm.
! 7831:
! 7832: Conversions from binary to other radices use one of two algorithms. Sizes
! 7833: below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method.
! 7834: Repeated divisions by @math{b^n} are made, where @math{b} is the radix and
! 7835: @math{n} is the biggest power that fits in a limb. But instead of simply
! 7836: using the remainder @math{r} from such divisions, an extra divide step is done
! 7837: to give a fractional limb representing @math{r/b^n}. The digits of @math{r}
! 7838: can then be extracted using multiplications by @math{b} rather than divisions.
! 7839: Special case code is provided for decimal, allowing multiplications by 10 to
! 7840: optimize to shifts and adds.
! 7841:
! 7842: Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
! 7843: For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are
! 7844: calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is
! 7845: reached. @math{t} is then divided by that largest power, giving a quotient
! 7846: which is the digits above that power, and a remainder which is those below.
! 7847: These two parts are in turn divided by the second highest power, and so on
! 7848: recursively. When a piece has been divided down to less than
! 7849: @code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is
! 7850: used.
! 7851:
! 7852: The advantage of this algorithm is that big divisions can make use of the
! 7853: sub-quadratic divide and conquer division (@pxref{Divide and Conquer
! 7854: Division}), and big divisions tend to have less overheads than lots of
! 7855: separate single limb divisions anyway. But in any case the cost of
! 7856: calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome.
! 7857:
! 7858: @code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent
! 7859: the same basic thing, the point where it becomes worth doing a big division to
! 7860: cut the input in half. @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost
! 7861: of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD}
! 7862: assumes that's already available, which is the case when recursing.
! 7863:
! 7864: Since the base case produces digits from least to most significant but they
! 7865: want to be stored from most to least, it's necessary to calculate in advance
! 7866: how many digits there will be, or at least be sure not to underestimate that.
! 7867: For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly}
! 7868: from @code{mp_bases}, rounding up. The result is either correct or one too
! 7869: big.
! 7870:
! 7871: Examining some of the high bits of the input could increase the chance of
! 7872: getting the exact number of digits, but an exact result every time would not
! 7873: be practical, since in general the difference between numbers 100@dots{} and
! 7874: 99@dots{} is only in the last few bits and the work to identify 99@dots{}
! 7875: might well be almost as much as a full conversion.
! 7876:
! 7877: @code{mpf_get_str} doesn't currently use the algorithm described here, it
! 7878: multiplies or divides by a power of @math{b} to move the radix point to the
! 7879: just above the highest non-zero digit (or at worst one above that location),
! 7880: then multiplies by @math{b^n} to bring out digits. This is @math{O(N^2)} and
! 7881: is certainly not optimal.
! 7882:
! 7883: The @math{r/b^n} scheme described above for using multiplications to bring out
! 7884: digits might be useful for more than a single limb. Some brief experiments
! 7885: with it on the base case when recursing didn't give a noticable improvement,
! 7886: but perhaps that was only due to the implementation. Something similar would
! 7887: work for the sub-quadratic divisions too, though there would be the cost of
! 7888: calculating a bigger radix power.
! 7889:
! 7890: Another possible improvement for the sub-quadratic part would be to arrange
! 7891: for radix powers that balanced the sizes of quotient and remainder produced,
! 7892: ie. the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to
! 7893: @m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor. That ought to
! 7894: smooth out a graph of times against sizes, but may or may not be a net
! 7895: speedup.
! 7896:
! 7897:
! 7898: @node Radix to Binary, , Binary to Radix, Radix Conversion Algorithms
! 7899: @subsection Radix to Binary
! 7900:
! 7901: Conversions from a power-of-2 radix into binary use a simple and fast
! 7902: @math{O(N)} bitwise concatenation algorithm.
! 7903:
! 7904: Conversions from other radices use one of two algorithms. Sizes below
! 7905: @code{SET_STR_THRESHOLD} use a basic @math{O(N^2)} method. Groups of @math{n}
! 7906: digits are converted to limbs, where @math{n} is the biggest power of the base
! 7907: @math{b} which will fit in a limb, then those groups are accumulated into the
! 7908: result by multiplying by @math{b^n} and adding. This saves multi-precision
! 7909: operations, as per Knuth section 4.4 part E (@pxref{References}). Some
! 7910: special case code is provided for decimal, giving the compiler a chance to
! 7911: optimize multiplications by 10.
! 7912:
! 7913: Above @code{SET_STR_THRESHOLD} a sub-quadratic algorithm is used. First
! 7914: groups of @math{n} digits are converted into limbs. Then adjacent limbs are
! 7915: combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x} and @math{y}
! 7916: are the limbs. Adjacent limb pairs are combined into quads similarly with
! 7917: @m{xb^{2n}+y,x*b^(2n)+y}. This continues until a single block remains, that
! 7918: being the result.
! 7919:
! 7920: The advantage of this method is that the multiplications for each @math{x} are
! 7921: big blocks, allowing Karatsuba and higher algorithms to be used. But the cost
! 7922: of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome.
! 7923: @code{SET_STR_THRESHOLD} usually ends up quite big, around 5000 digits, and on
! 7924: some processors much bigger still.
! 7925:
! 7926: @code{SET_STR_THRESHOLD} is based on the input digits (and tuned for decimal),
! 7927: though it might be better based on a limb count, so as to be independent of
! 7928: the base. But that sort of count isn't used by the base case and so would
! 7929: need some sort of initial calculation or estimate.
! 7930:
! 7931: The main reason @code{SET_STR_THRESHOLD} is so much bigger than the
! 7932: corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is
! 7933: much faster than @code{mpn_divrem_1} (often by a factor of 10, or more).
! 7934:
! 7935:
! 7936: @need 1000
! 7937: @node Other Algorithms, Assembler Coding, Radix Conversion Algorithms, Algorithms
! 7938: @section Other Algorithms
! 7939:
! 7940: @menu
! 7941: * Factorial Algorithm::
! 7942: * Binomial Coefficients Algorithm::
! 7943: * Fibonacci Numbers Algorithm::
! 7944: * Lucas Numbers Algorithm::
! 7945: @end menu
! 7946:
! 7947:
! 7948: @node Factorial Algorithm, Binomial Coefficients Algorithm, Other Algorithms, Other Algorithms
! 7949: @subsection Factorial
! 7950:
! 7951: Factorials @math{n!} are calculated by a simple product from @math{1} to
! 7952: @math{n}, but arranged into certain sub-products.
! 7953:
! 7954: First as many factors as fit in a limb are accumulated, then two of those
! 7955: multiplied to give a 2-limb product. When two 2-limb products are ready
! 7956: they're multiplied to a 4-limb product, and when two 4-limbs are ready they're
! 7957: multiplied to an 8-limb product, etc. A stack of outstanding products is
! 7958: built up, with two of the same size multiplied together when ready.
! 7959:
! 7960: Arranging for multiplications to have operands the same (or nearly the same)
! 7961: size means the Karatsuba and higher multiplication algorithms can be used.
! 7962: And even on sizes below the Karatsuba threshold an N@cross{}N multiply will
! 7963: give a basecase multiply more to work on.
! 7964:
! 7965: An obvious improvement not currently implemented would be to strip factors of
! 7966: 2 from the products and apply them at the end with a bit shift. Another
! 7967: possibility would be to determine the prime factorization of the result (which
! 7968: can be done easily), and use a powering method, at each stage squaring then
! 7969: multiplying in those primes with a 1 in their exponent at that point. The
! 7970: advantage would be some multiplies turned into squares.
! 7971:
! 7972:
! 7973: @node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms
! 7974: @subsection Binomial Coefficients
! 7975:
! 7976: Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated
! 7977: by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) =
! 7978: \left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then
! 7979: evaluating the following product simply from @math{i=2} to @math{i=k}.
! 7980: @tex
! 7981: $$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$
! 7982: @end tex
! 7983: @ifnottex
! 7984:
! 7985: @example
! 7986: k (n-k+i)
! 7987: C(n,k) = (n-k+1) * prod -------
! 7988: i=2 i
! 7989: @end example
! 7990:
! 7991: @end ifnottex
! 7992: It's easy to show that each denominator @math{i} will divide the product so
! 7993: far, so the exact division algorithm is used (@pxref{Exact Division}).
! 7994:
! 7995: The numerators @math{n-k+i} and denominators @math{i} are first accumulated
! 7996: into as many fit a limb, to save multi-precision operations, though for
! 7997: @code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an
! 7998: @code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all.
! 7999:
! 8000: An obvious improvement would be to strip factors of 2 from each multiplier and
! 8001: divisor and count them separately, to be applied with a bit shift at the end.
! 8002: Factors of 3 and perhaps 5 could even be handled similarly. Another
! 8003: possibility, if @math{n} is not too big, would be to determine the prime
! 8004: factorization of the result based on the factorials involved, and power up
! 8005: those primes appropriately. This would help most when @math{k} is near
! 8006: @math{n/2}.
! 8007:
! 8008:
! 8009: @node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms
! 8010: @subsection Fibonacci Numbers
! 8011:
! 8012: The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed
! 8013: for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]}
! 8014: values efficiently.
! 8015:
! 8016: For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is
! 8017: used. On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb
! 8018: up to @m{F_{93},F[93]}. For convenience the table starts at @m{F_{-1},F[-1]}.
! 8019:
! 8020: Beyond the table, values are generated with a binary powering algorithm,
! 8021: calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to
! 8022: low across the bits of @math{n}. The formulas used are
! 8023: @tex
! 8024: $$\eqalign{
! 8025: F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr
! 8026: F_{2k-1} &= F_k^2 + F_{k-1}^2 \cr
! 8027: F_{2k} &= F_{2k+1} - F_{2k-1}
! 8028: }$$
! 8029: @end tex
! 8030: @ifnottex
! 8031:
! 8032: @example
! 8033: F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k
! 8034: F[2k-1] = F[k]^2 + F[k-1]^2
! 8035:
! 8036: F[2k] = F[2k+1] - F[2k-1]
! 8037: @end example
! 8038:
! 8039: @end ifnottex
! 8040: At each step, @math{k} is the high @math{b} bits of @math{n}. If the next bit
! 8041: of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if
! 8042: it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process
! 8043: repeated until all bits of @math{n} are incorporated. Notice these formulas
! 8044: require just two squares per bit of @math{n}.
! 8045:
! 8046: It'd be possible to handle the first few @math{n} above the single limb table
! 8047: with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} =
! 8048: F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually
! 8049: turns out to be faster for only about 10 or 20 values of @math{n}, and
! 8050: including a block of code for just those doesn't seem worthwhile. If they
! 8051: really mattered it'd be better to extend the data table.
! 8052:
! 8053: Using a table avoids lots of calculations on small numbers, and makes small
! 8054: @math{n} go fast. A bigger table would make more small @math{n} go fast, it's
! 8055: just a question of balancing size against desired speed. For GMP the code is
! 8056: kept compact, with the emphasis primarily on a good powering algorithm.
! 8057:
! 8058: @code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but
! 8059: @code{mpz_fib_ui} is only interested in @m{F_n,F[n]}. In this case the last
! 8060: step of the algorithm can become one multiply instead of two squares. One of
! 8061: the following two formulas is used, according as @math{n} is odd or even.
! 8062: @tex
! 8063: $$\eqalign{
! 8064: F_{2k} &= F_k (F_k + 2F_{k-1}) \cr
! 8065: F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k
! 8066: }$$
! 8067: @end tex
! 8068: @ifnottex
! 8069:
! 8070: @example
! 8071: F[2k] = F[k]*(F[k]+2F[k-1])
! 8072:
! 8073: F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k
! 8074: @end example
! 8075:
! 8076: @end ifnottex
! 8077: @m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a
! 8078: multiply. For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above
! 8079: can be applied just to the low limb of the calculation, without a carry or
! 8080: borrow into further limbs, which saves some code size. See comments with
! 8081: @code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done.
! 8082:
! 8083:
! 8084: @node Lucas Numbers Algorithm, , Fibonacci Numbers Algorithm, Other Algorithms
! 8085: @subsection Lucas Numbers
! 8086:
! 8087: @code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci
! 8088: numbers with the following simple formulas.
! 8089: @tex
! 8090: $$\eqalign{
! 8091: L_k &= F_k + 2F_{k-1} \cr
! 8092: L_{k-1} &= 2F_k - F_{k-1}
! 8093: }$$
! 8094: @end tex
! 8095: @ifnottex
! 8096:
! 8097: @example
! 8098: L[k] = F[k] + 2*F[k-1]
! 8099: L[k-1] = 2*F[k] - F[k-1]
! 8100: @end example
! 8101:
! 8102: @end ifnottex
! 8103: @code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be
! 8104: saved. Trailing zero bits on @math{n} can be handled with a single square
! 8105: each.
! 8106: @tex
! 8107: $$ L_{2k} = L_k^2 - 2(-1)^k $$
! 8108: @end tex
! 8109: @ifnottex
! 8110:
! 8111: @example
! 8112: L[2k] = L[k]^2 - 2*(-1)^k
! 8113: @end example
! 8114:
! 8115: @end ifnottex
! 8116: And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci
! 8117: numbers, similar to what @code{mpz_fib_ui} does.
! 8118: @tex
! 8119: $$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$
! 8120: @end tex
! 8121: @ifnottex
! 8122:
! 8123: @example
! 8124: L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k
! 8125: @end example
! 8126:
! 8127: @end ifnottex
! 8128:
! 8129:
! 8130: @node Assembler Coding, , Other Algorithms, Algorithms
! 8131: @section Assembler Coding
! 8132:
! 8133: The assembler subroutines in GMP are the most significant source of speed at
! 8134: small to moderate sizes. At larger sizes algorithm selection becomes more
! 8135: important, but of course speedups in low level routines will still speed up
! 8136: everything proportionally.
! 8137:
! 8138: Carry handling and widening multiplies that are important for GMP can't be
! 8139: easily expressed in C. GCC @code{asm} blocks help a lot and are provided in
! 8140: @file{longlong.h}, but hand coding low level routines invariably offers a
! 8141: speedup over generic C by a factor of anything from 2 to 10.
! 8142:
! 8143: @menu
! 8144: * Assembler Code Organisation::
! 8145: * Assembler Basics::
! 8146: * Assembler Carry Propagation::
! 8147: * Assembler Cache Handling::
! 8148: * Assembler Floating Point::
! 8149: * Assembler SIMD Instructions::
! 8150: * Assembler Software Pipelining::
! 8151: * Assembler Loop Unrolling::
! 8152: @end menu
! 8153:
! 8154:
! 8155: @node Assembler Code Organisation, Assembler Basics, Assembler Coding, Assembler Coding
! 8156: @subsection Code Organisation
! 8157:
! 8158: The various @file{mpn} subdirectories contain machine-dependent code, written
! 8159: in C or assembler. The @file{mpn/generic} subdirectory contains default code,
! 8160: used when there's no machine-specific version of a particular file.
! 8161:
! 8162: Each @file{mpn} subdirectory is for an ISA family. Generally 32-bit and
! 8163: 64-bit variants in a family cannot share code and will have separate
! 8164: directories. Within a family further subdirectories may exist for CPU
! 8165: variants.
! 8166:
! 8167:
! 8168: @node Assembler Basics, Assembler Carry Propagation, Assembler Code Organisation, Assembler Coding
! 8169: @subsection Assembler Basics
! 8170:
! 8171: @code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines
! 8172: for overall GMP performance. All multiplications and divisions come down to
! 8173: repeated calls to these. @code{mpn_add_n}, @code{mpn_sub_n},
! 8174: @code{mpn_lshift} and @code{mpn_rshift} are next most important.
! 8175:
! 8176: On some CPUs assembler versions of the internal functions
! 8177: @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups,
! 8178: mainly through avoiding function call overheads. They can also potentially
! 8179: make better use of a wide superscalar processor.
! 8180:
! 8181: The restrictions on overlaps between sources and destinations
! 8182: (@pxref{Low-level Functions}) are designed to facilitate a variety of
! 8183: implementations. For example, knowing @code{mpn_add_n} won't have partly
! 8184: overlapping sources and destination means reading can be done far ahead of
! 8185: writing on superscalar processors, and loops can be vectorized on a vector
! 8186: processor, depending on the carry handling.
! 8187:
! 8188:
! 8189: @node Assembler Carry Propagation, Assembler Cache Handling, Assembler Basics, Assembler Coding
! 8190: @subsection Carry Propagation
! 8191:
! 8192: The problem that presents most challenges in GMP is propagating carries from
! 8193: one limb to the next. In functions like @code{mpn_addmul_1} and
! 8194: @code{mpn_add_n}, carries are the only dependencies between limb operations.
! 8195:
! 8196: On processors with carry flags, a straightforward CISC style @code{adc} is
! 8197: generally best. AMD K6 @code{mpn_addmul_1} however is an example of an
! 8198: unusual set of circumstances where a branch works out better.
! 8199:
! 8200: On RISC processors generally an add and compare for overflow is used. This
! 8201: sort of thing can be seen in @file{mpn/generic/aors_n.c}. Some carry
! 8202: propagation schemes require 4 instructions, meaning at least 4 cycles per
! 8203: limb, but other schemes may use just 1 or 2. On wide superscalar processors
! 8204: performance may be completely determined by the number of dependent
! 8205: instructions between carry-in and carry-out for each limb.
! 8206:
! 8207: On vector processors good use can be made of the fact that a carry bit only
! 8208: very rarely propagates more than one limb. When adding a single bit to a
! 8209: limb, there's only a carry out if that limb was @code{0xFF...FF} which on
! 8210: random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}},
! 8211: 2^mp_bits_per_limb}. @file{mpn/cray/add_n.c} is an example of this, it adds
! 8212: all limbs in parallel, adds one set of carry bits in parallel and then only
! 8213: rarely needs to fall through to a loop propagating further carries.
! 8214:
! 8215: On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code
! 8216: for the RISC style idioms that are necessary to handle carry bits in
! 8217: C. Often conditional jumps are generated where @code{adc} or @code{sbb} forms
! 8218: would be better. And so unfortunately almost any loop involving carry bits
! 8219: needs to be coded in assembler for best results.
! 8220:
! 8221:
! 8222: @node Assembler Cache Handling, Assembler Floating Point, Assembler Carry Propagation, Assembler Coding
! 8223: @subsection Cache Handling
! 8224:
! 8225: GMP aims to perform well both on operands that fit entirely in L1 cache and
! 8226: those which don't.
! 8227:
! 8228: Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on
! 8229: large operands, so L2 and main memory performance is important for them.
! 8230: @code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and
! 8231: square basecases, so L1 performance matters most for them, unless assembler
! 8232: versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in
! 8233: which case the remaining uses are mostly for larger operands.
! 8234:
! 8235: For L2 or main memory operands, memory access times will almost certainly be
! 8236: more than the calculation time. The aim therefore is to maximize memory
! 8237: throughput, by starting a load of the next cache line which processing the
! 8238: contents of the previous one. Clearly this is only possible if the chip has a
! 8239: lock-up free cache or some sort of prefetch instruction. Most current chips
! 8240: have both these features.
! 8241:
! 8242: Prefetching sources combines well with loop unrolling, since a prefetch can be
! 8243: initiated once per unrolled loop (or more than once if the loop covers more
! 8244: than one cache line).
! 8245:
! 8246: On CPUs without write-allocate caches, prefetching destinations will ensure
! 8247: individual stores don't go further down the cache hierarchy, limiting
! 8248: bandwidth. Of course for calculations which are slow anyway, like
! 8249: @code{mpn_divrem_1}, write-throughs might be fine.
! 8250:
! 8251: The distance ahead to prefetch will be determined by memory latency versus
! 8252: throughput. The aim of course is to have data arriving continuously, at peak
! 8253: throughput. Some CPUs have limits on the number of fetches or prefetches in
! 8254: progress.
! 8255:
! 8256: If a special prefetch instruction doesn't exist then a plain load can be used,
! 8257: but in that case care must be taken not to attempt to read past the end of an
! 8258: operand, since that might produce a segmentation violation.
! 8259:
! 8260: Some CPUs or systems have hardware that detects sequential memory accesses and
! 8261: initiates suitable cache movements automatically, making life easy.
! 8262:
! 8263:
! 8264: @node Assembler Floating Point, Assembler SIMD Instructions, Assembler Cache Handling, Assembler Coding
! 8265: @subsection Floating Point
! 8266:
! 8267: Floating point arithmetic is used in GMP for multiplications on CPUs with poor
! 8268: integer multipliers. It's mostly useful for @code{mpn_mul_1},
! 8269: @code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and
! 8270: @code{mpn_mul_basecase} on both 32-bit and 64-bit machines.
! 8271:
! 8272: With IEEE 53-bit double precision floats, integer multiplications producing up
! 8273: to 53 bits will give exact results. Breaking a 64@cross{}64 multiplication
! 8274: into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient. With
! 8275: some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be
! 8276: used, if one of the lower two 21-bit pieces also uses the sign bit.
! 8277:
! 8278: For the @code{mpn_mul_1} family of functions on a 64-bit machine, the
! 8279: invariant single limb is split at the start, into 3 or 4 pieces. Inside the
! 8280: loop, the bignum operand is split into 32-bit pieces. Fast conversion of
! 8281: these unsigned 32-bit pieces to floating point is highly machine-dependent.
! 8282: In some cases, reading the data into the integer unit, zero-extending to
! 8283: 64-bits, then transferring to the floating point unit back via memory is the
! 8284: only option.
! 8285:
! 8286: Converting partial products back to 64-bit limbs is usually best done as a
! 8287: signed conversion. Since all values are smaller than @m{2^{53},2^53}, signed
! 8288: and unsigned are the same, but most processors lack unsigned conversions.
! 8289:
! 8290: @sp 2
! 8291:
! 8292: Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or
! 8293: @code{mpn_addmul_1} with a 64-bit limb. The single limb operand V is split
! 8294: into four 16-bit parts. The multi-limb operand U is split in the loop into
! 8295: two 32-bit parts.
! 8296:
! 8297: @tex
! 8298: \global\newdimen\GMPbits \global\GMPbits=0.18em
! 8299: \def\GMPbox#1#2#3{%
! 8300: \hbox{%
! 8301: \hbox to 128\GMPbits{\hfil
! 8302: \vbox{%
! 8303: \hrule
! 8304: \hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
! 8305: \hrule}%
! 8306: \hskip #1\GMPbits}%
! 8307: \raise \GMPboxdepth \hbox{\hskip 2em #3}}}
! 8308: %
! 8309: \GMPdisplay{%
! 8310: \vbox{%
! 8311: \hbox{%
! 8312: \hbox to 128\GMPbits {\hfil
! 8313: \vbox{%
! 8314: \hrule
! 8315: \hbox to 64\GMPbits{%
! 8316: \GMPvrule \hfil$v48$\hfil
! 8317: \vrule \hfil$v32$\hfil
! 8318: \vrule \hfil$v16$\hfil
! 8319: \vrule \hfil$v00$\hfil
! 8320: \vrule}
! 8321: \hrule}}%
! 8322: \raise \GMPboxdepth \hbox{\hskip 2em V Operand}}
! 8323: \vskip 0.5ex
! 8324: \hbox{%
! 8325: \hbox to 128\GMPbits {\hfil
! 8326: \raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}%
! 8327: \vbox{%
! 8328: \hrule
! 8329: \hbox to 64\GMPbits {%
! 8330: \GMPvrule \hfil$u32$\hfil
! 8331: \vrule \hfil$u00$\hfil
! 8332: \vrule}%
! 8333: \hrule}}%
! 8334: \raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}%
! 8335: \vskip 0.5ex
! 8336: \hbox{\vbox to 2ex{\hrule width 128\GMPbits}}%
! 8337: \GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}%
! 8338: \vskip 0.5ex
! 8339: \GMPbox{16}{u00 \times v16}{$p16$}
! 8340: \vskip 0.5ex
! 8341: \GMPbox{32}{u00 \times v32}{$p32$}
! 8342: \vskip 0.5ex
! 8343: \GMPbox{48}{u00 \times v48}{$p48$}
! 8344: \vskip 0.5ex
! 8345: \GMPbox{32}{u32 \times v00}{$r32$}
! 8346: \vskip 0.5ex
! 8347: \GMPbox{48}{u32 \times v16}{$r48$}
! 8348: \vskip 0.5ex
! 8349: \GMPbox{64}{u32 \times v32}{$r64$}
! 8350: \vskip 0.5ex
! 8351: \GMPbox{80}{u32 \times v48}{$r80$}
! 8352: }}
! 8353: @end tex
! 8354: @ifnottex
! 8355: @example
! 8356: @group
! 8357: +---+---+---+---+
! 8358: |v48|v32|v16|v00| V operand
! 8359: +---+---+---+---+
! 8360:
! 8361: +-------+---+---+
! 8362: x | u32 | u00 | U operand (one limb)
! 8363: +---------------+
! 8364:
! 8365: ---------------------------------
! 8366:
! 8367: +-----------+
! 8368: | u00 x v00 | p00 48-bit products
! 8369: +-----------+
! 8370: +-----------+
! 8371: | u00 x v16 | p16
! 8372: +-----------+
! 8373: +-----------+
! 8374: | u00 x v32 | p32
! 8375: +-----------+
! 8376: +-----------+
! 8377: | u00 x v48 | p48
! 8378: +-----------+
! 8379: +-----------+
! 8380: | u32 x v00 | r32
! 8381: +-----------+
! 8382: +-----------+
! 8383: | u32 x v16 | r48
! 8384: +-----------+
! 8385: +-----------+
! 8386: | u32 x v32 | r64
! 8387: +-----------+
! 8388: +-----------+
! 8389: | u32 x v48 | r80
! 8390: +-----------+
! 8391: @end group
! 8392: @end example
! 8393: @end ifnottex
! 8394:
! 8395: @math{p32} and @math{r32} can be summed using floating-point addition, and
! 8396: likewise @math{p48} and @math{r48}. @math{p00} and @math{p16} can be summed
! 8397: with @math{r64} and @math{r80} from the previous iteration.
! 8398:
! 8399: For each loop then, four 49-bit quantities are transfered to the integer unit,
! 8400: aligned as follows,
! 8401:
! 8402: @tex
! 8403: % GMPbox here should be 49 bits wide, but use 51 to better show p16+r80'
! 8404: % crossing into the upper 64 bits.
! 8405: \def\GMPbox#1#2#3{%
! 8406: \hbox{%
! 8407: \hbox to 128\GMPbits {%
! 8408: \hfil
! 8409: \vbox{%
! 8410: \hrule
! 8411: \hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
! 8412: \hrule}%
! 8413: \hskip #1\GMPbits}%
! 8414: \raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}%
! 8415: }}
! 8416: \newbox\b \setbox\b\hbox{64 bits}%
! 8417: \newdimen\bw \bw=\wd\b \advance\bw by 2em
! 8418: \newdimen\x \x=128\GMPbits
! 8419: \advance\x by -2\bw
! 8420: \divide\x by4
! 8421: \GMPdisplay{%
! 8422: \vbox{%
! 8423: \hbox to 128\GMPbits {%
! 8424: \GMPvrule
! 8425: \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
! 8426: \hfil 64 bits\hfil
! 8427: \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
! 8428: \vrule
! 8429: \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
! 8430: \hfil 64 bits\hfil
! 8431: \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
! 8432: \vrule}%
! 8433: \vskip 0.7ex
! 8434: \GMPbox{0}{p00+r64'}{i00}
! 8435: \vskip 0.5ex
! 8436: \GMPbox{16}{p16+r80'}{i16}
! 8437: \vskip 0.5ex
! 8438: \GMPbox{32}{p32+r32}{i32}
! 8439: \vskip 0.5ex
! 8440: \GMPbox{48}{p48+r48}{i48}
! 8441: }}
! 8442: @end tex
! 8443: @ifnottex
! 8444: @example
! 8445: @group
! 8446: |-----64bits----|-----64bits----|
! 8447: +------------+
! 8448: | p00 + r64' | i00
! 8449: +------------+
! 8450: +------------+
! 8451: | p16 + r80' | i16
! 8452: +------------+
! 8453: +------------+
! 8454: | p32 + r32 | i32
! 8455: +------------+
! 8456: +------------+
! 8457: | p48 + r48 | i48
! 8458: +------------+
! 8459: @end group
! 8460: @end example
! 8461: @end ifnottex
! 8462:
! 8463: The challenge then is to sum these efficiently and add in a carry limb,
! 8464: generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48}
! 8465: extends 33 bits into the high half).
! 8466:
! 8467:
! 8468: @node Assembler SIMD Instructions, Assembler Software Pipelining, Assembler Floating Point, Assembler Coding
! 8469: @subsection SIMD Instructions
! 8470:
! 8471: The single-instruction multiple-data support in current microprocessors is
! 8472: aimed at signal processing algorithms where each data point can be treated
! 8473: more or less independently. There's generally not much support for
! 8474: propagating the sort of carries that arise in GMP.
! 8475:
! 8476: SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much
! 8477: work as one 32@cross{}32 from GMP's point of view, and need some shifts and
! 8478: adds besides. But of course if say the SIMD form is fully pipelined and uses
! 8479: less instruction decoding then it may still be worthwhile.
! 8480:
! 8481: On the 80x86 chips, MMX has so far found a use in @code{mpn_rshift} and
! 8482: @code{mpn_lshift} since it allows 64-bit operations, and is used in a special
! 8483: case for 16-bit multipliers in the P55 @code{mpn_mul_1}. 3DNow and SSE
! 8484: haven't found a use so far.
! 8485:
! 8486:
! 8487: @node Assembler Software Pipelining, Assembler Loop Unrolling, Assembler SIMD Instructions, Assembler Coding
! 8488: @subsection Software Pipelining
! 8489:
! 8490: Software pipelining consists of scheduling instructions around the branch
! 8491: point in a loop. For example a loop taking a checksum of an array of limbs
! 8492: might have a load and an add, but the load wouldn't be for that add, rather
! 8493: for the one next time around the loop. Each load then is effectively
! 8494: scheduled back in the previous iteration, allowing latency to be hidden.
! 8495:
! 8496: Naturally this is wanted only when doing things like loads or multiplies that
! 8497: take a few cycles to complete, and only where a CPU has multiple functional
! 8498: units so that other work can be done while waiting.
! 8499:
! 8500: A pipeline with several stages will have a data value in progress at each
! 8501: stage and each loop iteration moves them along one stage. This is like
! 8502: juggling.
! 8503:
! 8504: Within the loop some moves between registers may be necessary to have the
! 8505: right values in the right places for each iteration. Loop unrolling can help
! 8506: this, with each unrolled block able to use different registers for different
! 8507: values, even if some shuffling is still needed just before going back to the
! 8508: top of the loop.
! 8509:
! 8510:
! 8511: @node Assembler Loop Unrolling, , Assembler Software Pipelining, Assembler Coding
! 8512: @subsection Loop Unrolling
! 8513:
! 8514: Loop unrolling consists of replicating code so that several limbs are
! 8515: processed in each loop. At a minimum this reduces loop overheads by a
! 8516: corresponding factor, but it can also allow better register usage, for example
! 8517: alternately using one register combination and then another. Judicious use of
! 8518: @command{m4} macros can help avoid lots of duplication in the source code.
! 8519:
! 8520: Unrolling is commonly done to a power of 2 multiple so the number of unrolled
! 8521: loops and the number of remaining limbs can be calculated with a shift and
! 8522: mask. But other multiples can be used too, just by subtracting each @var{n}
! 8523: limbs processed from a counter and waiting for less than @var{n} remaining (or
! 8524: offsetting the counter by @var{n} so it goes negative when there's less than
! 8525: @var{n} remaining).
! 8526:
! 8527: The limbs not a multiple of the unrolling can be handled in various ways, for
! 8528: example
! 8529:
! 8530: @itemize @bullet
! 8531: @item
! 8532: A simple loop at the end (or the start) to process the excess. Care will be
! 8533: wanted that it isn't too much slower than the unrolled part.
! 8534:
! 8535: @item
! 8536: A set of binary tests, for example after an 8-limb unrolling, test for 4 more
! 8537: limbs to process, then a further 2 more or not, and finally 1 more or not.
! 8538: This will probably take more code space than a simple loop.
! 8539:
! 8540: @item
! 8541: A @code{switch} statement, providing separate code for each possible excess,
! 8542: for example an 8-limb unrolling would have separate code for 0 remaining, 1
! 8543: remaining, etc, up to 7 remaining. This might take a lot of code, but may be
! 8544: the best way to optimize all cases in combination with a deep pipelined loop.
! 8545:
! 8546: @item
! 8547: A computed jump into the middle of the loop, thus making the first iteration
! 8548: handle the excess. This should make times smoothly increase with size, which
! 8549: is attractive, but setups for the jump and adjustments for pointers can be
! 8550: tricky and could become quite difficult in combination with deep pipelining.
! 8551: @end itemize
! 8552:
! 8553: One way to write the setups and finishups for a pipelined unrolled loop is
! 8554: simply to duplicate the loop at the start and the end, then delete
! 8555: instructions at the start which have no valid antecedents, and delete
! 8556: instructions at the end whose results are unwanted. Sizes not a multiple of
! 8557: the unrolling can then be handled as desired.
! 8558:
! 8559:
! 8560: @node Internals, Contributors, Algorithms, Top
! 8561: @chapter Internals
! 8562:
! 8563: @strong{This chapter is provided only for informational purposes and the
! 8564: various internals described here may change in future GMP releases.
! 8565: Applications expecting to be compatible with future releases should use only
! 8566: the documented interfaces described in previous chapters.}
! 8567:
! 8568: @menu
! 8569: * Integer Internals::
! 8570: * Rational Internals::
! 8571: * Float Internals::
! 8572: * Raw Output Internals::
! 8573: * C++ Interface Internals::
! 8574: @end menu
! 8575:
! 8576: @node Integer Internals, Rational Internals, Internals, Internals
! 8577: @section Integer Internals
! 8578:
! 8579: @code{mpz_t} variables represent integers using sign and magnitude, in space
! 8580: dynamically allocated and reallocated. The fields are as follows.
! 8581:
! 8582: @table @asis
! 8583: @item @code{_mp_size}
! 8584: The number of limbs, or the negative of that when representing a negative
! 8585: integer. Zero is represented by @code{_mp_size} set to zero, in which case
! 8586: the @code{_mp_d} data is unused.
! 8587:
! 8588: @item @code{_mp_d}
! 8589: A pointer to an array of limbs which is the magnitude. These are stored
! 8590: ``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the
! 8591: least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most
! 8592: significant. Whenever @code{_mp_size} is non-zero, the most significant limb
! 8593: is non-zero.
! 8594:
! 8595: Currently there's always at least one limb allocated, so for instance
! 8596: @code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch
! 8597: @code{_mp_d[0]} unconditionally (though its value is then only wanted if
! 8598: @code{_mp_size} is non-zero).
! 8599:
! 8600: @item @code{_mp_alloc}
! 8601: @code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d},
! 8602: and naturally @code{_mp_alloc >= ABS(_mp_size)}. When an @code{mpz} routine
! 8603: is about to (or might be about to) increase @code{_mp_size}, it checks
! 8604: @code{_mp_alloc} to see whether there's enough space, and reallocates if not.
! 8605: @code{MPZ_REALLOC} is generally used for this.
! 8606: @end table
! 8607:
! 8608: The various bitwise logical functions like @code{mpz_and} behave as if
! 8609: negative values were twos complement. But sign and magnitude is always used
! 8610: internally, and necessary adjustments are made during the calculations.
! 8611: Sometimes this isn't pretty, but sign and magnitude are best for other
! 8612: routines.
! 8613:
! 8614: Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these
! 8615: have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory
! 8616: allocation functions. Care is taken to ensure that these are big enough that
! 8617: no reallocation is necessary (since it would have unpredictable consequences).
! 8618:
! 8619:
! 8620: @node Rational Internals, Float Internals, Integer Internals, Internals
! 8621: @section Rational Internals
! 8622:
! 8623: @code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and
! 8624: denominator (@pxref{Integer Internals}).
! 8625:
! 8626: The canonical form adopted is denominator positive (and non-zero), no common
! 8627: factors between numerator and denominator, and zero uniquely represented as
! 8628: 0/1.
! 8629:
! 8630: It's believed that casting out common factors at each stage of a calculation
! 8631: is best in general. A GCD is an @math{O(N^2)} operation so it's better to do
! 8632: a few small ones immediately than to delay and have to do a big one later.
! 8633: Knowing the numerator and denominator have no common factors can be used for
! 8634: example in @code{mpq_mul} to make only two cross GCDs necessary, not four.
! 8635:
! 8636: This general approach to common factors is badly sub-optimal in the presence
! 8637: of simple factorizations or little prospect for cancellation, but GMP has no
! 8638: way to know when this will occur. As per @ref{Efficiency}, that's left to
! 8639: applications. The @code{mpq_t} framework might still suit, with
! 8640: @code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and
! 8641: denominator, or of course @code{mpz_t} variables can be used directly.
! 8642:
! 8643:
! 8644: @node Float Internals, Raw Output Internals, Rational Internals, Internals
! 8645: @section Float Internals
! 8646:
! 8647: Efficient calculation is the primary aim of GMP floats and the use of whole
! 8648: limbs and simple rounding facilitates this.
! 8649:
! 8650: @code{mpf_t} floats have a variable precision mantissa and a single machine
! 8651: word signed exponent. The mantissa is represented using sign and magnitude.
! 8652:
! 8653: @c FIXME: The arrow heads don't join to the lines exactly.
! 8654: @tex
! 8655: \global\newdimen\GMPboxwidth \GMPboxwidth=5em
! 8656: \global\newdimen\GMPboxheight \GMPboxheight=3ex
! 8657: \def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
! 8658: \GMPdisplay{%
! 8659: \vbox{%
! 8660: \hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb}
! 8661: \vskip 0.7ex
! 8662: \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
! 8663: \hbox {
! 8664: \hbox to 3\GMPboxwidth {%
! 8665: \setbox 0 = \hbox{@code{\_mp\_exp}}%
! 8666: \dimen0=3\GMPboxwidth
! 8667: \advance\dimen0 by -\wd0
! 8668: \divide\dimen0 by 2
! 8669: \advance\dimen0 by -1em
! 8670: \setbox1 = \hbox{$\rightarrow$}%
! 8671: \dimen1=\dimen0
! 8672: \advance\dimen1 by -\wd1
! 8673: \GMPcentreline{\dimen0}%
! 8674: \hfil
! 8675: \box0%
! 8676: \hfil
! 8677: \GMPcentreline{\dimen1{}}%
! 8678: \box1}
! 8679: \hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}}
! 8680: \vskip 0.5ex
! 8681: \vbox {%
! 8682: \hrule
! 8683: \hbox{%
! 8684: \vrule height 2ex depth 1ex
! 8685: \hbox to \GMPboxwidth {}%
! 8686: \vrule
! 8687: \hbox to \GMPboxwidth {}%
! 8688: \vrule
! 8689: \hbox to \GMPboxwidth {}%
! 8690: \vrule
! 8691: \hbox to \GMPboxwidth {}%
! 8692: \vrule
! 8693: \hbox to \GMPboxwidth {}%
! 8694: \vrule}
! 8695: \hrule
! 8696: }
! 8697: \hbox {%
! 8698: \hbox to 0.8 pt {}
! 8699: \hbox to 3\GMPboxwidth {%
! 8700: \hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}}
! 8701: \hbox to 5\GMPboxwidth{%
! 8702: \setbox 0 = \hbox{@code{\_mp\_size}}%
! 8703: \dimen0 = 5\GMPboxwidth
! 8704: \advance\dimen0 by -\wd0
! 8705: \divide\dimen0 by 2
! 8706: \advance\dimen0 by -1em
! 8707: \dimen1 = \dimen0
! 8708: \setbox1 = \hbox{$\leftarrow$}%
! 8709: \setbox2 = \hbox{$\rightarrow$}%
! 8710: \advance\dimen0 by -\wd1
! 8711: \advance\dimen1 by -\wd2
! 8712: \hbox to 0.3 em {}%
! 8713: \box1
! 8714: \GMPcentreline{\dimen0}%
! 8715: \hfil
! 8716: \box0
! 8717: \hfil
! 8718: \GMPcentreline{\dimen1}%
! 8719: \box2}
! 8720: }}
! 8721: @end tex
! 8722: @ifnottex
! 8723: @example
! 8724: most least
! 8725: significant significant
! 8726: limb limb
! 8727:
! 8728: _mp_d
! 8729: |---- _mp_exp ---> |
! 8730: _____ _____ _____ _____ _____
! 8731: |_____|_____|_____|_____|_____|
! 8732: . <------------ radix point
! 8733:
! 8734: <-------- _mp_size --------->
! 8735: @sp 1
! 8736: @end example
! 8737: @end ifnottex
! 8738:
! 8739: @noindent
! 8740: The fields are as follows.
! 8741:
! 8742: @table @asis
! 8743: @item @code{_mp_size}
! 8744: The number of limbs currently in use, or the negative of that when
! 8745: representing a negative value. Zero is represented by @code{_mp_size} and
! 8746: @code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is
! 8747: unused. (In the future @code{_mp_exp} might be undefined when representing
! 8748: zero.)
! 8749:
! 8750: @item @code{_mp_prec}
! 8751: The precision of the mantissa, in limbs. In any calculation the aim is to
! 8752: produce @code{_mp_prec} limbs of result (the most significant being non-zero).
! 8753:
! 8754: @item @code{_mp_d}
! 8755: A pointer to the array of limbs which is the absolute value of the mantissa.
! 8756: These are stored ``little endian'' as per the @code{mpn} functions, so
! 8757: @code{_mp_d[0]} is the least significant limb and
! 8758: @code{_mp_d[ABS(_mp_size)-1]} the most significant.
! 8759:
! 8760: The most significant limb is always non-zero, but there are no other
! 8761: restrictions on its value, in particular the highest 1 bit can be anywhere
! 8762: within the limb.
! 8763:
! 8764: @code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being
! 8765: for convenience (see below). There are no reallocations during a calculation,
! 8766: only in a change of precision with @code{mpf_set_prec}.
! 8767:
! 8768: @item @code{_mp_exp}
! 8769: The exponent, in limbs, determining the location of the implied radix point.
! 8770: Zero means the radix point is just above the most significant limb. Positive
! 8771: values mean a radix point offset towards the lower limbs and hence a value
! 8772: @math{@ge{} 1}, as for example in the diagram above. Negative exponents mean
! 8773: a radix point further above the highest limb.
! 8774:
! 8775: Naturally the exponent can be any value, it doesn't have to fall within the
! 8776: limbs as the diagram shows, it can be a long way above or a long way below.
! 8777: Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data
! 8778: are treated as zero.
! 8779: @end table
! 8780:
! 8781: @sp 1
! 8782: @noindent
! 8783: The following various points should be noted.
! 8784:
! 8785: @table @asis
! 8786: @item Low Zeros
! 8787: The least significant limbs @code{_mp_d[0]} etc can be zero, though such low
! 8788: zeros can always be ignored. Routines likely to produce low zeros check and
! 8789: avoid them to save time in subsequent calculations, but for most routines
! 8790: they're quite unlikely and aren't checked.
! 8791:
! 8792: @item Mantissa Size Range
! 8793: The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if
! 8794: the value can be represented in less. This means low precision values or
! 8795: small integers stored in a high precision @code{mpf_t} can still be operated
! 8796: on efficiently.
! 8797:
! 8798: @code{_mp_size} can also be greater than @code{_mp_prec}. Firstly a value is
! 8799: allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d},
! 8800: and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves
! 8801: @code{_mp_size} unchanged and so the size can be arbitrarily bigger than
! 8802: @code{_mp_prec}.
! 8803:
! 8804: @item Rounding
! 8805: All rounding is done on limb boundaries. Calculating @code{_mp_prec} limbs
! 8806: with the high non-zero will ensure the application requested minimum precision
! 8807: is obtained.
! 8808:
! 8809: The use of simple ``trunc'' rounding towards zero is efficient, since there's
! 8810: no need to examine extra limbs and increment or decrement.
! 8811:
! 8812: @item Bit Shifts
! 8813: Since the exponent is in limbs, there are no bit shifts in basic operations
! 8814: like @code{mpf_add} and @code{mpf_mul}. When differing exponents are
! 8815: encountered all that's needed is to adjust pointers to line up the relevant
! 8816: limbs.
! 8817:
! 8818: Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts,
! 8819: but the choice is between an exponent in limbs which requires shifts there, or
! 8820: one in bits which requires them almost everywhere else.
! 8821:
! 8822: @item Use of @code{_mp_prec+1} Limbs
! 8823: The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just
! 8824: @code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its
! 8825: operation. @code{mpf_add} for instance will do an @code{mpn_add} of
! 8826: @code{_mp_prec} limbs. If there's no carry then that's the result, but if
! 8827: there is a carry then it's stored in the extra limb of space and
! 8828: @code{_mp_size} becomes @code{_mp_prec+1}.
! 8829:
! 8830: Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not
! 8831: needed for the intended precision, only the @code{_mp_prec} high limbs. But
! 8832: zeroing it out or moving the rest down is unnecessary. Subsequent routines
! 8833: reading the value will simply take the high limbs they need, and this will be
! 8834: @code{_mp_prec} if their target has that same precision. This is no more than
! 8835: a pointer adjustment, and must be checked anyway since the destination
! 8836: precision can be different from the sources.
! 8837:
! 8838: Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs
! 8839: if available. This ensures that a variable which has @code{_mp_size} equal to
! 8840: @code{_mp_prec+1} will get its full exact value copied. Strictly speaking
! 8841: this is unnecessary since only @code{_mp_prec} limbs are needed for the
! 8842: application's requested precision, but it's considered that an @code{mpf_set}
! 8843: from one variable into another of the same precision ought to produce an exact
! 8844: copy.
! 8845:
! 8846: @item Application Precisions
! 8847: @code{__GMPF_BITS_TO_PREC} converts an application requested precision to an
! 8848: @code{_mp_prec}. The value in bits is rounded up to a whole limb then an
! 8849: extra limb is added since the most significant limb of @code{_mp_d} is only
! 8850: non-zero and therefore might contain only one bit.
! 8851:
! 8852: @code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra
! 8853: limb from @code{_mp_prec} before converting to bits. The net effect of
! 8854: reading back with @code{mpf_get_prec} is simply the precision rounded up to a
! 8855: multiple of @code{mp_bits_per_limb}.
! 8856:
! 8857: Note that the extra limb added here for the high only being non-zero is in
! 8858: addition to the extra limb allocated to @code{_mp_d}. For example with a
! 8859: 32-bit limb, an application request for 250 bits will be rounded up to 8
! 8860: limbs, then an extra added for the high being only non-zero, giving an
! 8861: @code{_mp_prec} of 9. @code{_mp_d} then gets 10 limbs allocated. Reading
! 8862: back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and
! 8863: multiply by 32, giving 256 bits.
! 8864:
! 8865: Strictly speaking, the fact the high limb has at least one bit means that a
! 8866: float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but
! 8867: for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice
! 8868: multiple of the limb size.
! 8869: @end table
! 8870:
! 8871:
! 8872: @node Raw Output Internals, C++ Interface Internals, Float Internals, Internals
! 8873: @section Raw Output Internals
! 8874:
! 8875: @noindent
! 8876: @code{mpz_out_raw} uses the following format.
! 8877:
! 8878: @tex
! 8879: \global\newdimen\GMPboxwidth \GMPboxwidth=5em
! 8880: \global\newdimen\GMPboxheight \GMPboxheight=3ex
! 8881: \def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
! 8882: \GMPdisplay{%
! 8883: \vbox{%
! 8884: \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
! 8885: \vbox {%
! 8886: \hrule
! 8887: \hbox{%
! 8888: \vrule height 2.5ex depth 1.5ex
! 8889: \hbox to \GMPboxwidth {\hfil size\hfil}%
! 8890: \vrule
! 8891: \hbox to 3\GMPboxwidth {\hfil data bytes\hfil}%
! 8892: \vrule}
! 8893: \hrule}
! 8894: }}
! 8895: @end tex
! 8896: @ifnottex
! 8897: @example
! 8898: +------+------------------------+
! 8899: | size | data bytes |
! 8900: +------+------------------------+
! 8901: @end example
! 8902: @end ifnottex
! 8903:
! 8904: The size is 4 bytes written most significant byte first, being the number of
! 8905: subsequent data bytes, or the twos complement negative of that when a negative
! 8906: integer is represented. The data bytes are the absolute value of the integer,
! 8907: written most significant byte first.
! 8908:
! 8909: The most significant data byte is always non-zero, so the output is the same
! 8910: on all systems, irrespective of limb size.
! 8911:
! 8912: In GMP 1, leading zero bytes were written to pad the data bytes to a multiple
! 8913: of the limb size. @code{mpz_inp_raw} will still accept this, for
! 8914: compatibility.
! 8915:
! 8916: The use of ``big endian'' for both the size and data fields is deliberate, it
! 8917: makes the data easy to read in a hex dump of a file. Unfortunately it also
! 8918: means that the limb data must be reversed when reading or writing, so neither
! 8919: a big endian nor little endian system can just read and write @code{_mp_d}.
! 8920:
! 8921:
! 8922: @node C++ Interface Internals, , Raw Output Internals, Internals
! 8923: @section C++ Interface Internals
! 8924:
! 8925: A system of expression templates is used to ensure something like @code{a=b+c}
! 8926: turns into a simple call to @code{mpz_add} etc. For @code{mpf_class} and
! 8927: @code{mpfr_class} the scheme also ensures the precision of the final
! 8928: destination is used for any temporaries within a statement like
! 8929: @code{f=w*x+y*z}. These are important features which a naive implementation
! 8930: cannot provide.
! 8931:
! 8932: A simplified description of the scheme follows. The true scheme is
! 8933: complicated by the fact that expressions have different return types. For
! 8934: detailed information, refer to the source code.
! 8935:
! 8936: To perform an operation, say, addition, we first define a ``function object''
! 8937: evaluating it,
! 8938:
! 8939: @example
! 8940: struct __gmp_binary_plus
! 8941: @{
! 8942: static void eval(mpf_t f, mpf_t g, mpf_t h) @{ mpf_add(f, g, h); @}
! 8943: @};
! 8944: @end example
! 8945:
! 8946: @noindent
! 8947: And an ``additive expression'' object,
! 8948:
! 8949: @example
! 8950: __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >
! 8951: operator+(const mpf_class &f, const mpf_class &g)
! 8952: @{
! 8953: return __gmp_expr
! 8954: <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g);
! 8955: @}
! 8956: @end example
! 8957:
! 8958: The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<...>>} is used to
! 8959: encapsulate any possible kind of expression into a single template type. In
! 8960: fact even @code{mpf_class} etc are @code{typedef} specializations of
! 8961: @code{__gmp_expr}.
! 8962:
! 8963: Next we define assignment of @code{__gmp_expr} to @code{mpf_class}.
! 8964:
! 8965: @example
! 8966: template <class T>
! 8967: mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr)
! 8968: @{
! 8969: expr.eval(this->get_mpf_t(), this->precision());
! 8970: return *this;
! 8971: @}
! 8972:
! 8973: template <class Op>
! 8974: void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval
! 8975: (mpf_t f, unsigned long int precision)
! 8976: @{
! 8977: Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t());
! 8978: @}
! 8979: @end example
1.1 maekawa 8980:
1.1.1.4 ! ohara 8981: where @code{expr.val1} and @code{expr.val2} are references to the expression's
! 8982: operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the
! 8983: @code{__gmp_expr}).
! 8984:
! 8985: This way, the expression is actually evaluated only at the time of assignment,
! 8986: when the required precision (that of @code{f}) is known. Furthermore the
! 8987: target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly
! 8988: with @code{f} as the output argument.
1.1 maekawa 8989:
1.1.1.4 ! ohara 8990: Compound expressions are handled by defining operators taking subexpressions
! 8991: as their arguments, like this:
1.1 maekawa 8992:
1.1.1.4 ! ohara 8993: @example
! 8994: template <class T, class U>
! 8995: __gmp_expr
! 8996: <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
! 8997: operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2)
! 8998: @{
! 8999: return __gmp_expr
! 9000: <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
! 9001: (expr1, expr2);
! 9002: @}
! 9003: @end example
! 9004:
! 9005: And the corresponding specializations of @code{__gmp_expr::eval}:
! 9006:
! 9007: @example
! 9008: template <class T, class U, class Op>
! 9009: void __gmp_expr
! 9010: <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval
! 9011: (mpf_t f, unsigned long int precision)
! 9012: @{
! 9013: // declare two temporaries
! 9014: mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision);
! 9015: Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t());
! 9016: @}
! 9017: @end example
! 9018:
! 9019: The expression is thus recursively evaluated to any level of complexity and
! 9020: all subexpressions are evaluated to the precision of @code{f}.
! 9021:
! 9022:
! 9023: @node Contributors, References, Internals, Top
1.1 maekawa 9024: @comment node-name, next, previous, up
1.1.1.4 ! ohara 9025: @appendix Contributors
1.1.1.2 maekawa 9026: @cindex Contributors
9027:
9028: Torbjorn Granlund wrote the original GMP library and is still developing and
9029: maintaining it. Several other individuals and organizations have contributed
9030: to GMP in various ways. Here is a list in chronological order:
1.1 maekawa 9031:
1.1.1.2 maekawa 9032: Gunnar Sjoedin and Hans Riesel helped with mathematical problems in early
9033: versions of the library.
9034:
9035: Richard Stallman contributed to the interface design and revised the first
9036: version of this manual.
9037:
9038: Brian Beuning and Doug Lea helped with testing of early versions of the
9039: library and made creative suggestions.
1.1 maekawa 9040:
9041: John Amanatides of York University in Canada contributed the function
9042: @code{mpz_probab_prime_p}.
9043:
9044: Paul Zimmermann of Inria sparked the development of GMP 2, with his
9045: comparisons between bignum packages.
9046:
9047: Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul)
9048: contributed @code{mpz_gcd}, @code{mpz_divexact}, @code{mpn_gcd}, and
9049: @code{mpn_bdivmod}, partially supported by CNPq (Brazil) grant 301314194-2.
9050:
1.1.1.2 maekawa 9051: Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure.
1.1 maekawa 9052: He has also made valuable suggestions and tested numerous intermediary
9053: releases.
9054:
9055: Joachim Hollman was involved in the design of the @code{mpf} interface, and in
9056: the @code{mpz} design revisions for version 2.
9057:
1.1.1.4 ! ohara 9058: Bennet Yee contributed the initial versions of @code{mpz_jacobi} and
! 9059: @code{mpz_legendre}.
1.1 maekawa 9060:
9061: Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and
1.1.1.4 ! ohara 9062: @file{mpn/m68k/rshift.S} (now in @file{.asm} form).
1.1 maekawa 9063:
1.1.1.2 maekawa 9064: The development of floating point functions of GNU MP 2, were supported in part
9065: by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial
9066: System SOlving).
9067:
1.1.1.4 ! ohara 9068: GNU MP 2 was finished and released by SWOX AB, SWEDEN, in cooperation with the
! 9069: IDA Center for Computing Sciences, USA.
1.1 maekawa 9070:
1.1.1.2 maekawa 9071: Robert Harley of Inria, France and David Seal of ARM, England, suggested clever
9072: improvements for population count.
1.1 maekawa 9073:
1.1.1.2 maekawa 9074: Robert Harley also wrote highly optimized Karatsuba and 3-way Toom
9075: multiplication functions for GMP 3. He also contributed the ARM assembly
9076: code.
1.1 maekawa 9077:
1.1.1.2 maekawa 9078: Torsten Ekedahl of the Mathematical department of Stockholm University provided
9079: significant inspiration during several phases of the GMP development. His
9080: mathematical expertise helped improve several algorithms.
9081:
1.1.1.4 ! ohara 9082: Paul Zimmermann wrote the Divide and Conquer division code, the REDC code, the
! 9083: REDC-based mpz_powm code, the FFT multiply code, and the Karatsuba square
! 9084: root. The ECMNET project Paul is organizing was a driving force behind many
! 9085: of the optimizations in GMP 3.
1.1.1.2 maekawa 9086:
9087: Linus Nordberg wrote the new configure system based on autoconf and
9088: implemented the new random functions.
9089:
9090: Kent Boortz made the Macintosh port.
9091:
1.1.1.4 ! ohara 9092: Kevin Ryde worked on a number of things: optimized x86 code, m4 asm macros,
! 9093: parameter tuning, speed measuring, the configure system, function inlining,
! 9094: divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas number
! 9095: functions, printf and scanf functions, perl interface, demo expression parser,
! 9096: the algorithms chapter in the manual, @file{gmpasm-mode.el}, and various
! 9097: miscellaneous improvements elsewhere.
1.1.1.2 maekawa 9098:
9099: Steve Root helped write the optimized alpha 21264 assembly code.
9100:
1.1.1.4 ! ohara 9101: Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++
! 9102: @code{istream} input routines.
! 9103:
! 9104: GNU MP 4.0 was finished and released by Torbjorn Granlund and Kevin Ryde.
1.1.1.2 maekawa 9105: Torbjorn's work was partially funded by the IDA Center for Computing Sciences,
9106: USA.
9107:
9108: (This list is chronological, not ordered after significance. If you have
9109: contributed to GMP but are not listed above, please tell @email{tege@@swox.com}
9110: about the omission!)
9111:
1.1.1.4 ! ohara 9112: Thanks goes to Hans Thorsen for donating an SGI system for the GMP test system
! 9113: environment.
! 9114:
! 9115: @node References, GNU Free Documentation License, Contributors, Top
1.1 maekawa 9116: @comment node-name, next, previous, up
1.1.1.4 ! ohara 9117: @appendix References
1.1.1.2 maekawa 9118: @cindex References
1.1 maekawa 9119:
1.1.1.4 ! ohara 9120: @c FIXME: In tex, the @uref's are unhyphenated, which is good for clarity,
! 9121: @c but being long words they upset paragraph formatting (the preceding line
! 9122: @c can get badly stretched). Would like an conditional @* style line break
! 9123: @c if the uref is too long to fit on the last line of the paragraph, but it's
! 9124: @c not clear how to do that. For now explicit @texlinebreak{}s are used on
! 9125: @c paragraphs that come out bad.
! 9126:
! 9127: @section Books
! 9128:
1.1 maekawa 9129: @itemize @bullet
1.1.1.4 ! ohara 9130: @item
! 9131: Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in
! 9132: Analytic Number Theory and Computational Complexity'', Wiley, John & Sons,
! 9133: 1998.
! 9134:
! 9135: @item
! 9136: Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate
! 9137: Texts in Mathematics number 138, Springer-Verlag, 1993.
! 9138: @texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen}
1.1 maekawa 9139:
9140: @item
1.1.1.4 ! ohara 9141: Donald E. Knuth, ``The Art of Computer Programming'', volume 2,
! 9142: ``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998.
! 9143: @texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html}
1.1 maekawa 9144:
9145: @item
1.1.1.4 ! ohara 9146: John D. Lipson, ``Elements of Algebra and Algebraic Computing'',
1.1 maekawa 9147: The Benjamin Cummings Publishing Company Inc, 1981.
9148:
9149: @item
1.1.1.4 ! ohara 9150: Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of
! 9151: Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/}
! 9152:
! 9153: @item
! 9154: Richard M. Stallman, ``Using and Porting GCC'', Free Software Foundation, 1999,
1.1.1.2 maekawa 9155: available online @uref{http://www.gnu.org/software/gcc/onlinedocs/}, and in
1.1.1.4 ! ohara 9156: the GCC package @uref{ftp://ftp.gnu.org/gnu/gcc/}
! 9157: @end itemize
! 9158:
! 9159: @section Papers
1.1 maekawa 9160:
1.1.1.4 ! ohara 9161: @itemize @bullet
1.1 maekawa 9162: @item
1.1.1.4 ! ohara 9163: Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'',
! 9164: Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, @texlinebreak{}
! 9165: @uref{http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022}
! 9166:
! 9167: @item
! 9168: Torbjorn Granlund and Peter L. Montgomery, ``Division by Invariant Integers
! 9169: using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June
! 9170: 1994. Also available @uref{ftp://ftp.cwi.nl/pub/pmontgom/divcnst.psa4.gz}
! 9171: (and .psl.gz).
1.1 maekawa 9172:
9173: @item
1.1.1.4 ! ohara 9174: Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in
! 9175: Mathematics of Computation, volume 44, number 170, April 1985.
1.1 maekawa 9176:
9177: @item
9178: Tudor Jebelean,
1.1.1.4 ! ohara 9179: ``An algorithm for exact division'',
1.1 maekawa 9180: Journal of Symbolic Computation,
1.1.1.4 ! ohara 9181: volume 15, 1993, pp. 169-180.
! 9182: Research report version available @texlinebreak{}
1.1.1.2 maekawa 9183: @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz}
1.1 maekawa 9184:
9185: @item
1.1.1.4 ! ohara 9186: Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended
! 9187: Abstract'', RISC-Linz technical report 96-31, @texlinebreak{}
! 9188: @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz}
! 9189:
! 9190: @item
! 9191: Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'',
! 9192: ISSAC 97, pp. 339-341. Technical report available @texlinebreak{}
! 9193: @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz}
! 9194:
! 9195: @item
! 9196: Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93,
! 9197: pp. 111-116. Technical report version available @texlinebreak{}
! 9198: @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz}
! 9199:
! 9200: @item
! 9201: Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD
! 9202: of Long Integers'', Journal of Symbolic Computation, volume 19, 1995,
! 9203: pp. 145-157. Technical report version also available @texlinebreak{}
! 9204: @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz}
! 9205:
! 9206: @item
! 9207: Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'',
! 9208: Journal of Symbolic Computation, volume 21, 1996, pp. 441-455. Early
! 9209: technical report version also available
! 9210: @uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz}
! 9211:
! 9212: @item
! 9213: R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'',
! 9214: Proceedings of the 13th Annual IEEE Symposium on Switching and Automata
! 9215: Theory, October 1972, pp. 90-96. Reprinted as ``Fast Modular Transforms'',
! 9216: Journal of Computer and System Sciences, volume 8, number 3, June 1974,
! 9217: pp. 366-386.
! 9218:
! 9219: @item
! 9220: Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser
! 9221: Zahlen'', Computing 7, 1971, pp. 281-292.
! 9222:
! 9223: @item
! 9224: Kenneth Weber, ``The accelerated integer GCD algorithm'',
1.1 maekawa 9225: ACM Transactions on Mathematical Software,
1.1.1.4 ! ohara 9226: volume 21, number 1, March 1995, pp. 111-122.
1.1.1.2 maekawa 9227:
9228: @item
1.1.1.4 ! ohara 9229: Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805,
! 9230: November 1999, @uref{http://www.inria.fr/RRRT/RR-3805.html}
1.1.1.2 maekawa 9231:
9232: @item
1.1.1.4 ! ohara 9233: Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root
! 9234: Implementations'', @texlinebreak{}
! 9235: @uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz}
1.1.1.2 maekawa 9236:
9237: @item
1.1.1.4 ! ohara 9238: Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE
! 9239: Symposium on Computer Arithmetic, 1993, pp. 260 to 271. Reprinted as ``More
! 9240: on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers,
! 9241: volume 43, number 8, August 1994, pp. 899-908.
1.1 maekawa 9242: @end itemize
9243:
1.1.1.4 ! ohara 9244:
! 9245: @node GNU Free Documentation License, Concept Index, References, Top
! 9246: @appendix GNU Free Documentation License
! 9247: @cindex GNU Free Documentation License
! 9248: @include fdl.texi
! 9249:
! 9250:
! 9251: @node Concept Index, Function Index, GNU Free Documentation License, Top
1.1 maekawa 9252: @comment node-name, next, previous, up
9253: @unnumbered Concept Index
9254: @printindex cp
9255:
1.1.1.2 maekawa 9256: @node Function Index, , Concept Index, Top
1.1 maekawa 9257: @comment node-name, next, previous, up
9258: @unnumbered Function and Type Index
9259: @printindex fn
9260:
9261: @bye
1.1.1.2 maekawa 9262:
9263: @c Local variables:
9264: @c fill-column: 78
1.1.1.4 ! ohara 9265: @c compile-command: "make gmp.info"
1.1.1.2 maekawa 9266: @c End:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>