% $Id: usersch4.tex,v 1.36 2002/09/05 14:36:59 karim Exp $ % Copyright (c) 2000 The PARI Group % % This file is part of the PARI/GP documentation % % Permission is granted to copy, distribute and/or modify this document % under the terms of the GNU Free Documentation License \chapter{Programming PARI in Library Mode} \section{Introduction: initializations, universal objects} \label{se:intro4} \noindent To be able to use PARI in \idx{library mode}, you must write a C program and link it to the PARI library. See the installation guide (in Appendix~A) on how to create and install the library and include files. A sample Makefile is presented in Appendix~B. Probably the best way to understand how programming is done is to work through a complete example. We will write such a program in \secref{se:prog}. Before doing this, a few explanations are in order. First, one must explain to the outside world what kind of objects and routines we are going to use. This is done simply with the statement \bprog #include @eprog \noindent This file \tet{pari.h} imports all the necessary constants, variables and functions, defines some important macros, and also defines the fundamental type for all PARI objects: the type \teb{GEN}, which is simply a pointer to \kbd{long}. \misctitle{Technical note}: we would have liked to define a type \kbd{GEN} to be a pointer to itself. This unfortunately is not possible in C, except by using structures, but then the names become unwieldy. The result of this is that when we use a component of a PARI object, it will be a \kbd{long}, hence will need to be typecast to a \kbd{GEN} again if we want to avoid warnings from the compiler. This will sometimes be quite tedious, but of course is trivially done. See the discussion on typecasts in the next section. After declaring the use of the file \kbd{pari.h}, the first executable statement of a main program should be to initialize the PARI system, and in particular the PARI stack which will be both a scratchboard and a repository for computed objects. This is done with a call to the function \fun{void}{pari_init}{long size, ulong maxprime)} \noindent The first argument is the number of bytes given to PARI to work with (it should not reasonably be taken below 500000), and the second is the upper limit on a precomputed prime number table. If you don't want prime numbers, just put $\tet{maxprime} = 2$. Be careful because lots a PARI functions need this table (certainly all the ones of interest to number theorists). If you wind up with the error message ``not enough precomputed primes'', try to increase this value. \noindent We have now at our disposal: $\bullet$ a large PARI \tev{stack} containing nothing. It's a big connected chunk of memory whose size you chose when invoking \tet{pari_init}. All your computations are going to take place here. When doing large computations, unwanted intermediate results clutter up memory very fast so some kind of garbage collecting is needed. Most large systems do garbage collecting when the memory is getting scarce, and this slows down the performance. In PARI we have taken a different approach: you must do your own cleaning up when the intermediate results are not needed anymore. Special purpose routines have been written to do this; we will see later how (and when, if at all) you should use them. $\bullet$ the following {\it universal objects\/} (by definition, objects which do not belong on the stack): the integers 0, 1 and 2 (respectively called \teb{gzero}, \teb{gun}, and \teb{gdeux}), the fraction $\dfrac{1}{2}$ (\teb{ghalf}), the complex number $i$ (\teb{gi}). All of these are of type \kbd{GEN}. In addition, space is reserved for the polynomials $x_v$ \sidx{variable} (\teb{polx}\kbd{[$v$]}), and the polynomials $1_v$ (\teb{polun}\kbd{[$v$]}). Here, $x_v$ is the name of variable number $v$, where $0\le v\le \tet{MAXVARN}$ (the exact value of which depends on your machine, at least 16383 in any case). Both \kbd{polun} and \kbd{polx} are arrays of \kbd{GEN}s, the index being the polynomial variable number. However, except for the ones corresponding to variables $0$ and \kbd{MAXVARN}, these polynomials are {\it not\/} created upon initialization. It is the programmer's responsibility to fill them before use. We'll see how this is done in \secref{se:vars} ({\it never\/} through direct assignment). $\bullet$ a {\it \idx{heap}\/} which is just a linked list of permanent universal objects. For now, it contains exactly the ones listed above. You will probably very rarely use the heap yourself; and if so, only as a collection of individual copies of objects taken from the stack (called \idx{clone}s in the sequel). Thus you need not bother with its internal structure, which may change as PARI evolves. Some complex PARI functions may create clones for special garbage collecting purposes, usually destroying them when returning. $\bullet$ a table of primes (in fact of {\it differences\/} between consecutive primes), called \teb{diffptr}, of type \kbd{byteptr} (pointer to \kbd{unsigned char}). Its use is described in appendix~C. $\bullet$ access to all the built-in functions of the PARI library. These are declared to the outside world when you include \kbd{pari.h}, but need the above things to function properly. So if you forget the call to \tet{pari_init}, you will immediately get a fatal error when running your program. \section{Important technical notes} \subsec{Typecasts}.\label{se:typecast}\sidx{typecast} \noindent We have seen that, due to the non-recursiveness of the PARI types from the compiler's point of view, many typecasts will be necessary when programming in PARI. To take an example, a vector $V$ of dimension 2 (two components) will be represented by a chunk of memory pointed to by the \kbd{GEN}~\kbd{V}. \kbd{V[0]} contains coded information, in particular about the type of the object, its length, etc. \kbd{V[1]} and \kbd{V[2]} contain pointers to the two components of \kbd{V}. Those coefficients \kbd{V[i]} themselves are in chunks of memory whose complexity depends on their own types, and so on. This is where typecasting will be necessary: a priori, \kbd{V[i]} (for $\kbd{i}=1,2$) is a \kbd{long}, but we will want to use it as a \kbd{GEN}. The following two constructions will be exceedingly frequent (\kbd{x} and \kbd{V} are \kbd{GEN}s): % \bprog V[i] = (long) x; x = (GEN) V[i]; @eprog \noindent Note that a typecast is not a valid lvalue (cannot be put on the left side of an assignment), so \kbd{(GEN)V[i] = x} would be incorrect, though some compilers may accept it. Due to this annoyance, the PARI functions and variables that occur most frequently have analogues which are macros including the typecast. The complete list can be found in the file \kbd{paricast.h} (which is included by \kbd{pari.h} and can be found at the same place). For instance you can abbreviate: % \bprog (long) gzero -----> zero (long) gun -----> un (long) polx[v] -----> lpolx[v] (long) gadd(x,y) -----> ladd(x,y) @eprog\noindent% \kbdsidx{zero}% \kbdsidx{un}% \kbdsidx{lpolx} In general, replacing a leading \kbd{g} by an \kbd{l} in the name of a PARI function will typecast the result to \kbd{long}. Note that \kbd{ldiv} is an ANSI C function which is hidden in PARI by a macro of the same name representing \kbd{(long)gdiv}. The macro $\teb{coeff}(x,m,n)$ exists with exactly the meaning of \kbd{x[m,n]} under GP when \kbd{x} is a matrix. This is a purely syntactical trick to reduce the number of typecasts and thus does not create a copy of the coefficient (contrary to all the library {\it functions\/}). It can be put on the left side of an assignment statement, and its value, of type \kbd{long} integer, is a pointer to the desired coefficient object. The macro \kbd{gcoeff} is a synonym for \kbd{(GEN) coeff}, hence cannot be put on the left side of an assignment. To retrieve the values of elements of lists of \dots\ of lists of vectors, without getting infuriated by gigantic lists of typecasts, we have the \teb{mael} macros (for {\bf m}ultidimensional {\bf a}rray {\bf el}ement). The syntax is $\key{mael}n(x,a_1,\dots,a_n)$, where $x$ is a \kbd{GEN}, the $a_i$ are indexes, and $n$ is an integer between $2$ and $5$ (with a standalone \kbd{mael} as a synonym for \kbd{mael2}). This stands for $x[a_1][a_2]\dots[a_n]$ (with all the necessary typecasts), and returns a \kbd{long} (i.e.~they are valid lvalues). The $\kbd{gmael}n$ macros are synonyms for $\kbd{(GEN) mael}n$. Note that due to the implementation of matrix types in PARI (i.e.~as horizontal lists of vertical vectors), \kbd{coeff(x,y)} is actually completely equivalent to \kbd{mael(y,x)}. It is suggested that you use \kbd{coeff} in matrix context, and \kbd{mael} otherwise. \subsec{Variations on basic functions}.\label{se:low_level} In the library syntax descriptions in Chapter~3, we have only given the basic names of the functions. For example \kbd{gadd}$(x,y)$ assumes that $x$ and $y$ are PARI objects (of type \kbd{GEN}), and {\it creates\/} the result $x+y$ on the PARI stack. For most of the basic operators and functions, many other variants are available. We give some examples for \kbd{gadd}, but the same is true for all the basic operators, as well as for some simple common functions (a more complete list is given in Chapter~5): \fun{GEN}{gaddgs}{GEN x, long y} \fun{GEN}{gaddsg}{long x, GEN y} \noindent In the following three, \kbd{z} is a preexisting \kbd{GEN} and the result of the corresponding operation is put into~\kbd{z}. The size of the PARI stack does not change: \fun{void}{gaddz}{GEN x, GEN y, GEN z} \fun{void}{gaddgsz}{GEN x, long y, GEN z} \fun{void}{gaddsgz}{GEN x, GEN y, GEN z} \noindent There are also low level functions which are special cases of the above: \fun{GEN}{addii}{GEN x, GEN y}: here $x$ and $y$ are \kbd{GEN}s of type \typ{INT} (this is not checked). \fun{GEN}{addrr}{GEN x, GEN y}: here $x$ and $y$ are \kbd{GEN} reals (type \typ{REAL}). \noindent There also exist functions \teb{addir}, \teb{addri}, \teb{mpadd} (whose two arguments can be of type integer or real), \teb{addis} (to add a \typ{INT} and a \kbd{long}) and so on. All these functions can of course be called by the user but we feel that the few microseconds lost in calling more general functions (in this case \kbd{gadd}) are compensated by the fact that one needs to remember a much smaller number of functions, and also because there is a hidden danger here: the types of the objects that you use, if they are themselves results of a previous computation, are not completely predetermined. For instance the multiplication of a type real \typ{REAL} by a type integer \typ{INT} {\it usually\/} gives a result of type real, except when the integer is~0, in which case according to the PARI philosophy the result is the exact integer~0. Hence if afterwards you call a function which specifically needs a real type argument, you are going to be in trouble. If you really want to use these functions, their names are self-explanatory once you know that {\bf i} stands for a PARI integer, {\bf r} for a PARI real, {\bf mp} for i or r, {\bf s} for an ordinary signed long, whereas {\bf z} (as a suffix) means that the result is not created on the PARI stack but assigned to a preexisting GEN object passed as an extra argument. For completeness, Chapter 5 gives a description of all these low-level functions. Please note that in the present version \vers{} the names of the functions are not always consistent. This will be changed. Hence anyone programming in PARI must be aware that the names of almost all functions that he uses might be subject to change. If the need arises (i.e.~if there really are people out there who delve into the innards of PARI), updated versions with no name changes will be released. \subsec{Portability: 32-bit / 64-bit architectures}. \noindent PARI supports both 32-bit and 64-bit based machines, but not simultaneously! The library will have been compiled assuming a given architecture (a~priori following a guess by the \kbd{Configure} program, see Appendix~A), and some of the header files you include (through \kbd{pari.h}) will have been modified to match the library. Portable macros are defined to bypass most machine dependencies. If you want your programs to run identically on 32-bit and 64-bit machines, you will have to use these, and not the corresponding numeric values, whenever the precise size of your \kbd{long} integers might matter. Here are the most important ones: \settabs\+ -----------------------------&---------------&------------&\cr \+ & 64-bit & 32-bit \cr\+ \tet{BITS_IN_LONG} & 64 & 32 \cr\+ \tet{LONG_IS_64BIT} & defined & undefined \cr\+ \tet{DEFAULTPREC} & 3 & 4 & ($\approx$ 19 decimal digits, % see formula below) \cr\+ \tet{MEDDEFAULTPREC}& 4 & 6 & ($\approx$ 38 decimal digits) \cr\+ \tet{BIGDEFAULTPREC}& 5 & 8 & ($\approx$ 57 decimal digits) \cr \noindent For instance, suppose you call a transcendental function, such as \kbd{GEN \key{gexp}(GEN x, long prec)}. \noindent The last argument \kbd{prec} is only used if \kbd{x} is an exact object (otherwise the relative precision is determined by the precision of~\kbd{x}). But since \kbd{prec} sets the size of the inexact result counted in (\kbd{long}) {\it words\/} (including codewords), the same value of \kbd{prec} will yield different results on 32-bit and 64-bit machines. Real numbers have two codewords (see~\secref{se:impl}), so the formula for computing the bit accuracy is $$ \tet{bit_accuracy}(\kbd{prec}) = (\kbd{prec} - 2) * \tet{BITS_IN_LONG}$$ (this is actually the definition of a macro). The corresponding accuracy expressed in decimal digits would be % $$ \kbd{bit\_accuracy(prec)} * \log(2) / \log(10).$$ % For example if the value of \kbd{prec} is 5, the corresponding accuracy for 32-bit machines is $(5-2)*\log(2^{32})/\log(10)\approx 28$ decimal digits, while for 64-bit machines it is $(5-2)*\log(2^{64})/\log(10)\approx 57$ decimal digits. Thus, you must take care to change the \kbd{prec} parameter you are supplying according to the bit size, either using the default precisions given by the various \kbd{DEFAULTPREC}s, or by using conditional constructs of the form: % \bprog #ifndef LONG_IS_64BIT prec = 4; #else prec = 6; #endif @eprog \noindent which is in this case equivalent to the statement \kbd{prec = MEDDEFAULTPREC;}. Note that for parity reasons, half the accuracies available on 32-bit architectures (the odd ones) have no precise equivalents on 64-bit machines. \section{Creation of PARI objects, assignments, conversions} \subsec{Creation of PARI objects}.\sidx{creation} The basic function which creates a PARI object is the function \teb{cgetg} whose prototype is: \kbd{GEN \key{cgetg}(long length, long type)}. \noindent Here \kbd{length} specifies the number of longwords to be allocated to the object, and type is the type number of the object, preferably in symbolic form (see \secref{se:impl} for the list of these). The precise effect of this function is as follows: it first creates on the PARI {\it stack\/} a chunk of memory of size \kbd{length} longwords, and saves the address of the chunk which it will in the end return. If the stack has been used up, a message to the effect that ``the PARI stack overflows'' will be printed, and an error raised. Otherwise, it sets the type and length of the PARI object. In effect, it fills correctly and completely its first codeword (\kbd{z[0]} or \kbd{*z}). Many PARI objects also have a second codeword (types \typ{INT}, \typ{REAL}, \typ{PADIC}, \typ{POL}, and \typ{SER}). In case you want to produce one of those from scratch (this should be exceedingly rare), {\it it is your responsibility to fill this second codeword}, either explicitly (using the macros described in \secref{se:impl}), or implicitly using an assignment statement (using \kbd{gaffect}). Note that the argument \kbd{length} is predetermined for a number of types: 3 for types \typ{INTMOD}, \typ{FRAC}, \typ{FRACN}, \typ{COMPLEX}, \typ{POLMOD}, \typ{RFRAC} and \typ{RFRACN}, 4 for type \typ{QUAD} and \typ{QFI}, and 5 for type \typ{PADIC} and \typ{QFR}. However for the sake of efficiency, no checking is done in the function \kbd{cgetg}, so disasters will occur if you give an incorrect length. \misctitle{Notes}: 1) The main use of this function is to prepare for later assignments (see \secref{se:assign}). Most of the time you will use \kbd{GEN} objects as they are created and returned by PARI functions. In this case you don't need to use \kbd{cgetg} to create space to hold them. \noindent 2) For the creation of leaves, i.e.~integers or reals, which is very common, \fun{GEN}{cgeti}{long length} \fun{GEN}{cgetr}{long length} \noindent should be used instead of \kbd{cgetg(length, t\_INT)} and \kbd{cgetg(length, t\_REAL)} respectively. \noindent 3) The macros \teb{lgetg}, \teb{lgeti}, \teb{lgetr} are predefined as \kbd{(long)cgetg}, \kbd{(long)cgeti}, \kbd{(long)cgetr}, respectively. \misctitle{Examples}: 1) \kbd{z = cgeti(DEFAULTPREC)} and \kbd{cgetg(DEFAULTPREC, t\_INT)} create an integer object whose ``precision'' is \kbd{bit\_accuracy(DEFAULTPREC)} = 64. This means \kbd{z} can hold rational integers of absolute value less than $2^{64}$. Note that in both cases, the second codeword will {\it not\/} be filled. Of course we could use numerical values, e.g.~\kbd{cgeti(4)}, but this would have different meanings on different machines as \kbd{bit\_accuracy(4)} equals 64 on 32-bit machines, but 128 on 64-bit machines. \noindent 2) The following creates a type ``complex'' object whose real and imaginary parts can hold real numbers of precision $\kbd{bit\_accuracy(MEDDEFAULTPREC)} = 96\hbox{ bits:}$ % \bprog z = cgetg(3, t_COMPLEX); z[1] = lgetr(MEDDEFAULTPREC); z[2] = lgetr(MEDDEFAULTPREC); @eprog \noindent 3) To create a matrix object for $4\times 3$ matrices: % \bprog z = cgetg(4, t_MAT); for(i=1; i<4; i++) z[i] = lgetg(5, t_COL); @eprog % \noindent If one wishes to create space for the matrix elements themselves, one has to follow this with a double loop to fill each column vector. These last two examples illustrate the fact that since PARI types are recursive, all the branches of the tree must be created. The function \teb{cgetg} creates only the ``root'', and other calls to \kbd{cgetg} must be made to produce the whole tree. For matrices, a common mistake is to think that \kbd{z = cgetg(4, t\_MAT)} (for example) will create the root of the matrix: one needs also to create the column vectors of the matrix (obviously, since we specified only one dimension in the first \kbd{cgetg}!). This is because a matrix is really just a row vector of column vectors (hence a priori not a basic type), but it has been given a special type number so that operations with matrices become possible. \subsec{Assignments}. Firstly, if \kbd{x} and \kbd{y} are both declared as \kbd{GEN} (i.e.~pointers to something), the ordinary C assignment \kbd{y = x} makes perfect sense: we are just moving a pointer around. However, physically modifying either \kbd{x} or \kbd{y} (for instance, \kbd{x[1] = 0}) will also change the other one, which is usually not desirable. \label{se:assign} \misctitle{Very important note}: Using the functions described in this paragraph is very inefficient and often awkward: one of the \tet{gerepile} functions (see~\secref{se:garbage}) should be preferred. See the paragraph end for some exceptions to this rule. \noindent The general PARI \idx{assignment} function is the function \teb{gaffect} with the following syntax: \fun{void}{gaffect}{GEN x, GEN y} \noindent Its effect is to assign the PARI object \kbd{x} into the {\it preexisting\/} object \kbd{y}. This copies the whole structure of \kbd{x} into \kbd{y} so many conditions must be met for the assignment to be possible. For instance it is allowed to assign an integer into a real number, but the converse is forbidden. For that, you must use the truncation or rounding function of your choice (see section 3.2). It can also happen that \kbd{y} is not large enough or does not have the proper tree structure to receive the object \kbd{x}. As an extreme example, assume \kbd{y} is the zero integer with length equal to 2. Then all assignments of a non-zero integer into \kbd{y} will result in an error message since \kbd{y} is not large enough to accommodate a non-zero integer. In general common sense will tell you what is possible, keeping in mind the PARI philosophy which says that if it makes sense it is legal. For instance, the assignment of an imprecise object into a precise one does {\it not\/} make sense. However, a change in precision of imprecise objects is allowed. All functions ending in ``\kbd{z}'' such as \teb{gaddz} (see~\secref{se:low_level}) implicitly use this function. In fact what they exactly do is record {\teb{avma}} (see~\secref{se:garbage}), perform the required operation, \teb{gaffect} the result to the last operand, then restore the initial \kbd{avma}. You can assign ordinary C long integers into a PARI object (not necessarily of type \typ{INT}). Use the function \teb{gaffsg} with the following syntax: \fun{void}{gaffsg}{long s, GEN y} \misctitle{Note}: due to the requirements mentioned above, it's usually a bad idea to use \tet{gaffect} statements. Two exceptions: $\bullet$ for simple objects (e.g.~leaves) whose size is controlled, they can be easier to use than gerepile, and about as efficient. $\bullet$ to coerce an inexact object to a given precision. For instance \bprog gaffect(x, (tmp=cgetr(3))); x = tmp; @eprog \noindent at the beginning of a routine where precision can be kept to a minimum (otherwise the precision of \kbd{x} will be used in all subsequent computations, which will be a disaster if \kbd{x} is known to thousands of digits). \subsec{Copy}. It is also very useful to \idx{copy} a PARI object, not just by moving around a pointer as in the \kbd{y = x} example, but by creating a copy of the whole tree structure, without pre-allocating a possibly complicated \kbd{y} to use with \kbd{gaffect}. The function which does this is called \teb{gcopy}, with the predefined macro \teb{lcopy} as a synonym for \kbd{(long)gcopy}. Its syntax is: \fun{GEN}{gcopy}{GEN x} \noindent and the effect is to create a new copy of x on the PARI stack. Beware that universal objects which occur in specific components of certain types (mainly moduli for types \typ{INTMOD} and \typ{PADIC}) are not copied, as they are assumed to be permanent. In this case, \kbd{gcopy} only copies the pointer. Use \fun{GEN}{forcecopy}{GEN x} if you want a complete copy. Please be sure at this point that you really understand the difference between \kbd{y = x}, \kbd{y = gcopy(x)}, and \kbd{gaffect(x,y)}: this will save you from many ``obvious'' mistakes later on. \subsec{Clones}.\sidx{clone} Sometimes, it may be more efficient to create a {\it permanent\/} copy of a PARI object. This will not be created on the stack but on the heap. The function which does this is called \teb{gclone}, with the predefined macro \teb{lclone} as a synonym for \kbd{(long)gclone}. Its syntax is: \fun{GEN}{gclone}{GEN x} A clone can be removed from the heap (thus destroyed) using \fun{void}{gunclone}{GEN x} \noindent No PARI object should keep references to a clone which has been destroyed. If you want to copy a clone back to the stack then delete it, use \tet{forcecopy} and not \tet{gcopy}, otherwise some components might not be copied (moduli of \typ{INTMOD}s and \typ{POLMOD}s for instance). \subsec{Conversions}.\sidx{conversions} The following functions convert C objects to PARI objects (creating them on the stack as usual): \fun{GEN}{stoi}{long s}: C \kbd{long} integer (``small'') to PARI integer (\typ{INT}) \fun{GEN}{dbltor}{double s}: C \kbd{double} to PARI real (\typ{REAL}). The accuracy of the result is 19 decimal digits, i.e.~a type \typ{REAL} of length \kbd{DEFAULTPREC}, although on 32-bit machines only 16 of them will be significant. \noindent We also have the converse functions: \fun{long}{itos}{GEN x}: \kbd{x} must be of type \typ{INT}, \fun{double}{rtodbl}{GEN x}: \kbd{x} must be of type \typ{REAL}, \noindent as well as the more general ones: \fun{long}{gtolong}{GEN x}, \fun{double}{gtodouble}{GEN x}. \section{Garbage collection}\label{se:garbage}\sidx{garbage collecting} \subsec{Why and how}. \noindent As we have seen, the \kbd{pari\_init} routine allocates a big range of addresses (the {\it stack\/}) that are going to be used throughout. Recall that all PARI objects are pointers. But for universal objects, they will all point at some part of the stack. The stack starts at the address \kbd{bot} and ends just before \kbd{top}. This means that the quantity % $$ (\kbd{top} - \kbd{bot})\,/\,\kbd{sizeof(long)} $$ % is equal to the \kbd{size} argument of \kbd{pari\_init}. The PARI stack also has a ``current stack pointer'' called \teb{avma}, which stands for {\bf av}ailable {\bf m}emory {\bf a}ddress. These three variables are global (declared for you by \kbd{pari.h}). For historical reasons they are of type \tet{ulong} (unsigned long) and not of type \kbd{GEN} as would seem more natural. The stack is oriented upside-down: the more recent an object, the closer to \kbd{bot}. Accordingly, initially \kbd{avma} = \kbd{top}, and \kbd{avma} gets {\it decremented\/} as new objects are created. As its name indicates, \kbd{avma} always points just {\it after\/} the first free address on the stack, and \kbd{(GEN)avma} is always (a pointer to) the latest created object. When \kbd{avma} reaches \kbd{bot}, the stack overflows, aborting all computations, and an error message is issued. To avoid this {\it you\/} will need to clean up the stack from time to time, when some bunch of intermediate objects will not be needed anymore. This is called ``{\it garbage collecting}.'' We are now going to describe briefly how this is done. We will see many concrete examples in the next subsection. \noindent$\bullet$ First, PARI routines will do their own garbage collecting, which means that whenever a documented function from the library returns, only its result(s) will have been added to the stack (non-documented ones may not do this, for greater speed). In particular, a PARI function that does not return a \kbd{GEN} does not clutter the stack. Thus, if your computation is small enough (i.e.~you call few PARI routines, or most of them return \kbd{long} integers), then you don't need to do any garbage collecting. This will probably be the case in many of your subroutines. Of course the objects that were on the stack {\it before\/} the function call are left alone. Except for the ones listed below, PARI functions only collect their own garbage. \noindent$\bullet$ It may happen that all objects that were created after a certain point can be deleted~--- for instance, if the final result you need is not a \kbd{GEN}, or if some search proved futile. Then, it is enough to record the value of \kbd{avma} just {\it before\/} the first garbage is created, and restore it upon exit: \bprog ulong ltop = avma; /*@Ccom record initial avma */ garbage ... avma = ltop; /*@Ccom restore it */ @eprog \noindent All objects created in the \kbd{garbage} zone will eventually be overwritten: they should not be accessed anymore once \kbd{avma} has been restored. \noindent$\bullet$ If you want to destroy (i.e.~give back the memory occupied by) the latest PARI object on the stack (e.g.~the latest one obtained from a function call), and the above method is not available (because the initial value of \kbd{avma} has been lost), just use the function\sidx{destruction}% \vadjust{\penalty500}%discourage page break \fun{void}{cgiv}{GEN z} \noindent where \kbd{z} is the object you want to give back. \noindent$\bullet$ Unfortunately life is not so simple, and sometimes you will want to give back accumulated garbage {\it during\/} a computation without losing recent data. For this you need the \kbd{gerepile} function (or one of its variants described hereafter): \fun{GEN}{gerepile}{ulong ltop, ulong lbot, GEN q} \noindent This function cleans up the stack between \kbd{ltop} and \kbd{lbot}, where $\kbd{lbot} < \kbd{ltop}$, and returns the updated object \kbd{q}. This means: 1) we translate (copy) all the objects in the interval $[\kbd{avma}, \kbd{lbot}[$, so that its right extremity abuts the address \kbd{ltop}. Graphically \vbox{\bprog bot avma lbot ltop top End of stack |-------------[++++++[-/-/-/-/-/-/-|++++++++| Start free memory garbage @eprog \noindent becomes: \bprog bot avma ltop top End of stack |---------------------------[++++++[++++++++| Start free memory @eprog } \noindent where \kbd{++} denote significant objects, \kbd{--} the unused part of the stack, and \kbd{-/-} the garbage we remove. 2) The function then inspects all the PARI objects between \kbd{avma} and \kbd{lbot} (i.e.~the ones that we want to keep and that have been translated) and looks at every component of such an object which is not a codeword. Each such component is a pointer to an object whose address is either --- between \kbd{avma} and \kbd{lbot}, in which case it will be suitably updated, --- larger than or equal to \kbd{ltop}, in which case it will not change, or --- between \kbd{lbot} and \kbd{ltop} in which case \kbd{gerepile} will scream an error message at you (``significant pointers lost in gerepile''). 3) \key{avma} is updated (we add $\kbd{ltop} - \kbd{lbot}$ to the old value). 4) We return the (possibly updated) object \kbd{q}: if \kbd{q} initially pointed between \kbd{avma} and \kbd{lbot}, we return the translated address, as in~2). If not, the original address is still valid (and we return it!). As stated above, no component of the remaining objects (in particular \kbd{q}) should belong to the erased segment [\kbd{lbot}, \kbd{ltop}[, and this is checked within \kbd{gerepile}. But beware as well that the addresses of all the objects in the translated zone will have changed after a call to \kbd{gerepile}: every pointer you may have kept around elsewhere, outside the stack objects, which previously pointed into the zone below \kbd{ltop} must be discarded. If you need to recover more than one object, use one of the \kbd{gerepilemany} functions below. As a consequence of the preceding explanation, we must now state the most important law about programming in PARI: {\bf If a given PARI object is to be relocated by \hbox{gerepile} then, apart from universal objects, the chunks of memory used by its components should be in consecutive memory locations}. All \kbd{GEN}s created by documented PARI function are guaranteed to satisfy this. This is because the \kbd{gerepile} function knows only about {\it two connected zones\/}: the garbage that will be erased (between \kbd{lbot} and \kbd{ltop}) and the significant pointers that will be copied and updated. If there is garbage interspersed with your objects, disasters will occur when we try to update them and consider the corresponding ``pointers''. So be {\it very\/} wary when you allow objects to become disconnected. Have a look at the examples, it's not as complicated as it seems. \noindent In practice this is achieved by the following programming idiom: \bprog ltop=avma; garbage(); lbot=avma; q=anything(); return gerepile(ltop, lbot, q); /*@Ccom returns the updated q */ @eprog \noindent Beware that \bprog ltop=avma; garbage(); return gerepile(ltop, avma, anything()) @eprog \noindent might work, but should be frowned upon. We can't predict whether \kbd{avma} is going to be evaluated after or before the call to \kbd{anything()}: it depends on the compiler. If we are out of luck, it will be {\it after\/} the call, so the result will belong to the garbage zone and the \kbd{gerepile} statement becomes equivalent to \kbd{avma = ltop}. Thus we would return a pointer to random garbage. \noindent$\bullet$ A simple variant is \fun{GEN}{gerepileupto}{ulong ltop, GEN q} \noindent which cleans the stack between \kbd{ltop} and the {\it connected\/} object \kbd{q} and returns \kbd{q} updated. For this to work, \kbd{q} must have been created {\it before\/} all its components, otherwise they would belong to the garbage zone! Documented PARI functions guarantee this. If you stumble upon one that does not, consider it a bug worth reporting. \noindent$\bullet$ Another variant (a special case of \tet{gerepilemany} below, where $n=1$) is \fun{GEN}{gerepilecopy}{ulong ltop, GEN x)} \noindent which is functionnally equivalent to \kbd{gerepileupto(ltop, gcopy(x))} but more efficient. In this case, the \kbd{GEN} parameter \kbd{x} need not satisfy any property before the garbage collection (it may be disconnected, components created before the root and so on). Of course, this is less efficient than either \tet{gerepileupto} or \tet{gerepile}, because \kbd{x} has to be copied to a clean stack zone first. \noindent$\bullet$ To cope with complicated cases where many objects have to be preserved, you can use \fun{void}{gerepileall}{ulong ltop, int n, ...} \noindent where the routine expects $n$ further arguments, which are the {\it addresses} of the \kbd{GEN}s you want to preserve. It cleans up the most recent part of the stack (between \kbd{ltop} and \kbd{avma}), updating all the \kbd{GEN}s added to the argument list. A copy is done just before the cleaning to preserve them, so they don't need to be connected before the call. With \kbd{gerepilecopy}, this is the most robust of the \kbd{gerepile} functions (the less prone to user error), but also the slowest. An alternative, unnecessarily complicated, syntax is given by \fun{void}{gerepilemany}{ulong ltop, GEN *gptr[], int n} \noindent which works exactly as above, except that the preserved \kbd{GEN}s are the elements of the array \kbd{gptr} (of length $n$): \kbd{gptr[0]}, \kbd{gptr[1]}, \dots, \kbd{gptr[$n$-1]}. \noindent$\bullet$ More efficient, but tricky to use is \fun{void}{gerepilemanysp}{ulong ltop, ulong lbot, GEN *gptr[], int n} \noindent which cleans the stack between \kbd{lbot} and \kbd{ltop} and updates the \kbd{GEN}s pointed at by the elements of \kbd{gptr} without doing any copying. This is subject to the same restrictions as \kbd{gerepile}, the only difference being that more than one address gets updated. \subsec{Examples}. \noindent Let \kbd{x} and \kbd{y} be two preexisting PARI objects and suppose that we want to compute $\kbd{x}^2+ \kbd{y}^2$. This can trivially be done using the following program (we skip the necessary declarations; everything in sight is a \kbd{GEN}): \bprog p1 = gsqr(x); p2 = gsqr(y); z = gadd(p1,p2); @eprog\noindent The \kbd{GEN} \kbd{z} indeed points at the desired quantity. However, consider the stack: it contains as unnecessary garbage \kbd{p1} and \kbd{p2}. More precisely it contains (in this order) \kbd{z}, \kbd{p2}, \kbd{p1} (recall that, since the stack grows downward from the top, the most recent object comes first). We need a way to get rid of this garbage (in this case it causes no harm except that it occupies memory space, but in other cases it could disconnect other PARI objects and this is dangerous). It would not have been possible to get rid of \kbd{p1}, \kbd{p2} before \kbd{z} is computed, since they are used in the final operation. We cannot record \kbd{avma} before \kbd{p1} is computed and restore it later, since this would destroy \kbd{z} as well. It is not possible either to use the function \kbd{cgiv} since \kbd{p1} and \kbd{p2} are not at the bottom of the stack and we don't want to give back~\kbd{z}. But using \kbd{gerepile}, we can give back the memory locations corresponding to \kbd{p1}, \kbd{p2}, and move the object \kbd{z} upwards so that no space is lost. Specifically: \bprog ltop = avma; /*@Ccom remember the current address of the top of the stack */ p1 = gsqr(x); p2 = gsqr(y); lbot = avma; /*@Ccom keep the address of the bottom of the garbage pile */ z = gadd(p1, p2); /*@Ccom z is now the last object on the stack */ z = gerepile(ltop, lbot, z); /*@Ccom garbage collecting */ @eprog \noindent Of course, the last two instructions could also have been written more simply: \kbd{z = gerepile(ltop, lbot, gadd(p1,p2));} \noindent In fact \kbd{gerepileupto} is even simpler to use, because the result of \kbd{gadd} will be the last object on the stack and \kbd{gadd} is guaranteed to return an object suitable for \kbd{gerepileupto}: \bprog ltop = avma; z = gerepileupto(ltop, gadd(gsqr(x), gsqr(y))); @eprog \noindent As you can see, in simple conditions the use of \kbd{gerepile} is not really difficult. However make sure you understand exactly what has happened before you go on (use the figure from the preceding section). \misctitle{Important remark}: as we will see presently it is often necessary to do several \kbd{gerepile}s during a computation. However, the fewer the better. The only condition for \kbd{gerepile} to work is that the garbage be connected. If the computation can be arranged so that there is a minimal number of connected pieces of garbage, then it should be done that way. For example suppose we want to write a function of two \kbd{GEN} variables \kbd{x} and \kbd{y} which creates the vector $\kbd{[x}^2+\kbd{y}, \kbd{y}^2+\kbd{x]}$. Without garbage collecting, one would write: % \bprog p1 = gsqr(x); p2 = gadd(p1, y); p3 = gsqr(y); p4 = gadd(p3, x); z = cgetg(3,t_VEC); z[1] = (long)p2; z[2] = (long)p4; @eprog \noindent This leaves a dirty stack containing (in this order) \kbd{z}, \kbd{p4}, \kbd{p3}, \kbd{p2}, \kbd{p1}. The garbage here consists of \kbd{p1} and \kbd{p3}, which are separated by \kbd{p2}. But if we compute \kbd{p3} {\it before\/} \kbd{p2} then the garbage becomes connected, and we get the following program with garbage collecting: % \bprog ltop = avma; p1 = gsqr(x); p3 = gsqr(y); lbot = avma; z = cgetg(3,t_VEC); z[1] = ladd(p1,y); z[2] = ladd(p3,x); z = gerepile(ltop,lbot,z); @eprog \noindent Finishing by \kbd{z = gerepileupto(ltop, z)} would be ok as well. But when you have the choice, it's usually clearer to brace the garbage between \kbd{ltop}~/ \kbd{lbot} pairs. \noindent Beware that \bprog ltop = avma; p1 = gadd(gsqr(x), y); p3 = gadd(gsqr(y), x); z = cgetg(3,t_VEC); z[1] = (long)p1; z[2] = (long)p3 z = gerepileupto(ltop,z); /*@Ccom WRONG !!! */ @eprog \noindent would be a disaster since \kbd{p1} and \kbd{p3} would be created before \kbd{z}, so the call to \kbd{gerepileupto} would overwrite them, leaving \kbd{z[1]} and \kbd{z[2]} pointing at random data! We next want to write a program to compute the product of two complex numbers $x$ and $y$, using a method which takes only 3 multiplications instead of 4. Let $z = x*y$, and set $x = x_r + i*x_i$ and similarly for $y$ and $z$. The well-known trick is to compute $p_1 = x_r*y_r$, $p_2=x_i*y_i$, $p_3=(x_r+x_i)*(y_r+y_i)$, and then we have $z_r=p_1-p_2$, $z_i=p_3-(p_1+p_2)$. The program is essentially as follows: % \bprog ltop = avma; p1 = gmul(x[1],y[1]); p2 = gmul(x[2],y[2]); p3 = gmul(gadd(x[1],x[2]), gadd(y[1],y[2])); p4 = gadd(p1,p2); lbot = avma; z = cgetg(3,t_COMPLEX); z[1] = lsub(p1,p2); z[2] = lsub(p3,p4); z = gerepile(ltop,lbot,z); @eprog \noindent ``Essentially'', because for instance \kbd{x[1]} is a \kbd{long} and not a \kbd{GEN}, so we need to insert many annoying typecasts: \kbd{p1 = gmul((GEN)x[1], (GEN)y[1])} and so on. Let us now look at a less trivial example where more than one \kbd{gerepile} is needed in practice (at the expense of efficiency, one can always use only one using \kbd{gerepilecopy}). Suppose that we want to write a function which multiplies a line vector by a matrix (such a function is of course already part of \kbd{gmul}, but let's ignore this for a moment). Then the most natural way is to do a \kbd{cgetg} of the result immediately, and then a \kbd{gerepile} for each coefficient of the result vector to get rid of the garbage which has accumulated while this particular coefficient was computed. We leave the details to the reader, who can look at the answer in the file \kbd{basemath/gen1.c}, in the function \kbd{gmul}, case \typ{VEC} times case \typ{MAT}. It would theoretically be possible to have a single connected piece of garbage, but it would be a much less natural and unnecessarily complicated program, which would in fact be slower. Let us now take an example which is probably the least trivial way of using \kbd{gerepile}, but is unfortunately sometimes necessary. Although it is not an infrequent occurrence, we will not give a specific example but a general one: suppose that we want to do a computation (usually inside a larger function) producing more than one PARI object as a result, say two for instance. Then even if we set up the work properly, before cleaning up we will have a stack which has the desired results \kbd{z1}, \kbd{z2} (say), and then connected garbage from lbot to ltop. If we write \kbd{z1 = gerepile(ltop,lbot,z1);} \noindent then the stack will be cleaned, the pointers fixed up, but we will have lost the address of \kbd{z2}. This is where we need one of the \idx{gerepilemany} functions: we declare \bprog GEN *gptr[2]; /*@Ccom Array of pointers to GENs */ gptr[0] = &z1; gptr[1] = &z2; @eprog \noindent and now the call \kbd{gerepilemany(ltop, gptr, 2)} copies \kbd{z1} and \kbd{z2} to new locations, cleans the stack from \kbd{ltop} to the old \kbd{avma}, and updates the pointers \kbd{z1} and \kbd{z2}. Here we don't assume anything about the stack: the garbage can be disconnected and \kbd{z1}, \kbd{z2} need not be at the bottom of the stack. If all of these assumptions are in fact satisfied, then we can call \kbd{gerepilemanysp} instead, which will usually be faster since we don't need the initial copy (on the other hand, it is less cache friendly). The whole thing could (and should) be abbreviated as \bprog gerepileall(ltop, 2, &z1, &z2) @eprog so that the array \kbd{gptr} is not needed anymore. Another important usage is ``random'' garbage collection during loops whose size requirements we cannot (or don't bother to) control in advance: \bprog ulong ltop = avma, limit = (avma+bot)/2; GEN x, y; while (...) { garbage(); x = anything(); garbage(); y = anything() garbage(); if (avma < limit) /*@Ccom memory is running low (half spent since entry) */ gerepileall(ltop, 2, &x, &y); } @eprog \noindent Here we assume that only \kbd{x} and \kbd{y} are needed from one iteration to the next. As it would be too costly to call gerepile once for each iteration, we only do it when it seems to have become necessary. Of course, when the need arises, you can use bigger \kbd{gptr} arrays: in the PARI library source code, we once needed to preserve up to 10 objects at a time (in a variant of the LLL algorithm)! \misctitle{Technical note:} the statement \kbd{limit = (avma+bot)/2} is dangerous since the addition can overflow, which would result in \kbd{limit} being always smaller than \kbd{avma}. This will prevent garbage collection in the loop. To avoid this problem, we provide a robust macro \tet{stack_lim}\kbd{(avma,$n$)}, which denotes an address where $2^{n-1} / (2^{n-1}+1)$ of the total stack space is exhausted ($1/2$ for $n=1$, $2/3$ for $n=2$). Hence, the above snippet should be written as \bprog ulong ltop = avma, limit = stack_lim(avma,1); @dots @eprog \subsec{Some hints and tricks}. In this section, we give some indications on how to avoid most problems connected with garbage collecting: First, although it looks complicated, \kbd{gerepile} has turned out to be a very flexible and fast garbage collector, which compares very favorably with much more sophisticated methods used in other systems. A few tests that we have done indicate that the price paid for using \kbd{gerepile}, when properly used, is usually around 1 or 2 percents of the total running time, which is quite acceptable. Secondly, in many cases, in particular when the tree structure and the size of the PARI objects which will appear in a computation are under control, one can avoid \kbd{gerepile} altogether by creating sufficiently large objects at the beginning (using \teb{cgetg}), and then using assignment statements and operations ending with z (such as \teb{gaddz}). Coming back to our first example, note that if we know that x and y are of type real and of length less than or equal to 5, we can program without using \kbd{gerepile} at all: \bprog z = cgetr(5); ltop = avma; p1 = gsqr(x); p2 = gsqr(y); gaddz(p1,p2,z); avma = ltop; @eprog \noindent This practice will usually be {\it slower\/} than a craftily used \kbd{gerepile} though, and is certainly more cumbersome to use. As a rule, assignment statements should generally be avoided. \smallskip Thirdly, the philosophy of \kbd{gerepile} is the following: keep the value of the stack pointer \kbd{avma} at the beginning, and just {\it before\/} the last operation. Afterwards, it would be too late since the lower end address of the garbage zone would have been lost. Of course you can always use \kbd{gerepileupto}, but you will have to assume that the object was created {\it before\/} its components. Lastly, if all seems lost, just use \tet{gerepilecopy} (or \tet{gerepilemany}) to fix up the stack for you. \smallskip If you followed us this far, congratulations, and rejoice: the rest is much easier. \section{Implementation of the PARI types} \label{se:impl} \noindent Although it is a little tedious, we now go through each type and explain its implementation. Let \kbd{z} be a \kbd{GEN}, pointing at a PARI object. In the following paragraphs, we will constantly mix two points of view: on the one hand, \kbd{z} will be treated as the C pointer it is (in the context of program fragments like \kbd{z[1]}), on the other, as PARI's handle on (the internal representation of) some mathematical entity, so we will shamelessly write $\kbd{z} \ne 0$ to indicate that the {\it value\/} thus represented is nonzero (in which case the {\it pointer\/}~\kbd{z} certainly will be non-\kbd{NULL}). We offer no apologies for this style. In fact, you had better feel comfortable juggling both views simultaneously in your mind if you want to write correct PARI programs. Common to all the types is the first codeword \kbd{z[0]}, which we don't have to worry about since this is taken care of by \kbd{cgetg}. Its precise structure will depend on the machine you are using, but it always contain the following data: the {\it internal \idx{type number}\/} associated to the symbolic type name, the {\it\idx{length}\/} of the root in longwords, and a technical bit which indicates whether the object is a clone (see below) or not. This last one is used by GP for internal garbage collecting, you won't have to worry about it. \noindent These data can be handled through the following {\it macros\/}: \fun{long}{typ}{GEN z} returns the type number of \kbd{z}. \fun{void}{settyp}{GEN z, long n} sets the type number of \kbd{z} to \kbd{n} (you should not have to use this function if you use \kbd{cgetg}). \fun{long}{lg}{GEN z} returns the length (in longwords) of the root of \kbd{z}. \fun{long}{setlg}{GEN z, long l} sets the length of \kbd{z} to \kbd{l} (you should not have to use this function if you use \kbd{cgetg}; however, see an advanced example in \secref{se:prog}). \noindent (If you know enough about PARI to need to access the ``clone'' bit, then you'll be able to find usage hints in the code (esp. \kbd{killbloc()} and \kbd{matrix\_block()}). It {\it is\/} technical after all.) These macros are written in such a way that you don't need to worry about type casts when using them: i.e.~if \kbd{z} is a \kbd{GEN}, \kbd{typ(z[2])} will be accepted by your compiler, without your having to explicitly type \kbd{typ((GEN)z[2])}. Note that for the sake of efficiency, none of the codeword-handling macros check the types of their arguments even when there are stringent restrictions on their use. The possible second codeword is used differently by the different types, and we will describe it as we now consider each of them in turn: \subsec{Type \typ{INT} (integer):}\sidx{integer}\kbdsidx{t_INT} this type has a second codeword \kbd{z[1]} which contains the following information: the sign of \kbd{z}: coded as $1$, $0$ or $-1$ if $\kbd{z} > 0$, $\kbd{z} = 0$, $\kbd{z} < 0$ respectively. the {\it effective length\/} of \kbd{z}, i.e.~the total number of significant longwords. This means the following: apart from the integer 0, every integer is ``normalized'', meaning that the first mantissa longword (i.e.~\kbd{z[2]}) is non-zero. However, the integer may have been created with a longer length. Hence the ``length'' which is in \kbd{z[0]} can be larger than the ``effective length'' which is in \kbd{z[1]}. Accessing \kbd{z[i]} for \kbd{i} larger than or equal to the effective length will yield random results. \noindent This information is handled using the following macros: \fun{long}{signe}{GEN z} returns the sign of \kbd{z}. \fun{void}{setsigne}{GEN z, long s} sets the sign of \kbd{z} to \kbd{s}. \fun{long}{lgefint}{GEN z} returns the \idx{effective length} of \kbd{z}. \fun{void}{setlgefint}{GEN z, long l} sets the effective length of \kbd{z} to \kbd{l}. The integer 0 can be recognized either by its sign being~0, or by its effective length being equal to~2. When $\kbd{z} \ne 0$, the word \kbd{z[2]} exists and is non-zero, and the absolute value of \kbd{z} is (\kbd{z[2]},\kbd{z[3]},\dots,\kbd{z[lgefint(z)-1]}) in base \kbd{2\pow BITS\_IN\_LONG}, where as usual in this notation \kbd{z[2]} is the highest order longword. \noindent The following further macros are available: \fun{long}{mpodd}{GEN x} which is 1 if \kbd{x} is odd, and 0 otherwise. \fun{long}{mod2}{GEN x}, \fun{}{mod4}{x}, and so on up to \fun{}{mod64}{x}, which give the residue class of \kbd{x} modulo the corresponding power of 2, for {\it positive\/}~\kbd{x}. By definition, $\kbd{mod}n(x) := \kbd{mod}n(|x|)$ for $x < 0$ (the macros disregard the sign), and the result is undefined if $x = 0$. These macros directly access the binary data and are thus much faster than the generic modulo functions. Besides, they return long integers instead of \kbd{GEN}s, so they don't clutter up the stack. \subsec{Type \typ{REAL} (real number):}\kbdsidx{t_REAL}\sidx{real number} this type has a second codeword z[1] which also encodes its sign (obtained or set using the same functions as for the integers), and a binary exponent. This exponent can be handled using the following macros: \fun{long}{expo}{GEN z} returns the exponent of \kbd{z}. This is defined even when \kbd{z} is equal to zero, see \secref{se:whatzero}. \fun{void}{setexpo}{GEN z, long e} sets the exponent of \kbd{z} to \kbd{e}. \noindent Note the functions: \fun{long}{gexpo}{GEN z} which tries to return an exponent for \kbd{z}, even if \kbd{z} is not a real number. \fun{long}{gsigne}{GEN z} which returns a sign for \kbd{z}, even when \kbd{z} is neither real nor integer (a rational number for instance). The real zero is characterized by having its sign equal to 0. If \kbd{z} is not equal to~0, then is is represented as $2^e M$, where $e$ is the exponent, and $M\in [1, 2[$ is the mantissa of $z$, whose digits are stored in $\kbd{z[2]},\dots, \kbd{z[lg(z)-1]}$. More precisely, let $m$ be the integer (\kbd{z[2]},\dots, \kbd{z[lg(z)-1]}) in base \kbd{2\pow BITS\_IN\_LONG}; here, \kbd{z[2]} is the most significant longword and is normalized, i.e.~its most significant bit is~1. Then we have $M := m \cdot 2^{1 - \kbd{bit\_accuracy(lg(z))}}$. Thus, the real number $3.5$ to accuracy \kbd{bit\_accuracy(lg(z))} is represented as \kbd{z[0]} (encoding $\kbd{type} = \typ{REAL}$, \kbd{lg(z)}), \kbd{z[1]} (encoding $\kbd{sign} = 1$, $\kbd{expo} = 1$), $\kbd{z[2]} = \kbd{0xe0000000}$, $\kbd{z[3]} =\dots = \kbd{z[lg(z)-1]} = \kbd{0x0}$. \subsec{Type \typ{INTMOD} (integermod):}\kbdsidx{t_INTMOD}\sidx{integermod} \kbd{z[1]} points to the modulus, and \kbd{z[2]} at the number representing the class \kbd{z}. Both are separate \kbd{GEN} objects, and both must be of type integer, satisfying the inequality $0 \le \kbd{z[2]} < \kbd{z[1]}$. It is good practice to keep the modulus object on the heap, so that new integermods resulting from operations can point at this common object, instead of carrying along their own copies of it on the stack. The library functions implement this practice almost by default. \subsec{Type \typ{FRAC} and \typ{FRACN} (rational number):}% \kbdsidx{t_FRAC}\kbdsidx{t_FRACN}\sidx{rational number} \kbd{z[1]} points to the numerator, and \kbd{z[2]} to the denominator. Both must be of type integer. In principle $\kbd{z[2]} > 0$, but this rule does not have to be strictly obeyed. Note that a type \typ{FRACN} rational number can be converted to irreducible form using the function \fun{GEN}{gred}{GEN x}. \subsec{Type \typ{COMPLEX} (complex number):}% \kbdsidx{t_COMPLEX}\sidx{complex number} \kbd{z[1]} points to the real part, and \kbd{z[2]} to the imaginary part. A priori \kbd{z[1]} and \kbd{z[2]} can be of any type, but only certain types are useful and make sense. \subsec{Type \typ{PADIC} ($p$-adic numbers):}% \sidx{p-adic number}\kbdsidx{t_PADIC} this type has a second codeword \kbd{[1]} which contains the following information: the $p$-adic precision (the exponent of $p$ modulo which the $p$-adic unit corresponding to \kbd{z} is defined if \kbd{z} is not~0), i.e.~one less than the number of significant $p$-adic digits, and the exponent of \kbd{z}. This information can be handled using the following functions: \fun{long}{precp}{GEN z} returns the $p$-adic precision of \kbd{z}. \fun{void}{setprecp}{GEN z, long l} sets the $p$-adic precision of \kbd{z} to \kbd{l}. \fun{long}{valp}{GEN z} returns the $p$-adic valuation of \kbd{z} (i.e. the exponent). This is defined even if \kbd{z} is equal to~0, see \secref{se:whatzero}. \fun{void}{setvalp}{GEN z, long e} sets the $p$-adic valuation of \kbd{z} to \kbd{e}. In addition to this codeword, \kbd{z[2]} points to the prime $p$, \kbd{z[3]} points to $p^{\text{precp(z)}}$, and \kbd{z[4]} points to an integer representing the $p$-adic unit associated to \kbd{z} modulo \kbd{z[3]} (and points to zero if \kbd{z} is zero). To summarize, if $z\neq 0$, we have the equality: $$ \kbd{z} = p^{\text{valp(z)}} * (\kbd{z[4]} + O(\kbd{z[3]})) = p^{\text{valp(z)}} * (\kbd{z[4]} + O(p^{\text{precp(z)}})) $$ \subsec{Type \typ{QUAD} (quadratic number):}\sidx{quadratic number}\kbdsidx{t_QUAD} \kbd{z[1]} points to the polynomial defining the quadratic field, \kbd{z[2]} to the ``real part'' and \kbd{z[3]} to the ``imaginary part'', which are to be taken as the coefficients of \kbd{z} with respect to the ``canonical'' basis $(1,w)$, see~\secref{se:compquad}. Complex numbers are a particular case of quadratics but deserve a separate type. \subsec{Type \typ{POLMOD} (polmod):}\kbdsidx{t_POLMOD}\sidx{polmod} exactly as for integermods, \kbd{z[1]} points to the modulus, and \kbd{z[2]} to a polynomial representing the class of~\kbd{z}. Both must be of type \typ{POL} in the same variable. However, \kbd{z[2]} is allowed to be a simplification of such a polynomial, e.g a scalar. This is quite tricky considering the hierarchical structure of the variables; in particular, a polynomial in variable of \var{lesser} priority (see \secref{se:priority}) than the modulus variable is valid, since it can be considered as the constant term of a polynomial of degree 0 in the correct variable. On the other hand a variable of \var{greater} priority would not be acceptable. \subsec{Type \typ{POL} (polynomial):}\kbdsidx{t_POL}\sidx{polynomial} this type has a second codeword which is analogous to the one for integers. It contains a ``{\it sign\/}'': 0 if the polynomial is equal to~0, and 1 if not (see however the important remark below), a {\it variable number\/} (e.g.~0 for $x$, 1 for $y$, etc\dots), and an {\it effective length}. \noindent These data can be handled with the following macros: \teb{signe} and \teb{setsigne} as for reals and integers. \fun{long}{lgef}{GEN z} returns the \idx{effective length} of \kbd{z}. \fun{void}{setlgef}{GEN z, long l} sets the effective length of \kbd{z} to \kbd{l}. \fun{long}{varn}{GEN z} returns the variable number of the object \kbd{z}. \fun{void}{setvarn}{GEN z, long v} sets the variable number of \kbd{z} to \kbd{v}. The variable numbers encode the relative priorities of variables as discussed in \secref{se:priority}. We will give more details in \secref{se:vars}. Note also the function \fun{long}{gvar}{GEN z} which tries to return a \idx{variable number} for \kbd{z}, even if \kbd{z} is not a polynomial or power series. The variable number of a scalar type is set by definition equal to \tet{BIGINT}. The components \kbd{z[2]}, \kbd{z[3]},\dots \kbd{z[lgef(z)-1]} point to the coefficients of the polynomial {\it in ascending order}, with \kbd{z[2]} being the constant term and so on. Note that the {\it \idx{degree}\/} of the polynomial is equal to its effective length minus three. The function \fun{long}{degree}{GEN x} returns the degree of \kbd{x} with respect to its main variable even when \kbd{x} is not a polynomial (a rational function for instance). By convention, the degree of $0$ is~$-1$. \misctitle{Important remark}. A zero polynomial can be characterized by the fact that its sign is~0. However, its effective length may be equal to 2, or greater than 2. If it is greater than 2, this means that all the coefficients of the polynomial are equal to zero (as they should for a zero polynomial), but not all of these zeros are exact zeros, and more precisely the leading term \kbd{z[lgef(z)-1]} is not an exact zero. \subsec{Type \typ{SER} (power series):}\kbdsidx{t_SER}\sidx{power series} This type also has a second codeword, which encodes a ``{\it sign\/}'', i.e.~0 if the power series is 0, and 1 if not, a {\it variable number\/} as for polynomials, and an {\it exponent\/}. This information can be handled with the following functions: \teb{signe}, \teb{setsigne}, \teb{varn}, \teb{setvarn} as for polynomials, and \teb{valp}, \teb{setvalp} for the exponent as for $p$-adic numbers. Beware: do {\it not\/} use \teb{expo} and \teb{setexpo} on power series. If the power series is non-zero, \kbd{z[2]}, \kbd{z[3]},\dots \kbd{z[lg(z)-1]} point to the coefficients of \kbd{z} in ascending order, \kbd{z[2]} being the first non-zero coefficient. Note that the exponent of a power series can be negative, i.e.~we are then dealing with a Laurent series (with a finite number of negative terms). \subsec{Type \typ{RFRAC} and \typ{RFRACN} (rational function):}% \kbdsidx{t_RFRAC}\kbdsidx{t_RFRACN}\sidx{rational function} \kbd{z[1]} points to the numerator, and \kbd{z[2]} on the denominator. The denominator must be of type polynomial. Note that a type \typ{RFRACN} rational function can be converted to irreducible form using the function \teb{gred}. \subsec{Type \typ{QFR} (indefinite binary quadratic form):}% \kbdsidx{t_QFR}\sidx{indefinite binary quadratic form} \kbd{z[1]}, \kbd{z[2]}, \kbd{z[3]} point to the three coefficients of the form and should be of type integer. \kbd{z[4]} is Shanks's distance function, and should be of type real. \subsec{Type \typ{QFI} (definite binary quadratic form):}% \kbdsidx{t_QFI}\sidx{definite binary quadratic form} \kbd{z[1]}, \kbd{z[2]}, \kbd{z[3]} point to the three coefficients of the form. All three should be of type integer. \subsec{Type \typ{VEC} and \typ{COL} (vector):}% \kbdsidx{t_VEC}\kbdsidx{t_COL}\sidx{row vector}\sidx{column vector} \kbd{z[1]}, \kbd{z[2]},\dots \kbd{z[lg(z)-1]} point to the components of the vector. \subsec{Type \typ{MAT} (matrix):}\kbdsidx{t_MAT}\sidx{matrix} \kbd{z[1]}, \kbd{z[2]},\dots \kbd{z[lg(z)-1]} point to the column vectors of \kbd{z}, i.e.~they must be of type \typ{COL} and of the same length. \noindent The last two were introduced for specific GP use, and you'll be much better off using the standard malloc'ed C constructs when programming in library mode. We quote them just for completeness (advising you not to use them): \subsec{Type \typ{LIST} (list):}\kbdsidx{t_LIST}\sidx{list} This one has a second codeword which contains an effective length (handled through \teb{lgef}~/ \teb{setlgef}). \kbd{z[2]},\dots, \kbd{z[lgef(z)-1]} contain the components of the list. \subsec{Type \typ{STR} (character string):}% \kbdsidx{t_STR}\sidx{character string} \fun{char *}{GSTR}{z} (= \kbd{(z+1)}) points to the first character of the (\kbd{NULL}-terminated) string. \misctitle{Implementation note:} for the types including an exponent (or a valuation), we actually store a biased non-negative exponent (bit-ORing the biased exponent to the codeword), obtained by adding a constant to the true exponent: either \kbd{HIGHEXPOBIT} (for \typ{REAL}) or \kbd{HIGHVALPBIT} (for \typ{PADIC} and \typ{SER}). Of course, this is encapsulated by the exponent/valuation-handling macros and need not concern the library user. \section{PARI variables}\label{se:vars} \subsec{Multivariate objects}\sidx{variable (priority)} \noindent We now consider variables and formal computations, and give the technical details corresponding to the general discussion in \secref{se:priority}. As we have seen in \secref{se:impl}, the codewords for types \typ{POL} and \typ{SER} encode a ``variable number''. This is an integer, ranging from $0$ to \kbd{MAXVARN}. The lower it is, the higher the variable priority. In fact, the way an object will be considered in formal computations depends entirely on its ``principal variable number'' which is given by the function \fun{long}{gvar}{GEN z} \noindent which returns a \idx{variable number} for \kbd{z}, even if \kbd{z} is not a polynomial or power series. The variable number of a scalar type is set by definition equal to \tet{BIGINT} which is bigger than any legal variable number. The variable number of a recursive type which is not a polynomial or power series is the minimal variable number of its components. But for polynomials and power series only the ``outermost'' number counts (we directly access $\tet{varn}(x)$ in the codewords): the representation is not symmetrical at all. Under GP, one need not worry too much since the interpreter will define the variables as it sees them\footnote{*}{ More precisely, the first time a given identifier is read by the GP parser (and is not immediately interpreted as a function) a new variable is created, and it is assigned a strictly lower priority than any variable in use at this point. On startup, before any user input has taken place, 'x' is defined in this way and will thus always have maximal priority (and variable number $0$).} % and do the right thing with the polynomials produced (however, have a look at the remark in \secref{se:rempolmod}). But in library mode, they are tricky objects if you intend to build polynomials yourself (and not just let PARI functions produce them, which is usually less efficient). For instance, it does not make sense to have a variable number occur in the components of a polynomial whose main variable has a higher number (lower priority), even though there's nothing PARI can do to prevent you from doing it. \subsec{Creating variables} A basic difficulty is to ``create'' a variable. As we have seen in \secref{se:intro4}, a plethora of objects is associated to variable number~$v$. Here is the complete list: \teb{polun}$[v]$ and \teb{polx}$[v]$, which you can use in library mode and which represent, respectively, the monic monomials of degrees 0 and 1 in~$v$; \teb{varentries}$[v]$, and \teb{polvar}$[v]$. The latter two are only meaningful to GP, but they have to be set nevertheless. All of them must be properly defined before you can use a given integer as a variable number. Initially, this is done for $0$ (the variable \kbd{x} under GP), and \tet{MAXVARN}, which is there to address the need for a ``temporary'' new variable in library mode and cannot be input under GP. No documented library function can create from scratch an object involving \tet{MAXVARN} (of course, if the operands originally involve \kbd{MAXVARN}, the function will abide). We call the latter type a ``temporary variable''. The regular variables meant to be used in regular objects, are called ``user variables\sidx{variable (user)}''. \subsubsec{User variables}\sidx{variable (user)}: When the program starts, \kbd{x} is the only user variable (number~$0$). To define new ones, use \fun{long}{fetch_user_var}{char *$s$} \noindent which inspects the user variable named $s$ (creating it if needed), and returns its variable number. \bprog long v = fetch_user_var("y"); GEN gy = polx[v]; @eprog This function raises an error if $s$ is already known as a function name to the interpreter. \misctitle{Caveat:} it is possible to use \teb{flissexpr} (see~\secref{se:flisexpr}) to execute a GP command and create GP variables on the fly as needed: \bprog GEN gy = flissexpr("y"); /*@Ccom supposedly returns polx[$v$], for some $v$ */ long v = gvar(gy); @eprog \noindent This is dangerous, especially when programming functions that will be used under GP. The code above reads the value of \kbd{y}, as it is currently known by the GP interpreter (possibly creating it in the process). All is well and good if \kbd{y} hasn't been tampered with in previous GP commands. But if \kbd{y} has been modified (e.g \kbd {y = 1}), then the value of \kbd{gy} is not what you expected it to be and corresponds instead to the current value of the GP variable (e.g \kbd{gun}). \subsubsec{Temporary variables}\sidx{variable (temporary)}: \kbd{MAXVARN} is available, but is better left to pari internal functions (some of which don't check that \kbd{MAXVARN} is free for them to use, which can be considered a bug). You can create more temporary variables using \fun{long}{fetch_var}{}\label{se:fetch_var} \noindent This returns a variable number which is guaranteed to be unused by the library at the time you get it and as long as you do not delete it (we'll see how to do that shortly). This has {\it lower\/} number (i.e.~{\it higher\/} priority) than any temporary variable produced so far (\kbd{MAXVARN} is assumed to be the first such). This call updates all the aforementioned internal arrays. In particular, after the statement \kbd{v = fetch\_var()}, you can use \kbd{polun[v]} and \kbd{polx[v]}. The variables created in this way have no identifier assigned to them though, and they will be printed as \kbd{\#<\text{number}>}, except for \kbd{MAXVARN} which will be printed as~\kbd{\#}. You can assign a name to a temporary variable, after creating it, by calling the function \fun{void}{name_var}{long n, char *s} \noindent after which the output machinery will use the name \kbd{s} to represent the variable number~\kbd{n}. The GP parser will {\it not\/} recognize it by that name, however, and calling this on a variable known to~GP will raise an error. Temporary variables are meant to be used as free variables, and you should never assign values or functions to them as you would do with variables under~GP. For that, you need a user variable. All objects created by \kbd{fetch\_var} are on the heap and not on the stack, thus they are not subject to standard garbage collecting (they won't be destroyed by a \kbd{gerepile} or \kbd{avma = ltop} statement). When you don't need a variable number anymore, you can delete it using \fun{long}{delete_var}{} \noindent which deletes the {\it latest\/} temporary variable created and returns the variable number of the previous one (or simply returns 0 if you try, in vain, to delete \kbd{MAXVARN}). Of course you should make sure that the deleted variable does not appear anywhere in the objects you use later on. Here is an example: \bprog { long first = fetch_var(); long n1 = fetch_var(); long n2 = fetch_var(); /*@Ccom prepare three variables for internal use */ ... /*@Ccom delete all variables before leaving */ do { num = delete_var(); } while (num && num <= first); }@eprog \noindent The (dangerous) statement \bprog while (delete_var()) /*@Ccom empty */; @eprog \noindent removes all temporary variables that were in use, except \kbd{MAXVARN} which cannot be deleted. \section{Input and output} \noindent Two important aspects have not yet been explained which are specific to library mode: input and output of PARI objects. \subsec{Input}. \noindent For \idx{input}, PARI provides you with two powerful high level functions which enables you to input your objects as if you were under GP. In fact, the second one {\it is\/} essentially the GP syntactical parser, hence you can use it not only for input but for (most) computations that you can do under GP. These functions are called \teb{flisexpr} and \teb{flisseq}. The first one has the following syntax:\label{se:flisexpr} \fun{GEN}{flisexpr}{char *s} \noindent Its effect is to analyze the input string s and to compute the result as in GP. However it is limited to one expression. If you want to read and evaluate a sequence of expressions, use \fun{GEN}{flisseq}{char *s} \noindent\sidx{filter} In fact these two functions start by {\it filtering\/} out all spaces and comments in the input string (that's what the initial \kbd{f} stands for). They then call the underlying basic functions, the GP parser proper: \fun{GEN}{lisexpr}{char *s} and \fun{GEN}{lisseq}{char *s}, which are slightly faster but which you probably don't need. To read a \kbd{GEN} from a file, you can use the simpler interface \fun{GEN}{lisGEN}{FILE *file} which reads a character string of arbitrary length from the stream \kbd{file} (up to the first newline character), applies \kbd{flisexpr} to it, and returns the resulting \kbd{GEN}. This way, you won't have to worry about allocating buffers to hold the string. To interactively input an expression, use \kbd{lisGEN(stdin)}. This function returns \kbd{NULL} if \kbd{EOF} is encountered before a complete expression could be read. Once in a while, it may be necessary to evaluate a GP expression sequence involving a call to a function you have defined in~C. This is easy using \teb{install} which allows you to manipulate quite an arbitrary function (GP knows about pointers!). The syntax is \fun{void}{install}{void *f, char *name, char *code} \noindent where \kbd{f} is the (address of) the function (cast to the C type \kbd{void*}), \kbd{name} is the name by which you want to access your function from within your GP expressions, and \kbd{code} is a character string describing the function call prototype (see~\secref{se:gp.interface} for the precise description of prototype strings). In case the function returns a \kbd{GEN}, it should satisfy \kbd{gerepileupto} assumptions (see \secref{se:garbage}). \subsec{Output}. \noindent For \idx{output}, there exist essentially three different functions (with variants), corresponding to the three main GP output formats (as described in \secref{se:output}), plus three extra ones, respectively devoted to \TeX\ output, string output, and (advanced) debugging. \noindent $\bullet$ ``raw'' format, obtained by using the function \teb{brute} with the following syntax: \fun{void}{brute}{GEN obj, char x, long n} \noindent This prints the PARI object \kbd{obj} in \idx{format} \kbd{x0.n}, using the notations from \secref{se:format}. Recall that here \kbd{x} is either \kbd{'e'}, \kbd{'f'} or \kbd{'g'} corresponding to the three numerical output formats, and \kbd{n} is the number of printed significant digits, and should be set to $-1$ if all of them are wanted (these arguments only affect the printing of real numbers). Usually you won't need that much flexibility, so most of the time you will get by with the function \fun{void}{outbrute}{GEN obj}, which is equivalent to \kbd{brute(x,'g',-1)}, \noindent or even better, with \fun{void}{output}{GEN obj} which is equivalent to \kbd{outbrute(obj)} followed by a newline and a buffer flush. This is especially nice during debugging. For instance using \kbd{dbx} or \kbd{gdb}, if \kbd{obj} is a \kbd{GEN}, typing \kbd{print output(obj)} will enable you to see the content of \kbd{obj} (provided the optimizer has not put it into a register, but it's rarely a good idea to debug optimized code). \noindent $\bullet$ ``prettymatrix'' format: this format is identical to the preceding one except for matrices. The relevant functions are: \fun{void}{matbrute}{GEN obj, char x, long n} \fun{void}{outmat}{GEN obj}, which is followed by a newline and a buffer flush. \noindent $\bullet$ ``prettyprint'' format: the basic function has an additional parameter \kbd{m}, corresponding to the (minimum) field width used for printing integers: \fun{void}{sor}{GEN obj, char x, long n, long m} \noindent The simplified version is \fun{void}{outbeaut}{GEN obj} which is equivalent to \kbd{sor(obj,'g',-1,0)} followed by a newline and a buffer flush. \noindent $\bullet$ The first extra format corresponds to the \teb{texprint} function of GP, and gives a \TeX\ output of the result. It is obtained by using: \fun{void}{exe}{GEN obj, char x, long n} \noindent $\bullet$ The second one is the function \teb{GENtostr} which converts a PARI \kbd{GEN} to an ASCII string. The syntax is \fun{char*}{GENtostr}{GEN obj}, wich returns a \kbd{malloc}'ed character string (which you should \kbd{free} after use). \noindent $\bullet$ The third and final one outputs the \idx{hexadecimal tree} corresponding to the GP command \kbd{\b x} using the function \fun{void}{voir}{GEN obj, long nb}, which will only output the first \kbd{nb} words corresponding to leaves (very handy when you have a look at big recursive structures). If you set this parameter to $-1$ all significant words will be printed. Usually this last type of output would only be used for debugging purposes. \misctitle{Remark}. Apart from \teb{GENtostr}, all PARI output is done on the stream \teb{outfile}, which by default is initialized to \teb{stdout}. If you want that your output be directed to another file, you should use the function \fun{void}{switchout}{char *name} where \kbd{name} is a character string giving the name of the file you are going to use. The output will be {\it appended\/} at the end of the file. In order to close the file, simply call \kbd{switchout(NULL)}. Similarly, errors are sent to the stream \teb{errfile} (\teb{stderr} by default), and input is done on the stream \teb{infile}, which you can change using the function \teb{switchin} which is analogous to \teb{switchout}. \misctitle{(Advanced) Remark}. All output is done according to the values of the \teb{pariOut}~/ \teb{pariErr} global variables which are pointers to structs of pointer to functions. If you really intend to use these, this probably means you are rewriting GP. In that case, have a look at the code in \kbd{language/es.c} (\kbd{init80()} or \kbd{GENtostr()} for instance). \subsec{Errors}.\sidx{error}\kbdsidx{talker} \noindent If you want your functions to issue error messages, you can use the general error handling routine \teb{err}. The basic syntax is % \bprog err(talker, "error message"); @eprog \noindent This will print the corresponding error message and exit the program (in library mode; go back to the GP prompt otherwise).\label{se:err} You can also use it in the more versatile guise \bprog err(talker, format, ...); @eprog\noindent where \kbd{format} describes the format to use to write the remaining operands, as in the \teb{printf} function (however, see the next section). The simple syntax above is just a special case with a constant format and no remaining arguments. \noindent The general syntax is \fun{void}{err}{numerr,...} \noindent where \kbd{numerr} is a codeword which indicates what to do with the remaining arguments and what message to print. The list of valid keywords is in \kbd{language/errmessages.c} together with the basic corresponding message. For instance, \kbd{err(typeer,"matexp")} will print the message: \kbd{ *** incorrect type in matexp.} \noindent Among the codewords are {\it warning\/} keywords (all those which start with the prefix \kbd{warn}). In that case, \teb{err} does {\it not\/} abort the computation, just print the requested message and go on. The basic example is \kbd{err(warner, "Strategy 1 failed. Trying strategy 2")} \noindent which is the exact equivalent of \kbd{err(talker,...)} except that you certainly don't want to stop the program at this point, just inform the user that something important has occurred (in particular, this output would be suitably highlighted under GP, whereas a simple \kbd{printf} would not). \subsec{Debugging output}.\sidx{debugging}\sidx{format}\label{se:dbg_output} \noindent The global variables \teb{DEBUGLEVEL} and \teb{DEBUGMEM} (corresponding to the default \teb{debug} and \teb{debugmem}, see \secref{se:defaults}) are used throughout the PARI code to govern the amount of diagnostic and debugging output, depending on their values. You can use them to debug your own functions, especially after having made them accessible under GP through the command \teb{install} (see \secref{se:install}). For debugging output, you can use \kbd{printf} and the standard output functions (\teb{brute} or \teb{output} mainly), but also some special purpose functions which embody both concepts, the main one being \fun{void}{fprintferr}{char *pariformat, ...} \noindent Now let's define what a PARI format is. It is a character string, similar to the one \kbd{printf} uses, where \kbd{\%} characters have a special meaning. It describes the format to use when printing the remaining operands. But, in addition to the standard format types, you can use \kbd{\%Z} to denote a \kbd{GEN} object (we would have liked to pick \kbd{\%G} but it was already in use!). For instance you could write: \bprog err(talker, "x[%d] = %Z is not invertible!", i, x[i]) @eprog \noindent since the \teb{err} function accepts PARI formats. Here \kbd{i} is an \kbd{int}, \kbd{x} a \kbd{GEN} which is not a leaf and this would insert in raw format the value of the \kbd{GEN} \kbd{x[i]}. \subsec{Timers and timing output}. \noindent To profile your functions, you can use the PARI timer. The functions \fun{long}{timer}{} and \fun{long}{timer2}{} return the elapsed time since the last call of the same function (in milliseconds). Two different functions (identical except for their independent time-of-last-call memories!) are provided so you can have both global timing and fine tuned profiling. You can also use \fun{void}{msgtimer}{char *format,...}, which prints prints \kbd{Time}, then the remaining arguments as specified by \kbd{format} (which is a PARI format), then the output of \kbd{timer2}. This mechanism is simple to use but not foolproof. If some other function uses these timers, and many PARI functions do use \kbd{timer2} when \tet{DEBUGLEVEL} is high enough, the timings will be meaningless. To handle timing in a reentrant way, PARI defines a dedicated datatype, \tet{pari_timer}. The functions \fun{long}{TIMER}{pari\_timer *T} \fun{long}{msgTIMER}{pari\_timer *T, char *format,...} \noindent are equivalent to \kbd{timer} and \kbd{msgtimer} respectively, except they use a unique timer \kbd{T} containing all the information needed, so that no other function can mess with your timings. They are used as follows: \bprog pari_timer T; (void)TIMER(&T); /* initialize timer */ ... printf("Total time: %ld\n", TIMER(&T)); @eprog or \bprog pari_timer T; long i; GEN L; (void)TIMER(&T); /* initialize timer */ for (i = 1; i < 10; i++) { ... msgTIMER(&T, "for i = %ld (L[i] = %Z)", i, L[i]); } @eprog \section{A complete program} \label{se:prog} \noindent Now that the preliminaries are out of the way, the best way to learn how to use the library mode is to work through a detailed non-trivial example of a main program. We will write a program which computes the exponential of a square matrix~$x$. The complete listing is given in Appendix~B, but each part of the program will be produced and explained here. We will use an algorithm which is not optimal but is not far from the one used for the PARI function \teb{gexp} (in fact embodied in the function \kbd{mpexp1}). This consists in calculating the sum of the series: $$e^{x/(2^n)}=\sum_{k=0}^\infty \dfrac{(x/(2^n))^k}{k!}$$ for a suitable positive integer $n$, and then computing $e^x$ by repeated squarings. First, we will need to compute the $L^2$-norm of the matrix~$x$, i.e.~the quantity: $$z=\|x\|_2=\sqrt{\sum x_{i,j}^2}.$$ We will then choose the integer $n$ such that the $L^2$-norm of $x/(2^n)$ is less than or equal to~1, i.e. $$ n = \left\lceil{\ln(z)}\big/{\ln(2)}\right\rceil $$ if $z\ge1$, and~$n=0$ otherwise. Then the series will converge at least as fast as the usual one for $e^1$, and the cutoff error will be easy to estimate. In fact a larger value of $n$ would be preferable, but this is slightly machine dependent and more complicated, and will be left to the reader. Let us start writing our program. So as to be able to use it in other contexts, we will structure it in the following way: a main program which will do the input and output, and a function which we shall call \kbd{matexp} which does the real work. The main program is easy to write. It can be something like this: \bprog #include GEN matexp(GEN x, long prec); int main() { long d, prec = 3; GEN x; /*@Ccom take a stack of $10^6$ bytes, no prime table */ pari_init(1000000, 2); printf("precision of the computation in decimal digits:\n"); d = itos(lisGEN(stdin)); if (d > 0) prec = (long) (d*pariK1+3); printf("input your matrix in GP format:\n"); x = matexp(lisGEN(stdin), prec); sor(x, 'g', d, 0); exit(0); }@eprog \noindent The variable \kbd{prec} represents the length in longwords of the real numbers used. \teb{pariK1} is a constant (defined in \kbd{paricom.h}) equal to $\ln(10) / (\ln(2) * \kbd{BITS\_IN\_LONG})$, which allows us to convert from a number of decimal digits to a number of longwords, independently of the actual bit size of your long integers. The function \teb{lisGEN} reads an expression (here from standard input) and converts it to a \kbd{GEN}, like the GP parser itself would. This means it takes care of whitespace etc.\ in the input, and can do computations (e.g.~\kbd{matid(2)} or \kbd{[1,0; 0,1]} are equally valid inputs). Finally, \kbd{sor} is the general output routine. We have chosen to give \kbd{d} significant digits since this is what was asked for. Note that there is a trick hidden here: if a negative \kbd{d} was input, then the computation will be done in precision 3 (i.e.~about 9.7 decimal digits for 32-bit machines and 19.4 for 64-bit machines) and in the function \kbd{sor}, giving a negative third argument outputs all the significant digits, which is entirely appropriate. Now let us attack the main course, the function \kbd{matexp}: % \bprog GEN matexp(GEN x, long prec) { long lx=lg(x),i,k,n,lbot, ltop = avma; GEN y,r,s,p1,p2; /*@Ccom check that x is a square matrix */ if (typ(x) != t_MAT) err(talker,"this expression is not a matrix"); if (lx == 1) return cgetg(1, t_MAT); if (lx != lg(x[1])) err(talker,"not a square matrix"); /*@Ccom compute the $L_2$ norm of x */ s = gzero; for (i=1; i>}. We now come to the heart of the function. We have a \kbd{GEN} \kbd{p1} which points to a certain matrix of which we want to take the exponential. We will want to transform this matrix into a matrix with real (or complex of real) entries before starting the computation. To do this, we simply multiply by the real number 1 in precision $\kbd{prec}+1$ (to be on the side of safety). To sum the series, we will use three variables: a variable \kbd{p2} which at stage $k$ will contain $\kbd{p1}^k/k!$, a variable \kbd{y} which will contain $\sum_{i=0}^k \kbd{p1}^i/i!$, and a variable \kbd{r} which will contain the size estimate $\kbd{s}^k/k!$. Note that we do not use Horner's rule. This is simply because we are lazy and do not want to compute in advance the number of terms that we need. We leave this modification (and many other improvements!) to the reader. The program continues as follows: \bprog /*@Ccom initializations before the loop */ r = cgetr(prec+1); gaffsg(1,r); p1 = gmul(r,p1); y = gscalmat(r,lx-1); /*@Ccom creates scalar matrix with r on diagonal */ p2 = p1; r = s; k = 1; y = gadd(y,p2); /*@Ccom now the main loop */ while (expo(r) >= -BITS_IN_LONG*(prec-1)) { k++; p2 = gdivgs(gmul(p2,p1),k); r = gdivgs(gmul(s,r),k); y = gadd(y,p2); } /*@Ccom now square back n times if necessary */ if (!n) { lbot = avma; y = gcopy(y); } else { for (i=0; i "GGp" void name(GEN x, GEN y, long prec) ----> "vGGp" void name(GEN x, long y, long prec) ----> "vGLp" long name(GEN x) ----> "lG" @eprog If you want more examples, GP gives you easy access to the parser codes associated to all GP functions: just type \kbd{\b{h} \var{function}}. You can then compare with the C prototypes as they stand in the code. \misctitle{Remark}: If you need to implement complicated control statements (probably for some improved summation functions), you'll need to know about the \teb{entree} type, which is not documented. Check the comment before the function list at the end of \kbd{language/init.c} and the source code in \kbd{language/sumiter.c}. You should be able to make something of it. \smallskip \subsec{Coding guidelines}. \noindent Code your function in a file of its own, using as a guide other functions in the PARI sources. One important thing to remember is to clean the stack before exiting your main function (usually using \kbd{gerepile}), since otherwise successive calls to the function will clutter the stack with unnecessary garbage, and stack overflow will occur sooner. Also, if it returns a \kbd{GEN} and you want it to be accessible to GP, you have to make sure this \kbd{GEN} is suitable for \kbd{gerepileupto} (see \secref{se:garbage}). If error messages are to be generated in your function, use the general error handling routine \kbd{err} (see \secref{se:err}). Recall that, apart from the \kbd{warn} variants, this function does not return but ends with a \kbd{longjmp} statement. As well, instead of explicit \kbd{printf}~/ \kbd{fprintf} statements, use the following encapsulated variants: \fun{void}{pariputs}{char *s}: write \kbd{s} to the GP output stream. \fun{void}{fprintferr}{char *s}: write \kbd{s} to the GP error stream (this function is in fact much more versatile, see \secref{se:dbg_output}). Declare all public functions in an appropriate header file, if you expect somebody will access them from C. For example, if dynamic loading is not available, you may need to modify PARI to access these functions, so put them in \kbd{paridecl.h}. The other functions should be declared \kbd{static} in your file. Your function is now ready to be used in library mode after compilation and creation of the library. If possible, compile it as a shared library (see the \kbd{Makefile} coming with the \kbd{matexp} example in the distribution). It is however still inaccessible from GP.\smallskip \subsec{Integration with GP as a shared module} To tell GP about your function, you must do the following. First, find a name for it. It does not have to match the one used in library mode, but consistency is nice. It has to be a valid GP identifier, i.e.~use only alphabetic characters, digits and the underscore character (\kbd{\_}), the first character being alphabetic. Then you have to figure out the correct \idx{parser code} corresponding to the function prototype. This has been explained above (\secref{se:gp.interface}). Now, assuming your Operating System is supported by \tet{install}, simply write a GP script like the following: \bprog install(libname, code, gpname, library) addhelp(gpname, "some help text") @eprog \noindent(see \secref{se:addhelp} and~\ref{se:install}). The \idx{addhelp} part is not mandatory, but very useful if you want others to use your module. \kbd{libname} is how the function is named in the library, usually the same name as one visible from C. Read that file from your GP session (from your \idx{preferences file} for instance, see \secref{se:gprc}), and that's it, you can use the new function \var{gpname} under GP (and we would very much like to hear about it!). \subsec{Integration the hard way} If \tet{install} is not available for your Operating System, it's more complicated: you have to hardcode your function in the GP binary (or install \idx{Linux}). Here's what needs to be done: In the definition of \kbd{functions\_basic} (file \kbd{language/init.c}), add your entry in exact alphabetical order by its GP name (note that digits come before letters), in a line of the form: \bprog { "gpname", V, (void*)libname, secno, "code" } @eprog \noindent where \kbd{libname} is the name of your function in library mode, \kbd{gpname} the name that you have chosen to call it under GP, \kbd{secno} is the section number of Chapter~3 in which this function would belong (type \kbd{?} in GP to see the list), \kbd{V} is a number between 0 and 99. Right now, for PARI there are only two significant values: zero means that it's possible to call the function without argument, and non-zero means it needs at least one argument. A binding of PARI to an external language (such as \kbd{Math::Pari} Perl module) may actually distinguish between different non-zero values. Better use 99 if you want a non-zero value which will not confuse anybody. \kbd{code} is the parser code. Once this has been done, in the file \kbd{language/helpmessages.c} add in exact alphabetical order a short message describing the effect of your function: \kbd{"name(x,y,...)=short descriptive message",} The message must be a single line, of arbitrary length. Do not use \kbd{\bs{n}}; the necessary newlines will be inserted by GP's online help functions. Optional arguments should be shown between braces (see the other messages for comparison).\smallskip Now, you can recompile GP. \subsec{Example}. % A complete description could look like this: \bprog { install(bnfinit0, GD0,L,DGp, ClassGroupInit, "libpari.so"); addhelp(ClassGroupInit, "ClassGroupInit(P,{flag=0},{data=[]}): compute the necessary data for ..."); } @eprog which means we have a function \kbd{ClassGroupInit} under GP, which calls the library function \kbd{bnfinit0} . The function has one mandatory argument, and possibly two more (two \kbd{'D'} in the code), plus the current real precision. More precisely, the first argument is a \kbd{GEN}, the second one is converted to a \kbd{long} using \kbd{itos} (\kbd{0} is passed if it is omitted), and the third one is also a \kbd{GEN}, but we pass \kbd{NULL} if no argument was supplied by the user. This matches the C prototype (from \kbd{paridecl.h}): % \bprog GEN bnfinit0(GEN P, long flag, GEN data, long prec) @eprog This function is in fact coded in \kbd{basemath/buch2.c}, and will in this case be completely identical to the GP function \kbd{bnfinit} but GP does not need to know about this, only that it can be found somewhere in the shared library \kbd{libpari.so}. \misctitle{Important note:} You see in this example that it is the function's responsibility to correctly interpret its operands: \kbd{data = NULL} is interpreted {\it by the function\/} as an empty vector. Note that since \kbd{NULL} is never a valid \kbd{GEN} pointer, this trick always enables you to distinguish between a default value and actual input: the user could explicitly supply an empty vector! \misctitle{Note:} If \kbd{install} were not available we would have to modify \kbd{language/helpmessages.c}, and \kbd{language/init.c} and recompile GP. The entry in \kbd{functions\_basic} corresponding to the function above is actually \bprog { "bnfinit", 91, (void*)bnfinit0, 6, "GD0,L,DGp" } @eprog \vfill\eject