Annotation of OpenXM/doc/calc2000/homogeneous-network.tex, Revision 1.1
1.1 ! noro 1: % $OpenXM: OpenXM/doc/issac2000/homogeneous-network.tex,v 1.13 2000/01/17 08:50:56 noro Exp $
! 2:
! 3: \subsection{Distributed computation with homogeneous servers}
! 4: \label{section:homog}
! 5:
! 6: One of the aims of OpenXM is a parallel speedup by a distributed computation
! 7: with homogeneous servers. As the current specification of OpenXM does
! 8: not include communication between servers, one cannot expect
! 9: the maximal parallel speedup. However it is possible to execute
! 10: several types of distributed computation as follows.
! 11:
! 12: \subsubsection{Product of univariate polynomials}
! 13:
! 14: Shoup \cite{Shoup} showed that the product of univariate polynomials
! 15: with large degrees and large coefficients can be computed efficiently
! 16: by FFT over small finite fields and Chinese remainder theorem.
! 17: It can be easily parallelized:
! 18:
! 19: \begin{tabbing}
! 20: Input :\= $f_1, f_2 \in {\bf Z}[x]$ such that $deg(f_1), deg(f_2) < 2^M$\\
! 21: Output : $f = f_1f_2$ \\
! 22: $P \leftarrow$ \= $\{m_1,\cdots,m_N\}$ where $m_i$ is an odd prime, \\
! 23: \> $2^{M+1}|m_i-1$ and $m=\prod m_i $ is sufficiently large. \\
! 24: Separate $P$ into disjoint subsets $P_1, \cdots, P_L$.\\
! 25: for \= $j=1$ to $L$ $M_j \leftarrow \prod_{m_i\in P_j} m_i$\\
! 26: Compute $F_j$ such that $F_j \equiv f_1f_2 \bmod M_j$\\
! 27: \> and $F_j \equiv 0 \bmod m/M_j$ in parallel.\\
! 28: \> (The product is computed by FFT.)\\
! 29: return $\phi_m(\sum F_j)$\\
! 30: (For $a \in {\bf Z}$, $\phi_m(a) \in (-m/2,m/2)$ and $\phi_m(a)\equiv a \bmod m$)
! 31: \end{tabbing}
! 32:
! 33: Figure \ref{speedup}
! 34: shows the speedup factor under the above distributed computation
! 35: on Risa/Asir. For each $n$, two polynomials of degree $n$
! 36: with 3000bit coefficients are generated and the product is computed.
! 37: The machine is FUJITSU AP3000,
! 38: a cluster of Sun workstations connected with a high speed network
! 39: and MPI over the network is used to implement OpenXM.
! 40: \begin{figure}[htbp]
! 41: \epsfxsize=8.5cm
! 42: \begin{center}
! 43: \epsffile{speedup.ps}
! 44: \end{center}
! 45: \caption{Speedup factor}
! 46: \label{speedup}
! 47: \end{figure}
! 48:
! 49: If the number of servers is $L$ and the inputs are fixed, then the cost to
! 50: compute $F_j$ in parallel is $O(1/L)$, whereas the cost
! 51: to send and receive polynomials is $O(L)$ if {\tt ox\_push\_cmo()} and
! 52: {\tt ox\_pop\_cmo()} are repeatedly applied on the client.
! 53: Therefore the speedup is limited and the upper bound of
! 54: the speedup factor depends on the ratio of
! 55: the computational cost and the communication cost for each unit operation.
! 56: Figure \ref{speedup} shows that
! 57: the speedup is satisfactory if the degree is large and $L$
! 58: is not large, say, up to 10 under the above environment.
! 59: If OpenXM provides operations for the broadcast and the reduction
! 60: such as {\tt MPI\_Bcast} and {\tt MPI\_Reduce} respectively, the cost of
! 61: sending $f_1$, $f_2$ and gathering $F_j$ may be reduced to $O(\log_2L)$
! 62: and we can expect better results in such a case.
! 63:
! 64: \subsubsection{Competitive distributed computation by various strategies}
! 65:
! 66: SINGULAR \cite{Singular} implements {\it MP} interface for distributed
! 67: computation and a competitive Gr\"obner basis computation is
! 68: illustrated as an example of distributed computation.
! 69: Such a distributed computation is also possible on OpenXM.
! 70: The following Risa/Asir function computes a Gr\"obner basis by
! 71: starting the computations simultaneously from the homogenized input and
! 72: the input itself. The client watches the streams by {\tt ox\_select()}
! 73: and the result which is returned first is taken. Then the remaining
! 74: server is reset.
! 75:
! 76: \begin{verbatim}
! 77: /* G:set of polys; V:list of variables */
! 78: /* O:type of order; P0,P1: id's of servers */
! 79: def dgr(G,V,O,P0,P1)
! 80: {
! 81: P = [P0,P1]; /* server list */
! 82: map(ox_reset,P); /* reset servers */
! 83: /* P0 executes non-homogenized computation */
! 84: ox_cmo_rpc(P0,"dp_gr_main",G,V,0,1,O);
! 85: /* P1 executes homogenized computation */
! 86: ox_cmo_rpc(P1,"dp_gr_main",G,V,1,1,O);
! 87: map(ox_push_cmd,P,262); /* 262 = OX_popCMO */
! 88: F = ox_select(P); /* wait for data */
! 89: /* F[0] is a server's id which is ready */
! 90: R = ox_get(F[0]);
! 91: if ( F[0] == P0 ) {
! 92: Win = "nonhomo"; Lose = P1;
! 93: } else {
! 94: Win = "homo"; Lose = P0;
! 95: }
! 96: ox_reset(Lose); /* reset the loser */
! 97: return [Win,R];
! 98: }
! 99: \end{verbatim}
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>