OpenXM/doc/issac2000/homogeneous-network.tex - annotate

Return to homogeneous-network.tex CVS log
Up to [local] / OpenXM / doc / issac2000
Annotation of OpenXM/doc/issac2000/homogeneous-network.tex, Revision 1.3

1.3     ! noro        1: % $OpenXM: OpenXM/doc/issac2000/homogeneous-network.tex,v 1.2 2000/01/02 07:32:12 takayama Exp $
1.2       takayama    2:
                      3: \section{Applications}
1.3     ! noro        4: \subsection{Distributed computation with homogeneous servers}
1.2       takayama    5:
1.3     ! noro        6: OpenXM also aims at speedup by a distributed computation
        !             7: with homogeneous servers. As the current specification of OpenXM does
        !             8: not include communication between servers, one cannot expect
        !             9: the maximal parallel speedup. However it is possible to execute
        !            10: several types of distributed computation as follows.
        !            11:
        !            12: \subsubsection{Product of univariate polynomials}
        !            13:
        !            14: Shoup \cite{Shoup} showed that the product of univariate polynomials
        !            15: with large degrees and large coefficients can be computed efficiently
        !            16: by FFT over small finite fields and Chinese remainder theorem.
        !            17: It can be easily parallelized:
        !            18:
        !            19: \begin{tabbing}
        !            20: Input :\= $f_1, f_2 \in Z[x]$\\
        !            21: \> such that $deg(f_1), deg(f_2) < 2^M$\\
        !            22: Output : $f = f_1f_2 \bmod p$\\
        !            23: $P \leftarrow$ \= $\{m_1,\cdots,m_N\}$ where $m_i$ is a prime, \\
        !            24: \> $2^{M+1}|m_i-1$ and $m=\prod m_i $ is sufficiently large. \\
        !            25: Separate $P$ into disjoint subsets $P_1, \cdots, P_L$.\\
        !            26: for \= $j=1$ to $L$ $M_j \leftarrow \prod_{m_i\in P_j} m_i$\\
        !            27: Compute $F_j$ such that $F_j \equiv f_1f_2 \bmod M_j$\\
        !            28: \> and $F_j \equiv 0 \bmod m/M_j$ in parallel.\\
        !            29: \> ($f_1, f_2$ are regarded as integral.\\
        !            30: \> The product is computed by FFT.)\\
        !            31: return $\phi_m(\sum F_j)$\\
        !            32: (For $a \in Z$, $\phi_m(a) \in (-m/2,m/2)$ and $\phi_m(a)\equiv a \bmod m$)
        !            33: \end{tabbing}
        !            34:
        !            35: Figure \ref{speedup}
        !            36: shows the speedup factor under the above distributed computation
        !            37: on {\tt Risa/Asir}. For each $n$, two polynomials of degree $n$
        !            38: with 3000bit coefficients are generated and the product is computed.
        !            39: The machine is Fujitsu AP3000,
        !            40: a cluster of Sun connected with a high speed network and MPI over the
        !            41: network is used to implement OpenXM.
        !            42: \begin{figure}[htbp]
        !            43: \epsfxsize=8.5cm
        !            44: \epsffile{speedup.ps}
        !            45: \caption{Speedup factor}
        !            46: \label{speedup}
        !            47: \end{figure}
        !            48:
        !            49: The task of a client is the generation and partition of $P$, sending
        !            50: and receiving of polynomials and the synthesis of the result. If the
        !            51: number of servers is $L$ and the inputs are fixed, then the time to
        !            52: compute $F_j$ in parallel is proportional to $1/L$, whereas the time
        !            53: for sending and receiving of polynomials is proportional to $L$
        !            54: because we don't have the broadcast and the reduce
        !            55: operations. Therefore the speedup is limited and the upper bound of
        !            56: the speedup factor depends on the communication cost and the degree
        !            57: of inputs. Figure \ref{speedup} shows that
        !            58: the speedup is satisfactory if the degree is large and the number of
        !            59: servers is not large, say, up to 10.
        !            60:
        !            61: \subsubsection{Order counting of an elliptic curve}
        !            62:
        !            63: \subsubsection{Gr\"obner basis computation by various methods}
        !            64:
        !            65: Singular \cite{Singular} implements {\tt MP} interface for distributed
        !            66: computation and a competitive Gr\"obner basis computation is
        !            67: illustrated as an example of distributed computation.  However,
        !            68: interruption has not implemented yet and the looser process have to be
        !            69: killed explicitly. As stated in Section \ref{secsession} OpenXM
        !            70: provides such a function and one can safely reset the server and
        !            71: continue to use it.  Furthermore, if a client provides synchronous I/O
        !            72: multiplexing by {\tt select()}, then a polling is not necessary.  The
        !            73: following {\tt Risa/Asir} function computes a Gr\"obner basis by
        !            74: starting the computations simultaneously from the homogenized input and
        !            75: the input itself.  The client watches the streams by {\tt ox\_select()}
        !            76: and The result which is returned first is taken. Then the remaining
        !            77: server is reset.
        !            78:
        !            79: \begin{verbatim}
        !            80: /* G:set of polys; V:list of variables */
        !            81: /* O:type of order; P0,P1: id's of servers */
        !            82: def dgr(G,V,O,P0,P1)
        !            83: {
        !            84:   P = [P0,P1]; /* server list */
        !            85:   map(ox_reset,P); /* reset servers */
        !            86:   /* P0 executes non-homogenized computation */
        !            87:   ox_cmo_rpc(P0,"dp_gr_main",G,V,0,1,O);
        !            88:   /* P1 executes homogenized computation */
        !            89:   ox_cmo_rpc(P1,"dp_gr_main",G,V,1,1,O);
        !            90:   map(ox_push_cmd,P,262); /* 262 = OX_popCMO */
        !            91:   F = ox_select(P); /* wait for data */
        !            92:   /* F[0] is a server's id which is ready */
        !            93:   R = ox_get(F[0]);
        !            94:   if ( F[0] == P0 ) {
        !            95:     Win = "nonhomo"; Lose = P1;
        !            96:   } else {
        !            97:     Win = "homo"; Lose = P0;
        !            98:   }
        !            99:   ox_reset(Lose); /* reset the loser */
        !           100:   return [Win,R];
        !           101: }
        !           102: \end{verbatim}
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>