URL https://opencores.org/ocsvn/mod_sim_exp/mod_sim_exp/trunk

Subversion Repositories mod_sim_exp

[/] [mod_sim_exp/] [trunk/] [doc/] [src/] [operation.tex] - Blame information for rev 65

Go to most recent revision | Details | Compare with Previous | View Log


\chapter{Operation}
 
\section{Pipeline operation}
The operation of the pipeline is shown in Figure~\ref{fig:pipeline_op}. One can see that the stages are started
every 2 clock cycles ($\tau_{c}$ is the core clock period). This is needed because the least significant bit of the next
stage result is needed. Every stage has to run $n$ (the width of the operands) times for the multiplication to be complete.
 
\begin{figure}[H]
\centering
\includegraphics[trim=1.2cm 1.2cm 1.2cm 1.2cm, width=7cm]{pictures/pipeline_operation.pdf}
\caption{Pipeline operation: Each circle represents an active stage. The number indicates how much times that stage
                has run. Dotted line contours indicate the stage is inactive.}
\label{fig:pipeline_op}
\end{figure}
 
For performing one Montgomery multiplication using this core, the total computation time $T_m$ for an $n$-bit operand
with a $k$-stage pipeline is given by~(\ref{eq:Tmult}).
\begin{align}\label{eq:Tmult}
T_{m} = \left[k + 2(n - 1)\right] \tau_c
\end{align}
\newpage
\section{Modular Simultaneous exponentiation operations}
Exponentiations are calculated with Algorithm~\ref{alg:mme} which uses the Montgomery multiplier as the main computation
step. It uses the principle of a square-and-multiply algorithm to calculate an exponentiation with 2 bases.
\begin{algorithm}[H] % enter the algorithm environment
\caption{Montgomery simultaneous exponentiation} % give the algorithm a caption
\label{alg:mme} % and a label for \ref{} commands later in the document
\algnewcommand\algorithmicdownto{\textbf{downto}}
\algrenewtext{For}[3]%
{\algorithmicfor\ #1 $\gets$ #2 \algorithmicdownto\ #3 \algorithmicdo}
\algnewcommand\algorithmicswitch{\textbf{switch}}
\algrenewtext{While}[2]%
{\algorithmicswitch\ #1, #2}
\algnewcommand\algorithmicinput{\textbf{Input:}}
\algnewcommand\Input{\item[\algorithmicinput]}
\algnewcommand\algorithmicoutput{\textbf{Output:}}
\algnewcommand\Output{\item[\algorithmicoutput]}
\footnotesize
\begin{algorithmic}[1] % enter the algorithmic environment
\Input $g_{0},\:g_{1},\:e_{0}=(e_{0_{t-1}} \cdots e_{0_{0}})_{2},\:e_{1}=(e_{0_{t-1}} \cdots e_{0_{0}})_{2},\:R^{2}\bmod m,\:m$
\Output $g_{0}^{e_{0}} \cdot g_{1}^{e_{1}} \bmod m$
\State $\tilde{g}_{0} := \text{Mont}(g_{0}, R^{2}),\:\tilde{g}_{1} := \text{Mont}(g_{1}, R^{2}),\:\tilde{g}_{01} := \text{Mont}(\tilde{g}_{0}, \tilde{g}_{1})$
\State $a := \text{Mont}(R^{2}, 1)$
\Comment This is the same as $a := R \bmod m$.
\For{$i$}{$(t-1)$}{0}
\State $a := \text{Mont}(a, a)$
\While{$e_{1_{i}}$}{$e_{0_{i}}$} % use as switch statement
\State $0,\:1:\;a := \text{Mont}(a, \tilde{g}_{0})$
\State $1,\:0:\;a := \text{Mont}(a, \tilde{g}_{1})$
\State $1,\:1:\;a := \text{Mont}(a, \tilde{g}_{01})$
\EndWhile
\EndFor
\State $a := \text{Mont}(a, 1)$
\State \Return{$a$}
\end{algorithmic}
\end{algorithm}
It can be seen that the algorithm requires $R^{2}\bmod m$ which is $2^{2n}\bmod m$. We assume $R^2 \bmod m$ can be
provided or pre-computed. The for loop in the algorithm is executed by the control logic of the core. Apart from this,
a few pre- and one post-calculations have to be performed.
 
The computation time for an exponentiation depends on the number of zero's in the exponents, from
Algorithm~\ref{alg:mme} one can see that if both exponent bits are zero at a time, no multiplication has to be
performed. Thus reducing the total time. The average computation time for a modular simultaneous exponentiation, with
$n$-bit base operands and $t$-bit exponents is given by~(\ref{eq:Tsime}).
\begin{align}\label{eq:Tsime}
T_{se} = \frac{7}{4} t \cdot T_{m} = \frac{7}{4}t \cdot [k + 2(n - 1)] \tau_c
\end{align}
 
For single base exponentiations, i.e. 1 exponent is equal to zero, the average exponentiation time is given by~(\ref{eq:Texp}).
\begin{align}\label{eq:Texp}
T_{e} = \frac{3}{2} t \cdot T_{m} = \frac{3}{2}t \cdot [k + 2(n - 1)] \tau_c
\end{align}
 
The formulas~(\ref{eq:Tsime}) and~(\ref{eq:Texp}) given here are only the theoretical average time for an exponentiation,
excluding the pre- and post-computations.
 
 
\section{Core operation steps}
The core can operate in 2 modes, multiplication or exponentiation mode. The steps required to do one of these actions
are described here.
\subsection{Single Montgomery multiplication}
The following steps are needed for a single Montgomery multiplication:
\begin{enumerate}
        \item load the modulus to the RAM using the 32 bit bus
        \item load the desired $x$ and $y$ operands into any 2 locations of the operand RAM using the 32 bit bus.
        \item select the correct input operands for the multiplier using \verb|x_sel_single| and \verb|y_sel_single|
        \item select the result destination operand using \verb|result_dest_op|
        \item set \verb|exp/m| = `0' to select multiplication mode
        \item set \verb|p_sel| to choose which pipeline part you will use
        \item generate a start pulse for the core
        \item wait until interrupt is received and read out result in selected operand
\end{enumerate}
 
\textbf{Note:} this computation gives a result \( r = x \cdot y \cdot R^{-1} \bmod m\). If the actual product of $x$ and $y$ is
desired, a final Montgomery multiplication of the result with $R^{2}$ is needed.
 
\subsection{Modular simultaneous exponentiation}
The core requires $\tilde{g}_{0}$, $\tilde{g}_{0}$, $\tilde{g}_{01}$ and $a$ to be loaded into the correct operand
spaces before starting the exponentiation. These parameters are calculated using single Montgomery multiplications as follows:
\begin{align*}
        \tilde{g}_{0} &= Mont(g_{0}, R^{2}) &\,&= g_{0} \cdot R \bmod m & \hspace{3cm}\text{in operand 0}\hspace{4cm}\\
        \tilde{g}_{1} &= Mont(g_{1}, R^{2}) &\,&= g_{1} \cdot R \bmod m & \hspace{3cm}\text{in operand 1}\hspace{4cm}\\
        \tilde{g}_{01} &= Mont(\tilde{g}_{0}, \tilde{g}_{1}) &\,&= g_{0} \cdot g_{1} \cdot R \bmod m & \hspace{3cm}\text{in operand 2}\hspace{4cm}\\
        a &= Mont(R^{2}, 1) &\,&= R \bmod m &\hspace{3cm}\text{in operand 3}\hspace{4cm}
\end{align*}
When the exponentiation is done, a final multiplication has to be started by the software to multiply $a$ with 1.
The steps needed for a full simultaneous exponentiation are:
 
 
\begin{enumerate}
        \item load the modulus to the RAM using the 32 bit bus
        \item load the desired $g_0$, $g_1$ operands and \(R^{2} \bmod m\) into the operand RAM using the 32 bit bus.
        \item set \verb|p_sel| to choose which pipeline part you will use
        \item compute $\tilde{g}_{0}$ by using a single Montgomery multiplication of $g_{0}$ with $R^{2}$ and place the result $\tilde{g}_{0}$ in operand 0.
        \item compute $\tilde{g}_{1}$ by using a single Montgomery multiplication of $g_{1}$ with $R^{2}$ and place the result $\tilde{g}_{1}$ in operand 1.
        \item compute $\tilde{g}_{01}$ by using a single Montgomery multiplication of $\tilde{g}_{0}$ with $\tilde{g}_{1}$ and place the result $\tilde{g}_{01}$ in operand 2.
        \item compute $a$ by using a single Montgomery multiplication of $R^{2}$ with $1$ and place the result $a$ in operand 3.
        \item set the core in exponentiation mode ($exp/m$='1')
        \item generate a start pulse for the core
        \item wait until interrupt is received
        \item perform the post-computation using a single Montgomery multiplication of $a$(in operand 3) with 1 and read out result
\end{enumerate}

Line No.	Rev	Author	Line
1	47	JonasDC	`\chapter{Operation}`
2
3			`\section{Pipeline operation}`
4			`The operation of the pipeline is shown in Figure~\ref{fig:pipeline_op}. One can see that the stages are started`
5			`every 2 clock cycles ($\tau_{c}$ is the core clock period). This is needed because the least significant bit of the next`
6			`stage result is needed. Every stage has to run $n$ (the width of the operands) times for the multiplication to be complete.`
7
8			`\begin{figure}[H]`
9			`\centering`
10			`\includegraphics[trim=1.2cm 1.2cm 1.2cm 1.2cm, width=7cm]{pictures/pipeline_operation.pdf}`
11			`\caption{Pipeline operation: Each circle represents an active stage. The number indicates how much times that stage`
12			`has run. Dotted line contours indicate the stage is inactive.}`
13			`\label{fig:pipeline_op}`
14			`\end{figure}`
15
16			`For performing one Montgomery multiplication using this core, the total computation time $T_m$ for an $n$-bit operand`
17			`with a $k$-stage pipeline is given by~(\ref{eq:Tmult}).`
18			`\begin{align}\label{eq:Tmult}`
19			`T_{m} = \left[k + 2(n - 1)\right] \tau_c`
20			`\end{align}`
21			`\newpage`
22			`\section{Modular Simultaneous exponentiation operations}`
23			`Exponentiations are calculated with Algorithm~\ref{alg:mme} which uses the Montgomery multiplier as the main computation`
24			`step. It uses the principle of a square-and-multiply algorithm to calculate an exponentiation with 2 bases.`
25			`\begin{algorithm}[H] % enter the algorithm environment`
26			`\caption{Montgomery simultaneous exponentiation} % give the algorithm a caption`
27			`\label{alg:mme} % and a label for \ref{} commands later in the document`
28			`\algnewcommand\algorithmicdownto{\textbf{downto}}`
29			`\algrenewtext{For}[3]%`
30			`{\algorithmicfor\ #1 $\gets$ #2 \algorithmicdownto\ #3 \algorithmicdo}`
31			`\algnewcommand\algorithmicswitch{\textbf{switch}}`
32			`\algrenewtext{While}[2]%`
33			`{\algorithmicswitch\ #1, #2}`
34			`\algnewcommand\algorithmicinput{\textbf{Input:}}`
35			`\algnewcommand\Input{\item[\algorithmicinput]}`
36			`\algnewcommand\algorithmicoutput{\textbf{Output:}}`
37			`\algnewcommand\Output{\item[\algorithmicoutput]}`
38			`\footnotesize`
39			`\begin{algorithmic}[1] % enter the algorithmic environment`
40			`\Input $g_{0},\:g_{1},\:e_{0}=(e_{0_{t-1}} \cdots e_{0_{0}})_{2},\:e_{1}=(e_{0_{t-1}} \cdots e_{0_{0}})_{2},\:R^{2}\bmod m,\:m$`
41			`\Output $g_{0}^{e_{0}} \cdot g_{1}^{e_{1}} \bmod m$`
42			`\State $\tilde{g}_{0} := \text{Mont}(g_{0}, R^{2}),\:\tilde{g}_{1} := \text{Mont}(g_{1}, R^{2}),\:\tilde{g}_{01} := \text{Mont}(\tilde{g}_{0}, \tilde{g}_{1})$`
43			`\State $a := \text{Mont}(R^{2}, 1)$`
44			`\Comment This is the same as $a := R \bmod m$.`
45			`\For{$i$}{$(t-1)$}{0}`
46			`\State $a := \text{Mont}(a, a)$`
47			`\While{$e_{1_{i}}$}{$e_{0_{i}}$} % use as switch statement`
48			`\State $0,\:1:\;a := \text{Mont}(a, \tilde{g}_{0})$`
49			`\State $1,\:0:\;a := \text{Mont}(a, \tilde{g}_{1})$`
50			`\State $1,\:1:\;a := \text{Mont}(a, \tilde{g}_{01})$`
51			`\EndWhile`
52			`\EndFor`
53			`\State $a := \text{Mont}(a, 1)$`
54			`\State \Return{$a$}`
55			`\end{algorithmic}`
56			`\end{algorithm}`
57			`It can be seen that the algorithm requires $R^{2}\bmod m$ which is $2^{2n}\bmod m$. We assume $R^2 \bmod m$ can be`
58			`provided or pre-computed. The for loop in the algorithm is executed by the control logic of the core. Apart from this,`
59			`a few pre- and one post-calculations have to be performed.`
60
61			`The computation time for an exponentiation depends on the number of zero's in the exponents, from`
62			`Algorithm~\ref{alg:mme} one can see that if both exponent bits are zero at a time, no multiplication has to be`
63			`performed. Thus reducing the total time. The average computation time for a modular simultaneous exponentiation, with`
64			`$n$-bit base operands and $t$-bit exponents is given by~(\ref{eq:Tsime}).`
65			`\begin{align}\label{eq:Tsime}`
66			`T_{se} = \frac{7}{4} t \cdot T_{m} = \frac{7}{4}t \cdot [k + 2(n - 1)] \tau_c`
67			`\end{align}`
68
69			`For single base exponentiations, i.e. 1 exponent is equal to zero, the average exponentiation time is given by~(\ref{eq:Texp}).`
70			`\begin{align}\label{eq:Texp}`
71			`T_{e} = \frac{3}{2} t \cdot T_{m} = \frac{3}{2}t \cdot [k + 2(n - 1)] \tau_c`
72			`\end{align}`
73
74			`The formulas~(\ref{eq:Tsime}) and~(\ref{eq:Texp}) given here are only the theoretical average time for an exponentiation,`
75			`excluding the pre- and post-computations.`
76
77
78			`\section{Core operation steps}`
79			`The core can operate in 2 modes, multiplication or exponentiation mode. The steps required to do one of these actions`
80			`are described here.`
81			`\subsection{Single Montgomery multiplication}`
82			`The following steps are needed for a single Montgomery multiplication:`
83			`\begin{enumerate}`
84			`\item load the modulus to the RAM using the 32 bit bus`
85			`\item load the desired $x$ and $y$ operands into any 2 locations of the operand RAM using the 32 bit bus.`
86			`\item select the correct input operands for the multiplier using \verb\|x_sel_single\| and \verb\|y_sel_single\|`
87			`\item select the result destination operand using \verb\|result_dest_op\|`
88			\item set \verb\|exp/m\| = `0' to select multiplication mode
89			`\item set \verb\|p_sel\| to choose which pipeline part you will use`
90			`\item generate a start pulse for the core`
91			`\item wait until interrupt is received and read out result in selected operand`
92			`\end{enumerate}`
93
94			`\textbf{Note:} this computation gives a result \( r = x \cdot y \cdot R^{-1} \bmod m\). If the actual product of $x$ and $y$ is`
95			`desired, a final Montgomery multiplication of the result with $R^{2}$ is needed.`
96
97			`\subsection{Modular simultaneous exponentiation}`
98			`The core requires $\tilde{g}_{0}$, $\tilde{g}_{0}$, $\tilde{g}_{01}$ and $a$ to be loaded into the correct operand`
99			`spaces before starting the exponentiation. These parameters are calculated using single Montgomery multiplications as follows:`
100			`\begin{align*}`
101			`\tilde{g}_{0} &= Mont(g_{0}, R^{2}) &\,&= g_{0} \cdot R \bmod m & \hspace{3cm}\text{in operand 0}\hspace{4cm}\\`
102			`\tilde{g}_{1} &= Mont(g_{1}, R^{2}) &\,&= g_{1} \cdot R \bmod m & \hspace{3cm}\text{in operand 1}\hspace{4cm}\\`
103			`\tilde{g}_{01} &= Mont(\tilde{g}_{0}, \tilde{g}_{1}) &\,&= g_{0} \cdot g_{1} \cdot R \bmod m & \hspace{3cm}\text{in operand 2}\hspace{4cm}\\`
104			`a &= Mont(R^{2}, 1) &\,&= R \bmod m &\hspace{3cm}\text{in operand 3}\hspace{4cm}`
105			`\end{align*}`
106			`When the exponentiation is done, a final multiplication has to be started by the software to multiply $a$ with 1.`
107			`The steps needed for a full simultaneous exponentiation are:`
108
109
110			`\begin{enumerate}`
111			`\item load the modulus to the RAM using the 32 bit bus`
112			`\item load the desired $g_0$, $g_1$ operands and \(R^{2} \bmod m\) into the operand RAM using the 32 bit bus.`
113			`\item set \verb\|p_sel\| to choose which pipeline part you will use`
114			`\item compute $\tilde{g}_{0}$ by using a single Montgomery multiplication of $g_{0}$ with $R^{2}$ and place the result $\tilde{g}_{0}$ in operand 0.`
115			`\item compute $\tilde{g}_{1}$ by using a single Montgomery multiplication of $g_{1}$ with $R^{2}$ and place the result $\tilde{g}_{1}$ in operand 1.`
116			`\item compute $\tilde{g}_{01}$ by using a single Montgomery multiplication of $\tilde{g}_{0}$ with $\tilde{g}_{1}$ and place the result $\tilde{g}_{01}$ in operand 2.`
117			`\item compute $a$ by using a single Montgomery multiplication of $R^{2}$ with $1$ and place the result $a$ in operand 3.`
118			`\item set the core in exponentiation mode ($exp/m$='1')`
119			`\item generate a start pulse for the core`
120			`\item wait until interrupt is received`
121			`\item perform the post-computation using a single Montgomery multiplication of $a$(in operand 3) with 1 and read out result`
122			`\end{enumerate}`

Browse

Tools

Subversion Repositories mod_sim_exp

[/] [mod_sim_exp/] [trunk/] [doc/] [src/] [operation.tex] - Blame information for rev 65