URL https://opencores.org/ocsvn/mod_sim_exp/mod_sim_exp/trunk

# Subversion Repositoriesmod_sim_exp

## [/] [mod_sim_exp/] [trunk/] [doc/] [src/] [architecture.tex] - Diff between revs 87 and 103

Rev 87 Rev 103
Line 1... Line 1...
\chapter{Architecture}
\chapter{Architecture}

\section{Block diagram}
\section{Block diagram}
The architecture for the full IP core is shown in the Figure~\ref{blockdiagram}. It consists of 2 major parts, the actual
The architecture for the full IP core is shown in the Figure~\ref{blockdiagram}. It consists of 2 major parts, the actual
exponentiation core (\verb|mod_sim_exp_core| entity) with a bus interface wrapped around it. In the following sections these
exponentiation core (\verb|mod_sim_exp_core| entity) with a bus interface wrapped around it. In the following sections these
different blocks are described in detail.\\
different blocks are described in detail. The bus interface and the exponentiation core can run on different clock

frequencies, so they are independent of each other.\\
\begin{figure}[H]
\begin{figure}[H]
\centering
\centering
\includegraphics[trim=1.2cm 1.2cm 1.2cm 1.2cm, width=10cm]{pictures/block_diagram.pdf}
\includegraphics[trim=1.2cm 1.2cm 1.2cm 1.2cm, width=10cm]{pictures/block_diagram.pdf}
\caption{Block diagram of the Modular Simultaneous Exponentiation IP core}
\caption{Block diagram of the Modular Simultaneous Exponentiation IP core}
\label{blockdiagram}
\label{blockdiagram}
Line 28... Line 29...
\includegraphics[trim=1.2cm 1.2cm 1.2cm 1.2cm, width=10cm]{pictures/mod_sim_exp_core.pdf}
\includegraphics[trim=1.2cm 1.2cm 1.2cm 1.2cm, width=10cm]{pictures/mod_sim_exp_core.pdf}
\cprotect\caption{\verb|mod_sim_exp_core| structure}
\cprotect\caption{\verb|mod_sim_exp_core| structure}
\label{msec_structure}
\label{msec_structure}
\end{figure}
\end{figure}

The multiplier and control unit operate on the \verb|core_clk| clock frequency and the interface to the operand RAM and

exponent FIFO operates on the \verb|bus_clk| clock frequency. The transition between the 2 clock domains is mainly

implemented by the RAM and FIFO. For the remainder, the necessary control signals are synchronised to the

\verb|bus_clk|. Thus when using the \verb|mod_sim_exp_core|, one can thus assume that al ports are operating on the

\verb|bus_clk| clock signal.

\subsection{Multiplier}
\subsection{Multiplier}
The kernel of this design is a pipelined Montgomery multiplier. A Montgomery multiplication\cite{MontModMul} allows efficient implementation of a
The kernel of this design is a pipelined Montgomery multiplier. A Montgomery multiplication\cite{MontModMul} allows efficient implementation of a
modular multiplication without explicitly carrying out the classical modular reduction step. Right-shift operations ensure that the length of the (intermediate) results does not exceed $n+1$ bits. The result of a Montgomery multiplication is given by~(\ref{eq:mont}):
modular multiplication without explicitly carrying out the classical modular reduction step. Right-shift operations ensure that the length of the (intermediate) results does not exceed $n+1$ bits. The result of a Montgomery multiplication is given by~(\ref{eq:mont}):
\begin{align}\label{eq:mont}
\begin{align}\label{eq:mont}
r = x \cdot y \cdot R^{-1} \bmod m \hspace{1.5cm}\text{with } R = 2^{n}
r = x \cdot y \cdot R^{-1} \bmod m \hspace{1.5cm}\text{with } R = 2^{n}
Line 123... Line 130...
To store the exponents, there is a FIFO of 32 bit wide. Every 32 bit entry has to be formatted as 16 bit of $e_0$ for the
To store the exponents, there is a FIFO of 32 bit wide. Every 32 bit entry has to be formatted as 16 bit of $e_0$ for the
lower part [15:0] and 16 bit of $e_1$ for the higher part [31:16]. Entries have to be pushed in the FIFO starting with the least significant word and ending with the most significant word of the exponents.
lower part [15:0] and 16 bit of $e_1$ for the higher part [31:16]. Entries have to be pushed in the FIFO starting with the least significant word and ending with the most significant word of the exponents.

For the FIFO there are 2 styles available. The implementation style depends on the style of the operand memory and it can not be set directly. When  the RAM option \verb|"xil_prim"| is chosen, the resulting FIFO will use the FIFO18E1 primitive. It is able to store 512 entries, meaning 2 exponents of each 8192 bit long.
For the FIFO there are 2 styles available. The implementation style depends on the style of the operand memory and it can not be set directly. When  the RAM option \verb|"xil_prim"| is chosen, the resulting FIFO will use the FIFO18E1 primitive. It is able to store 512 entries, meaning 2 exponents of each 8192 bit long.

When the RAM options \verb|"generic"| or \verb|"asym"| are chosen, a generic FIFO will be implemented. This consist of a symmetric RAM with the control logic for a FIFO. The depth of this generic FIFO is adjustable with the parameter \verb|C_FIFO_DEPTH|.
When the RAM options \verb|"generic"| or \verb|"asym"| are chosen, a generic FIFO \footnote{This FIFO is a slightly
The number of RAM blocks for the FIFO is given by (\ref{eq:fifoblocks}), where \verb|RAMBLOCK_SIZE| is the size [bits] of the FPGA's RAM primitive.
modified version of the generic FIFOs project at OpenCores.org (http://opencores.org/project,generic\_fifos).} will be

implemented.

This consist of a dual port symmetric RAM with the control logic for a FIFO. The depth of this generic FIFO is adjustable with the parameter \verb|C_FIFO_AW|. The number of RAM blocks for the FIFO is given by (\ref{eq:fifoblocks}), where

\verb|RAMBLOCK_SIZE| is the size [bits] of the FPGA's RAM primitive.
\begin{align}
\begin{align}
\left[\left(\mathtt{C\_FIFO\_DEPTH}+1\right) \cdot 32 \right]/ \mathtt{RAMBLOCK\_SIZE} \label{eq:fifoblocks}
\left[\left(\mathtt{2^{C\_FIFO\_AW}}+1\right) \cdot 32 \right]/ \mathtt{RAMBLOCK\_SIZE} \label{eq:fifoblocks}
\end{align}
\end{align}

\subsection{Control unit}
\subsection{Control unit}
The control unit loads in the operands and has full control over the multiplier. For single multiplications, it latches in
The control unit loads in the operands and has full control over the multiplier. For single multiplications, it latches in
the $x$ operand, then places the $y$ operand on the bus and starts the multiplier. In case of an exponentiation, the FIFO is
the $x$ operand, then places the $y$ operand on the bus and starts the multiplier. In case of an exponentiation, the FIFO is
Line 145... Line 155...
\begin{tabular}{|l|c|c|p{8cm}|}
\begin{tabular}{|l|c|c|p{8cm}|}
\hline
\hline
\rowcolor{Gray}
\rowcolor{Gray}
\textbf{Port} & \textbf{Width} & \textbf{Direction} & \textbf{Description} \bigstrut\\
\textbf{Port} & \textbf{Width} & \textbf{Direction} & \textbf{Description} \bigstrut\\
\hline
\hline
\verb|clk|   & 1     & in    & core clock input \bigstrut\\
\verb|core_clk|   & 1     & in    & core clock input, clock signal for the multiplier and control unit \bigstrut\\

\hline

\verb|bus_clk|   & 1     & in    & bus clock input, clock signal for all core IO \bigstrut\\
\hline
\hline
\verb|reset| & 1     & in    & reset signal (active high) resets the pipeline, fifo and control logic \bigstrut\\
\verb|reset| & 1     & in    & reset signal (active high) resets the pipeline, fifo and control logic \bigstrut\\
\hline
\hline
\multicolumn{4}{|l|}{\textbf{\textit{operand memory interface}}} \bigstrut\\
\multicolumn{4}{|l|}{\textbf{\textit{operand memory interface}}} \bigstrut\\
\hline
\hline
Line 211... Line 223...
\hline
\hline
\verb|C_NR_STAGES_LOW| & number of lower stages in the pipeline, defines the bit-width of the lower pipeline part & integer & 32 \bigstrut\\
\verb|C_NR_STAGES_LOW| & number of lower stages in the pipeline, defines the bit-width of the lower pipeline part & integer & 32 \bigstrut\\
\hline
\hline
\verb|C_SPLIT_PIPELINE| & option to split the pipeline in 2 parts & boolean & true \bigstrut\\
\verb|C_SPLIT_PIPELINE| & option to split the pipeline in 2 parts & boolean & true \bigstrut\\
\hline
\hline
\verb|C_FIFO_DEPTH| & depth of the generic FIFO, only applicable if \verb|C_MEM_STYLE| = \verb|"generic"| or \verb|"asym"|  & integer & 32 \bigstrut\\
\verb|C_FIFO_AW| & address width of the generic FIFO pointers, FIFO size is equal to $2^{C\_FIFO\_AW}$. & integer & 7 \bigstrut\\

& only applicable if \verb|C_MEM_STYLE| = \verb|"generic"| or \verb|"asym"|  & & \\
\hline
\hline
\verb|C_MEM_STYLE| & select the RAM memory style (3 options): & string & \verb|"generic"| \bigstrut\\
\verb|C_MEM_STYLE| & select the RAM memory style (3 options): & string & \verb|"generic"| \bigstrut\\
& \verb|"generic"| : use general 32-bit RAMs & & \\
& \verb|"generic"| : use general 32-bit RAMs & & \\
& \verb|"asym"| : use asymmetric RAMs & & \\
& \verb|"asym"| : use asymmetric RAMs & & \\