URL
https://opencores.org/ocsvn/virtex7_pcie_dma/virtex7_pcie_dma/trunk
Subversion Repositories virtex7_pcie_dma
Compare Revisions
- This comparison shows the changes necessary to convert path
/virtex7_pcie_dma/trunk
- from Rev 30 to Rev 31
- ↔ Reverse comparison
Rev 30 → Rev 31
/documentation/example_application/introduction.tex
0,0 → 1,56
% !TeX spellcheck = en_US |
\section{Introduction} |
|
\subsection {Wupper package} |
|
Approaching a development package bottom up, the Wupper core\footnote{A wupper is a person performing the act of bongelwuppen, the version from the Dutch province of Groningen of the Frisian sport Fierljeppen (canal pole vaulting). \href{https://www.youtube.com/watch?v=Bre8DsQZqSs}{https://www.youtube.com/watch?v=Bre8DsQZqSs}}, is a module of the FELIX firmware and provides an interface for the Direct Memory Acces (DMA) in the Xilinx Virtex-7 FPGA hosted on the VC-709. This FPGA has a PCIe Gen3 hard block integrated in the silicon~\cite{pg023}. With the PCIe Gen3 standard it is possible to reach a theoretical line rate of 8 GT/s; by using 8 lanes, it is therefore possible to reach a theoretical throughput of 64 Gb/s. |
The main purpose of Wupper is to handle data transfers from a simple user interface, i.e a FIFO, to and from the host PC memory. The other functionality supported by Wupper is the access to control and monitor registers inside the FPGA, and the surrounding electronics, via a simple register map. Figure~\ref{fig:simplewupperpackage1} below shows a block diagram of the Wupper package. |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = .8 \textwidth]{figures/wupper_package_simple.pdf} |
\caption{Wupper package overview} |
\label{fig:simplewupperpackage1} |
\end{figure} |
The Wupper core communicates to the host PC via the Wupper driver and is controlled by a set of, so called, Wupper tools. The Wupper driver through an Application Programming Interface (API) can also communicate to a Wupper Graphical User Interface (GUI). Wupper had been published under the LGPL license on Opencores.org~\cite{opencores}. As the developers firmly believe in the dissemination of knowledge through Open Source. Hence users can freely download, use and learn from the core and possibly provide feedback to further improve Wupper. The outcome of the development is the so called Wupper package: a suite of firmware and software components, which details will be given later in this report. On missing feature of the Wupper core published on OpenCores was a simple yet complete example application to study, test, and benchmark Wupper. To avoid confusion concerning name, a list is created to specify a name and description for all the parts of the Wupper project: |
|
\begin{itemize} |
\item Wupper core: firmware PCIe engine |
\item Wupper driver: software device driver |
\item Wupper tools: software tools to operate the core |
\item Wupper GUI: a simple control and monitor panel |
\item Wupper package: the sum of the above packed for distribution on Open Cores. |
\end{itemize} |
|
|
\newpage |
\section{Internship} |
|
\subsection {Goal} |
Given the background provided in the previous chapter, my contribution to this project is to develop an example application that checks the health of the core in both directions. The application also checks whether the data that is written into the PC memory is valid. The development contains software (Wupper tools) and an HDL example application.. In addition, a GUI will be developed for the application. Besides those main activities, the device driver and tools developed for Wupper used in the FELIX application has to be ported and tested for the Wupper version published on OpenCores. Appendix B shows the global schedule of the activities I carried out during the development of the application. |
|
\subsection {Topics} |
As introduced in the previous paragraph, the aim of this internship is to develop a test application for Wupper. Its purpose is to benchmark the robustness and performance of the Wupper core. |
To reach this goal, at first the structure of the Wupper package needs to be understood. This requires grasping how to transfer data using the Wupper core and what is needed for controlling the FPGA using software. Each specific sub-task of the work carried out for this development is detailed in the following sub-paragraphs. |
|
|
\subsection {Drivers and tools} |
|
%\begin{wrapfigure}[15]{r}{0.4\textwidth} |
% \centering |
% \vspace{-4mm} |
% \includegraphics[width = 0.5 \textwidth]{figures/FELIX_PC.png} |
% \vspace{-3mm} |
% \caption{Overview of FELIX PC} |
% \label{fig:felixpc} |
%\end{wrapfigure} |
|
The drivers and tools are the low level software parts which control the logic of the Wupper core. A set of device drivers are used to: (i) initialize the FPGA PCIe card and control DMA transfers, (ii) perform I/O operations on registers inside the FPGA, (iii) allocate memory buffers in the host PC to be used as landing areas for data transfers. |
The Wupper-tools, a collection of tools which is made in the programming languages C and C++, are used to control the logic through the drivers.The Wupper-tools are intended to be a subset of the tools developed for Wupper in the framework of the FELIX project, meaningful for the OpenCores users. The key to implement the Wupper tools is to understand how the original tools work and which parts can be reused. |
|
|
\subsection {VHDL example application code} |
The purpose of the VHDL example application is to show the essentials of the DMA transfer function of Wupper. Prior to the development described in this report, there was only a simple 32-bit counter used to test the data flow in only one direction, i.e. from the FPGA to the PC. Understanding the Wupper core will lead to a renewed version which should transfer 256 bit data with high speed, both in the up and down direction. |
|
\subsection {Developing a GUI} |
Prior to this development operating the FPGA card was done via a terminal. There is a certain order to get it working which can be very complicated for the users. The solution is to design a Graphical User Interface (GUI) which can be run on Linux systems. |
/documentation/example_application/wupper.tex
0,0 → 1,349
% !TeX spellcheck = en_US |
\newpage |
\section{Wupper package} |
|
In this section, the firmware, drivers, and tools of Wupper (see Figure~\ref{fig:simplewupperpackage}) together with its working principle are explained. For more detailed information about the internals and the core please refer to the official Wupper documentation~\cite{wupperoffical}. |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = .8 \textwidth]{figures/wupper_package_simple.pdf} |
\caption{Wupper package overview} |
\label{fig:simplewupperpackage} |
\end{figure} |
|
\subsection {Wupper core} |
|
%\begin{wrapfigure}[15]{r}{0.7\textwidth} |
% \centering |
% \vspace{-4mm} |
% \includegraphics[width = 0.5 \textwidth]{figures/full_application_structure.pdf} |
% \vspace{-3mm} |
% \caption{Overview of the HDL modules in the Wupper package} |
% \label{fig:wupperpackage} |
%\end{wrapfigure} |
|
An Engine, like Wupper, moves data bidirectionally to a memory without CPU intervention. This efficient method is used for handling large amounts of data, which is crucial for throughput intensive applications. During a DMA tranfer, the DMA control core will take control according to the information provided by a DMA descriptor, and by flagging completion of operations in a per descriptor status register. By providing user data into the FIFO's, the core starts the DMA transfer over the PCIe lanes. Figure~\ref{fig:wupperpackage} shows a complete diagram of the of the HDL modules of the Wupper package; including the HDL modules for the Wupper core and the example application, together with the host PC memory. |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = 0.8 \textwidth]{figures/full_application_structure.pdf} |
\caption{Overview of the HDL modules in the Wupper package} |
\label{fig:wupperpackage} |
\end{figure} |
|
% http://www.techonline.com/electrical-engineers/education-training/tech-papers/4370633/Introduction-to-DMA/viewpdf |
% http://www.techonline.com/electrical-engineers/education-training/tech-papers/4136585/Universal-DMA-Controller-One-stop-solution-for-increasing-throughput-and-decreasing-latency-by-bypassing-CPU/viewpdf |
|
\subsubsection {Xilinx PCIe End Point} |
|
|
The Virtex-7 XC7VX690T-2FFG1761C on the VC-709 board has an integrated endpoint for PCI Express Gen3~\cite{pg023}. This black box handles the traffic over the PCI Express bus. Inside the Wupper core a DMA read/write process, sends and receives AXI4 commands over the AXI4-Stream bus. The black box translates this into differential electrical signals. Figure~\ref{fig:pciexpressendpoint} shows a simplified model of the firmware stack. Configuration of the core is explained in section 3.2 of the official documentation of Wupper~\cite{configpciecore}. |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = 1 \textwidth]{figures/PG023.pdf} |
\caption{Block diagram of the logic in the VC-709 FPGA} |
\label{fig:pciexpressendpoint} |
\end{figure} |
|
\subsubsection {Core control} |
The DMA control (DMA\_control in Figure~\ref{fig:wupperpackage}) process consists of a register map which can be configured from a PC using the Wupper tools. The registermap is divided in three regions: BAR0, BAR1 and BAR2. BAR stands for Base Address Region. Every BAR has 1 MB of address space. |
|
BAR0 contains registers associated with DMA like the DMA descriptors. The descriptors specify the addresses, transfer direction, size of the data and an enable line. Figure~\ref{fig:wupperpackage} shows that the information is fed to the DMA\_read\_write core. |
|
BAR1 is reserved for the interrupt mechanism and consists of 8 vectors. |
|
BAR2 is used for the benchmark application and is dedicated to user applications. The work done for this report defines and acts on registers in BAR2, as summarized in Appendix ~\ref{sec:bar2}. |
|
|
|
As previously shown in Figure~\ref{fig:wupperpackage}, the example application core consists of multiple function blocks which are attached to the register map. This makes it possible to control the benchmark application from the PC. |
A complete overview of the register map can be found in the official documentation of Wupper ~\cite{wupperoffical}. |
|
|
|
|
%The content related to the benchmark application will be discussed in the section |
|
|
%The DMA control process consist of two Read / Write subprocesses: descriptors and status. The descriptors are parsed in the Read / Write descriptors process |
\newpage |
\subsubsection {DMA read/write} |
The DMA read and write (DMA\_read\_write in Figure~\ref{fig:wupperpackage}) module handles the transfer from the FIFO's according to the direction specified by the descriptors. If data shifts into the down FIFO, a non-empty flag will be asserted to start the DMA write process, this direction of the flow is defined as the "down link". This process reads the descriptors and creates a header with the information. The header is added when the data shifts out of the down FIFO. For the reversed situation, the data with a header is read from the PC memory. This direction of the flow is then defined as "up link". The information in the header will be parsed by the DMA control and the data fed to the up FIFO. |
|
|
|
|
\subsection{Example application HDL modules} |
The example application, the user application inside the FPGA, replaces the counter with a pseudo-random data generator. Moreover the new feature in the application has the possibility to process data from the PC memory. The example application can be operated in two modes: |
|
\begin{enumerate} |
\item The random data generator directly sends data to the host via Wupper, this is referred to as "write only" or "half loop" test. |
\item The content of the random data generator is wrote back to the FPGA, multiplied and sent to host again, this is referred as "read and write" or "full loop" test. |
|
\end{enumerate} |
|
The example application is developed in VHDL, and the code is synthesized and implemented in Xilinx Vivado 2014.4~\cite{vivadoman}. The example application is now part of the Wupper package on OpenCores. |
|
\subsubsection {Functional blocks} |
|
Figure~\ref{fig:benchmarkapp} shows a detailed block diagram of the example application for Wupper. The Wupper core contains a list of addresses, this list is the register map. The values of the register map are implemented in the firmware as signals. The PC sees the signals as addresses. Wupper tools write values to these addresses which control the FPGA logic (see dashed lines in Figure~\ref{fig:benchmarkapp}). |
%http://www.eetimes.com/document.asp?doc_id=1274550 |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = 0.8 \textwidth]{figures/benchmark_application.pdf} |
\caption{Overview of the example application} |
\label{fig:benchmarkapp} |
\end{figure} |
|
\newpage |
As introduced in the previous paragraph, one type of test possible with the example application is the "half loop": in such mode of operation, Wupper is fed by a random data generator based on a 256 bits Linear Feedback Shift Register (LFSR). An LFSR, as shown in Figure~\cite{lfsr}, consists of a number of shift registers which are fed back to the input. The feedback is manipulated by an XOR operation which creates a pseudo-random pattern. The ideal goal is to produce a sequence with a infinite length to prevent repetition. Repetition occurs by two factors, the feedback points/taps and the start value. The maximal length sequence can be approached by $2^n-1$~\cite{lfsr}. Where the $n$ is the number of shift registers. The 256 bits LFSR is a four stage Galois LFSR with taps at the registers 256, 254,251 and 246. The approach is explained in paper~\cite{lfsrtable} by R. W. Ward and T.C.A. Molteno of the electronics group at the University of Otago. |
The software tools developed for the example application initialize the seed value by writing it to the register map thereafter the 1-bit $LFSR\_LOAD\_SEED$ signal is set to 1. This resets the LFSR process with a seed value. |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = 0.8 \textwidth]{figures/lfsr.pdf} |
\caption{A 4 bit Linear Feedback Shift Register (LFSR)} |
\label{fig:lfsr} |
\end{figure} |
%http://www.newwaveinstruments.com/resources/articles/m_sequence_linear_feedback_shift_register_lfsr.htm |
\newpage |
For multiplication, the Xilinx multiplier IP block is used. The operations are based on the DSP48E1~\cite{DSP84E1} for the Virtex-7 series. There are two parallel multipliers used with two unsigned 64-bit inputs. To make the multiplier perform optimally at high clock rates, an 18 stage pipelining is used. |
|
For monitoring the core temperature, a XADC IP block~\cite{xadc} is used. This is generated by Vivado's XADC wizard. The output signal of the block is connected to one register of the register map. |
|
The 1-bit signal $APP\_MUX$ is attached to the select port of the application multiplexer. This enables the data flow to the down FIFO. |
|
The signal $APP\_ENABLE$ enables the output of the LFSR and the multiplier. The 2-bits signal has three states: |
\begin{itemize} |
\item "00": No data flow, application is on standby. |
\item "01": Makes the example application enable 'high' causing data to flow only from the LFSR. |
\item "10": Makes the example application enable 'high' causing data to flow only from the multiplier. |
\end{itemize} |
The FIFO's are generated by Vivado's FIFO generator and using integrated common clock block RAMs. The clock is set to 250 MHz to reach the maximum theoretical throughput. The up FIFO is deeper to function as a buffer. This is an extra precaution. The reason is if the data is looped back in the application, both FIFO's can be full at the same time. If this occurs, the application stalls because of the loop back. |
|
\newpage |
|
|
\subsection{Device driver and Wupper tools} |
|
|
The Wupper tools communicate with the Wupper core through the Wupper device driver. Buffers in the host PC memory are used for bidirectional data transfers, this is done by a part of the driver called CMEM. This will reserve a chunk of contiguous memory in the host. For the specific case of the example application, the allocated memory will be logically subdivided in two buffers (buffer 1 and buffer 2 in Figure~\ref{fig:wupperpackage}). One buffer is used to store data coming from the FPGA (write buffer, buffer 1), the other to store the ones going to the FPGA (read buffer, buffer 2). The idea behind the logical split of the memory in buffers is that those buffers can be used to copy data from the write to read, and perform checks. The driver is developed for Scientific Linux CERN 6 but has been tested and used also under Ubuntu kernel version 3.13.0-44a. Building and loading/unloading the driver is explained in section 6.1.2 en 6.1.3 of the official documentation of Wupper~\cite{operatingpcieengine}. |
|
The Wupper tools are a collection of tools which can be used to debug and control the Wupper core. These tools are command line programs and can only run if the device driver is loaded. A detailed list and explanation of each tool is given in the next paragraphs. Along with the collection of tools derived from the FELIX tool suite, the Wupper-dma-transfer and Wupper-chaintest had been added as new features for the OpenCores' benchmark. As mentioned before, the purpose of those applications is to check the health of the Wupper core. |
|
The Wupper tools collection comes with a readme~\cite{wupperreadme}, this explains how to compile and run the tools. Most of the tools have an -h option to provide helpful information. The table below shows a list of the tools derived from the original flxtools suite and their description. |
|
\begin{center} |
\begin{tabular}{ | l || p{10cm} |} |
\hline |
Tool & Description \\ \hline |
|
Wupper-info |
& Prints information of the device. For instance device ID, PLL lock status of the internal clock and FW version. |
\\ \hline |
|
Wupper-reset |
& Resets parts of the example application core. These functions are also implemented in the Wupper-dma-transfer tool. |
\\ \hline |
|
|
Wupper-config |
& Shows the PCIe configuration registers and allows to set, store and load configuration. An example is configuring the LED's on the VC-709 board by writing a hexadecimal value to the register. |
\\ \hline |
Wupper-irq-test |
& Tool to test interrupt routines |
\\ \hline |
|
Wupper-dma-test |
& This tool transfers every second 1024 Byte of data and dumps it to the screen. |
\\ \hline |
|
Wupper-throughput |
& The tool measures the throughput of the Wupper core. The method of computing the throughput is wrong, this is discussed in the section 3.4.2. |
\\ \hline |
|
|
Wupper-dump-blocks |
& This tools dumps a block of 1 KB. The iteration is set standard on 100. This can be changed by adding a number after the "-n". |
\\ \hline |
|
\end{tabular} |
\end{center} |
|
For the Wupper package on OpenCores two extra tools had been newly developed to target specific benchmark requirement for the generic example application: Wupper-dma-transfer and Wupper-chaintest. In the next paragraphs a detailed description of such tools and their operation is given. |
|
\newpage |
|
\subsubsection{Operating Wupper-dma-transfer} |
|
Wupper-dma-transfer sends data to the target PC via Wupper also known as half loop test. This tool operates the benchmark application and has multiple options. A list of such options is summarized in Listing~\ref{lst:dmatoollist}. |
|
\begin{lstlisting}[language=BASH, frame=single, label={lst:dmatoollist}, caption=Output of Wupper-dma-transfer -h] |
daqmustud@gimone:$ ./wupper-dma-transfer -h |
|
Usage: wupper-dma-transfer [OPTIONS] |
|
|
This application has a sequence: |
1 -Start with dma reset(-d) |
2 -Flush the FIFO's(-f) |
3 -Then reset the application (-r) |
|
|
Options: |
-l Load pre-programmed seed. |
-q Load and generate an unique seed. |
-g Generate data from PCIe to PC. |
-b Generate data from PC to PCIe. |
-s Show application register. |
-r Reset the application. |
-f Flush the FIFO's. |
-d Disable and reset the DMA controller. |
-h Display help. |
|
|
\end{lstlisting} |
|
|
Before using the write function, make sure that the application is ready by resetting all the values, as shown in Listing~\ref{lst:dmatoolreset}. |
|
\begin{lstlisting}[language=BASH, frame=single, label={lst:dmatoolreset}, caption=Reset Wupper before a DMA Write action] |
daqmustud@gimone:$ ./wupper-dma-transfer -d |
Resetting the DMA controller...DONE! |
daqmustud@gimone:$ ./wupper-dma-transfer -f |
Flushing the FIFO's...DONE! |
daqmustud@gimone:$ ./wupper-dma-transfer -r |
resetting application...DONE! |
\end{lstlisting} |
|
\newpage |
|
\noindent |
Before writing data into the PC, the data generator needs a seed to initialize the generator. There are two options available: load a unique seed or load a pre-programmed seed. The pre-programmed seed is always 256 bits, the unique seed value can be variable. The -s option displays the status of the register including the seed value. For a unique seed, replace the -l with -q, as shown in Listing~\ref{lst:dmatoolseed}. |
|
\begin{lstlisting}[language=BASH, frame=single, label={lst:dmatoolseed}, caption=Loading a pre-programmed seed in to the data generator.] |
daqmustud@gimone:$ ./wupper-dma-transfer -l |
Writing seed to application register...DONE! |
daqmustud@gimone:$ ./wupper-dma-transfer -s |
|
Status application registers |
---------------------------- |
LFSR_SEED_0A: DEADBEEFABCD0123 |
LFSR_SEED_0B: 87613472FEDCABCD |
LFSR_SEED_1A: DEADFACEABCD0123 |
LFSR_SEED_1B: 12313472FEDCFFFF |
APP_MUX: 0 |
LFSR_LOAD_SEED: 0 |
\end{lstlisting} |
|
|
The -g option performs a DMA write to the PC memory. The data generator starts to fill the down FIFO and from the PC side, a DMA read action is performed. The size of the transfer is set to 1 MB by default, but the size is configurable. When the PC receives 1 MB of data, the transfer stops. It is possible that there is still some data left in the down FIFO, resetting the FIFO's can be done by the -f option, as shown in Listing~\ref{lst:dmatoolwrite}. |
|
\begin{lstlisting}[language=BASH, frame=single, label={lst:dmatoolwrite}, caption=Start generating data to the target.] |
daqmustud@gimone:$ ./wupper-dma-transfer -g |
Starting DMA write |
done DMA write |
Buffer 1 addresses: |
0: EED9733362A50D71 |
... |
... |
... |
\end{lstlisting} |
|
\newpage |
|
In a similar way a DMA read action from the FPGA can be performed by using the -b option. The output of the up FIFO is fed to a multiplier. The output of the multiplier is fed to the down FIFO with a destination to the PC memory as shown in Listing~\ref{lst:dmatoolback}. |
|
\begin{lstlisting}[language=BASH, frame=single, label={lst:dmatoolback}, caption= Performing a DMA read and DMA write] |
daqmustud@gimone:$ ./wupper-dma-transfer -b |
Reading data from buffer 1... |
DONE! |
Buffer 2 addresses: |
0: 24BBEC63B53F3BCC |
... |
... |
... |
\end{lstlisting} |
|
\subsubsection{Operating Wupper-chaintest} |
The Wupper-chaintest tool does in one shot a complete DMA Read and Write transfer. It checks if the multiplied data is done correctly. This is done by multiplying the data in buffer 2 and compare the output of the multiplier in buffer 1 (shown earlier in Figure~\ref{fig:wupperpackage}). The tool returns the number of errors out of 65536 loops as shown in Listing~\ref{lst:chaintest}. |
\begin{lstlisting}[language=BASH, frame=single, label={lst:chaintest}, caption=Output of Wupper-chaintest] |
daqmustud@gimone:$ ./wupper-chaintest |
Reading data from buffer 1... |
DONE! |
Buffer 2 addresses: |
0: 49A5A89745420D34 |
... |
... |
... |
9: 5D37679AE79FA7C2 |
0 errors out of 65536 |
\end{lstlisting} |
|
\newpage |
|
|
\subsection {Wupper GUI} |
|
The concept of the Wupper GUI is based on the Wupper tools and has the same construction (see Figure~\ref{fig:softwaretree}). The GUI is developed with Qt version 5.5 (C++ based)~\cite{qt} and gives the user a visual feedback of the Wupper's status/health. The GUI can only run if the device driver is loaded. |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = 0.7 \textwidth]{figures/tree.pdf} |
\caption{High and low level software overview block diagram.} |
\label{fig:softwaretree} |
\end{figure} |
|
\subsubsection {Functional blocks and threaded programming} |
|
|
Multi-threading is used so functional blocks can run at the same time as the GUI. If multi-threading is not used, the GUI interface gets stuck. A thread starts a new process next to the main process. If another processor core is available, the thread will run on a separated core. By communicating via slots to the main process, the data is secured. |
There are two threads but only one of the threads can be used at the same time. The reason is that both threads use the same DMA ID, this will cause an error. |
The threads communicate with the Application Program Interface (API) to control and fetch the output of the logic. The output data communicate safely via a signal to the slots. Figure~\ref{fig:guithreads} shows an overview of the threaded programs in the Wupper GUI. |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = 0.9 \textwidth]{figures/wupper_gui_threads_overview.pdf} |
\caption{Threaded programs in the Wupper GUI} |
\label{fig:guithreads} |
\end{figure} |
|
\newpage |
\subsubsection {GUI operation} |
|
The GUI is separated in four regions (see Figure~\ref{fig:wuppergui}): status, control, measurement and an info region. |
The status region fetches the information about various parts of the FPGA on the VC-709 via the Wupper core, and about the core itself. When the user clicks on the "get Wupper status" button, it shows the internal PLL lock status, Board ID, Card ID and the firmware version. |
|
The control region controls the logic inside Wupper through the API. The "Reset Wupper" button resets the application logic by resetting the DMA, flushing the FIFO's and reset the application values. |
|
In the DMA Write section, the user can perform a DMA Write measurement. The user can configure the blocksize. The blocksize has effect on the speed, this is discussed in Appendix ~\ref{sec:blocksize}. The measurement output is shown in the measurement region. The method of computing the throughput is different than the method of the Wupper-throughput tool. The fault is the wrong order of operations by misplacing brackets. The wrong method is $A/B*C= D$ instead of $A/(B*C)= D$. |
|
In a similar way, the user can perform a DMA Read test and the output is shown in the plot in the measurement region. |
The info/console output region gives the user feedback of the application and the GUI. |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = 1 \textwidth]{figures/gui_printscreen_tb.PNG} |
\caption{Screenshot of the example application GUI} |
\label{fig:wuppergui} |
\end{figure} |
|
|
|
|
|
|
\newpage |
|
|
|
|
|
|
|
|
|
|
\newpage |
/documentation/example_application/verification.tex
0,0 → 1,38
% !TeX spellcheck = en_US |
\section{Verification} |
In this section the verification of the example application HDL modules is being discussed. During the development of the example application HDL modules, the functional blocks are simulated separately. The HDL blocks are simulated with Mentor Graphics Questasim using a scripting language called Tool Command Language (TCL)~\cite{tcl}. |
|
\subsection{Randomness of the data generator} |
The heart of the example application HDL modules is the data generator. This provides data to the Wupper core. The data generator is based on the Linear Feedback Shift Register (LFSR). There are other techniques for generating (pseudo) random data such as the Linear Congruential Generator and Multiple Recursive Generators. The problem of these methods are the need of a lot of multiplication computing power~\cite{randomgen}. This requires a lot of digital logic / DSP slices. |
|
It is important that the output data is random. To check this pattern Questasim is used for simulation. The first signal in Figure~\ref{fig:simapp} shows the output signal of the LFSR module using the approach by R.W. Ward and T.C.A Molteno~\cite{lfsrtable}. Questasim can plot the data in the waveform viewer which gives a nice overview of the randomness. |
|
\begin{figure}[h] |
\centering |
\includegraphics[width = 1 \textwidth]{figures/sim_lfsr_output.pdf} |
\caption{Randomness of the data generator based on a 256-bits LFSR. } |
\label{fig:simapp} |
\end{figure} |
|
|
\subsection{Verification flow} |
After an expected behaviour of the application HDL modules, the complete Wupper package needs to be verified. The expected behaviour of the full Wupper pacakage is that the output of the LFSR is first sent to a buffer in the PC memory. This buffer will later be transferred back into the FPGA by means of a DMA read cycle and fed into the input of the multipliers. Meanwhile, a second transfer is started simultaneously to transfer the multiplied data into a second buffer of PC memory. |
Simulating the behaviour of transactions to PC memory is possible but very complex. In this case it is efficient to test the behavior real-time with Vivado's Integrated Logic Analyzer (ILA) ~\cite{ila}. ILA allows monitoring signals in real-time. The ILA core uses RAM blocks inside the FPGA as storage elements for the data in between acquisitions and subsequent transfers to host via the JTAG interface. It is therefore obvious that a combination of monitored signals and the depth of an acquisition will impact the resource consumption in the FPGA when equipped with debug probes. It is therefore crucial to carefully select the signals that one wants to monitor and the depth, i.e. number of samples, one wants to get per acquisition. |
|
\newpage |
When the probes are set properly and the triggers are armed, the next step in this process is operating the logic. This is done on the host PC via the Wupper-tools (described in Paragraph 3.3). The tools will activate the triggers and create an event. This event will acquire signal status which can be used for verifying the behaviour. This approach tests at the same time the HDL part and the low level software part. This approach and resources used during the verification is displayed below in Figure~\ref{fig:veriflow}. |
|
\begin{figure}[h] |
\includegraphics[width = 1 \textwidth]{figures/veriflow.pdf} |
\caption{View of the verification flow} |
\label{fig:veriflow} |
\end{figure} |
|
The output of the LFSR is fed back to the input of the multipliers. To verify the multiplication, the tool Wupper-chaintest is developed. The tool activates the flow from the data generator to PC memory and back to the PC memory through the multipliers. As described in Section 3.3.2, the tool reserves two buffers inside the target PC. The data that is stored in buffer 2, are the inputs of the multipliers. This data is multiplied and verified with the output of the data that is stored in buffer 1. This is compared by the tool and returns an overview of errors that occur. |
|
The high level software has the same purpose as the Wupper-tools. This makes it easy to implement it in the Wupper GUI. As mentioned before, the tools are based on programming languages C and C++ while Qt is based on C++. Which makes it possible to port the tools into Qt. |
|
|
|
|
\newpage |
/documentation/example_application/internship-wupper.tex
0,0 → 1,19
\author{Oussama el Kharraz Alami} |
\title{\large Development of an application for\\ \small Wupper a PCIe Gen3 DMA for Virtex 7 } |
\input{et_template/template.tex} |
\date{\SetDocumentDate} |
\def\DocVer{1.0} |
\def\SetDocumentDate{29-1-2016} |
\begin{document} |
\input{titlepage.tex} |
\input{tableofcontents.tex} |
\input{introduction.tex} |
\input{wupper.tex} |
\input{verification.tex} |
\input{conclusion.tex} |
|
\begin{appendices} |
\input{appendix.tex} |
\end{appendices} |
\input{reference.tex} |
\end{document} |
/documentation/example_application/et_template/template.tex
0,0 → 1,118
\pdfobjcompresslevel=0 |
\documentclass[12pt,a4paper,twoside]{article} |
%to use some special color names like "OliveGreen" |
\usepackage[usenames,dvipsnames]{color} |
%Math functions |
\usepackage{amsmath} |
\usepackage{amsfonts} |
\usepackage{amssymb} |
%Including images |
\usepackage{graphicx} |
%Source code listings |
\usepackage{listings} |
%Clickable links |
\usepackage{hyperref} |
\hypersetup{colorlinks=true,urlcolor=blue,linkcolor=black, citecolor=black} |
\usepackage{geometry} |
\usepackage[utf8]{inputenc} %codification of the document |
%to have your images correctly |
\usepackage{float} |
%used to fill with Lorem Ipsum text |
\usepackage{lipsum} |
%uses courier font in code listing |
\usepackage{courier} |
%A nicer way to create tables |
\usepackage{tabu} |
\usepackage{longtable} |
\let\oldlongtabu\longtabu \renewcommand{\longtabu}{\footnotesize\oldlongtabu} |
\usepackage{tabularx} |
\usepackage[table]{xcolor} |
%Headers on top and below page |
\usepackage{fancyhdr} |
%ability to use .svg images |
\usepackage{svg} |
%ablity to make a history index |
\usepackage{vhistory} |
%ability to make a nomenclature abbreviation list |
\usepackage[intoc]{nomencl} % for abbreviation list |
% just for generation blind text test. |
\usepackage{blindtext} |
% to change the page dimensions\textbf{} |
\usepackage{geometry} |
% to change a single page to landscape. |
\usepackage{lscape} |
\usepackage{wrapfig} |
|
|
\usepackage[titletoc, title]{appendix} |
% for merging pdf pages |
\usepackage{pdfpages} % merging pdf files..... |
|
\usepackage{sectsty} |
% eps to pdf automatic |
\usepackage{epstopdf} |
%little trick so we can use \doctitle and \docauthor throughout the document |
\makeatletter |
\let\doctitle\@title |
\let\docauthor\@author |
\makeatother |
%use another font |
\renewcommand{\familydefault}{\sfdefault} |
%Put page numbers, document title and author in header / footer |
\fancypagestyle{plain}{ |
\fancyhead[L]{} |
\fancyhead[R]{} |
\fancyhead[CH]{\doctitle} |
\fancyfoot[OR]{\thepage} |
\fancyfoot[OL]{\DocVer} |
\fancyfoot[EL]{\thepage} |
\fancyfoot[ER]{\DocVer} |
\fancyfoot[C]{\docauthor} |
\renewcommand{\headrulewidth}{0.1 mm} % ad line under header |
\renewcommand{\footrulewidth}{0.1 mm} % ad line under footer |
} |
\setlength{\headheight}{51.4pt} |
%use plain page style with fancyheaders |
\pagestyle{plain} |
%put the nikhef logo and add some other things on the title page |
\fancypagestyle{titlepage}{ |
\fancyhead[C]{} |
\fancyhead[L]{\includegraphics[width=0.2\textwidth]{et_template/pictures/NIKHEF.pdf} } |
\fancyhead[R]{Electronics \\Technology} |
\fancyfoot[L]{} |
\fancyfoot[R]{} |
\fancyfoot[C]{\includegraphics[width=1\textwidth]{et_template/pictures/footer.png}\\Science Park 105 - 1098XG Amsterdam} |
\renewcommand{\headrulewidth}{0.1 mm} % add line under header |
\renewcommand{\footrulewidth}{0 pt} % add line under footer |
|
} |
%lines below header and above footer |
%some better margins than default |
\geometry{a4paper} |
\geometry{left=27.5mm,right=27.5mm,bottom=27.5mm,top=27.5mm} %margins |
%make the header over the full page width for odd and even pages |
\fancyheadoffset[LE]{0 cm} |
%Format and colorize source code listings |
\lstset{ |
basicstyle=\footnotesize\ttfamily, |
breaklines=true, |
keywordstyle=\color{blue}, |
stringstyle=\color{red}, |
commentstyle=\color{OliveGreen}, |
% numbers=left, |
morecomment=[l][\color{OliveGreen}]{\#} |
} |
|
|
|
%Add some properties to the PDF file |
\hypersetup{pdfauthor={\docauthor},% |
pdftitle={\Large \doctitle},% |
pdfsubject={\doctitle},% |
pdfkeywords={Nikhef, ET, Electronics},% |
pdfproducer={LaTeX},% |
pdfcreator={LuaLaTeX} |
} |
% Loading circuitikz with siunitx option to create electronic circuits |
\usepackage[siunitx]{circuitikz} |
|
/documentation/example_application/et_template/titlepage.tex
0,0 → 1,9
\begin{titlepage} |
\maketitle |
\thispagestyle{titlepage} |
\begin{center} |
\vspace{5cm} |
\includegraphics[width=0.25\textwidth]{pictures/NIKHEF.pdf} |
\end{center} |
\newpage |
\end{titlepage} |
/documentation/example_application/et_template/pictures/footer.png
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
documentation/example_application/et_template/pictures/footer.png
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: documentation/example_application/et_template/pictures/NIKHEF.pdf
===================================================================
--- documentation/example_application/et_template/pictures/NIKHEF.pdf (nonexistent)
+++ documentation/example_application/et_template/pictures/NIKHEF.pdf (revision 31)
@@ -0,0 +1,87 @@
+%PDF-1.4
+%쏢
+5 0 obj
+<>
+stream
+xmM!}
+NA?HpR,^$U 즼'DI0y~~;4a>Pք'PJh@&*ʀA&leA;%\F6dhE֕Ryk6,*ncbŽ6yC9d\^S>kx V
+0ݺk2W7l kgIe91aY&@y xmt6vŲ^
+d+6l
+ab;f`=K/klHxͺv]WMLg&̽,Όvdv]Xّ̺4C [v~BHK8w~:gm%klPC8d?ej*̌䬲$pJ2b(cnP"sd,LB;B֛.">Z qb]xKfV=l }#JbkDֻ4;endstream
+endobj
+6 0 obj
+476
+endobj
+4 0 obj
+<>
+/Contents 5 0 R
+>>
+endobj
+3 0 obj
+<< /Type /Pages /Kids [
+4 0 R
+] /Count 1
+>>
+endobj
+1 0 obj
+<>
+endobj
+7 0 obj
+<>endobj
+8 0 obj
+<>
+endobj
+9 0 obj
+<>stream
+
+
+
+
+
+2011-05-30T11:55:17+02:00
+2011-05-30T11:55:17+02:00
+Adobe Illustrator\(TM\) 3.2
+
+\(NIKHEFlogo.eps\) \(Kees Huyser\) \(\)
+
+
+
+
+
+endstream
+endobj
+2 0 obj
+<>endobj
+xref
+0 10
+0000000000 65535 f
+0000000770 00000 n
+0000002399 00000 n
+0000000711 00000 n
+0000000580 00000 n
+0000000015 00000 n
+0000000561 00000 n
+0000000834 00000 n
+0000000875 00000 n
+0000000904 00000 n
+trailer
+<< /Size 10 /Root 1 0 R /Info 2 0 R
+/ID [<8C0947D36D75CCCB68A0E1ADEDF7181E><8C0947D36D75CCCB68A0E1ADEDF7181E>]
+>>
+startxref
+2617
+%%EOF
Index: documentation/example_application/figures/gui_printscreen.jpg
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: documentation/example_application/figures/gui_printscreen.jpg
===================================================================
--- documentation/example_application/figures/gui_printscreen.jpg (nonexistent)
+++ documentation/example_application/figures/gui_printscreen.jpg (revision 31)
documentation/example_application/figures/gui_printscreen.jpg
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: documentation/example_application/figures/generate_output_products.png
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: documentation/example_application/figures/generate_output_products.png
===================================================================
--- documentation/example_application/figures/generate_output_products.png (nonexistent)
+++ documentation/example_application/figures/generate_output_products.png (revision 31)
documentation/example_application/figures/generate_output_products.png
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: documentation/example_application/figures/pipeline_mul_2early_inverted.png
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: documentation/example_application/figures/pipeline_mul_2early_inverted.png
===================================================================
--- documentation/example_application/figures/pipeline_mul_2early_inverted.png (nonexistent)
+++ documentation/example_application/figures/pipeline_mul_2early_inverted.png (revision 31)
documentation/example_application/figures/pipeline_mul_2early_inverted.png
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: documentation/example_application/figures/benchmark_application.pdf
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: documentation/example_application/figures/benchmark_application.pdf
===================================================================
--- documentation/example_application/figures/benchmark_application.pdf (nonexistent)
+++ documentation/example_application/figures/benchmark_application.pdf (revision 31)
documentation/example_application/figures/benchmark_application.pdf
Property changes :
Added: svn:mime-type
## -0,0 +1 ##
+application/octet-stream
\ No newline at end of property
Index: documentation/example_application/figures/random_wupper_orange.pdf
===================================================================
--- documentation/example_application/figures/random_wupper_orange.pdf (nonexistent)
+++ documentation/example_application/figures/random_wupper_orange.pdf (revision 31)
@@ -0,0 +1,655 @@
+%PDF-1.5
+%
+3 0 obj
+<< /Length 4 0 R
+ /Filter /FlateDecode
+>>
+stream
+xM%Mr_qҢ[Ah%
+-h1 tA?cf]5
+-nǎ>|[_o9ߎ??:|[c>>89}<뿹[~>f=ۃ>;(Zڣ魎6z|=o<[[α{IOXx5jmo5=w"}u^>ooR7,\'-9#KRs^mU~iXm244iYyrs?9g:7Oci빿q4}:{^P;oo0[h/?{ ^ZcO\ZLeHx/Y9cO;?~B{i7_9ڷ)yo_}O{}){a^f$eZO4˞p]ʱ{[Ib=̥U^Y({Q8#2N{siLl{qn)48c- ui2@;O:q@W̫)>Uc彛No'Ot=iu"{=.VreoғnQji{0R~yOY]ddcw>%?ZGv5Vp/e{JMӖ)ނŒ˾uZE/δꯩY )-OvEuz^c|}-3{;ɋf:Smٛ{ >O[~"^a{^*{c/JE9uK^CUU[5fsI|!{Oؓ>|_ci9_}o>eK-ʷXaMv
+ks-^HE㱸\c)|=lz) W4\/IW֡^`ޒoýsCo>V'a|Xޯ10Q"?}-3EPwC˦c-[^#5^5okP{5NXžcc]L[FWӑOy4^ܽWz T! A#c~y}õd@ ͳX&Bj@wjٳ'Ll423ZA{].ijqes o !r':e\!W^mhP;Պ
+m4c$!3:WTaZcNئKm//KK[
+9d1?SA26'zb
+}O+DpM)}Kql T=^FZw[nx^/U,cbim-_oB+Q%S~m uCTXyh}TQgt2ֺD~'ڏȴmm+hرP$e1mE%||X#%'lK/6~ײ7adm7=:#HļeMײdM|ho"2:ρ6vXS6QYԦ]X(zi5oMS"$൏
+.++/_]npFEw+sFyg_JLUw~YB>V/fovHS'쏄: wOM_)*>s<|>dnEaJ<ǖ#{b1wLڟ~DKb7/lm]=A}>#ٺ~:4mut{n/Y1ژKUX:[j9xTهY˛X{c
+ZR֞6}vo{mNN''/}!lmRV2zKb1f{( F[ِKL#ؒ