This project implements the AXI4 transaction-level model (TLM) and bus functional model (BFM) in VHDL. Currently, only the AXI4-Stream Master protocol is supported, but I also have plans to support AXI4-Lite and the full AXI4 protocols.
This enables sub-components of an SoC system to easily communicate with one another through the AXI4 bus. Communications is achieved simply by having a procedure-call statement in your sub-component. The high-level transactions encapsulate the AXI4 protocol details in a lower-level layer known as the bus functional model. This separation between the high-level and low-level implementations results in a more modular and manageable design.
I have included OS-VVM verbatim from their website, so you will need to uncompress the file (you may uncompress using GUI as well):$ cd rtl/packages
$ tar xvf OSVVM_2013_05.tar.gz
I do not adapt nor make any changes to the OS-VVM packages. To find out more about the cool features of OS-VVM, or to contribute to the project, visit the OS-VVM website.
After unpacking OS-VVM, we can now simulate the design with Mentor Graphics Questa/ModelSim. Simply cd
into the testbench/questa
folder, and execute simulate.sh
from the Unix prompt:$ ./simulate.sh
If you have ModelSim/QuestaSim installed, the GUI will appear immediately after you run the script.
Currently, I provide only the simulation script for Linux/Unix. Email me at daniel.kho@opencores.org if you need help with simulating this project on Windows, and I will send you separate instructions.
I tried simulating this on Synopsys VCS-MX, but the tool didn't like the VHDL-2008 constructs I was using very much. If you are using this simulator, or any other simulator, kindly let us know.
Altera and Xilinx tools failed to synthesise this core as is, as they do not yet support many of VHDL-2008 and VHDL-2002 language constructs. However, I believe Synopsys Synplify should be able to synthesise this. If you are using Synplify, or any other synthesis tool, let us know how well this core works with your toolchain.
[Note: if this core synthesises well with Synplify, it could very well work for Lattice FPGAs without much hassle. Let us know if you would like to try this on Lattice, so I can post up your results here.]
Update [11 Sept 2013]:
Design debugged on Altera Quartus. I had to hack Quartus synthesis by changing some VHDL-2008 constructs to VHDL-93. Design verified on an Altera FPGA, and hardware measurements matches well with ModelSim simulations. To use the synthesis sources, look under the rtl/quartus-synthesis
folder. You can run the Quartus synthesis flow by supplying the following at the Unix prompt (assuming you are in "trunk
"):$ cd workspace/quartus
$ ./synthesise.sh
Here's an explanation of what the synthesis script (synthesise.sh
) does:$ quartus_sh --flow compile axi4-tlm
- Runs the whole Quartus synthesis, place-and-route, and design assembly flow.$ quartus_pgm -c 'USB-Blaster [1-1.6]' -m jtag -o 'p;./output_files/axi4-tlm.sof'
- Programs your board. You may need to change your cable name to the one that's connected to your machine. Enter "quartus_pgm -l
" to find out your cable name.$ quartus_stpw ./waves.stp &
- Brings up the Quartus SignalTap II Embedded Logic Analyser's GUI for signal acquisition and viewing.
I have tested this to be working on an Altera DE2-115 kit, the Nios II Embedded Evaluation Kit (NEEK), and also the Altera-Arrow BeMicro Kit. Essentially, this design should work on any other Altera board as well. You just need to assign a clock and reset, and perhaps tweak the SignalTap II core for other boards (if needed), and you're set.
Note that although I used the NEEK, I did not use Nios (or any processor) in this design. You could however use this core to interconnect between processors and other peripherals that are AXI4-Stream compliant. The place-and-route results above was taken from the compilation on the BeMicro Kit (which uses the Cyclone IV E FPGA).
I am trying to make this core to be as vendor independent as possible. To do this, I plan to write a script that works around several vendor tools, including conversion of some VHDL-2008 language constructs to VHDL-93 synthesisable forms. If you'd like to volunteer writing this script (or like to help in any other way), feel free to let me know, and we'll see how we could collaborate.
Stay tuned for our Xilinx Vivado version of this core.
Comments and feedback are surely appreciated and welcomed. Feel free to write to me (daniel.kho@opencores.org / daniel.kho@tauhop.com).
Usability and readability:
- Designed in simple and elegant VHDL-2008, with conversions to VHDL-93 for synthesis.
- Transactor and BFM designed using synthesisable VHDL procedures and VHDL records.
- I/O ports are grouped into VHDL records.
- Very simple to use. For a design unit to communicate with another design unit having the same interface, communications is done via a very simple procedure call. For a Master to send data to the Slave, one would just do the following:
write(streamData);
where streamData
is the data which the master peripheral wishes to transfer to the slave peripheral.
- Functional verification using OS-VVM's coverage-driven constrained random verification techniques.
Design characteristics:
[Note that some of these characteristics reflect the current state of development of this project, and may change as this project evolves.]
- Synchronous and pipelined logic, with asynchronous resets.
- Huge chunks of combinatorial logic will also be synchronously reset.
- Design is very generic, flexible, and scalable. Data widths can be easily adjusted, and the design was created with readability and scalability carefully thought out from the beginning.
- Efficient and very small (77 LEs for Altera) AXI4-Stream Master if using a 32-bit data interface.
- Quartus reported an Fmax of 277.47 MHz, for a 32-bit data bus under 85C temperature.
As of current status, this is the post-place-and-route summary. To produce similar results, compile this project with SignalTap tester removed, and use 32-bit bus-widths for both the data bus (axiMaster_out.tData:t_msg
) and the symbolsPerTransfer:t_cnt
testbench stimulus.
Note that these results may be different if you use different bus widths, or Quartus settings, etc.
+--------------------------------------------------------------------------------+
; Fitter Summary ;
+------------------------------------+-------------------------------------------+
; Fitter Status ; Successful - Mon Mar 10 16:27:39 2014 ;
; Quartus II 32-bit Version ; 12.1 Build 177 11/07/2012 SJ Full Version ;
; Revision Name ; axi4-tlm ;
; Top-level Entity Name ; user ;
; Family ; Cyclone IV E ;
; Device ; EP4CE115F29C7 ;
; Timing Models ; Final ;
; Total logic elements ; 77 / 114,480 ( < 1 % ) ;
; Total combinational functions ; 44 / 114,480 ( < 1 % ) ;
; Dedicated logic registers ; 75 / 114,480 ( < 1 % ) ;
; Total registers ; 75 ;
; Total pins ; 125 / 529 ( 24 % ) ;
; Total virtual pins ; 0 ;
; Total memory bits ; 0 / 3,981,312 ( 0 % ) ;
; Embedded Multiplier 9-bit elements ; 0 / 532 ( 0 % ) ;
; Total PLLs ; 0 / 4 ( 0 % ) ;
+------------------------------------+-------------------------------------------+
Here are the corresponding timing summaries for the same compilation:
+-----------------------------------------------------------------------------------------------------------+
; Slow 1200mV 85C Model Fmax Summary ;
+------------+-----------------+------------+---------------------------------------------------------------+
; Fmax ; Restricted Fmax ; Clock Name ; Note ;
+------------+-----------------+------------+---------------------------------------------------------------+
; 277.47 MHz ; 250.0 MHz ; clk ; limit due to minimum period restriction (max I/O toggle rate) ;
+------------+-----------------+------------+---------------------------------------------------------------+
+-----------------------------------------------------------------------------------------------------------+
; Slow 1200mV 0C Model Fmax Summary ;
+------------+-----------------+------------+---------------------------------------------------------------+
; Fmax ; Restricted Fmax ; Clock Name ; Note ;
+------------+-----------------+------------+---------------------------------------------------------------+
; 302.66 MHz ; 250.0 MHz ; clk ; limit due to minimum period restriction (max I/O toggle rate) ;
+------------+-----------------+------------+---------------------------------------------------------------+
This core has been verified with ModelSim and Quartus SignalTap II, using basic directed testcases as well as using OSVVM's coverage-driven constrained random verification techniques. I would like to increase the test coverage in future. I also plan to add hardware results from Xilinx ChipScope, as well as simulation results from other simulators as well. If you have simulated or verified this core, please let me know how this core works with your toolchain. I believe Aldec ActiveHDL/Riviera Pro and Synopsys Synplify should have no problems, but I have yet to try them out.
ModelSim simulation of AXI4-Stream Master write operations:
Acquired measurements from Quartus SignalTap II embedded logic analyser, showing AXI4-Stream Master write operations:
I plan to implement AXI4-Stream Slave read operations as well. Currently, the testbench emulates a simple AXI4-Stream slave which responds to write requests from our AXI4-Stream Master, however, it does not latch and save the data. In future, I will design the Slave also as a TLM/BFM model, which will then replace the existing testbench code that emulates the Slave. The Master will connect directly to the Slave, and both Master and Slave models will validate each other. To ensure reliable data transfer, I plan to implement transmit and receive FIFOs, and verify the design with separate clock domains for the Master and Slave. Stay tuned!
1. More comprehensive directed, constrained random, and functional coverage testcases.
2. Documentation: design specification write-up, verification plan write-up, verification results.
3. AXI4-Stream Slave, and optional features defined by AMBA AXI4 specification.
4. Verify + debug design on more tools: Cadence (Encounter RTL, ncvhdl, ncsim), Mentor Graphics (Precision RTL), Synopsys (VCS-MX, Design Compiler, Synplify), Aldec (Riviera, ActiveHDL), Xilinx (Vivado, ISE, ISim).
5. Bash/Python script to perform automatic VHDL-2008 to VHDL-93 conversion to workaround different tools.
LogikHaus Sdn. Bhd. - Penang, Malaysia
site: https://www.logik.haus
email: info@logik.haus
tel.: +60 16 333 0498 (daniel)