This project implements a sorter able to sort a continuous stream of data, consisting of records labeled with "sort keys".
Sorter sorts one record every two clock cycles.
Sorter is based on the heap sort algorithm. Efficient implementation is assured thanks to the use of internal dual port
RAM in FPGA.
The required size of heap is equal to the expected maximum distance between unsorted records in the data stream.
The sorter implemented in this project is designed for sorting of stream of constant length records.
The main supposed application area are the multichannel data acquisition systems, where data records labelled with time stamps are transmitted through multiple channels with different latencies to the data concentrator, which should sort data according to their time stamps and source identifiers before sending them to further processing.
Each record contains the "sort key" and the payload. In the simplest case, the sort key may be a N-bit bit vector, treated as unsigned integer number.
Of course in case of lengthy data stream, the sort key will reach its maximum value of 2^N-1 and then wrap to 0. For the sorter it is fully acceptable, as long, as the capacity of the sorter (maximum number of data records stored in the sorter) summed with the maximum distance between unsorted records in the input data stream is less than half of the maximum value of the sort-key.
In this case we may compare sort keys of the newly received data record and data records stored in the sorter using simple operation:
signal unsigned sort_key1(N-1 downto 0);
signal unsigned sort_key2(N-1 downto 0);
signal signed sort_key_diff(N-1 downto 0);
[...]
sort_key_diff
Correct operation of the sorter requires, that the heap is initially filled with the records labelled with smallest possible values of the sort key. Similarly, when all data are sent to the sorter, it is necessary to send to the input of the sorter records with the biggest possible value of sort key. The "smallest possible" and the "biggest possible" values are implemented using additional bit fields "init" and "valid". This implementation allows as also to mark temporary interruptions in the stream of data, as shown below:
The main entity of the sorter is defined as follows:
entity sorter_sys is
generic (
NLEVELS : integer := SYS_NLEVELS -- number of levels in the sorter heap
);
port (
din : in T_DATA_REC;
we : in std_logic;
dout : out T_DATA_REC;
dav : out std_logic;
clk : in std_logic;
rst_n : in std_logic;
ready : out std_logic);
end sorter_sys;
To customize it to your needs, you should adjust definition of the T_DATA_REC type in the sorter_pkg.vhd file. Additionally you may also need to modify functions for converting data records into the std_logic_vector and from the std_logic_vector (tdrec2stlv and stlv2tdrec).
You may also change definition of the T_SORT_KEY type, but then probably you should also change the function sort_cmp_lt used to compare sort keys in two data records.
The sources provide also complete testbench for my sorter. To check it, you should simply adjust parameters given in the "sort_test_gen.py" file and run "make".
The Python script will generate the sorter configuration and input records. Then the makefile will compile the testbench and run the simulation.
Finally the records on the output will be analysed. If everything works fine, the "Test passed!" message will be displayed.
The simulation is prepared for the Linux system, and uses the ghdl simulator.
If you want to synthesize the sorter for use in real FPGA, you should replace the implementation found in dpram4.vhd with implementation based on: http://danstrother.com/2010/09/11/inferring-rams-in-fpgas/ The source is available in dpram_inf.vhd
All my sources in this archive are published under the BSD license. You can use them and modify them, however you should keep the information about the original author.
First implementation of the sorter has been described in my paper:
Wojciech M. Zabołotny, "Dual port memory based Heapsort implementation for FPGA", Proc. SPIE 8008, 80080E (2011); doi:10.1117/12.905281. I'll be glad if you cite my paper, when you publish something based on my sources.
The archive contains also some sources (dpram4.vhd, dpram_inf.vhd) which were obtained from sources with unclear licensing conditions - so I simply provide them for completeness, but I'm not able to set any specific licensing for them.
I don't know whether my sorter infringes any patents. If you want to use it for commercial purposes, you should check it yourself. I also don't know if my sorter works correctly in all possible conditions. I provide it "AS IS" without any warranty. You use it on your own risk!
First version of the sorter is available on my own website: http://koral.ise.pw.edu.pl/~wzab/fpga_heapsort/.
HLS implementation for Xilinx FPGAs
I have prepared different implementations of my heap sorter in C using the Xilinx HLS compiler. The detailed description of those implementations and their properties may be found in my paper:
Wojciech M. Zabołotny, "Implementation of heapsort in programmable logic with high-level synthesis", Proc. SPIE 10808 (2018); doi:10.1117/12.2502093