1 |
4 |
abhiag |
Reed-Solomon decoder design for C-based synthesis
|
2 |
|
|
|
3 |
|
|
Top level decoder module: rs_decode.cpp
|
4 |
|
|
Simple Testbench: test_rs_decode.cpp
|
5 |
|
|
|
6 |
|
|
Presently dynamic values of k and t are not used in the decoding process.
|
7 |
|
|
Instead the static constants in global_rs.h are used for simplicity.
|
8 |
|
|
|
9 |
|
|
Executing make compiles the decoder and the testbench and creates an executable.
|
10 |
|
|
Correct execution is indicated by the decoded output being a descending count
|
11 |
|
|
from 223 to 1, repeated 3 times.
|
12 |
|
|
|
13 |
|
|
For synthesis, import into a C based design tool, set appropriate HW parameters
|
14 |
|
|
and synthesize for a particular platform.
|
15 |
9 |
abhiag |
|
16 |
|
|
For our case study of generating hardware from this code, the following
|
17 |
|
|
transformations and directives were used for improving performance:
|
18 |
|
|
|
19 |
|
|
- Break up complex functions into smaller & simpler computational
|
20 |
|
|
segments/loops to allow independent optimizations, for example see
|
21 |
|
|
berlekamp.cpp, each loop labelled for further directives.
|
22 |
|
|
|
23 |
|
|
- Modify loops dependent on dynamic bounds to use static bounds and
|
24 |
|
|
mask output to give correct result.
|
25 |
|
|
|
26 |
|
|
- Unroll 'For' loops with fixed static bounds maximally, marked in
|
27 |
|
|
code as comments corresponding to each loop to be unrolled. Loops
|
28 |
|
|
unrolled in berlekamp.cpp, chien_search.cpp, gf_arith.cpp,
|
29 |
|
|
syndrome.cpp.
|
30 |
|
|
|
31 |
|
|
- Synthesize hardware units for individual functions rather than the
|
32 |
|
|
entire logic as a single unit, marked in code as comments for
|
33 |
|
|
independent synthesis.
|
34 |
|
|
|
35 |
|
|
- Use appropriate streaming buffers for communication between
|
36 |
|
|
functional units. Buffers should allow fine-grained pipeline
|
37 |
|
|
parallelism across a single data block where possible.
|