OpenCores
URL https://opencores.org/ocsvn/dblclockfft/dblclockfft/trunk

Subversion Repositories dblclockfft

[/] [dblclockfft/] [trunk/] [sw/] [fftgen.cpp] - Rev 37

Rev

Details | Compare with Previous | Blame

Filtering Options

Clear current filter

Rev Log message Author Age Path
37 Software updates dgisselq 2081d 09h /dblclockfft/trunk/sw/fftgen.cpp
36 Added several new modes to the FFT

This makes the FFT core generator a generator for a generic
pipelined FFT--whether it be two samples per clock, one
sample per clock, one sample per two clocks, or even one
sample every three clocks.

This version works in simulation, with some formal checks
as well.
dgisselq 2340d 13h /dblclockfft/trunk/sw/fftgen.cpp
35 TB now handles newer Verilator versions

I also placed verilator -Wall into the verilator Makefile,
turned on the -trace capability (tho nothing uses it), and
placed `default_nettype none into all of the created
Verilog files.
dgisselq 2554d 22h /dblclockfft/trunk/sw/fftgen.cpp
34 OCTkt #2661: Fixed a problem in longbimpy causing an iverilog error dgisselq 2800d 21h /dblclockfft/trunk/sw/fftgen.cpp
33 Updated internal docs, ignore files, no substantial changes. dgisselq 2801d 17h /dblclockfft/trunk/sw/fftgen.cpp
32 Fixed the makefile dependency on the static VERILATOR_ROOT. dgisselq 2963d 13h /dblclockfft/trunk/sw/fftgen.cpp
31 Thanks to Lesha Birukov, these changes will help to make fftgen.cpp work in a
Microsoft Visual Studio (2012) environment. Further testing is necessary for
proving that it works in (2013+) environments, but I don't have those to test
with. At issue are the functions llround() for rounding a double to a long
long (64-bit integer), mkdir for creating a directory to hold all the Verilog
files in, lstat and S_ISDIR for determining if a directory exists that can be
written to without overwriting a file of the same name, and the access()
macros R_OK, W_OK, X_OK and F_OK--again for determining whether access to a
directory is possible.
dgisselq 3226d 23h /dblclockfft/trunk/sw/fftgen.cpp
30 Minor documentation edits. dgisselq 3278d 11h /dblclockfft/trunk/sw/fftgen.cpp
29 Checking in a lot of changes here. These changes were focused on two
things primarily: 1st the ability to match, in bench testing, the bench
test to the configuration of the generated FFT. For this purpose, the
fftgen program now creates fftsize.h and ifftsize.h header files. These
header files contain the parameters that were used in the creation of the
various verilog files, and therefore the C++ test benches may now be compiled
to match the test files. The 2nd change is the multiply. Based upon a
set of slides from Xilinx, I rebuilt my shiftaddmpy into a longbimpy.
(Think if 'bimpy' as a 'bi', or two-bit, 'mpy', or multiply.) Longbimpy
depends upon bimpy, an optimized 2xN bit multiply--optimized for 6-bit
LUTs with carry chains. Longbimpy simply expands that capability to a
NxN bit multiply. Sadly, the longbimpy approach increased my area on the
chip when it was supposed to be a cheaper multiply, so I may well take it
back out in the future.
dgisselq 3434d 20h /dblclockfft/trunk/sw/fftgen.cpp
28 This revision represents a lot of work to get the Verilator simulation to now
match the FPGA performance. The big problem turned out to be in the
bit reversal stage, where a '=' was used on a register instead of a '<='.
Neither Verilator nor Vivado complained, but they each treated the result
differently. In addition, a bug was traced to the soft butterfly, butterfly.v,
whereby the delay through the butterfly did not properly change when the
delay through the multiply changed. All of this has been fixed, and now
appears to work and work well in both hardware and simulation.
dgisselq 3444d 14h /dblclockfft/trunk/sw/fftgen.cpp
26 A lot of updates and upgrades in this release. Specifically, work took place
over the last several days to demonstrate this FFT on an FPGA. It was
demonstrated on the Xilinx Artix-7 found on a Basys-3 development board.
Part of the effort stemmed around making certain that the DSPs were used
optimally, part of it stemmed around making certain that various parts of the
FFT could use block RAM-type memories. The other massive change involved
removing as much unnecessary logic as possible, so that two 16-bit 1k FFTs
could fit onto this part--together with other glue logic. The bottom line,
though, is that it all now works. Specifically, I've tested it successfully
with

fftgen -f <FFTSIZE> -n 16 -m 16 -p 7 -c 1 -x 1

and with FFTSIZEs of 32, 64, 128, 256, 512, and 1024.

Oh, I should mention that there's also an undocumented DEBUG interface to the
part, and I fixed where the Verilog files went when given an argument, so
that they actually went to the directory specified. Minor updates have taken
place to the documentation format, making it match the documentation format
for other opencores projects that I've produced.

On a sadder note, the Verilator simulation fft_tb no longer works. (Yeah, get
that---the FFT implementation works but Verilator does not. Sigh).
dgisselq 3466d 21h /dblclockfft/trunk/sw/fftgen.cpp
25 The documentation has been changed to bring it closer to the example
specification document. The big change is that the first line, the
headline if you will, of each table has een adjusted to have a gray
background.

The second and bigger change is in the fftgen software itself, or more
specifically to the code that it produces. I had the opportunity to build
the code using Xilinx's ISE and spent some time getting rid of warnings
during the build. As a result, registers that need initial conditions
now have initial conditions. In a similar manner I have trimmed out
registers that were unused. The resulting code continues to pass all
test benches, although I will admit that the automated testing of the
whole fft or ifft via there test benches, fft_tb and ifft_tb, continues
to be a less than automated process than it should be. Therefore, these
test benches always "pass" and only a manual examination of the results
can be used to confirm their success.
dgisselq 3516d 21h /dblclockfft/trunk/sw/fftgen.cpp
24 Found some bugs in the synchronization code, such that if a reset was
issued residual internal synchronization signals might not get cleared.
These have been fixed.

Other updates have been made to the internal source documentation.
dgisselq 3536d 19h /dblclockfft/trunk/sw/fftgen.cpp
23 Lot's of work to implement a variable means of rounding. The variable
rounding is now implemented within the code, all that's left is to
place a command line option to the generator to choose how values
are to be rounded: either by truncation (drop the lower bits), by
always rounding half up (if the first extra bit is one, go up),
by rounding away from zero (if exactly .5, move away from zero), or
by rounding towards even (if exactly .5, move towards the nearest
even value).

This added an extra clock cycle to each stage, so all of the
test benches needed to be reworked. There is currently no testbench
to test the rounding method itself. This necessitated some
wholescale changes to the testbench code, and the addition
of the twoc.[h|cpp] files. (They were within every piece of code, just
copied from one to the next, this now encapsulates them within their
own file so fixes will propagate to all.) Other changes include creating
testbench classes, adjusting the classes so that one can test what will
happen if the sync isn't added initially, and more. In the end, my
problem was tied to an assumption within fftmain.v that dblstage would
always be a one tick delay, whereas with the one tick of the rounding
function it now becomes a two tick delay .... but the task is done, and
the FFT appears to work again. The maximum sum of square errors (XISQ)
is about half what it was before now, when I use convergent rounding.
dgisselq 3546d 11h /dblclockfft/trunk/sw/fftgen.cpp
22 Lot's of changes, mostly around getting this multiply to fit within a
particular FPGA. Specifically, we just added the capability of using
hardware multiplies to the command line options. Use them if you have
them, and it will simplify the operation of the FFT.
dgisselq 3547d 14h /dblclockfft/trunk/sw/fftgen.cpp
21 Modified the core generator so that the result compiles with Xilinx's Vivado
toolsuite without generating any syntax errors.
dgisselq 3548d 10h /dblclockfft/trunk/sw/fftgen.cpp
20 Adjusted rounding to use the floating point modes inherent in the double type.
Hence, (int)(x+0.5) has been replaced by (int)round(x).
dgisselq 3550d 00h /dblclockfft/trunk/sw/fftgen.cpp
19 Added the capability to accumulate bits internal to the FFT, only to drop
those extra bits just before the end. This helps to reduce truncation
error, and may even drop it by a factor of four (my own measurements).
dgisselq 3550d 19h /dblclockfft/trunk/sw/fftgen.cpp
16 Cleaned up the test bench build scripts, made sure license statements were
placed on all files, etc.
dgisselq 3552d 12h /dblclockfft/trunk/sw/fftgen.cpp
15 Added rounding into the routine to remove bias. All of the test benches have
been modified so that the FFT, with rounding, now passes. While the rounding
implementation applied does remove bias, it does not yet remove all bias.
Some work still remains.
dgisselq 3553d 11h /dblclockfft/trunk/sw/fftgen.cpp

1 2 Next >

Show All

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.