Project maintainers


Name: fpu_double
Created: Jan 16, 2009
Updated: Oct 11, 2014
SVN Updated: Feb 13, 2010
SVN: Browse
Latest version: download (might take a bit to start...)
Statistics: View
Bugs: 2 reported / 0 solved
Star3you like it: star it!

Other project properties

Category:Arithmetic core
Development status:Alpha
Additional info:Design done
WishBone compliant: No
WishBone version: n/a


IEEE-754 compliant double-precision floating point unit. 4 operations (addition, subtraction, multiplication, division) are supported, as are the 4 rounding modes (nearest, 0, +inf, -inf). This unit also supports denormalized numbers, which is rare because most floating point units treat denormalized numbers as zero. The unit can run at clock frequencies up to 185 MHz for a Virtex5 target device.


- The unit is designed to be synchronous to one global clock. All registers are updated on the rising edge of the clock.
- All registers can be reset with one global reset.
- The multiply operation is broken up to take advantage of the 25 x 18 multiply blocks in the Virtex5 DSP48E slices. The 25 x 18 multiply twos complement block will perform a 24 x 17 unsigned multiply, so it takes 9 DSP48E slices to perform the 53 x 53 bit multiply required to multiply two double-precision floating point numbers.
- fpu_double.v is the top-level module. The input signals are:
- 1) clk
- 2) rst
- 3) enable
- 4) rmode (rounding mode)
- 5) fpu_op (operation code)
- 6) opa (64-bit floating point number)
- 7) opb (64-bit floating point number)

- The output signals are:
- 1) out (64-bit floating point output)
- 2) ready (goes high when the output is ready)
- 3) underflow
- 4) overflow
- 5) inexact
- 6) exception
- 7) invalid

- Each operation takes the following amount of clock cycles to complete:
- 1. addition : 20 clock cycles
- 2. subtraction: 21 clock cycles
- 3. multiplication: 24 clock cycles
- 4. division: 71 clock cycles

- This is longer than some floating point units, but the support for denormalized numbers requires several more logic levels and a longer latency.


- version 1