# FPU Double VHDL :: Overview

## Project maintainers

## Details

Name: fpu_double

Created: Jan 16, 2009

Updated: Oct 11, 2014

SVN Updated: Feb 13, 2010

SVN: Browse

Latest version: download

Statistics: View

## Other project properties

Category: Arithmetic core

Language: VHDL

Development status: Alpha

Additional info:
Design done

WishBone Compliant: No

License:

## Description

IEEE-754 compliant double-precision floating point unit. 4 operations (addition, subtraction, multiplication, division) are supported, as are the 4 rounding modes (nearest, 0, +inf, -inf). This unit also supports denormalized numbers, which is rare because most floating point units treat denormalized numbers as zero. The unit can run at clock frequencies up to 185 MHz for a Virtex5 target device.

## Features

- The unit is designed to be synchronous to one global clock. All registers are updated on the rising edge of the clock.

- All registers can be reset with one global reset.

- The multiply operation is broken up to take advantage of the 25 x 18 multiply blocks in the Virtex5 DSP48E slices. The 25 x 18 multiply twos complement block will perform a 24 x 17 unsigned multiply, so it takes 9 DSP48E slices to perform the 53 x 53 bit multiply required to multiply two double-precision floating point numbers.

- fpu_double.v is the top-level module. The input signals are:

- 1) clk

- 2) rst

- 3) enable

- 4) rmode (rounding mode)

- 5) fpu_op (operation code)

- 6) opa (64-bit floating point number)

- 7) opb (64-bit floating point number)

- The output signals are:

- 1) out (64-bit floating point output)

- 2) ready (goes high when the output is ready)

- 3) underflow

- 4) overflow

- 5) inexact

- 6) exception

- 7) invalid

- Each operation takes the following amount of clock cycles to complete:

- 1. addition : 20 clock cycles

- 2. subtraction: 21 clock cycles

- 3. multiplication: 24 clock cycles

- 4. division: 71 clock cycles

- This is longer than some floating point units, but the support for denormalized numbers requires several more logic levels and a longer latency.