OpenCores
URL https://opencores.org/ocsvn/fp_log/fp_log/trunk

Subversion Repositories fp_log

[/] [fp_log/] [trunk/] [README] - Diff between revs 2 and 3

Only display areas with differences | Details | Blame | View Log

Rev 2 Rev 3
======================================================================================================
======================================================================================================
DP-ICSI log  (C implementation of a fast logarithmic approximation unit based on ICSI log V2 0.6 Beta)
DP-ICSI log  (C implementation of a fast logarithmic approximation unit based on ICSI log V2 0.6 Beta)
DP/SP LAU    (FPGA unit that implements the ICSI log algorithm in VHDL)
DP/SP LAU    (FPGA unit that implements the ICSI log algorithm in VHDL)
======================================================================================================
======================================================================================================
Version 0.2 beta
Version 0.2 beta
Build date: August 2nd, 2009
Build date: August 2nd, 2009
Introduction
Introduction
------------
------------
Software :
Software :
This package contains a C implementation of the ICSI logarithm approximation algorithm originally introduced in
This package contains a C implementation of the ICSI logarithm approximation algorithm originally introduced in
O.Vinyals, G.Friedland, A Hardware-Independent Fast Logarithm Approximation with Adjustable Accuracy.
O.Vinyals, G.Friedland, A Hardware-Independent Fast Logarithm Approximation with Adjustable Accuracy.
Tenth IEEE International Symposium on Multimedia, 2008. ISM 2008. pp. 61-65, December 2008.
Tenth IEEE International Symposium on Multimedia, 2008. ISM 2008. pp. 61-65, December 2008.
The new C function has been adjusted to support double precision inputs in contrast to the official implementation of the algorithm
The new C function has been adjusted to support double precision inputs in contrast to the official implementation of the algorithm
which supports only single precision. Furthermore, there is invalid input detection which makes the function fully compatible with
which supports only single precision. Furthermore, there is invalid input detection which makes the function fully compatible with
the IEEE 754 standard and the GNU library log() function.
the IEEE 754 standard and the GNU library log() function.
Hardware:
Hardware:
This package also contains a VHDL implementation of the ICSI logarithm approximation algorithm described in
This package also contains a VHDL implementation of the ICSI logarithm approximation algorithm described in
N. Alachiotis, A. Stamatakis: "Efficient Floating-Point Logarithm Unit for FPGAs". Accepted for publication at RAW workshop,
N. Alachiotis, A. Stamatakis: "Efficient Floating-Point Logarithm Unit for FPGAs". Accepted for publication at RAW workshop,
held in conjunction with IPDPS 2010, Atlanta, Georgia, April, 2010.
held in conjunction with IPDPS 2010, Atlanta, Georgia, April, 2010.
The SP-LAU (Single Precision Logarithm Approximation Unit) implements the algorithm and supports single precision inputs.
The SP-LAU (Single Precision Logarithm Approximation Unit) implements the algorithm and supports single precision inputs.
The DP-LAU (Double Precision Logarithm Approximation Unit) implements the algorithm and supports double precision inputs.
The DP-LAU (Double Precision Logarithm Approximation Unit) implements the algorithm and supports double precision inputs.
Both units support invalid input detection.
Both units support invalid input detection.
All implementations in this package calculate an approximation of the natural logarithm.
All implementations in this package calculate an approximation of the natural logarithm.
For more details about the software implementation see the respective readme file and paper for the ICSI log.
For more details about the software implementation see the respective readme file and paper for the ICSI log.
Package Structure
Package Structure
-----------------
-----------------
This package contains the following files and folder:
This package contains the following files and folder:
-README                                 : This file
-README                                 : This file
-DP-ICSILog/DP-ICSILog.c        : C file that contains the adjusted for double precision implementation and an example of how to use the function.
-DP-ICSILog/DP-ICSILog.c        : C file that contains the adjusted for double precision implementation and an example of how to use the function.
-Virtex 5/SP-LAU                        : This folder contains the VHDL source files as well as .xco and .ngc files of the IPs that have been used to implement the single precision unit on Virtex 5.
-Virtex 5/SP-LAU                        : This folder contains the VHDL source files as well as .xco and .ngc files of the IPs that have been used to implement the single precision unit on Virtex 5.
-Virtex 5/DP-LAU                        : This folder contains the VHDL source files as well as .xco and .ngc files of the IPs that have been used to implement the double precision unit on Virtex 5.
-Virtex 5/DP-LAU                        : This folder contains the VHDL source files as well as .xco and .ngc files of the IPs that have been used to implement the double precision unit on Virtex 5.
-Virtex 4/SP-LAU                        : This folder contains the VHDL source files as well as .xco and .ngc files of the IPs that have been used to implement the single precision unit on Virtex 4.
-Virtex 4/SP-LAU                        : This folder contains the VHDL source files as well as .xco and .ngc files of the IPs that have been used to implement the single precision unit on Virtex 4.
-Virtex 4/DP-LAU                        : This folder contains the VHDL source files as well as .xco and .ngc files of the IPs that have been used to implement the double precision unit on Virtex 4.
-Virtex 4/DP-LAU                        : This folder contains the VHDL source files as well as .xco and .ngc files of the IPs that have been used to implement the double precision unit on Virtex 4.
-COE Files                              : This folder contains COE files to be used if one needs to adjust the accuracy of the unit.
-COE Files                              : This folder contains COE files to be used if one needs to adjust the accuracy of the unit.
-PAO Files                              : This folder contains PAO files that contain the Peripheral Analysis Order for the SP and DP LAUs.
-PAO Files                              : This folder contains PAO files that contain the Peripheral Analysis Order for the SP and DP LAUs.
Usage of the DP-ICSILog
Usage of the DP-ICSILog
-----------------------
-----------------------
The DP-ICSILog.c file contains the necessay global variables and functions that need to be
The DP-ICSILog.c file contains the necessay global variables and functions that need to be
called in order to use the DP-ICSILog function as well as an example.
called in order to use the DP-ICSILog function as well as an example.
Interface of the LAU
Interface of the LAU
--------------------
--------------------
The toplevel module of the LAU is sp_fp_log_v2 for the single precision logarithm approximation unit
The toplevel module of the LAU is sp_fp_log_v2 for the single precision logarithm approximation unit
and the dp_fp_log_v2 for double precison.
and the dp_fp_log_v2 for double precison.
sp/dp : Single Precision / Double Precision
sp/dp : Single Precision / Double Precision
fp    : Floating Point
fp    : Floating Point
log   : Logarithm
log   : Logarithm
V2    : Because the mantissa lookup table has been initialized using the respective function of the ICSILog V2 0.6 Beta software.
V2    : Because the mantissa lookup table has been initialized using the respective function of the ICSILog V2 0.6 Beta software.
        (The Version 2 of this function doubled the precision of the unit comparing to Version 1)
        (The Version 2 of this function doubled the precision of the unit comparing to Version 1)
The interface of the unit is defined as follows:
The interface of the unit is defined as follows:
entity sp_fp_log_v2/dp_fp_log_v2 is
entity sp_fp_log_v2/dp_fp_log_v2 is
        Port ( rst : in STD_LOGIC;          -- The reset signal
        Port ( rst : in STD_LOGIC;          -- The reset signal
               clk : in STD_LOGIC;          -- The clock signal
               clk : in STD_LOGIC;          -- The clock signal
               valid_in: in STD_LOGIC;      -- Signal that indicates valid number at the input port of the unit.
               valid_in: in STD_LOGIC;      -- Signal that indicates valid number at the input port of the unit.
               input_val: STD_LOGIC_VECTOR(31/63 downto 0);     -- The input number.
               input_val: STD_LOGIC_VECTOR(31/63 downto 0);     -- The input number.
               valid_out : STD_LOGIC;       -- Signal that indicates valid number at the output port of the unit.
               valid_out : STD_LOGIC;       -- Signal that indicates valid number at the output port of the unit.
               output_val : STD_LOGIC_VECTOR(31/63 downto 0)   -- The output number, the approximation of the logarithm of the input number.
               output_val : STD_LOGIC_VECTOR(31/63 downto 0)   -- The output number, the approximation of the logarithm of the input number.
              );
              );
end sp_fp_log_v2/dp_fp_log_v2;
end sp_fp_log_v2/dp_fp_log_v2;
Implementation Details
Implementation Details
----------------------
----------------------
The VHDL units have been designed using the Xilinx 10.1 Design Suite.
The VHDL units have been designed using the Xilinx 10.1 Design Suite.
ISE 10.1 was used to create the unit.
ISE 10.1 was used to create the unit.
Coregen was used to create all the IPs used in this unit.
Coregen was used to create all the IPs used in this unit.
The released LAUs use a mantissa lookup table with 4,096 entries.
The released LAUs use a mantissa lookup table with 4,096 entries.
Target devices are Virtex 4 and Virtex 5 FPGAs.
Target devices are Virtex 4 and Virtex 5 FPGAs.
One needs to change the IPs used in order to use the unit on any FPGA that meets the demands of number of block rams (This number
One needs to change the IPs used in order to use the unit on any FPGA that meets the demands of number of block rams (This number
depends on the desired accuracy and thus on the size of the mantissa lookup table) and number of DSP slices (3 DSP slices are occupied).
depends on the desired accuracy and thus on the size of the mantissa lookup table) and number of DSP slices (3 DSP slices are occupied).
One can use the coe files in the COE file folder to regenerate the mantissa lookup table for different accuracy and resources occupation.
One can use the coe files in the COE file folder to regenerate the mantissa lookup table for different accuracy and resources occupation.
Both units have a latency of 22 cycles (Virtex 5) and 28 cycles (Virtex 4) which is the same irrespective of the size of the mantissa lookup table used and thus the accuracy.
Both units have a latency of 22 cycles (Virtex 5) and 28 cycles (Virtex 4) which is the same irrespective of the size of the mantissa lookup table used and thus the accuracy.
The released units occupy 2% of the hardware resources on the Virtex 5 SX95T FPGA and can operate with the following clock frequencies
The released units occupy 2% of the hardware resources on the Virtex 5 SX95T FPGA and can operate with the following clock frequencies
as they were reported by the static timing report:
as they were reported by the static timing report:
353.4 MHz for the SP-LAU on the V5SX95T-2 and
353.4 MHz for the SP-LAU on the V5SX95T-2 and
320.6 MHz for the DP-LAU on the V5SX95T-2 .
320.6 MHz for the DP-LAU on the V5SX95T-2 .
Verification Details
Verification Details
--------------------
--------------------
Modelsim 6.3f was used for extensive post place and route simulations.
Modelsim 6.3f was used for extensive post place and route simulations.
The development board HTG-V5-PCIE by HiTech Global populated with a V5SX95T-1 FPGA was used to verify the LAUs.
The development board HTG-V5-PCIE by HiTech Global populated with a V5SX95T-1 FPGA was used to verify the LAUs.
ChiScope Pro Analyzer was used for advanced on-chip debugging and verification of the units.
ChiScope Pro Analyzer was used for advanced on-chip debugging and verification of the units.
IP Configuration Details for the Virtex 5 LAUs
IP Configuration Details for the Virtex 5 LAUs
----------------------------------------------
----------------------------------------------
The IPs used for the implementations are the following:
The IPs used for the implementations are the following:
(The configuration options that are not mentioned were not selected.)
(The configuration options that are not mentioned were not selected.)
comp_eq_000000000000 :
comp_eq_000000000000 :
Comparator ,
Comparator ,
Operation :A=B,
Operation :A=B,
Data Type: Unsigned,
Data Type: Unsigned,
Input Width: 12,
Input Width: 12,
Port B Constant: 000000000000,
Port B Constant: 000000000000,
Pipeline Stages: 0,
Pipeline Stages: 0,
Output Options:Registered Output,
Output Options:Registered Output,
Synchronous Settings: Clear
Synchronous Settings: Clear
comp_eq_000000000000000 :
comp_eq_000000000000000 :
Comparator ,
Comparator ,
Operation :A=B,
Operation :A=B,
Data Type: Unsigned,
Data Type: Unsigned,
Input Width: 15,
Input Width: 15,
Port B Constant: 000000000000000,
Port B Constant: 000000000000000,
Pipeline Stages: 0,
Pipeline Stages: 0,
Output Options:Registered Output,
Output Options:Registered Output,
Synchronous Settings: Clear
Synchronous Settings: Clear
comp_eq_8ones :
comp_eq_8ones :
Comparator ,
Comparator ,
Operation :A=B,
Operation :A=B,
Data Type: Unsigned,
Data Type: Unsigned,
Input Width: 8,
Input Width: 8,
Port B Constant: 11111111,
Port B Constant: 11111111,
Pipeline Stages: 0,
Pipeline Stages: 0,
Output Options:Registered Output,
Output Options:Registered Output,
Synchronous Settings: Clear
Synchronous Settings: Clear
comp_eq_11ones :
comp_eq_11ones :
Comparator ,
Comparator ,
Operation :A=B,
Operation :A=B,
Data Type: Unsigned,
Data Type: Unsigned,
Input Width: 11,
Input Width: 11,
Port B Constant: 11111111111,
Port B Constant: 11111111111,
Pipeline Stages: 0,
Pipeline Stages: 0,
Output Options:Registered Output,
Output Options:Registered Output,
Synchronous Settings: Clear
Synchronous Settings: Clear
comp_eq_22zeros :
comp_eq_22zeros :
Comparator ,
Comparator ,
Operation :A=B,
Operation :A=B,
Data Type: Unsigned,
Data Type: Unsigned,
Input Width: 22,
Input Width: 22,
Port B Constant: 00000...0000,
Port B Constant: 00000...0000,
Pipeline Stages: 0,
Pipeline Stages: 0,
Output Options:Registered Output,
Output Options:Registered Output,
Synchronous Settings: Clear
Synchronous Settings: Clear
comp_eq_51zeros :
comp_eq_51zeros :
Comparator ,
Comparator ,
Operation :A=B,
Operation :A=B,
Data Type: Unsigned,
Data Type: Unsigned,
Input Width: 51,
Input Width: 51,
Port B Constant: 00000...0000,
Port B Constant: 00000...0000,
Pipeline Stages: 0,
Pipeline Stages: 0,
Output Options:Registered Output,
Output Options:Registered Output,
Synchronous Settings: Clear
Synchronous Settings: Clear
comp_eq_111111 :
comp_eq_111111 :
Comparator ,
Comparator ,
Operation :A=B,
Operation :A=B,
Data Type: Unsigned,
Data Type: Unsigned,
Input Width: 6,
Input Width: 6,
Port B Constant: 111111,
Port B Constant: 111111,
Pipeline Stages: 0,
Pipeline Stages: 0,
Output Options:Registered Output,
Output Options:Registered Output,
Synchronous Settings: Clear
Synchronous Settings: Clear
comp_eq_111111111 :
comp_eq_111111111 :
Comparator ,
Comparator ,
Operation :A=B,
Operation :A=B,
Data Type: Unsigned,
Data Type: Unsigned,
Input Width: 9,
Input Width: 9,
Port B Constant: 111111111,
Port B Constant: 111111111,
Pipeline Stages: 0,
Pipeline Stages: 0,
Output Options:Registered Output,
Output Options:Registered Output,
Synchronous Settings: Clear
Synchronous Settings: Clear
exp_lut_MEM :
exp_lut_MEM :
Block Memory Generator,
Block Memory Generator,
Memory Type: Single Port ROM,
Memory Type: Single Port ROM,
Read Width: 9
Read Width: 9
Read Depth: 128
Read Depth: 128
mant_lut_MEM :
mant_lut_MEM :
Block Memory Generator,
Block Memory Generator,
Memory Type: Single Port ROM,
Memory Type: Single Port ROM,
Read Width: 27
Read Width: 27
Read Depth: 4096 (depends on the desired accuracy)
Read Depth: 4096 (depends on the desired accuracy)
All the registers used are RAM-based Shift Registers. The width and depth of each register is indicated by the name.
All the registers used are RAM-based Shift Registers. The width and depth of each register is indicated by the name.
For example: reg_1b_1c is a register of 1 bit and 1 clock latency.
For example: reg_1b_1c is a register of 1 bit and 1 clock latency.
sp_fp_add:
sp_fp_add:
Floating Point,
Floating Point,
Operation Selection: Add,
Operation Selection: Add,
Precision: Single,
Precision: Single,
Architecture Optimization: High Speed,
Architecture Optimization: High Speed,
Family Optimizations: Full Usage,
Family Optimizations: Full Usage,
Latency and Rate Configuration: Use Maximum Latency
Latency and Rate Configuration: Use Maximum Latency
sp_fp_mult:
sp_fp_mult:
Floating Point,
Floating Point,
Operation Selection: Multiply,
Operation Selection: Multiply,
Precision: Single,
Precision: Single,
Architecture Optimization: High Speed,
Architecture Optimization: High Speed,
Family Optimizations: Medium Usage,
Family Optimizations: Medium Usage,
Latency and Rate Configuration: Use Maximum Latency
Latency and Rate Configuration: Use Maximum Latency
Note:
Note:
The Coregen Project Settings were changed from Virtex 5 to Virtex 4 and all the above IPs were regenerated under the current project settings,
The Coregen Project Settings were changed from Virtex 5 to Virtex 4 and all the above IPs were regenerated under the current project settings,
except only for the RAM-based Shift Registers that operate in parallel with the sp_fp_add and sp_fp_mult IPs. In this case the depth (clock delay)
except only for the RAM-based Shift Registers that operate in parallel with the sp_fp_add and sp_fp_mult IPs. In this case the depth (clock delay)
was changed according to the latency of the sp_fp_add and sp_fp_mult IPs.
was changed according to the latency of the sp_fp_add and sp_fp_mult IPs.
Authors and Contact Details
Authors and Contact Details
---------------------------
---------------------------
Nikos Alachiotis                        alachiot@in.tum.de
Nikos Alachiotis                        alachiot@in.tum.de
Alexandros Stamatakis           stamatak@in.tum.de
Alexandros Stamatakis           stamatak@in.tum.de
Copyright
Copyright
---------
---------
These programs are free software; you can redistribute it and/or modify
These programs are free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2 of the License, or
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
(at your option) any later version.
The programs are distributed in the hope that they will be useful,
The programs are distributed in the hope that they will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
GNU General Public License for more details.
Further Information
Further Information
-------------------
-------------------
The FPGA units SP-LAU and DP-LAU are exact implementations of the SP-ICSILog and the DP-ICSILog algorithms respectively.
The FPGA units SP-LAU and DP-LAU are exact implementations of the SP-ICSILog and the DP-ICSILog algorithms respectively.
Furthermore there is support for invalid input detection like nan, inf, -inf or zero.
Furthermore there is support for invalid input detection like nan, inf, -inf or zero.
For more information on the LAU see the paper:
For more information on the LAU see the paper:
N. Alachiotis, A. Stamatakis: "Efficient Floating-Point Logarithm Unit for FPGAs". Accepted for publication at RAW workshop,
N. Alachiotis, A. Stamatakis: "Efficient Floating-Point Logarithm Unit for FPGAs". Accepted for publication at RAW workshop,
held in conjunction with IPDPS 2010, Atlanta, Georgia, April, 2010.
held in conjunction with IPDPS 2010, Atlanta, Georgia, April, 2010.
For more information on the ICSI log algorithm see the paper:
For more information on the ICSI log algorithm see the paper:
O.Vinyals, G.Friedland, A Hardware-Independent Fast Logarithm Approximation with Adjustable Accuracy.
O.Vinyals, G.Friedland, A Hardware-Independent Fast Logarithm Approximation with Adjustable Accuracy.
Tenth IEEE International Symposium on Multimedia, 2008. ISM 2008. pp. 61-65, December 2008.
Tenth IEEE International Symposium on Multimedia, 2008. ISM 2008. pp. 61-65, December 2008.
or/and download the official single precision C implementation from:
or/and download the official single precision C implementation from:
http://linux.softpedia.com/get/Programming/Libraries/ICSILog-41333.shtml
http://linux.softpedia.com/get/Programming/Libraries/ICSILog-41333.shtml
Citation
Citation
--------
--------
By using this component you agree to cite it as: "Efficient Floating-Point Logarithm Unit for FPGAs", by Nikos Alachiotis and Alexandros Stamatakis, accapted for publication at RAW workhsop, held in conjunction with IPDPS 2010.
By using this component you agree to cite it as: "Efficient Floating-Point Logarithm Unit for FPGAs", by Nikos Alachiotis and Alexandros Stamatakis, accapted for publication at RAW workhsop, held in conjunction with IPDPS 2010.
Release Notes
Release Notes
------------
------------
Version : 0.2 beta
Version : 0.2 beta
Build date : September 20th, 2009
Build date : September 20th, 2009
 * support for Virtex 4 FPGAs as well
 * support for Virtex 4 FPGAs as well
 * FPGA verification
 * FPGA verification
Version : 0.1 beta
Version : 0.1 beta
Build date : August 2nd, 2009
Build date : August 2nd, 2009
 * support for Virtex 5 FPGAs only
 * support for Virtex 5 FPGAs only
 * Tested by using extensive post place and route simulations.
 * Tested by using extensive post place and route simulations.
 
 

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.