Before You read
This is a brief overview of the article about the series of multiplication algorithms. For comparison and estimation of proposed algorithms please refer to the full article... (see PDF file from downloads).
Overview
Operation of multiplication is very important in microelectronics. Each modern microprocessor has this operation within its instruction set, and advanced microprocessors have special multiplication units, that perform multiplication during 1 synchronization period(cycle). Especially valuable multiplication is in DSP processors, where it is practically main operation. Performance of any DSP processor is defined with delays in it MAC (multiply and accumulate) unit. So efficiency of multiplication is very important.
Methodology Overview.
The idea of algorithms is as follows. Unsigned multiplicands A and D may be represented in following form: A*D = (B * 2n + ó) * (E * 2n + F), where n – any number that is satisfied with following conditions:
«Hierarchical» algorithm.
As it follows from theory of algorithms maximum of timing efficiency should be expected when dimensions of operands B, C, E and F (see basic formula) are equal at every algorithm call, i.e. n=m/2. In this case number of recursions will be minimal and number of sums that take part in final result also will be minimal.
Modified «hierarchical» algorithm.
This algorithm is an attempt to improve “hierarchical” algorithm for long-dimensional operands by substitution of one multiplication with some of addition operations. But for dimensions commonly used (8 - 64 bit) the result was not as expected. Algorithm advantages supposed to appear on m → [128..∞) where possibly the algorithm may be preferable than the prototype.
"Hierarchical" integer multiplication unit characteristics
The algorithm was written in VHDL, synthesized within Synopsys Design Compiler on 0.35u CMOS library. The data of the allocation areas are given only for a combinational part of algorithm.
Operands Width |
Delay(ns) | Gates allocated |
8 | 9.56 | 760 |
16 | 15.15 | 2505 |
32 | 23.12 | 9355 |
64 | 35.43 | 33805 |
"Optimized Hierarchical" multiplication IP core characteristics
The algorithm was written in VHDL, synthesized within Synopsys Design Compiler on 0.35u CMOS library. The data of the allocation areas are given only for a combinational part of algorithms.
Operands Width |
Delay(ns) | Gates allocated |
8 | 14.28 | 1015 |
16 | 21.76 | 3585 |
32 | 33.85 | 11240 |
64 | 56.48 | 30368 |
These cores are developed and provided by ASIC reseach department member of DeverSYS Corp., Vladimir V.Erokhin. More usefull fundamental (and not only) FREE IP Cores can be found at DeverSYS web www.deversys.com.