OpenCores

OpenCores

Question on Modular Multipliers

no use

no use

1/1

no use

no use

Question on Modular Multipliers by Unknown on May 14, 2004			Not available!
I am the author of the Basic RSA project on Open Cores. It does what I originally intended it to do, but frankly, its performance is embarrassing. I have been trying to wade through papers on high radix modular multipliers, including Montgomery, and something does not seem right to me. If I understand correctly, the speed advantages in these systems come from pre-calculating the possible products (i.e. 1x1, 1x2,...4x3, 4x4 for radix 4) and adding those, instead of the traditional shift and add multiplication. Unfortunately, it seems to me that this can only be effective for a non-modular multiplier, since massive adders and subtractors are still required to be chained together to get the correct modular result. What am I missing here? Also, I have seen some papers on partial adders that claim to accelerate large multipliers (>512 bits) considerably. I can see how clock rates can be accelerated dramatically, but I cannot see how the throughput is improved. It seems to me that carries must still be propagated which would cancel most of the benefit of the faster clock. I have developed a partitioned adder that doubles the speed of 1024-bit add operations at the cost of 4 times the circuitry, but this does not seem cost-effective to me, and it does not scale well. If anyone can point me to some nice, clear descriptions of high performance large number adders or multipliers, preferably with lots of examples, or can shed some light on the subject personally, I would greatly appreciate it. Thank you! Steve -- Steven R. McQueen <<A HREF="http://www.opencores.org/mailman/listinfo/cores">srmcqueen at mcqueentech.com</A>> McQueen Technologies, Inc. I am the author of the Basic RSA project on Open Cores. It does what I originally intended it to do, but frankly, its performance is embarrassing. I have been trying to wade through papers on high radix modular multipliers, including Montgomery, and something does not seem right to me. If I understand correctly, the speed advantages in these systems come from pre-calculating the possible products (i.e. 1x1, 1x2,...4x3, 4x4 for radix 4) and adding those, instead of the traditional shift and add multiplication. Unfortunately, it seems to me that this can only be effective for a non-modular multiplier, since massive adders and subtractors are still required to be chained together to get the correct modular result. What am I missing here? Also, I have seen some papers on partial adders that claim to accelerate large multipliers (>512 bits) considerably. I can see how clock rates can be accelerated dramatically, but I cannot see how the throughput is improved. It seems to me that carries must still be propagated which would cancel most of the benefit of the faster clock. I have developed a partitioned adder that doubles the speed of 1024-bit add operations at the cost of 4 times the circuitry, but this does not seem cost-effective to me, and it does not scale well. If anyone can point me to some nice, clear descriptions of high performance large number adders or multipliers, preferably with lots of examples, or can shed some light on the subject personally, I would greatly appreciate it. Thank you! Steve -- Steven R. McQueen srmcqueen at mcqueentech.com> McQueen Technologies, Inc.

Question on Modular Multipliers by Unknown on May 14, 2004			Not available!
All these different multiplier organizations have one thing in common; they use carry-save adders. In fact a traditional (pipelined) shift-add multiplier can be written to use carry-save adders. Just browse the internet, or get a copy of Perhami's "Computer Arithmetic" book. Cheers, Richard [q] -----Original Message----- From: <A HREF="http://www.opencores.org/mailman/listinfo/cores">cores-bounces at opencores.org</A> [mailto:<A HREF="http://www.opencores.org/mailman/listinfo/cores">cores-bounces at opencores.org</A>] On Behalf Of Steven R. McQueen Sent: Friday, May 14, 2004 7:52 AM To: Discussion list about free open source IP cores Subject: [oc] Question on Modular Multipliers I am the author of the Basic RSA project on Open Cores. It does what I originally intended it to do, but frankly, its performance is embarrassing. I have been trying to wade through papers on high radix modular multipliers, including Montgomery, and something does not seem right to me. If I understand correctly, the speed advantages in these systems come from pre-calculating the possible products (i.e. 1x1, 1x2,...4x3, 4x4 for radix 4) and adding those, instead of the traditional shift and add multiplication. Unfortunately, it seems to me that this can only be effective for a non-modular multiplier, since massive adders and subtractors are still required to be chained together to get the correct modular result. What am I missing here? Also, I have seen some papers on partial adders that claim to accelerate large multipliers (>512 bits) considerably. I can see how clock rates can be accelerated dramatically, but I cannot see how the throughput is improved. It seems to me that carries must still be propagated which would cancel most of the benefit of the faster clock. I have developed a partitioned adder that doubles the speed of 1024-bit add operations at the cost of 4 times the circuitry, but this does not seem cost-effective to me, and it does not scale well. If anyone can point me to some nice, clear descriptions of high performance large number adders or multipliers, preferably with lots of examples, or can shed some light on the subject personally, I would greatly appreciate it. Thank you! Steve -- Steven R. McQueen <<A HREF="http://www.opencores.org/mailman/listinfo/cores">srmcqueen at mcqueentech.com</A>> McQueen Technologies, Inc. _______________________________________________ <A HREF="http://www.opencores.org/mailman/listinfo/cores">http://www.opencores.org/mailman/listinfo/cores</A>[/q] All these different multiplier organizations have one thing in common; they use carry-save adders. In fact a traditional (pipelined) shift-add multiplier can be written to use carry-save adders. Just browse the internet, or get a copy of Perhami's "Computer Arithmetic" book. Cheers, Richard -----Original Message----- From: cores-bounces at opencores.org [mailto:cores-bounces at opencores.org] On Behalf Of Steven R. McQueen Sent: Friday, May 14, 2004 7:52 AM To: Discussion list about free open source IP cores Subject: [oc] Question on Modular Multipliers I am the author of the Basic RSA project on Open Cores. It does what I originally intended it to do, but frankly, its performance is embarrassing. I have been trying to wade through papers on high radix modular multipliers, including Montgomery, and something does not seem right to me. If I understand correctly, the speed advantages in these systems come from pre-calculating the possible products (i.e. 1x1, 1x2,...4x3, 4x4 for radix 4) and adding those, instead of the traditional shift and add multiplication. Unfortunately, it seems to me that this can only be effective for a non-modular multiplier, since massive adders and subtractors are still required to be chained together to get the correct modular result. What am I missing here? Also, I have seen some papers on partial adders that claim to accelerate large multipliers (>512 bits) considerably. I can see how clock rates can be accelerated dramatically, but I cannot see how the throughput is improved. It seems to me that carries must still be propagated which would cancel most of the benefit of the faster clock. I have developed a partitioned adder that doubles the speed of 1024-bit add operations at the cost of 4 times the circuitry, but this does not seem cost-effective to me, and it does not scale well. If anyone can point me to some nice, clear descriptions of high performance large number adders or multipliers, preferably with lots of examples, or can shed some light on the subject personally, I would greatly appreciate it. Thank you! Steve -- Steven R. McQueen srmcqueen at mcqueentech.com> McQueen Technologies, Inc. _______________________________________________ http://www.opencores.org/mailman/listinfo/cores

no use

no use

1/1

no use

no use

© copyright 1999-2025 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.