OpenCores
no use no use 1/1 no use no use
Need Strassen Matrix Multiplication Algorithm Verilog Code
by srikanth.vlsi.2011 on Jul 15, 2016
srikanth.vlsi.2011
Posts: 1
Joined: Sep 22, 2015
Last seen: Nov 12, 2024
Dear Members,

Can any one Please share "STRASSEN MATRIX MULTIPLICATION VERILOG CODE"? I am working on floating point ALU operation related project. I have done literature survey on this module but confusing me to do code in verilog.

Thanks and Regards,
Srikanth

RE: Need Strassen Matrix Multiplication Algorithm Verilog Code
by jmmontanana on Jul 15, 2016
jmmontanana
Posts: 1
Joined: Mar 10, 2011
Last seen: Jul 15, 2016
If you can pay, then we can develop for you.

You can write me at jm.montananaaliaga@york.ac.uk

Best regards
José Miguel
RE: Need Strassen Matrix Multiplication Algorithm Verilog Code
by dgisselq on Jul 15, 2016
dgisselq
Posts: 247
Joined: Feb 20, 2015
Last seen: Oct 24, 2024
Hmm ... that's not a piece of code I personally have. Have you checked the projects directory at all?

On the toother hand, if you are building a floating point unit (FPU), chances are what you need is a CPU and not a Verilog FPGA version of the matrix algorithm, no? In that case, a Verilog version doesn't make as much sense.

Either way, if you end up building it, will you be posting your results on OpenCores?

Dan

RE: Need Strassen Matrix Multiplication Algorithm Verilog Code
by robfinch on Jul 16, 2016
robfinch
Posts: 28
Joined: Sep 29, 2005
Last seen: Nov 18, 2024
Strassen ? I can�t really help here but I�d like to learn more about it myself. This has got my curiosity going as to how one would accelerate the matrix multiply. I can only suggest how I might approach it given that I don�t know of any HDL code that performs the operation.
Strassen�s method decomposes the matrix recursively into smaller matrixes until only a 2x2 matrix is available to be processed. Assuming that there are going to be only a limited number of floating point functional units it would need a state machine to manage movement of sums and products between temporary storage and functional units. Some means of tracking intermediate sums and products would have to be present. Ideally there would be parallel paths to the temporary storage.
I�d be tempted to make the minimum function unit a product of sums unit. P = (A + B)(C + D) and provide multiple product of sums units so that the multiply can be parallelized. (A, B, C, or D would be zero in some cases). Depending on how much FP hardware is available it may be possible to perform the 2x2 matrix multiply directly as a single operation.
This reminds me of an out-of-order machine where operations are stalled until all the operands for the op are valid.
no use no use 1/1 no use no use
© copyright 1999-2025 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.