1/1
Need Strassen Matrix Multiplication Algorithm Verilog Code
by srikanth.vlsi.2011 on Jul 15, 2016 |
srikanth.vlsi.2011
Posts: 1 Joined: Sep 22, 2015 Last seen: Nov 12, 2024 |
||
Dear Members,
Can any one Please share "STRASSEN MATRIX MULTIPLICATION VERILOG CODE"? I am working on floating point ALU operation related project. I have done literature survey on this module but confusing me to do code in verilog. Thanks and Regards, Srikanth |
RE: Need Strassen Matrix Multiplication Algorithm Verilog Code
by jmmontanana on Jul 15, 2016 |
jmmontanana
Posts: 1 Joined: Mar 10, 2011 Last seen: Jul 15, 2016 |
||
If you can pay, then we can develop for you.
You can write me at jm.montananaaliaga@york.ac.uk Best regards José Miguel |
RE: Need Strassen Matrix Multiplication Algorithm Verilog Code
by dgisselq on Jul 15, 2016 |
dgisselq
Posts: 247 Joined: Feb 20, 2015 Last seen: Oct 24, 2024 |
||
Hmm ... that's not a piece of code I personally have. Have you checked the projects directory at all?
On the toother hand, if you are building a floating point unit (FPU), chances are what you need is a CPU and not a Verilog FPGA version of the matrix algorithm, no? In that case, a Verilog version doesn't make as much sense.
Either way, if you end up building it, will you be posting your results on OpenCores?
Dan |
RE: Need Strassen Matrix Multiplication Algorithm Verilog Code
by robfinch on Jul 16, 2016 |
robfinch
Posts: 28 Joined: Sep 29, 2005 Last seen: Nov 18, 2024 |
||
Strassen ? I can�t really help here but I�d like to learn more about it myself. This has got my curiosity going as to how one would accelerate the matrix multiply. I can only suggest how I might approach it given that I don�t know of any HDL code that performs the operation.
Strassen�s method decomposes the matrix recursively into smaller matrixes until only a 2x2 matrix is available to be processed. Assuming that there are going to be only a limited number of floating point functional units it would need a state machine to manage movement of sums and products between temporary storage and functional units. Some means of tracking intermediate sums and products would have to be present. Ideally there would be parallel paths to the temporary storage. I�d be tempted to make the minimum function unit a product of sums unit. P = (A + B)(C + D) and provide multiple product of sums units so that the multiply can be parallelized. (A, B, C, or D would be zero in some cases). Depending on how much FP hardware is available it may be possible to perform the 2x2 matrix multiply directly as a single operation. This reminds me of an out-of-order machine where operations are stalled until all the operands for the op are valid. |
1/1