OpenCores
URL https://opencores.org/ocsvn/amber/amber/trunk

Subversion Repositories amber

[/] [amber/] [trunk/] [doc/] [WishList.txt] - Diff between revs 26 and 48

Go to most recent revision | Only display areas with differences | Details | Blame | View Log

Rev 26 Rev 48
Wish List for Improving Performance
Wish List for Improving Performance
-----------------------------------
-----------------------------------
Add CTI singal to Wishbone to support proper burst accesses.This would reduce the time for a burst access to main memory significantly. Currently a burst access is just 4 single accesses back to back.
Add CTI singal to Wishbone to support proper burst accesses.This would reduce the time for a burst access to main memory significantly. Currently a burst access is just 4 single accesses back to back.
Add a branch prediction buffer to the A25 core. This would improve performance by between 5% and 10%. But its quite a complex change and would increase the size of the core a little as well.
Add a branch prediction buffer to the A25 core. This would improve performance by between 5% and 10%. But its quite a complex change and would increase the size of the core a little as well.
 
 
Speed up the A25 execute stage by forcing instructions that use complex shifts to take two cycles. These instructions are quite rare. The vast majority of instructions use no shift or a simple shift of 1 bit left or right, or 8 bits left or right. A simplified barrel shifter on the critical path would allow the core to run 10% to 20% faster in a given technology. Any instruction using a complex shift would require two passes through the execute stage, stalling everything behind it. The first pass it would use the a complex barrel shifter, and in the second pass it would use the ALU.
Done. Speed up the A25 execute stage by forcing instructions that use complex shifts to take two cycles. These instructions are quite rare. The vast majority of instructions use no shift or a simple shift of 1 bit left or right, or 8 bits left or right. A simplified barrel shifter on the critical path would allow the core to run 10% to 20% faster in a given technology. Any instruction using a complex shift would require two passes through the execute stage, stalling everything behind it. The first pass it would use the a complex barrel shifter, and in the second pass it would use the ALU.
 
 
Add a fast multiplier using the FPGA DSP blocks. The multiplier module is designed to have a clean interface so it is easy to replace it with a higher performance one. This would make a big difference for applications that contain a large number of multiplications.
Add a fast multiplier using the FPGA DSP blocks. The multiplier module is designed to have a clean interface so it is easy to replace it with a higher performance one. This would make a big difference for applications that contain a large number of multiplications.
 
 

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.