Re-opening this issue since I've identified a simple configuration to make it appear.
Considering the last Amber revision (86),
if we modify line 192 of file "main_mem.v" from: assign o_wb_ack = i_wb_stb && ( start_write || start_read_d2 );
to: assign o_wb_ack = i_wb_stb && ( wr_en || start_read_d2 );
This has the effect of delaying the wb acknowledge of one clock cycle, generating a wait state when writing to the main memory.
In this configuration, when the "boot-loader-ethmac" software is run in simulation for a23, Amber enters into an infinite loop when executing the instruction "push {r3, lr}" at address 0x1003788.
This is the first instruction of the "main" function.
This issue does not appear when running this test with a25.
Does this bug only occur if you modify the Verilog as you described?
I have a ram setup where it takes 4 cycles to return the ack pulse and I havent noticed this issue. That said it may only be for odd numbers of cycles.
Conor, leave this one open for now and i'll investigate it after i finish what im currently working on. I've been meaning to try and bring up the reference demo in Verilator anyways.
Yes, this bug only occur if I modify the code as described. Seems to only happen when cache is enabled. I got no trouble when running tests located at /hw/tests.
Yes, this bug only occur if I modify the code as described. Seems to only happen when cache is enabled. I got no trouble when running tests located at /hw/tests.
I spotted an issue with the caches a while back. let me see if i can recall it correctly. It seemed that the cache timings assumed the RAM timing would be at the same speed. In particular i noticed big issues when I implemented data aborts for writes. The cache was always written even if the wb_err signal was asserted. It could be due to that.
Currently I run with caches off because i havent had time to debug the caches with data aborts.
I may try running my fuzz program while choosing random bus waits from say 1 to 50. A CPU does need to be able to have random waits in there because bus contention could mean that the cpu has to wait while a higher priority device accesses the bus.
I may try running my fuzz program while choosing random bus waits from say 1 to 50. A CPU does need to be able to have random waits in there because bus contention could mean that the cpu has to wait while a higher priority device accesses the bus.
Ok i have confirmed there is an issue with the cache write. If the RAM and cache timings do not match then a write will result in u_fetch's o_stall never going low.
I think the fix here is to have the cache perform the write when the wishbone bus's ack pulse comes back. Otherwise the wishbone bus is out of step with the CPU and we're in a write back, rather than write-through, situation. Which is not what the ARMv2 is (need to double check that though).
I'll take a look at this.
Ok the problem here is kinda simple.
o_fetch_stall is set by the OR of the wb_stall and the cache_stall. The trouble is that in the bus setup described in this bug they end up being completely out of phase and hence never both go low at the same time.