OpenCores

Ion - MIPS(tm) compatible CPU

Issue List
Reset problem with 'Hello World' demo #2
Closed ja_rd opened this issue over 11 years ago
ja_rd commented over 11 years ago

In the 'Hello World' demo, out of the first few times the system is reset, one or two 'fail'. That is, the console shows an avalanche of nonsense characters. This usually the second reset, but it depends (I'm not sure on what). After the first 3 or 4 resets, everything works fine indefinitely -- or at last for the more than 10 resets in a row that I have tried.

This happens with the cache enabled or disabled. Doesn't happen on the Adventure demo, at least so far.

I had thought this was a mechanical problem in my dev board but it isn't. I have replicated the problem in the simulation test bench.

ja_rd was assigned over 11 years ago
ja_rd commented over 11 years ago

It's a problem with the PC reset logic. PC is not being incremented after fetching the first instruction (a relative jump). Then the relative jump goes to an address 4 bytes before the intended target. At that address there's a 'jr' (jump to register) which is part of the opcode emulation decoding logic, and the code derails.

Looks like it might depend on whether or not the reset 'hits' a jump instruction, and whether or not the first instruction after reset is a jump (which it is).

ja_rd commented over 11 years ago

In the interval between reset and the end of the first cache refill, the cache is presenting a code word to the CPU; in general, this word will be a leftover from before the reset and will not be valid.

And the CPU loads into the IR whatever the cache has in its code word output, unless it is stalled -- but the very 1st cycle after reset, the CPU is not stalled yet.

In this case, the IR was being loaded with a jump instruction (located at the reset vector address) before the CPU was stalled for the code cache refill. The PC increment was then stalled, and the decoded IR was forcing the CPU to jump to strange places and derail.

The real problem here is that the CPU structure is too messy. And the CPU-cache in particular is the messiest of all...

I have to find a solution that does not ruin the cache timing. At the very least, the cache must not present code to the CPU until after the first code refill has completed.

ja_rd commented over 11 years ago

Ok, fixed at revision 242.

The fix involves a new signal between the cache and the CPU, 'cache_ready'. When deasserted, the CPU will not load the cache code word into IR, instead it will load a zero word. This will only happen in the first fetch after a reset. The signal cache_ready asserts after the first code miss and remains asserted until the next reset.

This is a nasty hack, if only because the logic is obscure. The code needs some refactor to make things more clear. At the very least a diagram!

ja_rd closed this over 11 years ago
ja_rd commented over 11 years ago

Closed by revision 242, fix does not hurt area or clock rate.


Assignee
ja_rd
Labels
Bug