The code in c_startup.s that transfers the data section from flash to ram moves whole words only. When the data section contains 1 to 3 bytes, the code fails (line 51).
I think this is related to a quirky reset behavior that happens sometimes: for example, the 'hello' demo crashes 2 times out of the first 4 resets and then resets correctly indefinitely.
I have yet to make sure this reset problem is caused by the faulty startup code, but the startup code is wrong, that's sure.
I've fixed the C startup code proble. It still moves whole words but the loop will work for data sizes not multiple of 4.
In the process I have fixed a major snafu in the test bench, which apparently had not been adapted to the latest changes in the SoC, etc. I don't know for how long it's been broken...
The reset problem still happens so I'm opening a separate bug.
Startup code fixed.