I have observed that hardware division does not seem to work correctly in current top-of-tree. This was observed in the test example dhry.c.
The behavior is observed with the code:
<pre> Dhrystones_Per_Second = Number_Of_Runs * 1000000 / User_Time; </pre>With a value of 175 for User_Time and 1 for Number_Of_Runs, Dhrystones_Per_Second is computed as 0, rather than the correct value, 5714.
I discovered the bug when correcting the erroneous formula:
<pre> Dhrystones_Per_Second = Number_Of_Runs * 1000 / User_Time; </pre>This yielded the value 168 (rather than 5).
This does not appear to be an OR32 compiler optimization bug. The previous example was with -O2, but repeating with -O0, Number_Of_Runs = 10 and User_Time = 7450 yields the result 5120, rather than the correct value, 1342.
This could therefore be a bug in the underlying code generation, or in the hardware implementation. I'll do some more tests to investigate.
More investigation indicates this is a hardware bug. This is with top-of-tree running under Icarus Verilog. The code sequence generated seems to be correct.<br> <br> The proble affects both l.div and l.divu. Mutliplication seems not to be affected.
Not a bug. The target HW had been built without l.div/l.divu enabled!