



Problem with function prologue generated by or32-elf-gcc
by ashwinm on May 22, 2013 |
ashwinm
Posts: 3 Joined: Oct 1, 2008 Last seen: Sep 9, 2017 |
||
Hello,
I am using a newlib build of gcc checked out from svn and built using the instructions on http://opencores.org/or1k/OpenRISC_GNU_tool_chain. The GCC version is 4.5.1-or32-1.0rc4. I find that for the following example function: void f() { } or32-elf-gcc generates this assembly: l.sw 0xfffffffc(r1),r2 l.addi r2,r1,0x0 l.addi r1,r1,0xfffffffc l.ori r1,r2,0x0 l.lwz r2,0xfffffffc(r1) l.jr r9 l.nop 0x0 I see here that on entering the function, the stack pointer (r1) is decremented only after the actual push of r2. This will result in stack corruption if an exception occurs right after the execution of the first instruction, since the exception handler will not see the updated stack pointer, and will clobber the top of the stack. This problem has been verified to occur by me while simulating OR1200 RTL. The software is running bare metal, compiled against the newlib library. I happen to have an older version of the toolchain lying around (GCC version 4.2.2), and I see that it generates the correct function prologue, like so: l.addi r1,r1,0xfffffffc l.sw 0x0(r1),r2 l.addi r2,r1,0x4 l.lwz r2,0x0(r1) l.jr r9 l.addi r1,r1,0x4 In this case, the stack pointer is decremented first, and then the stack is populated. I'd like to know if my understanding of this is correct and this is indeed a bug, or am I missing something. Best regards, Ashwin |
RE: Problem with function prologue generated by or32-elf-gcc
by jeremybennett on May 23, 2013 |
jeremybennett
Posts: 815 Joined: May 29, 2008 Last seen: Jun 13, 2019 |
||
Hi Ashwin,
I've asked Joern Rennecke, who worked on the 4.5.1 port, to give an opinion on this. Best wishes, Jeremy |
RE: Problem with function prologue generated by or32-elf-gcc
by ashwinm on May 23, 2013 |
ashwinm
Posts: 3 Joined: Oct 1, 2008 Last seen: Sep 9, 2017 |
||
Thanks Jeremy.
Best Regards, Ashwin |
RE: Problem with function prologue generated by or32-elf-gcc
by jeremybennett on May 23, 2013 |
jeremybennett
Posts: 815 Joined: May 29, 2008 Last seen: Jun 13, 2019 |
||
I've spoken briefly with Joern. He reminded me that the OpenRISC ABI allows for a "red zone" beyond the end of the stack. The interrupt handler places its stack beyond this.
Having an open end to the stack like this opens up a large number of optimization opportunities. For example leaf functions do not need to set up a stack frame at all. The OpenRISC is a little unusual, in that its red zone is huge (2,536 bytes), but it is reasonably common in other architectures. HTH, Jeremy |
RE: Problem with function prologue generated by or32-elf-gcc
by ashwinm on May 23, 2013 |
ashwinm
Posts: 3 Joined: Oct 1, 2008 Last seen: Sep 9, 2017 |
||
Jeremy,
Thanks very much for the clarification, I was not aware of the redzone. I will modify my exception handlers to take it into account. Best Regards, Ashwin |
RE: Problem with function prologue generated by or32-elf-gcc
by stekern on May 29, 2013 |
stekern
Posts: 84 Joined: Apr 28, 2009 Last seen: Nov 10, 2016 |
||
I don't think that the red zone of 2092 bytes have ever been used in practice,
the default for gcc is 128 bytes and all exception handlers I've seen (including newlib) use that. The latest version of the arch manual use that value too. |
RE: Problem with function prologue generated by or32-elf-gcc
by jeremybennett on Jun 3, 2013 |
jeremybennett
Posts: 815 Joined: May 29, 2008 Last seen: Jun 13, 2019 |
||
I don't think that the red zone of 2092 bytes have ever been used in practice,
the default for gcc is 128 bytes and all exception handlers I've seen (including newlib) use that. The latest version of the arch manual use that value too. Hi Stefan, I wasn't defending the value of 2,920, just noting the value that is historically used by OpenRISC. It is the value that is used in the stable (4.5.1) tool chain. Do I take it from your comment that it is now 128 bytes in the development GCC (4.8) and LLVM tool chains? The important thing is that all the parts of the tool chain match up. Notably GDB has knowledge of the red zone, so should match the value in GCC. If the value is changed in the architecture manual, then it might be sensible to note the former value there as well, so people are not surprised if they are using older tools. Best wishes, Jeremy |
RE: Problem with function prologue generated by or32-elf-gcc
by stekern on Jun 3, 2013 |
stekern
Posts: 84 Joined: Apr 28, 2009 Last seen: Nov 10, 2016 |
||
I wasn't defending the value of 2,920, just noting the value that is historically used by OpenRISC. It is the value that is used in the stable (4.5.1) tool chain. Do I take it from your comment that it is now 128 bytes in the development GCC (4.8) and LLVM tool chains? The important thing is that all the parts of the tool chain match up. Notably GDB has knowledge of the red zone, so should match the value in GCC.
I know you weren't defending it ;) You can take from my comment that (the default) is 128 bytes in 4.8 gcc *and* the stable 4.5.1 gcc (and the newlib exception code only steps over 128 bytes too). And looking even further back in the mirror, the 4.2.2 version doesn't seem to use the red zone, hence my comment about being unsure if the 2092 value has ever been used in practice at all. LLVM don't take advantage of the red zone at all as it is now. |
RE: Problem with function prologue generated by or32-elf-gcc
by jeremybennett on Jun 4, 2013 |
jeremybennett
Posts: 815 Joined: May 29, 2008 Last seen: Jun 13, 2019 |
||
I know you weren't defending it ;)
You can take from my comment that (the default) is 128 bytes in 4.8 gcc *and* the stable 4.5.1 gcc (and the newlib exception code only steps over 128 bytes too). And looking even further back in the mirror, the 4.2.2 version doesn't seem to use the red zone, hence my comment about being unsure if the 2092 value has ever been used in practice at all. LLVM don't take advantage of the red zone at all as it is now. 128 bytes is much more sensible. I forgot it had been changed in GCC 4.5.1 - IIRC we made it even made configurable. 2092 was certainly used somewhere in the old tool chains, because I remember being horrified at the value. I think it was defined in the tool chain and not actually used, which was the worst of all worlds. I'd encourage you to consider the merits of a red zone for LLVM. It opens up all sorts of leaf function optimizations that can make a big improvement to performance. Best wishes, Jeremy |



