1/1
about or32-uclinux-gcc
by Unknown on Jan 13, 2004 |
Not available! | ||
Hi all,
I used or32-uclinux-gcc generated an assembly file from a FIR filter
that coded in C language.
////// c code /////
#include
#include
int main() {
int i;
int N;
int F;
int x[10], c[10];
F =0;
N =10;
for(i=0; i_main,@function
_main:
# 00111100000100000000000000000000
# gpr_save_area 0 vars 92
current_function_outgoing_args_size 0
l.addi r1,r1,-96 # reserve 96 bytes for store the data
20*4 +4(i) +4(N)+4(F)+4(for storing r2 value)
l.sw 0(r1),r2
l.addi r2,r1,96
l.addi r3,r0,0 # move immediate
l.sw -12(r2),r3 # F
l.addi r4,r0,10 # move immediate N
l.sw -8(r2),r4
l.addi r3,r0,0 # move immediate i
l.sw -4(r2),r3
.L2:
l.lwz r4,-4(r2) # SI load
l.lwz r3,-8(r2) # SI load
l.sflts r4,r3 # if r4
|
about or32-uclinux-gcc
by Unknown on Jan 13, 2004 |
Not available! | ||
I found that the generated ASM code is inefficient. i.e.
in a loop operation, it takes a lot of time on calculating the address of storing the data for example l.lwz r4,-4(r2) # SI load l.addi r3,r0,4 # move immediate l.mul r4,r4,r3 l.addi r3,r2,-52 l.add r5,r3,r4 # calculating the address of x seems like gcc has problems recognizing l.muli instruction. Marko |
about or32-uclinux-gcc
by Unknown on Jan 13, 2004 |
Not available! | ||
l.lwz r4,-4(r2) # SI load l.addi r3,r0,4 # move immediate l.mul r4,r4,r3 l.addi r3,r2,-52 l.add r5,r3,r4 # calculating the address of x seems like gcc has problems recognizing l.muli instruction. the l.muli is not used by our gcc, a thing that might be worth doing too. I'm no asm guru, but this looks like a shift would suffice as well. And then, there could be one of the l.add saved if the array address was calculated outside the loop. Stephen, do you have optimization (-O or -O2) enabled? Heiko |
about or32-uclinux-gcc
by Unknown on Jan 14, 2004 |
Not available! | ||
Hi Heiko,
Thanks for the comment. I didnot enable the optimization option, -O or -O2. Now I have tried to enable the optimization option, but I found that there are some problems.
my c code is a fir filter
////// C source code/////
#include
#include
int main() {
int i;
int N;
int F;
int x[10], c[10];
F =0;
N =10;
for(i=0; i_main,@function
_main:
# 00011111110100000000000000000000
# gpr_save_area 0 vars 80 current_function_outgoing_args_size 0
l.addi r1,r1,-84
l.sw 0(r1),r9
l.addi r7,r0,10 # move immediate
l.addi r6,r0,0 # move immediate
l.addi r9,r1,44
l.addi r8,r1,4
# first loop for initialization, no problem
.L5:
l.slli r3,r6,2
l.slli r5,r6,1
l.add r4,r9,r3
l.add r3,r8,r3
l.sw 0(r4),r6
l.addi r6,r6,1
l.sflts r6,r7
l.bf .L5 # delay slot filled
l.sw 0(r3),r5
l.addi r6,r0,0 # move immediate
l.addi r6,r6,1
# the second loop, just have looping, no operation, wrong (bug???)
.L16:
l.sflts r6,r7
l.bf .L16 # delay slot filled
l.addi r6,r6,1
l.lwz r9,0(r1)
l.jr r9
l.addi r1,r1,84
.endproc _main
.Lfe1:
.size _main,.Lfe1-_main
.ident "GCC: (GNU) 3.1 20020121 (experimental)"
Heiko Panther heiko.panther@web.de> wrote:
the l.muli is not used by our gcc, a thing that might be worth doing too.
I'm no asm guru, but this looks like a shift would suffice as well. And
then, there could be one of the l.add saved if the array address was
calculated outside the loop.
Stephen, do you have optimization (-O or -O2) enabled?
Heiko
Shining Friends¡B¦n¤ß¦n³ø¡B·³¤ë¦pºq...
®öº©¹aÃn ±¡¤ß³sô
http://ringtone.yahoo.com.hk/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.opencores.org/forums/openrisc/attachments/20040114/360d5aaf/attachment.htm
l.lwz r4,-4(r2) # SI load l.addi r3,r0,4 # move immediate l.mul r4,r4,r3 l.addi r3,r2,-52 l.add r5,r3,r4 # calculating the address of x seems like gcc has problems recognizing l.muli instruction. |
about or32-uclinux-gcc
by Unknown on Jan 14, 2004 |
Not available! | ||
Stephen,
Any idea about it? Is it the bug of gcc when it operates with optimization option?
I would guess that your code is optimized away because you're not using the results. Try and use the results, and see what happens then. Heiko |
1/1