Here's a simplified version of the instructions up to step 182r.peephole2 for the "long" type.
inputs: r1,r2=val, r3=a_base (unused)
insn 24: r3=0 \_ Set temp 32-bit value to zero
insn 25: r4=r3 /
insn 28: r4=r2 <- Copy low word into temp value
insn 12: r4&=0x0F <- Apply mask
insn 13: r3(32-bit)+=48 <- '0'=48 <- Adding constant to low word
insn 14: r1=(char)r4
insn 15: number_buf[0]=r1
insn 33: return
GCC is claiming that R3 is unused (this is true), and so any manipulation of R3 can be safely removed. GCC also seems to be expanding this to R4, as it's part of the temporary 32-bit value defined in insn 24 and 25.
That results in these instructions being deleted in step 182:
insn 13
insn 12
insn 28
insn 25
insn 24
We are left with R4, with the note that it's in SImode (32-bit value), somehow the notion that R4 is the low word is lost, and R5 is selected for the low word of the value. This results in the final code:
print_hex
mov r5, r1
swpb r1
movb r1, @number_buf
b *r11
Similar behavior was seen with this code as well:
char test(long a) {return(a+'0');}
test
mov r5, r1
swpb r1
b *r11
Testing will continue with this example, since it's much shorter.
I found that the problem happens during register allocation. In all prior steps, the RTX looks good, but at 172r.ira, the wrong register is chosen, which makes a mess of everything from that point on.
168r.asmcoms
insn 3: (HI)((SI)r24)[0] = r1
insn 4: (HI)((SI)r24)[1] = r2
insn 9: (SI)r26=(SI)r24+48
insn 10: (QI)r27=(QI)((SI)r26)[3]
insn 16: (QI)r1=(QI)r27
172r.ira
insn 3: r3=r1
insn 4: r4=r2
insn 9: (SI)r3+=48
insn 10: (QI)r1=(SI)4 <- this is wrong, should be (SI)r3
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment