Apparently, the constraints for the shift count register was too loose, resulting in unexpected registers being used in the IRA step, and later flagged as errors. I was seeing things like "srl r1, r2" which is super wrong.
This was fixed by rewriting all the shift instructions to use "define_expand" to copy the shift count into R0, then do the shift. Two instruction forms were then written, one which only acccepts R0 as the shift register, and then another which only accepts contants. The optimizer then eliminates the unneeded move when constant shifts are used. I'm awfully happy about how that works now.
Even though I don't have a 32-bit shift yet, GCC is happy to compose a sequence using 16-bit instructions. Unfortunately, that sequence is pretty big. "long shift_ar(long r, int n) {return(r>>n);}" gets converted to:
shift_ar
mov r3, r4
andi r4, >10
abs r4
jeq L2
mov r3, r0
mov r1, r2
sra r2, 0
sra r1, >F
b *r11
jmp L6
L2
mov r1, r4
sla r4, >1
mov r3, r0
inv r0
sla r4, 0
mov r3, r0
srl r2, 0
soc r4, r2
sra r1, 0
b *r11
L6
I think I can do better, but there are other things to do right now.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment