More good news, inline assembly works just fine. I really didn't expect any problems, but it's good to know for sure.
I used 32-bit left shifts as my test for inlining, and I've improved quite a bit over the GCC generated code. That was 18 instructions, and awfully clunky. Here's the new code:
ci r0, 16
jeq shiftl_16
mov r2, r4
sla r1, r0
sla r2, r0
neg r0
srl r4, r0
soc r4, r1
ci r0, -16
jgt shiftl_end
shiftl_16:
mov r2, r1
clr r2
shiftl_end:
This is twelve instructions, and about 200 cycles. I can't see any good way to significantly reduce this sequence.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment