I think I found a problem in the shift instructions.
I made a simple function to emit a variable shift instruction (EX: sla r12, r0). I was looking at the edge cases, and realized that if the variable shift value is zero, the instruction will actually shift by 16 bits. The C standard requires that in this case, no shift is to be performed. This is a problem.
I took a look at the shift instructions for Arm and x86, and they shift from 0 to N-1 bits for a N bit value. The TI instructions instead shift from 1 to N bits. Hmm..
So what I need to do is insert a check every time that a shift of this type is called for. Something like this:
# Left shift. Shift count in R0, value in R1
andi r0, >000F # Mask shift value, check for zero count
jeq $+4 # If zero, jump over shift
sla r1, r0 # Shift by non-zero bits
I'm not really happy about this, it adds 6 bytes for every use and lots of clocks. Unfortunately, I don't have a choice if I want to emit correct code.
After testing, this is now working properly.