Monday, May 21, 2012

I've noticed a few places where we have a sequence where a register is loaded with a constant, used for some operation, then discarded. Like this:
  li r1, 27
  mpy r1, r4

I wonder if it would be better to put these constants in the data section, and do something like this:
  mpy @const_27, r4

Lets check it out!

li r1, 27  - 4 bytes, 12+4+4=20 clocks
mpy r1, r4 - 2 byte, 52+4=56 clocks
total: 6 bytes, 76 clocks

.data const_27: data 27 - 2 bytes, 0 clocks
mpy @const_27, r4 - 2 bytes, 52+4+8=64 clocks
total: 4 bytes, 64 clocks

Hmm, pretty good... 33% smaller, 15% faster. I wasn't expecting that. This could be better if the constant is used in other places too. That would further reduce the effective bytes used for each instruction. (Average size is between 4 and 2 bytes per operation. Asymptotic to 2.) This is also more drastic for quicker instructions (like movb: 20+14+4+8=46 clocks vs. 26 clocks, 43% faster).

I should look at ways to optimize memset and memcpy sequences, there's got to be a way to take advantage of that.

No comments:

Post a Comment