Monday, December 4, 2017

I spent some more time on the decompiler. Nothing special, just fixing bugs and making little improvements here and there.

The biggest fix I made was to the instruction simplification code. This fixed some clutter when working with byte values.

When simplifying, subsets of the instruction tree are passed through filters to find ways to simplify it. Currently, I've got about 20 filters in place. These cover cases like adding zero to a value, or multiplying by 1. Basically, we replace a subset of an instruction with a simpler equvalent operation.

Here's an example:

We start with something like "var2 = ((var1 >> 8) | (var1 << 8)) >> 8". This shows up when we have an assembly sequence like "SWPB R1; MOVB R1, R2".

First, we distribute the last right shift: "var2 = ((var1 >> 16) | ((var1 << 8) >> 8)"

Then we recognize that the first right shift will always result in zero: "var2 = 0 | ((var1 << 8) >> 8)"

Next, or'ing with zero has no impact on the result: "var2 = (var1 << 8) >> 8"

Finally, we replace the nested shifts with a logical and: "var2 = var1 & 256"

As you can see, this results in an instruction which is much easier to understand and process.

Here's that same instruction in the context of a complete function. The function just writes to a VDP register.

ORI     R1, >8000   # write to register
SWPB    R1          # swap bytes in R1
MOVB    R1, @>8C02  # write low byte (value) to VDP data port
SWPB    R1          #
MOVB    R1, @>8C02  # write high byte (reg ID) to VDP data port
B       *R11        # return


Here's the decompiler's output after processing that function:

(var223 = (((R1 | 0x8000) >> 8) | ((R1 | 0x8000) << 8)))
((U8*)(0x8C02) = (var223 >> 8))
((U8*)(0x8C02) = (var223 & 255))
(goto (U16*)(R11))


At first, this doesn't seem especially impressive, but remember that the decompiler found this function without human guidance, using only the compiled binary as input.

No comments:

Post a Comment