Monday, December 4, 2017

I spent some more time on the decompiler. Nothing special, just fixing bugs and making little improvements here and there.

The biggest fix I made was to the instruction simplification code. This fixed some clutter when working with byte values.

When simplifying, subsets of the instruction tree are passed through filters to find ways to simplify it. Currently, I've got about 20 filters in place. These cover cases like adding zero to a value, or multiplying by 1. Basically, we replace a subset of an instruction with a simpler equvalent operation.

Here's an example:

We start with something like "var2 = ((var1 >> 8) | (var1 << 8)) >> 8". This shows up when we have an assembly sequence like "SWPB R1; MOVB R1, R2".

First, we distribute the last right shift: "var2 = ((var1 >> 16) | ((var1 << 8) >> 8)"

Then we recognize that the first right shift will always result in zero: "var2 = 0 | ((var1 << 8) >> 8)"

Next, or'ing with zero has no impact on the result: "var2 = (var1 << 8) >> 8"

Finally, we replace the nested shifts with a logical and: "var2 = var1 & 256"

As you can see, this results in an instruction which is much easier to understand and process.

Here's that same instruction in the context of a complete function. The function just writes to a VDP register.

ORI     R1, >8000   # write to register
SWPB    R1          # swap bytes in R1
MOVB    R1, @>8C02  # write low byte (value) to VDP data port
SWPB    R1          #
MOVB    R1, @>8C02  # write high byte (reg ID) to VDP data port
B       *R11        # return


Here's the decompiler's output after processing that function:

(var223 = (((R1 | 0x8000) >> 8) | ((R1 | 0x8000) << 8)))
((U8*)(0x8C02) = (var223 >> 8))
((U8*)(0x8C02) = (var223 & 255))
(goto (U16*)(R11))


At first, this doesn't seem especially impressive, but remember that the decompiler found this function without human guidance, using only the compiled binary as input.

Saturday, December 2, 2017

So I spent some more time working on the decompiler. The part I was working on was subgraph recognition. Each flow control structure has a unique sructure, and by recognizing these structures, we can identify code blocks and the control operations.

Here are several example graphs to help explain this better. Each letter represents a block of non-branching code. The arrows show the possible execution paths between these blocks.

Example 1: Do while loop

A->B->C
^--'

Execution starts at block A, proceeds to block B and then either loops back to A or proceeds to C. This graph is equivalent to the following pseudocode:

do {
  A
} while(B);
C

Example 2: While loop

.-----v
A->B->C
^--'

This is similar to the do while loop, but the loop determination is in block A. Here's the equivalent pseudocode:

while(A) {
  B
}
C

Example 3: If

A->B->C
'-----^

This is a subgraph of the while loop, so the detection logic needs to distinguish between them. Again the decision whether to execute block B is made in block A. In either case, execution proceeds to block C. Here's the equivalent pseudocode:

if(A) {
  B
}
C

Example 4: If else

   .-----v
A->B  C->D
'-----^

I think you get the idea by now. Here's the equivalent pseudocode:

if(A) {
  B
} else {
  C
}
D

There's still tons to do for the decompiler, but getting this right is one of the central problems that needed to be solved. It's pretty neat to see this working.

Sunday, November 19, 2017

It's been a long time since I've updated this, but I've still been busy doing TI stuff.

There's really no way to reconstruct the development details, but here's a summary of where things currently stand:

GCC and Binutils:
These are pretty mature, and haven't really needed much work. I did find a minor flaw handling RTWP in the assembler, but that probably doesn't warrant a patch release on it's own.

TI Disk OS:
This is an attempt to write a replacement Linux-like OS for the TI-99/4a. I've currently got process management, time functions, wait queues and a primitive device driver layer. There's about 5000 lines for everything.

I got stalled trying to finalize a design for the file system and started working on the decompiler. I haven't done any OS work for a few months now.

Decompiler:
I started working on a general-purpose decompiler, but it's still in early devlopment phase. The idea was to make a platform-dependant disassembler which converts binary to a platform-independant representation. The tool then takes that instruction graph and does register lifetime calculation and flow control recognition to identify variables and flow control structures (if, while, for, etc.) to output high-level pseudocode.

I currently have the disassembly, variable reconstruction and most of the flow control recognition completed. There's about 9000 lines of code written so far, so it's pretty far along, but there's probably about 50 to 60 percent of the work left to do.

Other stuff:
I finished a libc implementation in support of the OS project, which has been released. I also finished a floating-point emulation library, which has also been released. The emulator was written mainly for the challenge, but is still a useful bit of code.

Today's update:
I was working on the decompiler today and was focused on flow control recognition.

The idea is that code is broken down into blocks. Within each block, execution proceeds without branches. The relationships between the blocks is preserved, resulting in a graph of possible execution paths. By analyzing the structure, the flow control operations can be determined. Once this is known, the pseudocode can be properly formatted.

The first phase is to convert binary to a general representation.. The second phase identifies chunks of executable code based on static analysis of the instruction stream. The third phase was the focus of todays work, to convert these unstructured code chunks into the execution graph.

This was done by converting branch instructions to graph nodes and non-branching chunks of code to graph edges between the nodes. Now that this has been done, I can do flow control recognition by doing subgraph isomorphism tests. Once a subgraph has been identified, that flow control type is recorded for later and that subgraph is collapsed to a single edge. By repeating these isomorphism tests, we can identify nested flow control of arbitrary complexity.

I know this has been a little short of details, so I'll try to provide more in later updates.

Tuesday, August 18, 2015

GCC 4.4.0 patch 1.12

Changes this version:

Fixed bug when dividing by constant value
Improved type testing for instruction arguments
Added text to "--version" flag output to show patch version.

Download: gcc-4.4.0-tms9900-1.12-patch.tar.gz

Monday, July 20, 2015

I finally found a good place to put patch version information. Check it out:

eric@lenovo:~/dev/tios/src/gcc_installer/temp/out/bin$ ./tms9900-gcc --version
tms9900-gcc (GCC) 4.4.0 20090421 (TMS9900 patch 1.12)
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I also took a detour to see if I can get GCC 5.2 to work with my changes. So far, the merge is working out better than I thought, but there are still lots of errors. For some reason, the build of gcov fails. This is a profiling tool, and is required as part of the GCC build. This is all unmodified code, so I don't know why it's broken.

The problem seems to be related to the fact that we don't have libc headers for the TMS9900. I'm not sure why this is important since lots of other targets must be configured like this too. Unless I can figure out some way to make progress here, it may be time to give up for a while and get back to making a release.

Saturday, July 18, 2015

I was reading the changelog for more recent versions of GCC, and there's some nice stuff in there. This made me realize how old the gcc version 4.4.0 tree is. On the other hand, I've made a lot of changes to the internals of the compiler that would have to be ported to a newer version.

One thing I noticed is that I might be able to get proper handling of byte quantities by making custom operand types. The Arm back end does this to force values into the desired registers. But as I write this, I remember that the problem is that the compiler reasonably assumes that any place that can hold an int can be used for byte operations too. I  might have to stick with 4.4.0 for a while longer.

After fixing the overly broad parameter constrants for division I thought I should go through the other instructions as well. Unfortunately I found that same problem all over the place. Fixed now.

Sunday, July 12, 2015

I finally found the problem. One of the four descriptions for the division instruction in tms9900.md was wrong. All of the other descriptions forced the numerator to be in a register, this one allowed anything. So this let the constant 3200 be used directly, without the need to be stored in register. When that form was expanded, gen_highpart was called, resulting in the crash.

The error only showed up in this code because it contains lots of calculations, increasing the pressure on the compiler to make more efficient use of the registers. While doing that calculation it saw that the division form did not require a register and acted accordingly. The normal behavior is to put all values in registers before use. For less demanding code, that register usage would be left in place, and no error would be seen.

Once that was fixed, wolfie3 compiles without any problems, and the resulting assembly is correct. Yay!

Before I do any other releases, I want to try to make a cygwin archive for the windows people. I've been neglecting that for a long time.