Sunday, November 19, 2017

It's been a long time since I've updated this, but I've still been busy doing TI stuff.

There's really no way to reconstruct the development details, but here's a summary of where things currently stand:

GCC and Binutils:
These are pretty mature, and haven't really needed much work. I did find a minor flaw handling RTWP in the assembler, but that probably doesn't warrant a patch release on it's own.

TI Disk OS:
This is an attempt to write a replacement Linux-like OS for the TI-99/4a. I've currently got process management, time functions, wait queues and a primitive device driver layer. There's about 5000 lines for everything.

I got stalled trying to finalize a design for the file system and started working on the decompiler. I haven't done any OS work for a few months now.

Decompiler:
I started working on a general-purpose decompiler, but it's still in early devlopment phase. The idea was to make a platform-dependant disassembler which converts binary to a platform-independant representation. The tool then takes that instruction graph and does register lifetime calculation and flow control recognition to identify variables and flow control structures (if, while, for, etc.) to output high-level pseudocode.

I currently have the disassembly, variable reconstruction and most of the flow control recognition completed. There's about 9000 lines of code written so far, so it's pretty far along, but there's probably about 50 to 60 percent of the work left to do.

Other stuff:
I finished a libc implementation in support of the OS project, which has been released. I also finished a floating-point emulation library, which has also been released. The emulator was written mainly for the challenge, but is still a useful bit of code.

Today's update:
I was working on the decompiler today and was focused on flow control recognition.

The idea is that code is broken down into blocks. Within each block, execution proceeds without branches. The relationships between the blocks is preserved, resulting in a graph of possible execution paths. By analyzing the structure, the flow control operations can be determined. Once this is known, the pseudocode can be properly formatted.

The first phase is to convert binary to a general representation.. The second phase identifies chunks of executable code based on static analysis of the instruction stream. The third phase was the focus of todays work, to convert these unstructured code chunks into the execution graph.

This was done by converting branch instructions to graph nodes and non-branching chunks of code to graph edges between the nodes. Now that this has been done, I can do flow control recognition by doing subgraph isomorphism tests. Once a subgraph has been identified, that flow control type is recorded for later and that subgraph is collapsed to a single edge. By repeating these isomorphism tests, we can identify nested flow control of arbitrary complexity.

I know this has been a little short of details, so I'll try to provide more in later updates.

1 comment:

  1. Excellent! I do think I've discovered another bug or two in the compiler, I've documented them over at atariage. Looking forward to reading more about your OS.

    ReplyDelete