Insomnia Labs: 2015

Tuesday, August 18, 2015

GCC 4.4.0 patch 1.12

Changes this version:

Fixed bug when dividing by constant value
Improved type testing for instruction arguments
Added text to "--version" flag output to show patch version.

Download: gcc-4.4.0-tms9900-1.12-patch.tar.gz

Monday, July 20, 2015

I finally found a good place to put patch version information. Check it out:

eric@lenovo:~/dev/tios/src/gcc_installer/temp/out/bin$ ./tms9900-gcc --version
tms9900-gcc (GCC) 4.4.0 20090421 (TMS9900 patch 1.12)
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I also took a detour to see if I can get GCC 5.2 to work with my changes. So far, the merge is working out better than I thought, but there are still lots of errors. For some reason, the build of gcov fails. This is a profiling tool, and is required as part of the GCC build. This is all unmodified code, so I don't know why it's broken.

The problem seems to be related to the fact that we don't have libc headers for the TMS9900. I'm not sure why this is important since lots of other targets must be configured like this too. Unless I can figure out some way to make progress here, it may be time to give up for a while and get back to making a release.

Saturday, July 18, 2015

I was reading the changelog for more recent versions of GCC, and there's some nice stuff in there. This made me realize how old the gcc version 4.4.0 tree is. On the other hand, I've made a lot of changes to the internals of the compiler that would have to be ported to a newer version.

One thing I noticed is that I might be able to get proper handling of byte quantities by making custom operand types. The Arm back end does this to force values into the desired registers. But as I write this, I remember that the problem is that the compiler reasonably assumes that any place that can hold an int can be used for byte operations too. I might have to stick with 4.4.0 for a while longer.

After fixing the overly broad parameter constrants for division I thought I should go through the other instructions as well. Unfortunately I found that same problem all over the place. Fixed now.

Sunday, July 12, 2015

I finally found the problem. One of the four descriptions for the division instruction in tms9900.md was wrong. All of the other descriptions forced the numerator to be in a register, this one allowed anything. So this let the constant 3200 be used directly, without the need to be stored in register. When that form was expanded, gen_highpart was called, resulting in the crash.

The error only showed up in this code because it contains lots of calculations, increasing the pressure on the compiler to make more efficient use of the registers. While doing that calculation it saw that the division form did not require a register and acted accordingly. The normal behavior is to put all values in registers before use. For less demanding code, that register usage would be left in place, and no error would be seen.

Once that was fixed, wolfie3 compiles without any problems, and the resulting assembly is correct. Yay!

Before I do any other releases, I want to try to make a cygwin archive for the windows people. I've been neglecting that for a long time.

Saturday, July 11, 2015

I finally got the new laptop working, and I started looking at the compiler crash for wolfie3.c. The problem looks like a bad subreg expression was added very early in the compilation process. Unfortunately, there is no output from the "-da" debug flag describing what that expression might be. I have no choice but to trace the error back from the location indicated by the failed assertion to a point where I can see what went wrong.

wolfie3.c: In function ‘cast_rays’:
wolfie3.c:278: internal compiler error: in subreg_highpart_offset, at emit-rtl.c:1308

The C code at that line looks like this:

#define SCREEN_DISTANCE 3200
unsigned int distance;
...
int sliceheight = SCREEN_DISTANCE / distance;

I changed all variables to unsigned int, thinking the difference in signedness was the problem. Nope, no change in error.

After adding some extra debug output, I think I found the problem.

EMW>> gen_highpart : (const_int 3200 [0xc80])
EMW>> subreg_highpart_offset : in=0 out=2 diff=-2

Someone is trying to take a subreg of a constant, which has no mode size. As a result, subreg_highpart_offset gets confused and aborts the compilation. Seeing that constant value is encouraging, since that shows I'm on the right track. Since this is a condition which should never happen, it's likely I wrote the code causing this problem. That narrows the scope quite a bit.

Thursday, June 25, 2015

I was checking AtariAge today and one of the comments there had a really good idea. There should be some way to identify the patch revision in the compiler binary. I'll need to look into how to do that, but it's a really good idea.

Another idea: modify the gcc_installer script to find and use the most recent patches. It would be nice to not have to repackage the installer for every future patch. Also, there's no easy way to tell without unzipping it what patch versions are included in the distribution archive.

Also, there have been a lot of requests for a Windows installer. Can we get away with just installing cygwin and building from there? Can I get the installer to automatically adapt to the build environment? That would be really nice as well.

Monday, June 15, 2015

GCC 4.4.0 patch 1.11

Changes this version:

Fixed compilation error due to missing include in tms9900.h.
Fixed problem declaring global variables, they were not always exported
Some instruction sizes were defined incorrectly, causing assembly errors.
Fixed conditional jump displacement limits, they were too small.
Added compilation pass to add needed SWPB instructions.

Download: gcc-4.4.0-tms9900-1.11-patch.tar.gz

Saturday, June 13, 2015

More progress. I've added some debug output for this new pass. The idea is to follow the pattern of the other passes. This will provide some transparency in the compilation process as well as some breadcrumbs to folow in case things all go horribly wrong.

Here's an example instruction from a test program I used to verify the code which originally handled these subreg problems:

(insn 11 10 12 2 tursi5.c:6 (set (reg:QI 21 [ D.1197 ])
        (plus:QI (subreg:QI (reg:HI 25 [ Round+-1 ]) 1)
            (const_int 48 [0x30]))) 59 {addqi3} (expr_list:REG_DEAD (reg:HI 25 [ Round+-1 ])
        (nil)))

This is equivalent to the C expression "C=A+(char)B;"

In this instruction, we're trying to get the sum of a one-byte value and the least significant byte of a different two-byte value. On any other processor this wouldn't be a problem, and we could just ignore the subreg part since the addition instruction would just ignore the othr byte. Sadly for the TMS9900, we need to preserve the subreg because if the two-byte value is stored in a register, we need to move the least significant byte into the most-significant position before invoking the AB (Add Bytes) instruction.

So this pass now extracts that subreg expression into a seprate instruction before the add to handle the relocation if needed. Making a sequence like "D=(char)B; C=A+D". Here is the debug output describing this action.

From tursi5.c.171r.tms9900_subreg:

Modifying insn 11, extracting subreg to new instruction
scanning new insn with uid = 21.
New sequence:
(set:QI (reg:QI 29)
    (subreg:QI (reg:HI 25 [ Round+-1 ]) 1))
(insn 11 21 12 2 tursi5.c:6 (set (reg:QI 21 [ D.1197 ])
        (plus:QI (reg:QI 29)
            (const_int 48 [0x30]))) 59 {addqi3} (expr_list:REG_DEAD (reg:HI 25 [ Round+-1 ])
        (nil)))
Now, the RTL expressions can map more directly into opcodes, and if needed a SPWB instruction will be inserted to handle the type conversion before the addition.

OK, enough chatting, get to patching.

Friday, June 5, 2015

I've been spending some time thinking about the SWPB problem, and I think the only thing left to try is to add a new compilation pass before register assignment to fix these problem instructions. I've tried to avoid this, since it is very intrusive. Unfortunately, I can't think of a better method, and to be honest everything else I've tried has failed. OK, let's do it.

So I've got the framework of a new compilation pass in place, and it seems to be called where I wanted it in the compile process. This was placed between the pass checking for assembly constraints and the pass handling register assignment, which should be the right place. Unfortunately, this has increased the debug ID numbers of some nearby passes. This is kind of annoying, but not really a problem. For example, *172.ira is now *.173.ira.

OK, I got the new compilation pass done. So far it seems to work. Libgcc can be built without errors, and a test program known to require an additional SWPB instruction works as well. At this point I think I just need to do some cleanup and a little more testing. After that I can do a release. Finally.

Sunday, March 1, 2015

I've been putting off working on GCC because I know the next thing to work on is the bad subreg instruction. This has always been a long, frustrating processs, so I'm not excitied to dive back into this. Oh well, this isn't going to get any easier...

I won't copy my debug notes here, since there were a lot of false starts and dead ends. Much frustration was had, and several implite words were directed at the fine people at Texas Instruments for their chosen method of working with byte values.

Finally, I found the source of the invalid instruction. The problem is that the code I added to insert SWPB instruction when needed to convert between data formats can emit multiple SWPB instructions.

During development, I noticed that the compiler evaluates the same chunks of code several times before moving on. To prevent multiple SWPB instructions frm being inserted, I added a check to make sure this wouldn't happen. If the location of the last inserted SWPB is the same as the requested location, do nothing.

This worked out great for the test programs I wrote, since they only had one place where SWPB's needed to be inserted. When compiling libgcc however, there were two places where we needed to do this. In this case, it was possible to insert multiple instructions, since the last insrted location ping-pongs between these two places.

All this means that the handling of subreg expressions will be much harder and require much more complicated code than I had originally expected. More research required, I think.

Friday, January 23, 2015

I did a test to confirm that the corrected instruction sizes ensure that there are no more issues with invalid displacements being used with conditional jumps. The good news is that I can see the expected cutoff where the compiler switches from using a conditional jump (jeq ADDR) and a compound jump (jne +4; b @addr) to execute the right branch. The examples below should make this clearer.

While testing, I realized that the maximum displacement of conditional jumps is wrong. I forgot that the the instruction is encoded with half the true signed displacement. This gives us a range between -256 and +254 bytes counted from the start of the next instruction, not the +-128 I was using.

GCC calculates the displacement from the start of the instruction, while the processor uses the end. This means that the size of whatever jump code is used must be taken into account when determining the displacement limits.

NEG_DISP = 256 - CODE_SIZE
POS_DISP = 254 + CODE_SIZE

Here are the possible patterns for jump instructions and their maximum displacements from the initial PC. When multiple jumps are used, the largest displacement which satisfies all instructions is used.

jlt               : [-254 .. 256]
jlt; jeq          : [-252 .. 256]
jlt; b @ADDR      : [-254 .. 256]
jlt; jeq; b @ADDR : [-252 .. 256]
jmp               : [-254 .. 256]
b @ADDR           : No limits

So to play it safe and prevent a situation where changing from one form to another causes an illegal instruction, let's just use the smallest common range of [-252 .. 256].

OK, new limits are in and tested. Looks good.

Well, back to wrangling subregs. Ugh.

Sunday, January 11, 2015

I've been neglecting TI stuff for a while, so it's time to get busy.

I got an update from Tursi on AtariAge, he mentioned that he saw a problem with jump offsets limits being violated. I looked at the machine description and noticed that all the length attributes were measured in words, not bytes. I'm not sure if this will fix the problem he saw, but this needed to be done anyway.

Insomnia Labs