Since there were no more problems, I've gone back to printf. To make any more progress, I need to get 32-bit division working. I mentioned this approach earlier, but after I went over those notes, I found some problems.

So just to be sure, I derived everything from scratch. The notes here are the highlights, since I don't want to reproduce the three pages I needed for the proof here.

The idea is to transform the fraction to ensure a 16-bit denominator. This allows the use of the DIV instruction, and skips the more commonly-seen but much slower shift-and-subtract method usually used for this kind of math.

Let's start at the beginning. Assume positive numerator and denominator.

num

----- = Q

den

The denominator is 32-bit, so for non-zero results, the numerator must be 32-bits as well. The "n" constant is 0x10000.

a*n+b

------- = Q

c*n+d

Multiply by (1/c)/(1/c), rename to P to acknowledge the truncation error

(a*n+b)/c

----------- = P

n+(d/c)

Multiply by (1/2)/(1/2) to keep 16-bit denominator.

((a*n+b)/2)/c

----------------- = P

(n/2)+((d/2)/c)

Find the remainder to correct for truncation error. Given the requirements on the inputs, P will be no more than 16 bits wide.

(a*n+b) - P(c*n+d) = R

The bulk of the proof was in determining that this was the only correction needed over the entire domain of valid inputs:

If R<0 then decrement P by one

Here's the resulting C code. It's been verified on a PC using Monte Carlo methods over the valid domain. I just need to convert to TMS9900 assembly and count some clocks.

long div(unsigned long num, unsigned long den)

{

unsigned short c=den>>16;

unsigned short d=den&0xFFFF;

unsigned short v=0x8000+(d/2)/c;

unsigned long u=(num/2)/c;

unsigned long p = u/v;

long r = num-p*den;

if(r<0)

{r+=den; p--;}

return(p);

}

This requires one 32x16 division, one 32x32 multiply, a 32-bit add and a 32-bit subtract, All other operations can be made using 16-bit values.

Not too shabby.

Subscribe to:
Post Comments (Atom)

## No comments:

## Post a Comment