Since there were no more problems, I've gone back to printf. To make any more progress, I need to get 32-bit division working. I mentioned this approach earlier, but after I went over those notes, I found some problems.
So just to be sure, I derived everything from scratch. The notes here are the highlights, since I don't want to reproduce the three pages I needed for the proof here.
The idea is to transform the fraction to ensure a 16-bit denominator. This allows the use of the DIV instruction, and skips the more commonly-seen but much slower shift-and-subtract method usually used for this kind of math.
Let's start at the beginning. Assume positive numerator and denominator.
num
----- = Q
den
The denominator is 32-bit, so for non-zero results, the numerator must be 32-bits as well. The "n" constant is 0x10000.
a*n+b
------- = Q
c*n+d
Multiply by (1/c)/(1/c), rename to P to acknowledge the truncation error
(a*n+b)/c
----------- = P
n+(d/c)
Multiply by (1/2)/(1/2) to keep 16-bit denominator.
((a*n+b)/2)/c
----------------- = P
(n/2)+((d/2)/c)
Find the remainder to correct for truncation error. Given the requirements on the inputs, P will be no more than 16 bits wide.
(a*n+b) - P(c*n+d) = R
The bulk of the proof was in determining that this was the only correction needed over the entire domain of valid inputs:
If R<0 then decrement P by one
Here's the resulting C code. It's been verified on a PC using Monte Carlo methods over the valid domain. I just need to convert to TMS9900 assembly and count some clocks.
long div(unsigned long num, unsigned long den)
{
unsigned short c=den>>16;
unsigned short d=den&0xFFFF;
unsigned short v=0x8000+(d/2)/c;
unsigned long u=(num/2)/c;
unsigned long p = u/v;
long r = num-p*den;
if(r<0)
{r+=den; p--;}
return(p);
}
This requires one 32x16 division, one 32x32 multiply, a 32-bit add and a 32-bit subtract, All other operations can be made using 16-bit values.
Not too shabby.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment