I may have to admit defeat on the transfer register scheme. While it works great for simple programs, it makes a lot of problems for more complex ones. Also, I've seen where the set and use of the transfer register are far apart, which would prevent optimizing it away. I'd need a stateful process to clean up the output assembly. If I could do that I wouldn't need these elaborate schemes in the first place.
So I guess I'm back to using 8-bit registers again. Oh well.