During the last weeks, I worked a bit on improving my mupen64 port, here are the things I did.
As SM64 started to work well, I switched to the Zelda rom for my testing, it is a much more complex game to emulate graphics wise, so obviously it was completely buggy and painfully slow on the first run. The biggest problem was that unlike SM64, most of the rendering was done with my slowest pixel shader, and fixing the bugs would have made it even slower so I decided to do a complete rewrite of the shader. This time I designed it around something I just discovered: constant boolean registers, it allows flow control without a big performance hit. I took me a few tries to get it fast and accurate, but now that pixel shader is almost as fast as the old one on simple cases while being more accurate and much faster on complex cases. I also made my old shader as accurate as I could, it is now used for some rare cases the new one can't emulate. With a few more fixes to libxenon and the emulator (implementing 2D rendering for example), this makes Zelda reasonably fast and playable.
Next game was Mario Kart 64, this time it was fast and looking good on the first try, but crashed after a few races with some 'out of memory' message. It turns out something very important was missing from the libxenon 3D driver: a way to free what you allocate ! (texures/vertex buffers/...) So I replaced the very basic GFX memory allocator with some malloc-like one I found in libxenon sourcecode, and modified the emulator texture cache to actually free old textures when needed.
I think it's time for a new video so here it is :)