No problem. Interface design is always messy, prone to bring heated debates.
Yes, I have an idea for later to rip the innards out for a plain C port. I am quite familiar with the language, and how to solve things nice and proper in it (that's my profession also). What I have on my mind for later is creating a very light Uzebox emulator not to compete with this one, but to serve a very specific purpose: just gaming, particularly Emscripten, so to get a nice fast emulator so you could publish your games for everyone.
A major thing on my mind is a bit of "butchering" in this regard: to implement the common graphics modes in native C code instead of running avr emulation, so huge chunk of instructions can be skipped in games using those modes. This would be very useful for Emscripten usage, to demonstrate specific games. Another use case would be compiling this emulator with a game embedded, so you could also distribute native binaries, hopefully broadening the audience, involving people who don't necessary have interest in tinkering with hardware, just retro-gaming.
Of course these are just daydreaming for now, I am pretty sure I could go down this road, but right now I am more interested in tinkering with Uzebox itself. I buried myself deep in assembler, hopefully in the progress of creating a shiny new graphics mode.
For performance I doubt much more is possible unless you JIT compile the thing. Uzem's code may be a mess at parts, but the avr's emulation dominates it's cycle budget, and I already understand well all the parts of that which matter. My box spends some 35 cycles for an AVR instruction. The AVR instruction table is simply built so you need to go through 2 branches to emulate one, and that's two pipeline flushes. 28MHz with most instructions taking 1 cycle is steep. Then it is also necessary to advance other parts of the hardware with proper timing.
I checked simavr, the core of it is this file:
https://github.com/buserror/simavr/blob ... sim_core.c
If you stroll it through, it will look almost suspiciously familiar. And it has deeper switches than our emulator, including ones which can not be unfolded into jump tables. So this part should definitely perform worse. The core's performance also depends on games of course, notably how the video mode of the game is implemented. If it runs through that maze always on different routes, then the poor CPU will be flushing away its pipeline like no tomorrow during emulating, while if it uses some quite repetitive instruction pattern, the jump predictor might remember it, and allow running it much faster.
The 10x faster figure is something I simply won't believe unless they have a monster of CPU to run it, and-or the avr code itself is laid out so the branch predictor can memorize the paths through it. It would mean 160MHz (!) since the Gamebuino is clocked at 16MHz. If I applied it on my box, that would mean it can emulate one AVR instruction in 12-14 CPU cycles hardware and all! Now cram all the flag logic an ADD for example produces in that, and two pipeline flushes... Or even just one.
The modularization of the simavr project is of course, seems nice, at least from what I got for the first glimpse. But it may have more overhead than a gaming-oriented Uzebox emulator would reasonably demand, where it is not necessary to emulate every little component of the AVR.
Anyway, drifted a bit off. I don't think Uzem should be ditched in favor for something else, the Uzebox-oriented core is quite worthy for keeping once the cruft is peeled off of it: it is a good base for producing a speedy gaming-oriented emulator (even if for development and debugging you rather moved on to something more beefy).