I merged Jubatian's code into my uzem140 branch, and added my unrelated fix for the linker errors I was getting with the Emscripten build, along with a making GDB support a compile-time option, since it contained two conditionals that got executed 28M times per second, even when you don't activate it with the command line option. This gives a noticeable speed up in the web build, and it also speeds up the native build if you choose to compile it using:
The default behavior is unchanged.
I also tweaked the link-time-optimizations for the web build to make that even faster. The web build got fast enough that I actually had to decrease the number of cycles that it executes each 1/60th of a second because it was running too fast between frames, resulting in jitter and distorted sound. Ideally it would execute 28636360 / 60 emulated cycles per iteration, but that doesn't account for the time that the web browser spends doing other tasks, so it needs to execute more cycles per iteration to compensate.
On my desktop, I did notice a slowdown when update_hardware_fast is not called everywhere versus when it's called in the few places that Jubatian chose (225 MHz when called everywhere, versus 215 MHz the way it is now), but I left it the way it is so we have a baseline to build on top of.
Jubatian, can you benchmark
the version that's in my PR the way it currently is, but compile it using:
Code: Select all
make clean
rm -rf Release
GEN=1 NOGDB=1 ARCH=core2 make release
./uzem bugz.uze -w
(play it for a minute or so)
make clean
USE=1 NOGDB=1 ARCH=core2 make release
./uzem bugz.uze -vnw
(It's very important that for the second compile you switch the GEN=1 to USE=1)
and then add update_hardware_fast() everywhere it can possibly go, and re-benchmark it using the same procedure above? I'm curious if you'll end up seeing the same gains that I saw on my core-avx2 on your core2.
Edit: Uze, I sent you a PR.