Get that emu faster

The Uzebox now have a fully functional emulator! Download and discuss it here.
User avatar
Artcfox
Posts: 1382
Joined: Thu Jun 04, 2015 5:35 pm
Contact:

Re: Get that emu faster

Post by Artcfox »

Jubatian wrote:Then you can repackage them again since Arkanoid now flies at 50MHz on my box!

I implemented what was also mentioned by CunningFellow for Timer 1, and it bumped up performance a lot for that game, guess will do so for most of the rest as well. Hopefully I didn't introduce any nasties with it, I was careful :)
Wow, this is fast! :shock:

Even without that change, I figured out the correct settings to get the rendering speed up to max in the browser, and it's running Bugz at what feels like over 40MHz emulation speed for me. It's too fast for me to play my own game. :lol:

The problem now is nothing is limiting the emulation speed, because I had to remove this line:

Code: Select all

while (audioRing.isFull())SDL_Delay(1);
in order for the SDL2 version to work under Emscripten, because that's threaded and I don't think threads are well supported yet. If I tell it that I want the main loop to run 60 times a second it doesn't use requestAnimationFrame to render at max speed, and it's choppy and crappy and doesn't run in Chrome, but if I tell it 0 like the warning message is telling me to use, then it runs smooth even in Chrome.

I tested out your latest change, combined with my new changes, and I can't believe I'm saying this, but now the trick is going to be figuring out how to slow down the web version, because it's way too fast now. Even in Chrome. :lol:

If you call SDL_Delay(1), or usleep(1), or just spin waiting for the ring buffer to not be full, then everything will just hang. I added back the SDL_RENDERER_PRESENTVSYNC flag, which I thought would limit the speed, but it seems to be ignoring that.
Last edited by Artcfox on Tue Oct 06, 2015 12:48 pm, edited 1 time in total.
User avatar
Artcfox
Posts: 1382
Joined: Thu Jun 04, 2015 5:35 pm
Contact:

Re: Get that emu faster

Post by Artcfox »

I figured out how to slow it down, and fixed all the games. The sound seems to be working great, and it even runs well and sounds good under Chrome. And if it runs this fast on my Nexus 5 phone, then it'll probably run great even on old computers. This is amazing! :D
User avatar
Jubatian
Posts: 1569
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: Get that emu faster

Post by Jubatian »

I think I now covered all what was algorithmically possible around the instruction decoding and AVR's innards. There doesn't seem any more component around which would contribute much to the CPU load of the emulator in this region, and what is (still) there is necessary, irreplacable. The only region I could think of is messing around with the flags, but they already have no branches, so big amount of work for little benefits.

The rest of update_hardware is written right for good performance as far as I see. The most used OUT instruction (for pixel output) is at level 1 on the switch, and the pixel output is inlined, it is impossible to make that any faster, either.

What could be done is that I see only parts of the AVR are implemented (for example from Timer1 only what was used by the kernel), so in general, this direction should be followed: just don't add stuff which is not necessary, and likely wouldn't ever be used within Uzebox, to keep the parts used simple, and with less conditionals.

The implementation of the watchdog in this regard looks a bit odd for me, such a safety measure would seem an overkill for simply playing games. However I see that the kernel uses it as a nifty trick for getting a random number, then disables it, the other function being a soft reset. Nice!

Poking around a bit, carefully tracing how the program proceeds makes me feel that some variables are not of the right type for good performance. For example cycles is a byte variable, and likes. These encode into more or slower instructions, since the non-native type has to be loaded / stored and maybe trimmed during arithmetic. I will try to examine these next time and see what it results.
User avatar
D3thAdd3r
Posts: 3289
Joined: Wed Apr 29, 2009 10:00 am
Location: Minneapolis, United States

Re: Get that emu faster

Post by D3thAdd3r »

Artcfox wrote:I tested out your latest change, combined with my new changes, and I can't believe I'm saying this, but now the trick is going to be figuring out how to slow down the web version, because it's way too fast now. Even in Chrome.
:shock: Uzem is running pretty well on my cellphone!!! I am blown away by what you guys have achieved here, truly. Excellent job everyone with this.
User avatar
uze6666
Site Admin
Posts: 4812
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: Get that emu faster

Post by uze6666 »

You guys are on fire, I'm totally blown away by your productivity level and I can't keep up!! :mrgreen: I'm really not anymore in a position to review the code per se, so if it runs most games correctly I'll merge your latest changes on the spot. When all optimizations are done and stable I'll check back for the cycle perfect thing. Then we'll address edge cases in a clean manner as you suggested. For now you can send cancel the previous pull request and just send me a pull for your latest and greatest code.

@Artfox: Great work on the browser port. It runs really fast even a tad over 100% on my old machine. Amazing!
User avatar
Jubatian
Posts: 1569
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: Get that emu faster

Post by Jubatian »

Uze6666 wrote:For now you can send cancel the previous pull request and just send me a pull for your latest and greatest code.
OK, but then lets wait a bit for it to settle. This evening after work I will see the remaining little bits, types, and whether I can still move anything aside on the main path so it runs better, to complete this performance boosting adventure. On my PC what I experience currently means that on average 40 CPU cycles are used for emulating an AVR cycle (hardware and all), so it really got to a point when doing anything on the main path could be noticable.
User avatar
Artcfox
Posts: 1382
Joined: Thu Jun 04, 2015 5:35 pm
Contact:

Re: Get that emu faster

Post by Artcfox »

I think the one part of the code that still needs careful attention is the SPI and SD card stuff. With the native (non-web) version of uzem140*, Tornado 2000 does not run correctly (it draws blue over the entire background, instead of drawing the web) and it gives SD errors.

It seemed that the line buffer changes made it slightly slower for me (if I remember correctly, Bugz ran at 49MHz with sound disabled on my laptop, rather than 55MHz, but if I can get the 8-bit texture thing working, then we should be able to get that performance back).

My vote is to wait for things to settle a bit, and maybe merge everything up to, but not including that line buffer change, at least until we can try it in combination with an 8-bit texture (or merge that line buffer change into a new branch).
User avatar
Artcfox
Posts: 1382
Joined: Thu Jun 04, 2015 5:35 pm
Contact:

Re: Get that emu faster

Post by Artcfox »

Unfortunately, when I call:

Code: Select all

texture = SDL_CreateTexture(renderer, SDL_PIXELFORMAT_INDEX8, SDL_TEXTUREACCESS_STREAMING, surface->w, surface->h);
I get:

Code: Select all

CreateTexture failed: Palettized textures are not supported
on both computers that I tried it on, so I guess the documentation was correct when it said that SDL2 does not support palletized textures, even though the API looks like it could support them. (SDL1.2 will internally convert a palletized surface into a true color texture, which wouldn't really help us, since the goal is to avoid that conversion on the CPU.) If we really wanted to do indexed textures, then a hand-coded GLSL program could work, but since the backend for SDL isn't always OpenGL, that could kill the portability.

Unless someone comes up with a better idea, I think indexed textures are a no-go.
User avatar
Jubatian
Posts: 1569
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: Get that emu faster

Post by Jubatian »

Huh, I am tired, anyway, the pull request is up!

I couldn't get any notable performance improvement now, just maybe a little bit, although it might vary. Changing the type of a few important variables to native int might help Emscripten, at least as far as I understand JavaScript and asm-js.

The linebuffer is not in the request, that's just an experiment. I also thought on SDL doing it, its a pity that it lost the indexed capability, still there are a few software rendered retro style projects running which could benefit from that (the ability to send a smaller buffer through the pipe). Anyway, I will rather move to the linebuffer topic with this.

I think for now this quest is completed, lets concentrate on bugfixes from now. And maybe I can also start messing around with AVR code one I have this nice emulator :) (A consistent almost 50MHz performance is a long way better than how it started on my box, 26MHz, barely recognizable sound. Now even if something else is wildly going on, the emulator can run, and it doesn't spin my CPU crazy)
User avatar
uze6666
Site Admin
Posts: 4812
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: Get that emu faster

Post by uze6666 »

Damn....it's fast!! :D Tested and merged. I will start checking for the remaining timing issues.

Ps: Let me know if it's ok to add you guys names in the sources as contributors.
Post Reply