How far can you push it?

Topics related to the API, programming discussions & questions, coding tips, bugs, etc. should go here.
Post Reply
lonlaz
Posts: 1
Joined: Tue Sep 09, 2008 12:32 am

How far can you push it?

Post by lonlaz »

I love this project, and I plan to put together an Uzebox this winter at some point. My skill level is about amateur level for everything involved in homebrew consoles/gaming. So, I'd like to get someone's assessment on the capabilities of this system.

I know that the project is just at the beginning of its life, but how far do you think this can be pushed with the current microcontroller? Could this thing run a SNES Zelda clone (without hardware scaling, etc), or is Tetris getting near about as good as it gets?
havok1919
Posts: 474
Joined: Thu Aug 28, 2008 9:44 pm
Location: Vancouver, WA
Contact:

Re: How far can you push it?

Post by havok1919 »

lonlaz wrote:I know that the project is just at the beginning of its life, but how far do you think this can be pushed with the current microcontroller? Could this thing run a SNES Zelda clone (without hardware scaling, etc), or is Tetris getting near about as good as it gets?
Good question... I'm eager to find some of that out myself! I would suspect that Zelda style stuff would be very doable. Scrolling (smooth) might be a bear, but there's LOTS of games that don't need that. For people that grew up with the Atari400/800 and C64 and the like the techniques are pretty similar. (Limited sprites, so use character graphics to augment sprite action.)

Certainly most of your tile-mapped 80's arcade game genres would be doable. (Pacman, Space Invaders, Breakout/Arkanoid, Berzerk, Burgertime, QBert, Centipede, Donkey Kong, Karate Champ, etc.)

There's nothing stopping people from rolling their own video kernels if Uze's "stock" stuff doesn't do exactly what you need. (ie, tune it to your particular game's needs.)

Maybe Uze can answer this for us-- about how much 'free' CPU is there? (Time for 'main()' execution that isn't eaten up by video/audio gen?) If we have all the vblank time that's ~2.4ms per 60Hz frame... ~68K AVR instructions. (Ignoring DMA overhead, a ~1.79MHz 6502 could maybe pull off ~10K instructions per frame in 1/60th of a second.) So no matter what happens we should still have several times the CPU horsepower of an old 8-bit even if ~85% of the AVR is burned up doing Audio/Video/etc.

-Clay
User avatar
uze6666
Site Admin
Posts: 4801
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: How far can you push it?

Post by uze6666 »

Here's a breakdown based on the Tiles-only engine:

Total Lines per field: 262 (include vblank lines)
Rendered lines: 224 (100% of cycles taken during that time)
Mixer+Music: 25 lines (including vsync pulses)
So that's 262-224-25=13 free lines * 1820 cycles/lines =~24K CPU cycles free per field or 48K per frame.

That seems little, and I'm amazed myself because it's the first time I compute this. My Tetris game is coded so inefficiently and yet I had enough power for two game fields plus animations :lol: .

By doing some obvious optimizations, I could get back 2-3 scanlines worth of CPU. The sound code is pretty optimized right now, but someone could find a faster way to do it. Note that if more CPU is required, you can alway lower the number of rendered lines. Its easy to configure.

For the games, its pretty much what Havok said. The current engine was designed to be as generic as possible, so there limitations. That said, I think there still power left to exploit. We just gotta be clever!

Cheers,

Uze
havok1919
Posts: 474
Joined: Thu Aug 28, 2008 9:44 pm
Location: Vancouver, WA
Contact:

Re: How far can you push it?

Post by havok1919 »

uze6666 wrote:Here's a breakdown based on the Tiles-only engine:
So that's 262-224-25=13 free lines * 1820 cycles/lines =~24K CPU cycles free per field or 48K per frame.
Not too bad! :D

That's pretty good, IMHO. Works out to probably around 2x the computational power of the NES. If you say the 6502 core in the NES runs at 1.79MHz and probably takes an average of ~3 clocks or so for an instruction... You can run about ~20K instructions per field on the NES. (Although you'd likely never achieve that because of needing to DMA and talk to the PPU/etc.)

The big win on the AVR of course is the multiplier-- and then the fact that you've got a register-rich, C-friendly architecture to play with. (That's part of the appeal to me-- I just don't have the time to go elbows deep writing another game in assembler, but I'd happily mess with a niftly little C-based toy!)

-Clay
Lerc
Posts: 64
Joined: Sat Aug 30, 2008 11:13 pm

Re: How far can you push it?

Post by Lerc »

It's quite a lot if you compare to a c64, which had about 17k cycles per field and instructions took a few clocks.

Maybe you need to sponsor a Uzebox category at a democene event. They'll squeeze every last drop out of those cycles.

I'm a bit surprised that the audio takes so much. Wouldn't you just output a value at the start/end of each scanline? That'd be only 262 values to pre-generate per field.
havok1919
Posts: 474
Joined: Thu Aug 28, 2008 9:44 pm
Location: Vancouver, WA
Contact:

Re: How far can you push it?

Post by havok1919 »

Lerc wrote:It's I'm a bit surprised that the audio takes so much. Wouldn't you just output a value at the start/end of each scanline? That'd be only 262 values to pre-generate per field.
That's a good question too... Any way to make the number of voices (easily) configurable as a tradeoff for CPU time? (and/or maybe do procedural waveform generation vs. fractional stepping in wavetables?)

I bet you could have "simple sound" almost for free by using cycles that are otherwise 'do nothing' during video pulse generation...

-Clay
User avatar
uze6666
Site Admin
Posts: 4801
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: How far can you push it?

Post by uze6666 »

I'm a bit surprised that the audio takes so much. Wouldn't you just output a value at the start/end of each scanline? That'd be only 262 values to pre-generate per field.
Yeah, it seems that simple. However have a look at the sound engine in asm, you'll understand. I've used a couple of trick to save cycles, like having 256 bytes waves to benefit from "free" rollover and aligning on 256 bytes boundary in ROM to have smaller pointers and save registers. Also, to save on memory and cpu cycles this one mixes a full frame in one shot, not using a temporary 16-bit buffer and by using all registers. IF someone have a faster method without taking more RAM (yet having arbitrary wave forms), I want to know! :lol:

Here's the code to compute pitch+vol for one sample for chan 0,1 and 2:

Code: Select all

	; 12 cycles/sample
	add	r6,r2	;add step to fractional part of sample pos
	adc r4,r3	  ;add step to low byte of sample pos
	movw ZL,r4    ;copy sample pointer
	lpm	r20,Z	;load sample from ROM
	mulsu r20,r17 ;(sample*mixing vol)
	clr r0
	sbc r0,r0	  ;sign extend
	mov r28,r1	 ;add (sample*vol>>8) to mix buffer lsb
	mov r29,r0	 ;ajust mix buffer msb
If you add the clipping and storing code * 262 * 4 channels, plus the cost of setup and cleanup, you get pretty close to a 20 lines. Add the MIDI player routines which are in C, and the vsync pulses that interrupts the mixing and are pure wasted time(since I just wait inside those pulses) and you'll get that 25 lines figures.

With that said, it could still be possible to mix during each hsync pulse, but that's very thin (to not say, not enough), you have perhaps 150 cycles, no more. With the tiles engines, that could be possible. With the sprite engine, almost every single cycles are taken during rendering so that would not be possible (with my implementation at least). ;)
and/or maybe do procedural waveform generation vs. fractional stepping in wavetables?
I've tried many methods like procedural, but there only trivial wave forms that can be generate without taking more cycles than the current method (like square and sawtooth). Removing the LFSR would save a lot, but no more "percussions" and sound effects? Games would loose a lot of interest.

And even with procedural, you'll still need a fractional pointer or step. Perhaps you had a different approach in mind?


Uze
havok1919
Posts: 474
Joined: Thu Aug 28, 2008 9:44 pm
Location: Vancouver, WA
Contact:

Re: How far can you push it?

Post by havok1919 »

uze6666 wrote:I've tried many methods like procedural, but there only trivial wave forms that can be generate without taking more cycles than the current method (like square and sawtooth). Removing the LFSR would save a lot, but no more "percussions" and sound effects? Games would loose a lot of interest.

And even with procedural, you'll still need a fractional pointer or step. Perhaps you had a different approach in mind?
No you're right-- I was just thinking more along the lines of "sound effects" and not music. If you just wanted interesting noises that were gated by game events you could try stuff like "playing" the contents of variables or pointers, messing with square/pulse width stuff, etc. It would be a total PITA to try to get those to be musically scaled in all likelyhood. ;-)

-Clay
User avatar
uze6666
Site Admin
Posts: 4801
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: How far can you push it?

Post by uze6666 »

Hmmm, I just thought of a way to dramatically lower the cost of music, by giving up a bit of quality and sound diversity. By using only ramps (sawtooth), it could be possible to lower that to just ~5 scanlines for four channels. Sawtooths are rich sounding, but at 15Khz on PWM a bit harsh too. So the LPF would be welcomed here. Thinking of that, could it be possible to make that filter switchable on/off in software (by using one free pin)...just like the C64? Or was it the Amiga?

Anyhow, it could make a nice alternate & low cpu mix engine. I'll try that tonight.

Uze
havok1919
Posts: 474
Joined: Thu Aug 28, 2008 9:44 pm
Location: Vancouver, WA
Contact:

Re: How far can you push it?

Post by havok1919 »

uze6666 wrote:Hmmm, I just thought of a way to dramatically lower the cost of music, by giving up a bit of quality and sound diversity. By using only ramps (sawtooth), it could be possible to lower that to just ~5 scanlines for four channels. Sawtooths are rich sounding, but at 15Khz on PWM a bit harsh too. So the LPF would be welcomed here. Thinking of that, could it be possible to make that filter switchable on/off in software (by using one free pin)...just like the C64? Or was it the Amiga?
Heheh... Yeah, the SID in the '64 had really nice filters. The Amiga had some too, but for some reason I remember the older stuff better than the later. ;-)

You actually hit on something I was thinking about-- namely a 'micro music UZEBOX'. In particular one that's specially suited to chiptunes/retro sound.

I was thinking that the current active filter I have ("good" filter) could have a few transistors tied to GPIO that you could use to add in different caps to change the behaviour of the filter. One thing I use on a couple of our other video game boards is a little 32-step digital pot (I use it for volume control) but that could be used in the filter too. Another thought was to drop one of TI's TAS audio chips down-- that has programmable EQ as well as being able to interface to TOSLINK for optical out.

What I kinda want to try is something I've had on my "someday" list for probably a decade... The "8 bit box". I have supplies of lots of old sound chips-- TI 76489, GI 8910, TMS 5220, Atari TIA's, POKEY, YM2151, etc. I thought it would be cool to make sort-of a SID-station like device, but just have an MCU that provides a GUI/interface and then a bunch of card slots (like the Simmstick) that can take sound chips on little cards. Run all those out through a programmable filter/amplitude control and have a MIDI interface on it. To if you had, say, eight slots you could plug in 4 76489's, 2 8910's, 1 TMS 5220, and 1 TIA or something and treat each channel as a 'patch/voice'. The UZEBOX would make for a great UI (use PAx for knobs/sliders) and it's plenty fast to bang on the address/databus for the sound IC's with GPIO.

-Clay
Post Reply