How far can you push it?
How far can you push it?
I love this project, and I plan to put together an Uzebox this winter at some point. My skill level is about amateur level for everything involved in homebrew consoles/gaming. So, I'd like to get someone's assessment on the capabilities of this system.
I know that the project is just at the beginning of its life, but how far do you think this can be pushed with the current microcontroller? Could this thing run a SNES Zelda clone (without hardware scaling, etc), or is Tetris getting near about as good as it gets?
I know that the project is just at the beginning of its life, but how far do you think this can be pushed with the current microcontroller? Could this thing run a SNES Zelda clone (without hardware scaling, etc), or is Tetris getting near about as good as it gets?
Re: How far can you push it?
Good question... I'm eager to find some of that out myself! I would suspect that Zelda style stuff would be very doable. Scrolling (smooth) might be a bear, but there's LOTS of games that don't need that. For people that grew up with the Atari400/800 and C64 and the like the techniques are pretty similar. (Limited sprites, so use character graphics to augment sprite action.)lonlaz wrote:I know that the project is just at the beginning of its life, but how far do you think this can be pushed with the current microcontroller? Could this thing run a SNES Zelda clone (without hardware scaling, etc), or is Tetris getting near about as good as it gets?
Certainly most of your tile-mapped 80's arcade game genres would be doable. (Pacman, Space Invaders, Breakout/Arkanoid, Berzerk, Burgertime, QBert, Centipede, Donkey Kong, Karate Champ, etc.)
There's nothing stopping people from rolling their own video kernels if Uze's "stock" stuff doesn't do exactly what you need. (ie, tune it to your particular game's needs.)
Maybe Uze can answer this for us-- about how much 'free' CPU is there? (Time for 'main()' execution that isn't eaten up by video/audio gen?) If we have all the vblank time that's ~2.4ms per 60Hz frame... ~68K AVR instructions. (Ignoring DMA overhead, a ~1.79MHz 6502 could maybe pull off ~10K instructions per frame in 1/60th of a second.) So no matter what happens we should still have several times the CPU horsepower of an old 8-bit even if ~85% of the AVR is burned up doing Audio/Video/etc.
-Clay
Re: How far can you push it?
Here's a breakdown based on the Tiles-only engine:
Total Lines per field: 262 (include vblank lines)
Rendered lines: 224 (100% of cycles taken during that time)
Mixer+Music: 25 lines (including vsync pulses)
So that's 262-224-25=13 free lines * 1820 cycles/lines =~24K CPU cycles free per field or 48K per frame.
That seems little, and I'm amazed myself because it's the first time I compute this. My Tetris game is coded so inefficiently and yet I had enough power for two game fields plus animations .
By doing some obvious optimizations, I could get back 2-3 scanlines worth of CPU. The sound code is pretty optimized right now, but someone could find a faster way to do it. Note that if more CPU is required, you can alway lower the number of rendered lines. Its easy to configure.
For the games, its pretty much what Havok said. The current engine was designed to be as generic as possible, so there limitations. That said, I think there still power left to exploit. We just gotta be clever!
Cheers,
Uze
Total Lines per field: 262 (include vblank lines)
Rendered lines: 224 (100% of cycles taken during that time)
Mixer+Music: 25 lines (including vsync pulses)
So that's 262-224-25=13 free lines * 1820 cycles/lines =~24K CPU cycles free per field or 48K per frame.
That seems little, and I'm amazed myself because it's the first time I compute this. My Tetris game is coded so inefficiently and yet I had enough power for two game fields plus animations .
By doing some obvious optimizations, I could get back 2-3 scanlines worth of CPU. The sound code is pretty optimized right now, but someone could find a faster way to do it. Note that if more CPU is required, you can alway lower the number of rendered lines. Its easy to configure.
For the games, its pretty much what Havok said. The current engine was designed to be as generic as possible, so there limitations. That said, I think there still power left to exploit. We just gotta be clever!
Cheers,
Uze
Re: How far can you push it?
Not too bad!uze6666 wrote:Here's a breakdown based on the Tiles-only engine:
So that's 262-224-25=13 free lines * 1820 cycles/lines =~24K CPU cycles free per field or 48K per frame.
That's pretty good, IMHO. Works out to probably around 2x the computational power of the NES. If you say the 6502 core in the NES runs at 1.79MHz and probably takes an average of ~3 clocks or so for an instruction... You can run about ~20K instructions per field on the NES. (Although you'd likely never achieve that because of needing to DMA and talk to the PPU/etc.)
The big win on the AVR of course is the multiplier-- and then the fact that you've got a register-rich, C-friendly architecture to play with. (That's part of the appeal to me-- I just don't have the time to go elbows deep writing another game in assembler, but I'd happily mess with a niftly little C-based toy!)
-Clay
Re: How far can you push it?
It's quite a lot if you compare to a c64, which had about 17k cycles per field and instructions took a few clocks.
Maybe you need to sponsor a Uzebox category at a democene event. They'll squeeze every last drop out of those cycles.
I'm a bit surprised that the audio takes so much. Wouldn't you just output a value at the start/end of each scanline? That'd be only 262 values to pre-generate per field.
Maybe you need to sponsor a Uzebox category at a democene event. They'll squeeze every last drop out of those cycles.
I'm a bit surprised that the audio takes so much. Wouldn't you just output a value at the start/end of each scanline? That'd be only 262 values to pre-generate per field.
Re: How far can you push it?
That's a good question too... Any way to make the number of voices (easily) configurable as a tradeoff for CPU time? (and/or maybe do procedural waveform generation vs. fractional stepping in wavetables?)Lerc wrote:It's I'm a bit surprised that the audio takes so much. Wouldn't you just output a value at the start/end of each scanline? That'd be only 262 values to pre-generate per field.
I bet you could have "simple sound" almost for free by using cycles that are otherwise 'do nothing' during video pulse generation...
-Clay
Re: How far can you push it?
Yeah, it seems that simple. However have a look at the sound engine in asm, you'll understand. I've used a couple of trick to save cycles, like having 256 bytes waves to benefit from "free" rollover and aligning on 256 bytes boundary in ROM to have smaller pointers and save registers. Also, to save on memory and cpu cycles this one mixes a full frame in one shot, not using a temporary 16-bit buffer and by using all registers. IF someone have a faster method without taking more RAM (yet having arbitrary wave forms), I want to know!I'm a bit surprised that the audio takes so much. Wouldn't you just output a value at the start/end of each scanline? That'd be only 262 values to pre-generate per field.
Here's the code to compute pitch+vol for one sample for chan 0,1 and 2:
Code: Select all
; 12 cycles/sample
add r6,r2 ;add step to fractional part of sample pos
adc r4,r3 ;add step to low byte of sample pos
movw ZL,r4 ;copy sample pointer
lpm r20,Z ;load sample from ROM
mulsu r20,r17 ;(sample*mixing vol)
clr r0
sbc r0,r0 ;sign extend
mov r28,r1 ;add (sample*vol>>8) to mix buffer lsb
mov r29,r0 ;ajust mix buffer msb
With that said, it could still be possible to mix during each hsync pulse, but that's very thin (to not say, not enough), you have perhaps 150 cycles, no more. With the tiles engines, that could be possible. With the sprite engine, almost every single cycles are taken during rendering so that would not be possible (with my implementation at least).
I've tried many methods like procedural, but there only trivial wave forms that can be generate without taking more cycles than the current method (like square and sawtooth). Removing the LFSR would save a lot, but no more "percussions" and sound effects? Games would loose a lot of interest.and/or maybe do procedural waveform generation vs. fractional stepping in wavetables?
And even with procedural, you'll still need a fractional pointer or step. Perhaps you had a different approach in mind?
Uze
Re: How far can you push it?
No you're right-- I was just thinking more along the lines of "sound effects" and not music. If you just wanted interesting noises that were gated by game events you could try stuff like "playing" the contents of variables or pointers, messing with square/pulse width stuff, etc. It would be a total PITA to try to get those to be musically scaled in all likelyhood.uze6666 wrote:I've tried many methods like procedural, but there only trivial wave forms that can be generate without taking more cycles than the current method (like square and sawtooth). Removing the LFSR would save a lot, but no more "percussions" and sound effects? Games would loose a lot of interest.
And even with procedural, you'll still need a fractional pointer or step. Perhaps you had a different approach in mind?
-Clay
Re: How far can you push it?
Hmmm, I just thought of a way to dramatically lower the cost of music, by giving up a bit of quality and sound diversity. By using only ramps (sawtooth), it could be possible to lower that to just ~5 scanlines for four channels. Sawtooths are rich sounding, but at 15Khz on PWM a bit harsh too. So the LPF would be welcomed here. Thinking of that, could it be possible to make that filter switchable on/off in software (by using one free pin)...just like the C64? Or was it the Amiga?
Anyhow, it could make a nice alternate & low cpu mix engine. I'll try that tonight.
Uze
Anyhow, it could make a nice alternate & low cpu mix engine. I'll try that tonight.
Uze
Re: How far can you push it?
Heheh... Yeah, the SID in the '64 had really nice filters. The Amiga had some too, but for some reason I remember the older stuff better than the later.uze6666 wrote:Hmmm, I just thought of a way to dramatically lower the cost of music, by giving up a bit of quality and sound diversity. By using only ramps (sawtooth), it could be possible to lower that to just ~5 scanlines for four channels. Sawtooths are rich sounding, but at 15Khz on PWM a bit harsh too. So the LPF would be welcomed here. Thinking of that, could it be possible to make that filter switchable on/off in software (by using one free pin)...just like the C64? Or was it the Amiga?
You actually hit on something I was thinking about-- namely a 'micro music UZEBOX'. In particular one that's specially suited to chiptunes/retro sound.
I was thinking that the current active filter I have ("good" filter) could have a few transistors tied to GPIO that you could use to add in different caps to change the behaviour of the filter. One thing I use on a couple of our other video game boards is a little 32-step digital pot (I use it for volume control) but that could be used in the filter too. Another thought was to drop one of TI's TAS audio chips down-- that has programmable EQ as well as being able to interface to TOSLINK for optical out.
What I kinda want to try is something I've had on my "someday" list for probably a decade... The "8 bit box". I have supplies of lots of old sound chips-- TI 76489, GI 8910, TMS 5220, Atari TIA's, POKEY, YM2151, etc. I thought it would be cool to make sort-of a SID-station like device, but just have an MCU that provides a GUI/interface and then a bunch of card slots (like the Simmstick) that can take sound chips on little cards. Run all those out through a programmable filter/amplitude control and have a MIDI interface on it. To if you had, say, eight slots you could plug in 4 76489's, 2 8910's, 1 TMS 5220, and 1 TIA or something and treat each channel as a 'patch/voice'. The UZEBOX would make for a great UI (use PAx for knobs/sliders) and it's plenty fast to bang on the address/databus for the sound IC's with GPIO.
-Clay