Thinking about STMUzebox

Discuss anything not related to the current Uzebox design like successors and other open source gaming hardware

Re: Thinking about STMUzebox

Postby hub.martin » Thu Nov 10, 2011 2:47 pm

For SD card connectivity there are dedicated SDIO[0-3] & SDIO_CLK pins. They are on different physical pins than FSMC so you don't loose SD card interface.
hub.martin
 
Posts: 5
Joined: Mon Nov 07, 2011 1:35 am

Re: Thinking about STMUzebox

Postby vaclavpe » Fri Nov 11, 2011 10:14 pm

Hello,
uze6666 wrote:
But, is there any application which has to use Tiles/Fonts from RAM ? For example dynamically generated tiles? This would be problem and then all tiles and fonts have to be in RAM to be consistent.
I'm afraid yes. All mode 3 games uses a mix of ROM tiles *and* dynamically generated tiles (ramtiles) to pre-blit the sprites during VSYNC. Some games like Bomberman and LodeRunner also uses them for other things like fake parallax effect.

This really bad news is in fact stopping point for STM32F1 architecture (time delay inconsistency between RAM and FLASH). In this case, line renderer must always take tiles from RAM. I have on mind three scenarios:
- all tiles in RAM - but what about RAM consumption ? Does anybody have a clue how many tiles are used as a maximum?
- tiles in RAM just for single microline - this is better. But the preparation for renderer must be done during HSync pulse. Transfer of 240 bytes is not possible in such time. Or to prepare the whole line ( 8 microlines) and interleave two buffers. This takes 3840Bytes. Which still can be ok, but I am not sure, if it if doable this way.
- microline is still in RAM but transfer is done with DMA. And again maybe with interleaving.

Fourth really theoretic scenario is to choose CPU frequency the way to have the same timing for both RAM and FLASH access. But who wants to play ? :?

Does anybody have any other idea? I got quite disappointed with power of STM32F1 for such exact timing applications...
vaclavpe
 
Posts: 20
Joined: Wed Jan 27, 2010 9:49 am

Re: Thinking about STMUzebox

Postby hpglow » Sat Nov 12, 2011 1:36 am

vaclavpe wrote:Hello,
uze6666 wrote:
But, is there any application which has to use Tiles/Fonts from RAM ? For example dynamically generated tiles? This would be problem and then all tiles and fonts have to be in RAM to be consistent.
I'm afraid yes. All mode 3 games uses a mix of ROM tiles *and* dynamically generated tiles (ramtiles) to pre-blit the sprites during VSYNC. Some games like Bomberman and LodeRunner also uses them for other things like fake parallax effect.

This really bad news is in fact stopping point for STM32F1 architecture (time delay inconsistency between RAM and FLASH). In this case, line renderer must always take tiles from RAM. I have on mind three scenarios:
- all tiles in RAM - but what about RAM consumption ? Does anybody have a clue how many tiles are used as a maximum?
- tiles in RAM just for single microline - this is better. But the preparation for renderer must be done during HSync pulse. Transfer of 240 bytes is not possible in such time. Or to prepare the whole line ( 8 microlines) and interleave two buffers. This takes 3840Bytes. Which still can be ok, but I am not sure, if it if doable this way.
- microline is still in RAM but transfer is done with DMA. And again maybe with interleaving.

Fourth really theoretic scenario is to choose CPU frequency the way to have the same timing for both RAM and FLASH access. But who wants to play ? :?

Does anybody have any other idea? I got quite disappointed with power of STM32F1 for such exact timing applications...

Here are some ideas that could help you. First have a buffer that contains enough space for two rows of tiles. For the frame buffer make that pointers to the tiles in ram. That way you don't have to duplicate data in the tile buffer. Set up a double buffer so that wile the renderer is rendering the set of tiles (8 scan lines) the cpu or dma is getting tiles for the next 8 scan lines. When the end of the 8 scan lines to be renered is reached you flip the to the next pointer array. Worst case scenario is two rows of tiles that are all different (to be honest this will rarely happen.)

So for instance if you need to have 4 unique tiles for the first row of tiles you load those four into the ram tile buffer and point to them in the pointer array as needed. While this is all renering you look at what tiles need to be loaded for the next 8 scanlines and you determine which are unique compared to the last scan line let us say for example the next scan line has 8 unique tiles and 2 repeats since the buffer still insn't full you load the 8 new tiles into the buffer point to them as needed and move on to the next 8 scan lines. Eventuly you may reach a situation where the tile buffer is full and what you do is cull the least used tiles that are not used in the current scan lines being rendered or the next set of scan lines to be rendered.

So basicly you are only renering from the pointer array and not directly from the ram buffer. It would eat up a lot of memory, but I think it would get the job done for you.

Here is a generic code example how I imagine it to work:
Code: Select all
Struct {
RamTile tile;
float average;
} Struct BufferedTile;

BufferedTile VramArray[64];
BufferedTile* PtVramArray[64];

//Somewhere in code body whenever a tile is used.
VramArray[location].average++;

//And this would be done on each tile after every 8 scan lines so imagine a for loop around it
VramArray[i].average /= 2;

//When you go to load a unique ram tile just replace the one with the lowest average but not any that are 0.0


If you ever get you hand on an F4 let us know, it has a more intelegent pre-fetcher unit. I commend your work so far, but understand if you have had enough and want to give up.
hpglow
 
Posts: 168
Joined: Wed Apr 14, 2010 6:06 am

Re: Thinking about STMUzebox

Postby jnosky » Sat Nov 26, 2011 5:10 am

Hi,
Just came across this thread, and its very interesting :)
I recently got an F4 discovery board, and have made a lot of progress setting up a free toolchain.
Check out my repo:
https://github.com/jnosky/discoveryF4

I also recently added a bunch of code to:
https://github.com/texane/stlink
To allow that project to fully support the F4 discovery board.

So in summary, its now possible to develop and do source level (hardware) debugging on the F4 for free :)
I use eclipse with the CDT plugin, and my repo has the .project files to be imported for those who have that setup already.

Theres a lot of these boards around now, since they were giving em away free, even to buy one is still dirt cheap ($15).

The F4 can run at 168mhz, STM says its the "worlds fastest" uC.
According to the data sheet the F4 can execute from flash with zero wait states using its ART memory accelerator.
The FSMC peripeheral can generate write timing for many types of memories including sram. Wouldnt it be possible to simply use it address bus to drive the vga resistors? ie: Just do dummy reads from the "sram" and use the address lines as the ladder inputs.
The discovery board already has a http://cirrus.com/en/products/cs43l22.html for audio.
It also has a mems that could possibly be used as a controller.

Seems the F4 is leaps and bounds ahead of the F1. I hope to generate some interest in using this as opposed to the f1.
jnosky
 
Posts: 1
Joined: Sat Nov 26, 2011 4:58 am

Re: Thinking about STMUzebox

Postby uze6666 » Sat Nov 26, 2011 10:24 pm

Woa, awesome work! :mrgreen: I will definitively give it a look since a 100% free toolchain with debugging was what I was looking for.

Thanks for posting!

-Uze
User avatar
uze6666
Site Admin
 
Posts: 2672
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada

Re: Thinking about STMUzebox

Postby WAHa.06x36 » Sat Feb 04, 2012 6:17 pm

I also just got an STM32F4DISCOVERY, and tried setting it up to do VGA.

Turns out you can basically put a simple 3:3:2 bit resistor DAC on a GPIO port, and then set up a DMA transfer to move pixels from a RAM framebuffer to the DAC without using any processor cycles at all. You just have to be careful about fighting for memory cycles.

Also, you can write code just fine for it with the devkitARM toolchain (usually used for GBA and DS development).

Here are some example videos: (Warning: May hurt eyes.)

http://www.youtube.com/watch?v=U9LUJ68cr14
http://www.youtube.com/watch?v=tasgbVFBtqw
WAHa.06x36
 
Posts: 31
Joined: Sat Feb 04, 2012 6:11 pm

Re: Thinking about STMUzebox

Postby hpglow » Sat Feb 04, 2012 7:38 pm

Very nice, What resolution is that running at? Are you using both DMAs or just one of the two?
hpglow
 
Posts: 168
Joined: Wed Apr 14, 2010 6:06 am

Re: Thinking about STMUzebox

Postby WAHa.06x36 » Sat Feb 04, 2012 7:42 pm

320x240. I should probably reduce that to 320x200 so I can fit two buffers in the main memory bank and do double-buffering.

It's using only DMA2, because DMA1 is not connected to the main memory bus at all. Took me a while to figure out why that one didn't work...
WAHa.06x36
 
Posts: 31
Joined: Sat Feb 04, 2012 6:11 pm

Re: Thinking about STMUzebox

Postby DaveyPocket » Sat Feb 04, 2012 9:14 pm

Amazing!
User avatar
DaveyPocket
 
Posts: 377
Joined: Sun Sep 14, 2008 8:33 pm

Re: Thinking about STMUzebox

Postby WAHa.06x36 » Sun Feb 05, 2012 11:38 pm

A slightly more useful video: http://www.youtube.com/watch?v=IRT4ShYi8Bw

I got 240, 200 and 175 row modes working. This one is 320x200, just like old DOS mode X. And just like that, there is now enough space for double buffering, if you use the entire 128k main SRAM bank for framebuffers. I moved the data and bss sections to the 64k CCM memory instead to make room for the framebuffers.

It renders about 450 of these star sprites at 70 Hz, full frame rate.

You could do some pretty impressive games on this, I am pretty sure.
WAHa.06x36
 
Posts: 31
Joined: Sat Feb 04, 2012 6:11 pm

PreviousNext

Return to Uzebox Derivatives & open source consoles

Who is online

Users browsing this forum: No registered users and 1 guest