An RPG on Uzebox

Use this forum to share and discuss Uzebox games and demos.
User avatar
D3thAdd3r
Posts: 3221
Joined: Wed Apr 29, 2009 10:00 am
Location: Minneapolis, United States

Re: An RPG on Uzebox

Post by D3thAdd3r »

Thank you sir, that is exactly what the wiki needed for a long time!
User avatar
nicksen782
Posts: 714
Joined: Wed Feb 01, 2012 8:23 pm
Location: Detroit, United States
Contact:

Re: An RPG on Uzebox

Post by nicksen782 »

Update since it has been a week.

Presently I am working on some updates to my tools.

One of my goals will be to be able to swap a chunk of SPI RAM in/out from the SDCARD on command. I've actually began to reach some of the SPI space limits and I still need to work on working memory for the game that isn't just stored in the CPU RAM.

Also, my pseudo sprites could use a change. I am hardcoding the maps they use which means I cannot de-duplicate tiles in their tileset. I've learned that I can read the maps from SPIRAM fast enough so I intend to store them there. I'll be reading SPI RAM quite a bit. I had no problem with the ramtilemap animations or the dialog tests so I figure this can work. In fact, I could just read the 6 bytes into ram and then draw from there in order to minimize the time the SPI bus is open. I really don't think this will be much of a problem.

I have all the game's screens stored in SPI RAM. I should store them on the SDCARD and then bring in a screen as needed to SPIRAM. This will save loads of SPIRAM space.

With all the extra SPI RAM space I have this idea about dialog menus and a sort of VRAM caching. To draw a menu you would first copy the VRAM region it will cover. Then draw the menu right over that region (maybe even gradually for a nice effect!) When done you could then read the cached VRAM region back in as you are "undrawing" the menu.

If I could save a few bytes of CPU RAM that would be great too. Even a couple more ram tiles would be awesome. That is why I was interested in a method to make tiles take less that 64 bytes.

Thanks again for taking a look. I have appreciated all the feedback so far, thank you!
User avatar
Jubatian
Posts: 1561
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: An RPG on Uzebox

Post by Jubatian »

Just something what might help unless you already considered it.

How did you plan accessing the SD card? In particular, how you planned to resolve the need for a 512 byte buffer. Probably both Lee and Artcfox found this issue before which needs a bit of thought.

So unless you are already past it, here are some thoughts. You can use some RAM tiles for the buffer, RAM tiles which obviously you don't use during the transfer. Since you are on plain Mode 3, you can still display whatever you like, including some sprites, as long as you manage the RAM tiles carefully so RAM tile usage doesn't meet the buffer.

Other ideas to get more RAM for tiles (of course unless you did these already or anything equivalent):
  • Remove the kernel's sprite engine (SPRITES_AUTO_PROCESS=0). The sprites structure takes 128 bytes (4 bytes * 32 sprites), and I think you have your in-game characters' coordinates and attributes elsewhere, too. So you could write an own sprite engine (which draws using BlitSprite) using those directly, bypassing the need for that 128 byte array.
  • Consider flipping in and out some work memory. I mean part of the game state, something like that: on the beginning of a frame, you copy it from SPI RAM, you work with it, then copy it back. After it is copied back, you can use the area for RAM tiles (that is, you actually copy into a RAM tile). Of course this needs good overall planning and timing (SPRITES_VSYNC_PROCESS=0 is very much recommended). You will need about 3 scanlines of CPU time to process 2 RAM tiles worth of data (128 bytes) this way.
  • You have a very deep stack when the game is in conversations (observing the emulator's RAM usage display). Try to slim it down somehow (examine your call graph), it may not be possible, but what I see there seems unusual to me. Managing the stack better might also allow you to save a RAM tile worth of memory.
  • A vertically narrower screen gives you both more CPU time and RAM. The latter by having smaller VRAM (one tile row is equivalent to about a half RAM tile).
  • When you start using the SD card more extensively, all the structures relating it may be stored in the SPI RAM, and copied into RAM only when you need to access the card. Since you will be using it only in short bursts, transferring data into the SPI RAM, you only need those then, and not any other time (like on a sprite-intensive in-game screen). This may be especially handy if you have a large data file and prefer to keep pointers into it around for faster access.
Hope these can help a little in getting the RAM you need!
User avatar
nicksen782
Posts: 714
Joined: Wed Feb 01, 2012 8:23 pm
Location: Detroit, United States
Contact:

Re: An RPG on Uzebox

Post by nicksen782 »

I've been using ram_tiles[] as a buffer. I have also tried with vram. It works pretty well as long as you reserve ram tiles or render less lines on screen. For instance, render only the top lines (one row/tile worth) and then use a pointer to the second row of vram. Faster when less lines rendered.

Presently I am only using the sprite engine for the player's sprite and that sprite is always aligned anyway. So, I literally have 4 sprites (4*4==16). 16 bytes is still 16 bytes plus whatever I save from losing the sprite engine. How much savings are we talking about?

The conversations do show a deep stack in terms of levels of nesting. I didn't actually know the term "Call Graph" so I looked it up. I've made them before by hand (tedious) but I did not know the name for the procedure. Now I'm using cflow and cflow2dot. One gives text and the other gives a pdf. Kinda hard to read the pdf but it is good to look at. I can see complication right away. I have the text formatted to show the nesting and to indicate the number of nested layers. The highest level (6) occurs with music streaming and the conversations. Actually, I was able to quickly identify a recursive call in my music streaming which I removed. What do you use for creating a call graph?

My screen dimensions are already 28x28. I think I'm good there for the moment. The original Gameboy was 164*144 pixels (20*18 tiles assuming 8x8 pixel tiles.) I keep it in mind, many great games were created, I'm just not ready to shrink vram yet... although I know that I would get 6 more ram tiles if I did.

The sd_struct is 23 bytes. The idea of storing it's data in SPIRAM and just repopulating it when needed is brilliant! I will certainly consider this. Maybe this would even be good for storing the structure data for multiple files even.

Swapping (like a page file) in and out between SPIRAM and SDCARD is a strategy I've been considering. I'm just not sure how to go about doing it. I've went through some of my screen struct and I can save a ram tile right away if I am able to swap specific data out to SPIRAM during conversations. However, I am not sure how to do this. The screen structure is set at the program start and the memory would already be allocated. If I could get a pointer to say, &ram_tiles[64*26] (26th ram tile) and interpret the memory at that point as the screen structure then I think it would work. However, it seems rather hacky. How can I do that? Can I do it? I can only think of malloc() and free() and just repopulating the screen structure once the conversation is done. Any ideas?

What if there were no ram tiles and ram tiles was just a large chunk of general purpose memory? Perhaps give the kernel a pointer to where "ram tiles starts" within a large buffer. The pointer can change as needed. All of the memory could be used for whatever data type or structure desired. In a way, the data file from SPIRAM (or the SDCARD) is just general memory that you read as u8 or u32, etc. I realize that this sounds big, but could this happen?
User avatar
Jubatian
Posts: 1561
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: An RPG on Uzebox

Post by Jubatian »

nicksen782 wrote: Wed Aug 08, 2018 11:04 pmPresently I am only using the sprite engine for the player's sprite and that sprite is always aligned anyway. So, I literally have 4 sprites (4*4==16). 16 bytes is still 16 bytes plus whatever I save from losing the sprite engine. How much savings are we talking about?
Do you use this along with setting MAX_SPRITES = 4? I just don't know how well it is known, if not, that determines the size of the sprites array (so unless you set this, it is 32, thus using 128 bytes). Otherwise if you have that player sprite always aligned you could just eliminate it being a kernel sprite. No much RAM saved beyond that 16 bytes then, rather you would most importantly get rid of the Vsync hazard in video (which could be very beneficial for you considering that most of your render is done as user-code). If you totally don't need free-moving sprites, and you do everything at user-space, you could also set RTLIST_ENABLE = 0. This throws away the capability to restore RAM tiles allocated to sprites (used by the kernel's sprite engine or BlitSprite), which takes 3 * RAM_TILES_COUNT bytes (that's another ~100 bytes).
nicksen782 wrote: Wed Aug 08, 2018 11:04 pmWhat do you use for creating a call graph?
My brain :lol: I can just maintain a good idea on how my most significant calls nest, probably a consequence of maintaining some quite hairy legacy 8051 code where I had to discover and track it all manually (on that MCU stack can be very limited).
nicksen782 wrote: Wed Aug 08, 2018 11:04 pmThe sd_struct is 23 bytes. The idea of storing it's data in SPIRAM and just repopulating it when needed is brilliant! I will certainly consider this. Maybe this would even be good for storing the structure data for multiple files even.
Yup, multiple files, yes, and also keep in mind that seeking is slow on FAT, that's why I provided functions to load and store file positions (FS_Get_Pos() and FS_Set_Pos()). If you plan to use them and so create such a table of positions for important stuff in a file, that table can also be stored in the SPI RAM.
nicksen782 wrote: Wed Aug 08, 2018 11:04 pmIf I could get a pointer to say, &ram_tiles[64*26] (26th ram tile) and interpret the memory at that point as the screen structure then I think it would work. However, it seems rather hacky.
Just design your interface right to minimize the area the hack affects. I mean if you design your chat screen so it takes a work area pointer as parameter (or have a separate function where you can set this pointer), then the module realizing the chat screen is not affected by the hack. You only need to have it in your main game logic somewhere, ideally in one central place where you can see on one screen worth of code which regions of the RAM tile area you are passing to various game components. This way memory reuse remains rather clean (at the RAM cost of a few pointers) and manageable.
nicksen782 wrote: Wed Aug 08, 2018 11:04 pmWhat if there were no ram tiles and ram tiles was just a large chunk of general purpose memory? Perhaps give the kernel a pointer to where "ram tiles starts" within a large buffer.
Memory reuse I think always remains somewhat hacky no matter how you approach it. It is a difficult scenario as you as the programmer is responsible to separate tasks using the same memory locations, ensuring they never use it simultaneously. You can not build static analysis on this as it totally depends on the architecture of your game. The RAM tiles region I think is good as it is: it is a large continuous chunk sufficient for most imaginable sane reuse scenarios.

The &ram_tiles[n * 64] way of getting pointers for reusing RAM is quite portable too: if it works on the Uzebox fine, it will work in a supposed scenario where you emulate the Uzebox kernel over SDL for example. The RAM tile's memory will be there and free even there. So I say it is OK.
User avatar
nicksen782
Posts: 714
Joined: Wed Feb 01, 2012 8:23 pm
Location: Detroit, United States
Contact:

Re: An RPG on Uzebox

Post by nicksen782 »

SPRITES_AUTO_PROCESS=0
RTLIST_ENABLE=0

Saved the 100 or so bytes and the 16 bytes. Quite nice! I will have to recreate a simple sprite engine in C though. Probably be a smash up with flash-tile sourced ram-tile sprites. I already expected to pay the cost of the ram tiles but now I can do blitting on some sprites more than once (the special sprites can only be blit once since their source data is ram tiles.)

Then, I tried this:
struct screen_ *screenPtr;
screenPtr = &ram_tiles[34*64];

In the past I was doing screen.membername. Now it is screenPtr->membername. This works just fine so long as I don't over-write the data since it is shared with ram tiles. Also, my screen struct took 154 bytes which is less than the 192 bytes (3 ram tiles) I would need. So, I took the player and game struct and put it in the same memory space. I'll need to create new pointers and make sure that the screen data doesn't over-write those but I have some serious ram re-coup here! Actually, I raised by ram tiles from 36 to 40. I could probably do 42 but I'll not push it for now.

Being able to regain 154 bytes and then reload it is awesome.

So yeah... I'm going to create a sort of memory manager/data swapping system, and a simple sprite engine.

My sprite engine would do blitting, mirroring (x-flip, y-flip). I have a system for this in C but I've long thought that it was a bit crazy. What do you think?

Code: Select all

// Used For NPCs.
void flipRamTileMap(const char * map){
	// http://rosettacode.org/wiki/Generic_swap#C
	void swapArray(void *va, void *vb, size_t s)
	{
	  char t, *a = (char*)va, *b = (char*)vb;
	  while(s--)
		t = a[s], a[s] = b[s], b[s] = t;
	}
	void ramtile_xFlip(u8 ramtile_id){
		void rvereseArray(unsigned char *arr, int start, int end){
			unsigned char temp;
			while (start < end)
			{
				temp = arr[start];
				arr[start] = arr[end];
				arr[end] = temp;
				start++;
				end--;
			}
		}

		u16 addr;

		// memcpy(dest, src, 4);
		for(u16 i=0; i<8; i++){
			addr = (ramtile_id*64) + (i*8) ;
			rvereseArray(&ram_tiles[ addr ], 0, 8-1);
			// WaitVsync(1);
		}

	}

	// ASSUMES 2X2 MAP!

	// Get the dimensions so we know how many tiles are in the tilemap.
	u8 width  = pgm_read_byte(&(map[0])) ;
	u8 height = pgm_read_byte(&(map[1])) ;
	u16 ramtile1_addr;
	u16 ramtile2_addr;

	// Flip the tiles in the tilemap.
	for(u8 i=0; i<width*height; i++){
		ramtile_xFlip( pgm_read_byte(&(map[i+2])) );
	}

	// Swap the tiles horizontally.
	for(u8 i=0; i<width*height; ){
		ramtile1_addr = 64*(pgm_read_byte(&(map[i+2]))); i++;
		ramtile2_addr = 64*(pgm_read_byte(&(map[i+2]))); i++;
		swapArray( &ram_tiles[ramtile1_addr], &ram_tiles[ramtile2_addr], 64);
	}
}
Should this be ASM instead? Would it matter much?
User avatar
Jubatian
Posts: 1561
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: An RPG on Uzebox

Post by Jubatian »

nicksen782 wrote: Thu Aug 09, 2018 6:45 pmMy sprite engine would do blitting, mirroring (x-flip, y-flip). I have a system for this in C but I've long thought that it was a bit crazy. What do you think?
It is fine to have it, your game simply have different needs than what the regular sprite engine serves: then you make your own.

Since you are staying within tile boundaries, you produce code which is rather easy to optimize for the C compiler (fixed 8 or 64 iteration loops). You don't need to go assembly, the added benefit is that you can port your game to different targets easier if you ever wanted to do that later.

The code for me is a little odd by that I see you have routines for in-place flips (or at least within RAM). It would be more optimal if you generated your sprites the right way immediately from the source (ROM or SPI RAM I guess), staying within tile boundaries this isn't very difficult to do, and can be easily got right with some trial and error even if you messed it up at first.
User avatar
nicksen782
Posts: 714
Joined: Wed Feb 01, 2012 8:23 pm
Location: Detroit, United States
Contact:

Re: An RPG on Uzebox

Post by nicksen782 »

Jubatian wrote: Thu Aug 09, 2018 7:13 pm The code for me is a little odd by that I see you have routines for in-place flips (or at least within RAM). It would be more optimal if you generated your sprites the right way immediately from the source (ROM or SPI RAM I guess), staying within tile boundaries this isn't very difficult to do, and can be easily got right with some trial and error even if you messed it up at first.
Yes, in-place flips within RAM. The "sprite" flips routinely. Generating it flipped at the start would not be much help since it needs to flip repeatedly over time.

Is there a better way to do the flipping? The code seems simple enough. I used inline functions because they weren't going to be used anywhere else.
User avatar
Jubatian
Posts: 1561
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: An RPG on Uzebox

Post by Jubatian »

nicksen782 wrote: Thu Aug 09, 2018 7:40 pmYes, in-place flips within RAM. The "sprite" flips routinely. Generating it flipped at the start would not be much help since it needs to flip repeatedly over time.
I just assumed probably the most expectable workflow for this stuff: using a source (ROM or SPI RAM) and a colorkey to get a sprite onto a destination (with colorkeyed transparency), done in every frame. You likely have some different workflow then, that could be just fine, fitting your game.
User avatar
nicksen782
Posts: 714
Joined: Wed Feb 01, 2012 8:23 pm
Location: Detroit, United States
Contact:

Re: An RPG on Uzebox

Post by nicksen782 »

Jubatian wrote: Thu Aug 09, 2018 5:01 pm Yup, multiple files, yes, and also keep in mind that seeking is slow on FAT, that's why I provided functions to load and store file positions (FS_Get_Pos() and FS_Set_Pos()). If you plan to use them and so create such a table of positions for important stuff in a file, that table can also be stored in the SPI RAM.
I've got the memory swaps working. I have a question about FS_Get_Pos() and FS_Set_Pos(). Do they store the start position of the file or do they store a position within a file?

u32 t32;
t32 = FS_Find(sdFile1Ptr ..... )
FS_Select_Cluster(sdFile1Ptr, t32);

What is t32 here? Is this the first cluster of the file? I just want to make sure since it appears that way and it is not part of sd_struct.

Let's say you seek to a sector (multiple of 512) and you want byte 500. You would need to get that whole sector and skip bytes until byte 500, right? It looks like the sector gets written to ->bufp. Could I just do sd_struct.bufp[500]? Again, I just want to make sure.

Presently I read from the SD card once in the game to fill SPIRAM and then that's it. But, if I want to swap stuff back and forth then I'll need to do more SD access and I would like to make sure that I understand what is going on.
Post Reply