Unlimited RAM/ROM

Topics related to the API, programming discussions & questions, coding tips, bugs, etc. should go here.
rv6502
Posts: 80
Joined: Mon Feb 11, 2019 4:27 am

Unlimited RAM/ROM

Post by rv6502 »

A borderline-useful idea I've been working on for a while.



For reference a C64 does it in 28 milliseconds / run :lol:

I got it functional enough yesterday (or rather this morning at 3am), still lots of room for improvement.

Looks like my SPI RAM chip just arrived as I was capturing this, it'll probably run a lot faster once I implement support.
But it's optional! :lol: :lol: :lol:

*Crowd gasps in horror as what just happened dawns on them. As if millions of flash bits suddenly cried out in terror and were suddenly silenced*
rv6502
Posts: 80
Joined: Mon Feb 11, 2019 4:27 am

Re: Unlimited RAM/ROM

Post by rv6502 »

I added a cache flush / SD write counter.

Dhrystone causes 456 SD card writes to the swap file in it's 2.75mins run.

Assuming 1MB erase block size, a 3000 write/erase cycles MLC, that every write = one erase cycle (it's probably way better than this), and a uniform wear leveling on an almost empty 32GB card I got about 98304000 writes.

So if I got my math is right that $5 SD card got only 514 years, 123 days, 8 hours to live if I'd leave dry.c running in a loop.
rv6502
Posts: 80
Joined: Mon Feb 11, 2019 4:27 am

Re: Unlimited RAM/ROM

Post by rv6502 »

The SPI RAM chip is so easy to get working <3

That's a wee bit faster:
Screenshot_2019-10-25_23-46-03.jpg
Screenshot_2019-10-25_23-46-03.jpg (65.04 KiB) Viewed 12502 times
Almost 1 Dhrystone per second!

Now barely 38 times slower than a C64. :lol:

From the previous:
Screenshot_2019-10-25_22-45-20.jpg
Screenshot_2019-10-25_22-45-20.jpg (68.62 KiB) Viewed 12502 times
rv6502
Posts: 80
Joined: Mon Feb 11, 2019 4:27 am

Re: Unlimited RAM/ROM

Post by rv6502 »

Boom.
Screenshot_2019-10-26_01-22-07.jpg
Screenshot_2019-10-26_01-22-07.jpg (82.86 KiB) Viewed 12496 times
1.31 Dhrystone

(Mostly from cache-aligning the string copy functions. Cache misses are brutal on that "CPU". )

Now to rewrite the CPU interpreter code in AVR assembly.

Maybe If can reach 2 Dhrystone it'd be almost useful. :lol:
User avatar
uze6666
Site Admin
Posts: 4778
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: Unlimited RAM/ROM

Post by uze6666 »

Though it look pretty cool, I'm unsure of what we are looking at!? :oops: Do I understand you are running some sort of other cpu emulation (6502?) and using the SPI RAM at the main RAM for that CPU? If so that's pretty bad ass. And even if is slow that could be a very cool program on the Uzebox. Heck I wanted to emulate an Apple 2 for sooo long. :P
rv6502
Posts: 80
Joined: Mon Feb 11, 2019 4:27 am

Re: Unlimited RAM/ROM

Post by rv6502 »

It's Dhrystone's dry.c running on a custom virtual 32bits CPU on a Uzebox.

It uses a binary on the SD card as ROM, the SPI RAM as main RAM, and can use a swap file on the SD card as extra RAM.

The AVR SRAM/IO space is direct-mapped as well.

I just finished rewriting the CPU interpreter in AVR assembly tonight. 18% faster. Broke the 2 Dhrystones/seconds barrier ! :lol:


I made a Clang++/LLVM backend for it which still need quite a bit of work. Room for optimisations there too.
And I need to somehow cross-link the symbols from the AVR code with the VM-target code so I don't have to use hand-typed addresses.

It's slow but its got a 32bits address space and a C++ compiler so plenty of sluggish possibilities.

Text adventures... Rogue, maybe.
I need to find where I've put my last PS/2 keyboard and build that adaptor.
User avatar
Jubatian
Posts: 1560
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: Unlimited RAM/ROM

Post by Jubatian »

Wow, this is super-cool!

By the way what about the instruction set of the emulated CPU? Could you give me some pointers on where should I start if I wanted to make a Clang/LLVM backend? How complicated is that? I am really interested as I have an instruction set laying about since a while which I pushed to the max for emulation performance, while it is still desinged in a manner that it should do well as an actual HW implementation as well.

Did you consider Risc-V by the way? That's a pretty decent instruction set for emulation, although due to the layout of the opcodes, a translation mechanism is likely necessary (at least in my plans I tackle it that way).

Maybe you could combine it with my SPI RAM video kernel (this one), using the bitmap mode for creating complex graphics, could be interesting for text + static graphics adventures.
rv6502
Posts: 80
Joined: Mon Feb 11, 2019 4:27 am

Re: Unlimited RAM/ROM

Post by rv6502 »

I don't think it's possible to use SPI in the video kernel if that VM is running as it needs to read its "ROM" from the SD card.

Multiplexing the bus isn't reliable since there's no telling how long the SD card will delay and as far as I know the SD card's /CS line needs to stay asserted until the transfer is completed.

It's over 13,000 cycles to transfer a block of data, not counting command overhead and waiting for the SD card to find the block before the data is actually sent over SPI.

It'd be pretty ugly if the display has to blank every time the VM runs. Seizure rave.
And it'd be very slow if the VM had to wait for the vblank to initiate a transfer, and that's hoping that the SD card is ready quickly enough to release before vblank ends.

If we could get the SD card to hold on while we use the SPI RAM or bitbang SPI the SD card on completely separate PINs it'd be doable.

The VM *can* run off SRAM but there isn't enough left to do anything very interesting.

I've designed the instruction set to require very little code in the interpreter.
It's only around 1100 bytes of AVR code.

The trade-off being fairly huge code on the SD card (whatever), which also means lots of SD reads (less of a "whatever").
And also that no useful VM code can fit in what's left of SRAM.

Because with slow SD card transfers you want as much VM cache as possible, right now I got 1KB + bookkeeping variables (cache tag, dirty flag, LRU tracking) allocated to the cache.
64bytes for registers (16x 32bits)
2KB for VRAM,
128bytes for ARAM,
I don't remember how much my MMC/SD driver & FAT12/16/32 driver code takes. It's not a lot but it adds up.

I got very little SRAM left.

So the only practical memory layout I see with the VM is to have the video state data in SRAM, graphics and audio assets in AVR flash or SRAM.
Some VM helper routines in AVR code (memcpy, division/modulo, strlen, bitblit, etc)
Then all the game code in the VM, using SPI RAM and SD ROM/RAM.
rv6502
Posts: 80
Joined: Mon Feb 11, 2019 4:27 am

Re: Unlimited RAM/ROM

Post by rv6502 »

Jubatian wrote: Sat Nov 02, 2019 12:33 pm Could you give me some pointers on where should I start if I wanted to make a Clang/LLVM backend? How complicated is that?
I forgot to take notes :P

I started with cloning the ARC target, renaming it, built to make sure it existed in the executable that I could do --target MyTargetName and that built for ARC, then gutted it of everything ARC.

Essentially starting from scratch other than all the empty function stubs and feature descriptors.

The LLVM Backend documentation/tutorial recommends using Sparc as the base but I found Sparc too complicated.

After that it's just looking at targets with similar features to see how it was described in the opcode tables.
AVR, ARM (ignoring thumb), ARC, and Mips seem the easiest to follow.

I looked at X86 a few times for how they described some opcodes but that target has so much code to enable/disable features and reorder instructions/registers, etc.

I did lots of cross-referencing between multiple targets to figure out what's the common/simplest way between them.

https://llvm.org/docs/WritingAnLLVMBackend.html

Oh, one thing I remember is DO NOT DESCRIBE THE MOV OPCODE TO MATCH WITH THE INTERNAL LANGUAGE it has to be done programmatically in a specific function.

Otherwise llvm's magical automatic opcode selector selects MOV for no reason, completely incorrectly, for just about any operations. So what should be ADD R1, R2 becomes MOV R1, R2 .

Took me a while to figure that one out.

Also read this book: https://en.wikipedia.org/wiki/Compilers ... _and_Tools
It's still very relevant to understand llvm.
I highly recommend.
User avatar
Jubatian
Posts: 1560
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: Unlimited RAM/ROM

Post by Jubatian »

Thank you for the LLVM info!

For the video, eh, sure, the SPI RAM kernel was made with the assumption that you would use the SD card lightly (given that there is SPI RAM), envisioning processing one sector in one frame at most typically (which should fit, so that case no flicker, and yes, running over VBlank would automatically cause a blank frame with that kernel).

So seems like you have 2K for Video then, and that's all. Mode 42? Using that for a 40x25 display it would take 2000 bytes for VRAM, 1000 bytes for the characters, and 1000 for FG/BG attributes (from 16 colour palette), could be nice even for porting something (at least 40x25 was quite common).

I also created a new RLE mode's concept recently, probably more compact than CunningFellow's, 14 colours, for uncompressible sequences usage is half of a byte / pixel, 3-5 pixel runs are 1 byte, longer than that 2 bytes, 5 cycles / pixel (roughly square pixels). Could do for some super-slow simple 3D render, or even images, although 2000 bytes is pretty limited. I dunno whether CunningFellow's own RLE mode could be useful in some way, how it is apparently designed to allow for interleaving mainline code during runs. Anyway, those are quite unusual stuff.
Post Reply