Timing issues in uzem140

The Uzebox now have a fully functional emulator! Download and discuss it here.
User avatar
uze6666
Site Admin
Posts: 4801
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Timing issues in uzem140

Post by uze6666 »

This thread follows this one. After all the optimization, I'm going through the tracing of the Mode13Demo hex file to correct the timing. Here the timing issue I was referring to that leads to the "fat" pixel in the last column.

Tracing the attached HEX in AvrStudio Simulator V2. This bit of code is part of the scanline rendered, the last pixel of the last tile on the line. An Timer1 overflow interrupt happens right after that final pixel OUT. This is the expected behavior as per the simulator and what I can see also on the hardware.

Code: Select all

videomode13core.s, Line 579:
	out VIDEO,r16  
	add ZL,r24		
	adc ZH,r18		
	lpm XL,Z+       <-- Timer1 rollover happens during this instruction. Before execution TCNT1=0xFFFD.
mainloop:
	out VIDEO,r17	 <-- Before execution TCNT1=0x0000 and TOV1 flag is set. After execution of the OUT, the interrupt vector is executed.
	brts romloop
In the uzem140 branch:

Code: Select all

videomode13core.s, Line 579:
	out VIDEO,r16  
	add ZL,r24		
	adc ZH,r18		
	lpm XL,Z+       <-- Timer1 rollover happens during this instruction. Before execution TCNT1=0xFFFD. After, in update_hardware, the timers are rolled over, TOV flag is set then later the interrupts flags check is done and the vector is taken right away.
mainloop:
	out VIDEO,r17	 <-- this is never called and we are one cycle short on that scanline.
	brts romloop
So it seems that on the real thing, the interrupt is actually acknowledged after the next cycle/instruction following the cycle where the timer went from 0xffff to 0x0000.

From the Atmega644 Datasheet on Timer1:
14.9.1 Normal Mode
The simplest mode of operation is the Normal mode (WGMn3:0 = 0). In this mode the counting
direction is always up (incrementing), and no counter clear is performed. The counter simply
overruns when it passes its maximum 16-bit value (MAX = 0xFFFF) and then restarts from the
BOTTOM (0x0000). In normal operation the Timer/Counter Overflow Flag (TOVn) will be set in
the same timer clock cycle as the TCNTn becomes zero.
I could not find any detailed diagrams explaining this kind of interrupt timing. It may have to do with how the AVR pipelines the instructions.
User avatar
Jubatian
Posts: 1564
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: Timing issues in uzem140

Post by Jubatian »

Right now this behavior is implemented in the code (in the original as well as my branch) by delaying the timer interrupts to the next update_hardware call if they happen to fire at the last cycle of the current one. It should be functional (well, "should" doesn't mean it is :) ), anyway the effort to perform what you describe seems to be there. The emulation of LPM doesn't even break up the decoding into multiple update_hardware calls, so it should be a clean case, 0xFFFD incrementing to 0x0000. Oh, wait... Someone forgot to add the interrupt delaying magic to the TOV1 flag!! :D (So there's the bug!)

What also concerns me is the delay after setting the timer, before its first increment, which in the current code is described as happening after the instruction's completion. I feel this may be bogus. Isn't that simply 1 cycle as well (after the write, not necessary after the start of the instruction, see ST)? The current implementation is buggy (or at least isn't consistent with its documentation) since update_hardware is no longer only called after the instruction decoder (introduced to solve cycle perfection problems related to LD / ST instruction' effects).
User avatar
uze6666
Site Admin
Posts: 4801
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: Timing issues in uzem140

Post by uze6666 »

Oh, wait... Someone forgot to add the interrupt delaying magic to the TOV1 flag!! :D (So there's the bug!)
You the man, that was indeed the issue. I fixed it and the timing is perfect again!! :mrgreen: I run 2 full frames and the cycles count matches perfectly between the simulator and uzem.

While at it, I will also remove the instructions that are not on the 644 (like ELPM etc). As some early point, there was question to use the upcoming Atmega1284 so uzem included the. But alas, the 1284 can't be overcloked and we never removed them. Also, one of you comment was "LPM (r0 implied, why is this special?)"; on the first version of the 8 bit AVR achitecture (before the Megas) this was the only instruction to access the flash.

I will commit tonight so if you want to pull the changes to your branch.
User avatar
uze6666
Site Admin
Posts: 4801
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: Timing issues in uzem140

Post by uze6666 »

Fix was pushed to uzem140 branch.
User avatar
Jubatian
Posts: 1564
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: Timing issues in uzem140

Post by Jubatian »

Huh, good! :) That particular comment wasn't from me, I carried it through from the old code (it has that on that instruction, just copy-pasted, but Git couldn't follow that huge rearrangement).

What if I assume that the delay after setting the timer is simply 1 cycle? The code documents it as "after the instruction", but I really think it is simply 1 cycle, which coincides with "after the instruction" since probably all port writes have their effect at the last cycle. This assumption could help cleaning things up, and as far as I see it should be valid. This is for a bit later, for now I rather concentrate on getting the video task done.
User avatar
uze6666
Site Admin
Posts: 4801
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: Timing issues in uzem140

Post by uze6666 »

Jubatian wrote:Huh, good! :) That particular comment wasn't from me, I carried it through from the old code (it has that on that instruction, just copy-pasted, but Git couldn't follow that huge rearrangement).

What if I assume that the delay after setting the timer is simply 1 cycle? The code documents it as "after the instruction", but I really think it is simply 1 cycle, which coincides with "after the instruction" since probably all port writes have their effect at the last cycle. This assumption could help cleaning things up, and as far as I see it should be valid. This is for a bit later, for now I rather concentrate on getting the video task done.
Let me first check the exact behavior with the simulator. I suspect I didn't document that for nothing. :)
User avatar
Artcfox
Posts: 1382
Joined: Thu Jun 04, 2015 5:35 pm
Contact:

Re: Timing issues in uzem140

Post by Artcfox »

Great work guys! Earlier this morning I ran my limited regression test to make sure this did not re-introduce the missing scanline bug, and everything looks good.

Just an FYI, T2K still does not work with this new version.
User avatar
uze6666
Site Admin
Posts: 4801
Joined: Tue Aug 12, 2008 9:13 pm
Location: Montreal, Canada
Contact:

Re: Timing issues in uzem140

Post by uze6666 »

Artcfox wrote: Just an FYI, T2K still does not work with this new version.
Interesting, Ok, I'll check out why.
User avatar
Artcfox
Posts: 1382
Joined: Thu Jun 04, 2015 5:35 pm
Contact:

Re: Timing issues in uzem140

Post by Artcfox »

uze6666 wrote:
Artcfox wrote: Just an FYI, T2K still does not work with this new version.
Interesting, Ok, I'll check out why.
That won't be necessary, git bisect found it, and I fixed it.
User avatar
Jubatian
Posts: 1564
Joined: Thu Oct 01, 2015 9:44 pm
Location: Hungary
Contact:

Re: Timing issues in uzem140

Post by Jubatian »

Uze6666 wrote:Let me first check the exact behavior with the simulator. I suspect I didn't document that for nothing. :)
Not for nothing, but I suspect those were added sometime when update_hardware was only called at the end of the instruction decoder (that would make the most sense). Some timer hacks, and delaying of the LD / ST instruction's effect could have been added later to correct further timing issues, which all combined for me suggests that the true behavior should be something like I suspect (Without knowing about the "delay" of the write of ST, it may well seem like the effect is carried out at the end of the instruction).

Anyway, of course test, see what it truly does. I suggest testing every instruction which can perform a port write (except those which could theoretically have such effect by pointing the stack pointer at the port), and see how the effect behaves. I suspect that the result would appear like "end of instruction", which in truth comes from that the write is actually performed in the last cycle.

Skimming through the instruction set I see the following instructions of interest: ST (all variants), STS, STD, OUT. The instruction set manual also lists XCH, LAS, LAC and LAT instructions (which do both a load and a store), but the ATmega644 doesn't have these (by its datasheet), and the emulator doesn't implement them either. By this it seems like the "end of instruction" thing came first in an attempt to solve the problem coming from the behavior of ST variants.
Post Reply