I read the other topics (SDL2 porting) and I see there is an "incentive" to go back to have a possibility of sync by video, lured in by nice smooth scrolls.
Maybe sounds crazy, but what about both at once?
I was toying with this thought with my own emulator project, and I feel it might be possible, although definitely not on a permanently on basis, just as a feature. Video might be the tougher part, since you can only sync to vsync, and have no info between two vsyncs. With sound, you can measure the lag more precisely by how full your audio buffer is. If the program syncs to video, and it is about 60Hz all right, then in the long run, by monitoring the audio buffer's load, it could be possible to manage the audio output rate to align with video.
The simplest it is to imagine by that, say, the emulator starts with 15734Hz as normal, and outputs samples with this rate. If it experiences that the buffer is draining, it increases the rate slightly, say, to 15770Hz, hoping it can catch on. If it experiences that the buffer becomes ever longer, then does the opposite, reducing the rate to, say, 15700Hz. Of course not in such a crude manner, the point is that the program should slowly discover the true skew between video and audio, and play the latter with very little fluctuation in output rate.
Just an idea, I think this shouldn't be attempted right now, for now lets just get the emulator working well. The change to 48KHz, and removing the kernel dependency is something which should be done, I think, as soon as possible (it also even eases a bit kernel development without real HW). If playing back 15734Hz at 48KHz, I don't even think interpolation is that much necessary, just spit out the samples when they come. Most will span 3 48KHz samples with an occasional 4, I don't think that would be anything really audible. Basically write_io just sets the sample value, then when the 48KHz buffer needs it by elapsed cycles, update_hardware pushes whatever is on the port (PWM output) into the audio output puffer. So it becomes completely independent of how the kernel is coded (or does it even output at all). This type of solution makes it easier to implement changing the elapsed cycles later, so even the above idea may be built on top of it.