Time is very tight this week, but things worth saying immediately:
- in a case of accidental fuzz testing, I discovered that the WOZs produced by dsk2woz were truncating track bit count to produce the byte count (WOZ stores both) rather than rounding up. Which caused my emulator to perform an out-of-bounds read. Fixed at both ends.
- I made some progress with NIBs, but still hate them.
The specific issue with NIBs: you are given 6656 bytes of data to represent each track. Those bytes represent all bytes read from the disk while in perfect stream synchronisation. They omit the synchronisation bits.
So you're starting with 53,248 bits of information. You definitely have to add more, because the synchronisation bits have to go somewhere.
If you were not to assume anything about the format of the track then any and all words might be followed by one or two sync bits. But if you put only a single sync bit after every single byte then you're now trying to cram 59,904 bits onto a single track. You've exceeded both the actual media density and what can be handled by the Disk II's PLL at 300RPM.
Suppose you make a limited data-format assumption and say that only patches of five-of-more FFs are sync words. Then you still frequently end up with more bits than can be parsed. Especially if any sector contents are ever padded with FFs or contain a lookup table with many FFs.
Okay, so you've got to make at least one more assumption. You can assume the Apple sector and header prologues, and that only FFs immediately before those are sync words. Even then you can't afford to treat all the FFs as sync words on most disk images before you blow your track bit budget.
So if you're me you then arrive at assuming you know the prologues, and marking at most five FFs before a prologue for extra sync bits. For Apple 6-and-2 encoding that's usually going to be only 2*5*16 = 160 extra vits per track, for 53,408 bits total. Which is pushing it but within bounds.
But then you're at NIB support only for disks with the conventional 6-and-2 encoding, including an assumption of the ordinary sector and header patterns. So what did you gain over DSK? For all that, nothing that wouldn't more intelligently have been stored by just including headers with sectors in a DSK-esque format. Like, say, the Amstrad CPC .DSK.
But classic emulators, that aren't actually all that interested in how the hardware works, just spool a new byte from the file every time the CPU reads the data register. Magic alignment! But the only way I can think of to reproduce that is a hardware hack: use the 59,904 that give every single byte two sync bits,
slow down disk rotation (to approximately 280RPM, I think, but don't quote that; it's based on the Disk II state machine offering plus or minus an eighth on flux window lengths, which I feel I learnt but don't have a source to hand), and just write protect the whole thing because obviously writing real data on top would make the extended track length very odd.
I saw no issues opening NIBs as was; the contents aren't even loaded into memory, they're read from disk on demand, and I ran with both the Clang address sanitiser and the undefined behaviour analyser attached. But I'll keep at it.