Wednesday, November 11, 2015

IBM PC Floppy Disks - A Deeper Look at Disk Formats and Copy Protection

I.  Overview of Low-Level Floppy Disk Structure

I am going to be using a standard 360KB floppy disk as an example here.  A standard MS-DOS format command formats a 360KB floppy disk with 40 cylinders, 9 sectors per cylinder and 512 bytes per sector.  I distinguish between cylinders, which use both sides of the disk, from tracks, which only use one side of the disk.  Thus you get a total disk size of 368,640 bytes.  However, that is the space available for data bytes and a standard disk dump just dumps the data sectors.  In between the data there is extra data allowing the disk drive to find the right data.

Each sector has a Sector ID field and Sector Data field.  Between the Sector ID field and the Sector Data field are two kinds of bytes.  Before the Sector ID field there will be Gap bytes.  These Gap bytes, typically hex 4E, are used to compensate between slight differences between the drive that originally wrote the disk and the drive that is currently writing the disk.  Without some tolerance, the Sector fields may overwrite each other.  After the Gap bytes come the Sync bytes, which help tell the drive that there will be a Sector ID field or a Sector Data fields coming up next.  Sync bytes follow a pattern of 00s followed by three A1s.  This area is missing the normal clocks and does not follow proper MFM encoding rules so the drive controller can figure out that this is not real data and the real data is coming up.  The A1s are also called the Sync mark.  Gap bytes and Sync bytes precede both the Sector ID field and the Sector Data field.  

The Sector ID begins with an FE byte as an Address mark.  Then it continues with track number (00-27), the side number (00-01), the sector number (01-09) and finally the size ID byte (00-06).  Unlike everything else, sector numbers use a convention starting with 01, not 00.  The size ID byte tells the system how large the follwing Sector Data field is and is interpreted as follows :

Size ID Byte   Actual Size in Bytes
00             128
01             256
02             512
03             1024
04             2048
05             4096
06             8192

The only sector size the standard DOS FORMAT command will use is a 512 byte sector size.  This applies to all standard 5.25" and 3.5" disks and drives, whether double density (160KB, 180KB, 320KB, 360KB & 720KB), high density (1.2MB and 1.44MB) or extra high density (2.88MB).  You can use third party utilities to format disks with other sector sizes, but DOS will not read them.  Note that a track on a 360KB disk cannot really fit 8192 data bytes.  I have never come across a disk that has had more than 10 x 512 byte sectors.

The next two bytes form a 16-bit Cyclic Redundancy Check (CRC) value.  The CRC value verifies that the information contained in the sector ID field is correct.  Sometimes there may be some junk bytes between the CRC value and the next sequence of Gap bytes.  

As stated above, after more Gap bytes and Sync bytes comes the Sector Data field.  The sector data field begins with an FB byte as an Address mark.  Then comes the actual data, the bytes you see with a standard disk dump.  Finally comes the 16-bit CRC value.  

A Sector ID field, without the Address Mark and CRC Value, is 4 bytes large.  A Sector Data field without the Address Mark and CRC Value, is typically 512 bytes large, otherwise the size is whatever the Size ID Byte indicates.  Gap Bytes and Sync Bytes do not have a set size.

For standard DOS compatible 160KB, 180KB, 320KB and 720KB disks, the only thing that changes is the track number, side number and sector number.  160KB and 180KB disks only use 1 side of the disk, 160KB and 320KB disks only use 8 sectors, and 720KB and larger disks use 80 tracks.  High density disks use 15 (1.2MB), 18 (1.44MB) or 36 (2.88MB) sectors per track.  

In a hex viewer, a full sector dump might look like this :

4E 4E 4E 4E 4E 4E 4E 4E 4E 4E 4E 4E 4E 4E 4E 4E
00 00 00 00 00 00 00 00 00 00 00 00 00 A1 A1 A1
FE 04 01 06 02 xx xx 4E 4E 4E 4E 4E 4E 4E 4E 4E
4E 4E 4E 4E 4E 4E 4E 00 00 00 00 00 00 00 00 00
00 00 00 00 A1 A1 A1 FB 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 55 53 45 52 44 41 54 41
55 53 45 52 44 41 54 41 xx xx

Green = Gap Bytes
Red = Sync Bytes
Blue = Address Mark Bytes
Magenta = CRC Bytes
Black = ID and Data fields

The hex values in the data field repeat the ASCII for USERDATA over 64 times.  

II.  Floppy Disk Encoding

A floppy disk is a magnetic storage mechanism.  Data is encoded through a flux reversal, the change in the polarity of a magnetic field from north to south and south to north.  A flux reversal could signify a 0 or a 1 bit. We start with the proposition that a 0 is represented by no flux reversal and a 1 is represented by a flux reversal.  However, this scheme is not workable because it becomes too difficult to keep track of the bits when there is a prolonged period with no flux reversals.  

Thus the Frequency Modulation (FM) method was first used.  In this method, a 0 bit is encoded by a flux reversal followed by a no flux reversal, and a 1 bit is encoded by two flux reversals.   Thus for there will be at least one flux reversal for every two pulses of the clock.  This method is used in the earliest floppy disk controllers, but for home computers only the Atari 8-bit drives used it.  These disk drives were called single density.  The IBM PC floppy disk controllers that use an NEC uDP765 or 100% compatible like the Intel 8272 support FM encoding.  Later floppy disk controllers lose that functionality.

This system was easy to implement, but not very efficient.  The next improvement was the Modified Frequency Modulation (MFM) method.  In this method, a 0 bit preceded by a 0 bit would be encoded by flux reversal then no flux reversal.  A 0 bit preceded by a 1 bit would be two clocks of no flux reversal.  A 1 bit would be no flux reversal followed by a flux reversal.  On average this only requires half the flux reversals per bit as in the FM scheme.  The fewer flux reversals needed, the more data can be stored on a disk.  The disks that supported this system were called double density.  All IBM PC compatible floppy disk controllers use the MFM encoding scheme, as did the earliest hard disk drives.  

The data line of a floppy disk does not operate in a vacuum.  There is a relationship between the MFM clock bits and the MFM data bits.  Both are present on the Write Data and Read Data lines.  There are two data bits for every clock bit.  Normally the two are in sync, meaning that the clock bits correspond exactly to the MFM data bits.  But there is one time when they intentionally are not, the Sync bytes.  How this happens is as follows :

In MFM encoding, there will never be more than three 0 bits between a 1 bit.  However, with this limitation is more liberal than how the bits are actually represented in MFM.  The A1 in the Sync bytes does not follow the appropriate clock and data bit pattern. A1 should be encoded as data bits 100010010101001 and clocking bits 0 0 0 1 1 1 0 and is encoded that way in a Sector Data field (there is no reason for it to appear in a Sector ID field).  However, to put the A1 out of sync, the following pattern is produced by the clocking bits : 0 0 0 1 0 1 0.  The resulting fifth bit is not as it should be, instead it appears as 100010010001001, with the fifth bit being different.  This error signals to the disk controller that this is a Sync byte.  

III.  Copy Protection

Copy protection methods on the IBM PC platform ultimately generally work within this scheme.  Other systems like the Apple II, Commodore 64, Atari ST and Commodore Amiga have controller hardware that allows far more sophisticated protection methods to be used.  The PC disk controller will not tolerate the more bizarre methods like half tracks, fat tracks, spiral stream, offset tracks or varying bitrate speeds.  

The simplest methods are to use non-standard sector sizes, sector numbers or tracks.  This is good enough to bypass DISKCOPY and FORMAT, but any utility that uses the PC BIOS can format a disk with non standard.  Each track and sector within the track can have differing sector sizes.  

Another simple method is to use the write protection tab/notch on a disk.  If a program expects the disk to be write protected, it may try to write a destructive sequence of bytes to the disk.  Of course, the original disk should have no tab or notch.  Be wary if you see a disk notched roughly via scissors or with a hole punch.

A common method is to intentionally write a disk with erroneous CRC bytes.  A standard IBM PC Diskette Adapter can read, but not write erroneous CRCs.  Of course, sectors with data that cannot be reliably read will also return CRC errors.  It is then up to the person de-protecting the diskette to figure out whether the error is intentional or unintentional.  

Another method is to check that the exact number of Gap and Sync Bytes are on the disk.  If the original disk was written with 16 of each and the copy only has 15, that is a way to detect the copy.  The program could instead read data from the Gap bytes, like an encryption key.  The PC disk controller's read track command would read the Gap bytes.

A method used the Sync bytes to let sectors overlap.  The disk can play with the sector ID sizes to make no sense physically, like having sector 1 with an ID of 8192 bytes and then fitting the other sectors "within" sector 1.  This protection can often be fooled by writing the image to a high density disk, which allows for 15 or 18 sectors per track.  Of course, high density disk controllers were far from ubiquitous in the 1980s when most of these games were released.

A devious method used weak bits.  In this case, the game would read an area on the disk where the flux transitions have not been written with sufficient magnetic force to make the transitions able to be read reliably.  The game would not expect the same result to be returned from multiple reads of a sector.  If the same results were always read back, the game would fail to work, thinking it was being run on a copy.  Unfortunately, if the disk was damaged unintentionally or the flux transitions have lost their force with age, a flux copier like an Option Board,Deluxe Option Board, a Kryoflux or a Super Card Pro may misinterpret this and make a bad copy.

IV.  Known Protection Schemes used in Games :

Cops Copylock II

The Rising Dynasty (HD media used)

Electronic Arts IBM Interlock

Used by Electronic Arts booters, involves 96 interleaved sector IDs, typically at track 15, side 0.  They also use 200KB and 400KB disk sizes with 10 sectors per track.

Amnesia
Archon: The Light and the Dark
Hard Hat Mack
Marble Madness
Murder on the Zinderneuf
One-on-One
Pinball Construction Set
Seven Cities of Gold, The
Super Boulder Dash
Timothy Leary's Mind Mirror
Will Harvey's Music Construction Set

Formaster Copylock

Contains a sector with errors, usually at at sector 20, side 0, sector 5.  The sector ID is marked as having 256 bytes, but actually has 512 bytes, fooling copiers like TeleDisk.  Used mostly by Sierra.  This protection will probably fail at speeds greater than an AT (8MHz 80286)

B.C.'s Quest for Tires (regular version)
Blockbuster (Second release)
Computer Baseball
Crossfire (IBM PCjr floppy disk version)
Gato (Versions 1.0-1.3)
Hardball (Disk 2/EGA Disk, early version only)
King's Quest (IBM PC-CGA version)
Mr. Cool
Paperboy (Second release)
Pinball Wizard
Oil's Well
Quink
Sierra Championship Boxing (1984 version)
Top Gun
Troll's Tale
Ultima II (1983, 1984 versions)
Ultima III (1984, 1985 versions)
Wizard and the Princess, The (IBM PCjr version)

H.L.S. Duplication

Accolade and Epyx used this type of protection.  There is an additional sector ID field in one of the sectors.

Ajax
Bad Dudes
Boot Camp
California Games
Gauntlet (later version)
Grand Prix Circuit (early version only)
Hardball (Disk 1 only)
Mini-Putt
Rush 'n' Attack
Test Drive
Test Drive II

Microprose Protection 1
Protection is found on track 4, side 0.  Uses overlapping sectors.

Gunship (certain versions)
Sid Meier's Pirates

Microprose Protection 2
Protection is found on tracks 38-39, side 0.  Uses overlapping sectors.

Airborne Ranger
Dr.Doom's Revenge
F-15 Strike Eagle II
F-19 Stealth Fighter
M1 Tank Platoon
Sid Meier's Pirates (432.02 and above)
Red Storm Rising
Rick Dangerous
Savage
Stunt Track Racer
Sword of the Samurai
X-Men

Mindscape DEM
This protection will probably fail at 386 speeds (16MHz and above)

Inflitrator 2
Italy '90
Paperboy (early version)

Minder03
Used by Prism Software

Pipe Dream (European version, U.S. version by LucasFilm Games used a Codewheel)

Non-standard Sector Sizes
Probably the earliest method of copy protection.  Self-explanatory, can easily be copied with TeleDisk.

Adventure in Serenia
Agent USA
Arkanoid (Imagine version) & Super Tennis
Below the Root
Gremlins
Jungle Hunt
King's Quest (IBM PCjr. version)
Microsoft Flight Simulator v1.00
Sid Meiers' Pirates! v432.03
Pitstop II
Trains (DOS version)
World Games

On-Line Systems Protection #1

Frogger
Ulysses and the Golden Fleece

Origin Systems OSI-1

Used by all Origin Systems games with an EXE executable.

Ultima I: The First Age of Darkness
Ultima II: Revenge of the Enchantress (Ultima Trilogy version only)
Ultima IV: Quest of the Avatar
Ultima V: Warriors of Destiny

Rob Northern Copylock

Uses weak bits and calculates the entropy expected.  Encrypts the actual executable and stores a 32-bit decryption number on a bad sector.

Frank Bruno's Boxing (Compilation version)
Lemmings
Oh No! More Lemmings (comes in disk-based and document-based versions)
Paperboy (later version)
Xenon II

Sector ID Duplication
Sierra used this for the versions of their games distributed under Tandy's label prior to 1985.  In this scheme, there are multiple sectors with identical Sector IDs, but different data.  Subsequent read sector commands will read the later sectors.

BC's Quest for Tires (Tandy version)
King's Quest (Tandy version)

Softguard 2.x/3.x with original loader

Found mostly in Taito games.  Works like Sierra's version, but with a different loader that is harder to decrypt.  The file CMLxxxx.FCL will be present.

Arkanoid
Arkanoid II: Revenge of DOH
Bubble Bobble
Operation Wolf
Puzznic
QIX
Rambo III
Rastan
Renegade
Sky Shark
Ultima II (1985 version)

Softguard 2.0.3 with Sierra's Loader

This protection was used in Sierra's games.  It relied on a track with non-standard sector sizes, overlapping sectors, CRC errors in the sector data field   There is a hidden file called CPC.COM on the first disk that checks for the protection.  All these games could be installed to a hard drive, but required the first disk, the key disk, to be inserted in a floppy drive.  Police Quest: In Pursuit of the Death Angel, was the only AGI version 2 game that never seemed to be protected.  This protection was applied to both booter and DOS games.

3-D Helicopter Simulator
Black Cauldron, The (AGI version 2)
Donald Duck's Playground
King's Quest I: Quest for the Crown (AGI version 2)
King's Quest II:  Romancing the Throne (AGI version 2)
King's Quest III: To Heir is Human
Leisure Suit Larry in the Land of the Lounge Lizards (AGI version)
Mixed-Up Mother Goose (AGI version)
Sierra Championship Boxing
Space Quest: The Sarien Encounter (AGI version)
Space Quest II: Vohaul's Revenge
Thexder

Weak Bits Generic Implementation
Typically done by using more than three 0 bits following a 1 bit in violation of MFM encoding rules.

After Burner
Block Buster
Blood Money
Bop'N' Wresting
Boulder Dash II
Gauntlet (Early version)
Ghosts 'n Goblins
Harrier Combat
Horror Zombines From The Crypt
Lemmings 2: The Tribes (used HD media)
Outrun
Rick Dangerous II
Shadow Gate
Shinobi
Space Harrier

Waydisk Minder
This protection will probably fail at greater than IBM AT (8MHz 286) speeds

4 Soccer Simulators
AM, FM Tivia Vol. 2
Arac
Boulder Dash 2
Football Manager
Frank Bruno's Boxing
World Championship Soccer

XELOK/XEMAG

Used mainly by Broderbund.  Has 16 sectors instead of 8 on track 9, side 0.

Oo-Topos
Ultima III (Ultima Trilogy version)
Sargon 3

V.  Hardware Based Backup Methods

There are several programs that rely purely on the floppy disk controller to try to make a working backup, including Teledisk, ImageDisk, CopyIIPC & Snatchit.  None of them is a perfect solution.  TeleDisk did not take every PC protection method into account, but its great to write images with non-standard sector sizes.  ImageDisk is a newer program and doesn't have that many disk images available for it, but it too is bound by the limitations of the PC disk controller.  CopyIIPC and Snatchit only work on 100% compatible IBM PCs (no Tandy 1000s or PCjr.s), do not work at high 386 speeds or faster and potentially will modify the data on the disk image to crack the program.

The closest thing that was available back in the 1980s that could copy anything was Central Point Software's Deluxe Option Board.  This was a hardware ISA card that intercepted the signals between the disk drive and disk controller.  Because it operated on the flux transition level, it could theoretically copy anything from a double density disk.   It could write bad CRCs, overlapping sectors, interleaved sectors and data in the gap bytes.  This board used a program called TransCopy.  TransCopy could save an image of a disk for future rewriting.  360KB disks would have an image that would be 1,088KB in size.

TransCopy was not without its flaws.  The non-Deluxe Option Board could only work with TransCopy versions 4.x or earlier.  TransCopy versions below 5.x would not copy 80 track disks, inlcuding 720KB disks.  TransCopy images made with version 5.x would not work with earlier TransCopy versions.  TransCopy would also be crippled so that it would not work with certain protections, depending on which company was threatening CPS with litigation at the time.  You would need to hope that an earlier version of TransCopy would work.  Copy Copylock II will not be able to written back with an Option Board or Deluxe Option Board.

The modern equivalent of the TransCopy board are the Kryoflux and SuperCard Pro.  In the PC context, KryoFlux will create raw stream files of each track.  There are utilities that can convert these raw stream files to TransCopy or TeleDisk formats.  It can now write back raw stream files, so you no longer need to send your disk to the SPS and get an IPF file.  However, writing back raw stream files is not fully reliable because it does not account for fluctuations in drive rotation speed and imperfections in the signal, possibly leading to signal loss similar to copying analog tapes.  There are unsupported utilities to make your own IPF files, but it appears that unless you know the stream format well, you may not be able to write back a working disk. SuperCard Pro does the same thing but contains everything in one file.  The two formats require conversion. 

4 comments:

anormal said...

Very good article, maybe you want to mention super card pro
it's a very good alternative for modern raw dumps, some people thinks is better than kryoflux,
url: http://www.cbmstuff.com/proddetail.php?prod=SCP

Great Hierophant said...

I forgot to mention the SuperCard Pro. I am not sure how friendly it is to IBM PC disks, but Kyroflux is not exactly known for being super friendly to PC formats either.

Anonymous said...

Excellent article it was 25 years ago when I was young :-(
I used a software called copywrite from Quaid Software. I thinks it worked better tha CopyIIPC
I still have my old Amstrad 1512 and 1640 with deluxe option board also
nostalgie...

Anonymous said...

"A devious method used weak bits." This is incorrect.

The correct term is "Marching Bits" where the recording is actually recorded using a much shorter time window to record then the read device. When mass producing floppy discs the bits needed to be exactly centered so the real-world playback was possible on drive with various speed variation and flutter. "Window Margin" is a measure on how small the time span can be narrowed and still have the disc read correctly. 25% window was standard passing quality during production when discs were verified - yes every bit recorded was verified or the disc rejected even "4E" bytes.

Protection was achieved by placing "1" bits progressive closer to the beginning or end a the recording time window. The bits were NOT recorded weakly. Due to very slight variation in drive speed (Flutter) would result in inconsistent reading of the data. Thus "00100" is recorded near the beginning of the window, some read backs resulted as "01000". The verification software confirmed the "Marching Bits" as required and not as a read error. When a copy-protected disc was copied the bit is no longer off-centered and therefore reads without variation each time. Quality standards required 3 reads to produced 2 variations. The "Marching Bit" records a series of data bits with a single bit progressively off center thereby assuring a stable misread. Embedded software read this hidden sector to confirm variation before executing the program. The disc must remain in the drive for full features.

Unfortunately, Marching Bits resulted in distribution copies that had true COPY PROTECTION. Game software often had "marching bits" without the ability to make copies. Furturemore, Boot-leg discs had variation in the game that annoyed the users. Such variation included tasks that could never be completed. Business software consumers wanted an ability to make an archive copy. When asked how to make a backup copy the answer was "you cannot copy because the copy would be unprotected."