UEF [Unified Emulator Format] FILE FORMAT

Originally by Thomas Harte. Additions to tape functionality by Fraser Ross.
DRAFT file format version 0.10 / document draft 25 (revised 6/7/2005)

Chunk list

Introduction

The UEF file format is designed to store images of all the common media types that are associated with the BBC Micro, Acorn Electron and Atom, so that no information that would be presented to those machines is discarded.

UEF files are chunk based, and the standard definition allows for file compression.

If you spot any errors or omissions in this spec, or have any comments, then please e-mail me.

Header and Chunk Template

The file format is chunk based, with the basic chunk template:

2 bytes: chunk id
4 bytes: chunk length, not counting the 6 bytes that make up this header
<chunk data>

All UEF and Tape-UEF files have the same 12 byte file header:

10 bytes: Null terminated string "UEF File!"
1 byte: minor version number
1 byte: major version number

If a UEF file has the same major version number and a lesser or equal minor version number as your software, then your software should be able to function with it. This does not always mean that the previous definitions of chunks have remained unchanged, but merely that the previous definitions are not incompatible with the newer ones.

Covered Types of Media

Support is included for the following varieties of media:

  • Tape images
  • Disc images
  • ROM images
  • State snapshots

Variety of image is determined purely by the presence or absence of the relevant chunks. There are also numerous control chunks to give emulators hints about what they should be doing.

In most cases the sequence of chunks is unimportant, but this is not so for any UEF containing tape chunks. In those files, everything is ordered. UEF processing should begin at the start of the file, then proceed according to the tape hardware to the end.

Any non-tape chunks encountered whilst dealing with a tape UEF should only be processed when they are crossed by tape processing mechanisms. For example, suppose you have a UEF with a combination of tape and state snapshot chunks - four tape data chunks are followed by a memory state chunk, which is followed by another two tape data chunks. In this scenario, the state snapshot chunks should not be applied until after the data that is included in the first four tape chunks has been processed.

Conventions

UEF files may and commonly will be be gzip compressed, but the suffix will remain unchanged. The gzip compressed version has the normal gzip 'magic number' as its first two bytes, so the well known ZLib can be used to work with either type seemlessly.

All data is little endian (as on the original machines) unless stated otherwise.

Floating point numbers are stored in IEEE 754 format, with intel 8087 byte ordering. This means they should be treated as having 7 figure accuracy. [Sub-optimal] example C code to read such a float and convert it into a floating point number in a platform neutral manner:

	/* assume a four byte array named Float exists, where Float[0]
	was the first byte read from the UEF, Float[1] the second, etc */

	/* decode mantissa */
	int Mantissa;
	Mantissa = Float[0] | (Float[1] << 8) | ((Float[2]&0x7f)|0x80) << 16;

	float Result = (float)Mantissa;
	Result = (float)ldexp(Result, -23);

	/* decode exponent */
	int Exponent;
	Exponent = ((Float[2]&0x80) >> 7) | (Float[3]&0x7f) << 1;
	Exponent -= 127;
	Result = (float)ldexp(Result, Exponent);

	/* flip sign if necessary */
	if(Float[3]&0x80)
		Result = -Result;

	/* floating point number is now in 'Result' */

And to write:

	/* assume that the floating point number 'Value' is to be stored */
	unsigned char Float[4];

	/* sign bit */
	if(Value < 0)
	{
		Value = -Value;
		Float[3] = 0x80;
	}
	else
		Float[3] = 0;

	/* decode mantissa and exponent */
	float mantissa;
	int exponent;
	mantissa = (float)frexp(Value, &exponent);
	exponent += 126;

	/* store mantissa */
	Uint32 IMantissa = (Uint32)(mantissa * (1 << 24));
	Float[0] = IMantissa&0xff;
	Float[1] = (IMantissa >> 8)&0xff;
	Float[2] = (IMantissa >> 16)&0x7f;

	/* store exponent */
	Float[3] |= exponent >> 1;
	Float[2] |= (exponent&1) << 7;

	/* now output Float[0], then Float[1], etc */

All strings are ASCII and NULL terminated, and the linefeed character, ASCII code 10 (decimal) is used to deliminate new lines. The only other ASCII control character (i.e. with decimal values less than 32) recognised is 'tab', value 9. Others should be completely ignored.

Notes on pseudo-code and understanding the tape data chunks

This document now contains pseudo-code for every tape chunk that outputs anything other than a silent wave. For the purposes of reading that pseudo-code, you need to know the following things:

A 'pulse' is a single 'loop' of wave output - a signal starting at zero, then arching in one direction away from zero before returning there. Think of a high pulse as a sine wave in the region 0 to 180 degrees, a low pulse as a sine wave in the region 180 to 360 degrees.

A wave is two opposing loops joined together - a low pulse followed by a high pulse. This is also known as a cycle.

A bit is encoded depending on its value. All bits last for 1/(baud rate) seconds. A zero bit fits one wave into that period. A one bit fits two equal length waves into that period.

When reading pseudo-code always keep in your mind the idea that UEF files are assumed to start with a 1200 baud rate, but that this may change to any value, and that phase (see below) is initially 180.

As a general rule, emulator authors can forego processing of chunks that define the wave form at pulse or wave level (i.e. security waves, anything to do with phase) and rationalise &0104 to a whole number of stop bits, while retaining 99.9% compatibility with real world UEFs.

Notes on phase shift

UEF supports the concept of 'phase shift'. This is a factor in how the data represented in the UEF appeared on the original cassette, but doesn't have an effect on the stored data and may be ignored by emulators or any other tools that look only to that.

Phase is measured as an integer number between 0 and 359. If pulses and waves are thought of as sections of a sine wave, phase adjusts the sections in use. The effect of phase n is that a high pulse becomes the region 180+n to 360+n degrees, and a low n to 180+n. Phase is 180 when a UEF is opened - so a low pulse is genuinely a pit below zero, and a high pulse is genuinely an arch above.

To convert a quantity in degrees to a quantity in radians apply this conversion: (x*PI)/180. PI is the well known constant that lies somewhere around 3.141592654.

Pictographically, we see here some waves with no shift in phase since the UEF was opened (i.e. phase = 180):

Here we see the same waves with a phase shift of 90:

>

Colour Spaces

With all greyscale images, the colour values simply represent colour intensity, but for colour images, one of the following conventions applies:

  • 8 bpp: colours are paletted
  • 16 bpp: the 16 bit word is split into four nibbles, and is intended to be decoded so as to produce an r:g:b value with each component described in 8 bits. The most significant nibble is the high nibble of the red byte. The second most significant nibble (the other nibble of the high byte) is the high nibble of the green byte. The third most significant nibble is the high nibble blue byte. The least significant nibble is the 'offset' part, as it should used as the low nibble of the red, green and blue bytes. So, for example, the value &abcd describes the RGB colour (0xad, 0xbd, 0xcd).
  • 24 bpp: again an r:g:b triplet is formed, with each colour being represented by a byte value. The most significant byte is the red byte, the middle-most significant byte is the green byte, and the least significant byte is the blue byte. Hence the value &abcdef describes the RGB colour (0xab, 0xcd, 0xef).

Bit Multiplexing

This file format has support for the special emulator feature I have badly named 'bit multiplexing' until a better name can be found. Bit multiplexing supplies the emulator with additional information so that old programs may be run to produce a greater quality of output.

This feature is really only for emulation use of UEF files and ignoring bit multiplexing will have no effect on the accuracy of your tool to original hardware. This feature is expected to be ignored by most authors.

A separate document on multiplexing is in preparation.

All Defined Chunks

&00xx set - Content information
&01xx set - Tape chunks
&02xx set - Disc chunks
&03xx set - ROM chunks
&04xx set - State snapshots
&FFxx set - reserved / non-emulator portable

&00xx set - Content information

Back to chunk index

Chunk &0000 - origin information chunk

Holds a few lines of text describing where the file came from, or naming the utility that created it. Text should be automatically 'word wrap'd by any tool utilising this information, new lines should only be used for breaking paragraphs.

Chunk &0001 - game instructions / manual or URL

Text as above, intended to hold a copy of the game manual, or some notes on the game generally. Alternatively, this may contain a URL to a more meaningful resource, such as a local or internet based web site, in which case the first five characters should be "URL: " (URL: followed by a single space) followed by the URL.

Chunk &0003 - inlay scan

A scan of the inlay image.

Byte Offset Length Description
0 2 Width of image
2 2 Height of image
4 1 BPP of image in low 7 bits, high bit set if image is grey scale.
[5 if 8bpp paletted, non-existant otherwise] 768 If non-paletted 8bpp, then a 256 colour palette follows here, arranged in b, g, r triplets (bytewise you see the b value, then the g, then the r) each representing a linear scale from 0 (no quantity of this colour present) to 255, with the first palette entry coming first in the list.
[773 if 8bpp paletted, 5 otherwise] Width*Height*(bits per pixel) The image data itself, stored in English reading order and following the usual colour space conventions.

This chunk is intended for small, low quality scans, suitable for display within an emulator but no more.

Chunk &0005 - target machine chunk

Describes a type of hardware for which this file is targetted. Multiple chunks may be present - e.g. some tapes contain a BBC and Electron version of their game, the only difference being which file is loaded last. In general though, this sets a 'minimum' required hardware level, so that emulators can alert a user when a UEF requires hardware that isn't emulated.

This chunk is exactly 1 byte long. In that byte, the most significant nibble holds one of the following values:

0 - this file is aimed at a BBC Model A
1 - this file is aimed at an Electron
2 - this file is aimed at a BBC Model B
3 - this file is aimed at a BBC Master
4 - this file is aimed at an Atom

The least significant nibble holds one of the following values:

0 - this file will work well with any keyboard layout, or a layout preference is not specified
1 - this file will work best if all keys are left in the same places relative to each other as on the emulated machine (e.g. the IBM PC key physically above '/' produces ':' as per theoriginal hardware, even though it has a ' on it on UK keyboards)
2 - this file will work best with a keyboard mapped as per the emulating computer's (e.g. on a UK keyboard pressing shift+0 on a keyboard will produce ')', rather than '@' as on a BBC or Electron)

Chunk &0006 - bit multiplexing information

Contains one bytes, for determining what, if any, bit multiplexing information is provided by this UEF.

This byte is known as the 'bit multiplier'. Take the value of this byte and multiply it by 4 to get the number of bits that are stored for every bit that the original machine had. For 32bit platforms such as wintel, this byte will normally have the value '1' - indicating that 1*4 bits are available for every single in the original - i.e. every 8bit value is shadowed by a 32bit value.

Older UEF definitions had a second byte defined for this chunk, but it is no longer used and need not be present.

Chunk &0007 - extra palette

This chunk holds the palette for multiplexed modes with colour depths of less than 16bit. Contains (chunk length / 3) entries, where each entry is a red byte followed by a green byte, followed by a blue byte, each specifying a level in the full byte range of 0->255. If the chunk length has a remainder when divided by three, the last 'remainder' bytes should be ignored.

The first value you read is the r,g,b value of colour 0 is the palette. The second value is colour 1, and so on. The values for entries referred to in software but not described here are undefined.

Chunk &0008 - ROM hint

This chunk can be used to specifically say whether a particular ROM or class of ROM is required or not.

If the lsb of the first byte is 0, the chunk is requesting a ROM or set of ROMs be absent. If it is 1, the chunk is instead requesting presence.

If the second byte has value 0, a specific ROM is being named. In which case a NULL terminated string follows, which should be matched with the ROM encoded ROM name. For information on how to decipher the ROM name from a ROM file, see the BBC AUG. Another byte then follows the NULL terminator. If its lsb is 0, the string given is exactly equivalent to the ROM name. If the lsb is instead 1, then instead the string, if length n, names only the first n characters of the ROM it is thinking of. In this case it is sufficient for the emulator to find any ROM with those first n characters.

Otherwise, if the lsb of the second byte was 0, a ROM type follows in the third byte. It has one of the following values:

  • 0 - all ROMs [*]
  • 1 - all ROMs except BASIC [*]
  • 2 - any DFS ROM
  • 3 - any ADFS ROM
  • 4 - any filing system ROM [*]
  • 5 - any language ROM [*]

Of course if this chunk requests the presence 'any ADFS ROM', it means just one, whereas if it wants them absent it means the absence of any. Requests for ensuring presence of types marked with a * above is ignored - those types are included purely for the purpose of requesting absence.

There may be multiple ROM hints, in which case the order of these chunks is important. They are followed like a list of instructions. So if the first chunk says to remove any ADFS ROM's, and then a second one says to install any ADFS ROM's, then the UEF is in total requesting the presence of exactly one ADFS ROM. However an emulator need not take action if it finds a constraint is already satisfied, so it is meaningless to request more than one ADFS ROMs.

Chunk &0009 - short title

A short title, in ASCII, suitable for use as the title bar to an emulator, or display in a file selector.

Chunk &000a - visible area

For UEFs which use only a portion of the output screen, this chunk allows the total visible area to be restricted to a particular rectangle. Contents are:

Byte Offset Length Description
0 2 Lowest visible x value - i.e. 'left' of visible rectangle
2 2 Lowest visible y value - i.e. 'top' of visible rectangle
4 2 Highest visible x value - i.e. 'right' of visible rectangle
6 2 Highest visible y value - i.e. 'bottom' of visible rectangle

For the BBC/Electron, all coordinates are assumed to be measured on a mode 0 style 640x256 output, regardless of the display mode in use at any particular time. Atom displays are assumed always to be measured per the native display mode.

&01xx set - Tape chunks

Back to chunk index
Chunk &0100 - implicit start/stop bit tape data block

This chunk represents data stored on a cassette with the default start/stop bits, and as such those start/stop bits are omitted to save space.

If one considers the most significant bit of a byte to be 'before' the least significant, bits are stored reversed from the order on the tape. The least significant bit of the first byte is the first bit that would have appeared on the tape, the most significant the 8th, and so on. In this way, bytewise values are the same as the bytes stored on cassette.

PSEUDO-CODE

  • while bytes remain in UEF chunk
    • output a zero bit (the start bit)
    • read a byte from the UEF chunk, store it to NewByte
    • let InternalBitCount = 8
    • while InternalBitCount > 0
      • output least significant bit of NewByte
      • shift NewByte right one position
      • decrement InternalBitCount
    • output a one bit (the stop bit)
Chunk &0101 - multiplexed data block

The chunks that store meaningful tape data that may be multiplexed are &0100 and &0102. If any of these is immediately followed by a &0101 chunk then that chunk contains exactly the same information as its predecessor, except that the data fields are expanded to contain multiplexed data.

If this chunk appears not after one of &0100 or &0102 then it has no meaning and should be ignored.

Otherwise, its meaning is undefined and the chunk should be ignored. Older UEFs may use &0103 as a synonym for &0101 - it may be treated in exactly the same manner, but its use is depreciated.

Chunk &0102 - explicit tape data block

This is as chunk 0100, except that no assumption is made about start/stop bits - this is a raw representation of the bits stored on a tape, so start/stop bits are present here exactly when they present on the tape. This makes sense for games that alter the default start/stop bit settings.

The very first byte is used to fully determine how many bits are actually present in this chunk. Only the first (chunk length * 8) - (value of first byte) bits are considered to be present.

The least significant bit of the second byte was the first bit on the source tape. The most significant bit of the second byte was the eighth bit, the least significant bit of the third byte was the ninth bit, and so on.

PSEUDO-CODE

  • compute bit count for chunk - get chunk length, multiply it by 8 and subtract the value of the first byte. Store it to BitCount
  • store zero to CurrentBit
  • while CurrentBit < BitCount
    • if CurrentBit mod 8 = 0, read a new data byte from the chunk to NewByte
    • output the least significant bit of NewByte
    • shift NewByte right one position
    • increment CurrentBit
Chunk &0104 - defined tape format data block

A block of data with a non-standard data format which can be defined. The first byte holds the number of data bits. The second byte holds the ascii code for 'N', 'E' or 'O', which specifies that parity is not present, even or odd.

The third byte holds information concerning stop bits. If it is a positive number then it is a count of stop bits. If it is a negative number then it is a negatived count of waves of the stop frequency. So, -2n describes the same sequence as +n. Positive numbers should be used wherever possible. Original BBC or Electron material should only produce positive numbers if correctly encoded.

The length of the data is the chunks length minus three. Bits in bytes are stored reversed from the order on the tape, like block &0100, i.e. the least significant bit should appear on cassette first.

Data is always stored in the chunk as whole byte quantities. If the number of data bits is seven then the most significant bits of all bytes in the chunk are unused and should be zero.

Normal start bits should always be inserted into data, as per the implicit data chunk, &0100.

For the BBC/Electron, the following formats are allowed: 7E1, 7E2, 7O1, 7O2, 8E1, 8N1, 8N2, 8O1, along with their negative stop bits equivalent (i.e. 7E-2, 7E-4, etc)

For the Atom, data format will usually be 8N-3.

PSEUDO-CODE (NB: see phase notes at head of document)

  • let NumBitsPerPacket = number of data bits, per first byte in chunk
  • make a note of parity
  • let StopWaveCount = number of stop bits, per third byte in chunk
  • if StopWaveCount > 0 then double StopWaveCount
  • if StopWaveCount < 0 then negative StopWaveCount
  • while bytes remain in UEF chunk
    • output start bit - always a zero
    • read a byte from the UEF chunk, store it to NewByte
    • let InternalBitCount = NumBitsPerPacket
    • while InternalBitCount > 0
      • output least significant bit of NewByte
      • shift NewByte right one position
      • decrement InternalBitCount
    • if parity is required, output parity bit as required for NewByte when originally read
    • let InternalStopCount = StopWaveCount
    • while InternalStopCount > 0
      • output a single wave (i.e. low pulse, then high pulse) at the '1' frequency - i.e. 2400Hz if baud rate is 1200
      • decrement InternalStopCount
Chunk &0110 - high tone

A run of high (2400Hz on a 1200 baud cassette) tone, with a running length described in (1/(baud rate*2))ths of a second by the first two bytes.

PSEUDO-CODE (NB: see phase notes at head of document)

  • read wave count for chunk - first two bytes, store to WaveCount
  • while WaveCount > 0
    • output a single wave (i.e. low pulse followed by high pulse) at the current '1' frequency - i.e. 2400Hz at 1200 baud
    • decrement WaveCount
Chunk &0111 - high tone with dummy byte

According to the AUG section 20.10 (around p393), a dummy byte is usually placed in the middle of pure tone to get around a bug affecting BBC OS versions < 1.0. This chunk replicates that situation without the need for a high tone chunk followed by a short data block followed by another high tone.

This four byte chunk is composed of two sets of two bytes - the first two describing the length in (1/(baud rate*2))ths of a second of the tone before the dummy byte, and the second two similarly describing the length of the tone after the dummy byte. The dummy byte itself represents &AA, which it always is with ROM saved files.

PSEUDO-CODE (NB: see phase notes at head of document)

  • read 'before' wave count for chunk - first two bytes, store to WaveCount
  • while WaveCount > 0
    • output a single wave (i.e. low pulse followed by high pulse) at the current '1' frequency - i.e. 2400Hz at 1200 baud
    • decrement WaveCount
  • output the folowing bit sequence (in English reading order): 0, 0, 1, 0, 1, 0, 1, 0, 1, 1
  • read 'after' wave count for chunk - final two bytes, store to WaveCount
  • while WaveCount > 0
    • output a single wave (i.e. low pulse followed by high pulse) at the current '1' frequency - i.e. 2400Hz at 1200 baud
    • decrement WaveCount
Chunk &0112 - baudwise gap

A gap in the tape - that is a length of time for which no (recognised) sound is on the source audio casette. Simply holds a two byte rest length in (1/(baud rate*2))ths of a second.

Chunk &0116 - floating point gap

As per 0112, but the gap length is a floating point number measured in seconds.

Chunk &0113 - change of baud rate

With UEF files, the current baud rate is a modal value, which is assumed to be 1200 when a UEF is open. If this chunk is encountered, the baud rate changes.

This chunks contains a single floating point number, stating baud rate in stored bits per second.

Chunk &0114 - security waves

Security waves are mainly found at the start of a pilot tone as an identification feature. Sometimes they are at the end of a pilot tone but often only because of the high speed recording process. They consist of the same wavelengths as pilot waves and zero bit waves and sometimes have a leading and/or trailing pulse.

The first three bytes of this chunk (a 24 bit value) denote the number of 'waves' (although it is possible that the first and last may be only a pulse).

The fourth byte holds the ASCII code for 'P' or 'W'. If it is 'P', the first wave is only a single pulse equivalent to the second half of a wave (i.e. a high pulse if phase is unchanged), rather than an entire wave.

The fifth byte again holds the ASCII code 'P' or 'W' which, if it is 'P' signifies that the last wave is only a pulse equivalent to the first half of a wave (i.e. a low pulse if phase is unchanged), rather than an entire wave.

Remember to be aware of how phase may affect the current meaning of 'high' and 'low' pulse.

If the 'waves' follow a gap then the fourth byte can logically be 'P' or 'W'. If the 'waves' follow waves then the fourth byte will logically be 'W'.

This chunk never offends the general rule that the stored waveform consists only of gaps and pulses joined at zero crossings, and never creates an external or internal phase change.

Waves are recorded with eight 'waves' per byte.
Short 'waves' are denoted by 0 bits.
Long 'waves' are denoted by 1 bits.
Bits are stored akin to data bytes on the tape.
Spare bits in the last byte should preferably be 0 bits.
When the number of waves is '1':

  • Only one of the fourth and fifth bytes may be 'P'
  • If the fourth byte is 'P' the fifth byte must be 'W' but has no relevance

Examples:

The sequence of waves LSLLLSSLSSLLSL will be stored as &0E, &00, &00, 'W', 'W', &9D, &2C. A sequence following waves having only 1 short pulse will be stored as &01, &00, &00, 'W', 'P', &00. A sequence following a gap having only 1 short pulse followed by 3 long waves will be stored as &04, &00, &00, 'P', 'W', &0E.

PSEUDO-CODE (NB: see phase notes at head of document)

  • let NumWaves = number of stored waves, the first three bytes in chunk
  • read the first bit from the chunk
  • if fourth byte of chunk is an ASCII 'P' then output a single high pulse of the frequency implied by the bit just read, read new bit from chunk, decrement NumWaves
  • while NumWaves > 1
    • read next bit from chunk
    • if it is a zero, output a single wave (i.e. low pulse then high pulse) of the 'zero' frequency - i.e. 1200Hz if baud rate is 1200
    • if it is a one, output a single wave (i.e. low pulse then high pulse) of the 'one' frequency - i.e. 2400Hz if baud rate is 1200
    • decrement NumWaves
  • if NumWaves is equal to 1
    • if fifth byte of chunk is an ASCII 'P' then output a single low pulse of the frequency implied by the final bit
    • else fifth byte of chunk must be an ASCII 'W' so output a single wave (i.e. low pulse then high pulse) of the frequency implied by the final bit
Chunk &0115 - phase change

This chunk contains a 16 bit unsigned value between 0 and 359, which determines the new phase.

Two phase shifts are commonly recorded by this chunk which are 0 and 180 degrees. The majority of professional cassettes have waves shifted 0 or 180 degrees. Before one of these chunks is met (i.e. immediately after opening a UEF) the phase shift should be taken to be 180 degrees.

This chunk only makes sense, as a description of an original cassette, if found before or after a gap.

See the section entitled 'Notes on phase' towards the top of this document for a proper discussion of the effect of phase on the output waveform.

Chunk &0120 - position marker

This chunk contains only a string, offering a textual description of the significance of the location it sits at within the file.

For example, some games contain the game followed by a level editor, so you could use this chunk to mark the start of the editor files so that a user can easily skip straight to those if desired.

&02xx set - Disc chunks

Back to chunk index

Chunk &0200 - disc info

Gives an overview of the included disc thusly:
Byte Offset Length Description
0 1 number of heads minus one. Values greater than 127 are invalid. Notice that since each head is considered to potentially read two sides of a disc platter, this value will be 0 for single sided and double sided disc images alike.
1 2 sector length in bytes of implicitly defined disc sides
3 1 number of sectors per track within implicitly defined disc sides
4 1 number of tracks within implicitly defined disc sides
5 1 a one byte filing system identifier, one of:
  • 0: Undefined or not specified
  • 1: Acorn 8271 DFS
  • 2: Watford DFS 62
  • 3: Acorn ADFS
  • 4: Acorn 1770 DFS
  • 5: Solidisk
  • 6: OPUS DDOS

The value of this is intended to tell you the catalogue type of the disc, and therefore if applicable which filing system ROM to 'recommend' to the user. This should be interesting to emulators that like to be vocal towards their users, but may also be used by those which like to implement their own filing systems compatible with the originals but without the hassle of a working hardware emulation.

If a UEF wants to 'force' an emulator to adopt a particular ROM, it should use the ROM hint chunk &0008.

Chunk &0201 - single implicit disc side

First comes a one byte side/head id, in which the top bit represents the disc side - not set implies side 1, set implies side 2. The low 7 bits form the disc head id.

Then following are (length of chunk - 1) bytes, stored such that the first byte is the first byte on the first sector of the first track, the (sector length)th byte is the first byte of the second sector, and so on.

The stuff that is not normally seen by any component above the drive controller - the sector headers, (M)FM syncs, etc are left implicit. This chunk correlates directly to the SSD/DSD/ADF style of disc image.

In many cases, there will not be as many bytes stored here as calculating (bytes per sector)*(sectors per track)*(tracks per side) seems to imply. This situation indicates that the value of the remaining bytes on the disc is not important, although the remaining sectors and tracks were formatted.

Chunk &0202 - multiplexed disc side

As above but the disc data (after the side id) is multiplexed.

Chunk &0210 - explicit disc track

This chunk stores raw magnetic polarity changes at the platter level. Before that data are three data fields. A one byte field identifies side and head number ala chunk &0201. This is followed by another byte, signifying track number. Finally a two byte length count follows. This is the number of linearly spaced samples that record the entire track.

There then follows a list of results from reading the disc surface - a 0 anywhere the polarity did not change inbetween samples and a 1 wherever it did. This is equivalent to the raw underlying FM or MFM data that the WD177x and 8271 chips see.

The entire track is sampled, so an emulator can calculate data density by considering the number of samples and the rotation rate of the device. Bits are stored in bytes so that the lsb is considered to have come first on the disc.

No multiplexed way of duplicating data stored in this configuration is available.

&03xx set - ROM chunks

Back to chunk index

Chunk &0300 - standard machine rom

Contains some sort of ROM, which are usually 16kb in size. The first byte is a type byte. It can contain one of the following values:

  • 0 - type unspecified
  • 1 - this is the OS ROM
  • 2 - this is the BASIC ROM
  • 3 - this is a language ROM
  • 4 - this is a utility ROM
  • 5 - a filing system ROM
  • 6 - a hardware driver
  • 7 - a game ROM

The second byte is a slot recommendation, in which the high four bits should be zero. It is useful as on some hardware inserting a language or utility ROM in a slot above BASIC will cause it to boot instead, which may be the desired effect. For the OS ROM, this value is undefined, as it does not appear in a slot.

It should be noted that a slot recommendation is only a hint - it may be ignored by software if required. E.g. Electron emulators will have difficulty honuring a slot recommendation for the BASIC ROM because on that hardware the BASIC ROM is a special case, occupying more than one slot, and similarly they may not allow any ROM to occupy slots 8 or 9 since they are reserved for the keyboard.

The final part of this chunk is the ROM itself. As it will occupy a 16kb hole in the memory space, it shouldn't be larger than 16kb, however some ROMs are smaller (e.g. the Electron Plus 1 ROM is only 4kb), and should be considered to 'repeat' over the 16kb memory address range in that case.

Chunk &0301 - multiplexed machine rom

As above, but the ROM data is multiplexed.

&04xx set - State snapshots

Back to chunk index

Chunk &0400 - 6502 standard state

Contains 8 bytes. The first is the 'update byte'. Common to all the snapshot chunks, the 'update byte' contains a non-zero value if the emulator is supposed to update this chunk when closed.

The next five bytes are the a, p (status), x, y and s registers in that order. Then the two byte program counter follows.

Chunk &0401 - Electron ULA state

For Electron emulators, this chunk contains the entire state of the ULA (the video circuits, cassette interface, sound generator and ROM pager). Format is:

Byte Offset Length Description
0 1 'Update byte'. Contains a non-zero value if the emulator should update this chunk when closed.
1 2 Interrupt control, followed by interrupt status (SHEILA &FE00). Should be used to determine which interrupts are currently active as well as which are enabled.
3 2 SHEILA &FE02, followed by SHEILA &FE03 - in total screen start address
5 1 SHEILA &FE04 - casette shift register
6 2 First byte: value of &FE05, principly for determining the value of the page enable bit. Second byte: ROM currently paged in (low 4 bits). High 4 bits are undefined.
8 10 Remainder of SHEILA bytes, in ascending order (i.e. FE06, then FE07...)
18 4 Number of 16Mhz cycles since last 'end of display' interrupt signal (regardless of whether this interrupt was actually enabled at the time).

Chunk &0402 - WD1770 state

The assumption is made that a snapshot cannot be saved while the WD1770 is in the middle of an operation. The first byte, as with all other state snapshots, the 'update byte' - containing a non-zero value if the emulator should update this chunk when closed.

The next four bytes are, in order : the status byte, the track byte, the sector byte, and the data byte.

The final byte stores the disc drive status. As this varies from machine to machine, it is in a standard form. Bits 0->2 are the current drive number, in the range 0..7, bit 3 is the side select bit (high = side 2), and bit 4 is the double density select bit (high = double density).

Chunk &0403 - JIM paging register state

A two byte chunk - the usual 'update byte' followed by the last value written to the JIM paging register.

Chunk &0410 - standard memory data

Following the usual 'update byte', a second describes which memory is stored. It has one of the following values:

  • 0 - this memory data comes from the standard RAM, located from location &0000 in the 6502 memory map upwards
  • 1 - this memory data comes from shadow RAM
  • 2 - this memory data is from the JIM page
  • 255 - 'patch memory'

Upon encountering 'patch memory', the next three bytes should be read. The first is a base address, with a value equivalent to one in the above table other than 255. The next two are an offset into that area to which the following data should be loaded.

For example, suppose the chunk started &ff, &00, &12, &34 - then the following data should be loaded at position &3412 in normal RAM.

Chunk &0411 - multiplexed memory data

As above, but with multiplexed data.

Chunk &0412 - multiplexed (partial) 6502 state

Intended to coexist with chunk &0400. After the 'update byte', contains multiplexed entries for A, P, X and Y in that order.

Chunk &0420 - Slogger Master RAM Board State

After the 'update byte', this chunk contains one other byte indicating the mode of a Slogger Master RAM board. That byte may have value zero to indicate that the board is disabled, one to indicate that it is in turbo mode or two to indicate that it is in shadow mode.

FFxx set - reserved / non-emulator portable

Back to chunk index

Chunks &FF01 -> &FFFF - reserved / non-emulator portable

These chunks are reserved for you to do anything you like with. For example, my emulator stores a nice picture for its 'about' box, a GUI font and other small things in these chunks. An emulator should not assume it can understand these chunks unless it recognise the value of chunk &FF00.

Chunk &FF00 - emulator identification string

This chunk is a NULL terminated string. It dictates which emulator output these FF?? chunks, and therefore allow your emulator to decide whether it knows their meaning.

To ensure this doesn't clash with anyone else's chosen identification string, I suppose something like the name of your emulator is a good choice.

Valid HTML 4.0!