ZIP File Format Summary ======================= This summarises the ZIP File Format described in various places such as http://mdfs.net/Docs/Comp/Archiving/Zip/Format and elsewhere. Values in brackets are values to use to create an generally-compatible archive. ZIP Data Section ---------------- Files are stored in a ZIP file one after another, with a ZIP local header before each file: 0 file header id "PK",&03,&04 4 bytes 4 version needed to extract (eg &0000) 2 bytes 6 general purpose bit flag (eg &0000) 2 bytes 8 compression method 2 bytes 10 last modified time in DOS format 2 bytes 12 last modified date in DOS format 2 bytes 14 crc-32 4 bytes 18 compressed size (n) 4 bytes 22 uncompressed size 4 bytes 26 filename length (n) 2 bytes 28 extra field length (e) 2 bytes 30 filename n bytes 30+n extra field e bytes 30+n+e data l bytes 30+n+e+l ... The filename is stored in Unix Format: directories are seperated by '/'s. Directory entries have '/' at the end of their filename and compressed and uncompressed sizes of zero. ZIP filenames and BBC filenames are converted bidirectionally by swapping the following characters: / <-> . ? <-> # $ <-> < ^ <-> > @ <-> = & <-> + % <-> ; On extracting, BBCUnZip converts spaces in filenames to '_'s. The 'Extra' field holds additional information specific to the data. The Acorn extra field has the following contents: 0 extra header id 2 bytes "AC" 2 extra header sublength 2 bytes ------------v 4 Acorn header id 4 bytes "ARC0" 4 8 load address 4 bytes 8 12 execution address 4 bytes 12 16 attributes 4 bytes 16 20 &00000000 4 bytes 20 24 creation time 2 bytes 22 26 creation date 2 bytes 24 28 main account number 2 bytes 26 30 auxilary account number 2 bytes 28 On extraction, only that data present in the extra field should be written to the extracted files or directories. If there is no extra field it should be ignored or suitable defaults from the zip header used: extra header sublength: >27 >25 >23 >15 >11 >7 <8 load address load load load load load load 0 execution address exec exec exec exec exec load 0 attributes attr attr attr attr &33 &33 &33 creation time ctime ctime ctime mtime mtime mtime mtime creation date cdate cdate cdate mdate mdate mdate mdate main account number acc acc ignore ignore ignore ignore ignore auxilary account number aux acc ignore ignore ignore ignore ignore BBCUnZip only sets file ownership if the -X option is used. BBCZip can create 'envelopes' where the archive contains all the file metadata but no contents. This is indicated with compression method 255 and the uncompressed file length is zero. ZIP Directory (Catalog) Section ------------------------------- Following all the stored data there is usually a Central Directory, which repeats most of the ZIP header information: 0 dir header id "PK",&01,&02 4 bytes 4 version made by 2 bytes 6 version needed to extract 2 bytes -------+ 8 general purpose bit flag 2 bytes | 10 compression method 2 bytes | 12 last modiication time, DOS format 2 bytes As in | 14 last modification date, DOS format 2 bytes local | 16 crc-32 4 bytes header | 20 compressed size (l) 4 bytes | 24 uncompressed size 4 bytes | 28 filename length (n) 2 bytes | 30 extra field length (e) 2 bytes -------+ 32 file comment length (c) 2 bytes 34 disk number start (eg &0000) 2 bytes 36 internal file attributes (eg &0000) 2 bytes 38 external file attributes (eg &0000) 4 bytes 42 relative offset of local header 4 bytes 46 filename n bytes 46+n extra field (variable size) e bytes 46+n+e file comment c bytes 46+n+e+c ... 'Version made by' and 'Version needed to extract' can both be zero. ZIP End Of File --------------- The final entry is usually an End Of File header: 0 end of file id, "PK",&05,&06 4 bytes 4 number of this disk (eg &0000) 2 bytes 6 number of the disk with the start of the central directory (eg &0000) 2 bytes 8 total number of entries in the central dir on this disk 2 bytes 10 total number of entries in the central dir 2 bytes 12 size of the central directory 4 bytes 16 offset of start of central directory with respect to the start of the starting disk 4 bytes 18 zipfile comment length 2 bytes 20 zipfile comment c bytes 20+c ZIP header IDs -------------- "PK",&01,&02 - Central directory "PK",&03,&04 - Local header "PK",&05,&06 - End of File (end of central directory) "PK",&07,&08 - Data descriptor "PK",&05,&05 - Digital signature "PK",&06,&06 - Zip64 end of central directory "PK",&06,&07 - Zip64 end of central directory locator "PK",&06,&08 - Zip64 extra data record "PK",&30,&30 - Temporary spanning marker "PK",&0F,&10 - Temporary build header, "OP" - file still open References ---------- * http://mdfs.net/Docs/Comp/Archiving/Zip/Format * http://mdfs.net/Docs/Comp/Archiving/Zip/ExtraField * http://mdfs.net/Apps/Archivers/BBCZip