Packfile format information

Contents

Introduction and scope

This document describes the format of Allegro's packfiles from a decompression point of view. It does not describe how to do the compression (read the source, or ask Google about LZSS).

file.c says "This compression algorithm is based on the ideas of Lempel and Ziv, with the modifications suggested by Storer and Szymanski.", if that means anything to you.

Packfile signature

All compressed packfiles begin with a four byte signature "slh!" (ASCII), which in hexadecimal is 0x73, 0x6C, 0x68, 0x21 (in that order).

Another form of packfiles are uncompressed packfiles, which begin with the four byte signature "slh." (ASCII), in hexadecimal 0x73, 0x6C, 0x68, 0x2B. The rest of the file is then completely raw. Uncompressed packfiles will not be discussed further.

Format and decompression algorithm

Decompression requires a ring buffer of 4096 bytes and a ring index (which indexes into the ring buffer). The ring index starts at 4078, assuming 0-based indices. New bytes may be stored into in the position in the ring buffer that the index points to. When a byte is stored, the index is incremented by one, wrapping around to zero as necessary.

Packfiles must be decompressed sequentially. The first byte following the signature in the packfile is the "flags" byte, which determines how the following "tokens" in the file are to be interpreted.

Bit 1 (the least significant bit) corresponds to the first token following the flags byte. Bit 2 corresponds to the second token, etc. After every eight tokens, there is another flag byte and another eight tokens, and so on, until the end of file.

If the bit I of the flags byte is set, then token I is a single byte to be sent to the output. In addition, it must put onto the ring buffer and the ring index incremented.

Otherwise, if bit I of the flags byte is not set, then token I is a two byte sequence containing an index and a length. The index is the first byte OR'd together with the higher 4 bits of the second byte as the higher order bits, thus making a 12-bit unsigned number (0-4095). The length is the lower 4 bits of the second byte, plus 3. In less hand-wavy C syntax:
      index  = byte1 | ((byte2 & 0xF0) << 4);
      length = (byte2 & 0x0F) + 3;
Then, "length" number of bytes from the ring buffer, starting at "index", are to be sent to the output. In addition, all these bytes must be put onto the ring buffer, and the ring buffer incremented accordingly.

Decompression ends at the end of the file. There is no end-of-file marker. If the number of tokens is not a multiple of eight, the unused bits in the latest flags byte are always zero.

Summary

In brief, compressed packfiles take the form:
      signature -- 4 bytes

flags -- 1 byte token 1 -- each token 1 or 2 bytes token 2 ... token 8 flags ... (repeat flags and tokens 1-8 until EOF)