Game MPAK file format
This week I updated the storage file format used in Rival Fortress. The previous file format was a quick implementation I came up with while still figuring out what was needed for the engine. Now that most of the engine functionality is fleshed out, I decided to revise the file format to better fit the engine’s needs.
About MPAK
MPAK (Metric PAK) is a generalized binary file format optimized for disk streaming and network transmission. It is currently used to store:
- static game assets (e.g. meshes, textures, shaders)
- runtime data (e.g. save games, profile data)
- user generated content (e.g. mod logic and assets)
The format also supports versioning and basic overwriting of previous assets: the engine at runtime loads all MPAKs and resolves any conflicting entries by keeping only the most recent one. This way mod authors can overwrite engine content by distributing custom MPAKs.
File structure
MPAKs have a metadata preamble that contain:
- A header, with a file identifier, version flags and section count;
- A list of sections, describing the categories, storage type and count of entries to expect in the file;
- A list of entries, describing type, size, and offset of each entry.
All the metadata about the file is contained in the first bytes of the file, right after the header. This means that the engine can quickly parse the metadata to obtain the information needed in order to build the correct representation of the file entries, and resolve conflicts.
After the metadata has been parsed the engine can proceed to make the necessary allocations and load high priority data. Files containing entries that are not needed right away are memory-mapped in order to lazy-load large data blobs on demand.
Sections, that group entries together, are sort of like categories and represent an homogeneous list of items, like meshes, textures or sounds. This grouping by sections is a convenience feature to facilitate parsing of similar data, like shaders or LUA scripts, that can easily fit into memory without having to random access around the file. Data blobs for each section are stored sequentially and come after the last Entry
in the metadata part of the file, at the 64 bit offsets specified by EntryOffset
.
Nested formats
The MPAK file format specifies only the type, size and offset of each entry, not how it is stored. This allows the flexibility of treating different data types independently. For example: shaders are stored as LZ4 compressed text, while meshes are stored directly as vertex and index buffers. The code in charge of writing/reading each section decided how to store the data.