A New, Future-Proofable Archive Format
Written 2012-11-30
Tags:Arc Compression Zip Future-Proofing
I've been dealing with different archive formats recently, and I'm a little bummed that none of them do everything I want. So I've been kicking
around the idea of an archive format that is scalable to future algorithms, can be optimized at pack-time for what is stored, and can be updated
with more advanced decompression algorithms without needing to update the client. What I've come up with so far is embedding a small subset of a
programming language into the archive itself.
Examples:
- Text files can be compressed with the same LZxx algorithms today
- WAV files can be compressed as FLAC files, and decompressed on the fly as needed
- Bitmaps can be compressed any number of ways(PNG/JP2/...)
One hurdle is making the decompression algorithm small enough for this to be effective. That said, for small files, a smaller but less efficient
decompressor may be included to keep the total filesize still small.
Future proofing is intrinsic - when an archive is packed, it contains the algorithms used to compress it, so any new compression algorithms
will work fine on old implementations.
Security concerns - each archive contains one or more programs running on the user's machine. This is where exposing only a subset of
normal OS bindings comes in - the program gets access to read the compressed data block, ability to read/free memory, the ability to
write to the output buffer, and of course entry/exit points for open/close/read/seek.