HACKER Q&A
📣 appleflaxen

How would you create a utility to open a custom file format?


What's the easiest language to use if I need to convert an arbitrary file format into variables in memory? (assuming there is no existing library)

Are there any languages that really lend themselves to this problem?

The data is very heterogeneous (think: underspecified file format spec with multiple independent vendors) with a header that is lots of text labels, and a body that is numeric data of variable bit depth, numeric types.


  👤 simonblack Accepted Answer ✓
I used C, and also FUSE.

I wrote utilities to mount and access old Disk Operating Systems from 8-bit computers of the 1980s. (Floppy filesystem, and hard disk filesystem).

You also need a good working knowledge of the file format, whether you obtain that from documentation, or by disassembling working utilities that know that file format, or just plain old digging and trial and error.

That's valid whether you're looking at just a single file's format, or whether you're talking about the workings of a complete filesystem.


👤 tgflynn
Probably whatever language you're using for the rest of the project. Of course it's most straightforward to do in C but most higher level and/or scripting languages provide some sort of API for dealing with binary data because they often need to interface with low-level C code.

For example Python has the struct library which is fairly easy to use.


👤 swiley
Really what would be nice is something like format strings for binary. Or something that could let you describe binary structures in a formal way. Almost any reasonable programming language will let you push bytes around.