jbanana: Badly drawn banana (Default)
[personal profile] jbanana
It's unusual nowadays to have to deal with data in a non-text format. The only case I've had much experience with is Exif data in JPEGs, which I hacked at so I could display some photo metadata in Jix. The code I wrote then is pretty awful, partly because I couldn't be bothered to refactor it as it grew, but partly because parsing binary data is a pain. You have to tease out byte values into numbers and strings, paying attention to endian-ness, integer/floating point, signed/unsigned, explicit array lengths and pointers, all of which I'd left behind since I last wrote C in 1999.

Anyway, I was looking at the format of Java class files (just out of curiosity) and I tried writing a parser. It wasn't so hard, but the code soon began to look like the Exif mess. It occurred to me that some sort of generic parser might help. Then the work of reading binary data would be to document the format in some kind of binary description language for the parser to use...

But of course I'm not the first to think this: DFDL seems to be something like what I was thinking of, and Daffodil seems to be a Java implementation of it.

My only issue is that DFDL is horribly verbose. I had in mind something like assembler for binary data, but DFDL feels like Cobol translated into XML.


Edit: http://dilbert.com/strips/comic/2013-10-12/

July 2017

17 181920212223

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 21st, 2017 04:45 am
Powered by Dreamwidth Studios