Researching file formats 24: Unified Speech and Audio Coding

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Unified Speech and Audio Coding. Comments welcome directly to the Library of Congress. Okay, this format was hard! First, the format is standardized via ISO/IEC, which means it’s expensive. Next, the specification is extremely long and technical, with lots of math. And this is my area of...
Read more

Researching file formats 23: Audio Definition Model

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Audio Definition Model. Comments welcome directly to the Library of Congress. This format is about audio, but it’s a text-based document that describes audio. You can say it defines an audio model. It is typically stored as XML, but JSON is a valid option, too. While the...
Read more

Researching file formats 22: Sibelius

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Sibelius Music Notation Format. Comments welcome directly to the Library of Congress. We are entering the Audio-Video set of formats! The next four posts will be a/v formats. I have had some very brief, non-hands-on experience with this format, because one of my college roommates was a...
Read more

Researching file formats 21: bzip

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: bzip2. Comments welcome directly to the Library of Congress. Last week was gzip, this week is bzip. Or, I think, bzip2. I struggled with what to say about gzip, but bzip/bzip2 is more interesting because of PATENT PROBLEMS! bzip2 is based off of its predecessor bzip. bzip2...
Read more

Researching file formats 20: gzip

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: GZIP. Comments welcome directly to the Library of Congress. Am I running out of steam or is there not that much to say about gzip? I think the biggest struggle in writing about the sustainability of this format is ensuring there isn’t conflation between the format itself...
Read more

Researching file formats 19: Java class file

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Java Virtual Machine Class File Format. Comments welcome directly to the Library of Congress. Java configuration class file format. Might be candidate for least appealing documentation/specification (legacy, here.) The hardest part of this format was having to explain the JVM in a way that makes sense for...
Read more

Researching file formats 18: DS_Store

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Desktop Services Store. Comments welcome directly to the Library of Congress. The subject of DS_Store seems to bring the drama. There’s something about DS_Stores that really get people riled up. It feels like unlocking a particular trauma and people can’t help but express a lot of feelings...
Read more

Twenty twenty three annual report and twenty twenty four goals

It’s that time again: Annual report time. This is the 10th!!! A decade of reports! Professional accomplishments first Through Myriad, I pitched and won a bid researching file formats for the Library of Congress. These 39 formats are the ones I am researching and XML’ing. Not a requirement of the project at all, but you can follow along with my thoughts on these formats with weekly blog posts. They’re fairly brief. I’m trying to capture...
Read more

Researching file formats 17: Shell link binary file format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Microsoft Windows Shortcut File . Comments welcome directly to the Library of Congress. Shell Link Binary File Format Formal name: Shell Link Binary File Format Informal name / also known as / previously known as: Microsoft Windows Shortcut Link files are a little bit sneaky. They appear...
Read more

Researching file formats 16: Transport Neutral Encapsulation Format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Transport Neutral Encapsulation Format. Comments welcome directly to the Library of Congress. Following up from EMLX from a few weeks ago, we have Microsoft’s special way of handling emails: TNEF! TNEF: “Transport Neutral Encapsulation Format.” TNEF is responsible for the millions of people that have been annoyed...
Read more

Researching file formats 15: Groupwise MLM Format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: GroupWise Email Format. Comments welcome directly to the Library of Congress. Akin to me consistently messing up the definition of MUA when working with vCard, obviously I’m gonna think this format stands for Multilevel Marketing instead. Oh, I also (re-)learned that MLM may also stand for “men...
Read more

Library of Congress Format Descriptions Visualization

Spoiler alert: If you want to browse what I came up with, you can check it out here: https://lc-sdf-data-exploration.vercel.app/ Readers of this blog will know that I’ve been working through researching 39 formats for the Library of Congress Sustainability of Digital Formats site because I’ve been blogging about it weekly since August (and that series will continue until end of next May). I had a bit of holiday downtime, so I was thinking about the...
Read more

Researching file formats 14: Apple EMLX Format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Apple Mail Email Format. Comments welcome directly to the Library of Congress. In typical Apple fashion, this is a variation of an open and well-adopted standard (the EML format), modified slightly just to be Apple-specific, and totally undocumented. Got a kick out of this update to a...
Read more

Researching file formats 13: vCard (virtual business cards)

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Virtual Card Format (vCard). Comments welcome directly to the Library of Congress. vCard or VCF: Virtual Card Format? Virtual Contact File? vCard File? Sources are not consistent with this. This format has a lot of official specifications and extensions, lots of updated versions during the standardization process,...
Read more

Researching file formats 12: Kryoflux raw disk image format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: KryoFlux Stream File. Comments welcome directly to the Library of Congress. One of the things I know about Kryoflux is it has a bad reputation in multiple ways. ArchiveTeam has a strongly-worded blurb about concerns over the licensing agreement. Working on this format had me thinking a...
Read more