Researching file formats 0: Introduction (to the series)

In June, I started working on a project that will run for the next year – it’s helping the Library of Congress update their Sustainability of Digital Formats page with around 40 new file formats! You can see the list of 2023-2024 formats here – those are the formats I’ll be researching.

There’s an announcement in this general updates post from the Library of Congress and another announcement blog from Myriad.

Doing this research is a huge thrill for me!! Most of my time is going straight into the formats research, with a little time spent answering clarifying questions from my teammates and also doing the XML conversion required for the project. Since I’m spending a lot of time with each format, I thought it’d be nice to document some of the things I’m learning along the way that are … well, things that just don’t fit into a standard open government document, like my opinions and weird discoveries. So I’m going to do that here on my blog!

This blog series represents my views and my views only, and they do not reflect the views of any of my colleagues, and definitely not the views of any companies/organizations, because they don’t have feelings anyway.

(I started drafting the first several posts and some of them have at least started off a little fiery so it seems appropriate to say that in advance.)

I realize there’s a chance you’re coming across this Sustainability of Digital Formats page for the first time via this blog post. Well, it’s a great and continually updated resource – don’t let the vintage web design fool you! I like to review the entire list. I like this resource not just as a way to understand the assessed state of preservation risks for individual formats, but just in general to learn about the way a format works and the format history. There’s also relationships with other formats, which is essential for anyone just learning that a “format” like a video file can actually be at least three formats rolled up into one (the container, the audio codec, the video codec).

This blog post will (eventually) serve as a canonical list of all of the entries in order (and you can always visit the tags section of my blog).

We are not working in set order, so the posts won’t correspond to the published list. This projects is running from June 2023 until May 2024. I am aiming to post roughly once a week.

Here’s a tip for following along: Click the little satellite on this blog’s sidebar if you would like an RSS feed(or click that link right there) and want to stay up to date that way! Otherwise, I will be posting weekly notifications on Mastodon.

P.S. If you like this blog series, you may also be interested in Tyler Thorsted’s blog, where he does a File Formats Friday series!