Researching file formats 26: 3DM

This blog post is part of a series on file formats research. See this introduction post for more information. The next six formats are part of Set 3: 3D, VR and Animation! All of these formats are quite different from each other (except two closely related, which will be obvious). First up: 3DM. Rhino 3D Model file format family. Or the openNURBS 3D model? Something important to note off the top is that openNURBS is...
Read more

Researching file formats 25: Nullsoft Streaming Video

This blog post is part of a series on file formats research. See this introduction post for more information. “Support for more codecs will be added soon.” – Nullsoft, 2004 Last week was an audio codec, this week is an audio/video container. (See my training site for nuances there, if needed!) And it’s a container for streaming media! If you’ve heard of Nullsoft before, it’s probably because they made Winamp. If you’ve used Winamp… you...
Read more

Researching file formats 24: Unified Speech and Audio Coding

This blog post is part of a series on file formats research. See this introduction post for more information. Okay, this format was hard! First, the format is standardized via ISO/IEC, which means it’s expensive. Next, the specification is extremely long and technical, with lots of math. And this is my area of expertise! It was still challenging to work through. Fortunately, there was a hype train as part of getting this codec standardized and...
Read more

Researching file formats 23: Audio Definition Model

This blog post is part of a series on file formats research. See this introduction post for more information. This format is about audio, but it’s a text-based document that describes audio. You can say it defines an audio model. It is typically stored as XML, but JSON is a valid option, too. While the rest of the software engineering field goes “ick” when it hears about XML, both the cultural heritage sector and audio/video...
Read more

Researching file formats 22: Sibelius

This blog post is part of a series on file formats research. See this introduction post for more information. We are entering the Audio-Video set of formats! The next four posts will be a/v formats. I have had some very brief, non-hands-on experience with this format, because one of my college roommates was a music major and now a professional trombone player, and on a call one time I got the full low-down on how...
Read more

Researching file formats 21: bzip

Last week was gzip, this week is bzip. Or, I think, bzip2. I struggled with what to say about gzip, but bzip/bzip2 is more interesting because of PATENT PROBLEMS! bzip2 is based off of its predecessor bzip. bzip2 is widely more popular, bzip isn’t really used so much anymore, because of some patent problems. There was this dude running around in the 90s being a jerk and suing everyone he could for a basic algorithm,...
Read more

Researching file formats 20: gzip

Am I running out of steam or is there not that much to say about gzip? I think the biggest struggle in writing about the sustainability of this format is ensuring there isn’t conflation between the format itself (compression algorithm, in this case) and the software tool of the same name that creates this format. They are tightly linked together, but different, so I had to make my research notes explicit in every case so...
Read more

Researching file formats 19: Java class file

Java configuration class file format. Might be candidate for least appealing documentation/specification (legacy, here.) The hardest part of this format was having to explain the JVM in a way that makes sense for a website people primarily access to understand actual files they’ve come across in their preservation collections. Maybe I know too much? Maybe I know too little? It’s a challenge to comprehensible cover a file that has the purpose of being strictly from...
Read more

Researching file formats 18: DS_Store

The subject of DS_Store seems to bring the drama. There’s something about DS_Stores that really get people riled up. It feels like unlocking a particular trauma and people can’t help but express a lot of feelings (mostly anger) around the format. Like, it feels like every person who gets up-in-arms over the DS_Store was once a young computer user who was made fun of for checking in a .DS_Store into a git repo and it...
Read more

Twenty twenty three annual report and twenty twenty four goals

It’s that time again: Annual report time. This is the 10th!!! A decade of reports! Professional accomplishments first Through Myriad, I pitched and won a bid researching file formats for the Library of Congress. These 39 formats are the ones I am researching and XML’ing. Not a requirement of the project at all, but you can follow along with my thoughts on these formats with weekly blog posts. They’re fairly brief. I’m trying to capture...
Read more

Researching file formats 17: Shell link binary file format

Shell Link Binary File Format Formal name: Shell Link Binary File Format Informal name / also known as / previously known as: Microsoft Windows Shortcut Link files are a little bit sneaky. They appear on the Windows graphical user interface (the Desktop, folders, et al) without a file extension and with an illustration of an arrow in the bottom left corner, but no visible file extension – Windows hides that. Also any folder that is...
Read more

Researching file formats 16: Transport Neutral Encapsulation Format

This blog post is part of a series on file formats research. See this introduction post for more information. Following up from EMLX from a few weeks ago, we have Microsoft’s special way of handling emails: TNEF! TNEF: “Transport Neutral Encapsulation Format.” TNEF is responsible for the millions of people that have been annoyed at receiving an email with an attachment that doesn’t mean anything to them: the winmail.dat attachment. Turns out, this just is...
Read more

Researching file formats 15: Groupwise MLM Format

This blog post is part of a series on file formats research. See this introduction post for more information. Akin to me consistently messing up the definition of MUA when working with vCard, obviously I’m gonna think this format stands for Multilevel Marketing instead. Oh, I also (re-)learned that MLM may also stand for “men loving men” for people who are looking for a specific genre of romance books. I discovered that when looking for...
Read more

Library of Congress Format Descriptions Visualization

Spoiler alert: If you want to browse what I came up with, you can check it out here: https://lc-sdf-data-exploration.vercel.app/ Readers of this blog will know that I’ve been working through researching 39 formats for the Library of Congress Sustainability of Digital Formats site because I’ve been blogging about it weekly since August (and that series will continue until end of next May). I had a bit of holiday downtime, so I was thinking about the...
Read more

Researching file formats 14: Apple EMLX Format

This blog post is part of a series on file formats research. See this introduction post for more information. In typical Apple fashion, this is a variation of an open and well-adopted standard (the EML format), modified slightly just to be Apple-specific, and totally undocumented. Got a kick out of this update to a blog post from jwz that reads “Update: Please, people, I asked a very straightfoward question. I’m not interested in your guesses....
Read more