05 Mar 2018

Saving your data from oblivion is getting easier

  • Written by 
  • submit to reddit  
Files are easier to preserve than tape. As long as you make enough copies! Files are easier to preserve than tape. As long as you make enough copies! Shutterstock

Preserving our footage and data in the long term is a big concern. But is it a genuine concern, or is data preservation actually becoming easier? 

This article was suggested to me by a conversation about the obsolescence of older software, which can make it difficult to access old material.  Modern file-based formats can require esoteric mathematics before there can be a viewable image, unlike 35mm, you can simply see. But files last longer and can be duplicated. 

Well, the recent re-release of Friends on Netflix attests that there are reasons to chase old material. To some people, Friends still seems fairly current and, like other successful remastering efforts, its materials were probably in pretty good order to begin with. Restoring old material should be a more or less push-button operation, given properly annotated original camera negatives and edit decision lists. The titles will need redoing, but on Friends that probably wasn't a massive task. On the remaster of Star Trek: The Next Generation, the VFX workload would have been bigger, but at least it had all been shot in 35mm – a format that is still easily readable.

Cinesave uses black-and-white film  to store a 2D barcode and, optionally, analogue images alongside to cover  all the bases

Cinesave uses black-and-white film to store a 2D barcode and, optionally, analogue images alongside to cover all the bases

Even if every piece of film-handling equipment was broken, it's not the end of the world to read film. Even if we've completely forgotten what film is or what it is for, it isn't particularly difficult to figure out what it represents. The CineSave system, shown at IBC a few years ago (but barely heard of since) imaged digital information onto film as a two-dimensional barcode. It was designed to leverage both the longevity of black-and-white 35mm film and the fact that it's relatively easy to build a device to photograph the frames. They showed a film scanner based on a DSLR which demonstrated the fact that a basic arrangement can now be built reasonably easily. Film is a highly recoverable format.

Tapes themselves are likely to last  reasonably well - but will there be anything to play them back

Tapes themselves are likely to last reasonably well - but will there be anything to play them back?

Conversely, over much of the last few decades, we've generated a lot of material that might not be so easy to get at, with a large number of video tape formats coming and going. Film requires pretty careful storage to achieve the sort of lifespans that are often discussed, but physical survival of the media is only part of the problem. TThere have been dozens of videotape formats over the years and our best data storage systems are also tape-based. So, regardless of whether we're talking about data or video, assuming the media survives, the real issue is something to play it back on. One could argue that we've gone from film, which is fairly easy to understand and read, to tape, which requires complicated precision mechatronics and somewhat complex decoding.

With very early video formats,  the pattern of magnetism on the tape more or less represents the brightness graph of a series of lines through a picture, modulated onto a carrier frequency. That's the easy stuff. Add composite colour, digital compression or file-based encoding and it's far from straightforward to figure out what on earth this stuff is supposed to represent, let alone build a device to recover it.

File-based media needs to be stored  on something, but at least that can be any digital storage system

File-based media needs to be stored on something, but at least that can be any digital storage system

There are two layers of technology required for recovery: first, we have to read the media. This is likely to be easier for files than tapes on the basis that there's now only one preeminent files-on-tape system, LTO, and that system was developed specifically to ensure that future archives would be mutually compatible. There will be a period up to the late nineties where data tape formats were hugely incompatible and that will create trouble, but from then on, things should be reasonably regular and there is some hope that archival LTO decks will remain available for a while.

Even if they don't, it's far easier to clone files than anything else. Analogue formats can't be cloned, while digital videotapes sometimes can be, but usually only onto another identical medium. Long-term storage of digital video often involves turning it into files – and once something is a file, it can be kept alive for as long as there are binary information storage formats, which is likely to be a very, very long time. Data is, on paper, far easier to keep around than any previous format.

LTO decks are likely to endure,  specifically because people will need them

LTO decks are likely to endure because people will need them

And once it's a file, the decoding process is much better documented. The decoding mathematics for Digital Betacam was entirely locked into the tape decks. The mathematics for ProRes are in the public domain and will continue to exist for as long as someone – anyone – has a copy of it. Codecs such as Cinepak, used for video on CD-ROMs in the early 1990s, remain recoverable with modern software. Even CDXL video, one of the earliest CD-ROM video formats developed by Commodore in the late eighties, is understood by free code and can, therefore, be transcribed directly into a modern post production format.

Media files may be fairly  recoverable for a long time. Project files that use them may be a bit more  difficult

Media files may be fairly recoverable for a long time. Project files that use them may be a bit more difficult

Code such as the free software tool FFmpeg, with its enormous multi-format file reading capability, is unlikely to go anywhere. Millions of copies of it exist worldwide. Yes, there are still concerns as regards the usability of old project files for various types of software, even when software such as Premiere increasingly uses XML to describe its timelines. The file may be somewhat understandable, but recreating the application is a huge task. Footage, though, is reasonably safe as long as we keep duplicating it.

But will people do that? Content owners are notorious for their failure to properly fund archiving and preservation efforts, which is madness in a business where the product is completely intangible. There's no reason to assume that simply because files can be cloned, they will be cloned. Still, it's easier to clone modern media than ever before.

Title image courtesy of Shutterstock.


Phil Rhodes

Phil Rhodes is a Cinematographer, Technologist, Writer and above all Communicator. Never afraid to speak his mind, and always worth listening to, he's a frequent contributor to RedShark.

Twitter Feed