Besides being a front-end developer in Voog, I'm also an avid photographer. Over the course of the last 15 years I've taken hundreds of thousands of shots.
Even after deleting 90% of them the same day through tough self-censorship, the remaining archive takes up thousands of gigabytes of storage. And the amount is growing fast — improving camera technology has resulted also in a steep increase in file sizes.
Losing even part of these photos would be a first grade disaster. There are millions of people like me — keeping vast amounts of photos almost entirely in digital format and trying to figure out the best way for them to survive till eternity.
Here's what I've learnt along the way.
Issues with optical media
In the early 90's when blank CDs arrived, manufacturers declared that all your digital data storage and archival problems will be solved with optical media. The lifetime of CDs was predicted to reach far beyond 40 years. Now, more than 20 years later people are discovering their precious data on the very same CDs has vanished. Seems like all these proclaims were nothing more than just a marketing hoax after all.
Now after quite some years have passed since blank optical media was released for the masses, several real aging studies have been performed regarding its lifetime. On average it is said that about 90% of data is intact on CDs after 10 years of archiving in good conditions. Though it must also be said that some good quality CDs have kept their data in ideal conditions for more than 20 years. The key question is, what defines a good quality CD?
The manufacturers, out of the production line, quality is not the only factor to consider. Temperature fluctuations, light and physical forces all decrease optical media lifetime. Transportation and storage conditions (including position: vertical is better than horizontal) of your local merchant also play a vital role here, not to mention your own storing conditions. It also seems that data on a disc, from the same patch and manufacturer, burned in one cd burner outlives the other one burned in a different one. If you change the manufacturer of disc the situation can be reversed. Burners thus prefer some media over the other with no apparent reason or logic. Data on optical media is becoming also more dense with the introduction of DVDs and blue-ray discs and manufacturing processes are changed on daily bases. Any predictions on lifetime of current optical media has actually no bases whatsoever.
The good qualities of optical media like no possibility to accidentally overwrite, no effect of static electricity and magnetic fields on data and smallest probability of getting stolen in case of burglary, become shaded by the fact that no warranty can be applied on its lifetime. As a second or third backup solution they might be quite feasible.
Hard discs are having hard time starting up.
Magnetic storage seems as another quite feasible solution of storing data. The data on the magnetic plates of hard discs seems to outlive the technological advances in times that are forcing you to update your media under your data. Floppies are practically extinct and there are already difficulties to find a computer to get your data out of IDE hard-discs (not to mention SCSI). The commonly introduced problems of static electricity, magnetic fields and ease of physical damage can really be minimised with good storage conditions.
All this might seem quite ok if there weren't a quite enormous „but“ in all this. The electric motors driving hard discs have a very high failure rate when starting up after standing on the shelf in one position for a prolonged time of, lets say, years. They are built for constant movement, not for standing inanimate. You could power them for storage but that would in turn open up the possibility of all kind of electrical damage and physical wear.
Using hard discs as a backup solution is quite good option using an external raid rack protected from outside electric fluctuations with a good UPS. This in turn makes the price go high and still does not give a full warranty considering electrical failures and chances of burglary.
What about memory sticks, memory cards and other solid state media?
The construction on solid state media, that stores one bit of data can be described as a battery being charged or not. We all know that a charged battery can go dead after some time, even if not used. This applies to the bit stored on solid state media too. The depletion of the bit is said to occur in about 10 years average. Users have reported even 15 years of storage without problems. Manufacturers give their best flash products maximum 5-year warranty. It's a safe bet to assume that you can extend their lifetime by a magnitude by simply rewriting the data — that is refreshing the electrical charge — every 5 years.
The only problem is that we do
not have very much real life data to back this up and some scientific
studies say oxygen in the air can affect the discharge rate of these
little batteries. So more than ten years on the shelf, even if
recharged on regular basis, might still reveal some unpleasant
surprises in the future. While memory sticks seem to survive washing
machines and trucks riding over them, keep them away from static
electricity. An air balloon or synthetic cloth can easily do the
damage trucks cannot.
The data stored online in the cloud has actually the highest chance of survival. Online backup service providers keep copies and constantly upgrade the media under your data on a regular basis for you. Cost is a concern though. There are quite good ready solutions for storing massive amounts of data online. If it is your second backup where data access is rarely needed (only in case the first solution has failed), costs can be rather low too.
Amazon glacier is one good example. You pay only 0.01$ per GB in month. No pay for adding data. If you want to cut the costs there are some things to know though before setting up and testing, You pay if you want to retrieve your data and the pricing scheme is complicated. You might be presented with an enormous bill if you download too fast or your files are very big. You might want to read the Wired article and some forum posts first. Apps are available for simplifying backing up to this service.
Online storage is not without risks too. Extreme forces of nature like tsunamis and earthquakes can destroy your providers servers where your data is in. Storing data in multiple locations by the provider usually means additional costs. In addition economic an political decisions can render your provider non-existent faster than you can react.
Megaupload saga is one example of this case. So choosing a provider who is globally and politically connected is a good start. We think Amazon might be one of the options to consider, as government data of USA is stored in these servers as well, thus making it less likely to be shut down without a pre-warning.
How to keep your data?Whatever solutions you choose to keep your data, there are some basic rules:
rely on one solution and have at least one additional backup of your data in a different
location on different media.
- Upgrade the media under your data on a regular basis (online
solutions do that for you. You should monitor their financial and
- The file formats of your data can become obsolete too. Programs available for specific formats become extinct. Updating your data from old formats to newer ones from time to time is a good idea.
Digital data preservation needs constant involvement. Otherwise your grandma's old negatives have a fair chance of outliving your last holiday photos.