Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Technology

Fault Tolerant Archive Solutions? 18

Bob Washburne asks: "Does anyone know of a file system or storage protocol which allows you to recover a file even when sections of the media have become corrupt? This would be for archival storage - tapes, CR-R's, etc. - rather than on-line. I have been doing a lot of digital retoration/preservation work. Digitising home recordings from the 50's, photos from the 1890's, etc. and cleaning them up. I now have several gig of files - and it will continue to grow - representing hundreds of hours of work and I'm starting to get nervous about losing it to wear or bad media."

"There are several solutions for on-line storage; RAID, UPS, and frequent backups. As I fill a CD-R I make several copies of it and send them to reletives who live out of state. So I am fairly well protected against local disaster.

But what happens when the CD-R itself becomes degraded - possibly from scratches or bad lamination - and cannot be read by the normal file system? Murphy's Law would guarentee that all the backup CD-s were from the same bad batch or were lost, etc. So I am left with a CD that is 90% good, but that ugly 10% prevents me from getting my file.

I remember studying about N-dimentional parity and Hamming codes in Comp Sci class, so I know that it is possible to store a file with signifficant error correction capabilities. But has any such scheme been actually implemented?

I would expect any such scheme to include the ability to adjust the degree of recoverability (size vs. robustness) and to be able to span volumes. Since most physical damage is contiguous, you would hope that the storage would be non-contiguous. And you would think that this would either represent a unique file system or a custom raw storage methodology useable only by the storage aplication.

Thanks for your insights."

This discussion has been archived. No new comments can be posted.

Fault Tollerant Archive Solutions?

Comments Filter:
  • by crow ( 16139 ) on Wednesday March 07, 2001 @10:35AM (#378426) Homepage Journal
    You could do raid on a single disk (or, presumably, disc, if you're using a CD). Since you're assuming that most of the media will be good, you simply treat the disc as a collection of, say, ten 70M regions, using nine (or eight) for data and the remaining one (or two) for parity, which would allow for reconstruction of any one (or or two) dammaged regions.

    Of course, raid assumes that you errors are self-detecting, but I suspect that this is also true of CD media failure.

    Now the trick is to implement it. You could encode it such that it looks like a normal ISO-9660 CD, except for special "garbage" written to the last 70M or so. You would need a special version of mkisofs, as well as special recovery tools.

    In theory, you could have mkisofs figure out exactly how much extra space you have left on the CD and use it for parity of each previous block of that size. If you have more than some threshold of free space, it could use more than one parity block for multi-block failure recovery. Then if you cram the disc to within a meg or two of being full, you are still protected from failure, but only if the failure covers a small area.

    This sounds like a good project for a data storage class.
  • by Anonymous Coward on Wednesday March 07, 2001 @11:00AM (#378427)
    Because of the relatively high failure rates of floppys (in my experience), I would always make duplicate images on disks of an original. First off, if you're using a CDR, buy quality recordable media - if you're using the 100 pack that you bought for $30, I'm just going to laugh at you. Make an ISO of the filesystem you want to burn, then make, say, five copies of it. You might want to check that they are all really identical (using dd to reextract the ISO raw from each disc) otherwise this isn't going to work. So then, say that you are using an archived cd, and its failing reading a particular file, you can 'dd' the image from the cd (turn off the 'terminate on read error', and make sure it puts empty blocks in where it wasnt able to read the data) Just record the data sectors that it couldnt read, and splice them in from one of your copies (all blocks are identical, use 'dd' to extract the ones you need). Rewrite the image to a new disc, and you're done. "But, that seems like an awful lot of work" - And you're right. Chances are that at least one of your five or so discs will work "out of the box" with out splicing needed, but what happens when all of them have issues? A redundant filesystem is a great idea, adding parity so that it "just works" when blocks go bad. But if you're using cdr's, and want ISO9660 filesysem, you get the iso9660 filesysem that doesnt have the parity. The method I outlined above works, I've used it on floppy's, I know others that have used it on DAT tapes, and it will even fit in with the "make lots of copies and send them to relatives" approach you already have.
  • The only way to be sure you won't loose your data some day under a bizzare set of circumstances is to have infinite coppies of it - unfortionatly this costs infinite money and takes infinte time to acomplish, just a small implementation problem.

    I currently work for a DoD contractor on the east coast of Florida and we tend to worry about hurricanes and brush fires and aircraft that might miss the runway and land on the building. We use DLT and other mature media that has a 30+ year life. The general idea is each quarter we make an offsite copy that goes to another facility and sits in a vault. Whenever there is a threat (fire hurricane etc) we pull the most recent full backups and fly them out of town on a plane that goes to some other location to be stored just in case. I am dealing with backups on the order of 1TB per week of fulls +40GB of incrementals on top of that or I would make a second copy more often, however management has made a cost/risk decision that quarterly is often enough to make the offsite coppy.

    At my last job (across the street so same concerns) we had a rotation that sent a copy of last weekend's tapes to another building 5 miles away, the tapes at that location went to Orlando, the tapes in Orlando went to Harrisburg, PA, and the tapes in PA came back home and were recycled the next week. This gave us 5 weeks of tapes in multiple locations around the country at any given time.

    The bottom line is this aint cheap and you need to make a determination what the data is worth vs how much you are willing to spend ($ and time) to protect it and arrive at an acceptable level of protection.
  • by Christopher Thomas ( 11717 ) on Wednesday March 07, 2001 @11:12AM (#378429)
    My own system is the following. It isn't perfect, but it's pretty robust:

    • Use a RAID 1 (drive mirroring) for day-to-day data storage.
    • Back things up on to CDROM.
    • Burn two (or more) copies of everything per CD (just copy the directory tree two or more times, so that the copies are widely separated physically).
    • Burn multiple copies of each backup CD, and store them in different locations (ideally different buildings).


    I'm blithely pretending that CDs will last forever. If they don't, then I should check the integrity of all of my backup CDs every couple of years, and copy the data from failing sets to new sets. This involves doing something like calculating a CRC code for all files in the archive, and storing a copy of this with each copy of the backup tree.

    I also keep a paper copy of anything really important (that will fit on paper, at least).

    There's also an active system that I'm interested in trying, but it would have to be continually maintained (CDs and tapes can be left unattended for years, if they're stored well). The active system would be a bunch of servers with RAID drives that stored the files to be preserved, along with CRC information for the files. These servers actively mirror content from each other, trying to each keep a complete set of the data (updates propagate through the mirroring network). They'd also perform integrity checks on their own data and data from other servers (let the other server know that its data differs from the local copy). As long as the servers are maintained and swapped out when they fail, the data should be preserved intact forever.

    The catch is that, while you could in principle preserve storage media for a century, I wouldn't want to bet on a server network being maintained (in whatever form) for a century.
  • The question is about fault tolerant archival. Tapes are a backup mechanism, yes, but not a good archival mechanism. The reason CD-R is mentioned is because the shelf-life of a CD-R is many times that of tape. Tape degrades, and fast.
  • Thank you, excellent comments.

    1) While I have had unrecoverable errors with fingerprints, I will take your word on the scratches. Besides, both can be cleaned/polished to some degree.

    2) Exactly! It is the media durability which scares me. Good today, but crap next decade - long after I've removed the files from on-line storage.

    3) Which leads me to another favorite lecture of mine - the media needs to last as long as the drive, but not much longer. Digital storage must move from standard to standard or be lost. How much longer before my eight inch floppies are useless?

    I suspect that 30 years is sufficient for CD-R's. They are compatable with the next generation, DVD, but probably not with the generation after that. Jumping generations seems about right and should average about 20 years - assuming that we stay with "consumer" media. And that I avoid such wildly popular formats as the 8-track tape ;-).

    Thanks again,

    Bob Washburne

  • See any major corporations using CD-Rs for backups lately? Big guys use Tapes for backup. They have been proven reliable for years. I'd suggest using tapes instead of CD-Rs for backups. If data integrity is paramount it's worth the extra thousand or so for the tape drive. Plus, you can reuse your tapes in a backup cycle! (lowering media cost somewhat)

    Still, the other ideas suggested like multiple copies of backups are a good idea too, and when used with tapes make your solution even more effective.
  • This is a fruitless exersize, because CDROM already contain parity blocks so that when you smudge a fingerprint on it, you can still read the disk. Adding additional parity won't solve his problem anyway if the disk rots; it rots uniformly.
  • by micromoog ( 206608 ) on Wednesday March 07, 2001 @12:33PM (#378434)
    The CD-R format has significant error correction built in. Many of the CD's you have may already have suffered considerable damage, but still work because of the error correction.

    More info: geeky [brighton.ac.uk], geekier [cdpage.com], geekiest [washington.edu]. An interesting tidbit is that the data is interleaved serially, meaning the data and the parity codes are spread across wide arcs of disc. That's why it's recommended to clean discs from the center out, not around the discs (so if you scratch it, you damage unrelated segments).

    So, I think the idea of duplicating your CD-Rs and sending them to your relatives is a good one. For more fault tolerance, just send some more copies to some more relatives.

  • Actually, you have it backward. I don't know where you get your information from (hopefully not experience), but tape and pressed CDs have approximately the same life (10-20 years).
  • Couple of things...

    First of all, an experiment:
    Take a CD / CDR with known good data. Take out your keys. Carve a good scratch into the data surface. Put it in the drive and check the integrity of your data. Yup, it's still there (probably). CDs already incorporate ECC (error correcting codes) similar to RAID.

    The real question in my mind is the time durability of your media. People make different claims about CD-Rs, but you can probably count on a minimum of 30 years if you store these things in a box in the dark (the dyes are photo sensitive).

    Magnetic media, on the other hand, tends to have a much shorter life span. Think about where you're going to find a tape drive in 20 years. You're much more likely to find something that reads CDs, because they're a consumer technology (and because CD-ROM's can have virtually indefinite shelf lives).

    Stick with CD-R / DVD-R, but think about your migration strategies. And you can't beat RAID for on-line storage.
  • The lifetime of CD-ROMs is unknown, as the discs have not been in existence long enough for us to study and understand the mechanisms by which they wear out.

    It is often possible to predict the lifetime of a product (or a key component of a product) by subjecting it to accelerated life testing--that is, by increasing the stress on the component until it fails. It is not clear, however, which stresses will lead to failure, or if increasing the stress accurately predicts what happens at the end of a component's life.

    The disc of a CD-ROM is made of polycarbonate plastic and an encapsulated thin, reflective layer of aluminum. The digital information on the disc is imprinted in that aluminum layer. There are a number of possible wear-out mechanisms that could damage or destroy the information on a compact disc. Ultraviolet light can alter the optical properties of the polycarbonate plastic; cold flow of the plastic could lead to mechanical distortion of the disc; and oxidation could impair the readability of the aluminum reflective layer.

    Practically speaking, the most likely wear-out mechanism for CD-ROMs will be the changing technology of data storage. Long before the disc itself becomes unreadable, it is likely that the CD-ROM will be replaced by a new medium and that it will not be possible to find a CD-ROM reader, except perhaps in a museum."

  • by Tower ( 37395 ) on Wednesday March 07, 2001 @12:11PM (#378438)
    Another reason is that DLT drives and DAT autoloaders have *vastly* larger capacities than a CD-R. A small DAT cartridge could have 6 12/24GB DDS-3 tapes in it. At only a couple bucks per tape, that's dirt cheap, and you can reasonably store 72-120GB for each cartridge load. Less changing, more automation == nice.

    DLT drives are great, but definitely toward the high end of the scale. A basic DDS-3 DAT will only set you back a few hundred, and gives you a lot of room to work with.

    --
  • Thank you very much. This is the best suggestion I have seen so far.

    Basicly, what you have described is a manually implemented RAID 0. But it uses well supported standards and software. Just read the good bits from each copy and piece them back together.

    Simple, elegant, I love it.

    Thanks again,

    Bob Washburne

  • ok kids - did some more research. Here's the goodies...

    - i did some looking around - and one of the better raids for you would be level 2. Orielly defines it as "Data are spanned across multiple disks, and additional disks are used to store Hamming codes (to detect and correct errors or recover from failed drives). Four data disks would require three additional error detection and correction disks." They go on to say that it offers the greatest redundancy but is not commercially available because of the high cost
    But we don't care about level 2, since level six [acnc.com] is even better. By using two different dimensions of parity - you can lose more than one drive/disc and still be able to access the data. So - it's pretty much raid 5 with an extra set of parity. Obviously - the more disks you have in a set the better, minimum being 4. You can rest a lot easier knowing that even if %50 of your disks go bad, you'll be OK.

    More RAID level 6 info [storagereview.com]
    The difference between Raid 3 and Raid 5 is where the parity information is stored. With Raid 3, the parity is stored on a dedicated disk. With Raid 5 - that information is spread over all the disks. Which is better depends on what you're storing. Large data files such as graphics/image files get better performance on raid 3, while smaller files do better on raid 5.

    Oh - and i messed up in my previous comment - i meant raid 1, not raid 0. 0 is striping over 2 disks, 1 is mirroring.

    So you're wondering - all this theoretical info, and no practical tools to be able to actually use it.

    Well - it appears that software raid under linux currently doesn't support raid 6 - but some enterprising hacker could certainly put it in (i'd assume) IANAPY (I am not a programmer yet). But RAID-5 gives us a 33% acceptable failure not, not quite 50%, but nothing to sneer at.

    I did find some references to sites with a /kernel/2.2.16-1-RAID/modules/CDROM - so i'm assuming somebody uses raid+cdrom currently.

    I'll let y'all know if i find something more concrete out

  • Oh yeah - put your valuable data on tapes. And Save Money!!!!

    Nah. I don't think so. While interning at a major telecom firm (think PacBell) a user (internal) accidently deleted a document from the server and asked me to recover it from backup.... Went through an entire case of backup tapes and couldn't recover even part of the data. They let me go, used the savings to buy a better tape drive, then re-hired me a month ago. Mind you - that "better" tape drive still didn't always make tapes that worked.

    Moral of the story? Don't work for people who take their employees so lightly. And stay away from tape drives. Eventually something will go wrong with the drive or the tapes and you'll lose data. And since the drives are much less common than CD-ROM drives, you'll probably by SOL if the drive goes to that big /dev/null in the sky.

    My advice? There has to be some way to do raid 5+0 with CDR Drives. That's right - parity and mirror. Sure - you'll need to do it with scsi to get the 6 drives on-line at once. And for those 6 CDR's you'll only get about 1.3G. But you'll end up with 2 sets of 3 discs, of which you only need 2 different discs to recover with.

    PS: If you get this to work - let me know. It's just geeky enough that I'd kill to be able to brag about giving someone the idea.
  • You might use this [www.sci.fi] to get extra error correction. Store ECC on other CD.

    Apply it to ISO filesystem you are creating, so when CD get's damaged, read everything you get to file, and use this to get original ISO-image.

    Only problem is, error correction is for bit changes, and normally, when you have broken CD, you can't read part of it at all. So you have to insert missing bits to ISO-image (plain 0) to correct place before using error correction. Don't ask me how, dd conv=noerr will give you all data it can get, no idea where's something missing.

    One solution might be writing some structured data, where you can find block numbers from data. CD doesn't have to have ISO filesystem, or filesystem at all. Just create own dataformat, and write it to CD. And tell us when you have solution available!

  • Not that it would be cheap, but have you thought of using several DLT drives in a RAID configuration of some sort?

    I know I have heard of mirroring and parity systems using multiple parallel running tape drives - I am not sure how reliable they are/were - but it should be possible (if expensive).

    Tape is actually very robust, and lasts a long time. There is a reason it is used so much in the industry.

    I guess another possibility would be to find a card punch and a large cache of cards and... oh nevermind...

    Worldcom [worldcom.com] - Generation Duh!

I tell them to turn to the study of mathematics, for it is only there that they might escape the lusts of the flesh. -- Thomas Mann, "The Magic Mountain"

Working...