Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

What is Your Backup Policy?

Posted by Cliff on Wed May 31, 2006 08:40 PM
from the never-go-too-long-without-one dept.
higuita asks: "A few days ago, I was asked to check our backups policy, how they are being applied and to try to make it safer and more useful. Being new to the company, I started to check what is being done right now and found several problems. Since I don't have much experience with enterprise backups, what are the most used backup policies, software and global ideas about this issue? We have less than 1000 workstations (Windows and Macs), about 20 Oracle and Exchange servers (split between Windows, Solaris, and Linux), and it all needs to be backed up. Right now, we use the HP data protector with several tapes, where most things have a weekly full backup and daily incremental backups, and that most full backups are archived permanently in a safe we have for this purpose. We also have off-site storage for backups, as well. What practices and policies do Slashdot users implement for backups they perform at their office (home backups practices I am not interested in)?"
"I've investigated Veritas NetBackup and other solutions, and I'm also curious if Amanda could be better or at approximate the features offered by HP Data Protector. What backup software have you used that you found enjoyable with the least bit of hassle?

I've thought about using Dirvish to backup the user's homes to a cheap server with several HDs, and only backup to tapes once every 15 days or even once a month. They will lose their Windows permissions, but I don't think that matters much, since this is just for safekeeping the users' work. I thought about making full backups of the servers every 15 days with daily incremental backups. This way I will free up tape drives' time and gain more flexibility with the backup schedule.

I would love it if users worked off of file servers, but right now this just isn't possible. It's a planned addition that we still don't have the time to make."
+ -
story

Related Stories

[+] Linux: Amanda 2.5 Released 155 comments
Anonymous Coward writes to tell us that a new release of the popular open source backup tool Amanda is now available fixing many of the limitations of previous versions. From the release: "Overall the focus of the release is on security of the backup process & backed up data, scalability of the backup process and ease of installation & configuration of Amanda."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by georgewilliamherbert (211790) on Wednesday May 31 2006, @08:45PM (#15441085)
    For that many systems, use a professional, enterprise grade, commercial solution. The open source stuff doesn't supply the same manageability.

    AND FOR GOD'S SAKE, REGULARLY VERIFY THAT YOU CAN READ THE TAPES BACK... More sites have been screwed by backup tapes that weren't readable than any other failure mode. Verifying every tape is best. Second best is every weekly. Random samples, but covering every single drive's tape output at least once a month, are poor third place.

    The two obvious software suggestions are Veritas/Symantec NetBackup and Legato Networker.

    Weekly fulls and daily incrementals are good. Your offsite schedule should be checked to ensure that you have a relatively recent restore point both onsite (in case of data loss) and offsite (in case of building loss).

    In terms of offsites, having a prepared plan for where and how to restore (Disaster Recovery and Business Continuity) is also important. But those all start with "Go get the tapes...".
    • I've used Amanda, Bakula, Netbackup, Networker and by far the best of the bunch for enterprise size networks is TSM. Easily. Netbackup is something I still have cold sweats and nightmares about, ok, not quite nightmares, just the occasional cold sweat. It's really a small network system which has been kludged to "enterprise" class. TSM was designed for managing large network backups from the start.

       
      • BMR has been standard for years.

        I've seen attempts to build large enterprise backup environments with "simple open" software. They melt down somewhat short of the size that the original questioner is asking about, typically.

        I've built environments with NBU and used Legato, at large sites (much larger than the original questioner). They just work. Configuring them initially can be non-trivial if you have no prior experience with them, but once set up right they just work.

        Throwing a bunch of open source te
          • by georgewilliamherbert (211790) on Thursday June 01 2006, @02:45AM (#15442786)
            I use plenty of stuff for which I have the source code. Going back to the 4.2mumble BSDs, through SunOS, Linux, Solaris, the various x86 BSDs, and plenty of applications (this is Mozilla I'm /.ing with, and before that a long line of other open source browsers). I have no problem with installing large Linux farms, using Apache for an enterprise web deployment, using MySQL for moderate sized databases (or PostgreSQL, though I haven't deployed it personally).

            Tape backup... NBU wins. Legato's a close second. Sorry, charlie. Open source as a category does not suck. The open source backup stuff doesn't suck, for small to medium sized sites. It's not enterprise class, though, and most of the trick to succeeding in IT is knowing when the tools you use aren't applicable anymore and how to figure out what are.

            NBU can't RAIT, but it can stream across multiple tapes, and can write duplicate tapes if you want redundancy. And you can extract the files off tape with tar if you have to.

            Amanda certainly doesn't suck, but it's not NBU.
      • Try TSM. DR is one of its strongest suits!

        It's really pretty darned incredible. One command, and your TSM environment is rebuilt. We use the DR capabilities multiple times per year. Works great.
  • don't make the mistake that one guy did
    the office was in the North Tower --- The "offsite backup" was in the South Tower

    oops
    i would suggest minimum different zip codes different time zones would be best
    other than that Grand father > Father >Son GF gets sent offsite
    • If you live in Southern California, there are four seasons:

      Fire, Flood, Mud, and Earthquake

      In which case, the best case for off site backup is out of state, like Las Vegas or something. This also gives you an excellent excuse for monthly road trips to "check out the quality of the backups"

      That said, for simple off site backups, solutions like MOZY.com do just fine for a small small business. Otherwise, something like LiveVault.com is recommended. There are plenty of vendors out there.

      Another thing is the

      • Fire, Flood, Mud, and Earthquake

        Close, but no cigar. The four seasons in Southern California are Fire, Flood, Earthquake and Riot. I should know; I'm the one who posted that to rec.humor.funny about fourteen years ago. Besides, Mud is just a subsidiary of Flood.

    • by a9db0 (31053) on Thursday June 01 2006, @10:56AM (#15445654)
      i would suggest minimum different zip codes different time zones would be best

      Sounds funny but very true. Backups across town aren't terriby useful if across town is flat too. Sound farfetched? Ask a sysadmin in Miami how far off he ships his backups. If he was there when Andrew visited, I'll bet they're in New Mexico.

      This may seem a tad offtopic, but it is relevant:

      You have to think through both distance from and access to your backups as a part of disaster recovery planning. Backup isn't just recovering the CEO's email, though that is a (hopefully) far more frequent occurance than recovering from a hurricane/fire/mudslide/blizzard. Easy access to the backup media is important for daily operations. Recovery from disaster is quite a bit more complex. Your backup solution needs to be able to cover the full spectrum - from yestarday's lost spreadsheet to the area flattened by mother nature.

      Personally, I keep two backups - one here locally, one 1000 miles away in another state. Backup to CD here, online rsync in NC.

      "Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway." - Variously attributed, frequently to Andrew Tanenbaum
  • by khasim (1285) <brandioch.conner@gmail.com> on Wednesday May 31 2006, @08:51PM (#15441113)
    This will take a LOT of research on your part.

    You'll need to identify each application that is being used, where its data is being stored and what type of "backup" is needed for it.

    Don't forget to include "backups" of the system software. There's nothing more annoying than having to rebuild a system, and you have a backup of the data, but you cannot find the install CD.

    Older *nix systems were far easier than the "modern" PC-based servers. I could backup my old Sequent box to a bootable tape. If anything went wrong, I could boot the tape and re-write the system. This is somewhat supported now on some of the PC-based servers.

    Anyway, back to the "backups". Once you have the systems identified, then you'll need to look at what scenarios you'll need to plan for.

    #1. Server crash.
    The data on the disk is destroyed. The OS is destroyed. But the hardware is okay.

    #2. The building burns down.
    All of your servers are now smoking heaps of plastic. So's your desk. And all the CD's you had.

    #3. 5 years from now someone wants a critical policy that was deleted 3 years ago.

    I spend most of my time kicking co-workers to get them to NOT just dump data any where that has free space and to NOT just throw up a new web server without telling me.
    • "You'll need to identify each application that is being used, where its data is being stored and what type of "backup" is needed for it."

      I second this. Nothing's worse than someone telling you "back up this system, full once a week, incrementals every other day, all local drives, blah blah" and then not telling you they've got some database on it (you can't back up a live database by just copying the files.) Of course, when failure hits, guess what needs to be restored and isn't usable?
      • Or how about the database the backup software uses. I have seen other peoples solutions go down and after rebuilding the backup machine there wasn't a record of what was on what tape to be restored. I had to re-index each tape and restore from there. Then try and check the files for the newest ones.(incrementals spanning different tapes as well as recycled tapes so at best you have the changes to a few file but not the orignial to restore to.) It took weeks instead of hours or even days to get it cloe enoug
    • You missed a few:

      #4: User deletes a file deemed by somebody important to be critical and you have to get it back.

      Its amazing how much money is spent planning for the once-in-a-lifetime Twin-Towers disaster event, and how little is spent on the daily occurance of user-error. Unfortunately "User is an idiot" doesn't wash when its the company's financial records or the birthday party shots of the CEO's kid.

      - Don't permit users to save things to their local disks. Ensure all files go onto a share that can be ce
  • This may just be a wording issue, but it looks like you want to back up the desktops. Is that true?

    I can't think of any good reason to do that. All the important data should be on the server. If the user wants to save a picture on the local disk to use as a background or something that's one thing (although I wouldn't allow that myself) but nothing important should be on those disks.

    Past that, I don't have the experience to help you. All I can do is reiterate what another poster has already put up. Check the backups. I can't tell you how many stories I've heard about backups that "went fine" until someone needed data. Stories where the tapes were so old they almost shredded themselves in the drives. Stories of "backing up" for at least 6 months onto a cleaning tape (I bet the drive was in good condition though!). Stories of the backup data being garbage because of a faulty cable or something. The backup is worthless if you can't get the data back off it successfully.

    • I can't think of any good reason to do that. All the important data should be on the server. If the user wants to save a picture on the local disk to use as a background or something that's one thing (although I wouldn't allow that myself) but nothing important should be on those disks.

      Parent is correct - to an extent. There is still probably a requirement to bring a failed desktop up and running quickly if there is a problem that requires a desktop restoration.

      If centrally storing data is the way to take c

      • I agree. I assumed that the image of the computer(s) would be included in the backup. Having those images will save you a ton of time, even if each image is only for 50 computers.

        That said, there is a big difference between backing up the images and backing up each individual desktop in the company.

  • I dump stuff on undergrads. They've got to be good for something.

    /heh, just Kidding. I just mirror my scsi disks with a big ultra-ATA device weekly and daily.
    • > I just mirror my scsi disks with a big ultra-ATA device weekly and daily.

      You might like my backup software, Chroniton [cpan.org]. It will happily run from cron and make incremental backups (and allow you to easily restore from one). It also stores everthing to the filesystem, so even if my software crashes and burns (which it won't; it's heavily tested in practice and with unit tests :), your data will still be just fine. All of your file's metadata is safely versioned and archived, as well. Take a look, it's
  • by Millenniumman (924859) on Wednesday May 31 2006, @08:58PM (#15441148)
    My backup strategy consists of hoping that my hard drive doesn't fail before I get a new computer/hard drive. It's worked so far, even with a laptop.
  • 1 click down, yell "Clear" and hit the gas.
  • Paper (Score:5, Informative)

    by NetDanzr (619387) on Wednesday May 31 2006, @09:28PM (#15441319)
    My backup copy is paper. Granted, it gets a little awkward when I move, as I currently have six large file boxes of that stuff, but I know that as long as I keep it reasonably safe from humidity/mice it'll outlive all my computer media and file format changes.

    At work we do the same, only to a larger extent. We've got an on-site and off-site storage, and each piece of information is printed in two copies to be stored at each. All that in addition to your usual Veritas tape and CD-RW backups, which we do for convenience of restoring lost data, but which we don't trust enough to eliminate paper copies.

    • If you are making backups every week, you only need them to last one week. Paper makes sense for an archive if you plan on needing the data long after you have stopped creating new data, but while you are working a short-term, cheap, space efficient and environmentally friendly solution is better.
  • I think you're jumping the gun a little here.

    The first question you need to ask is:

    What is the time frame for your servers to be restored in should servers and such completely fail?

    If you don't know that answer to that question then how does your company know how much money to budget? Are you bound by HIPAA or Sarbanes-Oxley? You should know how much is your company's data worth prior to assigning a bidget.

    Are some of your database servers supposed to be up 24x7? Maybe you should look at distributed transactions across databases located at different sites so if one server fails you still have everything live? Have you timed how long it takes to rebuild your servers to confirm your allotted time in your disaster recovery plan? Has your company considered imaging servers/ Is it possible to?

    Have you consulted your disaster recovery plan? Have you checked with suppliers to see how long replacement parts will take to order? I can't tell you how many administrators get caught out by buying an expensive tape drive only to have it fail along woith the server and nothing can be restored until a new one can be sourced.

    Without requirements, a disaster recovery time frame you will never be in control in the event of a disaster.

    Your companies board of directors/owners will need this information. It's called operating under conditions of "due care and diligence".

    If something goes wrong and you can't tell your boss exactly what is required and how long it will take to recover then you're working in the wrong job - a big part of being a network administrator is planning for ANY event.

    Oh, most of the time my customers are happy with Robocopy. I hate paying for expensive hardware and backup software solutions when I can write something much simpler and document it properly rather than depending on someone else's buggy software. Of course this depends on the industry and their requirements.

    Make sure that your boss completely understands these questions and issues. Ask him to see the current Business Continuity plan and Disaster Recovery documentation before you touch anything on those servers - can't stress that enough.

    Hope that helps, sorry it's brief but if you're in charge of backups it's your job to be ANAL and PEDANTIC.
    • Before you start spending money you need to know what the company requirements are. There are excellent tools and options, including real time raid-1 over mutliple sites, but the business case will drive your requirements.

      Servers - how long can they be down? Do you have replacement plans in case your data center gets hit by the next earthquake/hurricane/fill_in_the_disaster. Having tapes off site means nothing if you don't have hardware for restore. Can you get Hardware X if everyone else is looking f

  • We moved all of our servers to VMware virtual machines. Now we back them all up every night, some of them we even back up multiple times a day. We tried esxRanger first, but it took too long (back up of all of the VMs took 4 days) and used too much space. Then we moved to esXpress, which does differential backups of VMs, so it is MUCH faster and uses MUCH less space. We keep 30 days worth of backups online, but once a week we cut tapes of the monthly full and that week's differentials and ship it off-si
  • Who bothers with backups? I've personally never wasted any time backing

    A fatal exeeption 0E has occurred at 0137:BFFA21C9. The current application will be terminated.

      * Press any key to terminate the current application
      * Press CTRL+ALT+DEL again to restart your computer. You will lose any unsaved information in all applications.

                      Press any key to continue _
  • by SlappyBastard (961143) on Wednesday May 31 2006, @10:16PM (#15441569)
    Please God... please say someone took the project home on CD, or we're fucked!
  • get a real file server,a small tape robot and veritas.
  • I don't give two hoots for a backup policy. What you need is a data recovery policy. When will I need to recover data, and how will it that be attained.

    I've been working with Symantec (formerly Veritas) Netbackup in my workplace for the past 6 years. About 6 months ago I became one of the backup admins, and the biggest barrier I have to break with our clients is the backup mentality - I must backup everything all the time...

    Generally your data recovery will happen from two triggers:

    1. A user broke his ow
  • ...actually turn your upper body around, so you can look in the direction you're driving.

    Think of the children!

  • Rsync is very good at keeping two servers in sync with minimal bandwidth and disk activity, and can be configured so that you never lose a past revision. I have it set up so we have the latest copy, two weeks of revisions, and one previous revision for each file on every file share.

    Some special consideration is needed for Windows servers. Some files get locked so they can't be read by rsync. We're not backing up anything that we'd run into that problem with, and we back up during a period of inactivity, but
  • You write that you're archiving your old backups. This is good, of course, for several reasons. You need multiple copies in case the newest one isn't usable, and you may need to acess old data. However, how far back do you plan to go in saving old data? If you just keep all backups from now on, you'll have an endlessly rising storage fee because they'll just take up more and more room, and the chances you'll need the older data will get smaller and smaller. Part of creating a good backup policy is deci
  • Remember to change the tapes!

    the cron scripts don't work otherwise!

    #!/bin/sh

    # Daily backup script

    rm -rf /var/db/mysql_tmp
    mkdir /var/db/mysql_tmp /usr/local/etc/rc.d/000.mysql-server.sh stop
    cp -R /var/db/mysql/./ /var/db/mysql_tmp/ /usr/local/etc/rc.d/000.mysql-server.sh start
    find /home/*/public_html /home/*/Mail /var/mail /usr/local/www/ /etc -newer /root/backup/last_backup -and \( -type f -or -type l \) > /root/backup/daily_increment
    find /var/db/mysql_tmp \( -type f -or -type l \) >> /root/backup
  • We have all Macs here and we're slightly smaller than this company; we just started doing serious backups recently. Being most of the IT department here (owner still feels we're too small to need more, I do well, the Macs are easy to administer, gives me time to work on our website and some internal apps) I decided to take advantage of the ridged structure of the Mac OS X home folders. We allow our employees to have MP3 collections and other personal media on their computers, but that doesn't need to be b
  • I never NEVER backup. It is futile, a huge waste of time, and a monumental risk. The only time I have ever lost data was while performing backups. Let me give you an example.

    Way back around 1979, it was my first serious development job, and as the junior programmer in the shop I had the onerous duty of performing the weekly backups of our production drive, containing all the code for our accounting software development. We had a big 10Gb Corvus hard drive (the original Winchester) networked to our Apple IIs
    • That's just ducky when the building burns down, the office is vandalized, the hardware is stolen, someone deletes the files, the fire system malfunctions and triggers the automatic sprinkler system, you hit 'delete' when you meant to hit 'enter', it turns out that your source control didn't quite control your source as much as you thought it had, you fire the wrong person, you hire the wrong person, someone does something they shouldn't have been doing and the equipment gets impounded, the bills weren't pai
      • You weren't paying attention. I specifically said that due to my diligent maintenance, I have a 0% hard drive failure rate over the last 20 years. I just retired a server with an Atlas 10K SCSI drive that ran 24/7/365 for over 5 years without a single problem, not even a soft error. That's what happens when you buy quality products, like high-end SCSI drives instead of cheapshit IDE drives.

        Yes, I am invulnerable. My OS and apps are backed up on their original distribution discs. My handmade data is archived
    • We had a big 10Gb Corvus hard drive (the original Winchester)...

      That didn't sound right, so I did a little checking. FOLDOC [foldoc.org] tells me that the drives got their name because they had two 30meg volumes, rather like the Winchester 30-30. If you really were working with a 10Gig drive, it wasn't a Winchester, and it wasn't in 1979, either, because they didn't have drives that big back then.

  • Two thoughts related to storage:

    - Consider carefully whether you trust your tape safe. I've seen tapes damaged at temperatures lower than some tape safes are rated for.

    - If you have offsite backups, you should also have offsite tape drives. If your main site is destroyed in some catastrophic disaster, it's not too hard to get emergency replacements for server hardware, especially x86. But urgently sourcing the right model of tape drive (in many cases a model that is a few years old) can be a nightmare. Whil
  • is http://www.avamar.com/ [avamar.com]

    The Backup server or cluster of servers store 20KB blocks keyed to the block's SHA-1 hash.

    Smart agents on each backup client chunks each new file to be backed up into 20KB blocks and calculates SHA-1 hashes which it compares against the backup server.

    If the block is new (not on the backup server) the block itself is transfered.
    If the block is old, the backup server stores an extra reference to the block for the client/file.

    The end result is..
    a) a 1000 windows backup clients will res
  • I don't bother with backups. I've got a airtight policy in case of a HD crash or any other form of data loss:
    1)Look shocked and terrified.
    2)Yell.
    3)Scream.
    4)Pull hear.
    5)Bang head to wall.
    6)sit quitely sobbing a corner.
    7)Kick the cat.
    8)Replace HD. (if necessary).
    9)Reinstall software.
    10)Kick cat again.
    11)redownload mp3s, movies, games and pron.
    12)Feed cat.
    13)Mail goatse.cx pictures to random innocent people as an act of pointless revenge.
    14)Make futile threats to a deity that if it happens again
  • AMANDA is really great software. In my past job, we used Retrospect (then from Dantz). That was a nightmare--it used some proprietary archiving format & we weren't able to retrieve some things. AMANDA uses standard dump or tar files (well, as standard as 'dump' is, I guess), so I'm confident that that'll never happen. It also has a first-class scheduling system. Every night, we fill almost exactly one full tape. There are very few disks which don't get a nightly incremental & we have it config
  • So far the best "backup" software I've used is rsync.

    I used to work at one of the worlds most well known web hosting companies where among other things I ran their backup system. It started out with Arkeia and a 120tape library with 6 AIT3 drives. Arkeia was crap though (this was 3yrs ago), it was such a pain to setup and the trying to restore ANY amount of data would literally take days just to scan its local database. Trying to restore just one file would take 6hrs just for it to scan its local database..
  • I run the same version of my OS on QEMU and have it rsync the data.

    • It's not something I've ever looked at, but I kind of doubt that encrypted backups are likely to be popular enough at the "serious backup" level for the simple reason that tape manufacturers advertise the (average) compressed capacity of the tape, with compression being done in hardware by the drive. This has generally been about twice the actual raw, uncompressed capacity of the tape. (It may be higher now; it's been a couple years since I went to virtual tapes.) Well-encrypted data is uncompressible, o