Costs Associated with the Storage of Terabytes? 161
NetworkAttached asks: "I know of a company that has large online storage requirements - on the order of 50TB - for a new data-warehousing oriented application they are developing. I was astonished to hear that the pricing for this storage (disk, frames, management software, etc...) was nearly $20 million dollars. I've tried to research the actual costs myself, but that information seems all but impossible to find online. For those of you out there with real world experience in this area, is $20 million really accurate? What are the set of viable alternatives out there for storage requirements of this size?"
Pricing sounds a little high (Score:5, Informative)
If you're going with EMC, you'll need to put those disks in something, like a frame (cabinet), and for your size, more like 5 cabinets. With that many cabinets, you'll need some sort of SAN switch and associated fibre cables (not cheap). That gets your disks into cabinets and all hooked together.
You wanted to access the data? Then you'll need EMC fibre channel cards ($15k a pop for the Sun 64bit PCI high end jobs). But you'll more than likely be serving data from a cluster of machines, so count on buying three ($45k) per machine (so each card is on a different I/O board hitting the SAN switch, redundancy)
Who's going to set this up? For that kind of coin, EMC (or whomever you go with) will more than likely set the thing up and burn it in for you on site. The price probably also includes some kind of maintenance contract with turn around time fitting the criticality of the system.
Yes, my 'big ass storage' experience may be limited , but I think that 20Million for 50TB installed/supported/tested by a big storage vendor is in the ballpark.
Good luck.
Re:Sounds reasonable (Score:3, Informative)
server only has FS daemons doing I/O, and the drives
are always hot, there is no SCSI advantage as there
is in a multitasking workstation environment.
Re:Pricing sounds a little high (Score:4, Informative)
For enterprise-class storage (i.e. this is NOT just a pile of Maxtor IDE drives duct-taped together) paying 20M for 50TB is on the high side, but not by much. (I would have given a range of 10M-20M for the whole thing depending on the exact trade-offs made.)
3 HBAs per host is overkill for most applications (but certainly not all). I've found that two is generally sufficient. Never rely on just one, even for a non-critical system. I'm often amazed at just how critical non-critical servers become when down for several hours in the middle of a busy day.
Don't discount the significant setup and debugging costs at the beginning. This will cost not only in hardware/software/consulting but in time lost for your own admins to spend working with the vendor, going to classes, learning new methods of adding storage, accidently messing up the systems, cleaning up those messes, etc.
Get the best monitoring/management software you can. EMC is famous for gouging people on software costs so you'll need to use your best judgement. (HINT: PowerPath == Veritas DMP at up to 20x the cost. SRDF == Veritas Volume Replicator at up to 20x the price. TimeFinder == Mirroring at up to an infinite multiple of the price. You get the idea-- just use your best judgement and be cautious.) Under extreme single-host disk loads the otherwise minor performance hit for host volume management can become a problem, making that 20x price worth it. Maybe.
If possible, press them for management software that makes adding/removing/changing filesystems a one-step operation, complete with error checking. It really sucks to put that new database on the same disks as another host's old database and software can be really good at checking for stupid human mistakes.
Re:Look at the quantities (Score:4, Informative)
Then you want to back this up? Break out your checkbook again for a Compaq minilibary if your lucky, that is only 10 tapes x 80gig a tape...800gig..and that is if your really doing well. So put that on top of it all 10x10X80 gives you 8 TB of backup at around 30k each for the minilibs, the price just keeps on jumpin!
No way, no how, not today or tomorrow. 100k will get you a floor full of 120gig maxtor drives and that is about it.
What Kind of Application? (Score:2, Informative)
Are you mostly reading, or also frequently writing this data? Are you searching or doing indexed lookups? Is this a nasty bandwidth hog or a trickle? Is this a zillion parallel transactions or only a few users? What kind of latencies are expected? What reliability is required? What access is needed to historical data?
Consider some concrete examples that are *very* different from each other yet could each total 50TB and would have very different solutions:
- Video-on-demand system for a Hollywood studio deciding that peer-to-peer pirate systems can only be beaten by a legitimate system that is better.
- Online credit card transaction system for, say, Visa.
- SETI data that needs to be collected and searched for messages from extraterrestrials.
- Particle accelerator data that needs to be collected at truly horrendous rates.
- Lexis/Nexis database.
- Google database.
- Echelon data.
- IRS data.
- "Dictionary attack" database for a lone cryto-analyst.
The possibilities go on and on. At the minimum a 50 TB database might be a small number of equipment racks with a single computer attached to them, all totaling maybe $100,000.
And on the other end, I can easily imagine a system where $200,000 of a much larger total might be spent for, say, a terabyte of DRAM.
I can easily imagine a system with less than $5,000 of battery backed up power supplies, and I can imagine a system with hundreds of throusands in generators.
This question has enormous dynamic range.
-kb, the Kent who would enjoy working out solutions for specific instances of this question.
Re:You're joking, right? Moderators: you too, righ (Score:3, Informative)
You know that I am talking about commonly available multi-CPU systems, and not exotic (and insanely expensive) systems with redundant CPUs and memory.
What are you smoking, and where can I get some?
Do you seriously believe that an E6500 or similar system will not crash if there is a faulty CPU? Despite your impressively low slashdot UID, if you believe this, you have virtually no experience with such systems.