Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Hardware

Costs Associated with the Storage of Terabytes? 161

NetworkAttached asks: "I know of a company that has large online storage requirements - on the order of 50TB - for a new data-warehousing oriented application they are developing. I was astonished to hear that the pricing for this storage (disk, frames, management software, etc...) was nearly $20 million dollars. I've tried to research the actual costs myself, but that information seems all but impossible to find online. For those of you out there with real world experience in this area, is $20 million really accurate? What are the set of viable alternatives out there for storage requirements of this size?"
This discussion has been archived. No new comments can be posted.

Costs Associated with the Storage of Terabytes?

Comments Filter:
  • by Twylite ( 234238 ) <twylite AT crypt DOT co DOT za> on Tuesday September 10, 2002 @12:02PM (#4228747) Homepage

    I am not an expert in this field, but Google was willing to tell me lots.

    RaidWeb [raidweb.com] sells rack mountable RAID units that take IDE drives and have SCSI or fibre connectivity. A 12-bay 4U SCSI (with 12x 120Gb IDE drives) system comes in at just under $8000, giving over 1Tb fault tolerant storage. There are several other companies that have units like this.

    Rackmount Solutions [rackmountsolutions.net] sells rackmount cabinets. A 44U cabinet with fans, doors, etc. will come in at around $3000.

    In theory, a single cabinet could house 11Tb of data, and cost around $91000. This still doesn't consider cabling, cooling, power distribution, networking, a proper server room (air con, false floor for cables, access control), and in all likelihood one or more controlling servers.

    More practically, depending on how they are going to make this data accessible, you could be looking at 9 raid units per cabinet plus 3 2U servers and a switch in the remaining space. Each server can support multiple SCSI cards and gigabyte networking. Such rackmount computers will set you back in the region of $6000 (incl. network and SCSI adapters, excl. software).

    So you can call it $100,000 for 9 Tb storage ... $600,000 for 54Tb. That doesn't answer the management software question, and may not be a suitable solution. But it sure is a lot cheaper than $20 mil ;)

  • by highcaffeine ( 83298 ) on Tuesday September 10, 2002 @12:10PM (#4228818)
    In raw disk storage, maybe. But you're forgetting actually putting those drives into a useable state with disaster recovery plans.

    In other words, someone dealing with 50TB and who wants backups of that data will be spending many, many times the amount it would cost to just purchase enough hard drives to get the bragging rights of 50TB. And a backup located in the same room/floor/rackspace/whatever as the source data will be pointless in the event of fire, floods, nuclear fallout, etc. So, they would also need a way to transfer all that data to offsite backups in a timely manner (waiting five weeks for a full backup to transfer over a 100Mb/s pipe would probably not be acceptable).

    Aside from backups, how would the drives be accessible? Even as JBOD, you're talking 40 IDE/ATA controllers (assuming 320GB drives and 4 ports per controller), or 20 SCSI channels (assuming 160GB per drive and 15 non-host devices per channel) to support that many disks. You could also use Fibre Channel and get away with only a couple arbitrated loops. Physically, you're talking about hundreds of disks that need to be mounted somewhere, so you would also need dozens of chassis to hold the drives.

    But, hundreds of disks in a JBOD configuration means you'll have hundreds of partitions, each separate from the others. Hell, if the clients are Windows machines, they won't even be able to access more than a couple dozen at a time. And even for operating systems with better partition/mount-point addressing, it would be unmanageable.

    So, now you get in to needing a RAID solution that can tie hundreds of disks together. If you're talking about hooking these up to standard servers through PCI RAID cards, you'll need several of those machines to be able to host all the controllers necessary (especially if all the disks are not 160GB or larger each).

    The only realistic solution for this much storage, at least until we have 5TB hard drives, is a SAN-like setup. Specialized hardware designed to house hundreds of disks in stand-alone cabinets and provide advanced RAID and partitioning features. SANs don't come cheap.

    Add to the SAN the various service plans, installation, freight, configuration, management and the occasional drive swapping as individual disks fail and you've already multiplied that $50K several times, as a bare minimum (and you still haven't priced out the backup solution).

    There's a lot more to it than just having a pile of hard drives on the floor. I wouldn't even be surprised if the drives are the cheapest component.
  • You pay for support. (Score:3, Interesting)

    by molo ( 94384 ) on Tuesday September 10, 2002 @12:23PM (#4228943) Journal
    When you get a Symmetrix frame from EMC, you also get a support contract. EMC will send multiple people to your installation for maintenance. EMC will remtoely monitor your Symm via modem. They will help you plan your storage needs (including what kind of backup and reliability you need). EMC will provide 24x7 support for everything you need. Then there's management software, etc.

    Don't forget that the hardware isn't cheap: Frame, multiple redundant hot swappable power supplies (requires specialty power connection), dozens of scsi drives, dozens of scsi controllers, 10-20 fibre channel connections, an interconnection network between FC and SCSI controllers that includes fiber and copper ethernet, hubs, etc., and a management x86 laptop integrated into the frame.

    $20 mil for this is a fair price in my opinion. Anyone who rolls their own is just insane. There are hundreds of engineers behind each of these boxes, and it shows.

    No, I don't work for EMC.
  • by chris_mahan ( 256577 ) <chris.mahan@gmail.com> on Tuesday September 10, 2002 @02:56PM (#4230566) Homepage
    I work at a bank. I understand about reliability and failover etc.

    What we need is some university/some poor souls with money to invest, to build this as a "test case" for linux distributed systems.

    =============
    Requirements:
    -- 50 TB Data storage
    -- 100% availability (I don't mean 99.99_)
    -- Data must be accessible worldwide
    -- Data must be safe in these events:
    -----War or terrorist act (building blows up)
    -----Earthquake (building falls down flat)
    -----Fire (building burns to foundation)
    -----Flood (building full of muddy fishy water)
    --Data must be online in the event of a disaster in 48 hours.
    --Data must survive:
    ----Server failure
    ----Storage medium failure
    ----telecommunication failure (junk through the pipes)
    ----Unauthorized access (r0x4H 31g00G)
    ----Vandalism (maintenance guy with baseball bat or axe)
    ----Theft of equipment
    Furthermore:
    --Data must always be in a non-corrupt state
    --Data must be fully auditable
    --Data transaction must always be fully reversible
    Also:
    --All procedures (ALL) must be written down on electronic document and on paper and must be available to ONLY the proper personnel.
    --All personnel must be correctly trained (development of training material, testing, evaluations, etc)
    --System architecture must allow for connectivity to any known server system, any database system, and any client systems.

    ===
    Oh, and under 20 million dollars.
    ===

    However which way that solution should be implemented is left as an exercise to the reader

To program is to be.

Working...