Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Data Storage

Making Use of Terabytes of Unused Storage 448

kernspaltung writes "I manage a network of roughly a hundred Windows boxes, all of them with hard drives of at least 40GB — many have 80GB drives and larger. Other than what's used by the OS, a few applications, and a smattering of small documents, this space is idle. What would be a productive use for these terabytes of wasted space? Does any software exist that would enable pooling this extra space into one or more large virtual networked drives? Something that could offer the fault-tolerance and ease-of-use of ZFS across a network of PCs would be great for small-to-medium organizations."
This discussion has been archived. No new comments can be posted.

Making Use of Terabytes of Unused Storage

Comments Filter:
  • Porn (Score:5, Funny)

    by Anonymous Coward on Saturday February 09, 2008 @10:29AM (#22359676)
    It's the obvious choice.
    • by fbjon ( 692006 )
      You jest, but I wonder how much of this "unused" space is really unused? It's not just admins who have a few files of "random OTP data" or "misc dll's", ya know!
  • vista? (Score:5, Funny)

    by stillb4llin ( 1232934 ) on Saturday February 09, 2008 @10:30AM (#22359680) Homepage
    install vista on them, that would fill up that space and give you something to manage your time a little better than wondering about what you could manage..
  • easy! (Score:5, Funny)

    by Anonymous Coward on Saturday February 09, 2008 @10:34AM (#22359698)
    Does any software exist that would enable pooling this extra space into one or more large virtual networked drives?

    Absolutely! Just hook them up directly to the internet before you update the machines, wait a few minutes, and voila! They'll be filled up with extra files in no time! Hey, you didn't say anything about wanting to be in control of what gets put on the machines...
  • by Mostly a lurker ( 634878 ) on Saturday February 09, 2008 @10:34AM (#22359700)
    If you have a very robust local network with plenty of spare capacity, and can accept a performance hit on the client computers, I am sure some kind of linked filesystem would be possible. In most practical situations, I think this idea would be a non-starter.
    • by Anonymous Coward

      Please stop typing words like "utilization" when you mean "use". You sound like a PHB trying to sound smarter than he really is and you make it a pain for people to read what you write, especially non-Anglophones. Read George Orwell's essay on this topic [mtholyoke.edu].

    • Please don't (Score:5, Interesting)

      by mnmn ( 145599 ) on Saturday February 09, 2008 @08:00PM (#22364484) Homepage
      Please do not use the space for anything else. Do not try to actively use the space.

      The reason is the obscenely large amount of power required to use the space given a few gigabytes requires the whole machine to be running, and uses it's CPU which can't be less than 21Watts itself.

      It's actually cheaper to get a 1TB drive and use it elsewhere than use the power on so many desktops (or worse, servers). Even with the desktops in use by active users.
  • by Marc Rochkind ( 775756 ) on Saturday February 09, 2008 @10:34AM (#22359702) Homepage
    If they're in a computer room, then such a scheme might work. But, if they're on user's desks, you don't really have control. They're subject to filling up, being shut off, being knocked about, crashing, etc. I don't think in this case you would really get the reliability that the diversity and independence would suggest.

    --Marc
    • by McGiraf ( 196030 )
      You just use some kind of distributed raid. I'm sure sofeare for this already exist.
    • by teslar ( 706653 )

      They're subject to filling up, being shut off, being knocked about, crashing, etc

      Well, filling up is kinda the point of the entire exercise, but you're right - being shut off, crashing, or being otherwise disconnected is enough of a problem to make this a non-starter. We're basically talking about a distributed filesystem in which subparts may fail without notice. I'm sure there are ways to minimise the problems this will create - you can for instance make sure that any one file is always completely locate

      • Re: (Score:2, Insightful)

        by cbart387 ( 1192883 )

        Or is this just trying to salvage something you can't really use in order to create a solution for a problem that doesn't really exist?
        Bingo Bango Bongo! If you read the submitter's question, it simplies to:
            a) Is there there something productive I can be doing?
            b) How to do it?

        Everything else is fluff that tends to lead slashdot readers off on tangents, flamewars, Emacs Vs Vi (emacs), KDE vs GNOME (gnome)
      • Besides, with hard disks these days being as cheap as they are, why not just buy another one if you do need more space? Do you even need more space? Or is this just trying to salvage something you can't really use in order to create a solution for a problem that doesn't really exist?

        Let's say 50GB, on average, is free on each computer. We want some fairly hefty redundancy, so let's knock it down to 10GB of storage per machine.

        Spanned across 1000 work machines - that's 10 Terabytes of storage, with 4X or so
        • by fbjon ( 692006 )
          Why not a sort of gradual compression redundancy, like a hologram? The more machines are on the network, the more details about the data in storage can be extracted.


          Now, this would only work well for lossy compression like, say... images, but I'm sure that is the intention of the submitter. Incidentally, it may provide an incentive for the workers to keep their computers turned on, and even stay late after work.

          The more, the merrier. Literally.

    • by arivanov ( 12034 )
      There is a number of clustered storage apps operating on P2P basis with N:M redundancy model. Just do an internet search and choose your poison. Neither one of them offers amasing performance, but the actual availability often exceeds what you get from an average SMB Winhoze server.
  • by SiegeTank ( 582725 ) on Saturday February 09, 2008 @10:38AM (#22359716)
    ...just in case your connection fails.
    • Because he will run out of room very fast I recommend downloading all the pron sites first. That way if the internet is down at least he still has porn.
  • not enough info (Score:2, Interesting)

    by YrWrstNtmr ( 564987 )
    Is this a company, college, or just a random collection of boxes in your mom's basement? What function does your organization want to do that it can't because of a lack of a few terabytes? What does the actual owner of these boxes have to say about your little enterprise?
    • by BeanThere ( 28381 ) on Saturday February 09, 2008 @10:47AM (#22359772)
      100 computers in his mom's basement? That's a big basement.
      • by Chas ( 5144 )
        Not really. A hundred mid-towers can fit on a few large three-tier roll-away carts.

        This'll fill a small room. But an entire basement? Not really.

        The REAL problem becomes supplying power for all of them without constantly blowing a breaker, setting the house on fire (through electrical or thermal means), or cooking the systems by producing too much heat. ....

        Not that I've ever tried such a thing...no...not me...

        *Whistle*
    • Its a botnet. Serving trojans, viruses & spam doesn't make full use of the hard drives.
  • by line-bundle ( 235965 ) on Saturday February 09, 2008 @10:42AM (#22359728) Homepage Journal
    You could try to use something like "Localhost Azureus" for distributed data storage. The only problem will be that it will cost you in terms of processor and network hogging.

    Is it cost effective to reclaim that (small) space? Probably not. My suggestion is to realize that no-one tries to save clock cycles any more and maybe this is the way disk storage is probably heading that way.
  • by eebra82 ( 907996 ) on Saturday February 09, 2008 @10:42AM (#22359730) Homepage
    It's a very interesting question, but from my point of view, hard drive space is so ridiculously cheap nowadays that it is utterly pointless to look for a useful application that will fill it up.

    Let's assume that the average computer has 80 GB of storage. Multiply that by 100 and you get 8 TB of space. That's what you can get into one or two computers nowadays without plunging out too much cash.

    What's more interesting is how much processing power you have as well as how fast the internet connection is.
    • by jaxom ( 90814 ) on Saturday February 09, 2008 @11:05AM (#22359888) Journal

      I disagree with this and face this question all the time in work. Disks are cheap, storage systems aren't. If this is for a business that requires reasonable uptime, then the only solution would be to implement a SAN using Fibre Channel or iSCSI and then take out the drives. With the right array, all of a sudden those drives become superfluous (you decide if boot from SAN is right for you), management is easier and you'll be able to get a lot of reuse out of the drives.

      Now a lot of people will start to question the cost of doing all of this and it isn't cheap, however you have to analyze the data correctly. We migrated 200 servers from DAS to a SAN and had our money back within 12 months. Add on top of that the implementation of VMs, all of a sudden those 200 went to 20. That's a big difference in cost of ownership.


      • Disks are cheap, storage systems aren't.

        No, SOME storage systems are expensive.

        You can pretty easily put together an inexpensive SATA array with multiple terrabytes of storage. To anyone that thinks Fibre Channel or iSCSI is just a million times better or more reliable than SATA, I'd say you're being sold a bill of goods by your vendor. Unless you have very high performance needs like say a database being hit by thousands of people, SATA will serve you just fine.
    • Re: (Score:3, Insightful)

      by STrinity ( 723872 )
      The solution is obvious -- the company should have just one or two 80 gig hard drives that employees connect to via Unix terminals.
    • by LWATCDR ( 28044 ) on Saturday February 09, 2008 @01:01PM (#22360896) Homepage Journal
      Yep a better question is Why do all these PCs have harddrives?
      If they are really only using it for the OS, a few applications, and a few docs why not use diskless workstations?
      Less power, heat, and fewer things to break.
      In other words don't use all those drives, get ride of all of them.
  • GlusterFS (Score:3, Informative)

    by Anonymous Coward on Saturday February 09, 2008 @10:43AM (#22359740)
    Check out GlusterFS. (http://www.gluster.org)

    You definitely can't run Windows in order to utilize this, but it should be a minimal effort to setup a quick netboot lab to test it with.

    Cheers.
    • Check out GlusterFS. (http://www.gluster.org)

      You definitely can't run Windows in order to utilize this, but it should be a minimal effort to setup a quick netboot lab to test it with.

      One could envision setting up small VMWare Player instances running under a different account on Windows launch using "Scheduled Tasks" for that account (set to launch on reboot). Or - run VMWare Player as a service. A little beefier would be VMWare Server (free), but a bit more of a hassle (need to also install IIS on eac

  • by kipin ( 981566 ) on Saturday February 09, 2008 @10:44AM (#22359750) Homepage
    I had a drive fail on me last year and I wanted to take my frustration out on it so naturally I did what any good American would do. I shot the shit out of it. Surprisingly it seemed to make for a pretty good piece of bullet proof armor. It stopped multiple rounds of full metal jacket 9mm rounds and managed to get a couple rounds lodged inside the casing. (None appeared to penetrate fully)
    • by eagl ( 86459 ) on Saturday February 09, 2008 @11:38AM (#22360124) Journal
      The drive survived because the 9mm is weak. Get a better gun using a better round, like .40 cal or even a good old .45.

      I've had a chance to read after-action reports from Iraq and Afghanistan, and the 9mm is pretty much a joke. Most of the forces that really rely on hangun stopping power have obtained emergency authorization to bypass normal procurement processes in order to get better handguns using better ammunition. To my knowledge, a modern .45 is considered one of the best alternatives.
      • Re: (Score:3, Informative)

        by Firethorn ( 177587 )
        Nahhh...

        Remember, pistol rounds are pistol rounds, and rifle rounds are rifle rounds.

        Next time he should test it with pretty much any centerfire rifle.
      • Re: (Score:3, Funny)

        by dlapine ( 131282 )
        Well, I know that .45 ball ammo won't penetrate a maxtor 40 GB drive casing- just makes a nice big dent with a nicely mushroomed round. Fired the round myself. Try it out. we had a guy with a .44 magnum and his shot punched clean through the spindle. We had a tachometer at the range that day, and the .45 was doing about 900fps. No holes in the drive though.

        Now, that doesn't mean that a .45 doesn't have more stopping power than 9mm, just that it wouldn't penetrate the aluminum casing of a hard drive. Fortu

  • Sanmelody (Score:4, Informative)

    by theoverlay ( 1208084 ) on Saturday February 09, 2008 @10:45AM (#22359758)
    Datacore offers software called Sanmelody to turner servers into a cheap storage network and there are other vendor solutions as well. http://infiniteadmin.com/ [infiniteadmin.com]
  • AFS (Score:5, Informative)

    by arabagast ( 462679 ) on Saturday February 09, 2008 @10:48AM (#22359780) Homepage
    OpenAFS [openafs.org] is a distributed file system. It seems to fit your bill. No personal experience, so don't know how well it actually works.
    • by Monx ( 742514 )
      IBM uses AFS internally. It works. Use it.
    • Re: (Score:2, Insightful)

      AFS would be applicable if you were interested in turning each end user workstation into a centrally managed AFS server and dedicate storage for holding replicated readonly volumes. I wouldn't store single instance read-write volumes on a machine that at the mercy of an end user to turn on or off. I would also be resistant to deploying centrally managed storage on end user controlled machines in any case due to the access control issues. Anything that is stored on a machine that the end user has physical
  • Solution for Linux (Score:2, Informative)

    by Anonymous Coward
    There's project dedicated to this on Linux, http://nbd.sourceforge.net/ [sourceforge.net].

    If there's nothing similar for windows, you might be able to run it through cygwin.

    Actually, this claims to run on Windows: http://www.vanheusden.com/Loose/nbdsrvr/ [vanheusden.com]
    • nbd is nice for some stuff but lacks fault-tolerance. Of course, you can run RAID, possibly several levels (say, a raid-6 on top of raid-1 or something) on top of nbd devices to trade space for fault-tolerance as much as you want, but you still lack flexibility. The advantage to RAID-over-nbd, on the other hand, is of course that you can do that right now if you want :] (And yes, the nbd server shouldn't be overly hard to run on Windows, one would think; it's rather simple...)

      A better solution would work o

  • I've been thinking of the same thing of late. Our IT department uses this huge SAN at $$$ money. Why couldn't a distributed fault tolerant (with something like striped with parity) be implemented across a LAN with 100Mb/GigE? The standard drive size being shipped on new PC's is at a minimum about 200GB. For biz users that is WAY overkill.

    Our whole organization is about a 1000 Windoze desktops, but I'd like to try it in our local workgroup first (maybe 20 systems). I looked around but couldn't find anything
  • Storage (Score:2, Informative)

    by Genocaust ( 1031046 )
    I tried to tout the merits something like this could have for non-critical regular user backups, but as previous posters mention, it was shot down.

    I was suggesting to run DrFTPD [drftpd.org] as a backend with NetDrive [american.edu] as an access medium. It looks good on paper, but I've never had the chance to apply it so widescale :)

    With DrFTPD it's easy to setup whatever kind of redundancy you would want, ie: "at least 3 nodes will mirror all files in /doc" or whatever. NetDrive (and I'm sure there are others) help take away th
  • What would be a productive use for these terabytes of wasted space?

    The first question to ask is whether what you want to do makes any sense for your employer. Who has to maintain this beast once you build it.

  • dCache (Score:3, Interesting)

    by Rev Saxon ( 666154 ) on Saturday February 09, 2008 @10:56AM (#22359830) Homepage
    http://www.dcache.org/ [dcache.org] You will need a system to act as a master, but otherwise your normal nodes should work great.
  • He have a few compute nodes around here. Each of them has an HD, and as those are so cheap we gave them 500Gbyte ones.

    They dont really need lots of space (maybe 30Gbyte for OS and temp-files), otoh without redundancy the other 450Gbyte are worthless.

    As the task is emberassingly parallel, Network traffic wouldnt be a problem.
    If there was a solution to compine all this storage (doesnt even have to be transparent) into a distributed, redundant storage network, i could surely make use of those Tbytes
    • To add to this:

      What i am imagine doesnt need to be low-level.

      Just a userland-application with container-files would be fine:

      They can listen to each other, and each file gets replicated on every node. If the filling level gets higher, copies are purged up to a minimal redundancy level.

      Even the factor 2 loss of non-parity redundancy would still be a lot better than not using th espace at all.
  • Backup (Score:2, Informative)

    by m0pher ( 1236210 )
    If you don't already have a backup mechanism for the data that may be on these systems, one way to use all the available storage is for backup. Vembu StoreGrid a solution designed specifically for this problem. Get more info @ http://www.vembu.com./ [www.vembu.com]
  • by pedantic bore ( 740196 ) on Saturday February 09, 2008 @11:03AM (#22359882)

    You might want to ask yourself why, after more than a decade of research and countless papers and prototypes that address this problem, your PCs storage are still underutilized...

    It's harder than it looks to get something reliable. Your PCs have extra capacity because it's cheap, but mining that capacity is not cheap. As other posters have pointed out, putting together (or just purchasing) a server with a few TB of storage is simpler and cheaper, less prone to getting wiped out by a virus, easier to manage and backup.

  • While I was in college, I worked in the IT department. In my experience, your end-users will have a proverbial shit-fit if their computer's HD starts spooling up when they aren't doing anything. While it would be nice to use the spare space for data storage, I'm not sure it would be worth the headache. The volume of user complaints would skyrocket, you'd have to train them to leave the things on all the time, and you'd have a distributed data pool to manage. Changing user behavior is like teaching a two-yea
  • Storage at Desk (Score:2, Informative)

    by phooji ( 1236218 )
    is a project at the University of Virginia that tries to do exactly what you describe: take unused storage on a bunch of machines and turn it into a file system. http://vcgr.cs.virginia.edu/storage_at_desk/index.html [virginia.edu]
  • Was to use a software driver to export the spare part of the disk as an iSCSI (or iATA, if you prefer) target. For performance and integrity, you'd probably be better having a dedicated partition the OS couldn't easily fiddle with, but it shouldn't be too hard to create an array of ~50GB iSCSI targets that you could then collate into larger volumes. Performance wouldn't be stellar, unless you could use a dedicated NIC/VLAN on the hosts, but should be reasonable enough for use a nearline storage of non-cri

  • Isn't this something Google either has already done, or *should* do? Google Distributed File System... GDFS. It has the added benefit of also being a curse if it goes wrong. Seriously, isn't this an ideal project for Google? And if they've already done it, is it available for implementation by everyone else?

    I'd like to see some sort of distributed filesystem as a standard installation option in a linux distribution... The question would be something to the effect of "would you like your computer to fi
    • Whoops, should have "googled" this first. Here it is, google file system.

      http://labs.google.com/papers/gfs.html [google.com]

      The big questions of course are is it usable by regular people, and is anyone actually working on implementing and including this in any of the major operating systems?
      • by allenw ( 33234 )
        Google hasn't released anything other than papers on GFS and their implementation of MapReduce. At this point, though, I'm not sure it matters since we have Hadoop [apache.org] (which, being mainly Java, C, and a little bash) runs perfectly fine on all of the major operating systems, including Windows.

  • What're you're talking about is not a new concept, it just turns out to be really hard to build in a useful way. The most comprehensive discussion of the problems involved can be found at the Microsoft Research project Farsite [microsoft.com].

    The short version of the problem is that the level of service you can expect from each system is incredibly variable, so it's hard to offer a meaningful QoS for the system as a whole. It's not quite as bad as the distributed-hash-table problem (a.k.a. P2P file storage), but it's sti
  • I know little about hardware, so forgive a stupid question: would it make any sense to pull out these computers' drives, replace them with smaller ones, and either sell the lot or assemble them in one place (a RAID?) for easier maintenance? Having your storage spread out through a company becomes a problem if one computer goes down (or is turned off by its user).

    I know the cheapness of drives may make this silly.
  • Birth of the Matrix? (Score:5, Interesting)

    by TropicalCoder ( 898500 ) on Saturday February 09, 2008 @11:41AM (#22360146) Homepage Journal

    What would be a productive use for these terabytes of wasted space?

    Well, I had this idea when I read about some Open Source software that allowed distributed storage (sorry, forgot what that was, but by now I am sure it has already been mentioned in this discussion). The idea was this - suppose we have such software for unlimited distributed storage, so that people can download it and volunteer some unused space on their HD for a storage pool. Then suppose we have some software for distributed computing like we have for the SETI program. Now we have ziggabytes of storage and googleplexflops of processing power, what can we do with that? How about, for one thing, storing the entire internet (using compression, of course) on that endless distributed storage, and then running a decentralized, independent internet via P2P software? The distributed database could be constantly updated from the original sources, and the distributed storage then becomes in effect a giant cache that contains the entire internet. Now we could employ the distributed computing software to datamine that cache and we could have searching independent of Google or Yahoo or M$FT. Beyond that we could develop some AI that uses all that computing power and all that data to do... what? - I'm not sure yet. Just thought I would throw this out there to perhaps maybe get stepped on, or who knows, inspire further thought.

  • There are two companies out there that may be able to do what you need:

    http://www.seanodes.com/ [seanodes.com]

    http://www.revstor.com/ [revstor.com]

    Both claim to be able to pool unused storage on desktops and application servers and make it available to hosts on the network.

  • Why do desktops in a work environment need local hard drives anyway? My Windows folder (created Sunday Nov 10, 2002) is about 4GB. A 4GB SD card is about $30, and a lot of RAM would eliminate the need for a swap file. Basically the only thing that is a bottleneck is the \temp folder and there may be a way to do that with a ramdrive as well. My company requires all user storage to be on a network server, although not really enforced.

    The answer, of course, is that there are a lot of business applications that
  • "Waste" the space. It's not worth it. Once you start doing this, the increased load on cheap desktop drives is going to lead to a several percent per year failure rate increase. It's probably not worth your time. If you want to store a few terabytes of data at much higher performance than this, spend a few hundred bucks on two or three modern drives and a SATA multiplexer.

    Unless you like re-building machines with dead disks it's just not worth it.
  • There's "full" as in "If you put any more crap in this box, I can't get to the crap that's already in there."
    And then there's "full" as in "Hey, I can cram more crap into that box!"

    I need some of that space to defrag the HDDs on my windows box.
    Now, if only there was some filesystem whose performance didn't degrade over time due to fragmentation...

  • Users trying to be productive at their workstation really don't need additional slowdowns happening because someone elsewhere is accessing their hard-drive.

    Not so long ago MS did an upgrade that brought my system to a halt in my usability of the system as it was using all additional drive space to cache my system, on my system.

    When it stopped even teh cache didn't show this usage but I was able to determine it was all in the cache and had to learn about clearing teh cache, not by delete but by ctrl-del to r
  • by IanDanforth ( 753892 ) on Saturday February 09, 2008 @05:04PM (#22362854)
    Having tried this in college, I can tell you a couple things.

    1. You will noticeably reduce the lifespan of the discs. (Which can anger cost conscious supervisors)

    2. Doing ongoing hardware maintenance, because of this reduced lifespan, on closed, used by others, boxes is a *serious* pain.

    Storage setups make hot swapping discs easy, trying to do this with full blown systems just gets tiresome. The solution I eventually came up with was the following.

    Implement a two tiered hardware replacement cycle where you reduce the time a user is allowed to keep any hard drive in their box before replacement. Then using the still reasonably good drives, create a centralized storage solution in which the drives can live out the rest of their useful spans. Data security, user happiness, and redundancy are all good selling points of this system. You still have to deal with monkeying around in user boxes but if it's on a schedule and it nets you more drives, it's not so bad.

    -Ian
  • Allmydata "Tahoe" (Score:3, Informative)

    by n6mod ( 17734 ) on Sunday February 10, 2008 @06:50PM (#22373820) Homepage
    I do some work for Allmydata, which an online storage provider. Their next-gen storage technology is open source and nearly perfect for this application. It's a bit green at this point, but coming along nicely. http://www.allmydata.org/ [allmydata.org]

On the eighth day, God created FORTRAN.

Working...