Distributed Data Storage on a LAN?

Distributed Data Storage on a LAN? 446

Posted by Cliff on Wednesday October 29, 2003 @05:12PM from the redundancy-gooood dept.

AgentSmith2 asks: "I have 8 computers at my house on a LAN. I make backups of important files, but not very often. If I could create a virtual RAID by storing data on multiple disks on my network I could protect myself from the most common form on data failure - a disk crash. I am looking for a solution that will let me mount the distributed storage as a shared drive on my Windows and Linux computers. Then when data is written, it is redundantly stored on all the machines that I have designated as my virtual RAID. And if I loose one of the disks that comprise the raid, the image would automatically reconstruct itself when I add a replacement system to the virtual RAID. Basically, I'm looking to emulate the features of hi-end RAIDS, but with multiple PCs instead of multiple disks within a single RAID subsystem. Is there any existing technologies that will let me do this?"

Distributed Data Storage on a LAN?

This discussion has been archived. No new comments can be posted.

Search 446 Comments Log In/Create an Account

Comments Filter:

Network RAID (Score:1, Interesting)

by Anonymous Coward writes: on Wednesday October 29, 2003 @05:20PM (#7341394)

Redhat has a very good software raid and is easy to setup with only two disks. Of course with only two disks they are mirrored. But it is very easy to setup a cron entry that can email you the status of that mirror everyday.

Expensive but reliable solution (Score:3, Interesting)

by onyxruby ( 118189 ) writes: <onyxruby&comcast,net> on Wednesday October 29, 2003 @05:25PM (#7341459)

I've been looking into something like this for a little while. What I'd like to do when I have the fundage is get a fileserver/backup box. The ideal is to run 4 160 GB IDE drives in RAID 5. This will give me a bit over 450 GB in usable network storage. I then want to add a pair of 250 GB 5400 drives for backup. I can then set up a the server to backup the data from the raid drives to the backup drives on a daily basis.

According to pricewatch the 4 160's could be had for around $400 total with about another $400 for the backup. Add a 3ware RAID controller for another $245 bucks and your looking at about $1045 to convert a system into supporting 450 GB of usuable network storage and backup.

From all indications IDE harddrives are now the cheapest form of backup there is. I've looked at CD, DVD, Tape, but it keeps coming back to IDE hard drives. This is far cheaper than a similiar storage and backup would be on tape.

Speed (Score:5, Interesting)

by backtick ( 2376 ) * writes: on Wednesday October 29, 2003 @05:34PM (#7341546) Homepage Journal

Using a pair of Intel EEPro 100's w/ trunking (using both links at the same time on one IP, works w/ a cisco switch), I've gotten over 100 Mb/sec of actual throughput (I think I hit 137 Mbit/sec, peak) out of a box using NBD to create a mirror'd RAID volume over the trunked ports. Now, my actual 'real' data speeds to the file ssystem were about half that (Call it 50-65 Mbit, or 6 to 7.5 MByte/sec), due to mirroring == writing it twice. Still not bad. Yes, the target disks were themselves part of other RAID volumes, for speed :)

I can't believe... (Score:2, Interesting)

by wcdw ( 179126 ) writes: on Wednesday October 29, 2003 @05:36PM (#7341560) Homepage

...this question even got asked. Ok, if you *need* to share the same device across machine, something like the network block device can be a real help.

If all you're worried about is disk failures, mirror each disk locally. Disks are cheap, and real operating systems don't have any trouble with software mirroring.

Why would you want to make all of your machines suddenly non-functional, just because one of them lost a network card? Or the switch failed? Or ....

Re:AFS (Score:5, Interesting)

by fireboy1919 ( 257783 ) writes: <rustypNO@SPAMfreeshell.org> on Wednesday October 29, 2003 @05:47PM (#7341663) Homepage Journal

In my experience, it's one of those "it would be a wonderful thing if it worked."

It requires it's own partition for each mount of it; you can't just share disks you've already got.

Setup also takes hours, and it probably won't work the first time. Online documentation is incredibly outdated, which doesn't help matters at all. It also takes a hefty chunk of computer to run it, because it requires a lot of watchdog type programs to fix the frequent corruption that happens to it as you use it.

The servers time has to be matched exactly, so it's also best if you've got an NTP server running and clients on all the machines.

It's also about ten times slower than Samba (which you might use instead to share with Windows machines), and it chokes when you try to move/copy/delete large files.

I tried it for a month before it completely corrupted it's own partition and I switched back to NFS and Samba.

I can't wait for the day when these problems are but a memory and such a system works flawlessly.

Re:Most common form of data loss? (Score:3, Interesting)

by ckaminski ( 82854 ) writes: <slashdot-nospam@ ... m ['r.c' in gap]> on Wednesday October 29, 2003 @05:50PM (#7341685) Homepage

But say I do? I mean, versioning databases are the next bit, man. Why not have a chmod +v for versioning? If this bit is set, then apply version control. Every file open/write/close sequence adds a new version delta. Sure, there's a performance hit associated with it, but I'd like the choice.

AFAIK, there's at least on project out there to turn CVS into a filesystem, and a few others to add MVCC functionality into a filesystem (somewhat like the Clearcase filesystem does).

It's a good feature, something I'd want on my docs and code, and other specs, not necessarily on my pr0n and MP3s.

-Chris

New kind of network file system needed (Score:2, Interesting)

by rar ( 110454 ) writes: on Wednesday October 29, 2003 @05:58PM (#7341753) Homepage

I don't think the RAID algorithm is the right way to syncronize all your data, when applied on the larger scale. I imagine that what a person really want to do is to unify all his accounts, on slow and fast links all over the world, to look like a huge syncronized partition which stores the data throughout the accounts with sufficient redundancy (meaning something like 'keep copies of all data on at least three different locations). I think using RAID for this would give horrible performance and not be nearly flexible enough in how data is distributed through the different locations.

A new networked file system is needed. I am working on such a solution on my spare time (but it is still in the design phase).

The main idea is to unify cache and storage. This means that the least used files are deleted when an account is running out of storage, but under the constraint that a mimum number of copies of the files are kept online. (Hence, data will propagate to the nodes that actually use it). Upon a data request the filesystem goes out and fetch the data. Preferably in some P2P-like way where it is fetched simultaniously from all locations that has copies of that data.

If someone knows a solution that already works something like this, please tell me.

Re:Backing up all within your house (Score:3, Interesting)

by Eric Smith ( 4379 ) * writes: on Wednesday October 29, 2003 @06:04PM (#7341812) Homepage Journal

Hmmmm, what happens if your house catches fire? 8 copies of the same document all nicely toasted!

Been there, done that [brouhaha.com]. :-( Didn't even get a t-shirt.

Re:NBD Does this (Score:3, Interesting)

by WindBourne ( 631190 ) writes: on Wednesday October 29, 2003 @06:23PM (#7341971) Journal

I currently do this at home with 3 computers (all Linux) for my home directory. But I have been thinking that there needs to be a way to seperate parts of etc for the local system vs. the network. I have been thinking of how to write a block device that allows layers to be combined.

File versioning useful, VMS variant not so sure (Score:3, Interesting)

by kingdon ( 220100 ) writes: on Wednesday October 29, 2003 @08:05PM (#7342739) Homepage

The concept of being able to see the previous version sounds good. But on VMS, file versions didn't really achieve this all that well. Classic example: how do you delete a file?

Try #1:

DELETE FOO.TXT

This is really the wrong answer. If you have FOO.TXT;1 and FOO.TXT;2, then this command deletes FOO.TXT;2 and any attempt to access FOO.TXT will get you FOO.TXT;1.

Try #2:

DELETE FOO.TXT;*

This is the common recommendation, but you've now lost the ability to see any of the old versions.

The GNU file utilities (and emacs and some other GNU programs) have a file versioning scheme which is somewhat similar to VMS but somewhat better. Look at commands like "VERSION_CONTROL=numbered cp foo bar".

Personally, I usually put things which matter in CVS. With the CVS server in a distant city (at an ISP which provides ssh shell accounts). That gives me off-site backups.

Re:You aren't gonna get a real RAID. (Score:2, Interesting)

by darrylo ( 97569 ) writes: on Wednesday October 29, 2003 @08:15PM (#7342840)

Cheap route: cron jobs/Windows task scheduler to copy important folders across the network every night

Also, for those people concerned about leaving another "backup server" running 24x7, you can make use of the "wake on LAN" capability to do backups (available on many LAN/motherboards). Just wake up (boot) the "backup server", do your backup, and then shut it down. It's way cool to remote-boot home servers.
Here, the only real issue is the power/thermal cycling of the hard disk once a day (or whatever), which might be a problem since many disks now tend to come with only a one-year warranty. However, this isn't all that different from a regularly-used PC.

Re:Coda (Score:3, Interesting)

by quantum bit ( 225091 ) writes: on Wednesday October 29, 2003 @08:32PM (#7342987) Journal

If by "high performance through client side persistent caching" you mean "has to copy the entire 300MB video from the server to my local machine before it even starts playing, assuming it doesn't crap out because the default cache size is smaller than that", then yeah, go for it!

Seriously, I looked into Coda a couple months ago and the design looks really cool, but it just doesn't seem to work very well unless you're only storing tiny text files. It also doesn't scale very well on large servers (i.e. it has a maximum limit on number the of files on each volume). Don't get me wrong, I REALLY wanted to use Coda because I liked the idea of it -- I just wish that it worked better. Ended up going back to NFS (yuck!).

Re:Most common form of data loss? (Score:3, Interesting)

by penguin7of9 ( 697383 ) writes: on Wednesday October 29, 2003 @10:01PM (#7343585)

That's one feature from VMS that I wish unix had.

That feature doesn't need to be in the kernel, since it can easily and transparently be provided in user space.

If you like, you can enable this right now using a simple hack on top of PlasticFS [sourceforge.net] or your own, custom LD_PRELOAD hack.

Providing file versioning in the kernel or enabling it globally in some other form has not caught on because it is a huge hassle and causes lots of problems, even in systems that know about it.

For example, when you retag one MP3, do you want to keep an old version? What about if you retag your entire 50G collection of MP3s?

The default of not versioning files in UNIX works better. Versioning and its implementation is highly application and implementation dependent. Emacs, OpenOffice, cvs, and other tools do the right thing, and they do it much better than anything the kernel could ever hope to do.

Re:Win2k (Score:2, Interesting)

by devilspgd ( 652955 ) * writes: on Thursday October 30, 2003 @02:54AM (#7345249) Homepage

From my reading of DFS prior to W2K/AD's release, it was mainly built for large mostly static data which needs to be replicated across multiple sites and needs high uptime, but very specifically does not need to be updated frequently.

The concept of giving all users read/write access was thought up later on and it happens to work, but as you say, if two users update the same file, you may/will lose data.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Distributed Data Storage on a LAN? 446

Distributed Data Storage on a LAN? More Login

Distributed Data Storage on a LAN?

Network RAID (Score:1, Interesting)

Expensive but reliable solution (Score:3, Interesting)

Speed (Score:5, Interesting)

I can't believe... (Score:2, Interesting)

Re:AFS (Score:5, Interesting)

Re:Most common form of data loss? (Score:3, Interesting)

New kind of network file system needed (Score:2, Interesting)

Re:Backing up all within your house (Score:3, Interesting)

Re:NBD Does this (Score:3, Interesting)

File versioning useful, VMS variant not so sure (Score:3, Interesting)

Re:You aren't gonna get a real RAID. (Score:2, Interesting)

Re:Coda (Score:3, Interesting)

Re:Most common form of data loss? (Score:3, Interesting)

Re:Win2k (Score:2, Interesting)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot