Slashdot Log In
Distributed Filesystems for Linux?
from the one-filesystem-many-hard-drives dept.
Zoneball looked at 3 distributed filesystems, here are his thoughts:
" Open AFS was the solution I chose because I have the experience with it from college. For performance, AFS was built with an intelligent client-side cache, but did not support network disconnects nicely. But there are other alternatives out there.
Coda appears to be a research fork from an earlier version of AFS. Coda supports disconnected operations. But, the consensus on the Usenet (when I looked into filesystems a while ago) was that Coda was still too 'experimental.'
Intermezzo looks like it was started with the lessons learned from Coda, but (again from Usenet) people have said that it is still too unstable and it crashes their servers. The last 'news' on their site is dated almost a year ago, so I don't even know if it's being developed or not"
So if you were to recommend a distributed filesystem for Linux machines, would you choose one of the three filesystems listed here, or something else entirely?
NFS (Score:4, Informative)
(Last Journal: Sunday April 11 2004, @07:41PM)
NFS Linux FAQ [sourceforge.net]
Howto #1 [sourceforge.net]
Howto #2 [linux.org]
If you find yourself needing help, try asking people at Just Linux forums [justlinux.com], or trying the NFS mailing list [sourceforge.net].
Re:NFS (Score:4, Informative)
(http://entropy.homelinux.org/)
It takes about 5 minutes to get an understanding of what you need. After setting it up it just works.
NFS is a great
Re:permissions? (Score:5, Informative)
(http://phorm.phormix.com/ | Last Journal: Monday May 19 2003, @12:08PM)
NIS == "Hack me please" (Score:5, Interesting)
(http://openconnector.org/ | Last Journal: Thursday December 11 2003, @08:15PM)
Other options like LDAPS and Kerberos offer at least some form of security.
ypcat, then brute force attack on the resulting passwd file is as old as dirt, and sadly still works. I was a bit dissappointed when I saw NIS as a required service on the Redhat cert syllabus.
This may sound harsh, but I don't think there is much excuse for run NIS in this day and age. Anyone who does this in an environment where security is a concerns deserves what they get.
Re:permissions? (Score:5, Informative)
(http://www.golum.org/)
Could also look into LDAP (VERY complex, no good starting point that I've been able to find) and Kerbreos auth methods as well.
Should give you a central point for uids/usernames. But NFS does not have transparent mounting that I'm aware of so that you could mount, say the
<ECODE>
CPU1 contains:
CPU2 contains:
CPU3 contains:
on CPU4, you'd do the following:
mount CPU1:/home
mount CPU2:/home
mount CPU3:/home
And you'd end up with on CPU4:
/home/tic
</ECODE>
If there is a way to do this, please lemme know. I've heard people talk about it in the past, but haven't seen anything come of it yet.
Re:permissions? Automounter (Score:4, Informative)
(http://www.rage.net/~greg/)
Re:NFS (Score:5, Insightful)
(http://mnm.uib.es/gallir/)
NFS is not distributed, it's only "networked" or "remote". I t doesn't support any: replication, disconnection, sharing, distribution. It is centralised, requires the same user names|numberpace and security.
In one word, it's far away of the requirements, at least if you compare them with the listed FS in the question.
Re:NFS (Score:4, Informative)
(http://mnm.uib.es/gallir/)
Disconnection in a DFS means a certain degree of replication: you still are able to work on your files even you you have no access to you repository, or you are off-line. Autofs doesn't do that, altough you can have some rsync's scripts to partially solve the problem, it's not a scalable or viable workaround for several users.
NIS on the other hand is not a good solution for WAN connections or different networks. Should you use this kind of soultion, I'd take a look to openldap instead.
Re:NFS (Score:4, Informative)
That's what NIS is for. Furthermore, the flexibility of being able to set up machines with different views of the network is crucial in many applications. None of my workstations or servers actually have the same mount tables: they all get some stuff via NIS, and some stuff is modified locally. The restrictions AFS imposes are just unacceptable.
AFS makes administration tremendously easier after one's scaled the initial learning curve.
AFS is an administrative nightmare. Apart from the mess that ACLs cause and the problems of trying to fit real-world file sharing semantics into AFS's straightjacket, just the number of wedged machines due to overfull caches and its complete disregard for UNIX file system semantics cause no end of support hassles. Then, there is the poor support for Windows clients. We started out using AFS because it sounded good on paper, but it was a disaster in terms of support, and we got rid of it again after several years of suffering.
It performs far, far better than NFS on large networks (and merely somewhat better on smaller ones).
AFS's caching scheme works better than what NFS is doing for small files, but that case is fast and easy anyway. AFS's approach falls apart for just the kind of usage where it would matter most: huge files accessed from many machines.
Both NFS and AFS have very serious problems. But between the two, NFS is far simpler than AFS, is easier to administer in complex real-world environments, respects UNIX file system semantics better, and works better with large files. I can guardedly recommend NFS or SMB ("there is nothing better around, so you might as well use it"), but I can't imagine any environment for which AFS is a reasonable choice anymore. The only thing AFS had ever going for it as far as I'm concerned is that it was fairly secure at a time when NFS had no security whatsoever, but that is not an issue anymore.
Re:NFS (Score:4, Funny)
(http://slashdot.org/)
Wouldn't it be simpler and easier to manage if users had to sign up for computer time on a mainframe? Just think: you would only have to support one system! The benefits to security and maintinence would be enormous. Letting users have their own computers seems nice, but since it requires less planning and thinking (as a mainframe timeshare system requires) it will always become unmanageable. After all, there's no way to plan for the use of advanced tools. Why do you think many larger 1970s corporations running large computer implementations have a policy of not allowing any employee to access the mainframe without signing up first?!?
(Note for the humour impaired: I'm parodying the above author's style.)
Symlinks are your friend! (Score:4, Insightful)
(Last Journal: Wednesday March 02 2005, @11:08PM)
But for a lot of applications, you simply don't need that much, and you've got some way to contain the security risks, and NFS can be enough. It's easy enough to set up, and if all you're *really* trying to do is make sure that everybody sees their home directory as /home/~user, and sees the operating system in the usual places and the couple of important project directories as /projecta and /projectb, NFS with an automounter and a bunch of symlinks for your home directories is really just fine. They hide the fact that users ~aaron through ~azimuth are on boxa and ~beowulf through ~czucky are on boxbc etc. And yes, there are times you really want more than that, and letting your users go log onto the boxes where their disk drives really are to run their big Makes can be critical help. But for a lot of day-to-day applications, it really doesn't matter so much.
Re:NFS (Score:5, Interesting)
The only file system that is truely distributed, has a global namespace, replication, and fault tolerance is AFS.
NFS is pretty much the same as CIFS for Windows. And, version 4 still doesn't have global namespace and volume location.
So, NFS can't be a common answer because it isn't even allowed to be in the game.
+4 cents.
NFS is not even close to secure (Score:5, Interesting)
That's like saying "jumping off a cliff is not the most intelligent thing to do." NFS is easily the LEAST secure option of ANY filesharing system.
NFS is only appropriate on a 100% secured(physical and network-level) network. If anyone/someone can plug in, forget it. If anyone has root on ANY system or there are ANY non-unix systems, forget it. If ANY system is physically accessible and can be booted off, say, a CDROM, forget it. The only major security tool at your disposal is access by IP, which is pathetic. Oh, and you can block root access.
Even though you can block root access for some/all clients, it's still massively insecure, and this remains NFS's greatest problem. You have zero way of authenticating a system. NFS is like a store where you could walk in, pick up any item you wanted, and say "I'm Joe Shmoe, bill me for this!" and they'd say "Right-o!" without even looking at you. All systems with the right IPs are explicitly trusted, and their user/permissions setups are also explicitly trusted.
NFS is a pretty good performer, especially when tuned right and on a non-broken client(which linux is VERY far from.) However, its entire security model is in dire need of a complete overhaul. There needs to be a way to authenticate hosts, for one, more similar to WinNT's domain setup, which is actually incredibly intelligent(aside from the weak LANMAN encryption.) The administrative functionality in NFS can't compare to the features that have been available to MacOS and Windows administrators for over a decade, and it's purely embarassing.
Either that, or AFS/Coda need to get a lot more documentation and (for Coda)implementation fixes. The unix world desperately needs a good filesharing system...
Re:NFS is not even close to secure (Score:5, Informative)
(http://www.mixdown.ca/)
I use a very simple script to help keep NFS secure:
Basically it only allows incoming NFS-related connections over ipsec, dropping anything that is not. NFS port allocation is dynamic by default and I know you can force ports, but this seemed far easier to scale.
One thing I have noticed (and perhaps it's common knowledge to NFS experts) is that in order to get locking to work at all, my NFS clients had to be running statd and lockd. Without 'em everything worked but locking would fail every time.
Re:NFS is not even close to secure (Score:4, Interesting)
(http://www.vort.org/)
Of course, that doesn't mean it's a good idea. I think your solution with IPSec is much more elegant. Unfortunately, I happen to need to get through a heavily packet-shaped network that massively favors port 80, and drops random packets everywhere else. Not IPSec friendly at all. I avoid this by running multiple ppp/ssh tunnels through the retarded parts of the network and letting my gateway balance between them. Unfortunately, this requires privileged accounts on many, many boxes in odd places.
By the way, 10 points to any Northeastern University students who send polite, well considered complaints to Network Services. Not RESNet - they exist only to prevent you from talking to Network Services. Don't bother yelling at them - they exist specifically for that purpose. RESNet has no authority whatsoever to, for instance, allow CVS to work when Network Services decides to to drop 90 percent of packets on port 2401. This is for your benifit - I'm perfectly happy with my tunnels.
Re:NFS is not even close to secure (Score:4, Informative)
By that you mean that it's easy to read stuff off people's directory if you can spoof their UID. Sure. I think you'll find the same is true on a SMB network.
> The administrative functionality in NFS can't
> compare to the features that have been available
> to MacOS and Windows administrators for over a
> decade,
Given that 10 years ago Windows for Workgroup had hardly been released and didn't even have TCP/IP by default I think you are exagerating a little bit. At the same time MacOS version 7 was the norm, and we all know how secure that one was, right?
Maybe NFS4 [samba.org] is your answer?
Re:NFS is not even close to secure (Score:5, Interesting)
(http://www.umich.edu/~bfields)
- NFSv4 home page [nfsv4.org]
- NFS Version 4 Open Source Reference Implementation [umich.edu], for Linux and OpenBSD
As part of University of Michigan/CITI's work on NFSv4, we're implementing rpcsec_gss on Linux [umich.edu], which uses kerberos to authenticate every NFS request and reply. This applies equally well to earlier versions of NFS, and interoperates with other vendor's NFS implementations. While it's still not sufficiently tested for production use, the code is going in to the 2.5 kernel series (thank-you, Mr. Torvalds, for accepting crypto into 2.5...) and is being actively developed.--Bruce Fields
Re:NFS (Score:5, Insightful)
(http://slashdot.org/)
'jfb
Nope, not NFS...yes AFS... (Score:4, Informative)
NFS is not secure. At most sites, NFS is exported read-only and limited to the domain, or to a given set of machine(s). If you export NFS as read/write then the client had better be secured, or you better use kerberos, and for damn sure better be behind a firewall. NFS has no client side cache, no volume location service, no ACL's, no authentication (unless kerberized), no replication, yata, yata, yata. We've used NFS sparingly for over 15 years because we -know how it works, and know its limitations.
On the other hand, we set an AFS cell for enterprise scale application and data sharing. It currently uses Kerberos V authentication, has volume replication, global namespace, client cache, fault tolerance. User's can setup their own groups, set their own ACL permissions. Did I say quota? AFS has per-user/per-volume quota. Hey, guess what, symbolic links work from any volume to any volume on AFS. And, AFS is just a simple daemon. You crank it up, mount the top of your cell and poof, you are done.
Another positive is the fact that once you setup an AFS cell you automatically become part of a larger community. Any AFS cell can mount the entire file system of another AFS cell within the same tree. I can for example mount many large university and government cells and share files. AFS allows Internet-wide file sharing with full security. On most versions of the client you can even enable encryption on the connection so your files won't be snooped easily.
All of our Solaris, Windows, Linux, and Mac boxes can use the same AFS tree without blinking an eye. We use AFS for many things. Before LDAP was really worth anything, we used AFS for simply exchanging read-only data. It -is- a replcated and global file system! Just put your config files in the tree and you are done.
If you are one of those people who are blinded by "always doing things one way", then I'd suggest you wake up and smell another technology, I did, and I liked what I got in return. Look into OpenAFS, you'll be glad you did.
+10,000 karma points!
Self Certifying File System (Score:5, Informative)
Re:Self Certifying File System (Score:4, Interesting)
(http://www.angio.net/)
Highly recommend cheking it out. Mega convenient.
Well it depends... (Score:5, Informative)
Since openafs [openafs.org] forked from the old transarc/IBM codebase, it looks as if it has a real future. It's used by a load of educational and research institutions (notably CERN), as well as Wall Street firms.
NFS/BOOTP (Score:3, Informative)
(http://www.a2b2.com/)
Just my $00.2
Rus
Background on DFS (Score:5, Informative)
(http://lucidamerica.com/)
PVFS (Score:5, Informative)
(http://wang-fu.org/)
http://parlweb.parl.clemson.edu/pvfs/ [clemson.edu]
openmosix (Score:5, Informative)
(http://blog.peoplesdns.com/)
If you want to take a look..
http://lucifer.intercosmos.net/index.php [intercosmos.net]
linkage and I am going to be placing some tutorials up. -joeldg
Ye olde Samba (Score:4, Informative)
No need to unnecessarily complicate things here, samba is simple to set up and functions great.
Re:Mirroring file system (Score:5, Informative)
(http://www.endpointcomputing.com/)
Unison will synchronize any two file trees in The Right Way (TM).
Get the gtk version for interactive conflict resolution.
Re:Mirroring file system (Score:5, Interesting)
(Last Journal: Thursday August 23 2001, @09:23PM)
Rsync is nice because you can update lots of files very quickly, as it only moves binary diff's between files. Also, if it is a costly network link, you have the option to specify max transfer rates, so you don't kill your pipe when it runs from your cron job.
Unison is nice because it is pretty smart about determining which files should be moved, and can correctly handle new and deleted files on either end of the link. Plus it supports doing all of it's comm via ssh, so it's secure.
rsync [samba.org]
unison [upenn.edu]
The downside to both of these being that neither of them are instantaneous. However, I've had much success running both of these as often as every 5 minutes. Just make sure that you launch them from a script that is smart enough to check for already running instances before it starts trying to move data.
Intermezzo does appear to be a current project (Score:5, Informative)
The sourceforge page for the project (http://sourceforge.net/projects/intermezzo) shows status as production/stable but the info there looks stale too.
Future obsolescence ? (Score:5, Insightful)
This guy must have installed too many versions of the same Microsoft products. ... You can still configure you networking using scripts for 2.0- or 2.2-based distros. You can often use 20 year old programs under Unix, albeit sometimes with some effort.
In the GNU/Linux world, BSD world, and to some extend in the entire Unix world, good designs do not become obsolete. Even not-so-good designs often stick around, for the sake of backward compatibility. In the newest greatest Linux kernel, you can still have a.out support, NFS, Minix, FAT16 filesystem support
Only in the M$ world is obsolescence such a big issue, because that obsolescence is planned. In short, don't worry that much about obsolescence : if Coda is as good as it looks, it'll be there for a long time. If SomeCrappyDistributedFS FileSystem is used by enough users, it'll stay around for compatibility's sake anyway, even if it sucks.
NFS & autofs (Score:4, Informative)
(http://www.rage.net/~greg/)
-- Greg
None of the above (Score:4, Interesting)
(http://www.slightlymad.net/)
NFS is not a DFS (Score:5, Informative)
Obsolete ? (Score:5, Funny)
(Last Journal: Saturday July 03 2004, @08:34PM)
The best protection from future obsolescence is to use something that is already obsolete.
AFS vs NFS (Score:5, Insightful)
It's become such a part of my day to day life that I can't really describe the things I was missing before. The best things about it are probably the strong, flexible security and ease of administration. It also gives you everything you need from a small shop all the way up to a globally available decentralized data store.
There seems to be a good comparison here [tu-chemnitz.de]. I would strongly recommend AFS for all of your distributed filesystem needs. (The OpenAFS developers are cool too!)
Re:AFS vs NFS (Score:5, Informative)
You only have to wait for the first day you want to reboot a fileserver without breaking every system on your network or waiting for startup dependencies, etc... One day, I moved all of the volumes off of an active fileserver (i.e. volumes being written) and shut the thing down and moved it to another machine room, brought it back up, and moved the volumes back. The reads and writes continued uninterrupted, no clients had to be restarted, no hung filesystems anywhere, etc...
Tutorial (Score:5, Informative)
(http://www.thelinuxpimp.com)
The only trouble you might run into with the setup I used is some file-locking issues with programs wanting to share the same preference files.
Remote Synchronised filesystems (Score:3, Informative)
(http://danpat.net/)
I looked into a whole pile of options for having a "live" filesystem, a-la NFS, but the bandwidth killed interactivity (this is for users who've never used 100mbit network filesystems before).
I found the following:
1. Windows 2000 Server includes a thing called "File Replication Service". Basically, it's a synchronisation service. You replicate the content to many servers, and the service watches transactions on the filesystem, and replicates them to the rest of the mirrors as soon as it can. You can write to all mirrors, but I never quite worked out how it handled conflict resolution.
A chapter from the Windows 2000 Resource kit that describes how it works: http://www.microsoft.com/windows2000/techinfo/res
2. Some people have done similar work for Unix systems, but they mostly involve kernel tweaks to capture filesystem events. Can't remember any URLS, but some Googling should find it.
3. Some people are using Unison to support multi-write file replication. So long as you sync regularly, you shouldn't have too many problems.
4. The multi-write problem is a hard one, so most people tend to say "don't do it, just make the bandwidth enough". This is the way to go if bandwidth isn't an issue.
A guy by the name of Yasushi Saito has done quite a bit of research into data replication. Some papers (search for them on google in quotes). He also put together a project called "Pangaea" which tries to do as described above. It wasn't great last time I looked. Some paper titles:
- Optimistic Replication for Internet Data Services
- Consistency Management in Optimistic Replication Algorithms
- Pangaea: a symbiotic wide-area file system
- Taming aggressive replication in the Pangaea wide-area file system
There is also a bunch of other research work:
- Studying Dynamic Grid Optimisation Algorithms for File Replication
- Challenges Involved in Multimaster Replication (note: this talks about Oracle database replication)
- Chapter 18 of the Windows 2000 Server manual describes the File Replication Service in detail
- How to avoid directory service headaches (talks about not having multi-master-write replication and why)
OpenAFS all the way (Score:5, Informative)