Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Data Storage Operating Systems Software Unix

Distributed Filesystem for Disconnected Operation? 58

juraj asks: "I'm trying to achieve the following setup: I have two offices connected via a relatively slow ADSL line, and I want a shared fileserver between the offices. I have VPN using IPSec ready, so security is less of a concern, but simply mounting a filesystem (via Samba or NFS) from one office to another is not a solution because of the speed. Also, the ADSL line is sometimes not only slow, but also disconnected. I've tried the CODA distributed filesystem to achieve replication, so that both offices have local copies of their files. The problem is, that the CODA filesystem is just a research project: it is unstable, with the venus daemon constantly falling, and sometimes when recovering from the disconnected state, one side does not recognize the changes and they are simply not propagated. Have you had any good experiences with CODA? Which versions do you use? What kind of setup did you have? How is it configured? I've also heard about OpenAFS, but similar to CODA, I've learned it is unusable in a real environment. Is there any real solution to my problem? Are there any decent solid free distributed file systems for Linux or the BSDs?"
This discussion has been archived. No new comments can be posted.

Distributed Filesystem for Disconnected Operation?

Comments Filter:
  • This question (Score:4, Interesting)

    by Molina the Bofh ( 99621 ) on Tuesday April 13, 2004 @02:28AM (#8845186) Homepage
    is very good, and I was thinking of something like that for my mailservers. That way, I'd have 2 different machines in 2 locations, and [maildir] boxes in both. When message arrive in one, any one, it is copied to the other. When erased, same thing.

    Both servers running at the same MX. So users could choose the server 1 or server 2 according to the location. And witch among them, in case one network goes down.

    Although it sounds simple, I don't know any simple solution to that. Rsync won't work, as there wouldn't not be a master server. Both would have the same preference, so no server depends on the other. That's the goal.
    • Re:This question (Score:5, Informative)

      by DDumitru ( 692803 ) <doug@easycoOOO.com minus threevowels> on Tuesday April 13, 2004 @03:18AM (#8845401) Homepage
      Please excuse the ad here (mod down if you like).

      I developed a replicated filesystem that we use with our commercial email service. The filesystem is layered under UML (User Mode Linux) and cross-replicates files between two servers, on in California, and one in Pennsylvania.

      I too looked at Coda and Inter-mezzo, but was not very satisfied with their stability and/or their ability to recover from outages.

      The replication that we use relies on the update nature of MailDir with Courier Imap.

      Our solution uses UML to post a transaction journel to the underlying host OS layer. Application level code then cross-posts filesystem updates using HTTP transactions with curl and Apache/cgi. Transactions are delayed about 2 seconds to coalesce multiple updates into a single network event. In general, we get about 5mbit of update thruput coast to coast and it is very rare that either system is more than a couple of seconds out of sync.

      I am sorry that I cannot give you the code. While the code is Linux bases, we don't actually sell (distribute) it, so we keep it in-house for our own use. Perhaps my description will give you some ideas.

      The email offering is described at:

      http://easyco.com/mail/index.htm

      • Your transport almost sounds like you implemented a lightweight imitation of Subversion [tigris.org] but without the versioning capabilities.

        Also, what am I missing that makes it significantly better than a triggered rsync session, i.e. one that runs either periodically or whenever a threshold of changes is exceeded?

        • Perhaps a bit more detail of the engine will help.

          The actual file-system driver in UML is patched to produce an "event log" for all file updates. Because this is UML, this log can run in user-space to the host OS.

          The log itself includes events such as "close file after writing", rename file, create directory, set priviledges, etc.

          The log itself is sent to a pipe. The pipe is dequeued by a daemon that builds transaction "blocks". A transaction block can contain one or more transactions. When a transac
    • it looks like a simple shell script with rsync will do it: http://lists.samba.org/archive/rsync/2001-October/ 000430.html
    • Re:This question (Score:3, Informative)

      by Tomun ( 144651 )
      Try offlineimap, it'll sync imapimap or imapmaildir
  • intermezzo! (Score:4, Informative)

    by Anonymous Coward on Tuesday April 13, 2004 @02:30AM (#8845194)
    http://www.inter-mezzo.org/

    you are looking for intermezzo

    http://www.inter-mezzo.org/

    the same guy from coda is the leader. remember

    afs -> coda -> intermezzo
    • Re:intermezzo! (Score:4, Informative)

      by Elwood P Dowd ( 16933 ) <judgmentalist@gmail.com> on Tuesday April 13, 2004 @03:14AM (#8845391) Journal
      And from their web page, there are still caveats. One of their components is advertised as "needing more work before it can be used in production."

      My company uses redundant leased lines to home (different breeds and providers) to ensure that every building can access network resources at all times. Manual fail over. We're not a huge company, but we manage most of this in-house. We'd *love* to know if there's a better answer, even if it cost a lot of money.

      Well. There's always a better answer on the other side of a long and expensive implementation process.
      • Re:intermezzo! (Score:2, Interesting)

        by mattmcl ( 469930 )
        Normally, I stay away from Microsoft as much as possible, but you did say that you'd be willing to spend some money... At a previous job, we were all Win2k/XP and Active Directory...we setup Microsoft DFS that kept a replicated copy of the network file share on a secondary domain controller at a remote location. The primary reason was to have an off-site copy of everything in the event of a disaster of some sort, but, if the primary domain controller/file server went down, users would not even notice - cha
        • Would this allow someone to make changes on the master and allow someone else to make changes on the currently disconnected secondary, and automatically sync up when everything was connected up once again? From your description, it sounds like this feature was not supported and so this solution would not really meet the requirements outlined.
          • Yeah - that's a definite problem. I'm not sure there's a silver bullet here. The only type of system I know of that would handle this is a CVS-type system where changes can be merged.
            I could be wrong, but I think if you've got two versions of a non-text document being concurrently modified in separate locations with the net down, you're going to have issues no matter what file system you're using.
        • Supposedly there are a variety of reasons that DFS won't work for us. I mean, I think we already use it for hardware fault tolerance, but it won't work for us for network faults. Dunno why. I'm not the admin.
    • Re:intermezzo! (Score:3, Informative)

      by Bronster ( 13157 )
      http://www.inter-mezzo.org/

      you are looking for intermezzo


      Hmm.. let's just look at the mailing list again... maybe just a snip from a recent(ish) post (Mar 22 - there are 8 posts since this one, half of them spam):

      | Don't post to a list without reading it also.

      I read this list, what little of it I get.

      | And don't complain about the state of open source
      | software, if you are not ready to test it's betas.

      I am certainly ready to test the betas, just that the last time I tested
      Intermezzo or Lustre -- Lust
    • Unfortunately, it crashes just as much as coda.
  • Unison (Score:5, Informative)

    by JabberWokky ( 19442 ) <slashdot.com@timewarp.org> on Tuesday April 13, 2004 @02:36AM (#8845218) Homepage Journal
    What you're looking for is something like unison [upenn.edu]. Since I don't know what you're serving off of those servers or how often you update files, I can't tell you if it will work for you. But it is robust, and with the -batch flag, it can be automated. It is quite CPU and disk intensive, that's why I say "something like". It's made more for daily or hourly syncs.

    --
    Evan

    • Re:Unison (Score:2, Informative)

      by Anonymous Coward
      Remember folks, unison has a 2GB file size limit.
  • Novell ifolder (Score:5, Interesting)

    by Why Should I ( 247317 ) on Tuesday April 13, 2004 @02:42AM (#8845240) Homepage
    Haven't actually looked into this to any great defree, but is Novel's iFolder an option ?

    It's opensourced even and available on Novel Forge.
  • by shadowxtc ( 561058 ) <shadow@beyourown.net> on Tuesday April 13, 2004 @02:45AM (#8845269) Homepage
    I find this to be the ideal solution for keeping filesystems synchronized across slow links.

    From my experience, Perforce [perforce.com] has the best use of bandwidth and also the most intelligence when it comes to rearranging directory structures and resolving conflicts.

    Unfortunately it's only free [perforce.com] for up to two users - so it may be useless for your needs.
    • Mod Parent Up!

      I have used P4 (perforce) to keep a lot of files in sync between two locations. Fortunately, I had only two locations, so the 2-user 2-client limit never was exceeded.

      In case you want more clients/users, you can try for any of the following:

      1. CVS (http://www.cvshome.org/)
      2. GNU Arch (http://www.gnu.org/software/gnu-arch/)
      3. SubVersion (http://subversion.tigris.org/)

      All these are excellent source control tools, and operate over ordinary TCP/IP (don't need a special setup).

      Avoid tools like Visual SourceSafe because they require a network-mapped drive to work.

      http://better-scm.berlios.de/comparison/comparis on .html gives a comparitive list of version control systems out there.
    • Actually if you're doing Open Source development you can have unlimited users and clients. FreeBSD uses it internally for example.
  • by LoneRanger ( 81227 ) <jboyens&fooninja,org> on Tuesday April 13, 2004 @02:51AM (#8845296) Journal
    Bullshit. You haven't looked at it hard enough then. I used to work at a university that had 26,000+ users using an AFS filestore for their homedirs and for distributed apps across several miles of campus.

    I'm sure this thing has more than surpassed terabyte size by now. It was always fast and always reliable, except when the one of server's SCSI cards would melt and start spewing errors.

    AFS is better than most people give it credit for. I'll admit, it isn't easy to set up, but all the features that you get for that initial work are well worth it.
    • Yeah - OpenAFS is *still* really the only way to go for multi plafrom, disconnected, distributed filesystems. It positively *rocks* - the only downside from my perspective is the unwieldy kerberos management environment, but i am pretty sure that has more to do with my own lazyness and ignorance (wrt learning proper kerberos instead of simply rattling off the HOWTO) as opposed to a fundemental flaw in the system.
      • AFS dies horribly if your clients lose sight of the volume location or file servers. As long as the machines are well-connected, it works great.

        As far as Kerberos goes, I'd suggest the new ORA nutshell book "Kerberos: The Definitive Guide". While it doesn't go into AFS much, it explains how the thing really works and how to configure MIT and Heimdal Krb5.

        - Happy AFS/krb5 site administrator
    • OpenAFS is a great solution to a problem, just not this one. It doesn't work in a detached state. On the other hand, the caching is quite aggressive, and if it's an option, you could set up two cells that trust each other and access files that way.

      I'll be happier once the stable versions have two things though... >2gb file support, and support for 2.6 series kernels. Disconnected operation would be nice as well.

      All of those are proposed projects, but not currently in the developement version (at le
  • by Anonymous Coward

    Monash University [monash.edu.au] is using AFS on its Linux desktops [monash.edu.au]. Whenever the connection to the file server goes down, everyone's sessions hang, which is clearly unacceptable.

    It's quite possible that it has been incorrectly set up, but in this situation AFS hasn't delivered what it promised.

    • umm, I go to monash uni too and before they were using NFS, I haven't really tried out the AFS drives yet except over ra-clay and stuff, but from the short time I used them on the network, they are far better. I remember all the NFS probs they had (I must have lost at least 4 or 5 assignments on them and lost at least 20% in marks). AFS has disconnected operation so should be much better.. Have you tried it out this year much.. It might also be hanging because they are trying to make different fetches off
  • AFS works fine (Score:3, Insightful)

    by frenchkiss ( 228758 ) on Tuesday April 13, 2004 @02:59AM (#8845330)
    It is used in a number of university campuses across the US (as a bunch of disjoint namespaces under /afs) and works fairly reliably. I wouldn't say it's perfect but works well for day-to-day usage.

    Among the most notable of the universities using afs are CMU, UNC, RPI, MIT. Furthermore, there are a number of government namespaces as well.

    Try it out!
    • Lots in other countries too:

      % ls /afs
      afs.hursley.ibm.com eos.ncsu.edu mcc.ac.gb sfc.keio.ac.jp afs1.scri.fsu.edu es.net md.chalmers.se si.umich.edu alw.nih.gov ethz.ch me.cmu.edu sipb.mit.edu andrew.cmu.edu federation.atd.net meteo.uni-koeln.de slac.stanford.edu anl.gov fh-heilbronn.de mpa-garching.mpg.de sleeper.nsa.hp.com asu.edu fl.mcs.anl.gov msc.cornell.edu spc.uchicago.edu athena.mit.edu fnal.gov msrc.pnl.gov sph.umich.edu bnl.gov geo.uni-koeln.de msu.edu spv.uniroma1.it bp.ncsu.edu glue.umd.edu na

  • My problem is similar to the original poster's, with one small limitation: I have two (mostly) identical Linux machines at both ends of an ADSL link with VPN, etc. All I need to do is edit/compile/run a CLI application (no fancy graphics required). The app must compile/run on a machine at the 'office' end, but I'd like to edit on the machine at the 'home' end. I tried emacs on the 'home' machine with remote editing, as well as remotely running vi/emacs on the 'office' machine, but neither method has the res
    • Use version control ?
      Edit at your local site, have a (subversion/cvs)server at the office.

    • I mount via nfs over a VPN over 1Mb ADSL (rsize=8192,wsize=8192,intr,rw,async,noatime,noau t o,user) and after the Vim session is restored, don't have a problem.
      An rsync based script (FWIW in Python) to xfer disparate directories and files works around the cumbersomeness problem.
      As for the 'use version control' responses: I don't want to store intermediate versions of persistent files and don't want to store intermediate/temporary files at all (but don't want to recreate them from scratch every couple
    • Re:tramp (Score:3, Interesting)

      by Chaostrophy ( 925 )
      Use the tramp package, it automates grabbing files via ssh (through multiple hosts), so you edit localy. Very hand, I really liked it.
  • Unison (Score:5, Informative)

    by hak1du ( 761835 ) on Tuesday April 13, 2004 @03:28AM (#8845428) Journal
    Don't bother with any of the kernel-mode disconnected file systems. For those kinds of situations, the Unison file synchronizer [upenn.edu] is a good choice: it performs bidirectional synchronization and uses an efficient protocol that only needs to send differences and some checksums across the wire. It also detects conflicts and (optionally) lets you resolve them automatically. It works on UNIX/Linux, Windows, and MacOS.
  • by psycho ( 84421 )
    Why not just use CVS or, even better, subversion?
    Just a first thought...
    • Re:cvs (Score:2, Informative)

      by doctormetal ( 62102 )
      Why not just use CVS or, even better, subversion?

      You should use CVSup [cvsup.org] for this.
      It has already proven its useability for syncing and updating FreeBSD systems
  • it looks like a simple shell script with rsync will do it...
    http://lists.samba.org/archive/rsync/2001-O ctober/ 000430.html
  • use AFS (Score:4, Interesting)

    by stonebeat.org ( 562495 ) on Tuesday April 13, 2004 @09:21AM (#8847015) Homepage
    use the real AFS from IBM. work very nicely.
    • How can I buy their product? Who sells it? What does it cost?

      I'd love to get started with a supported AFS... is this something that only corps can buy?

      Jonathan
      • Re:use AFS=OpenAFS (Score:2, Interesting)

        by oli_freyr ( 105995 )
        I'm no expert, but I became curious about the difference between IBM AFS and OpenAFS and it seems that they are the same [ibm.com].

        This means I will probably check it out for my next fileserver project... ;)
      • Re:use AFS (Score:4, Informative)

        by Earlybird ( 56426 ) <slashdot @ p u r e f i c t ion.net> on Tuesday April 13, 2004 @12:36PM (#8849454) Homepage
        • How can I buy their product? Who sells it?
        IBM AFS [ibm.com]. Note that OpenAFS is a true fork of IBM's own code, and currently maintained by IBM and the community. Afaik, IBM AFS is no longer in active development. You don't need to buy anything except support.
        • What does it cost?
        IBM AFS client licenses have historically been "very expensive" -- that's about all I know. If you need to ask, you probably can't afford it. :)
  • Have you considered Oracle iFS? Since it is based on an Oracle DB so it should be possible to do a two-way replication. Or possibly make the synchronization policy yourself by database triggers.
  • Foldershare (Score:2, Informative)

    by niai ( 310235 )
    Foldershare [foldershare.com] is a Win32 "Document Management & Real-time File Mirroring Solution".

    I read [deviantart.com] that "the development team hopes to start work on Mac OS X and Linux clients within the next six months" (Jan 27th 2004).

FORTRAN is not a flower but a weed -- it is hardy, occasionally blooms, and grows in every computer. -- A.J. Perlis

Working...