Sharing a Subset of Data Between 2 Sites? 23
"Some people spend 95% of their time in lab 2, so that is their 'home' server, but when they come to lab 1 for a week's stay or so, they scp/rsync their files to the lab 1 server, and at the end of the week push the changes back to lab 2. When people login to a workstation, they usually remain logged in for days at a time and xlock the screen. [If we can get this caching system working], it would mean that people moving between the labs would not need to copy files around since there would always be a 'local' copy.
The network between the labs is not fast enough for direct automounting of lab 1's server on the lab 2 workstations, especially since some files can be over 300Mb in size. We have a VPN (via freeswan) between the different labs, so all data transmitted is encrypted. Also, because lab 2 has 1/6 the capacity of lab 1's RAID it needs to be cached copies of in-use or probable in-use data only.
Crontab entries set for night copies are not useful because people often appear from both places on any given day.
The 3 servers currently run 2.4.18 with XFS so any solution should be compatible with XFS but at a real push we could consider changing the filesystem to another one."
CODA and AFS (Score:3, Informative)
In case the connection is not realiable (or not fast enough), you may want to try CODA [cmu.edu] which is a distributed filesystem which supports disconnected operations. Beware: AFS is a mature project, while CODA may still be a work-in-progress.
Keep it simple use CVS or rsync (Score:3, Informative)
Re:Keep it simple use CVS or rsync (Score:1)
or use rdiff-backup or cvsup (Score:2)
rdiff-backup is:
rdiff-backup backs up one directory to another, possibly over a network. The target directory ends up a copy of the source directory, but extra reverse diffs are stored in a special subdirectory of that target directory, so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup. rdiff-backup also preserves subdirectories, hard links, dev files, permissions, uid/gid ownership, and modifica
ssh (Score:3, Interesting)
unison (Score:4, Informative)
works very well and is designed for this kind of thing.
BTW - weekly backups!!!! daily surely?
Re:unison (Score:1)
Me too! (Score:1, Redundant)
I'm envisioning some type of write-through
FolderShare (Score:2)
For your situation, I would imagine that the server machines would run the FolderShare app, simply mirroring in more-or-less real time the lab2 data at lab1.
RC
Intermezzo might be a solution (Score:3, Informative)
Similar to afs and coda suggested before, but with local caching to allow much higher performance. Also works in disconnected mode.
Re:Intermezzo might be a solution (Score:1)
Intermezzo _might_ be a waste of time. (Score:2)
A shame too, it looked pretty good and like it could have quite a bit of promise.
Building a reliable, easy to install, distributed filesystem that allows for disconnected operation, updates and similar kinds of things would be very, very useful. (Notice the recent post on using CV
SUN's CacheFS (Score:2)
AFS has been doing this for years (Score:2)
There are servers and clients for tons of operating systems, including every one you mentioned.
WebDAV + HTTP proxy server (Score:2)