Laptop/Server Data Synchronization? 305
gbr writes "I've been trying to automatically synchronize data between a laptop and a server. When the laptop is connected to the network, I want all writes to automatically propagate across to the server. When the laptop is disconnected I want the laptop user to continue working with the local data. When the laptop is reconnected, I want the data to automatically re-sync.
The issue is, the data on the server may have changed as well, which needs to propagate back to the laptop. The data doesn't contain anything too special, no database tables etc. It does contain binary data such as executables and word processing documents. I've looked at ChironFS, Unison file sync, and drbd. ChironFS needs a manual rebuild if a connection fails, and the user needs to know which machine contains the correct data. Unison requires the user to initiate the synchronization process manually every time, and drbd is just not meant for the job at hand. How do you automatically, and invisibly to the user (except in the case of conflicts), synchronize between a laptop and a server?"
rsync (Score:5, Informative)
Subversion (Score:2, Informative)
iFolder? (Score:4, Informative)
http://www.ifolder.com/index.php/Home [ifolder.com]
Unison (Score:2, Informative)
Re:rsync (Score:5, Informative)
I'd also take a look at Microsoft's SyncToy [microsoft.com] if you're on win***s.
Re:iFolder? (Score:2, Informative)
Windows & Make Available Offline (Score:5, Informative)
In Windows you can mark a folder on a network share as "Available Offline". Windows will copy all of the files to the local HD and if the server isn't available just work with the local copies. When the server is detected Windows will automatically sync the files and pop-up asking the user about conflicts (keep local / keep remote). When connected writes automatically go to both the local copy and the server.
One of the few places that Windows has right and I haven't found a Linux or OS X solution for that is nearly as nice.
Foldershare (Score:2, Informative)
Re:rsync (Score:2, Informative)
Re:Am I missing something? (Score:3, Informative)
Not seeing the difference between arbitrary files on a disk and files that have been explicitly version controlled is I guess what makes you the hardware guy - does that mean you nail the floorboards down?
Re:rsync (Score:2, Informative)
Re:rsync (Score:5, Informative)
By using pooling and compression, one client of mine is using BackupPC to backup over 1TB of data distributed among over 100 laptops to a 200GB filesystem on a central server. The network is polled every hour, and any system that hasn't been backed up in the last 24 hours is queued. Beautiful system.
In OSX, portable home directories (Score:5, Informative)
Re:iFolder? (Score:2, Informative)
One possible problem: you have to store the information in a folder (which you specify). Only the data in that folder is synced.
HTH
-FreckledP
Re:Windows & Make Available Offline (Score:5, Informative)
Unison (Score:2, Informative)
You might have to do A-B, A-C, A-B type syncs for more than 2 paths, unless you stick to a hub/spoke or cascading distribution model.
Not all conflicts are automatically resolved, by default.
http://www.cis.upenn.edu/~bcpierce/unison/ [upenn.edu]
Good luck.
Re:rsync (Score:5, Informative)
Coda, AFS, InterMezzo (Score:4, Informative)
you already solved your problem (Score:3, Informative)
Re:rsync (Score:4, Informative)
Unison [upenn.edu] is 2-way rsync. But as the poster noted, unison/rsync doesn't easily support automatic synching (that I know of)- you have to kick it off and then deal with any conflicts, etc., manually. I think the poster is looking for ideas of at least automating Unison/rsync (BTW does rsync support 2-way updating, as the poster explicitly mentions?).
As someone who relies on running unison manually (too lazy to figure out how to automate on my Windows box), I'd be interested in relevant solutions.
Re:Windows might be good for something (Score:1, Informative)
Re:rsync (Score:2, Informative)
Um... what?
You mean besides this diagram [microsoft.com] of the steps you should follow when making a backup (and a similar one for restore), and the MSDN documentation [microsoft.com] for the VSS.
Re:Windows might be good for something (Score:5, Informative)
Further to this, offline files has a number of fairly fundamental bugs in the actual implementation. It records both the IP and the name of the server somewhere when doing the offlining. As a result if the name (but not the drive) or the IP changes your entire offline tree goes south and stays offline. You can neither delete it nor reconnect it and the only way of dealing with this is either surgery to the network (aliasing IP addresses) until you reconnect. The only alternative is to rebuild the affected laptops from scratch.
SyncBackSE (Score:3, Informative)
OpenAFS (Score:3, Informative)
As read from the main page:
AFS is a distributed filesystem product, pioneered at Carnegie Mellon University and supported and developed as a product by Transarc Corporation (now IBM Pittsburgh Labs). It offers a client-server architecture for federated file sharing and replicated read-only content distribution, providing location independence, scalability, security, and transparent migration capabilities. AFS is available for a broad range of heterogeneous systems including UNIX, Linux, MacOS X, and Microsoft Windows
Hope this helps, ciao
Re:iFolder? (Score:3, Informative)
(And if you're who I believe you are (CC), hey! Drop me a line...)
Re:Windows might be good for something (Score:1, Informative)
It can sometimes be alleviated by ensuring that a Windows box is the master browser, by setting "local master = No" and "preferred master = No" in smb.conf.
Alternatively, if you only have W2k/XP clients on the network, "disable netbios" seems to do the trick too.
Re:common refrain (Score:3, Informative)
Photoshop -> GIMP [gimp.org]
Avid -> LIVES [sourceforge.net] - Note: I am not a video editor and have no idea if this program is any good.
Quicken -> GNUCash [gnucash.org], among others.
I guess what I'm saying is that, based on your definition of "silly", there's quite a bit of silliness going on in the world today. *grin*
Re:rsync (Score:3, Informative)
AFAIK, rsync is only one-way, meaning that it overwrites and eventually deletes files. Have a try:
mkdir d1 d2 # Create two directories (e.g. one on server, one on laptop)
touch d1/foo.txt # Create an empty file
rsync -r d1/ d2/ # Sync the directories
echo "123" > d2/foo.txt # Now modify the file on d2 (e.g. laptop)
rsync -r d1/ d2/ # Sync again
cat d2/foo.txt # Ooops - foo.txt is empty!
One possible way I experimented with is the following:
- Integrate a rsync server -> laptop in the startup procedure of the laptop
- Never modify a file on the server while working with the laptop
- Integrate a rsync laptop -> server in the shutdown procedure of the laptop
In theory this works, but practically there are cases where you miss the shutdown/startup sync, e.g. when you have no network at startup (e.g. you took your laptop away from home and forgot to sync it), in case you laptop crashes, the network fails during shutdown and numerous other problems. These lead to dangerous situations, e.g. if the rsync laptop->server fails during shutdown, a startup-rsync may overwrite modified files.
After loosing some of my work, I decided to switch to unison, which is a 2-way sync and lets me decide how to resolve syncing problems.
Nevertheless I'm not entirely happy with the situation - if I forget to sync, I have to resolve things manually, moreover the sync takes quite some time.
In my special case, I have a WLAN connection to my server most of the time, so changes could be written immediately. So I'd favour some kind of network file system that has offline capabilities and can handle two-side modifications in some way. I thought about Coda but it seems to be far too complicated and unreliable and I don't know better alternatives.
So I'm still stuck to my Unison solution, which is somehow cumbersome, but works...
Unison, Rsync & NTP (Score:4, Informative)
Rsync can sync in both directions, but you decide one of the sides is the master and sync that one first, in the case of conflicts the master rules. It isn't possible to choose on a file by file basis at sync time as you can with Unison.
Oh, and NTP is absolutely vital when doing any synchronisation.
Basically. Either you do it manually and manage conflicts at sync time, or you do it automatically and define one of the sides as a master in the case of conflict. There's really no way round this, software just isn't sophisticated enough to decide what you're thinking.
The truth is that filesystem syncing isn't ideal for a very dynamically updated file system. It is best used on fairly static filesystems or one way syncing. Documentation, backups and the like.
Re:Windows might be good for something (Score:4, Informative)
csccmd
Re:rsync (Score:3, Informative)
rsync has a whole bunch of options that will let you decide behaviour. --update will make it skip files that have newer modify times or you could use --backup to make it make a copy of files instead of overwriting them, etc. Mix and match and run two-way syncs after eachother and you could get close in behaviour to a real two-way sync.
"I thought about Coda but it seems to be far too complicated and unreliable and I don't know better alternatives."
I've played around with Coda, and from what I recall there are two things that make it impractial for 'ordinary' use. The lack of file locking (which causes problems with annoying apps that use it) and the handling of large files (it had to copy the entire file to local cache before unblocking the io calls, ie, dont look at any video files on coda). And so, my original idea of having home directories supporting disconnected operations were shot. It would have worked very well for specific subsets of datastorage, but in the end it was simpler to just sort the data into various structures and deal with syncing on a case by case basis (rsync for some things, plain nfs/autofs for other things, cvs for code or text, etc).
In the end, I think this is one of those problems where it's better to just sit on your arse and wait because the problem of permanent connectivity will be solved before someone figures out how to make a wholly transparent redundant filesystem that seamlessly supports disconnected operation. The whole problem is simply to a certain extent incompatible with the way filesystems usually work.
Re:Coda (Score:2, Informative)
Some of the Wiki info suggests that things have improved, but I'm discovering a decent-sized client cache (10Gb) so that I can offline most of what I'd use has horrendous occaisional slow-downs and pauses.
I'm planning on testing a local/client server and a client with the RVM turned-off this week, but I'm not keen on the size that the RVM file(s) will have to get to.
Previous comments in here suggesting it's not really designed for modern data-sets (gigs rather than megs) are starting to look as if that's true... Other than that, it actually looks like a reasonably good design!