Guaranteed Transmission Protocols For Windows? 536
Michael writes "Part of our business at my work involves transferring mission critical files across a 2 mbit microwave connection, into a government-run telecommunications center with a very dodgy internal network and then finally to our own server inside the center. The computers at both ends run Windows. What sort of protocols or tools are available to me that will guarantee to get the data transferred across better than a straight Windows file system copy? Since before I started working here, they've been using FTP to upload the files, but many times the copied files are a few kilobytes smaller than the originals."
TCP? (Score:5, Interesting)
Correct me if I'm wrong... (Score:3, Interesting)
AS2 FTW (Score:3, Interesting)
You should look at the EDIINT AS2 protocol [wikipedia.org], AKA RFC 4130 [ietf.org]. This is a widely-used e-commerce protocol built over HTTP/S.
AS2 provides cryptographic signatures for authentification of the file at reception, non-repudiation and message delivery confirmation (if no confirmation is returned, the transfer is considered a failure), and is geared towards files. There is even an open-source implementation avaliable.
More complex than FTP/SFTP but entirely worth it if your data is mission-critical and/or confidential. Plus, passes through most networks because it is based on HTTP.
Use .complete files. (Score:4, Interesting)
Even on reliable connections, using .complete files is a great idea.
It works this way: If you're pushing, open ftp, after ftp completes, you check remote filesize, if matches local file size, you also ftp a 0 size .complete file (or a $filename.complete file with md5 checksum, if you want to be extra paranoid).
Any app that reads that file will first check if .complete file is there.
If remote file size is less, you resume upload. If remove filesize is more than local, you wipe out remote file and restart.
Same idea for the reverse side (if you're pulling the file, instead of pushing).
You can also setup scripts to run every 5 minutes, and only stop retrying once .complete file is written (or read).
Note that the above would work even if the connection was interrupted and restarted a dozen times during the transmission. [we use this in $bigcorp to transfer hundreds of gigs of financial data per day... seems to work great; never had to care for maintenance windows, 'cause in the end, the file will get there anyway (scripts won't stop trying until data is there)].
Re:TCP? (Score:2, Interesting)
Anyone who uses it that way is wrong [reference.com]. (For the lazy, every single definition of the verb "to borrow" involves receiving, not giving.)
So your parent post should have said "Borrow should only be used to refer to the act of receiving something", but his (her?) statement is still essentially correct, if you go by the actual definition of the word rather than colloquial usage from one particular area.
If I start using a word's opposite as if it were the word, and six hundred other people near me start doing it too, that makes it colloquial (in our area), but that doesn't make it right.
Re:Any encrypted transmission protocol actually (Score:2, Interesting)
Starting a comment off by explaining that you're not familiar enough with the subject matter to intelligently comment is a very handy flag, and I appreciate your warning the rest of us that what you were saying was going to be wrong ;)
BTW, checksum hasn't been considered a trustworthy means of ensuring data integrity for more than a decade. I invite you to have a discussion with Google regarding checksum collisions.
Re:TCP? (Score:3, Interesting)
If the drives are different sizes, different filesystems, or even just set up with different cluster sizes.
(Yes, you can do that in Windows, just don't get stupid with the settings.)
He may have corrupted files, he should really check, but if a different size on different drives is the only thing he's checked, they may be perfectly fine.
Ancient History Perspective
Back in the Dos days, people were always panicking about their memory not having the exact byte value they expected. Most people didn't understand that different bios versions/brands and different bios options, like shadowing, all affected that value.
Re:Sneakernet (Score:2, Interesting)
Re:TCP? (Score:2, Interesting)
You appear to be correct.
Yet I appear to be "insightful", interesting.
Re:TCP? (Score:4, Interesting)
As far as I can tell, the problem is entirely unique to notepad. Every other text editor I've ever tried handles files with Unix-style newlines correctly. Since it would be trivial to fix Notepad, I can only assume that Microsoft either doesn't care at all about Notepad, or is deliberately leaving the incompatibility in place to discourage use of Unix.
Re:Any encrypted transmission protocol actually (Score:3, Interesting)
Using FTP ASCII mode for binary files would be increadibly stupid, but yeah, it sounds like that could be it.
Calling ftp from a .BAT script or whatever it's called in DOS and *not* checking its exit code
is another likely candidate.
Otherwise, I don't believe FTP has any checksums, so I'd expect bit errors here and there --
things the TCP and link layer checksums did not catch in 1/65536 of the cases.
Depends entirely on the CPU speed of the endpoints relative to the link speed. If you enable compression and the files aren't already compressed, it can be a lot faster.
Re:Any encrypted transmission protocol actually (Score:2, Interesting)
Actually encryption doesn't guarantee *things add up* after transfer. And ssh does not guarantee things add up any more than tcp does. It does have other advantages, like compression.
And tcp is just not a good file transfer protocol over microwave links. Sure you can fix the glaring issues, using huge windows, you can even change registry settings to improve the situation : http://support.microsoft.com/kb/224829 [microsoft.com].
Making it work really well, though, you'll need
If you're worried about correctness of transfer you might want to use rsync for windows [unimelb.edu.au], which *does* check correctness. You might want to use an interface like http://www.aboutmyip.com/AboutMyXApp/DeltaCopy.jsp [aboutmyip.com].
Now rsync is no wonder. It is not something that is constantly trying to reconnect. You start it once ... it tries once. If you want an opportunistic reliable file transfer utility ... you might want to try bittorrent, it's quite good at that.