Forgot your password?

typodupeerror
Networking Operating Systems Windows Software

Guaranteed Transmission Protocols For Windows? 536

Posted by timothy
from the no-charge-for-autocompression dept.
Michael writes "Part of our business at my work involves transferring mission critical files across a 2 mbit microwave connection, into a government-run telecommunications center with a very dodgy internal network and then finally to our own server inside the center. The computers at both ends run Windows. What sort of protocols or tools are available to me that will guarantee to get the data transferred across better than a straight Windows file system copy? Since before I started working here, they've been using FTP to upload the files, but many times the copied files are a few kilobytes smaller than the originals."
This discussion has been archived. No new comments can be posted.

Guaranteed Transmission Protocols For Windows?

Comments Filter:
  • by guruevi (827432) <evi.smokingcube@be> on Tuesday June 30, 2009 @12:33PM (#28530151) Homepage

    SFTP should do since the communications are encrypted, if something changes along the way it should be rejected by the other end. HTTPS and any other protocol-over-SSL should do.

    FTP is a plain-text protocol so if something changes along the way it won't give you any issues.

  • Use BITS (Score:5, Informative)

    by Lothar (9453) on Tuesday June 30, 2009 @12:34PM (#28530183)

    Background Intelligent Transfer Service (BITS) can be used to transfer files between windows servers. It is the technology behind Windows Update. We use it in our company to transfer files across a low bandwidth sattelite connection. Great thing is that it can automatically resume transfer after rebooting both machines. SharpBits offer a nice .NET API. You can find it here: http://www.codeplex.com/sharpbits [codeplex.com]

  • Re:Robocopy? (Score:3, Informative)

    by Krneki (1192201) on Tuesday June 30, 2009 @12:39PM (#28530321)
    Robocopy works on top of Windows network layer, it's the same as using copy / paste with some extra functionality.
  • Re:TCP? (Score:5, Informative)

    by Zocalo (252965) on Tuesday June 30, 2009 @12:39PM (#28530331) Homepage
    The only times I've seen FTP report a successful file transfer and have a file discrepency is when a binary file has been transferred in ASCII mode and the CR/LF sequences are being swapped for just CRs, or visa versa. Nothing wrong with the protocol, PEBKAC...
  • rsync (Score:5, Informative)

    by itsme1234 (199680) on Tuesday June 30, 2009 @12:42PM (#28530399)

    ... is what you want. Yes, you can use it with Windows (with or without cygwin bloat). Use -c and a short --timeout and you're good to go. If you're using it over ssh you're looking at three layers of integrity (rsync checksums, ssh and TCP), two of them quite strong even against malicious attacks not only against normal stuff. Put it in a script with a short --timeout; if anything is wrong with the link your ssh session will freeze completely, as soon as your --timeout is reached rsync will die and your script can respawn a new one (which will resume the transfer using whatever chunks with good checksum you have already transfered and will again checksum the whole file when it finishes).

  • by Dogun (7502) on Tuesday June 30, 2009 @12:43PM (#28530415) Homepage

    Implementations of TCP in most operating systems fall a bit short of that, killing off stalled connections, etc. Also, some firewall suites, and some routers make a habit of killing off connections after a certain amount of time, sometimes without regard to whether or not they are 'active'.

    You might have some luck boosting reliability with the TcpMaxDataRetransmissions registry setting in Windows. But ultimately, the poster is going to need to find a file copy suite which retries when connections die.

  • by n4djs (1097963) on Tuesday June 30, 2009 @12:45PM (#28530455)
    'set mode binary' prior to moving the file. I bet the file you are moving isn't a text file with CR-LF line terminations as normally found in DOS, or one side is set and the other isn't.

    Ritchie's Law - assume you have screwed something up *first*, before blaming the tool...

  • Re:Robocopy? (Score:5, Informative)

    by Anonymous Coward on Tuesday June 30, 2009 @12:49PM (#28530561)

    Yeah but that extra functionality contains things like the ability to resume a transfer, retry if things fail, and verify the files after copying.

  • Re:Well...duh (Score:3, Informative)

    by metamatic (202216) on Tuesday June 30, 2009 @12:52PM (#28530613) Homepage Journal

    You don't need to MD5 if you're using rsync. The rsync algorithm already uses checksums to ensure the files are bit-for-bit identical. In fact, rsync 3.x uses MD5.

  • Re:rsync (Score:4, Informative)

    by doug (926) on Tuesday June 30, 2009 @12:52PM (#28530629)
    Yep, that's what I'd do. The rsync --server means sending signatures instead of files to prevent pointless copies, and it does an excellent job of ensuring good copy or failure. It is certainly better than any ftp variant.
  • Re:Robocopy? (Score:5, Informative)

    by Saint Stephen (19450) on Tuesday June 30, 2009 @12:53PM (#28530651) Homepage Journal

    MOD PARENT UP. Not to mention it's multithreaded, so it's not really the same as copy/paste - it's the same as a whole bunch of copy/pastes as the same time.

    Why do people keep fighting the Robocopy, I'll never know.

  • by Anonymous Coward on Tuesday June 30, 2009 @12:53PM (#28530653)

    Why don't you try rsync. That should do the trick nicely.

  • by jeffmeden (135043) on Tuesday June 30, 2009 @12:58PM (#28530763) Homepage Journal
    Using modern encryption like SSH does guarantee that things *have to add up* since keeping what you start with a secret is just as important (sometimes more so) as making sure you finish with exactly what you start with (meaning no one in the middle meddled with your data).

    So, in short, something like SSH or any other properly encrypted communication mechanism is a great way to both secure the data from snooping (in the case of a microwave link, a VERY real problem) as well as to safeguard the data from corruption (intentional or unintentional). I sincerely hope, for the asker's sake and possibly for the country's sake, that these files he works with are trivial.
  • by ericnils (1424615) on Tuesday June 30, 2009 @01:07PM (#28531001)
    We use Cygwin's rsync to backup windows servers over a slow Internet connection at work. It works very well for us and using the -z compression option will probably result in much faster transmission over a 2Mbit pipe than FTP will provide. We run rsync as a service on the source and pull to the destination using the rsync command line tool, but you could easily reverse that. You should also consider Microsoft's built-in DFS replication which automates replication of data between two file servers over TCP.
  • Re:Line endings! (Score:2, Informative)

    by wick3t (787074) on Tuesday June 30, 2009 @01:13PM (#28531131)

    Twenty bucks says you're converting from DOS line endings (\r\n) to Unix line endings (\n).

    There, fixed that for you.

  • Re:TCP? (Score:3, Informative)

    by rhsanborn (773855) on Tuesday June 30, 2009 @01:15PM (#28531169)
    And those people are wrong.
  • Re:TCP? (Score:5, Informative)

    by amoeba1911 (978485) on Tuesday June 30, 2009 @01:16PM (#28531189) Homepage
    I'm gonna learn you some English: First, I will download my photos to my Facebook page. Then I will borrow you my car but in collateral I demand you borrow me you're lawnmower for a week so I can mow my lawn. Your smart, so you will do good on your next test.
  • Re:TCP? (Score:5, Informative)

    by bwcbwc (601780) on Tuesday June 30, 2009 @01:18PM (#28531239)

    I used to get dropped characters and groups of characters in text files using FTP back in the 1990s and early 21st century. It seemed to be a bug in the FTP client, because it only happened when we used the Windows Explorer interface for the product. When we did command line or used the native GUI there was no problem. If you're seeing this type of a pattern where you can see that characters are missing, switch to a different FTP client or try the Windows command line FTP.

    Another possibility is that the target Windows system is mimicking a Unix system, so that an ASCII transfer is stripping the CR characters from CR/LF sequences.

    On the other hand, if you really want a "guaranteed delivery" with formal acknowledgment and validation, try using a secured protocol like SSH or SFTP or a messaging system like JMS with a handshaking architecture around it. There are plenty of Open Source architectures you can build around (xBus for example), but I don't know of any ready-built executables. Commercially, vendors like IBM (MQ) and Tibco have products that deal with the messaging at a similar level.

  • Re:TCP? (Score:4, Informative)

    by ShieldW0lf (601553) on Tuesday June 30, 2009 @01:19PM (#28531267) Journal

    You could deal with a situation like this by zipping or rarring it into multiple small files and including parity files.

    http://en.wikipedia.org/wiki/Parchive [wikipedia.org]

  • Re:Robocopy? (Score:5, Informative)

    by Ritchie70 (860516) on Tuesday June 30, 2009 @01:20PM (#28531275) Journal

    Actually, you can specify a single file, it just has a silly syntax.

    robocopy source destination file

    So "robocopy c:\a c:\b myfile.txt" will copy c:\a\myfile.txt to c:\b\myfile.txt.

  • by JoeRandomHacker (983775) on Tuesday June 30, 2009 @01:21PM (#28531307)
    rsync is great, though on Cygwin there are some caveats. Last time I tried using it to sync a large amount of data I ran into a Cygwin pipe bug (for the pipe between rsync and the ssh process) which caused the transfer to hang. Using the "rsync:" protocol (with an rsync daemon), optionally over an ssh tunnel (port forwarding), worked fine, though it was a bit clunky.
  • Re:UDP. (Score:3, Informative)

    by Underfoot (1344699) on Tuesday June 30, 2009 @01:21PM (#28531311)

    UDP is actually a great basis for accelerated file transfer. Several file transfer utilities / protocols have been built around it. I deal with really large files, but I have been using Aspera on several projects with great success. Worth a look.

    http://www.asperasoft.com/ [asperasoft.com]

  • by Anonymous Coward on Tuesday June 30, 2009 @01:36PM (#28531607)

    Agree... rsync is the way to go. builtin hashing, diffing, session realibility, retries... what more could you ask for?

  • Re:Well...duh (Score:3, Informative)

    by Alrescha (50745) on Tuesday June 30, 2009 @01:37PM (#28531627)

    "You don't need to MD5 if you're using rsync. The rsync algorithm already uses checksums to ensure the files are bit-for-bit identical. In fact, rsync 3.x uses MD5."

    Rsync, by default, does not necessarily do this. I've seen situations where rsync would happily copy files from a remote host over ssh to a destination host and the resulting files failed an independent MD5 test. Rsync was not causing this trouble - but it did fail to detect it. Forcing a checksum of every file (using "-c") would let rsync detect the failure to copy properly (after the entire file was done) and it would retry.

    In the end, a router and one of the hosts were rebooted and the problem went away. The point is that just using rsync and ssh does not guarantee anything.

    A.

  • Re:TCP? (Score:5, Informative)

    by samkass (174571) on Tuesday June 30, 2009 @01:45PM (#28531775) Homepage Journal

    While others point out, probably correctly, that the problem is probably a binary/ascii conversion, in actuality the error checking on TCP is simply not that good.

    TCP uses a 16-bit checksum, so you have 1 in 65536 chance of an error packet being incorrectly validated as being correct. To make matters worse, it uses 1's complement instead of 2's complement, so 0x00 and 0xFF are indistinguishable.

    Ethernet has a 32-bit, 2's complement checksum so if you're transmitting over that link-layer you're probably in good shape. But depending on that from a systems point of view seems risky.

    Much better to only transfer ZIPs and check them at the other end if you only have control over the endpoints. If you can control the transmission, use a better error-correcting high-level protocol or even a forward-error correction protocol on top of TCP.

    Or just use rsync.

  • by raddan (519638) * on Tuesday June 30, 2009 @02:05PM (#28532111)
    On some level, there isn't much difference between an application and a protocol. In fact, if you ever take a networking theory course, you'll see that each protocol layer in the network stack is, in fact a "protocol machine" (i.e., an application), which does the little protocol dance that makes functions at that layer happen.

    But I digress. What the user is running into here is a fundamental problem with TCP over lossy networks. It really was not designed with really lossy networks in mind. E.g., the congestion control mechanism in TCP ("exponential backoff") makes the assumption that there is a wire sitting there and that certain parameters (like bandwidth) are not going to change. If you need certain QoS guarantees on a wireless link, TCP may be hard-pressed to deliver, because TCP's [limited] QoS mechanisms may make the problem worse. There is a HUGE amount of overhead on 802.11 networks to make sure that TCP doesn't suck.

    I don't know how this person's microwave link is configured, but they might be better served by thinking about the QoS guarantees in the various layers in their network stack. I know a previous poster was joking when they said UDP might be a good option, but look, part of the problem on wireless is TCP's retransmission mechanism. With UDP it is up to the user/application to ask for a retransmit. Bittorrent works exactly like this, so something like Bittorrent, where each small file chunk gets its own hash, and those hashes are checked upon receipt, might not be a bad idea. I like rsync as well (because it has a rolling checksum feature), but again, you have TCP in the mix, and if I recall correctly, rsync will not retry automatically on failures, which is what you want.
  • Re:TCP? (Score:5, Informative)

    by Obfuscant (592200) on Tuesday June 30, 2009 @02:06PM (#28532121)
    Since ASCII files are also ultimately represented as particular sequences of binary data, why does FTP even have an ASCII transfer mode?

    Because of differences between systems like Unix and Windows, where line ends are a simple newline on Unix but a CR/LF pair on Windows. Also systems like VMS which have (had) about thirteen different file formats all inherent in the file structure itself.

    In other words, because all ASCII files are not represented the same way by all different operating systems.

    I know that Windows uses CR/LF for line termination and *nix uses just LF. That's a very minor inconvenience at worst,

    Not if you have an "ASCII" file you are trying to read on Windows that has Unix newline conventions. Try opening a newlined file with notepad, for example.

    ...and little standalone utilities to convert the formats are readily available and have been for some time now.

    "Little standalone utilities" are really handy for small files and small numbers of files. It's really handy when you know the format the file you have is in and what it needs to be. Please tell me how you will identify a VMS fixed record file that you have just ftp'd from a VMS FTP server when it gets to your Windows system. It has NO newlines or CR/LF pairs. You might dump the file somehow and notice that the lines are all 93 characters long and then write yourself a perl script to split it up -- or you could simply tell your FTP client that you are in ASCII mode and let the FTP server/client negotiate some resulting format that your system likes. Now try that with a VMS variable length record file, where the lines are variable length, still without line endings.

    FTP wasn't designed just for hobbyists who want a file or two and have the time to deal with file formats by hand. It was designed to move data, and anything that can be automated should be. "Little standalone utilities" are a pain in the ass when trying to automate something, especially when the critical information necessary to know what specific utility to use has been lost, or is completely unknown to the recipient's system. Like VMS fixed length records on Unix or Windows.

    It just seems like it's not the job of a file transfer protocol to concern itself with what an independent, unrelated application can or cannot do with the file after it's transferred.

    ASCII mode in FTP has nothing to do with anyone trying to tell anyone what they can or cannot do with a file after it's transferred. It's all about knowing how to deal with a hundred different ways of representing ASCII data on dozens of different operating systems and making life EASIER for people who have to do that on a daily basis.

    If YOU would rather operate in BIN mode and worry about which file formats you've just downloaded and how to convert them to an ASCII representation that your software knows how to deal with, more power to you. I got tired of dealing with this the first time I had to convert a VMS "ASCII" file to Unix and I'll let FTP do it silently for me. Yes, I've dealt with users who didn't know what ASCII mode was and downloaded a zipped file in ASCII mode and it didn't work, but the time I've saved just myself not having to deal with converting crap has more than made up for the time I've spent telling them to use BIN mode.

  • by Anonymous Coward on Tuesday June 30, 2009 @02:07PM (#28532153)

    FTP and TCP cannot "Drop" packets or bytes. You need to learn-up on TCP and FTP.

    FTP _does_ translate DOS end-of-line sequences (carriage-return followed by new-line -- 2 bytes) with Unix end-of-line sequences (just new-line -- 1 byte). So your files may become shorter by as many bytes as they contain lines.

    The solution is to tell FTP to not treat the file as text, but as binary image information in which new-line characters are treated with no special processing. Traditionally, FTP called this "file type I" and the command to set it is "bin" as in "binary":


    C:\Documents and Settings\fred>ftp abc.net
    Connected to abc.net.
    220 ProFTPD 1.3.1 Server (ABC Global Enterprise Group) [10.13.131.34]
    User (abc.net:(none)): freddy
    331 Password required for freddy
    Password:
    230 User freddy logged in
    ftp> bin
    200 Type set to I
    ftp>

  • Re:UDP. (Score:2, Informative)

    by Anonymous Coward on Tuesday June 30, 2009 @02:27PM (#28532479)

    TCP is so horrible. I wish HTTP used UDP by default so I wouldn't have the pro

    Aspera is little better than Tsunami. [sourceforge.net]

    As an exercise for the reader, guess which one is cheaper.

  • Re:TCP? (Score:2, Informative)

    by sexconker (1179573) on Tuesday June 30, 2009 @02:32PM (#28532581)

    Windows reports file sizes exactly, to the byte.

    It reports both the true file size and the file size on the disk, which is based on the block size and the number of blocks required to store the file.

  • by fm6 (162816) on Tuesday June 30, 2009 @02:35PM (#28532627) Homepage Journal

    Poster isn't concerned about whether the data has errors. That's a problem for the data creators. He's worried about it getting screwed up in transmission, either accidentally or maliciously

    Sigh. You're welcome to nitpick my prose, but would you mind doing so in a way that makes sense. Data that got screwed up in transmission can be said to have errors. And that's what I meant.

    and encryption absolutely solves that issue.

    How? Not all encryption algorithms break if you mung the data after it was encrypted. Do all the algorithms break if this happens? Show me where it says this, and I'll admit that encryption is sufficient.

    BTW, checksum hasn't been considered a trustworthy means of ensuring data integrity for more than a decade.

    Dude, you really need to start listening to how people actually talk. For more than a decade, the word "checksum" has been used to apply to algorithms that don't simply add up bits, such as MD5 [google.com]. Not strictly logical, but language rarely is.

  • by sexconker (1179573) on Tuesday June 30, 2009 @02:41PM (#28532721)

    Windows reports file sizes exactly, to the byte.

    It reports both the true file size and the file size on the disk, which is based on the block size and the number of blocks required to store the file. ..

  • Re:TCP? (Score:3, Informative)

    by jmauro (32523) on Tuesday June 30, 2009 @03:59PM (#28533811)

    Or they'd rather just have you use the already included Wordpad that does handle new lines correctly.

  • Re:TCP? (Score:3, Informative)

    by Obfuscant (592200) on Tuesday June 30, 2009 @04:05PM (#28533887)
    As far as I can tell, the problem is entirely unique to notepad.

    Who rated this "insightful"?

    I'm sorry, but I've worked in this area for years. I was responsible for moving data and source files to and from Unix to DOS to VMS to OSs that are even deader than VMS, and the problem is hardly unique to "notepad". YOU may see it only in notepad because YOU only use Windows, but there are a lot of other OSs out there. If you've never worked on an OS that has structured files inherent in the filesystem, well, lucky you. I have. The newlines in those kinds of files are completely lost when you copy the byte stream contents, because the newlines are implicit and defined in the file structure itself. A fixed-record file doesn't need newlines because every line is the same length.

    Every other text editor I've ever tried handles files with Unix-style newlines correctly.

    There is much more to the world than Windows and Unix-style newlines. If all you have seen is Windows and Unix newlines, I suppose you could think the problem was limited to that, but it really isn't. In fact, if you use FTP much at all, I suspect even you have been protected by ASCII mode, to the point that you never even knew that an FTP site you visited was VMS-based. I know I've been to VMS sites, and ASCII mode is critical if you are dealing with ASCII files.

  • Re:TCP? (Score:4, Informative)

    by link-error (143838) on Tuesday June 30, 2009 @04:09PM (#28533943)
    I replied similar to this above, but if you're microwave connection is generating any binary data and you're transmitting using ascii mode, you'll get file size differences at the destination.
  • Re:Robocopy? (Score:3, Informative)

    by oatworm (969674) on Tuesday June 30, 2009 @05:15PM (#28534897) Homepage
    Active Directory tends to complicate things, though you can use NTBackup or Windows Backup (depending on your Server version) to kind of keep things somewhat under control there. Even then, though, restoring AD from backup using NTBackup is not a particularly fun or, in my experience, reliable proposition. Plus, this doesn't even dig into the rest of a server's system state (in theory, if it's backed up right via NTBackup, you might be able to restore the whole thing without reinstalling every piece of software - good luck!) or attempting a brick-level backup of Exchange.

    It really is phenomenal how much effort Microsoft forces you to go through just to back up their servers. These days, I just go with image-based software for server backups - they seem to do a far more reliable job of getting Windows servers back up in a hurry than file-level products (which Robocopy + NTBackup would qualify as). But, that's just me, and I primarily deal with smallish networks, so I'm not entirely sure how well that scales.
  • by Kazoo the Clown (644526) on Tuesday June 30, 2009 @06:21PM (#28535619)
    FTP is however, more than an order of magnitude faster than SFTP or SCP. If the files are relatively small, SFTP is certainly the more secure solution, but if the files are huge and time is an issue, FTP has the clear performance advantage.
  • by Kazoo the Clown (644526) on Tuesday June 30, 2009 @07:19PM (#28536293)
    We've done a lot of testing for our data warehouse products over a gigabit link between two quadcore server PCs comparing the transfer of several gigabytes worth of data, between ftp and sftp, and typical times for sftp have been taking about 3 and a half hours, when the same transfer via ftp is taking about 20 minutes. For the clients, we've been using psftp and windows command-line ftp, and for the servers, War-FTP and copssh. HP has a performance patch for OpenSSH (see here [psc.edu]), but we have been unable to locate or develop a build for Windows that has this patch. While there may be better tuned SFTP software out there, the readily available open source tools do not compare well with FTP.

"If you own a machine, you are in turn owned by it, and spend your time serving it..." -- Marion Zimmer Bradley, _The Forbidden Tower_

Working...