Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Software

FTP: Better Than HTTP, Or Obsolete? 1093

An anonymous reader asks "Looking to serve files for downloading (typically 1MB-6MB), I'm confused about whether I should provide an FTP server instead of / as well as HTTP. According to a rapid Google search, the experts say 1) HTTP is slower and less reliable than FTP and 2) HTTP is amateur and will make you look a wimp. But a) FTP is full of security holes. and b) FTP is a crumbling legacy protocol and will make you look a dinosaur. Surely some contradiction... Should I make the effort to implement FTP or take desperate steps to avoid it?"
This discussion has been archived. No new comments can be posted.

FTP: Better Than HTTP, Or Obsolete?

Comments Filter:
  • do both... (Score:4, Informative)

    by jeffy124 ( 453342 ) on Thursday February 13, 2003 @07:11PM (#5297971) Homepage Journal
    But in my experiences, HTTP for whatever reason goes faster (not entirely sure why), and FTP doesnt work for some because of firewalls.

    Try both - see which gets used more.
  • how about rsync? (Score:5, Informative)

    by SurfTheWorld ( 162247 ) on Thursday February 13, 2003 @07:11PM (#5297973) Homepage Journal
    rsync is a great protocol, fairly robust, can be wrappered in ssh (or not), supports resuming transmission, and operates over one socket.

    seems like the best of both worlds to me.

    the real question is - do you control the clients that are going to access you? or is it something like a browser (which doesn't support rsync).

  • by emf ( 68407 ) on Thursday February 13, 2003 @07:11PM (#5297975)
    "HTTP is slower and less reliable than FTP"

    I would think FTP is slower since with FTP you have to login and build the data connection before the transfer begins. With HTTP it's a simple GET request.

    As far as the actual data being sent, I believe that the file is sent the same way with both protocols. (just send the data via a TCP connection). I could be wrong though.

  • for what its worlth (Score:3, Informative)

    by dunedan ( 529179 ) <antilles.byu@edu> on Thursday February 13, 2003 @07:14PM (#5298003) Homepage
    Those of your customers who don't have fast access to the internet may appreciate even a slightly faster standard.
  • by simetra ( 155655 ) on Thursday February 13, 2003 @07:14PM (#5298006) Homepage Journal
    you can ftp files without having a GUI, Browser, etc. Though, you can use HTTP with Lynx, and I'm sure wget, snarf, etc. FTP is a good, reliable tool. It comes standard with most every operating system out there. It's fairly standard; the commands you use in one OS are pretty much the same in another OS.

    FTP is the cat's pajamas. HTTP depends on too much other stuff.
  • by cybermint ( 255744 ) on Thursday February 13, 2003 @07:14PM (#5298008)
    If you only providing files, with little or no uploading, then I would use http because it is generally easier for someone in a browser to just grab it off an http server. IE in particular has some problems when navigating in nested directories.

    If you are providing a large number of files where people frequently download several files from the same directory, then ftp access would help as most ftp clients can queue multiple files for downloading.

    If users are uploading and downloading multiple files, then ftp is still your best bet by far. No one wants to upload one file at a time via some html form.
  • HTTP is fine (Score:5, Informative)

    by ahknight ( 128958 ) on Thursday February 13, 2003 @07:14PM (#5298012)
    HTTP does not have firewall issues, does not need authentication, does not (by default) allow directory listings, and is the same speed as FTP. It's a good deal for general file distrubution.

    FTP is quickly becoming a special-needs protocol. If you need authentication, uploads, directory listings, accessability with interactive tools, etc. then this is for you. Mainly useful for web designers these days, IMO, since the site software packages can use the extra file metadata for synchronization. Other than that, it's a lot of connection overhead for a simple file.

    FTP does have one nice advantage that HTTP lacks: it can limit concurrent connections based on access privleges (500 anonymous and 100 real, etc.). Doesn't sound like you need that.

    Go with HTTP. Simple, quick, anonymous, generally foolproof.
  • by brokenin2 ( 103006 ) on Thursday February 13, 2003 @07:15PM (#5298017) Homepage
    If you're just looking to transfer files back and forth, then FTP is the way to go.. If you only want to send out files, you might want to stick with the warm fuzzy feeling by knowing you've only got apache exposed to the outside world.


    We run ftp, but we have to have people send us files, and also distribute them on a regular basis.The client software available for doing the sending and receiving on a regular basis is a lot better for FTP.. it's pretty klunky, but it is very doable for http.


    We just choose to stay on top of our ftp updates.

  • by fwankypoo ( 58987 ) <jason@terk.gmail@com> on Thursday February 13, 2003 @07:15PM (#5298028) Homepage
    The question is, "what do you want to do?" I run an FTP server (incidentally affiliated with etree.org, lossless live music!) and I need what it can give me. Namely I need multiple classes of login, each with a different

    1) number of available slots
    2) speed limit
    3) premission set

    Some people can only read files at 60KB/s, some can read and write (to the upload dir) at the same speed, come can only browse, etc. etc. For this kind of a setup, FTP is great _IF_ you keep your software up to date; subscribe to bugtraq or your distro's security bulletin or both.

    On the other hand, HTTP is great when you want to give lots of people unlimited ANONYMOUS access to something. I'm sure there is a way to throttle bandwidth, but can you do it on a class by class basis? In proftpd it's a simple "RateReadBPS xxx" and I'm set.

    As always, choose the tool that fits _your_ purpose, not the one that everyone says is "best"; they both have good and bad qualities. And http can be just as secure/insecure as any other protocol.
  • Re:FTP (Score:2, Informative)

    by molarmass192 ( 608071 ) on Thursday February 13, 2003 @07:16PM (#5298039) Homepage Journal
    wget does both and does it well.
    http://www.gnu.org/software/wget/wget.html [gnu.org]
  • by Karamchand ( 607798 ) on Thursday February 13, 2003 @07:18PM (#5298059)
    I guess that's not what s/he wants. It sounds like anonymous downloading of publicy available files - whatfor do we need any encryption then? There are no passwords to secure, no sensitive data to secure. You'd get only hassles from MSIE users who never heard about sftp..
  • by AnEmbodiedMind ( 612071 ) on Thursday February 13, 2003 @07:18PM (#5298066)
    FTP provides you with user authentication, and binary transfers (which should be faster as there is no encoding??) It can also be linked to via the web, so there's not too much hassle for the user...

    On the other hand, if you don't need user authentication - and don't want to off load big file transfers from your web-server, you may as well just leave it as http.
  • Re:HTTP is fine (Score:5, Informative)

    by Voytek ( 15888 ) on Thursday February 13, 2003 @07:18PM (#5298068) Journal
    [SNIP]
    does not (by default) allow directory listings
    [SNIP]

    That is a dangerous and very incorrect assumption which has nothing to do with http and everything to do with your http server.
  • by Fastolfe ( 1470 ) on Thursday February 13, 2003 @07:19PM (#5298071)
    Furthermore, FTP allows for features such as resume, etc...

    So does HTTP. With the 'Range' header, you can retrieve only a portion of a resource.

    I agree that it really depends on the application, but for most practical "view directory, download file" purposes, there's no significant difference.

    If you wanted to interact with a directory structure, change ownerships, create directories, remove files, etc., it's generally easier to do this with FTP.
  • Re:In my opinion, (Score:2, Informative)

    by Dragonmaster Lou ( 34532 ) on Thursday February 13, 2003 @07:19PM (#5298076) Homepage
    ncftp is a command-line client that shows you your download progress.
  • Re:hmm (Score:3, Informative)

    by cbv ( 221379 ) on Thursday February 13, 2003 @07:19PM (#5298077) Homepage
    If it starts loading it usually finishes, and I haven't run into any corruption problems.

    You may (just may) run into a routing or timeout problem, in which case the download will stop and you are forced to do the entire download again. Using the right client, eg. ncftp, you can continue downloading partially downloaded files. An option, HTTP doesn't offer.

    With respect to the original question, I would set-up a box offering both, HTTP and FTP access.

  • Says who? (Score:2, Informative)

    by jafo ( 11982 ) on Thursday February 13, 2003 @07:19PM (#5298078) Homepage
    Anyone who says that HTTP is slower and less reliable than FTP probably hasn't done any benchmarking. Based on my experience, HTTP is definitely more reliable if only because it tends to go through firewalls easier then the two-connection FTP protocol.

    Both FTP and HTTP stream data across a TCP socket -- I can't see that streaming it over port 20 versus 80 is going to make any difference.

    FTP was designed to be able to do all these neat things back when the internet didn't have so many security issues. Most of these features are either not used or explicitly disabled these days... The fact that the FTP server uses a different port means that firewall have to understand and properly be configured for this. HTTP sends the data back in response to the initial connection, so it tends to be easier to get through firewalls.

    If you're concerned about looking like a "wimp", then you should offer both and let people pick what they prefer. Or... Stop worrying about what people these people think and figure out what YOU think is best.

    The people who would call you a "wimp" probably aren't worth worrying about.

    Sean
  • security issues? (Score:1, Informative)

    by Anonymous Coward on Thursday February 13, 2003 @07:21PM (#5298094)
    You mention FTP is full of security holes. This is wrong, just use a recent ftp server and you won't have any security issues. The same applies with HTTP, lots of bugs were discovered in HTTP servers as well, worms even made good use of them ;-) So ftp is not "less secure" than HTTP!
  • Re:My opinion (Score:1, Informative)

    by Anonymous Coward on Thursday February 13, 2003 @07:21PM (#5298108)
    Note that HTTP 1.1 can resume. As long as the user is downloading w/ a download manager that has a clue, that's not a problem.
  • Re:hmm (Score:5, Informative)

    by toast0 ( 63707 ) <slashdotinducedspam@enslaves.us> on Thursday February 13, 2003 @07:21PM (#5298110)
    using the right client, ie wget, you can resume from http streams provided the server supports it (and i think most modern ones do)
  • My experience.... (Score:2, Informative)

    by ruckc ( 111190 ) <ruckc@NoSPam.yahoo.com> on Thursday February 13, 2003 @07:22PM (#5298112) Homepage
    My experience with FTP/HTTP is due to FTP's authentication it responds slower and is harder to configure, whereas HTTP has near instant authentication & is more easily multithreaded on downloads.
  • by spRed ( 28066 ) on Thursday February 13, 2003 @07:22PM (#5298115)
    I could be wrong though.

    You were right on that one point...

    an FTP session has two connections, the control which is TCP/IP and data which is UDP. The latency (time to auth etc) is longer on FTP but not really 'slower'

    For the actual benefits and tradeoffs of each just read some of the other posts in the thread.
  • by linuxhack ( 413769 ) on Thursday February 13, 2003 @07:22PM (#5298122)
    Furthermore, FTP allows for features such as resume, etc...

    HTTP 1.1 supports resuming. I have setup Apache to serve movie files that I can play over the network. I can seek back and forth throughout the movie.
  • by Anonymous Coward on Thursday February 13, 2003 @07:24PM (#5298140)
    http is a lot more reasonable protocol. Use it. It can be made to work blazingly fast and it's much more reliable than ftp. Apache is a fine http server but there are others which are faster. But none of that matters; apache on a reasonably modern machine can saturate an ethernet. So who cares about efficiency.

    However, real men write a protocol that works over DNS TXT records using the CHAOS protocol. I'm actually working on this!!!

  • Re:My opinion (Score:2, Informative)

    by caino59 ( 313096 ) on Thursday February 13, 2003 @07:25PM (#5298142) Homepage
    IE will resume a d/l too, providing the temp file is still in cache
  • Re:hmm (Score:3, Informative)

    by cbv ( 221379 ) on Thursday February 13, 2003 @07:26PM (#5298153) Homepage
    using the right client, ie wget [...]

    Yes, I thought about wget while I wrote my answer - but left it out, simply because _for John Doe_ wget is too complicated. John Doe wants a clickety-click-drag-n-drop client, like a web browser or something like WS-FTP. Granted, ncftp doesn't fall into that category either, but even John Doe can use a simple ftp client.

  • by Kitanin ( 7884 ) on Thursday February 13, 2003 @07:26PM (#5298156) Homepage

    (Cleaning up the text a bit)

    According to a rapid Google search, the experts say:
    1. HTTP is slower and less reliable than FTP; and
    2. HTTP is amateur and will make you look a wimp; but
    3. FTP is full of security holes; and
    4. FTP is a crumbling legacy protocol and will make you look a dinosaur.

    Well, 2 and 4 are nothing more than acephalic punditry, unworthy of our attention, which leaves 1 and 3.

    The fact that HTTP doesn't use a binary connection to transfer binary files means that, yes, it is frequently slower than FTP. Especially since your listed file sizes imply that you're not offering text files for download.

    While FTP doesn't have any security holes (Yay for false generalisations!), many of the readily available ftp daemons have had shaky track records in the security area.

    I don't really have an answer for you, I just wanted to say acephalic. :-) Acephalic acephalic acephalic...

  • Re:hmm (Score:5, Informative)

    by tom.allender ( 217176 ) on Thursday February 13, 2003 @07:28PM (#5298169) Homepage
    you can continue downloading partially downloaded files. An option, HTTP doesn't offer.

    Plain wrong. RFC2068 [w3.org] section 10.2.7.

  • by trandles ( 135223 ) on Thursday February 13, 2003 @07:28PM (#5298170) Journal
    It is possible to get approximately 80% of the theoretical maximum throughput of your pipe using a single FTP connection, whereas HTTP can hope for around 60% max for a single connection. The only thing faster than an FTP-based protocol (tftp, pftp) is a raw socket, and they rarely get better then 90%. Most schemes like pftp (parallel ftp, see this paper [llnl.gov]) are implemented to get as close to theoretical maximum throughput by having multiple data connections transfer the file. Of course you'll see the difference in performance more for large file transfers. The previous comment about HTTP being OK for small files is right on the mark...you will hardly notice a 20% gain when the transfers are only taking a few seconds.
  • Re:HTTP is fine (Score:2, Informative)

    by ahknight ( 128958 ) on Thursday February 13, 2003 @07:29PM (#5298176)
    Which is why I said "by default."
  • Ahem. (Score:3, Informative)

    by kyz ( 225372 ) on Thursday February 13, 2003 @07:31PM (#5298194) Homepage
    I use two programs to retrieve files, wget and Mozilla. Both show the download rate whether I'm fetching from HTTP or FTP.

    What bugs me is when servers won't tell me the final downloaded file size -- no ETA available. I've seen both FTP and HTTP servers do it. The same goes for servers that don't support resuming or last-modified dates. They suck.
  • by treat ( 84622 ) on Thursday February 13, 2003 @07:31PM (#5298197)
    an FTP session has two connections, the control which is TCP/IP and data which is UDP.

    This is not true. FTP does not use UDP fpr any purpose.

  • Re:Both... (Score:4, Informative)

    by Centinel ( 594459 ) on Thursday February 13, 2003 @07:31PM (#5298199) Homepage
    When I'm at work and want to download something from a site that offers FTP only, I'm screwed.

    No you're not! Assuming you have access to a Windows workstation, check out HTTP-Tunnel [http-tunnel.com]

  • by DaveBarr ( 35447 ) on Thursday February 13, 2003 @07:32PM (#5298203) Journal
    The data connection is most assuredly NOT UDP. It is a TCP connection just like the control connection. But yes, the latency required to initiate a transfer (due to more handshakes) generally makes FTP slower in general.
  • OR, How about... (Score:5, Informative)

    by Anenga ( 529854 ) on Thursday February 13, 2003 @07:32PM (#5298205)
    P2P?

    I've written a tutorial [anenga.com] on how you can use P2P on your website to save bandwidth, space etc. An obvious way to do this would be to run a P2P client and share the file on a simple PC & Cable Modem. This works, but it is a bit generic and un-professional. A better way to do this may be to run a P2P client such as Shareaza [shareaza.com] on a webserver. You could then control the client using some type of remote service (Terminal Services, for example).

    P2P has it's advantages. Such as:
    - Users who download the file also share it. This is especially useful if the client/network supports Partial File Sharing.
    - When you release the file using the P2P client, you only need to upload to only a few users. Those users can then share the file using Partial File Sharing etc.
    - Unlike FTP and HTTP, they aren't connecting to your webserver. Thus, it saves bandwidth for you and allows people to browse your website for actual content, not media. (Though, media is content). In addition, there is ussually "Max # of Connections" allowed to a server or FTP. Not so on P2P.
    - P2P Clients have good queuing tools. At least, Shareaza does. It has a "Small Queue" and a "Large Queue". This basically allows you to have, say, 4 Upload slots for Large Files (Files that are above 10MB, for example) and one for Small Files (Under 10MB). Users who are waiting to download from you can wait in "Queue", instead of "Max users connected" on FTP.

    Though, at it's core, all of the P2P I know of uses HTTP to send files etc. But the network layer helps file distribution tremendously.
  • Re:HTTP is fine (Score:3, Informative)

    by kwerle ( 39371 ) <kurt@CircleW.org> on Thursday February 13, 2003 @07:33PM (#5298214) Homepage Journal
    FTP is quickly becoming a special-needs protocol. If you need authentication,

    Yeah, if you need CLEAR TEXT auth, FTP is for you. If you want SSL auth, maybe enable auth for your http server.

    uploads, directory listings,

    Which http can do fine, thanks.

    accessability with interactive tools, etc. then this is for you.

    Dunno about this.

    Mainly useful for web designers these days, IMO, since the site software packages can use the extra file metadata for synchronization.

    I'd push for SSL webdav in an instant...

    Sorry, but I live behind various firewalls and am sick to death of FTP. The sooner it dies, the better.

    (best not to take this post too seriously - FTP just really pisses me off)
  • Re:HTTP is fine (Score:3, Informative)

    by qnonsense ( 12235 ) on Thursday February 13, 2003 @07:35PM (#5298223)
    But "by default" for what server? The HTTP protocol may or may not recommend DIR listings by default, but that's beside the point. Some servers allow it "by default," some don't. Check your server.
  • Re:Both... (Score:5, Informative)

    by mlyle ( 148697 ) on Thursday February 13, 2003 @07:39PM (#5298260)
    IIRC, it abandons the UDP data stream, and instead pushes the data back to you over the TCP stream.

    FTP entirely uses TCP. What passive mode does is causes the FTP client to connect to the FTP server to download the file; in ordinary, traditional, "active" FTP, the FTP client sends an address to the FTP server, which the FTP server connects to to transfer the file. Obviously, this does not play nicely with FTP clients behind firewalls that don't allow incoming connections.
  • by Anonymous Coward on Thursday February 13, 2003 @07:46PM (#5298313)
    All PASV mode does is allow the client to initiate the data connection, rather than the server. There is still a control connection and a separate data connection.

    This can be helpful when the client is NAT'ed, otherwise the client will send a PORT command with an unroutable address. Of course, if the server is NAT'ed, the reverse will happen. There are stateful NAT devices that will actually exmaine FTP control connections and rewrite PORT commands, but NAT and FTP are basically a pain in the arse to deal with. Throw some encryption in the mix (sftp) with NAT, and you'll understand why FTP is not long for this world.

    Only on Slashdot could you learn that FTP uses UDP for the data connection or that PASV mode only uses a single socket!
  • Re:gopher (Score:1, Informative)

    by Anonymous Coward on Thursday February 13, 2003 @07:46PM (#5298318)
    Let the modem handle the error correction, and live on the edge with Ymodem-G! You'll squeeze out a few extra cps.
  • HTTP vs FTP (Score:2, Informative)

    by piranha(jpl) ( 229201 ) on Thursday February 13, 2003 @07:47PM (#5298325) Homepage
    Each has their place.

    FTP was designed for interfacing with the filesystem of a remote Unix system, with the filesystem permissions that are granted to the user you log in as. FTP lets you browse the hierarchy, including examining ownership, permission, and symlink targets; pretty much the same as what you get with 'ls -l'. Apache does file listings, but only shows file names, last modification dates, and size. This makes FTP more suitable than HTTP for remote mirroring of directory trees. This also makes it easier to "browse" what an FTP server has to offer, on a directory-by-directory basis.

    With FTP, the server prints a response when a client connects. Usually, the client sends a user name, password, the 'SYST' command, and asks for the current working directory, tells the server what mode (ASCII or binary) it wants, changes to the directory with the file it needs, sends a PORT command, then finally requests the file. With HTTP, the client connects, sends a request, and the server responds. That's 8 client commands and 9 server responses with FTP, as opposed to 1/1 with HTTP. Each time a command is sent, the client has to wait for the server to respond. The latency adds up, and that means, especially on high-latency connections, FTP is slower to initiate and begin downloading than HTTP. Who said HTTP is slower?

    Regarding reliability, both protocols and modern implementations of their clients and servers have features to resume a broken download from where it left off. Who says one is more reliable than the other?

    HTTP is more simple than FTP. As far as I can tell, in FTP "active mode", the client sends a PORT command with an IP address and port number that is listening. In "passive mode", the server sends an IP address and port number that is listening, after the PASV command. These address/port combinations are used for the actual file transfer.

    Active mode doesn't work if there is NAT between the client and the server, unless the NAT system rewrites the packets so that the IP address the server sees in the PORT command is the outside, external address of the NAT system. When an FTP server is behind NAT, passive mode cannot be used without a similar kluge; it must get an outside-world IP address to connect to from the client, which active-mode PORT does. If both client and server are behind NAT, then one of these NAT kluges must be in effect for file transfer to be possible.

    This address/port nonsense could be part of the security concern with FTP. I also believe older FTP implementations allowed the client (in active mode) or server (in passive mode) to specify arbitrary address/port combinations, so that the FTP server or client could be used as a proxy in an attack. Is this still the case?

    With HTTP, transfers are conducted on the same TCP connection as control is on, and therefore doesn't need to concern itself with IP addresses and ports, and the people using it have fewer NAT headaches.
  • by SWPadnos ( 191329 ) on Thursday February 13, 2003 @07:49PM (#5298336)
    As many people have said, it depends.

    FTP has a great advantage in that you can request multiple files at the same time: mget instead of get. Additionally, you can use wildcards in the names, so you can select categories / directories of files with very short commands. (mget *.mp3 *.m3u ...)

    Modern browsers allow you to transfer multiple files simultaneously, but they don't queue files for you - FTP will. This may be important if connections might get dropped - the FTP transfer will complete the first file, then move on to the next. In the event of an interruption, you will have some complete files, and one partial (which you can likely resume). For multiple simultaneous transfers - from an http browser - you may have some smaller files finished, but it's likely that all larger files will be partials, and will need to be retransmitted in their entirety, since http doesn't quite support resuming a previous download.

    So, if you're going to have a web page with many individual links, and you think that most people will download one or two files, http will probably suffice. If you expect people to want multiple files, or that they will want to be able to select groups of files with wildcards (tortuous with pointy-clicky things), then you should have FTP.

    It's not that hard to set up both, and that's probably the best solution.
  • by Edgewize ( 262271 ) on Thursday February 13, 2003 @07:49PM (#5298339)
    FTP supports a single connection (Passive, or PASV in the actual protocol), which is what most web browsers use by default.

    No, no, no. Jesus. Everyone always gets this wrong. FTP in any mode uses two TCP connections. Passive or not, there is a channel for data and a separate channel for commands.

    The difference is that passive-mode means that the client initiates the data connection. The default FTP behavior is for the client to connect to port 21 on the server, and then the server initiates a data connection to the client.

    Non-passive FTP clients are very hard for firewalls to keep track of, especially when NAT is involved. Passive is a little better because both connections are outgoing.

    But at the same time, passive mode makes the server firewall's job tougher, because it requires an large range of incoming ports for the data connections.

    No matter what the mode, FTP is not very firewall-friendly.
  • by ZoneGray ( 168419 ) on Thursday February 13, 2003 @07:54PM (#5298379) Homepage
    This is slightly off-topic and sftp isn't what he should be using, but you can change the user's shell to /usr/bin/sftp and add it to /etc/shells. I've only tried it with OpenSSH under Linux, so YMMV. I got the idea from an OpenBSD list, though, so it should work most anywhere.

    To answer the original question, when given a choice, I always download by http. It usually takes less time to set up the connection, probably becasue of those ident lookups that most ftpd's still run by default.

  • Re:how about rsync? (Score:5, Informative)

    by Dr. Awktagon ( 233360 ) on Thursday February 13, 2003 @07:58PM (#5298398) Homepage
    Agreed.. I've had enough headaches with FTP and firewalls/NAT, let's just let it die. For robust downloading of large files rsync is the protocol to use.

    For those not familiar: rsync can copy or synchronize files or directories of files. it divides the files into blocks and only transfers the parts of the file that are different or missing. It's awesome for mirrored backups, among other things. There is even a Mac OS X version that tranfers the Mac-specific metadata of each file.

    Just today I had to transfer a ~400MB file to a machine over a fairly slow connection. The only way in was SSH and the only way out was HTTP.

    First I tried HTTP and the connection dropped. No problem, I thought, I'll just use "wget -c" and it will continue fine. Well, it continued, but the archive was corrupt.

    I remembered that rsync can run over SSH and I rsync'd the file over the damaged one. It took a few moments for it to find the blocks with the errors, and it downloaded just thost blocks.

    Rsync should be built into every program that downloads large files, including web browsers. Apple or someone should pick up this technology, give it some good marketing ("auto-repair download" or something) and life will be good.

    Rsync also has a daemon mode that allows you to run a dedicated rsync server. This is good for public distribution of files.

    Rsync is the way to go! I guess this really doesn't 100% answer the poster's question, but people really should be thinking about rsync more.
  • by whovian ( 107062 ) on Thursday February 13, 2003 @08:04PM (#5298441)
    is ncftp. It's got filename completion and all the file shell commands. I resort to it when mozilla chokes (which is most of the time) -- especially when getting large files.
  • Re:hmm (Score:2, Informative)

    by The Snowman ( 116231 ) on Thursday February 13, 2003 @08:04PM (#5298448)

    With respect to the original question, I would set-up a box offering both, HTTP and FTP access.

    This is what I do. I have a link from my HTTP directory into the anonymous FTP directory, so users can download files either way. It takes up the same bandwidth and the same server hard drive space, and offers both options. Some firewalls block FTP, so I have to offer HTTP. Some people like the resume capability of FTP, so I have to offer FTP. Until everyone else decides that one or the other is dead, I will offer both.

  • Re:Both... (Score:3, Informative)

    by mabinogi ( 74033 ) on Thursday February 13, 2003 @08:05PM (#5298456) Homepage
    It's actually easier to implement two connections, one dedicated to commands, and one dedicated to data, than it is to try and determine the difference between commands and data on the same connection.
  • Re:Both... (Score:3, Informative)

    by mlyle ( 148697 ) on Thursday February 13, 2003 @08:07PM (#5298468)
    Primarily to allow aborting file transfers.

    Adding abort mechanisms to a text-based protocol with data multiplexed in (like HTTP) is difficult. That's why HTTP/1.1 closes the TCP connection on a user-initiated abort.

    Since FTP has session-based authentication (which often might use mechanisms like SKEY which can't just be replayed), and context (things like the current directory, transfer mode, and other settings) it is difficult/impossible for a client to reconnect and preserve the exact same state. So instead the data connections are seperate and can be closed at will.

    The actual implementation, though, involving the client sending its own address and having the server connect back-- that's braindead. The FTP bounce attack was "fun" (upload a file containing stuff you want sent to a victim to a FTP server. Then say the victim's IP and chosen port to the FTP server as "your" address, and ask the FTP server to send the file back to you. Poof, the FTP server has attacked your victim for you).
  • by Daytona955i ( 448665 ) <{moc.oohay} {ta} {42yugnnylf}> on Thursday February 13, 2003 @08:08PM (#5298478)
    sftp is not the way to go if you want public access of files. sftp would be the way to go if you were required an account to download/upload files.

    If the files you are serving are large then use ftp. If the files are smaller (less than 10MB) use http.

    http is great, I sometimes throw up a file on there if I need to give it to someone and it is too big to e-mail. (Happened recently with a batch of photos from the car show)

    Since I already have a web page it was easy to just throw the file in the http directory and provide the link in an e-mail.

    I like http for the most part. I doubt anyone will call you lame for using it, unless the files are huge.
    -Chris
  • by Speare ( 84249 ) on Thursday February 13, 2003 @08:10PM (#5298492) Homepage Journal

    Today I set up an Apache + mod_ssl + mod_dav server for "drag and drop" shared file folders that can be used by any Windows or Linux client over a single well-known socket port (https=443/tcp). It took me two hours without knowing a thing about WebDAV nor SSL to get both working together.

    Windows calls it a "Web Folder" while the protocol is usually called DAV or WebDAV. It extends the HTTP GET/POST protocol itself with file management verbs like COPY, LINK, DELETE and others.

    The key benefits are: almost zero training the users how to use it, flexibility while using proven protocols.

    WebDAV doesn't do the authentication or encryption, but these can be layered in with .htaccess and/or LDAP and/or SSL certified server-encryption.

    There are a few howto's out there. Google.

  • HTTP, hands down (Score:5, Informative)

    by Percy_Blakeney ( 542178 ) on Thursday February 13, 2003 @08:18PM (#5298554) Homepage
    As I understand it, your requirements are:

    1. Download only
    2. 1-6 MB files

      I also assume the following:

    3. You don't need intricate access controls
    4. Non-technical to Somewhat-technical users

    I would say that you should go with HTTP for sure. Of course, you can provide both, but there are some key reasons for using HTTP.

    Easier Configuration Perhaps I'm just not that swift, but I've found that web servers (including Apache) are easier to configure. This is especially true if you have any previous web server experience. Of course, the FTP server is more complex due to its additional features that HTTP doesn't have, but assuming that (c) is true, then you won't need to mess with group access control rights and file uploads.

    Speed This whole "FTP is faster" stuff is not true. HTTP does not have a lot more overhead than FTP; it may even have less overhead than FTP in certain cases. Even when it does have more overhead, it is in the order of 100-200 bytes, which is too small to care about. HTTP always uses binary transfers and just spits out the whole file on the same connection as the request. FTP needs to build a data connection for every single data transfer, which can slow things down and even occasionally introduce problems.

    Easier for Users Given assumption (d), your users will be much more familiar with HTTP URLs than FTP addresses. You could just use FTP URLs and let their web browsers download the files, but then you lose the benefit of resuming partial downloads.

    Simple Access Controls Though some people need to have complex user access rules, you may very well just need simple access controls. HTTP provides this (look at Apache's .htaccess file), and you can even integrate Apache's authentication routines into PAM, if you are really hard core.

    There are a few main areas where FTP currently holds sway:

    Partial Downloads Web browsers typically don't support partial downloads, but the fact of the matter is that the HTTP protocol does support it (see the Range header.) The next generation of web browsers may very well include this feature.

    User Controls Addressed above.

    File Uploads Again, HTTP does support this feature but most browsers don't support it well. Look to WebDAV in the future to provide better support.

    In summary, just use HTTP unless you need complex access rules, resumption of partial download, or file uploading. It will be easier both on you and your users.

  • Re:Both... (Score:3, Informative)

    by dissy ( 172727 ) on Thursday February 13, 2003 @08:24PM (#5298593)
    > Interesting -- do you or anyone else know why it
    > was designed like that? It seems like it's
    > unnecessary overhead when there's already a
    > connection open.

    There are two connections involved.
    The command connection, from you to server port 21, to send commands via.
    Then a data connection, in port mode to you, in pasive mode to the server. This is what the files (Including directory listings) are sent over.

    It was designed that way so you can connect your client to servers A and B at the same time (command connection) then send the IP address of B to server A (A thinks this is you) and tell B to recieve a file in in passive, and the data connection goes right from A to B with never passing through your connection or pipe at all.

    Handy when the two servers are on fast fast connections and you are on a slow connect, but want to transfer from one point to another.

    Some people call this action FXP, and likely you will need an FXP client to do it.

  • Re:hmm (Score:1, Informative)

    by Anonymous Coward on Thursday February 13, 2003 @08:24PM (#5298594)
    Sometimes it does and sometimes it doesn't for me. I have a feeling that the file size matters as to whether or not IE keeps the file in the cache (Temporary Internet Files as it's called on Windows). I had to restart my huge Animatrix download 2 whole times from over 50% before I said screw it and just got a wget port for Windows.
  • Re:hmm (Score:5, Informative)

    by Jucius Maximus ( 229128 ) on Thursday February 13, 2003 @08:34PM (#5298653) Journal
    "You may (just may) run into a routing or timeout problem, in which case the download will stop and you are forced to do the entire download again. Using the right client, eg. ncftp, you can continue downloading partially downloaded files. An option, HTTP doesn't offer."

    This is incorrect. Practically every download manager out there allows resuming HTTP downloads. There are only a few (very rare) servers that don't allow this, I guess due to them running HTTP 1.0

    Almost all windows download managers allow it, and for linux, check out 'WebDownloader for X' which has some good speed limiting features as well.

  • by @madeus ( 24818 ) <slashdot_24818@mac.com> on Thursday February 13, 2003 @08:36PM (#5298664)

    There are many reasons to support HTTP over FTP for small files.

    HTTP is a much faster mechanism for serving small files of a few MB's (as HTTP doesn't check the integrity of what you've just downloaded and relies purely on TCP's ability to check that all your packets arrived and were arranged correctly).

    Not only is HTTP faster both in initiating a download and while the download is in progress, it typically has less overhead on your server than is caused by serving the same file using an FTP package.

    If you are serving large files (multiple tens of MB's) it would be advisable to also have an FTP server, though many users still prefer HTTP for files over over 100 MB, and use FTP only if the site they are connecting to is unreliable.

    The speed of today's connections (56k, or DSL, or faster) means that the FTP protocol is not redundant, but it's less of a requirement than it used to be - as the consensus of what we consider to be a large file size has changed greatly.

    There was a time when anything over 500K was considered 'large' and the troublesome and unreliable nature of connections meant that software that was over that size would almost certainly need to be downloaded via FTP to ensure against corruption.

    Additionally, many web servers (Apache included) and web browsers (Netscape/Mozilla included) support HTTP resume, which works just like FTP resume.

    Unless you are serving large files (e.g. over 20 to 30 MB's) or you have a dreadful network connection (or your users will - for example if they will be roaming users connecting via GPRS) then HTTP is sufficient and FTP will only add to your overhead, support and administration time.

    One last note: I'd also add that many users in corporate environments are not able to download via FTP due to poorly administered corporate firewalls. This occurs frequently even in large corporations due to incompetent IT and/or 'security' staff. This should not put you off using FTP, but it is one reason to support HTTP.
  • Re:HTTP is fine (Score:5, Informative)

    by kasperd ( 592156 ) on Thursday February 13, 2003 @08:36PM (#5298667) Homepage Journal
    The HTTP protocol may or may not recommend DIR listings by default

    No, the HTTP protocol does not even specify the concept of a directory listning. Some servers can generate an HTML file from the directory listning, but that is all up to the server, it can generate that file as it likes or even just serve an error.
  • Re:gopher (Score:3, Informative)

    by BrookHarty ( 9119 ) on Thursday February 13, 2003 @08:43PM (#5298700) Journal
    Actually sz/lz works great over ssh, on windows securecrt supports it (putty wishlist). When your working on a box, ssh'ed thru 10 gateways, sz filename, and you have it. sz/lz should be on every solaris box.
    -
    goo bad ack s/b syn - punter
  • by pixel.jonah ( 182967 ) on Thursday February 13, 2003 @08:46PM (#5298711)
    or just do a search for the windows build of wget (wget.exe) - I use it all the time, still isn't a GUI for it though. However if you want to pass in a text file with all the urls you want to download - its killer. and fast too.
  • by TomatoMan ( 93630 ) on Thursday February 13, 2003 @08:53PM (#5298759) Homepage Journal
    OSX natively supports WebDAV; choose "connect to server..." from the Finder's Connect menu, enter the URL, your username and password, and it mounts the web folder as a local disk. You can save directly into it from any application, as well as create folders and drag-and-drop copy from the Finder. It's completely shielded from apps; you can even copy files to a WebDAV volume from a shell and it's entirely transparent (look in /Volumes). Very, very cool.
  • by argonaut ( 37085 ) on Thursday February 13, 2003 @08:54PM (#5298764) Homepage Journal
    Being in IT for a large Fortune 500 company that sells an operating system among other things (no, not Microsoft), I can share some of my expereinces with you. So take it for what it is worth.

    Our FTP servers run both HTTP and FTP providing the same content in the same directory structure. There are five servers that transfer an average of 1-2 TB (terabyte) per month each, so they are fairly busy. On a busy month each server can go as high as 7 TB of data transferred. File sizes range from 1 KB to to whole CD-ROM and DVD-ROM images. I think the single largest file is 3 GB.

    The logs show a trend of HTTP becoming more popular for the last several years and not stopping. It is currently at 70% of all downloads from the "FTP" servers via HTTP. While the remaining 30% is via FTP. Six years ago (I lost the logs from before this time, they are on a backup tape but I am way too lazy to get that data), it was completely reversed. 75% of downloads were via FTP and 25% were via HTTP. 90% of all transfers are done with a web browser as opposed to an FTP client or wget or something.

    One thing we learned was that many system administrators will download via FTP from the command line directly from the FTP server, especially during a crisis they are trying to resolve. They do this from the system itself and not a workstation. The reasons for this are a bit of a mystery. Feedback has shown that we should never get rid of this or we might be assassinated by our customers. We thought about it once and put out feelers.

    I would say if you don't need to deal with incoming files and you file size is not too large then stick with HTTP. Anything over about 10 MB should go to the FTP server. An FTP server can be more complicated. It seems like the vulnerabilities in FTP daemons has died down in the past year or so. Also, fronting an FTP server with a Layer 4 switch was a lot more tricky because of all the ports involved. If you want people to mirror you then go with FTP or rsync for private mirroring. In reading the feedback, most power users seem to prefer FTP, perhaps because that is what they are used to. Also, depending on the amount of traffic you might need to consider gigabit ethernet.

    The core dumps being uploaded are getting to be huge. Some of those systems have a lot of memory!
  • Re:My opinion (Score:3, Informative)

    by snol ( 175626 ) on Thursday February 13, 2003 @08:55PM (#5298776)
    It'd be nice if Phoenix and Mozilla would acquire that ability. For some reason the developers' stated position is that it won't happen anytime soon, but one can always vote for the bug [mozilla.org] anyway.
  • by dimator ( 71399 ) on Thursday February 13, 2003 @08:59PM (#5298793) Homepage Journal
    http://www.interlog.com/~tcharron/wgetwin.html [interlog.com]

    This is probably the first thing I get when I'm doing a new windows installation. For larger files, its a must. You also don't have to deal with browsers using their cache directory to download, and then *copying* it to the directory you really wanted. (Who the hell thought of doing it that way?)

  • by androse ( 59759 ) on Thursday February 13, 2003 @09:04PM (#5298823) Homepage
    The problem with using HTTP for large file downloads is that, in most cases, it's cheaper ressource-wise to span multiple FTP simultaneous connections than HTTP connections. Of course, this only becomes a real problem if you have more than a few hundred virtual hosts on a single box. So save your httpd processes, and use FTP for large files.
  • by ZorinLynx ( 31751 ) on Thursday February 13, 2003 @09:12PM (#5298867) Homepage
    Starting multiple TCP connections for a single file download can be advantageous, because of congested network paths.

    If there are 500 TCP downloads ocurring, each download will theoretically get 1/500th the bandwidth.

    Therefore, by opening multiple TCP connections, you will increase the amount of bandwidth for your transfer, at a cost to everyone else using the connection. This is because you've effectively doubled the size of your receive window (one for each connection), causing the host you are downloading from to stuff that many more packets down the pipe.

    The problem is, when everyone does it, it completely negates any advantage to using this method. It also leads to packet loss, since you have that many more TCP connections (each with its own receive window) fighting for pieces of the pie.

  • by RobbieW ( 4330 ) on Thursday February 13, 2003 @09:19PM (#5298907) Homepage
    Dan J. Bernstein [cr.yp.to] has written a fantastic, lightweight server [cr.yp.to] that will serve files via either or both FTP and HTTP depending on how the client connects.

    If you want to serve files to the public, this is the most secure way to do so. If you need to provide the files to only certain logins, use something else. If not, you can run this on very lightweight hardware and if it's the only server running, you won't get hacked. Period.
  • by caulfield ( 39545 ) on Thursday February 13, 2003 @09:32PM (#5298986) Homepage
    OSX natively supports WebDAV; choose "connect to server..." from the Finder's Connect menu

    Yeah, if only it worked. It has very shakey DAV over SSL support and no support for any sort of HTTP_AUTH. Also, don't bother trying on non-80 ports.

    Stoopid Apple.

  • by molo ( 94384 ) on Thursday February 13, 2003 @09:39PM (#5299022) Journal
    scp is your friend. Learn how to use it, and it will handle all of your (non-anonymous) file transfers. It is a beautiful thing.
  • Re:hmm (Score:3, Informative)

    by mvdw ( 613057 ) on Thursday February 13, 2003 @09:44PM (#5299053) Homepage

    Especially since http is faster to connect to than ftp.

    I disagree. Sure, it's easy to browse via http and get one or two files, but when you're trying to suck down the entire directory, http blows (excuse the pun).

    What's faster for getting a whole directory than:

    wget -t 0 -c ftp://ftp.server.name/path/to/dir/*

    Doesn't work with http, because the directory listing doesn't work with wget, at least the version I have.

  • by mr. methane ( 593577 ) on Thursday February 13, 2003 @09:54PM (#5299106) Journal
    I provide a mirror for a couple of largish open-source sites, and several of them specifically request that sites provide FTP service as preferred over HTTP. A couple of reasons:

    1. Scripts which need to get a list of files before choosing which ones to download - automated installers and the like - are easier to implement with FTP.

    2. FTP generally seems to chew up less CPU on the host. I can serve 12mb/s of traffic all day long on a P-II 450 box with only 256mb of memory.

    3. "download recovery" (after losing connection, etc.) seems to work better in FTP than HTTP.

  • FTP The Easy Way (Score:3, Informative)

    by l0gic_f0x ( 624207 ) on Thursday February 13, 2003 @10:33PM (#5299161)
    I run a ftp for similar file-sizes (1-6 meg) using a Windows 2000 Pro box (yeah i know i should stick to my preachings about the wonders of linux but im not 100% with my abilities to lock down linux yet) and im using Bulletproof FTP server which is hella cheap but has every feature you can need and is very secure. I highly recommend it. It handles beautifully.
  • Re: sftp (Score:2, Informative)

    by araemo ( 603185 ) on Thursday February 13, 2003 @10:44PM (#5299220)
    sftp incurs a terrible CPU overhead, especially if many people are going to be downloading at once. I doubt most web servers could concurrently handle a few dozen 3des encrypted sftp connections without slowing throughput, and if you're hosting files, thats the last thing you want. FTP is supposedly more bandwidth-efficient (though I've never seen proof), but I can still get 400k/sec downstream over http, so I don't think it's a huge problem. I'd just use http for the ease of setup. Securing a public-access ftpd is a true pain.
  • by HMC CS Major ( 540987 ) on Thursday February 13, 2003 @10:49PM (#5299245) Homepage
    lynx, wget, and fetch, all work over http.
  • Re:hmm (Score:2, Informative)

    by tachyonflow ( 539926 ) on Thursday February 13, 2003 @10:52PM (#5299259) Homepage
    It would appear that IE6.0 (at least) supports this resume feature of HTTP, when conditions permit. I just tested it by interrupting a large download from my web site.
  • by MoFoQ ( 584566 ) on Thursday February 13, 2003 @11:08PM (#5299325)
    Well there's always the option of FTP over SSH2. I'm sure you can find a Java applet that will do the SSH2 and make the tunnel needed for secure FTP.

    Now when we talk about Java, there's another possibility. Some sites (cr@ck) use a Java downloader. It doesn't mean that the Java applet that downloads the file uses HTTP or FTP, it can be some sort of propriety protocol (or you can combine the best of both worlds.)
    One way is to have the applet on a SSL'ed (https) page and it does some decrypting as it downloads a pre-encrypted file from your FTP. Or the person can just download the encrypted file directly and use the applet on the secured page to decrypt it. There's ALWAYS a way to have your cake and eat it all by yourself, too.
  • by sir99 ( 517110 ) on Thursday February 13, 2003 @11:15PM (#5299356) Journal
    lynx, wget, and fetch, all work over http.
    Wget (don't know fetch, but assuming it's like Wget) doesn't let you browse to a file; you have to know the full path in advance, or use recursive downloading, or guess with pattern matching.

    Lynx lets you browse, but you can't do globbing, so you see lots of irrelevant crap, and you have to select files to download one at a time.

    For getting (possibly multiple) files whose location you don't know in advance, FTP is more flexible and efficient.

  • Re:No, (Score:3, Informative)

    by sweetooth ( 21075 ) on Thursday February 13, 2003 @11:17PM (#5299364) Homepage
    Most web servers allow a max number of connections. If one user is eating up six connections that is potentially five fewer people that can download the files. With the case of ISOs the distributor probably has more bandwidth than the person downloading does. Hence it is more effective to serve as many people with as much bandwidth as possible. It's really a curteousness issue to the server operator to not open six connections when one will do.
  • Re:hmm (Score:3, Informative)

    by eht ( 8912 ) on Thursday February 13, 2003 @11:25PM (#5299394)
    most of the 30 second pause should go away if you turn off folder view for ftp Internet Options, Advanced Tab, browsing section, a couple down on the list for me
  • by almaw ( 444279 ) on Thursday February 13, 2003 @11:27PM (#5299397) Homepage
    You should use FTP if you answer yes to any of the following questions:
    1. Do you have bandwidth issues? If you are serving files to many people, FTP servers allow maximum concurrent users, which can be useful. I know you can do this with HTTP, but it's difficult to segment the downloading >1Mb files traffic from the normal site traffic. A separate service also allows you to use all the Quality of Service stuff in the 2.4 kernel nicely.
    2. Do you have a large array of files that the user might want to download, such that using an FTP client to ctrl+select multiple files is the right answer compared to having your users click on twenty links and have to cope with twenty dialog boxes?
    3. Do your users need to be able to upload files to you? This can be done with HTTP, but you'll need some PHP processing or similar on the server, it doesn't support resuming, and it won't work through many company firewalls, and therefore isn't a good option. HTTP uploading it particularly hopeless for large files, as it provides no user-feedback.
    However, you should NOT use FTP if you answer no to either of these:
    1. Are you running some flavour of unix? There just aren't any robust Windows FTP servers. Yes, I'm prepared for the flame war about this. :)
    2. Can you be bothered to keep your FTPd patched? ProFTPd and WU-FTPd are both frequent appearers on bugtraq. You need to stay on top of the patches, or you will be 0wn3d.
    Simple, see? :)
  • by slagdogg ( 549983 ) on Thursday February 13, 2003 @11:48PM (#5299499)
    'wget' with shell script capabilities is a very handy tool indeed ... for f in {0-2}{0-9} ; wget http://somesite.com/images/teen-$f.jpg :|
  • Consider WeebleFM (Score:2, Informative)

    by Anonymous Coward on Thursday February 13, 2003 @11:53PM (#5299520)
    I just set up WeebleFM http://sourceforge.net/projects/weeblefm/
    It's a PHP front end to FTP. My FTP ports are only open to the loopback interface. Users get the usablity of a clean web interface, and I get to have encrypted password-controlled FTP on a box that only has port 80 open to the internet.

    WeebleFM uses mcrypt to encrypt traffic (and I'm pretty sure I could get it to work over https).

    Using standard unix permissions, a careful directory schem, and vsftpd's chroot capabilities, I can have an internet filesharing arrangement with blind drop boxes, a group accessible directory and any number of world readable directories.
  • Re:hmm (Score:1, Informative)

    by Anonymous Coward on Thursday February 13, 2003 @11:57PM (#5299539)
    Isn't there anyone else who notices the 30 second freezes while IE tries to contact the ftp site?

    To be fair, it also freezes with WebDAV (HTTP). Not that we need to be fair to FTP.

    As someone mentioned, it seems to be yet another technical problem arising from shell integration.
  • by Anonymous Coward on Friday February 14, 2003 @12:04AM (#5299578)
    FTP implementations frequently use a fixed, small window size. HTTP on the other hand will honor the system limit, almost always larger even without tuning.

    Dramatically simplified, it means that the connection can send a lot more packets without hearing back from the far end, enabling the connection to reach higher speeds (imagine a phone call where you had to say 'okay' after every word the other person said. Now imagine only having to say it after every sentence. Much faster.)

    The tiny window size of (most crappy legacy implementations of) FTP starts to affect download speed at just 25ms latency, and has a huge effect over 50ms.

    A properly tuned system with HTTP can make a single high-latency transfer hundreds or even thousands of times faster than FTP.

    Relevant links:
    http://www.psc.edu/networking/perf_tune.ht ml
    http://www.nlanr.net/NLANRPackets/v1.3/windows _tcp tune.html
    http://dast.nlanr.net/Projects/Autobuf/ faq.html

  • by CmdrWass ( 570427 ) on Friday February 14, 2003 @12:28AM (#5299693) Homepage Journal
    I tend to agree with this, but for different reasons.

    If you are downloading a file off of a remote server, then there are one of two possibilites:

    1) you know the exact address to the file you are looking for... in this case ftp provides no superior advantage over using lynx or wget since in either case you could have been given the direct URL... either provided as an http url or an ftp url. Basically my point here is that an ftp url is no more or less useful or easy to remember than an http url.

    2) you don't know the address of the file you are looking fore... therefore you are pretty much required to browse via http, to find the site (or page) you want to download from... so since you are already forced to browse for the site, then you might as well use the browser to download. For most people that use graphical browsers, this is great... for those of us (myself included) that use shell browsers (ie lynx and links), this poses little problem as well (unless javascript is required to download a file... I friggen hate javascript... people who use javascript in their websites and have a choice should be fired [note, I use javascript in my works' website... but they make me.. I don't have a choice]).
  • bittorrent! (Score:1, Informative)

    by Anonymous Coward on Friday February 14, 2003 @01:00AM (#5299808)
    You want bittorrent

    http://www.bitconjurer.org/BitTorrent/index.html

    It makes it so a few people start downloading, and they in turn upload what they have to others, and it just kind of "spiderwebs" out, reducing the strain on the original host.

    I wish huge projects (distros, mozilla, etc) started using this. It would make everything SO fast.

    I'm not the guy that coded it, just a happy user.
  • Re:hmm (Score:2, Informative)

    by Prof.Phreak ( 584152 ) on Friday February 14, 2003 @01:19AM (#5299864) Homepage
    You may (just may) run into a routing or timeout problem, in which case the download will stop and you are forced to do the entire download again. Using the right client, eg. ncftp, you can continue downloading partially downloaded files. An option, HTTP doesn't offer.

    Heh? HTTP does offer restarts. There is a "Range" header value that was introduced in HTTP 1.1 which allows you to download any range of a file (even if you only need some limited range of a file).

    Just because IE or Netscape usually don't care to support it, doesn't mean it's not foolishly simple to setup yourself. Most self respecting developers can write a getter script in a few minutes that would allow for download restarts via HTTP.

  • by Archfeld ( 6757 ) <treboreel@live.com> on Friday February 14, 2003 @01:49AM (#5299932) Journal
    ftp is much easier to deal with proxy issues. While possible http makes it difficult. Http is nice for quick small files, but SecureFTP or FTP under a SSH with hashing is the best way, read fastest, most reliable way, that I know of.
  • by cryptor3 ( 572787 ) on Friday February 14, 2003 @02:03AM (#5299983) Journal

    If you're talking about the human engineering aspect of this discussion only, then I have no disagreement with you. However, FTP is just as technically feasible over SSL, since SSL works at a lower level on the network stack than FTP.

    Furthermore, there are good FTP clients that have SSL support. For example, CuteFTP supports FTP over SSL (and has a very user-friendly interface, for the clueless end user).

    There are a good number of servers supporting FTP over SSL. ServU and Sambar are some of the windows servers. Just do a google see what else there is.

  • Re:hmm (Score:4, Informative)

    by grolim13 ( 110441 ) on Friday February 14, 2003 @02:14AM (#5300018) Homepage
    wget -r -l1 http://http.server.name/path/to/dir/ will suck down all the files in that directory; wget -r -np http://http.server.name/path/to/dir/ will pull it down recursively.
  • by yomamasbooty ( 598640 ) on Friday February 14, 2003 @02:16AM (#5300025)

    Seems to be a lot of comments about firewalls and FTP from people who obviously don't work with them. Remember there are three basic types of firewall technology: packet filters, proxies, and stateful inspection.

    Packet filtering alone is always a problem because you have to open up all of the high ports.

    Proxy firewalls and FTP (active or passive) are a no brainer as long as either feature has been enabled. Remember that proxies "watch" the conversation so it will manage the connection if it's data coming back to the client on port 20, and will recognize the 'pasv' command in the command channel.

    Stateful Inspection firewalls include proxying code for the major protocols ie FTP, HTTP, Telnet, etc. So you are covered here as well.

    If you are having problems using FTP through a firewall then you are probably:

    -Are being blocked intentionally

    -Have a lazy security admin who hasn't updated the firewall in five years

    -Have a stupid router jockey "securing" the network with router ACLs (packet filters).

    As long as you are using a major firewall release like Checkpoint, PIX, Netscreen, IPTables, etc, that is up to date there will not be an issue getting FTP to work.

  • by kangasloth ( 114799 ) on Friday February 14, 2003 @02:51AM (#5300111) Homepage

    Microsoft's Web Folders work fine as long as you don't rename or create files with other WebDAV clients. Then you can end up with files named "seal" that show up as "Seal" in Web Folders. No big deal, it's only presentation, right? The problem is that Microsoft sends the mangled version back to the server for future requests, so you can't even get to "/seal", because mirosoft always asks for "/Seal". No problem, you think, I'll type it in manually - but wait - where'd my folder go and what's with this web page? Doh! Web Folders use the same protocol (http) so Explorer sees a web URL and morphs into Internet Explorer.

    So how do we work around the broken MS implementation? Unless you decide to run your website from a FAT32 partition, the filesystem remains case senstive and there's no easy way to make it look otherwise. mod_speling to the rescue! Sure, it's a little overhead, but we can correct Microsoft's blunders on the fly! MS lists the directory, mangles the names, sends bogus requests, and get's magically redirected to the correct file ... and then mangles the name again. Curses, foiled again.

    Last time I ran into this, I gave up and renamed the files to match Microsft's expectations. If anyone knows of a real solution, I would love to hear it.

  • Security Holes? (Score:3, Informative)

    by NerveGas ( 168686 ) on Friday February 14, 2003 @03:42AM (#5300248)
    Serve out anonymous FTP through public file (http://cr.yp.to/publicfile.html). Then there aren't any security holes.

    Really. The security holes in sendmail can be fixed by installing qmail. The security holes in BIND can be fixed by installing djbdns. The security holes in WuFTP (and most others) can be fixed by installing publicfile. There are also other good programs out there as well.

    steve
  • Here's how they work (Score:5, Informative)

    by tyler_larson ( 558763 ) on Friday February 14, 2003 @06:26AM (#5300631) Homepage
    I've worked pretty extensively with these two protocols, writing clients and servers for both. I've read all the relevant RFCs start-to-finish (whole lotta boring) and have a pretty good idea about what they both can do. Now, there's a lot of talk about the two, but few people really understand how they work.

    Forget people's opinions and observations about which is better; here's what they both do, you decide what you like. If you still want opinions, I give mine at the bottom.

    HTTP
    The average HTTP connection works like this:

    • The client initiates a connection. The server accepts but does not send any data.
    • The client sends his request string in the form
      [Method] [File Location]?[Query String] [HTTP version]
    • The client then sends a whole bunch of headers, each consisting of a name-value pair. A whole lot can be done with these headers, here's some hilights:
      • Authentication (many methods supported)
      • Download resume instructions
      • Virtual Host identification (so you can use multiple domains on one IP)
    • The client then can follow the headers up with some raw data (such as for file uploads or POST variables)
    • The server then sends a response string in the form
      [HTTP Version] [Response code] [Response string]
      where the response string is just a human-readable equivalent of the 3-digit response code.
    • Next, the server sends its own set of headers (usually explaining the data its about to send. File type, language, size, timestamp, etc.)
    • Finally, the server sends the raw data for the response itself (usually file contents).
    • If this is a keep-alive connection, we start over. Otherwise the connection is closed

    FTP
    FTP connections are a little less structured. The client connects, the server sends a banner identifying itself. The client sends a username and password, the server decides whether to allow access or not. Also, many FTP servers will try and get an IDENT from the client. Of course, firewalls will often silently drop packets for that port and the FTP server will wait for the operation to timeout (a minute or two) before continuing. Very, very annoying, because by then, the client has given up too.

    Next, the client sends a command. There's a lot of commands to choose from, and not all servers support all commands. Here are some hilights:

    • Change directory
    • Change transfer mode (ascii/binary) -- ascii mode does automatic CR/LF translation, nothing more.
    • Get a file
    • Send a file
    • Move a file
    • Change file permissions
    • Get a directory listing

    And here's my favorite part. Only requests/resonses go over this connection. Any data at all (even dir listings) has to go over a separate TCP connection on a different port. No exceptions. Most people don't understand this point, but even PASV mode connections must use a separate TCP connection for the data stream. Either the client specifies a port for the server to connect to with the PORT command, or the client issues a PASV command, to which the server replies with a port number for the client use in connecting to the server.

    The client does have the option to resume downloads or retrieve multiple files with one command. Yay.

    Some Observations

    • FTP authentication is usually plain-text. Furthermore, authentication is mandatory. Kinda stupid for public fileservers, if you ask me.
    • FTP is interactive--much better for interactive applications (most FTP clients) but unnecessary overhead for URL-type applications.
    • Both protocols depend on TCP to provide reliability. Reliability is NOT a distinguishing characteristic.
    • For transferring files, both send raw data over a single TCP stream. Niether is inherently faster because they send data in the exact same way.

    My Opinion
    I honestly think FTP was a bad idea from the beginning. The protocol's distinguishing characteristic is the fact that data has to go over a separate TCP stream. That would be a great idea if you could keep sending commands while the file transfers in the background... but instead, the server waits for the file to finish before accepting any commands. Pointless.

    FTP is not better for large files, nor is it better for multiple files. It doesn't go through firewalls, and quality clients are few. HTTP is equally reliable, but also universally supported. There are also a number of high quality clients available.

    In fact, the only thing FTP is better for is managing server directories and file uploads. But for that, you really should be using something more secure, like sftp (ssh-based).

    Bottom line, ditch FTP. Use HTTP for public downloads and sftp for file management.

  • Resource usage (Score:3, Informative)

    by MattBurke ( 58682 ) on Friday February 14, 2003 @06:50AM (#5300705)
    I used to run a server which distributed ~3TB/month. Initially I served these files via proftpd, but it soon became apparent that ftp daemons are far too bulky for high-volume serving.

    Enter apache. On the same hardware which keeled under around 30-50 ftp sessions, I could handle over 400 concurrent http sessions, with plenty of ram left over for the vital cacheing :)
  • by sh!va ( 312105 ) on Friday February 14, 2003 @07:11AM (#5300749)
    FTP traffic is given lower priority than HTTP traffic in a large number of packet shaping / DiffServ type routing algorithms.

    These algorithms are based on the assumption that HTTP traffic consists of fairly short bursts and not long sustained transfers which is typically what FTP traffic looks like. Based on these assumptions, these routers give lower priority to FTP traffic than they do to HTTP.

    This does not mean that you should serve large files off HTTP since it'll be "faster". Au contraire, it means that you should be fair to others and serve them over FTP, so that the routers can do the correct packet shaping even if it means a slight speed hit to you.

    Think of people downloading huge files off your web server and screwing up your warcraft (/quake/whatever) game.
  • Re:OR, How about... (Score:1, Informative)

    by Anonymous Coward on Friday February 14, 2003 @07:19AM (#5300760)
    > P2P?

    BitTorrent is perfect for this. http://bitconjurer.org/BitTorrent/

    >and if you want me to use my bandwith to upload your file to other people, sorry, forget about it.
    >I agree. My upstream is only 40KBytes, I don't want to share it.

    Because you are uploading pr0n all the time you can't share for.. well say Linux distribution distribution. If they would use BitTorrent there would be a much better possibility of getting that file than downloading it only at 1kbs.

    >also, those clients are a security hazard.
    >I definately agree. Downloading from a "trusted" website gives me at least some peace of mind that I'm not downloading a virus. Granted it's not guranteed - but it's far less likely to get infected from a website than it is form Joe Script Kidie.

    Well, the files get MD5 summed and downloaded on the fly, so there is a very little possibility of changing the files.

    Vote for BitTorrent! =)

    -V
  • Re:HTTP is fine (Score:3, Informative)

    by slim ( 1652 ) <john.hartnup@net> on Friday February 14, 2003 @07:30AM (#5300784) Homepage
    The HTTP protocol may or may not recommend DIR listings by default

    No, the HTTP protocol does not even specify the concept of a directory listning. Some servers can generate an HTML file from the directory listning, but that is all up to the server, it can generate that file as it likes or even just serve an error.

    Exactly right, and the point is that there is no explicit standard (the may be a few de-facto standards) to say what an HTML directory listing looks like, so coding the equivalent of an FTP client's "mget" command becomes a new job for every site.

    My advice is, if you think your users would like mget or its equivalent, then either give them FTP or think hard about how you could provide the same functionality using HTTP/HTML.

    If they don't need mget, HTTP might be fine.
  • by SEAL ( 88488 ) on Friday February 14, 2003 @09:05AM (#5301034)
    I honestly think FTP was a bad idea from the beginning. The protocol's distinguishing characteristic is the fact that data has to go over a separate TCP stream.

    First, I think FTP was a *good* idea, when you consider that its initial design was in 1971, predating even SMTP. Also since FTP was created when mainframes were king, it has features that seem like overkill today.

    Both protocols depend on TCP to provide reliability. Reliability is NOT a distinguishing characteristic.

    Oh but read the RFC young jedi :) There's a lot more to FTP than you might notice at first glance. The problem is that many clients and servers only partially implement the protocol as specified in the RFC. In particular, nowadays the stream transfer mode is used almost exclusively, which is the least reliable mode, and forces opening a new data connection for each transfer.

    If you dig into RFC 959 more, you'll see some weird things you can do with FTP. For example, from your client, you can initiate control connections to two separate servers, and open a data connection between the two servers.

    There's a lot of power and flexibility built into FTP, and that's why it has stuck around for 30 years. That's really phenomenal when you think about anything software related. Even though most current firewall vendors support active-mode pretty well, passive mode was there all along, showing that someone thought of this potential issue in advance. The main weakness of FTP is that it sends passwords over the wire in plaintext, but for an anonymous FTP server this isn't an issue.

    This is a good resource if you want to read up on the history and development of FTP:

    http://www.wu-ftpd.org/rfc/

    Best regards,

    SEAL

  • by bicho ( 144895 ) on Friday February 14, 2003 @09:16AM (#5301063)
    I come here late, so this might be redundant, but it wont get modded down either >:)

    You are missing some other points from ftp

    It gives you a shell, so there are customizable accounts and even commands. That makes it a lot easier to work with files you want to manage (i.e. transfer)
    FTP is jailed.
    The point that its not well implemented on either the client or server side, or that the implementation has security holes is another matter.

    FTP is pretty much alive, and I dont kow where I'd download my iso's from, or even huge amount of rpm's.
  • Re:HTTP is fine (Score:3, Informative)

    by slim ( 1652 ) <john.hartnup@net> on Friday February 14, 2003 @12:23PM (#5302418) Homepage
    Not if you do it right the first time. Surely directory listnings generated by different servers looks different, but all of those I have seen had one thing in common: They contains links to the files in the directory. So to produce a directory listning from the HTML file is not really a problem if you only need filenames. Just parse the HTML documents and find all links. Remove those links not pointing to files in the directory in question, and remove doubles if any. And once you actually get the files, be prepared to handle nonexisting files correctly.

    It can be done, but it can't be done /trivially/ and the scope for automation is limited. There's nothing /explicit/ in the HTML that states categorically that it's a directory listing, for example, so you need some kind of human input to say "yes, this is a directory listing, use it as a list of stuff to fetch", or "no, this is data I want, fetch it and save it".

    And, more to the point, although there are tools to let you "get everything linked off this chunk of HTML", they're not ubiquitous the way mget is.
  • Implement them both (Score:2, Informative)

    by Phred T. Magnificent ( 213734 ) on Friday February 14, 2003 @01:53PM (#5303379)
    It sounds like your objective is to make files available for download by the public. That being the case, your best solution is to provide both methods and let the person downloading the file determine which method is better for his/her/its needs.

    Some will prefer ftp because it's faster. Others, especially those behind overly-restrictive firewalls, will find that http is a more usable alternative.
  • Re:how about rsync? (Score:3, Informative)

    by virtual_mps ( 62997 ) on Friday February 14, 2003 @03:25PM (#5304296)

    What's your error model? It must be pretty wacky, and therefore unbelievable.

    A 128-bit CRC will be exactly as reliable as an MD5 checksum under all common error models.


    Thank you for pointing out that crc's are designed to look for errors--since in this application the checksum is used to uniquely identify a block, not to check for errors. You've quite succinctly explained the reason crc's won't work.
  • Re:FTP or... (Score:3, Informative)

    by TeddyR ( 4176 ) on Sunday February 16, 2003 @03:10AM (#5312835) Homepage Journal
    There is a fairly usable client that does both SFTP and FTP over SSL.

    FileZilla
    http://sourceforge.net/projects/filez illa/

With your bare hands?!?

Working...