
FTP: Better Than HTTP, Or Obsolete? 1093
An anonymous reader asks "Looking to serve files for downloading (typically 1MB-6MB), I'm confused about whether I should provide an FTP server instead of / as well as HTTP. According to a rapid Google search, the experts say 1) HTTP is slower and less reliable than FTP and 2) HTTP is amateur and will make you look a wimp. But a) FTP is full of security holes. and b) FTP is a crumbling legacy protocol and will make you look a dinosaur. Surely some contradiction... Should I make the effort to implement FTP or take desperate steps to avoid it?"
Forget them both.... (Score:3, Flamebait)
Re:Forget them both.... (Score:5, Informative)
Re:Forget them both.... (Score:5, Insightful)
It sounds like anonymous downloading of publicy available files - whatfor do we need any encryption then?
If not for the encryption, then consider what else you get: a well-defined TCP connection. It's a cinch to configure a firewall to allow sftp connections, while FTP firewalling will give you prematurely grey hair (and if it doesn't, then you're not doing it right).
Re:Forget them both.... Anonymity (Score:4, Insightful)
Re:Forget them both.... (Score:5, Informative)
1. Scripts which need to get a list of files before choosing which ones to download - automated installers and the like - are easier to implement with FTP.
2. FTP generally seems to chew up less CPU on the host. I can serve 12mb/s of traffic all day long on a P-II 450 box with only 256mb of memory.
3. "download recovery" (after losing connection, etc.) seems to work better in FTP than HTTP.
Re:Forget them both.... (Score:5, Insightful)
I personally would say go with http for the files, as it'll be much easier for people behind http proxies to download, it'll get cached more often by transparant proxies, and most browsers support browsing http directories FAR better than FTP directories.
Re:Forget them both.... (Score:5, Insightful)
-Sara
OR, How about... (Score:5, Informative)
I've written a tutorial [anenga.com] on how you can use P2P on your website to save bandwidth, space etc. An obvious way to do this would be to run a P2P client and share the file on a simple PC & Cable Modem. This works, but it is a bit generic and un-professional. A better way to do this may be to run a P2P client such as Shareaza [shareaza.com] on a webserver. You could then control the client using some type of remote service (Terminal Services, for example).
P2P has it's advantages. Such as:
- Users who download the file also share it. This is especially useful if the client/network supports Partial File Sharing.
- When you release the file using the P2P client, you only need to upload to only a few users. Those users can then share the file using Partial File Sharing etc.
- Unlike FTP and HTTP, they aren't connecting to your webserver. Thus, it saves bandwidth for you and allows people to browse your website for actual content, not media. (Though, media is content). In addition, there is ussually "Max # of Connections" allowed to a server or FTP. Not so on P2P.
- P2P Clients have good queuing tools. At least, Shareaza does. It has a "Small Queue" and a "Large Queue". This basically allows you to have, say, 4 Upload slots for Large Files (Files that are above 10MB, for example) and one for Small Files (Under 10MB). Users who are waiting to download from you can wait in "Queue", instead of "Max users connected" on FTP.
Though, at it's core, all of the P2P I know of uses HTTP to send files etc. But the network layer helps file distribution tremendously.
Re:OR, How about... (Score:4, Interesting)
Also, you forgot the first and biggest site with MAGNET links [bitzi.com]. Still, an excellent tutorial, thanks for writing it!
Re:Forget them both.... (Score:4, Insightful)
I don't believe there is anonymous sftp... (Score:5, Interesting)
What I don't care for with FTP is the continuous setup/teardown of data connections. What is even worse with active FTP is that the client side of the data connection establishes server ports, and the server becomes the client (I'd like to be able to use plug-gw from the TIS FWTK for FTP, but this is not possible for the data connections). However, even when enabling passive FTP, the data connections are too prone to lockup. The difficulty of implementing all of this in C probably contributes to the FTP server vulnerabilities.
Still, if you want both (optionally anonymous) upload ability and access from a web browser, FTP is the only game in town.
From the network perspective, the rsh/rcp mechanism is cleaner (in that there is only one connection), but it still has the problem of either passing cleartext authentication or establishing unreasonable levels of trust with trivial authentication. In addition, with rcp syntax you must know much more about the path to a file, and there is no real "browsing."
Many say that SSH is the panacea for these problems, but sometimes I am not concerned about encryption and I just want to quickly transfer a file. The SSH man pages indicate that encryption can be disabled, but I have never been able to make this work. SCP also has never been implemented in a browser for file transfers. I should also say that I've never used sftp, because it has so little support.
Someday, we will have a good, encrypted file transfer protocol (and reliable implementations of that protocol). Sorry to say, but ftp, rcp, and scp are not it. What will this new protocol support?
Boy, I never thought that I could rant about file transfer software for so long!
Re:Forget them both.... (Score:4, Informative)
If the files you are serving are large then use ftp. If the files are smaller (less than 10MB) use http.
http is great, I sometimes throw up a file on there if I need to give it to someone and it is too big to e-mail. (Happened recently with a batch of photos from the car show)
Since I already have a web page it was easy to just throw the file in the http directory and provide the link in an e-mail.
I like http for the most part. I doubt anyone will call you lame for using it, unless the files are huge.
-Chris
Re:Forget them both.... (Score:5, Informative)
To answer the original question, when given a choice, I always download by http. It usually takes less time to set up the connection, probably becasue of those ident lookups that most ftpd's still run by default.
hmm (Score:5, Interesting)
Re:hmm (Score:3, Informative)
You may (just may) run into a routing or timeout problem, in which case the download will stop and you are forced to do the entire download again. Using the right client, eg. ncftp, you can continue downloading partially downloaded files. An option, HTTP doesn't offer.
With respect to the original question, I would set-up a box offering both, HTTP and FTP access.
Re:hmm (Score:4, Interesting)
Re:hmm (Score:4, Informative)
Re:hmm (Score:5, Informative)
didn't you hear? (Score:4, Funny)
Re:hmm (Score:5, Informative)
Plain wrong. RFC2068 [w3.org] section 10.2.7.
Re:hmm (Score:5, Informative)
This is incorrect. Practically every download manager out there allows resuming HTTP downloads. There are only a few (very rare) servers that don't allow this, I guess due to them running HTTP 1.0
Almost all windows download managers allow it, and for linux, check out 'WebDownloader for X' which has some good speed limiting features as well.
You will not be forced to redo the entire download (Score:4, Informative)
There are many reasons to support HTTP over FTP for small files.
HTTP is a much faster mechanism for serving small files of a few MB's (as HTTP doesn't check the integrity of what you've just downloaded and relies purely on TCP's ability to check that all your packets arrived and were arranged correctly).
Not only is HTTP faster both in initiating a download and while the download is in progress, it typically has less overhead on your server than is caused by serving the same file using an FTP package.
If you are serving large files (multiple tens of MB's) it would be advisable to also have an FTP server, though many users still prefer HTTP for files over over 100 MB, and use FTP only if the site they are connecting to is unreliable.
The speed of today's connections (56k, or DSL, or faster) means that the FTP protocol is not redundant, but it's less of a requirement than it used to be - as the consensus of what we consider to be a large file size has changed greatly.
There was a time when anything over 500K was considered 'large' and the troublesome and unreliable nature of connections meant that software that was over that size would almost certainly need to be downloaded via FTP to ensure against corruption.
Additionally, many web servers (Apache included) and web browsers (Netscape/Mozilla included) support HTTP resume, which works just like FTP resume.
Unless you are serving large files (e.g. over 20 to 30 MB's) or you have a dreadful network connection (or your users will - for example if they will be roaming users connecting via GPRS) then HTTP is sufficient and FTP will only add to your overhead, support and administration time.
One last note: I'd also add that many users in corporate environments are not able to download via FTP due to poorly administered corporate firewalls. This occurs frequently even in large corporations due to incompetent IT and/or 'security' staff. This should not put you off using FTP, but it is one reason to support HTTP.
Re:No, (Score:4, Insightful)
Re:No, (Score:4, Insightful)
Limit number of connexions, NOT which FTP client (Score:4, Interesting)
I can make a dozen or more connexions to your FTP server with nothing more exotic than Netscape. Why pick on download managers when they use the same number of connexions? (BTW, Getright says right in its configuration that "some servers regard segmenting as rude" and recommends against it.) Better to limit connexions to x-many per IP address, and let the user spend them any way they wish.
BTW, if you do limit connexions, please remember that it usually takes one for browsing the site (using Netscape or whatever) PLUS one for the download manager to fetch the file. Otherwise the user who was looking with a browser has to leave the server, then wait for the browser connexion to close (which can take a while) then finally paste the link into the DLManager. So a limit of two connexions from a given IP is a nice practical minimum, and surely not a hard load for anything outside of home servers operating over dialup.
PS. I love FTP's convenience, and I always try to be extra-polite to small servers (and not rude to big ones). I do use Getright, and have segmenting disabled (which BTW is the default).
Re:No, (Score:4, Insightful)
Re:No, (Score:4, Interesting)
I have 3.5mbit DSL, but my ISP's performance is flaky. However, I have no problem pulling 300-350KB/s with a download accelerator.
The reason is simple: congestion! (Score:5, Informative)
If there are 500 TCP downloads ocurring, each download will theoretically get 1/500th the bandwidth.
Therefore, by opening multiple TCP connections, you will increase the amount of bandwidth for your transfer, at a cost to everyone else using the connection. This is because you've effectively doubled the size of your receive window (one for each connection), causing the host you are downloading from to stuff that many more packets down the pipe.
The problem is, when everyone does it, it completely negates any advantage to using this method. It also leads to packet loss, since you have that many more TCP connections (each with its own receive window) fighting for pieces of the pie.
Here's how they work (Score:5, Informative)
Forget people's opinions and observations about which is better; here's what they both do, you decide what you like. If you still want opinions, I give mine at the bottom.
HTTP
The average HTTP connection works like this:
FTP
FTP connections are a little less structured. The client connects, the server sends a banner identifying itself. The client sends a username and password, the server decides whether to allow access or not. Also, many FTP servers will try and get an IDENT from the client. Of course, firewalls will often silently drop packets for that port and the FTP server will wait for the operation to timeout (a minute or two) before continuing. Very, very annoying, because by then, the client has given up too.
Next, the client sends a command. There's a lot of commands to choose from, and not all servers support all commands. Here are some hilights:
And here's my favorite part. Only requests/resonses go over this connection. Any data at all (even dir listings) has to go over a separate TCP connection on a different port. No exceptions. Most people don't understand this point, but even PASV mode connections must use a separate TCP connection for the data stream. Either the client specifies a port for the server to connect to with the PORT command, or the client issues a PASV command, to which the server replies with a port number for the client use in connecting to the server.
The client does have the option to resume downloads or retrieve multiple files with one command. Yay.
Some Observations
My Opinion
I honestly think FTP was a bad idea from the beginning. The protocol's distinguishing characteristic is the fact that data has to go over a separate TCP stream. That would be a great idea if you could keep sending commands while the file transfers in the background... but instead, the server waits for the file to finish before accepting any commands. Pointless.
FTP is not better for large files, nor is it better for multiple files. It doesn't go through firewalls, and quality clients are few. HTTP is equally reliable, but also universally supported. There are also a number of high quality clients available.
In fact, the only thing FTP is better for is managing server directories and file uploads. But for that, you really should be using something more secure, like sftp (ssh-based).
Bottom line, ditch FTP. Use HTTP for public downloads and sftp for file management.
I'll stick up for poor old FTP (Score:5, Informative)
First, I think FTP was a *good* idea, when you consider that its initial design was in 1971, predating even SMTP. Also since FTP was created when mainframes were king, it has features that seem like overkill today.
Both protocols depend on TCP to provide reliability. Reliability is NOT a distinguishing characteristic.Oh but read the RFC young jedi :) There's a lot more to FTP than you might notice at first glance. The problem is that many clients and servers only partially implement the protocol as specified in the RFC. In particular, nowadays the stream transfer mode is used almost exclusively, which is the least reliable mode, and forces opening a new data connection for each transfer.
If you dig into RFC 959 more, you'll see some weird things you can do with FTP. For example, from your client, you can initiate control connections to two separate servers, and open a data connection between the two servers.
There's a lot of power and flexibility built into FTP, and that's why it has stuck around for 30 years. That's really phenomenal when you think about anything software related. Even though most current firewall vendors support active-mode pretty well, passive mode was there all along, showing that someone thought of this potential issue in advance. The main weakness of FTP is that it sends passwords over the wire in plaintext, but for an anonymous FTP server this isn't an issue.
This is a good resource if you want to read up on the history and development of FTP:
http://www.wu-ftpd.org/rfc/
Best regards,
SEAL
Re:hmm (Score:4, Interesting)
While you have some valid points, IE handling ftp poorly is not really a problem with FTP. A wrench makes a bad hammer.. that doenst make nails any less useful.
I havent had to worry about continuing a download since i stopped using my 2400 baud modem. The largest problem with downloaded are apps that autodownload (updates) and dont handle resuming. On broadband I havent concerned myself with an interrrupted DL in several years.
As for the firewall issue, if your admin is allowing downloads via http and not ftp for alleged 'security' reason he/she/it is a retard.
To address security holes, if there have been anyproblems with the ftpd's lately they dont get a lot of press. If you are referring to IIS, well... thats your own lookout.
Re:hmm (Score:5, Insightful)
It does when 90% of your users only have wrenches, and don't want to make the switch to hammers. You don't want to hand out nails in that kind of situation.
That said, my problems with IE's ftp seem to be unique. Isn't there anyone else who notices the 30 second freezes while IE tries to contact the ftp site? Or does everyone just take it for granted that their web browser should freeze while it tries to do ftp?
gopher (Score:5, Funny)
Re:gopher (Score:3, Funny)
Re:gopher (Score:4, Funny)
now where did I put that acoustic coupler for my spectrum...
Re:gopher (Score:4, Funny)
$ sz myfile.txt
*SZ
Now downloading "myfile.txt".....
Saved "myfile.txt".
$
Re:gopher (Score:4, Funny)
((12 * 60 * 60 * 24 * 7) * 2 ) / 1024 = 7,257KBs more per week!
*2, as 2 bytes per char, and
Back then, an EXTRA 7.2M a week was a lot of bad graphics porn.
WebDAV + HTTP (+ SSL) (Score:5, Informative)
Today I set up an Apache + mod_ssl + mod_dav server for "drag and drop" shared file folders that can be used by any Windows or Linux client over a single well-known socket port (https=443/tcp). It took me two hours without knowing a thing about WebDAV nor SSL to get both working together.
Windows calls it a "Web Folder" while the protocol is usually called DAV or WebDAV. It extends the HTTP GET/POST protocol itself with file management verbs like COPY, LINK, DELETE and others.
The key benefits are: almost zero training the users how to use it, flexibility while using proven protocols.
WebDAV doesn't do the authentication or encryption, but these can be layered in with .htaccess and/or LDAP and/or SSL certified server-encryption.
There are a few howto's out there. Google.
It's built into OSX too (Score:5, Informative)
Screw all of that! (Score:4, Funny)
do both... (Score:4, Informative)
Try both - see which gets used more.
Re:do both... (Score:5, Funny)
Then report back to us in the first ever Answer Slashdot.
how about rsync? (Score:5, Informative)
seems like the best of both worlds to me.
the real question is - do you control the clients that are going to access you? or is it something like a browser (which doesn't support rsync).
Re:how about rsync? (Score:5, Funny)
Re:how about rsync? (Score:5, Informative)
For those not familiar: rsync can copy or synchronize files or directories of files. it divides the files into blocks and only transfers the parts of the file that are different or missing. It's awesome for mirrored backups, among other things. There is even a Mac OS X version that tranfers the Mac-specific metadata of each file.
Just today I had to transfer a ~400MB file to a machine over a fairly slow connection. The only way in was SSH and the only way out was HTTP.
First I tried HTTP and the connection dropped. No problem, I thought, I'll just use "wget -c" and it will continue fine. Well, it continued, but the archive was corrupt.
I remembered that rsync can run over SSH and I rsync'd the file over the damaged one. It took a few moments for it to find the blocks with the errors, and it downloaded just thost blocks.
Rsync should be built into every program that downloads large files, including web browsers. Apple or someone should pick up this technology, give it some good marketing ("auto-repair download" or something) and life will be good.
Rsync also has a daemon mode that allows you to run a dedicated rsync server. This is good for public distribution of files.
Rsync is the way to go! I guess this really doesn't 100% answer the poster's question, but people really should be thinking about rsync more.
Re:how about rsync? (Score:5, Interesting)
Rsync is the way to go!
Rsync is great in theory, but the implementation has one major problem that makes it less than ideal for many cases: It puts a huge burden on the server, because the server has to calculate the MD5 sums on each block of each file it serves up, which is a CPU-intensive task. A machine which could easily handle a few dozen HTTP downloads at a time would choke with only a few rsync downloads.
This is a problem with the implementation, not with the theory, because it wouldn't be that difficult for the rsync server to cache the MD5 sums so that it only had to calculate them once for each file (assuming it's downloading static content -- for dynamic content rsync will probably never make sense, particularly since we can probably expect bandwidth to increase faster than processing power). The server could even take advantage of 'idle' times to precalculate sums. Once it had all of the sums cached, serving files via rsync wouldn't be that much more costly in terms of CPU power than HTTP or FTP, and it would often be *much* more efficient in terms of bandwidth.
Re:how about rsync? (Score:5, Insightful)
Http/Ftp which is slower? (Score:3, Informative)
I would think FTP is slower since with FTP you have to login and build the data connection before the transfer begins. With HTTP it's a simple GET request.
As far as the actual data being sent, I believe that the file is sent the same way with both protocols. (just send the data via a TCP connection). I could be wrong though.
Re:Http/Ftp which is slower? (Score:5, Informative)
This is not true. FTP does not use UDP fpr any purpose.
Re:Http/Ftp which is slower? (Score:5, Informative)
Re:Http/Ftp which is slower? (Score:5, Informative)
No, no, no. Jesus. Everyone always gets this wrong. FTP in any mode uses two TCP connections. Passive or not, there is a channel for data and a separate channel for commands.
The difference is that passive-mode means that the client initiates the data connection. The default FTP behavior is for the client to connect to port 21 on the server, and then the server initiates a data connection to the client.
Non-passive FTP clients are very hard for firewalls to keep track of, especially when NAT is involved. Passive is a little better because both connections are outgoing.
But at the same time, passive mode makes the server firewall's job tougher, because it requires an large range of incoming ports for the data connections.
No matter what the mode, FTP is not very firewall-friendly.
well, what're you trying to do? (Score:4, Insightful)
HTTP is restricted by browsers, many of which will not support files larger than a certain size. Furthermore, FTP allows for features such as resume, etc...
The real question, however, is what are you trying to use this for? What's your intended application?
If it's a file repository for moderately computer literate people - FTP is definitely the way to go.
If it's a place for average-joes to store pictures, maybe HTTP is your best option. Sacrificing a bit of speed and capabilities such as resume might be made up for with ease of use..
Re:well, what're you trying to do? (Score:5, Informative)
So does HTTP. With the 'Range' header, you can retrieve only a portion of a resource.
I agree that it really depends on the application, but for most practical "view directory, download file" purposes, there's no significant difference.
If you wanted to interact with a directory structure, change ownerships, create directories, remove files, etc., it's generally easier to do this with FTP.
Anecdotally, HTTP is more reliable (Score:5, Insightful)
It's generally simpler to get to from a browser, which is where 95% of people's online life is anyway. Yeah, you can rig up a FTP URL, but it seems a bit kludgey and more prone to firewall issues.
Re:Anecdotally, HTTP is more reliable (Score:4, Insightful)
I suspect that's because 99% of people are downloading from one of the FTP servers.
It's generally simpler to get to from a browser, which is where 95% of people's online life is anyway.
I honestly don't see how.
Yeah, you can rig up a FTP URL, but it seems a bit kludgey
ftp://www.mysite.com/file.zip
How is that cludgey?
'Reliability' of HTTP vs. FTP (Score:4, Insightful)
I suspect that's because 99% of people are downloading from one of the FTP servers.
I put to you that would be more logical to suspect it's because HTTP is faster than FTP as a transfer protocol. It generates less traffic (and uses less CPU overhead) which means downloads end quicker.
Additionally the CPU overhead generated by FTP connections also causes many sites to limit the number of users who can connect, which often results in 'busy sessions', something much rarer with HTTP (as HTTP servers typically have very high thresholds for the number of concurrent connections they will support). The overhead on a server of a user downloading a file over FTP is much greater than that of a user downloading the same file over HTTP.
Although FTP is of course theoretically more reliable than HTTP, in practice, because of 'Server busy: Too many users' messages combined with the speed and reliability of modern connections (which in turn makes HTTP more reliable) mean the the reverse is often the case from a user perspective - which is what I think the poster is getting at.
This may be partly due to poor FTP server configuration defaults and/or poor administration, but they cannot shoulder all the blame.
The potential lack of reliability with HTTP is a very minor issue these days, and the extra overhead of integrity checking files in addition to relying on TCP is just not warranted for all but the largest of files.
This doesn't make FTP completely redundant, but it does make it make it redundant when your files are small and your users are on fast, reliable connections (though the value of 'fast' varies in relation to the size of the file, even 33 kbps is 'fast' compared to the speed of connections that proliferated when the File Transfer Protocol was developed).
what are you serving again? (Score:3, Funny)
heh, most 1-6mb files I see are on irc fserves
for what its worlth (Score:3, Informative)
HTTP is fine (Score:5, Informative)
FTP is quickly becoming a special-needs protocol. If you need authentication, uploads, directory listings, accessability with interactive tools, etc. then this is for you. Mainly useful for web designers these days, IMO, since the site software packages can use the extra file metadata for synchronization. Other than that, it's a lot of connection overhead for a simple file.
FTP does have one nice advantage that HTTP lacks: it can limit concurrent connections based on access privleges (500 anonymous and 100 real, etc.). Doesn't sound like you need that.
Go with HTTP. Simple, quick, anonymous, generally foolproof.
Re:HTTP is fine (Score:5, Informative)
does not (by default) allow directory listings
[SNIP]
That is a dangerous and very incorrect assumption which has nothing to do with http and everything to do with your http server.
Re:HTTP is fine (Score:5, Informative)
No, the HTTP protocol does not even specify the concept of a directory listning. Some servers can generate an HTML file from the directory listning, but that is all up to the server, it can generate that file as it likes or even just serve an error.
"Files," eh? (Score:5, Funny)
Boy do I feel the pain... (Score:5, Funny)
You really gotta watch out for things like this. I know one guy that got a 'click me' sign on his back because he used HTTP instead of FTP.
Transparent (Score:5, Insightful)
And I wouldn't care about the opinion of someone who would actually judge you over what friggin protocol you use to provide downloads. Such an utter nerd is somethig that I can not relate too. Maybe after I use Linux for a few more years, who knows.
What do you want to do? (Score:5, Informative)
1) number of available slots
2) speed limit
3) premission set
Some people can only read files at 60KB/s, some can read and write (to the upload dir) at the same speed, come can only browse, etc. etc. For this kind of a setup, FTP is great _IF_ you keep your software up to date; subscribe to bugtraq or your distro's security bulletin or both.
On the other hand, HTTP is great when you want to give lots of people unlimited ANONYMOUS access to something. I'm sure there is a way to throttle bandwidth, but can you do it on a class by class basis? In proftpd it's a simple "RateReadBPS xxx" and I'm set.
As always, choose the tool that fits _your_ purpose, not the one that everyone says is "best"; they both have good and bad qualities. And http can be just as secure/insecure as any other protocol.
SCP (Score:5, Interesting)
It all depends on the application. I only use SCP to move files around if I have the choice, just because I like better security if I can have it.
But if you want to offer files to the public, I'd recommend offering both FTP and HTTP so people can use the most convenient.
Make your life simpler: use HTTP (Score:5, Insightful)
Security-wise, HTTP is a big win over FTP if only because it makes your port-filtering easier - "allow to 80" is simpler and less likely to cause unintended holes than all the things you need to do to support FTP active and passive connections. Certain FTP server software has a reputation as having more security holes than IIS, but there are FTP servers out there that are as secure as Apache.
I'd say it depends on what you're serving... (Score:4, Insightful)
1) I've found HTTP transfers are a little faster than FTP transfers (just personally, and I can in no way prove it - it may be user error, or just the programs I'm using)
2) I've found that FTP clients are everwhere - Windows, Linux, BSD, everything I've ever installed has included a command line FTP client, but not a web browser unless I specifically remember to install one. Further more, most of the "live CDs/boot disks" that I use don't have a web browser, but do have FTP... Thus, if you're serving files that a person with out a web browser/server might need, I'd set up both...
3) FTP security is what you/your daemon makes of it. wu-ftpd has a long history of being rooted... ProFTPd dosn't. VSFtp doesn't. HTTP security is the same way... IIS has a long history of being rooted... Apache doesn't... *(Not to say that there haven't been occasional exploits for these platforms)
There is no clear "Use this" or "Use that" procedure here, it depends entirely on your situation, what you're serving, what your network setup is, etc...
Security (Score:3, Insightful)
WebDAV? (Score:3, Insightful)
Still have some of the unreliablity of HTTP transfers and slowness. But works a lot better through firewalls (and more securely since connection tracking works better with WebDAV).
I've found Passive Mode FTP to also be more unstable than standard ftp transfers.
HTTP Vs FTP (Score:5, Insightful)
To me, this is a problem of authentication. If you want EVERYONE to have these files, why not just use the HTTP server? If you're targeting a select few people, then why not use the built-in authentication mechanisms of FTP?
Yes I know there are authentication mechanisms for HTTP, but they're arguably harder to implement than setting up an FTP server.
Are your clients only using web browsers to retrieve these files? I'll get flamed for this, but web browsers were not designed for FTP, and thus are klunky at it. HTTP wins there again.
Don't worry about it. Just use HTTP and let the FTP bigots flame away.
Use ZMODEM !! (Score:4, Funny)
I remember in my days of BBSes with X and Y Modem, and then when Z-Modem showed up we all couldn't be happier. When some idiot in the house picked up the phone and disconnected you from hours and hours of downloading the latest Liesure Suite Larry, I just reconnected and started to resume my downloads (but only if I had enough credit, then I might have to upload some crap).
Change it to, Ask slashdot to do my job. (Score:5, Funny)
Why do we have all these new ask slashdot question that sounds like a tech with a years experience is asking how to do his job?
I vote for a new section, "How do I do my job" with a dollar bill as the logo.
weak question, (Score:5, Insightful)
Don't start with finding the solution, figure out what it is you want, what you want it to do and then find the right tool. We can not tell you which is right with almost no information about the use of it, for what and what is the average user profile etc.
HTTP and FTP can be equally insecure, but it shouldn't be much of a job to properly secure a ftp.
Why /.? (Score:5, Funny)
Depends on the situation. (Score:4, Informative)
FTP has a great advantage in that you can request multiple files at the same time: mget instead of get. Additionally, you can use wildcards in the names, so you can select categories / directories of files with very short commands. (mget *.mp3 *.m3u
Modern browsers allow you to transfer multiple files simultaneously, but they don't queue files for you - FTP will. This may be important if connections might get dropped - the FTP transfer will complete the first file, then move on to the next. In the event of an interruption, you will have some complete files, and one partial (which you can likely resume). For multiple simultaneous transfers - from an http browser - you may have some smaller files finished, but it's likely that all larger files will be partials, and will need to be retransmitted in their entirety, since http doesn't quite support resuming a previous download.
So, if you're going to have a web page with many individual links, and you think that most people will download one or two files, http will probably suffice. If you expect people to want multiple files, or that they will want to be able to select groups of files with wildcards (tortuous with pointy-clicky things), then you should have FTP.
It's not that hard to set up both, and that's probably the best solution.
HTTP, hands down (Score:5, Informative)
I also assume the following:
I would say that you should go with HTTP for sure. Of course, you can provide both, but there are some key reasons for using HTTP.
Easier Configuration Perhaps I'm just not that swift, but I've found that web servers (including Apache) are easier to configure. This is especially true if you have any previous web server experience. Of course, the FTP server is more complex due to its additional features that HTTP doesn't have, but assuming that (c) is true, then you won't need to mess with group access control rights and file uploads.
Speed This whole "FTP is faster" stuff is not true. HTTP does not have a lot more overhead than FTP; it may even have less overhead than FTP in certain cases. Even when it does have more overhead, it is in the order of 100-200 bytes, which is too small to care about. HTTP always uses binary transfers and just spits out the whole file on the same connection as the request. FTP needs to build a data connection for every single data transfer, which can slow things down and even occasionally introduce problems.
Easier for Users Given assumption (d), your users will be much more familiar with HTTP URLs than FTP addresses. You could just use FTP URLs and let their web browsers download the files, but then you lose the benefit of resuming partial downloads.
Simple Access Controls Though some people need to have complex user access rules, you may very well just need simple access controls. HTTP provides this (look at Apache's .htaccess file), and you can even integrate Apache's authentication routines into PAM, if you are really hard core.
There are a few main areas where FTP currently holds sway:
Partial Downloads Web browsers typically don't support partial downloads, but the fact of the matter is that the HTTP protocol does support it (see the Range header.) The next generation of web browsers may very well include this feature.
User Controls Addressed above.
File Uploads Again, HTTP does support this feature but most browsers don't support it well. Look to WebDAV in the future to provide better support.
In summary, just use HTTP unless you need complex access rules, resumption of partial download, or file uploading. It will be easier both on you and your users.
College blocking ftp? (Score:4, Interesting)
My experiences with FTP and HTTP downloads (Score:5, Informative)
Our FTP servers run both HTTP and FTP providing the same content in the same directory structure. There are five servers that transfer an average of 1-2 TB (terabyte) per month each, so they are fairly busy. On a busy month each server can go as high as 7 TB of data transferred. File sizes range from 1 KB to to whole CD-ROM and DVD-ROM images. I think the single largest file is 3 GB.
The logs show a trend of HTTP becoming more popular for the last several years and not stopping. It is currently at 70% of all downloads from the "FTP" servers via HTTP. While the remaining 30% is via FTP. Six years ago (I lost the logs from before this time, they are on a backup tape but I am way too lazy to get that data), it was completely reversed. 75% of downloads were via FTP and 25% were via HTTP. 90% of all transfers are done with a web browser as opposed to an FTP client or wget or something.
One thing we learned was that many system administrators will download via FTP from the command line directly from the FTP server, especially during a crisis they are trying to resolve. They do this from the system itself and not a workstation. The reasons for this are a bit of a mystery. Feedback has shown that we should never get rid of this or we might be assassinated by our customers. We thought about it once and put out feelers.
I would say if you don't need to deal with incoming files and you file size is not too large then stick with HTTP. Anything over about 10 MB should go to the FTP server. An FTP server can be more complicated. It seems like the vulnerabilities in FTP daemons has died down in the past year or so. Also, fronting an FTP server with a Layer 4 switch was a lot more tricky because of all the ports involved. If you want people to mirror you then go with FTP or rsync for private mirroring. In reading the feedback, most power users seem to prefer FTP, perhaps because that is what they are used to. Also, depending on the amount of traffic you might need to consider gigabit ethernet.
The core dumps being uploaded are getting to be huge. Some of those systems have a lot of memory!
HTTP and FTP FUD (Score:5, Insightful)
1) HTTP doesn't support resumed downloading.
- That's ridiculous. It has since HTTP/1.1 years ago. In fact, it can even do things like request bytes 70,000 - 80,000, then 90,763 - 96,450, etc.
2) HTTP doesn't support security/authentication
- Ridiculous. HTTP has an open-ended model for authentication and security, many of which are secure and standardized. If you REALLY need security, use HTTPS.
3) HTTP doesn't support uploading
- HTTP/1.1 has had this for a while. Netscape 4.7, Mozilla 1.1, and IE 4+ support this. I must admit though, it sucks.
Several people have pointed out the real differences:
1) FTP doesn't like firewalls
- Passive FTP fixes this, but it has quirks and limitations.
2) FTP supports directory listing, renaming, uploading, changing of permissions, etc.
- This is what FTP is for
- This can be done in HTTP, but requires serious work
- If the scope creeps, shell access would be better.
Apply these three questions... (Score:5, Informative)
I'm dealing with this decision now too (Score:4, Interesting)
So far, it's been a failure for two reasons:
1. IE blows as an FTP client, and users aren't comfortable dropping into the (somewhat crappy) DOS FTP client.
2. Firewall setups at the fortune 500 companies that we deal with normally seem to keep FTP access off-site restricted.
FTP is slower due to TCP Window Size (Score:5, Informative)
Dramatically simplified, it means that the connection can send a lot more packets without hearing back from the far end, enabling the connection to reach higher speeds (imagine a phone call where you had to say 'okay' after every word the other person said. Now imagine only having to say it after every sentence. Much faster.)
The tiny window size of (most crappy legacy implementations of) FTP starts to affect download speed at just 25ms latency, and has a huge effect over 50ms.
A properly tuned system with HTTP can make a single high-latency transfer hundreds or even thousands of times faster than FTP.
Relevant links:
http://www.psc.edu/networking/perf_tune.h
http://www.nlanr.net/NLANRPackets/v1.3/window
http://dast.nlanr.net/Projects/Autobuf
HTTP is better for most cases (Score:5, Insightful)
The main strengths of HTTP over FTP for file transfers are:
The other differences one sees are due to server design issues. I.e. most FTP servers are large and spawn a process per connection, which makes FTP sessions much slower than HTTP sessions. But if you want to use FTP, there are very fast FTP servers out there.
Overall, in today's world, it does not make sense to use FTP unless you have a requirement from your users. For public access to files, use HTTP or something more modern, such as rsynch, or a P2P network.
As usual, you should answer such questions by thinking about your target users and asking yourself what they are likely to be most comfortable using. Chances are it's their main tool, the web browser.
Re:In my opinion, (Score:3, Offtopic)
Re:Both... (Score:5, Insightful)
Many companies lock down their firewalls with a huge, gigantic virtual padlock -- ie, port 80 outgoing only. When I'm at work and want to download something from a site that offers FTP only, I'm screwed. Some companies will keep ports < 1024 or some other low number open, but many will not. This would be a driving reason to provide HTTP access to files.
Personally, anyone that calls you an amateur for using HTTP or a dinosaur for using FTP needs to get a life. Both protocols have their place in the modern internet; there's absolutely nothing wrong with serving files via either method other than security and the above mentioned concern. Do what you feel is easiest to maintain and simplest for your users.
Re:Both... (Score:3, Funny)
Would you download that on your work connection? Heh.
Daniel
Re:Both... (Score:4, Informative)
No you're not! Assuming you have access to a Windows workstation, check out HTTP-Tunnel [http-tunnel.com]
Re:Both... (Score:5, Informative)
FTP entirely uses TCP. What passive mode does is causes the FTP client to connect to the FTP server to download the file; in ordinary, traditional, "active" FTP, the FTP client sends an address to the FTP server, which the FTP server connects to to transfer the file. Obviously, this does not play nicely with FTP clients behind firewalls that don't allow incoming connections.
Re:Different, not better or wose (Score:5, Interesting)
Re:Different, not better or wose (Score:4, Insightful)
A pretty distinct advantage for those of us who use shells often or have to repair machines at a command prompt.
Re:Different, not better or wose (Score:5, Informative)
Lynx lets you browse, but you can't do globbing, so you see lots of irrelevant crap, and you have to select files to download one at a time.
For getting (possibly multiple) files whose location you don't know in advance, FTP is more flexible and efficient.
Re:Different, not better or wose (Score:4, Interesting)
Ever heard of web page directories that almost every common web server uses, and most web servers automatically do when no index file is present?
I find lynx+ web page directories a lot faster to get files across my network than ftp.
The only real advantage I see with ftp is uploading files quickly and easily... but that's not for me
Re:Different, not better or wose (Score:5, Insightful)
mget blah[1,2,3].iso....
Get the drift? HTTP indexes are rather stupid if you ask me, it's FTP without the features. And before you whine "But I don't need the features", neither do I most of the time, but it's nice when they're there.
HOW ABOUT UPLOADING??? (Score:5, Insightful)
Re:Different, not better or wose (Score:5, Insightful)
FTP is notoriously difficult to secure while retaining ease of use for the clueless end-user.
-Sara
Re:Different, not better or wose (Score:5, Interesting)
True, but ftp has far more negatives. The connection model ftp uses, in which the server connects back to the client for data transfers, is a horrible idea. Surely, there must be a benefit to this method (or else why would ftp have become such a standard?), but I've yet to hear it. Passive ftp (PASV) is supposed to compensate for the fact that clients behind NAT can't use ftp, and passive is pretty much supported everywhere now. But passive doesn't solve the problem of many-connections, it just shifts it: With passive ftp, there is still a seperate "data" connection for every file transfer, or transfer of directory contents. It's just that with passive ftp the data connection is initiated by the client to some high-numbered port on the server. This allows clients behind NAT to do ftp, but it makes firewalling the server a pain in the ass, especially if the server is (god forbid) behind nat itself. It's relatively easy to setup a mailserver, or webserver, or ssh server, on a machine behind nat (assuming you can forward ports on the nat gateway). But try doing the same with ftp. Passive connections won't work, unless the ftpd is configured to only use a certain range of high ports for it's data connection, and the firewall forwards that entire range. You'd think with all these connections, the protocol might support multiple connections from one login, ie list another directory while downloading. Well, thats not the case at all, you have to login again if you want simultaneous transactions.
And then if you want tunneled connections... oh boy, what a pain in the ass:
I set this all up for someone recently, and it DOES finally work, with tunneled passive ftp connections to a proftpd server behind a nat gateway, but it was a big pain. Due to the way proftpd works (not sure about other daemons...) the passive connections are directed to the interface the client is connected to. So, with an ssh tunneled connection, proftpd sees the client connection TO 127.0.0.1 (because thats where sshd is). So it tells the client, connect back to me for the data connection, my IP is 127.0.0.1. And the client tries to connect to itself, and fails saying "can't open data connection". The solution was to point the tunnel not to 127.0.0.1, but to the external IP, and have the nat gateway forward internal traffic back to itself. So traffic comes in over ssh, over to the firewallwall (where the nat rewriting happens), and then back to the same machine for ftp.
If people could just use sFTP clients, this wouldn't be an issue _at_all_ (One client connection, one tcp connection! Hey, what a concept!). But due to the integrated ftp clients in programs like bbedit and dreamweaver, I suspect we'll be dealing with that bastard protocol known as ftp for quite a long time....
Damn, reading back over this, I'm kind of ranting. BUT ITS ALL TRUE!
Sorry bout that.
[X] Post Anonymously
Re:Different, not better or wose (Score:5, Interesting)
Re:Different, not better or wose (Score:5, Insightful)
But really, even at that I think it has more to do with the features typically found in the clients than with the differences in the protocols.
HTTP transfers are usually done using a web browser. Web browsers can also perform FTP transfers, but FTP clients offer some capabilities that web browsers lack (and of course, also lack some capabilities that web browsers have). With a web browser, grabbing ONE file to download is easy. Starting a second or third download is also easy. But what if you want to initiate a transfer (either up or down) of all files within a directory tree with a single operation, and maybe even have the directory structure kept intact? What if you want to setup a complex "batch" of transfers with full control over files will be transferred into which directories, some transferring up, others down, and have the whole thing then run unattended with the ability not just to resume a partial download when the user restarts it, but to automatically restore dropped connections, rusuming any partial download and continuing to process the batch without user intervention. There are ftp clients that can do all of this. I suppose it could be done over HTTP, but I sure haven't seen it. Not even close. And until something like that does exist, this difference in client capabilities is a valid justification for continuing to use FTP, at least for applications where those capabilities are beneficial.
And here's another reason to offer FTP along with HTTP: Some users like it. If you have any concern about the extent to which your site pleases it's users, then that is a perfectly valid item to put in the "pros" column.
Having that kind of capability within a browser would really rock though. Especially if it worked for both FTP and HTTP. I suppose it would need some standardization on the web server end as well, so that the client can reliably parse the http directory listings. Though I suppose it might still be possible to make it work at least with the most common servers. Any browser developers listening?
Re:I wouldn't worry about it... (Score:4, Funny)
if you're using linux, you can use wget. and if you're on windows, you can get cygwin and then also use wget.. there's gotta be other utilities with the same features, but wget is definitely the classic and does pretty much everything you'd need.