FTP: Better Than HTTP, Or Obsolete? 1093
An anonymous reader asks "Looking to serve files for downloading (typically 1MB-6MB), I'm confused about whether I should provide an FTP server instead of / as well as HTTP. According to a rapid Google search, the experts say 1) HTTP is slower and less reliable than FTP and 2) HTTP is amateur and will make you look a wimp. But a) FTP is full of security holes. and b) FTP is a crumbling legacy protocol and will make you look a dinosaur. Surely some contradiction... Should I make the effort to implement FTP or take desperate steps to avoid it?"
do both... (Score:4, Informative)
Try both - see which gets used more.
how about rsync? (Score:5, Informative)
seems like the best of both worlds to me.
the real question is - do you control the clients that are going to access you? or is it something like a browser (which doesn't support rsync).
Http/Ftp which is slower? (Score:3, Informative)
I would think FTP is slower since with FTP you have to login and build the data connection before the transfer begins. With HTTP it's a simple GET request.
As far as the actual data being sent, I believe that the file is sent the same way with both protocols. (just send the data via a TCP connection). I could be wrong though.
for what its worlth (Score:3, Informative)
FTP is the cat's pj's (Score:2, Informative)
FTP is the cat's pajamas. HTTP depends on too much other stuff.
ftp: going, going, but not gone (Score:1, Informative)
If you are providing a large number of files where people frequently download several files from the same directory, then ftp access would help as most ftp clients can queue multiple files for downloading.
If users are uploading and downloading multiple files, then ftp is still your best bet by far. No one wants to upload one file at a time via some html form.
HTTP is fine (Score:5, Informative)
FTP is quickly becoming a special-needs protocol. If you need authentication, uploads, directory listings, accessability with interactive tools, etc. then this is for you. Mainly useful for web designers these days, IMO, since the site software packages can use the extra file metadata for synchronization. Other than that, it's a lot of connection overhead for a simple file.
FTP does have one nice advantage that HTTP lacks: it can limit concurrent connections based on access privleges (500 anonymous and 100 real, etc.). Doesn't sound like you need that.
Go with HTTP. Simple, quick, anonymous, generally foolproof.
Security is the only worry (Score:2, Informative)
We run ftp, but we have to have people send us files, and also distribute them on a regular basis.The client software available for doing the sending and receiving on a regular basis is a lot better for FTP.. it's pretty klunky, but it is very doable for http.
We just choose to stay on top of our ftp updates.
What do you want to do? (Score:5, Informative)
1) number of available slots
2) speed limit
3) premission set
Some people can only read files at 60KB/s, some can read and write (to the upload dir) at the same speed, come can only browse, etc. etc. For this kind of a setup, FTP is great _IF_ you keep your software up to date; subscribe to bugtraq or your distro's security bulletin or both.
On the other hand, HTTP is great when you want to give lots of people unlimited ANONYMOUS access to something. I'm sure there is a way to throttle bandwidth, but can you do it on a class by class basis? In proftpd it's a simple "RateReadBPS xxx" and I'm set.
As always, choose the tool that fits _your_ purpose, not the one that everyone says is "best"; they both have good and bad qualities. And http can be just as secure/insecure as any other protocol.
Re:FTP (Score:2, Informative)
http://www.gnu.org/software/wget/wget.html [gnu.org]
Re:Forget them both.... (Score:5, Informative)
ftp has more features (Score:2, Informative)
On the other hand, if you don't need user authentication - and don't want to off load big file transfers from your web-server, you may as well just leave it as http.
Re:HTTP is fine (Score:5, Informative)
does not (by default) allow directory listings
[SNIP]
That is a dangerous and very incorrect assumption which has nothing to do with http and everything to do with your http server.
Re:well, what're you trying to do? (Score:5, Informative)
So does HTTP. With the 'Range' header, you can retrieve only a portion of a resource.
I agree that it really depends on the application, but for most practical "view directory, download file" purposes, there's no significant difference.
If you wanted to interact with a directory structure, change ownerships, create directories, remove files, etc., it's generally easier to do this with FTP.
Re:In my opinion, (Score:2, Informative)
Re:hmm (Score:3, Informative)
You may (just may) run into a routing or timeout problem, in which case the download will stop and you are forced to do the entire download again. Using the right client, eg. ncftp, you can continue downloading partially downloaded files. An option, HTTP doesn't offer.
With respect to the original question, I would set-up a box offering both, HTTP and FTP access.
Says who? (Score:2, Informative)
Both FTP and HTTP stream data across a TCP socket -- I can't see that streaming it over port 20 versus 80 is going to make any difference.
FTP was designed to be able to do all these neat things back when the internet didn't have so many security issues. Most of these features are either not used or explicitly disabled these days... The fact that the FTP server uses a different port means that firewall have to understand and properly be configured for this. HTTP sends the data back in response to the initial connection, so it tends to be easier to get through firewalls.
If you're concerned about looking like a "wimp", then you should offer both and let people pick what they prefer. Or... Stop worrying about what people these people think and figure out what YOU think is best.
The people who would call you a "wimp" probably aren't worth worrying about.
Sean
security issues? (Score:1, Informative)
Re:My opinion (Score:1, Informative)
Re:hmm (Score:5, Informative)
My experience.... (Score:2, Informative)
Re:Http/Ftp which is slower? (Score:0, Informative)
You were right on that one point...
an FTP session has two connections, the control which is TCP/IP and data which is UDP. The latency (time to auth etc) is longer on FTP but not really 'slower'
For the actual benefits and tradeoffs of each just read some of the other posts in the thread.
Re:well, what're you trying to do? (Score:2, Informative)
HTTP 1.1 supports resuming. I have setup Apache to serve movie files that I can play over the network. I can seek back and forth throughout the movie.
ftp is a turd - real men use dns (Score:1, Informative)
However, real men write a protocol that works over DNS TXT records using the CHAOS protocol. I'm actually working on this!!!
Re:My opinion (Score:2, Informative)
Re:hmm (Score:3, Informative)
Yes, I thought about wget while I wrote my answer - but left it out, simply because _for John Doe_ wget is too complicated. John Doe wants a clickety-click-drag-n-drop client, like a web browser or something like WS-FTP. Granted, ncftp doesn't fall into that category either, but even John Doe can use a simple ftp client.
Hint: If they talk like that, they're not experts. (Score:2, Informative)
(Cleaning up the text a bit)
Well, 2 and 4 are nothing more than acephalic punditry, unworthy of our attention, which leaves 1 and 3.
The fact that HTTP doesn't use a binary connection to transfer binary files means that, yes, it is frequently slower than FTP. Especially since your listed file sizes imply that you're not offering text files for download.
While FTP doesn't have any security holes (Yay for false generalisations!), many of the readily available ftp daemons have had shaky track records in the security area.
I don't really have an answer for you, I just wanted to say acephalic. :-) Acephalic acephalic acephalic...
Re:hmm (Score:5, Informative)
Plain wrong. RFC2068 [w3.org] section 10.2.7.
FTP _MUCH_ faster than HTTP (Score:3, Informative)
Re:HTTP is fine (Score:2, Informative)
Ahem. (Score:3, Informative)
What bugs me is when servers won't tell me the final downloaded file size -- no ETA available. I've seen both FTP and HTTP servers do it. The same goes for servers that don't support resuming or last-modified dates. They suck.
Re:Http/Ftp which is slower? (Score:5, Informative)
This is not true. FTP does not use UDP fpr any purpose.
Re:Both... (Score:4, Informative)
No you're not! Assuming you have access to a Windows workstation, check out HTTP-Tunnel [http-tunnel.com]
Re:Http/Ftp which is slower? (Score:5, Informative)
OR, How about... (Score:5, Informative)
I've written a tutorial [anenga.com] on how you can use P2P on your website to save bandwidth, space etc. An obvious way to do this would be to run a P2P client and share the file on a simple PC & Cable Modem. This works, but it is a bit generic and un-professional. A better way to do this may be to run a P2P client such as Shareaza [shareaza.com] on a webserver. You could then control the client using some type of remote service (Terminal Services, for example).
P2P has it's advantages. Such as:
- Users who download the file also share it. This is especially useful if the client/network supports Partial File Sharing.
- When you release the file using the P2P client, you only need to upload to only a few users. Those users can then share the file using Partial File Sharing etc.
- Unlike FTP and HTTP, they aren't connecting to your webserver. Thus, it saves bandwidth for you and allows people to browse your website for actual content, not media. (Though, media is content). In addition, there is ussually "Max # of Connections" allowed to a server or FTP. Not so on P2P.
- P2P Clients have good queuing tools. At least, Shareaza does. It has a "Small Queue" and a "Large Queue". This basically allows you to have, say, 4 Upload slots for Large Files (Files that are above 10MB, for example) and one for Small Files (Under 10MB). Users who are waiting to download from you can wait in "Queue", instead of "Max users connected" on FTP.
Though, at it's core, all of the P2P I know of uses HTTP to send files etc. But the network layer helps file distribution tremendously.
Re:HTTP is fine (Score:3, Informative)
Yeah, if you need CLEAR TEXT auth, FTP is for you. If you want SSL auth, maybe enable auth for your http server.
uploads, directory listings,
Which http can do fine, thanks.
accessability with interactive tools, etc. then this is for you.
Dunno about this.
Mainly useful for web designers these days, IMO, since the site software packages can use the extra file metadata for synchronization.
I'd push for SSL webdav in an instant...
Sorry, but I live behind various firewalls and am sick to death of FTP. The sooner it dies, the better.
(best not to take this post too seriously - FTP just really pisses me off)
Re:HTTP is fine (Score:3, Informative)
Re:Both... (Score:5, Informative)
FTP entirely uses TCP. What passive mode does is causes the FTP client to connect to the FTP server to download the file; in ordinary, traditional, "active" FTP, the FTP client sends an address to the FTP server, which the FTP server connects to to transfer the file. Obviously, this does not play nicely with FTP clients behind firewalls that don't allow incoming connections.
PASV mode still opens a separate data connection (Score:1, Informative)
This can be helpful when the client is NAT'ed, otherwise the client will send a PORT command with an unroutable address. Of course, if the server is NAT'ed, the reverse will happen. There are stateful NAT devices that will actually exmaine FTP control connections and rewrite PORT commands, but NAT and FTP are basically a pain in the arse to deal with. Throw some encryption in the mix (sftp) with NAT, and you'll understand why FTP is not long for this world.
Only on Slashdot could you learn that FTP uses UDP for the data connection or that PASV mode only uses a single socket!
Re:gopher (Score:1, Informative)
HTTP vs FTP (Score:2, Informative)
FTP was designed for interfacing with the filesystem of a remote Unix system, with the filesystem permissions that are granted to the user you log in as. FTP lets you browse the hierarchy, including examining ownership, permission, and symlink targets; pretty much the same as what you get with 'ls -l'. Apache does file listings, but only shows file names, last modification dates, and size. This makes FTP more suitable than HTTP for remote mirroring of directory trees. This also makes it easier to "browse" what an FTP server has to offer, on a directory-by-directory basis.
With FTP, the server prints a response when a client connects. Usually, the client sends a user name, password, the 'SYST' command, and asks for the current working directory, tells the server what mode (ASCII or binary) it wants, changes to the directory with the file it needs, sends a PORT command, then finally requests the file. With HTTP, the client connects, sends a request, and the server responds. That's 8 client commands and 9 server responses with FTP, as opposed to 1/1 with HTTP. Each time a command is sent, the client has to wait for the server to respond. The latency adds up, and that means, especially on high-latency connections, FTP is slower to initiate and begin downloading than HTTP. Who said HTTP is slower?
Regarding reliability, both protocols and modern implementations of their clients and servers have features to resume a broken download from where it left off. Who says one is more reliable than the other?
HTTP is more simple than FTP. As far as I can tell, in FTP "active mode", the client sends a PORT command with an IP address and port number that is listening. In "passive mode", the server sends an IP address and port number that is listening, after the PASV command. These address/port combinations are used for the actual file transfer.
Active mode doesn't work if there is NAT between the client and the server, unless the NAT system rewrites the packets so that the IP address the server sees in the PORT command is the outside, external address of the NAT system. When an FTP server is behind NAT, passive mode cannot be used without a similar kluge; it must get an outside-world IP address to connect to from the client, which active-mode PORT does. If both client and server are behind NAT, then one of these NAT kluges must be in effect for file transfer to be possible.
This address/port nonsense could be part of the security concern with FTP. I also believe older FTP implementations allowed the client (in active mode) or server (in passive mode) to specify arbitrary address/port combinations, so that the FTP server or client could be used as a proxy in an attack. Is this still the case?
With HTTP, transfers are conducted on the same TCP connection as control is on, and therefore doesn't need to concern itself with IP addresses and ports, and the people using it have fewer NAT headaches.
Depends on the situation. (Score:4, Informative)
FTP has a great advantage in that you can request multiple files at the same time: mget instead of get. Additionally, you can use wildcards in the names, so you can select categories / directories of files with very short commands. (mget *.mp3 *.m3u
Modern browsers allow you to transfer multiple files simultaneously, but they don't queue files for you - FTP will. This may be important if connections might get dropped - the FTP transfer will complete the first file, then move on to the next. In the event of an interruption, you will have some complete files, and one partial (which you can likely resume). For multiple simultaneous transfers - from an http browser - you may have some smaller files finished, but it's likely that all larger files will be partials, and will need to be retransmitted in their entirety, since http doesn't quite support resuming a previous download.
So, if you're going to have a web page with many individual links, and you think that most people will download one or two files, http will probably suffice. If you expect people to want multiple files, or that they will want to be able to select groups of files with wildcards (tortuous with pointy-clicky things), then you should have FTP.
It's not that hard to set up both, and that's probably the best solution.
Re:Http/Ftp which is slower? (Score:5, Informative)
No, no, no. Jesus. Everyone always gets this wrong. FTP in any mode uses two TCP connections. Passive or not, there is a channel for data and a separate channel for commands.
The difference is that passive-mode means that the client initiates the data connection. The default FTP behavior is for the client to connect to port 21 on the server, and then the server initiates a data connection to the client.
Non-passive FTP clients are very hard for firewalls to keep track of, especially when NAT is involved. Passive is a little better because both connections are outgoing.
But at the same time, passive mode makes the server firewall's job tougher, because it requires an large range of incoming ports for the data connections.
No matter what the mode, FTP is not very firewall-friendly.
Re:Forget them both.... (Score:5, Informative)
To answer the original question, when given a choice, I always download by http. It usually takes less time to set up the connection, probably becasue of those ident lookups that most ftpd's still run by default.
Re:how about rsync? (Score:5, Informative)
For those not familiar: rsync can copy or synchronize files or directories of files. it divides the files into blocks and only transfers the parts of the file that are different or missing. It's awesome for mirrored backups, among other things. There is even a Mac OS X version that tranfers the Mac-specific metadata of each file.
Just today I had to transfer a ~400MB file to a machine over a fairly slow connection. The only way in was SSH and the only way out was HTTP.
First I tried HTTP and the connection dropped. No problem, I thought, I'll just use "wget -c" and it will continue fine. Well, it continued, but the archive was corrupt.
I remembered that rsync can run over SSH and I rsync'd the file over the damaged one. It took a few moments for it to find the blocks with the errors, and it downloaded just thost blocks.
Rsync should be built into every program that downloads large files, including web browsers. Apple or someone should pick up this technology, give it some good marketing ("auto-repair download" or something) and life will be good.
Rsync also has a daemon mode that allows you to run a dedicated rsync server. This is good for public distribution of files.
Rsync is the way to go! I guess this really doesn't 100% answer the poster's question, but people really should be thinking about rsync more.
My favorite ftp client (Score:3, Informative)
Re:hmm (Score:2, Informative)
With respect to the original question, I would set-up a box offering both, HTTP and FTP access.
This is what I do. I have a link from my HTTP directory into the anonymous FTP directory, so users can download files either way. It takes up the same bandwidth and the same server hard drive space, and offers both options. Some firewalls block FTP, so I have to offer HTTP. Some people like the resume capability of FTP, so I have to offer FTP. Until everyone else decides that one or the other is dead, I will offer both.
Re:Both... (Score:3, Informative)
Re:Both... (Score:3, Informative)
Adding abort mechanisms to a text-based protocol with data multiplexed in (like HTTP) is difficult. That's why HTTP/1.1 closes the TCP connection on a user-initiated abort.
Since FTP has session-based authentication (which often might use mechanisms like SKEY which can't just be replayed), and context (things like the current directory, transfer mode, and other settings) it is difficult/impossible for a client to reconnect and preserve the exact same state. So instead the data connections are seperate and can be closed at will.
The actual implementation, though, involving the client sending its own address and having the server connect back-- that's braindead. The FTP bounce attack was "fun" (upload a file containing stuff you want sent to a victim to a FTP server. Then say the victim's IP and chosen port to the FTP server as "your" address, and ask the FTP server to send the file back to you. Poof, the FTP server has attacked your victim for you).
Re:Forget them both.... (Score:4, Informative)
If the files you are serving are large then use ftp. If the files are smaller (less than 10MB) use http.
http is great, I sometimes throw up a file on there if I need to give it to someone and it is too big to e-mail. (Happened recently with a batch of photos from the car show)
Since I already have a web page it was easy to just throw the file in the http directory and provide the link in an e-mail.
I like http for the most part. I doubt anyone will call you lame for using it, unless the files are huge.
-Chris
WebDAV + HTTP (+ SSL) (Score:5, Informative)
Today I set up an Apache + mod_ssl + mod_dav server for "drag and drop" shared file folders that can be used by any Windows or Linux client over a single well-known socket port (https=443/tcp). It took me two hours without knowing a thing about WebDAV nor SSL to get both working together.
Windows calls it a "Web Folder" while the protocol is usually called DAV or WebDAV. It extends the HTTP GET/POST protocol itself with file management verbs like COPY, LINK, DELETE and others.
The key benefits are: almost zero training the users how to use it, flexibility while using proven protocols.
WebDAV doesn't do the authentication or encryption, but these can be layered in with .htaccess and/or LDAP and/or SSL certified server-encryption.
There are a few howto's out there. Google.
HTTP, hands down (Score:5, Informative)
I also assume the following:
I would say that you should go with HTTP for sure. Of course, you can provide both, but there are some key reasons for using HTTP.
Easier Configuration Perhaps I'm just not that swift, but I've found that web servers (including Apache) are easier to configure. This is especially true if you have any previous web server experience. Of course, the FTP server is more complex due to its additional features that HTTP doesn't have, but assuming that (c) is true, then you won't need to mess with group access control rights and file uploads.
Speed This whole "FTP is faster" stuff is not true. HTTP does not have a lot more overhead than FTP; it may even have less overhead than FTP in certain cases. Even when it does have more overhead, it is in the order of 100-200 bytes, which is too small to care about. HTTP always uses binary transfers and just spits out the whole file on the same connection as the request. FTP needs to build a data connection for every single data transfer, which can slow things down and even occasionally introduce problems.
Easier for Users Given assumption (d), your users will be much more familiar with HTTP URLs than FTP addresses. You could just use FTP URLs and let their web browsers download the files, but then you lose the benefit of resuming partial downloads.
Simple Access Controls Though some people need to have complex user access rules, you may very well just need simple access controls. HTTP provides this (look at Apache's .htaccess file), and you can even integrate Apache's authentication routines into PAM, if you are really hard core.
There are a few main areas where FTP currently holds sway:
Partial Downloads Web browsers typically don't support partial downloads, but the fact of the matter is that the HTTP protocol does support it (see the Range header.) The next generation of web browsers may very well include this feature.
User Controls Addressed above.
File Uploads Again, HTTP does support this feature but most browsers don't support it well. Look to WebDAV in the future to provide better support.
In summary, just use HTTP unless you need complex access rules, resumption of partial download, or file uploading. It will be easier both on you and your users.
Re:Both... (Score:3, Informative)
> was designed like that? It seems like it's
> unnecessary overhead when there's already a
> connection open.
There are two connections involved.
The command connection, from you to server port 21, to send commands via.
Then a data connection, in port mode to you, in pasive mode to the server. This is what the files (Including directory listings) are sent over.
It was designed that way so you can connect your client to servers A and B at the same time (command connection) then send the IP address of B to server A (A thinks this is you) and tell B to recieve a file in in passive, and the data connection goes right from A to B with never passing through your connection or pipe at all.
Handy when the two servers are on fast fast connections and you are on a slow connect, but want to transfer from one point to another.
Some people call this action FXP, and likely you will need an FXP client to do it.
Re:hmm (Score:1, Informative)
Re:hmm (Score:5, Informative)
This is incorrect. Practically every download manager out there allows resuming HTTP downloads. There are only a few (very rare) servers that don't allow this, I guess due to them running HTTP 1.0
Almost all windows download managers allow it, and for linux, check out 'WebDownloader for X' which has some good speed limiting features as well.
You will not be forced to redo the entire download (Score:4, Informative)
There are many reasons to support HTTP over FTP for small files.
HTTP is a much faster mechanism for serving small files of a few MB's (as HTTP doesn't check the integrity of what you've just downloaded and relies purely on TCP's ability to check that all your packets arrived and were arranged correctly).
Not only is HTTP faster both in initiating a download and while the download is in progress, it typically has less overhead on your server than is caused by serving the same file using an FTP package.
If you are serving large files (multiple tens of MB's) it would be advisable to also have an FTP server, though many users still prefer HTTP for files over over 100 MB, and use FTP only if the site they are connecting to is unreliable.
The speed of today's connections (56k, or DSL, or faster) means that the FTP protocol is not redundant, but it's less of a requirement than it used to be - as the consensus of what we consider to be a large file size has changed greatly.
There was a time when anything over 500K was considered 'large' and the troublesome and unreliable nature of connections meant that software that was over that size would almost certainly need to be downloaded via FTP to ensure against corruption.
Additionally, many web servers (Apache included) and web browsers (Netscape/Mozilla included) support HTTP resume, which works just like FTP resume.
Unless you are serving large files (e.g. over 20 to 30 MB's) or you have a dreadful network connection (or your users will - for example if they will be roaming users connecting via GPRS) then HTTP is sufficient and FTP will only add to your overhead, support and administration time.
One last note: I'd also add that many users in corporate environments are not able to download via FTP due to poorly administered corporate firewalls. This occurs frequently even in large corporations due to incompetent IT and/or 'security' staff. This should not put you off using FTP, but it is one reason to support HTTP.
Re:HTTP is fine (Score:5, Informative)
No, the HTTP protocol does not even specify the concept of a directory listning. Some servers can generate an HTML file from the directory listning, but that is all up to the server, it can generate that file as it likes or even just serve an error.
Re:gopher (Score:3, Informative)
-
goo bad ack s/b syn - punter
Re:I wouldn't worry about it... (Score:2, Informative)
It's built into OSX too (Score:5, Informative)
My experiences with FTP and HTTP downloads (Score:5, Informative)
Our FTP servers run both HTTP and FTP providing the same content in the same directory structure. There are five servers that transfer an average of 1-2 TB (terabyte) per month each, so they are fairly busy. On a busy month each server can go as high as 7 TB of data transferred. File sizes range from 1 KB to to whole CD-ROM and DVD-ROM images. I think the single largest file is 3 GB.
The logs show a trend of HTTP becoming more popular for the last several years and not stopping. It is currently at 70% of all downloads from the "FTP" servers via HTTP. While the remaining 30% is via FTP. Six years ago (I lost the logs from before this time, they are on a backup tape but I am way too lazy to get that data), it was completely reversed. 75% of downloads were via FTP and 25% were via HTTP. 90% of all transfers are done with a web browser as opposed to an FTP client or wget or something.
One thing we learned was that many system administrators will download via FTP from the command line directly from the FTP server, especially during a crisis they are trying to resolve. They do this from the system itself and not a workstation. The reasons for this are a bit of a mystery. Feedback has shown that we should never get rid of this or we might be assassinated by our customers. We thought about it once and put out feelers.
I would say if you don't need to deal with incoming files and you file size is not too large then stick with HTTP. Anything over about 10 MB should go to the FTP server. An FTP server can be more complicated. It seems like the vulnerabilities in FTP daemons has died down in the past year or so. Also, fronting an FTP server with a Layer 4 switch was a lot more tricky because of all the ports involved. If you want people to mirror you then go with FTP or rsync for private mirroring. In reading the feedback, most power users seem to prefer FTP, perhaps because that is what they are used to. Also, depending on the amount of traffic you might need to consider gigabit ethernet.
The core dumps being uploaded are getting to be huge. Some of those systems have a lot of memory!
Re:My opinion (Score:3, Informative)
Re:I wouldn't worry about it... (Score:3, Informative)
This is probably the first thing I get when I'm doing a new windows installation. For larger files, its a must. You also don't have to deal with browsers using their cache directory to download, and then *copying* it to the directory you really wanted. (Who the hell thought of doing it that way?)
HTTP simultaneous connections are expensive. (Score:3, Informative)
The reason is simple: congestion! (Score:5, Informative)
If there are 500 TCP downloads ocurring, each download will theoretically get 1/500th the bandwidth.
Therefore, by opening multiple TCP connections, you will increase the amount of bandwidth for your transfer, at a cost to everyone else using the connection. This is because you've effectively doubled the size of your receive window (one for each connection), causing the host you are downloading from to stuff that many more packets down the pipe.
The problem is, when everyone does it, it completely negates any advantage to using this method. It also leads to packet loss, since you have that many more TCP connections (each with its own receive window) fighting for pieces of the pie.
Dan Bernstein's publicfile is the answer... (Score:2, Informative)
If you want to serve files to the public, this is the most secure way to do so. If you need to provide the files to only certain logins, use something else. If not, you can run this on very lightweight hardware and if it's the only server running, you won't get hacked. Period.
Re:It's built into OSX too (Score:2, Informative)
Yeah, if only it worked. It has very shakey DAV over SSL support and no support for any sort of HTTP_AUTH. Also, don't bother trying on non-80 ports.
Stoopid Apple.
Try scp. Its part of ssh. (Score:3, Informative)
Re:hmm (Score:3, Informative)
Especially since http is faster to connect to than ftp.
I disagree. Sure, it's easy to browse via http and get one or two files, but when you're trying to suck down the entire directory, http blows (excuse the pun).
What's faster for getting a whole directory than:
Doesn't work with http, because the directory listing doesn't work with wget, at least the version I have.
Re:Forget them both.... (Score:5, Informative)
1. Scripts which need to get a list of files before choosing which ones to download - automated installers and the like - are easier to implement with FTP.
2. FTP generally seems to chew up less CPU on the host. I can serve 12mb/s of traffic all day long on a P-II 450 box with only 256mb of memory.
3. "download recovery" (after losing connection, etc.) seems to work better in FTP than HTTP.
FTP The Easy Way (Score:3, Informative)
Re: sftp (Score:2, Informative)
Re:Different, not better or wose (Score:3, Informative)
Re:hmm (Score:2, Informative)
FTP rulez...but needs help; HTTP too. (Score:2, Informative)
Now when we talk about Java, there's another possibility. Some sites (cr@ck) use a Java downloader. It doesn't mean that the Java applet that downloads the file uses HTTP or FTP, it can be some sort of propriety protocol (or you can combine the best of both worlds.)
One way is to have the applet on a SSL'ed (https) page and it does some decrypting as it downloads a pre-encrypted file from your FTP. Or the person can just download the encrypted file directly and use the applet on the secured page to decrypt it. There's ALWAYS a way to have your cake and eat it all by yourself, too.
Re:Different, not better or wose (Score:5, Informative)
Lynx lets you browse, but you can't do globbing, so you see lots of irrelevant crap, and you have to select files to download one at a time.
For getting (possibly multiple) files whose location you don't know in advance, FTP is more flexible and efficient.
Re:No, (Score:3, Informative)
Re:hmm (Score:3, Informative)
Apply these three questions... (Score:5, Informative)
Re:I wouldn't worry about it... (Score:3, Informative)
Consider WeebleFM (Score:2, Informative)
It's a PHP front end to FTP. My FTP ports are only open to the loopback interface. Users get the usablity of a clean web interface, and I get to have encrypted password-controlled FTP on a box that only has port 80 open to the internet.
WeebleFM uses mcrypt to encrypt traffic (and I'm pretty sure I could get it to work over https).
Using standard unix permissions, a careful directory schem, and vsftpd's chroot capabilities, I can have an internet filesharing arrangement with blind drop boxes, a group accessible directory and any number of world readable directories.
Re:hmm (Score:1, Informative)
To be fair, it also freezes with WebDAV (HTTP). Not that we need to be fair to FTP.
As someone mentioned, it seems to be yet another technical problem arising from shell integration.
FTP is slower due to TCP Window Size (Score:5, Informative)
Dramatically simplified, it means that the connection can send a lot more packets without hearing back from the far end, enabling the connection to reach higher speeds (imagine a phone call where you had to say 'okay' after every word the other person said. Now imagine only having to say it after every sentence. Much faster.)
The tiny window size of (most crappy legacy implementations of) FTP starts to affect download speed at just 25ms latency, and has a huge effect over 50ms.
A properly tuned system with HTTP can make a single high-latency transfer hundreds or even thousands of times faster than FTP.
Relevant links:
http://www.psc.edu/networking/perf_tune.h
http://www.nlanr.net/NLANRPackets/v1.3/window
http://dast.nlanr.net/Projects/Autobuf
Re:Different, not better or wose (Score:3, Informative)
If you are downloading a file off of a remote server, then there are one of two possibilites:
1) you know the exact address to the file you are looking for... in this case ftp provides no superior advantage over using lynx or wget since in either case you could have been given the direct URL... either provided as an http url or an ftp url. Basically my point here is that an ftp url is no more or less useful or easy to remember than an http url.
2) you don't know the address of the file you are looking fore... therefore you are pretty much required to browse via http, to find the site (or page) you want to download from... so since you are already forced to browse for the site, then you might as well use the browser to download. For most people that use graphical browsers, this is great... for those of us (myself included) that use shell browsers (ie lynx and links), this poses little problem as well (unless javascript is required to download a file... I friggen hate javascript... people who use javascript in their websites and have a choice should be fired [note, I use javascript in my works' website... but they make me.. I don't have a choice]).
bittorrent! (Score:1, Informative)
http://www.bitconjurer.org/BitTorrent/index.htm
It makes it so a few people start downloading, and they in turn upload what they have to others, and it just kind of "spiderwebs" out, reducing the strain on the original host.
I wish huge projects (distros, mozilla, etc) started using this. It would make everything SO fast.
I'm not the guy that coded it, just a happy user.
Re:hmm (Score:2, Informative)
Heh? HTTP does offer restarts. There is a "Range" header value that was introduced in HTTP 1.1 which allows you to download any range of a file (even if you only need some limited range of a file).
Just because IE or Netscape usually don't care to support it, doesn't mean it's not foolishly simple to setup yourself. Most self respecting developers can write a getter script in a few minutes that would allow for download restarts via HTTP.
Re:Different, not better or wose (Score:2, Informative)
FTP is just as doable over SSL (Score:3, Informative)
If you're talking about the human engineering aspect of this discussion only, then I have no disagreement with you. However, FTP is just as technically feasible over SSL, since SSL works at a lower level on the network stack than FTP.
Furthermore, there are good FTP clients that have SSL support. For example, CuteFTP supports FTP over SSL (and has a very user-friendly interface, for the clueless end user).
There are a good number of servers supporting FTP over SSL. ServU and Sambar are some of the windows servers. Just do a google see what else there is.
Re:hmm (Score:4, Informative)
Stop the firewall madness... (Score:2, Informative)
Seems to be a lot of comments about firewalls and FTP from people who obviously don't work with them. Remember there are three basic types of firewall technology: packet filters, proxies, and stateful inspection.
Packet filtering alone is always a problem because you have to open up all of the high ports.
Proxy firewalls and FTP (active or passive) are a no brainer as long as either feature has been enabled. Remember that proxies "watch" the conversation so it will manage the connection if it's data coming back to the client on port 20, and will recognize the 'pasv' command in the command channel.
Stateful Inspection firewalls include proxying code for the major protocols ie FTP, HTTP, Telnet, etc. So you are covered here as well.
If you are having problems using FTP through a firewall then you are probably:
-Are being blocked intentionally
-Have a lazy security admin who hasn't updated the firewall in five years
-Have a stupid router jockey "securing" the network with router ACLs (packet filters).
As long as you are using a major firewall release like Checkpoint, PIX, Netscreen, IPTables, etc, that is up to date there will not be an issue getting FTP to work.
Beware file name mangling (Score:3, Informative)
Microsoft's Web Folders work fine as long as you don't rename or create files with other WebDAV clients. Then you can end up with files named "seal" that show up as "Seal" in Web Folders. No big deal, it's only presentation, right? The problem is that Microsoft sends the mangled version back to the server for future requests, so you can't even get to "/seal", because mirosoft always asks for "/Seal". No problem, you think, I'll type it in manually - but wait - where'd my folder go and what's with this web page? Doh! Web Folders use the same protocol (http) so Explorer sees a web URL and morphs into Internet Explorer.
So how do we work around the broken MS implementation? Unless you decide to run your website from a FAT32 partition, the filesystem remains case senstive and there's no easy way to make it look otherwise. mod_speling to the rescue! Sure, it's a little overhead, but we can correct Microsoft's blunders on the fly! MS lists the directory, mangles the names, sends bogus requests, and get's magically redirected to the correct file ... and then mangles the name again. Curses, foiled again.
Last time I ran into this, I gave up and renamed the files to match Microsft's expectations. If anyone knows of a real solution, I would love to hear it.
Security Holes? (Score:3, Informative)
Really. The security holes in sendmail can be fixed by installing qmail. The security holes in BIND can be fixed by installing djbdns. The security holes in WuFTP (and most others) can be fixed by installing publicfile. There are also other good programs out there as well.
steve
Here's how they work (Score:5, Informative)
Forget people's opinions and observations about which is better; here's what they both do, you decide what you like. If you still want opinions, I give mine at the bottom.
HTTP
The average HTTP connection works like this:
FTP
FTP connections are a little less structured. The client connects, the server sends a banner identifying itself. The client sends a username and password, the server decides whether to allow access or not. Also, many FTP servers will try and get an IDENT from the client. Of course, firewalls will often silently drop packets for that port and the FTP server will wait for the operation to timeout (a minute or two) before continuing. Very, very annoying, because by then, the client has given up too.
Next, the client sends a command. There's a lot of commands to choose from, and not all servers support all commands. Here are some hilights:
And here's my favorite part. Only requests/resonses go over this connection. Any data at all (even dir listings) has to go over a separate TCP connection on a different port. No exceptions. Most people don't understand this point, but even PASV mode connections must use a separate TCP connection for the data stream. Either the client specifies a port for the server to connect to with the PORT command, or the client issues a PASV command, to which the server replies with a port number for the client use in connecting to the server.
The client does have the option to resume downloads or retrieve multiple files with one command. Yay.
Some Observations
My Opinion
I honestly think FTP was a bad idea from the beginning. The protocol's distinguishing characteristic is the fact that data has to go over a separate TCP stream. That would be a great idea if you could keep sending commands while the file transfers in the background... but instead, the server waits for the file to finish before accepting any commands. Pointless.
FTP is not better for large files, nor is it better for multiple files. It doesn't go through firewalls, and quality clients are few. HTTP is equally reliable, but also universally supported. There are also a number of high quality clients available.
In fact, the only thing FTP is better for is managing server directories and file uploads. But for that, you really should be using something more secure, like sftp (ssh-based).
Bottom line, ditch FTP. Use HTTP for public downloads and sftp for file management.
Resource usage (Score:3, Informative)
Enter apache. On the same hardware which keeled under around 30-50 ftp sessions, I could handle over 400 concurrent http sessions, with plenty of ram left over for the vital cacheing
Don't forget DiffServ and QoS (Score:2, Informative)
These algorithms are based on the assumption that HTTP traffic consists of fairly short bursts and not long sustained transfers which is typically what FTP traffic looks like. Based on these assumptions, these routers give lower priority to FTP traffic than they do to HTTP.
This does not mean that you should serve large files off HTTP since it'll be "faster". Au contraire, it means that you should be fair to others and serve them over FTP, so that the routers can do the correct packet shaping even if it means a slight speed hit to you.
Think of people downloading huge files off your web server and screwing up your warcraft (/quake/whatever) game.
Re:OR, How about... (Score:1, Informative)
BitTorrent is perfect for this. http://bitconjurer.org/BitTorrent/
>and if you want me to use my bandwith to upload your file to other people, sorry, forget about it.
>I agree. My upstream is only 40KBytes, I don't want to share it.
Because you are uploading pr0n all the time you can't share for.. well say Linux distribution distribution. If they would use BitTorrent there would be a much better possibility of getting that file than downloading it only at 1kbs.
>also, those clients are a security hazard.
>I definately agree. Downloading from a "trusted" website gives me at least some peace of mind that I'm not downloading a virus. Granted it's not guranteed - but it's far less likely to get infected from a website than it is form Joe Script Kidie.
Well, the files get MD5 summed and downloaded on the fly, so there is a very little possibility of changing the files.
Vote for BitTorrent! =)
-V
Re:HTTP is fine (Score:3, Informative)
No, the HTTP protocol does not even specify the concept of a directory listning. Some servers can generate an HTML file from the directory listning, but that is all up to the server, it can generate that file as it likes or even just serve an error.
Exactly right, and the point is that there is no explicit standard (the may be a few de-facto standards) to say what an HTML directory listing looks like, so coding the equivalent of an FTP client's "mget" command becomes a new job for every site.
My advice is, if you think your users would like mget or its equivalent, then either give them FTP or think hard about how you could provide the same functionality using HTTP/HTML.
If they don't need mget, HTTP might be fine.
I'll stick up for poor old FTP (Score:5, Informative)
First, I think FTP was a *good* idea, when you consider that its initial design was in 1971, predating even SMTP. Also since FTP was created when mainframes were king, it has features that seem like overkill today.
Both protocols depend on TCP to provide reliability. Reliability is NOT a distinguishing characteristic.Oh but read the RFC young jedi :) There's a lot more to FTP than you might notice at first glance. The problem is that many clients and servers only partially implement the protocol as specified in the RFC. In particular, nowadays the stream transfer mode is used almost exclusively, which is the least reliable mode, and forces opening a new data connection for each transfer.
If you dig into RFC 959 more, you'll see some weird things you can do with FTP. For example, from your client, you can initiate control connections to two separate servers, and open a data connection between the two servers.
There's a lot of power and flexibility built into FTP, and that's why it has stuck around for 30 years. That's really phenomenal when you think about anything software related. Even though most current firewall vendors support active-mode pretty well, passive mode was there all along, showing that someone thought of this potential issue in advance. The main weakness of FTP is that it sends passwords over the wire in plaintext, but for an anonymous FTP server this isn't an issue.
This is a good resource if you want to read up on the history and development of FTP:
http://www.wu-ftpd.org/rfc/
Best regards,
SEAL
Re:Here's how they work (Score:3, Informative)
You are missing some other points from ftp
It gives you a shell, so there are customizable accounts and even commands. That makes it a lot easier to work with files you want to manage (i.e. transfer)
FTP is jailed.
The point that its not well implemented on either the client or server side, or that the implementation has security holes is another matter.
FTP is pretty much alive, and I dont kow where I'd download my iso's from, or even huge amount of rpm's.
Re:HTTP is fine (Score:3, Informative)
It can be done, but it can't be done
And, more to the point, although there are tools to let you "get everything linked off this chunk of HTML", they're not ubiquitous the way mget is.
Implement them both (Score:2, Informative)
Some will prefer ftp because it's faster. Others, especially those behind overly-restrictive firewalls, will find that http is a more usable alternative.
Re:how about rsync? (Score:3, Informative)
Thank you for pointing out that crc's are designed to look for errors--since in this application the checksum is used to uniquely identify a block, not to check for errors. You've quite succinctly explained the reason crc's won't work.
Re:FTP or... (Score:3, Informative)
FileZilla
http://sourceforge.net/projects/file