Good POP3 Server for Huge Mailboxes? 57
brainchill asks: "I've got about 10,000 users split between a couple of quad 550 xeon machines. The machines have 2GB of ram. The problem is that the UW POP3 server takes a huge hit in both cpu and memory utilization when a 40+MB mail spool is requested via POP3. Sometimes it's bad enough to drag the monster boxes to their knees. What other POP3 daemons do you guys have experience with and how do they perform with large mailboxes"
qmail (Score:2, Informative)
Yes, qmail (Score:5, Informative)
Check out cr.yp.to/qmail.html [cr.yp.to] and www.qmail.org [qmail.org]
MS (Score:5, Funny)
You know you want to.
Re:MS (Score:1)
Re:Ever think of FTP? (Score:2, Insightful)
Sending large emails via SMTP may not be the best useage of the protocol but in many cases (read - when one party is running Windows) it is very difficult to use FTP or scp to accomplish the same task as the tools are simply not available.
Re:Ever think of FTP? (Score:1)
Has anyone ever used a worse ftp client then the one microsoft bundles? It's even worse than the telnet client.
Re:Ever think of FTP? (Score:1)
The Win9x telnet client, on the other hand, was abyssmal.
Re:Ever think of FTP? (Score:2)
All you've to do is put the ftp's adress on explorer's adress bar, and it will work.
The interface is the same as if you are browsing your own files.
Re:Ever think of FTP? (Score:1)
Re:Ever think of FTP? (Score:2)
Re:Ever think of FTP? (Score:1)
As for public FTP servers, there is the issue of finding a public FTP server that allows uploads of files larger than would be considered appropriate for email (from 3MB to over 75MB). Also, once you've uploaded the file and emailed the recipient the URL, _anyone_ can get the file. Not good for anything even slightly private. Yes, email is not private. People can eavesdrop on the mail transfer and read your mail, but the number of people who have access to this situation is much lower. If you dont mind this, and can find a suitable server, a public FTP server is definately a good option to use. Your mailservers will thank you for it.
Re:Ever think of FTP? (Score:1)
Re:Ever think of FTP? (Score:5, Insightful)
Oh, and the fact that some people still use POP3, and their life is made miserable when they're working with large files over a modem. People should use IMAP.
There are quite legitimate uses for file transfer via email. Most people (i.e. not UNIX geeks) do not want to maintain a file server and keep their system up 24/7. The other person may not be at the computer...this puts it in their "queue of things to deal with".
If you mean "why don't people use ftp to transfer files to a third, intermediary system that acts as a drop box"...well, that's doing exactly what you're doing with SMTP. Why *not* do it with SMTP?
Finally, from a user perspective, mail is much more convenient to use than dedicated file transfer protocols. Most people constantly use a mail program and know how to use it reasonably well. Everyone has an email address (a more useful mapping to users than an IP address that FTP would require), and there are no worries about different companies having different places to drop files. Email lets users sort and date emails, and tag files as being from some user. It makes it accessable from anywhere they can get at their email.
Another thing that mail admins should live with is large mailboxes -- not just a single mail, but people leaving mail on the server, or keeping old mail around on the server. This is one of the *best* things to happen to IT. It's been the holy grail of NC designers for years. Centralize data storage to reduce costs, allow reuse of hardware, and facilitate backup.
Frankly, if anything, mail should be extended to have *better* support for this (like resumable transfers, etc). The FTP model -- where you have machines that are always up 24/7, users that associate well with "computers" rather than "other people", users that are familiar with a larger number of programs, and a network that has no firewall or other restrictions -- simply doesn't fit the reality of what's going on at businesses today. It's fantastic for techies who want to work with their own systems, but less good for your average end users.
Re:Ever think of FTP? (Score:1)
Maybe people should just email links to files/attachments on file servers, and the link will contain the information [login/password] that will allow the browser/email program to properly access the file.
Re:Ever think of FTP? (Score:2)
No 8-bit MIME?
and it generally seems slower.
[shrug] Unless the server is poorly designed and is the bottleneck, that shouldn't be the case. A TCP connection is a TCP connection is a TCP connection.
Re:Ever think of FTP? (Score:1)
Spoken like a true user (Score:1)
Re:Spoken like a true user (Score:3, Interesting)
Re:Ever think of FTP? (Score:1)
If you mean "why don't people use ftp to transfer files to a third, intermediary system that acts as a drop box"...well, that's doing exactly what you're doing with SMTP. Why *not* do it with SMTP?
Because it's wasteful. Encoding binary files for transfer via SMTP makes them 25% larger.
Re:Ever think of FTP? (Score:2)
Use 8-bit MIME.
Stop using mbox and switch to Maildir (Score:5, Insightful)
Re:Stop using mbox and switch to Maildir (Score:2)
Courier's POP and IMAP servers (the IMAP server is used in this benchmark) have performed excellently for me in both qmail and Postfix installations.
Re:Stop using mbox and switch to Maildir (Score:2, Interesting)
Every now and then I would have a customer call who was having problems getting their mail due to a corrupt bad message in their mailbox. Getting rid of the offending e-mail was a simple 'rm' command in the shell.
Re:Stop using mbox and switch to Maildir (Score:5, Informative)
I work for an ISP where we have ~ 50 000 email users. Maildir's great when you have a few messages, and if one of these messages happens to be big then it doesn't matter. However, if a user has tens of thousands of emails of whatever size in their mailbox (happens far, far more often than you might think) then just getting a list of files in the directory can take an age. In the scenario where a user has masses of small messages (sub 2k) then mbox would probably be faster.
Whilst I'd certainly recommend using Maildir over mbox, it's certainly not going to solve all the problems.
Re:Stop using mbox and switch to Maildir (Score:3, Informative)
Probably the best option is to have your local mail delivery program write out both the message and keep a header cache. The pop server simply reads the cache to get the info it needs to present to the user, while still manipulating the message files to give the user
his/her messages.
Modifying the local delivery app is trivial if you use, say, qmail-local or procmail to do the work. I could probably whip something up in a couple of hours. However, it does break one of the main principles of maildirs -- no locking. You'll have to lock the header cache file in order to append or delete from it.
Someone else recommended cyrus; that might also turn out to be a useful option. It already does something like this.
Re:Stop using mbox and switch to Maildir (Score:2)
I agree, I've speculated on the potential performance increase something like ReiserFS might bring. At present we use a NetApp filer for all our storage, with is extremely robust (it's never ever failed in the three or so years we've had it), great for features like snapshots, but not blazingly fast. Realistically, the performance at ersent is 'good enough', and I'd rather mediocre performance and excellent uptimes than mediocre uptimes and excellent performance.
Re:Stop using mbox and switch to Maildir (Score:5, Informative)
This is a filesystem problem. Use a better one. On FreeBSD, enable dirhash. On Linux, use ReiserFS or ext3 with htree.
Re:Stop using mbox and switch to Maildir (Score:3, Insightful)
Now, wrap it into a (pre-existing?) HOWTO. Get it published on defcon1.org. Whatevevr it takes.
But don't leave it as a random, 1-sentence Score:3 posting on Slashdot, where it will do little good for future masses encountering the same, doubtless growing, problem.
Thanks.
Re:Stop using mbox and switch to Maildir (Score:2)
But don't leave it as a random, 1-sentence Score:3 posting on Slashdot, where it will do little good for future masses encountering the same, doubtless growing, problem.
You've never heard of Google [google.com]? The information is out there, but you need to be willing to spend the thirty seconds necessary to find it.
Re:Stop using mbox and switch to Maildir (Score:2)
If you can't solve this by getting a better filesystem then look to the Maildir specs about how you can implement transparent hashing in the Maildir's themselves (based on timestamp/whatever)
Maildir is easily the way to go compared to mbox on a large user system.
-davidu
mbx (Score:5, Informative)
Cyrus? (Score:5, Informative)
Cyrus stores messages in a variation of the maildir format - it maintain a database of the flags, headers, etc for the messages in a folder to speed up access.
Notable features include shared mail folders (with independent views), quotas, multiple mail partitions (with the ability to move users across partitions on the fly), duplicate email checking, and a server side filtering language (sieve).
Most of this would probably be most useful if you were using IMAP, but it should scale quite well as a POP server.
Re:Cyrus? (Score:5, Informative)
Mostly right, in a very broad non-technical way.
Cyrus's mailstore system is actually quite different from Maildir, in particular because it doesn't need to play games with user processes (the way read/unread messages are handled in Maildir is handled that way so multiple processes can manipulate messages at the same time, for instance).
Also, most of the abilities you list are simply unavailable via POP; Cyrus is massive overkill for a POP server, and would require even more resources (particularly disk: the users that have 40MB spool files now could probably find themselves with 2GB of mail if you let them... and even the non-abusive users would require more storage for IMAP than for POP).
Incidentally, we use qpopper to handle POP - and quite a few users go over 40MB without killing our (not particularly beefy, and not dedicated mail) servers. I suspect the real problem is that the guy is using uw-imap's POP server - the author of which is notoriously unconcerned with the performance (or lack thereof) of spoolfiles being served over POP. Which is perfectly reasonable - he writes an IMAP server, he should be concerned with IMAP performance, and if he writes a better mailbox format (he has) then he should also concern himself with that and not a 20+ year old format.
Actually, if one were so inclined, IMAP makes a better POP than POP3 - just disable the ability to create new folders, and use a better mailbox format (mbx, Maildir, ...).
Yes, but... (Score:2, Informative)
BTW, there is a fine POP3 server that we've used without problems for a year (and we've customers that *never* empty their mboxes, so we've huge 300 MB horrors lying on the primary MX hard disk). It has no frills but works like a charm. It's called Solid-POP3, it's Polish, made by the same people who brought you the PLD Linux distro, and you can download it here [pld.org.pl] (alternatively, just do an `apt-get install solid-pop3d' if you run that good ole' Debian
A left-field solution (Score:2, Interesting)
Anyone who has ever been to Dartmouth or any other school using a BlitzMail installation will vouch for the strength, ease of use, and plain usefullness of the system.
fast pop3d (Score:3, Insightful)
We just went from cucipop to popa3d on a sendmail box supporting ~10000+ users. The load dropped from ~8 to ~1.5 during peak hours.
Before cucipop we were using qpopper, and the switch from qpopper to cucipop made a similar drop in load. Remember that this were with the "old" versions of qpopper, before the remote-root vulnerabilities were found. Don't know about the performance of today's qpopper.
DBMail (Score:3, Interesting)
Not free but ... (Score:1)
Its easy to configure, very feature rich, performant and features cluster config.
The only problem is it costs you an arm an a leg for the amout of users you have.
I'm running it with only a couple of hundred users using mostly IMAP but I never had any problems with it.
Re:Not free but ... (Score:1)
I use it as well and must say it has excellent features for monitoring/restricting email and web pages. GUI based [some people here may not like it], but it is well laid out. It also has a CLI interface that will allow you to do ALL your maintenance from the command line or scripts you write.
Re:Not free but ... (Score:1)
Your problem is caused partly by hardware? (Score:3, Interesting)
Could your problem be caused partly by hardware? Hevanet.com [hevanet.com], the best ISP in Portland, Oregon, USA, uses a special SCSI system run with a special version of NetBSD supplied by a company in Arizona.
Retrieval of mail stresses the filesystem; Hevanet's system is a combination of OS and hardware meant to take the load.
Mailbox format (Score:5, Informative)
Your mailbox format is all wrong. Storing all messages in a single file is pretty much the worst way to do anything useful. You want to explore some alternative storage format such as mbx or maildir. I personally use maildir on ReiserFS on Linux and have good luck. (The filesystem is VERY important for maildirs. ReiserFS's block tail support and directory indexing give it major disk space and speed advantages for a maildir mailserver application, while running something like maildirs on XFS would instantly kill your server. I hear mbx is pretty good too, if you're stuck on some sort of standard filesystem since it uses indexing and fewer files than maildir. The downside is that it's not as immediately parseable as maildir or mbox... Ie you couldnt write a script to say... delete extremely high scoring spam messages from any user who hasn't checked their mail in over 3 months, or other things ISP's might routinely do to maintain their servers.
Finally, if you plan to scale way up there (60,000+), you need to start looking at better cluster systems than just a couple machines. Specialize the tasks of several machines to do mail storage or talk POP3. Look at something like POPular [remote.org] for specialized POP3 server clustering software.
~GoRK
Re:Mailbox format (Score:1, Interesting)
Re:Mailbox format (Score:2)
popa3d (Score:1)
from their DESIGN doc: http://www.openwall.com/popa3d/DESIGN
Here's some real performance data that I've collected (popa3d running
via inetd; larger sites would use the standalone mode instead):
24864 295.50re 16.92cp popa3d*
12749 4578.88re 15.50cp popa3d
That is, 12749 POP3 sessions took 32.42 minutes of CPU time (on a 350
MHz Pentium II); of those, more than a half was spent in the temporary
child processes. It's not that bad though, as this system was running
an (intentionally) expensive crypt(3) that got accounted to the child
Before upgrading to popa3d, the same machine was running qpopper (out
of inetd, too):
12025 3169.38re 35.56cp popper
It used to take a bit more CPU for less POP3 sessions.
Maildir (Score:1)
InterMail! (Score:2)
http://www.openwave.com/products/messaging_suit