Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Linux Software

Good POP3 Server for Huge Mailboxes? 57

brainchill asks: "I've got about 10,000 users split between a couple of quad 550 xeon machines. The machines have 2GB of ram. The problem is that the UW POP3 server takes a huge hit in both cpu and memory utilization when a 40+MB mail spool is requested via POP3. Sometimes it's bad enough to drag the monster boxes to their knees. What other POP3 daemons do you guys have experience with and how do they perform with large mailboxes"
This discussion has been archived. No new comments can be posted.

Good POP3 Server for Huge Mailboxes?

Comments Filter:
  • qmail (Score:2, Informative)

    Read the various qmail + whatever guides. Also, remember that system tuning can make a difference as well.
  • MS (Score:5, Funny)

    by Anonymous Coward on Saturday November 09, 2002 @03:00AM (#4631183)
    MS Exchange.

    You know you want to.
  • by Electrum ( 94638 ) <david@acz.org> on Saturday November 09, 2002 @03:07AM (#4631208) Homepage
    You won't get good performance with mbox, period. You need to switch to Maildir [qmail.org]. qmail-pop3d [qmail.org] works great with Maildir. Maildir scales far better than mbox since it doesn't have to parse out the individual messages. It also doesn't have to use locking. This also makes Maildir inherently more reliable than mbox. There are many tools available [qmail.org] to convert between mbox and Maildir.
    • Check out this detailed mbox vs. Maildir comparison [courier-mta.org]. Maildir really knocks the socks off of mbox in almost all of the tests.

      Courier's POP and IMAP servers (the IMAP server is used in this benchmark) have performed excellently for me in both qmail and Postfix installations.
      • I've used the Postfix + Courier-IMAP combination with Maildir on a production mail server and it worked very well.

        Every now and then I would have a customer call who was having problems getting their mail due to a corrupt bad message in their mailbox. Getting rid of the offending e-mail was a simple 'rm' command in the shell.
    • by phaze3000 ( 204500 ) on Saturday November 09, 2002 @04:15AM (#4631340) Homepage
      Actually, maildir *can* be just as bad, if not worse.

      I work for an ISP where we have ~ 50 000 email users. Maildir's great when you have a few messages, and if one of these messages happens to be big then it doesn't matter. However, if a user has tens of thousands of emails of whatever size in their mailbox (happens far, far more often than you might think) then just getting a list of files in the directory can take an age. In the scenario where a user has masses of small messages (sub 2k) then mbox would probably be faster.
      Whilst I'd certainly recommend using Maildir over mbox, it's certainly not going to solve all the problems.

      • Another option is using a filesystem that handles large numbers of files in a directory.

        Probably the best option is to have your local mail delivery program write out both the message and keep a header cache. The pop server simply reads the cache to get the info it needs to present to the user, while still manipulating the message files to give the user
        his/her messages.

        Modifying the local delivery app is trivial if you use, say, qmail-local or procmail to do the work. I could probably whip something up in a couple of hours. However, it does break one of the main principles of maildirs -- no locking. You'll have to lock the header cache file in order to append or delete from it.

        Someone else recommended cyrus; that might also turn out to be a useful option. It already does something like this.
        • Another option is using a filesystem that handles large numbers of files in a directory.

          I agree, I've speculated on the potential performance increase something like ReiserFS might bring. At present we use a NetApp filer for all our storage, with is extremely robust (it's never ever failed in the three or so years we've had it), great for features like snapshots, but not blazingly fast. Realistically, the performance at ersent is 'good enough', and I'd rather mediocre performance and excellent uptimes than mediocre uptimes and excellent performance.

      • by Electrum ( 94638 ) <david@acz.org> on Saturday November 09, 2002 @05:47AM (#4631512) Homepage
        However, if a user has tens of thousands of emails of whatever size in their mailbox (happens far, far more often than you might think) then just getting a list of files in the directory can take an age.

        This is a filesystem problem. Use a better one. On FreeBSD, enable dirhash. On Linux, use ReiserFS or ext3 with htree.
        • Good information.

          Now, wrap it into a (pre-existing?) HOWTO. Get it published on defcon1.org. Whatevevr it takes.

          But don't leave it as a random, 1-sentence Score:3 posting on Slashdot, where it will do little good for future masses encountering the same, doubtless growing, problem.

          Thanks.

          • Now, wrap it into a (pre-existing?) HOWTO. Get it published on defcon1.org. Whatevevr it takes.

            But don't leave it as a random, 1-sentence Score:3 posting on Slashdot, where it will do little good for future masses encountering the same, doubtless growing, problem.


            You've never heard of Google [google.com]? The information is out there, but you need to be willing to spend the thirty seconds necessary to find it.
      • This is a flaw in your filesystem.

        If you can't solve this by getting a better filesystem then look to the Maildir specs about how you can implement transparent hashing in the Maildir's themselves (based on timestamp/whatever)

        Maildir is easily the way to go compared to mbox on a large user system.

        -davidu
  • mbx (Score:5, Informative)

    by zsmooth ( 12005 ) on Saturday November 09, 2002 @03:17AM (#4631240)
    I use mbx personally (NOT mbox) and it scales wonderfully. The mailbox is fully indexed to speed up searching and can be accessed simultaneously by various processes. See here [washington.edu] for more information.
  • Cyrus? (Score:5, Informative)

    by Pathwalker ( 103 ) <hotgrits@yourpants.net> on Saturday November 09, 2002 @04:23AM (#4631350) Homepage Journal
    Have you looked at Cyrus [cmu.edu]? It is probably best known as an IMAP server, but it has very nice pop3 support as well.

    Cyrus stores messages in a variation of the maildir format - it maintain a database of the flags, headers, etc for the messages in a folder to speed up access.

    Notable features include shared mail folders (with independent views), quotas, multiple mail partitions (with the ability to move users across partitions on the fly), duplicate email checking, and a server side filtering language (sieve).

    Most of this would probably be most useful if you were using IMAP, but it should scale quite well as a POP server.
    • Re:Cyrus? (Score:5, Informative)

      by Matthew Weigel ( 888 ) on Saturday November 09, 2002 @02:02PM (#4632733) Homepage Journal

      Mostly right, in a very broad non-technical way.

      Cyrus's mailstore system is actually quite different from Maildir, in particular because it doesn't need to play games with user processes (the way read/unread messages are handled in Maildir is handled that way so multiple processes can manipulate messages at the same time, for instance).

      Also, most of the abilities you list are simply unavailable via POP; Cyrus is massive overkill for a POP server, and would require even more resources (particularly disk: the users that have 40MB spool files now could probably find themselves with 2GB of mail if you let them... and even the non-abusive users would require more storage for IMAP than for POP).

      Incidentally, we use qpopper to handle POP - and quite a few users go over 40MB without killing our (not particularly beefy, and not dedicated mail) servers. I suspect the real problem is that the guy is using uw-imap's POP server - the author of which is notoriously unconcerned with the performance (or lack thereof) of spoolfiles being served over POP. Which is perfectly reasonable - he writes an IMAP server, he should be concerned with IMAP performance, and if he writes a better mailbox format (he has) then he should also concern himself with that and not a 20+ year old format.

      Actually, if one were so inclined, IMAP makes a better POP than POP3 - just disable the ability to create new folders, and use a better mailbox format (mbx, Maildir, ...).

    • Yes, but... (Score:2, Informative)

      by wsapplegate ( 210233 )
      I chose Cyrus for a customer that needed a MySQL backend for his server. But I quickly ran into a problem : the minimal timeout for unlocking the mailbox in the Cyrus POP3 server is 10 minutes (yes, that's right. 10 _minutes_ !). As people with buggy mailers (*cough* Outlook Express *cough*) are very common nowadays, I was forced to go patch the sources to weed out that stupid limitation. What's sad is that I found lots of messages on their mailing-list talking about this problem since a long time, and that one-liner patch never made it to the tree, which would lead me to think the authors are unconcerned about the needs of their users (I hope I'm mistaken here)...

      BTW, there is a fine POP3 server that we've used without problems for a year (and we've customers that *never* empty their mboxes, so we've huge 300 MB horrors lying on the primary MX hard disk). It has no frills but works like a charm. It's called Solid-POP3, it's Polish, made by the same people who brought you the PLD Linux distro, and you can download it here [pld.org.pl] (alternatively, just do an `apt-get install solid-pop3d' if you run that good ole' Debian :-) Or else, you could try using LARTs on your unruly, mailbox-filling users... Good luck to you, anyway !
  • by grossdog ( 15657 )
    If you want your users to love you, check out Dartmouth's <a href="http://www.dartmouth.edu/pages/softdev/blitz .html">BlitzMail system</a>. The server is open source, POP3 support is supposed to be very fast, and many clients using the BlitzMail protocol (which is like IMAP, but a bit easier from the user's POV) are available (2 web-based, Mac, Mac OSX, Windows, Java, curses, tty).

    Anyone who has ever been to Dartmouth or any other school using a BlitzMail installation will vouch for the strength, ease of use, and plain usefullness of the system.
  • fast pop3d (Score:3, Insightful)

    by robaman ( 6440 ) on Saturday November 09, 2002 @05:08AM (#4631456) Homepage
    I would say: try popa3d.

    We just went from cucipop to popa3d on a sendmail box supporting ~10000+ users. The load dropped from ~8 to ~1.5 during peak hours.
    Before cucipop we were using qpopper, and the switch from qpopper to cucipop made a similar drop in load. Remember that this were with the "old" versions of qpopper, before the remote-root vulnerabilities were found. Don't know about the performance of today's qpopper.
  • DBMail (Score:3, Interesting)

    by m0rph3us0 ( 549631 ) on Saturday November 09, 2002 @05:58AM (#4631525)
    http://www.dbmail.org/ I was looking thru their site, havent actually used it but it sounds like it might reduce the load on you're servers massively. tho i'd want to use postgres and not mysql for the back end.
  • CommuniGate Pro [stalker.com] is the best Email Svr I'Ve ever used.
    Its easy to configure, very feature rich, performant and features cluster config.

    The only problem is it costs you an arm an a leg for the amout of users you have.

    I'm running it with only a couple of hundred users using mostly IMAP but I never had any problems with it.
    • Its free if you don't mind the "delivered by CommuniGate Pro" message on each message.

      I use it as well and must say it has excellent features for monitoring/restricting email and web pages. GUI based [some people here may not like it], but it is well laid out. It also has a CLI interface that will allow you to do ALL your maintenance from the command line or scripts you write.
    • Oh yeah, and the server I was running was on a single processor pentium II 450Mhz with >8000 users....no apparent problems with scaling on that one. Plus clustering capabilities
  • by Futurepower(R) ( 558542 ) on Saturday November 09, 2002 @12:04PM (#4632189) Homepage

    Could your problem be caused partly by hardware? Hevanet.com [hevanet.com], the best ISP in Portland, Oregon, USA, uses a special SCSI system run with a special version of NetBSD supplied by a company in Arizona.

    Retrieval of mail stresses the filesystem; Hevanet's system is a combination of OS and hardware meant to take the load.
  • Mailbox format (Score:5, Informative)

    by GoRK ( 10018 ) on Saturday November 09, 2002 @12:41PM (#4632375) Homepage Journal
    Good christ, you'd think that by the time you outgrew a QUAD XEON mailserver with only 5000 users, you'd have been reevaluating performance before plunking down what must have been close to 10-15 grand or more at the time on a second one!

    Your mailbox format is all wrong. Storing all messages in a single file is pretty much the worst way to do anything useful. You want to explore some alternative storage format such as mbx or maildir. I personally use maildir on ReiserFS on Linux and have good luck. (The filesystem is VERY important for maildirs. ReiserFS's block tail support and directory indexing give it major disk space and speed advantages for a maildir mailserver application, while running something like maildirs on XFS would instantly kill your server. I hear mbx is pretty good too, if you're stuck on some sort of standard filesystem since it uses indexing and fewer files than maildir. The downside is that it's not as immediately parseable as maildir or mbox... Ie you couldnt write a script to say... delete extremely high scoring spam messages from any user who hasn't checked their mail in over 3 months, or other things ISP's might routinely do to maintain their servers.

    Finally, if you plan to scale way up there (60,000+), you need to start looking at better cluster systems than just a couple machines. Specialize the tasks of several machines to do mail storage or talk POP3. Look at something like POPular [remote.org] for specialized POP3 server clustering software.

    ~GoRK
    • Re:Mailbox format (Score:1, Interesting)

      by Anonymous Coward
      Because I don't know the answer, please at least briefly explain why XFS is bad with maildir and reiser is better. Thanks.
      • The main reason for me is unlinks (deletes) on XFS are dog slow. A file deletion happens whenever a message is retrieved from a maildir (most pop3 clients issue a DELE after each message).. It adds up when you have a few hundred (or thousand) users connecting concurrently since it means that people have to be connected longer. The biggest advantage to reiser in a maildir setup, though, is the sector tailing. Messages are routinely a lot smaller than the sector size of a disk. You can save tens of gigs easily on a large mailserver with reiserfs.
  • extracted source is 84k. highly secure, and tiny. default pop3 daemon in openbsd.

    from their DESIGN doc: http://www.openwall.com/popa3d/DESIGN

    Here's some real performance data that I've collected (popa3d running
    via inetd; larger sites would use the standalone mode instead):

    24864 295.50re 16.92cp popa3d*
    12749 4578.88re 15.50cp popa3d

    That is, 12749 POP3 sessions took 32.42 minutes of CPU time (on a 350
    MHz Pentium II); of those, more than a half was spent in the temporary
    child processes. It's not that bad though, as this system was running
    an (intentionally) expensive crypt(3) that got accounted to the child /etc/shadow authentication processes.

    Before upgrading to popa3d, the same machine was running qpopper (out
    of inetd, too):

    12025 3169.38re 35.56cp popper

    It used to take a bit more CPU for less POP3 sessions.
  • The key to getting better performance is Maildir. Instead of using a single machine, use a "cluster" of machines and share the Maildir of NFS (or some other network sharing means). Of course, a maildir format mailbox would probably lower your overall utilization as well.
  • Ok, we call it Openwave Email Mx now, but it still scales like a mother. I run it. But then I work for Openwave on Intermail :-)

    http://www.openwave.com/products/messaging_suite /e mail_mx/index.html

All life evolves by the differential survival of replicating entities. -- Dawkins

Working...