Infrastructure for One Million Email Accounts? 1216
cfsmp3 asks: "I have been asked to define the infrastructure for the email system for a huge company, which fed up of Exchange, wants to replace their entire system with something non-Microsoft. I have done this before, but not for anything of this scale. Suppose you are given a chance to build from scratch an email system that has to support around one million accounts. Some corporate, some personal, some free. POP, IMAP, webmail, etc are requirements. The system must scale perfectly, 99.9% uptime is expected... where would you start?"
cyrus (Score:1, Interesting)
For starters... (Score:3, Interesting)
earthlink's setup (Score:2, Interesting)
this guy [jetcafe.org] used to work at both sendmail and earthlink and he has links to some good resources
Vendors (Score:5, Interesting)
Just do one thing, please: make sure that the client is honest-to-goodness serious about this. I absolutely hate getting pie-in-the-sky RFPs from people who are just kicking the tires. It's a good way to burn bridges by not looking professional.
New Google Appliance (Score:3, Interesting)
It really is the best email.
Still Have to Engineer it (Score:3, Interesting)
For pop3 & imap4rev1, look at:
http://www.dbmail.org/index.php?page=overview [dbmail.org]
Still need an MTA, I think qmail is the fastest, best, but I'd used exim, as its easier.
Database - not sure if MySQL and PostgreSQL will scale with dbmail.
I'd say use FreeBSD, because of the ports collection (Don't linux Flame me). However, something like Solaris 10 x86 (or Solaris+Sun Hardware) might provide a bit better scaling, and HA hardware, SAN support, support in general, etc. Though, a bit tougher on the OSS software installs (In My Experience)
qmail-ldap is best suited to this task (Score:2, Interesting)
1. You can sleep at night knowing that you're running the only MTA in widespread deployment that has never once had its security compromised; in fact, qmail's author Dan Bernstein still offers cash to the first one to be successful...
2. You can sleep at night knowing that the core MTA, qmail, has reliably handled some of the largest e-mail operations in the history of the internet. Its design is such that on a properly configured system, you'll never lose a single e-mail. Hotmail actually used qmail for a long time, even after Microsoft bought them - Microsoft repeatedly tried to replace it with Exchange, which kept buckling under the load.
3. Qmail is very modular, allowing you to pick and choose your components wisely.
4. Qmail uses the Maildir format its author pioneered. Maildir is NFS safe, not proprietary/complicated (often binary formats like PST are subject to corruption), etc.
5. LDAP makes it easy to manage massive amounts of accounts.
In any case... qmail-ldap is already running large sites with millions of users. Info:
http://www.qmail-ldap.org/wiki/Documentation [qmail-ldap.org]
I've set one of these systems up on an IT cluster at my current office, and I must say that it is not only very robust but also really easy to manage.
Re:POP? (Score:5, Interesting)
Re:Obviously (Score:5, Interesting)
Just so you know. Most of us out in South East Asia refer to NMCI (Navy-Marine Corps Intranet) as the Not Mission Capable Intranet.
Re:Split up the tasks (Score:2, Interesting)
I agree that you want to split things up-- make farms of large numbers of servers to make horizontal scaling easy. Store your user info in LDAP (OpenLDAP works very well, with very good data replication in 2.3.x). Most common server software will support LDAP and it scales very well.
You need "layer-4 switching" to load balance across machines, and automatically disable systems/services that are down. You need something that will cluster. I recommend Foundry ServerIron switches. F5 BigIP is another common alternative.
Re:Obviously (Score:5, Interesting)
The Navy maywant to take a page out of walmarts book, if they're having that much trouble.
Re:NO GMAIL (Score:5, Interesting)
My God no! Friends don't let friends use qmail. Want reasons why?
1) It's a bitch to install. Won't even compile on modern Linux distributions. You have to patch it to compile it and the patch isn't even hosted on qmail's site.
2) It's a bitch to configure. Rather than parsing a single configuration file, qmail relies heavily on the presence of individual files in a directory.
3) Not not not not scalable! That's a myth. Doesn't properly batch jobs together. Hell! qmail was originally designed to be run from inetd!
4) Heavy reliance on other daemontools.
5) Breaks well-known and understood UNIX standards.
6) Security through lack-of-functionality.
7) Not really secure despite the claims.
8) No longer maintained.
9) No features. Adding them requires patching, and patching, and more patching.
Serious sysadmins don't use qmail and for damn good reason. I don't give a damn if Yahoo did manage to string it together and make it work well. In short, qmail isn't particularly suited for deployment in any capacity.
Qmail!! (Score:1, Interesting)
Get a server with RAIDed SCSI disks preferably hot-pluggable. Install FreeBSD, Qmail and other packages you might need as you go.
Ideally keep the emails in a Maildir format.
I dont know where the Novell idea came from.
More specific? (Score:3, Interesting)
5) Breaks well-known and understood UNIX standards.
Which standards are these? Are you talking about the errno [tesco.net] fiasco?
6) Security through lack-of-functionality.
What sort of functionality is provided by, say, postfix, that qmail simply won't do?
7) Not really secure despite the claims.
How's that? Do you have $500 [cr.yp.to]? If not, what's the security vulnerability that the author refuses to acknowledge?
Which of these problems that you enumerate are not addressed by netqmail [qmail.org]?
--grendel drago
Re:Simplicity is key. (Score:1, Interesting)
Re:Simplicity is key. (Score:3, Interesting)
Re:Obviously (Score:5, Interesting)
I am so tired of people shoving everything into relational databases. What queries are you going to run against your database, anyway? SELECT * FROM messages WHERE read=0? Try "ls new" in your maildir. The reason things never scale right is because people design things to be "new" and "cool" like putting their e-mail into a relational database. No. Just use the filesystem. It, and its supporting tools, have been around for 30 years! It Just Works! It doesn't use any userspace memory! There are no permissions issues, because the kernel controls the permissions. It's the optimal solution.
The filesystem is really really efficient (for e-mail) and really really reliable.
Please, don't use a database!
Re:Obviously (Score:5, Interesting)
Walmart invited countless consulting firms and data backup experts. They deployed Exchange strictly because M$ was willing to "support" them. To say they were vulnerable to a major IT disaster was an understatement. The Navy want nothing to do with Walmart's IT.
Re:NO GMAIL (Score:5, Interesting)
chkuser-2.0.8b-release.tar.gz
doublebounce-trim.patch
netqmail-1.05-tls-20050329.patch
outgoingip.patch
qmail-smtpd-auth-0.31.tar.gz
qmail-smtpd-auth-close3.patch
qmail-smtpd_gmfcheck.patch
qmail-spf-rc5.patch
Most of these patches require hand editing the sources and Makefiles to successfuly merge them all into the stock qmail or netqmail base. Lots of manually reading through *.rej files to make it all work.
In order to simplify new installations I've created my own personal CVS repository for my Qmail sources. I commit changes to the tree whenever a new patch comes out with functionality I need. Hence on a new install I simply check out my custom tree and compile.
The initial work was a royal pain in the ass, however, once it is all up and running the stability and performance has been excellent.
Re:Obviously (Score:2, Interesting)
Re:Simplicity is key. (Score:2, Interesting)
> OpenLDAP
IIRC, the replication feature was pretty buggy in some versions of OpenLDAP (2.2.x). Has it been really fixed in the latest versions ?
> Exim
What about qmail ? Have you ever tried it ?
> MD4 [is] more balanced than MD5.
Do you have evidence to back up this claim ?
> NFS mount the maildirs from a fast NFS device like a Netapp.
How do you provide data redundancy with such devices ? Do you replicate data on different NFS servers ? Why not use FreeBSD or Linux boxes as NFS servers ?
> Hardware load balancers are pretty much a necessity.
Why not use standard software load-balancing facilities provided by Linux and BSD systems ?
Re:Obviously (Score:3, Interesting)
Also many people have their mail clients set with ridiculousy frequent mail check times (like every minute), and on a file based system each check requires a trip to the drive and back. Even with the data on a RAID array with a decent read/write cache, you're still going through the disk subsystem, whereas with a database it would all be in memory.
What's wrong with SELECT * FROM messages WHERE userid=xyz and read=0? That is a cakewalk for a properly indexed dbms. On a medium sized server (say, quad processor w/ 8-16GB RAM) there is more userspace memory than os memory space.
Re:Simplicity is key. (Score:3, Interesting)
You need a central configuration repository to store the email accounts, their passwords, etc. OpenLDAP is perfect for this, and you can replicate it out for scalability. Be prepared to learn about LDAP schemas.
I know this won't be a popular opinion, but given that he's migrating from Exchange, it's fairly likely that they're already an Active Directory shop... it doesn't make sense to abandon it for OpenLDAP, especially given that they're almost certainly windows only on the desktop and will still need AD even if they ditch Exchange.
My vote is for Notes (Score:2, Interesting)
But the real advantage of Notes is as a distributed applications platform. If you want to expand past e-mail and start writing applications such as leave management or room booking or technical documentation databases the this is where Notes really shines. And they're all databases and they can all be replicated so they take advantage of the same redundancy that your e-mail will use. And if you need to travel then you just replicate the databases you want onto your notebook and take them with you. It's fantastic.
Ah, the mail client
Why oh why does the client suck SO MUCH!! At my previous company the management were looking at moving to exchange simply because Outlook is so much a better client than what Notes (even R6) is. It's a big fat piece of bloatware (as has been discussed many times here). My main peeve is that if you edit an attachment inside an e-mail you can't save it back into the e-mail! eg: here's a typical scenario:
Not using Notes (outlook, thunderbird, mail.app all let you do this)
With Notes:
WHY!?!?!?!?
But despite all that crap I still think it's an excellent platform and one you should consider. It has support for encryption and also supports IMAP (although not very well I hear). A lot of large corporations run it. I've worked for 2 large investment banks both of who run it. You can also integrate IM into it (with sametime) and remote meetings also (with sametime meeting). Also, IBM PS are good at setting it up. For something this scale you'll be up for $$$ anyway so I'd be looking at having someone come in to help you and they're pretty good (I don't work for IBM!).
Re:Split up the tasks (Score:3, Interesting)
That would be an absolute nightmare. Postfix is just as functional and orders of magnitude easier to administer.
If its a million seats, its not going to be easy to admin at all. It will require several people that know MTAs inside and out and sendmail has a track record in very large systems.
Remember that in this case, the job will be 100% running an email system so the best tool for the job should be used, not the best tool for the admin.
Re:Obviously (Score:4, Interesting)
Re:More specific? (Score:3, Interesting)
Qmail has almost no features out of the box. It can't talk to LDAP, it can't handle multiple domains, it does not reject mail for unkown users (instead it queques up a bounce message which means each spam message generates one outgoing message).
in order to get qmail to what exim and postfix do you have to apply half a dozen patches and recompile.
Of course unless the guy who did the compile took very careful notes you have no idea what your particular installation of qmail is capable of either.
I inherited a qmail install one time and it was a nightmare to maintain. When somebody decided to start sending me 100 thousand emails a day to unkownuser@mydomain.com and my message que got to be hours long I only had two options.
1) Gather all the patches used to build the original qmail (again no real way of knowing) and then add yet another patch and recompile.
2) Install postfix.
Guess what I did?
Re:openwave's email server does this but it's $$$ (Score:2, Interesting)
no, it will not be sendmail (Score:4, Interesting)
You're high. Building a massive production email system on Sendmail 9 is slow-motion suicide. If the security holes don't get you, the terrible configuration methods and complete lack of scaleability will, nevermind the fact that Sendmail Inc is trying desperately to replace the product.
"Most managable with [...] heavy customization?" I'd laugh if I wasn't crying. And I'm crying because I used to work for a company that deployed a massively customized sendmail infrastructure -- and I was one of the poor bastards who had to maintain it. Trust me, you don't want to do this. Ever.
Yes, milter is cool. No, it's not cool enough to justify burning CPU cycles on sendmail in 2005.
Even Sendmail Inc tacitly admits that Sendmail's design is garbage: take a look at the design document [sendmail.org] for Sendmail X, and note carefully how much it resembles Postfix and Qmail. There are very good reasons for this.
Re:Where to start (seriously) (Score:3, Interesting)
-russ
Re:You are wrong in every way. (Score:1, Interesting)
http://www.google.com/url?sa=t&ct=res&cd=2&url=ht
Re:You are wrong in every way. (Score:2, Interesting)
In order to say that an RDBMS is an order of magnitude slower, one most show that as load increases the overhead of the DB grows faster than that of a FS doing the same task. (and, generally, to say that this difference is "an order of magnitude" the spread between them should increase at least linearly).
Doing a trace on a DB for a simple query tells you absolutely nothing about its scalability.
Re:Obviously (Score:3, Interesting)
Exchange/Outlook will let you modify the attachment in place and keep it in your mailbox.
Are you saying that I can send a file to 100 people, then edit it after I send it and leave the 100 people with no audit trail? That's horrible!
Re:Qmail!! (Score:1, Interesting)
- 4-5 core machines all running heartbeat, and DRDBD or NFS
- Then several Machines for POP, IMAP, and Webmail (NFS the maildirs)
- Then several SMTP servers.
Something similar, but greatly scaled, like this: http://shupp.org/maps/ispcluster.html [shupp.org]
Re:Obviously (Score:5, Interesting)
Re:Obviously (Score:3, Interesting)
What relational DBMSs? All I've heard discussed are SQL products.
The filesystem is really really efficient (for e-mail) and really really reliable.
I'm tired of everyone shoveling everything into a filesystem.
How are you going to run queries against your contacts? Or your appointments?
How does a filesystem guarantee referential integrity? Can a filesystem guarantee an appointment doesn't exist for a bogus contact?
*Any* kind of integrity? Can a filesystem guarantee that a message is well formed?
My 2p on where to start.. (Score:2, Interesting)
Some things to consider: MS Exchange is a lot more than just mail. If Calendaring and other forms of group-working are involved then the task at hand is substantially more complex than for a mail only system. Also, these days with virus and spam being endemic the platform needs to incorporate a framework that handles them as well as policy driven content management controls at it's core rather than have them as bolt-in's or bolt-on's. Are you bound by any regulatory requirements?. Geography is a major influence, and if this is a business platform how does this affect your strategies for resilience, disaster recovery and backup of the platform? In a perverse way most of the decisions you have to make when building systems of this size are about business decisions (what's the cost of retraining users to use new mail clients is a favourite of CTO's) and it's not specifically about the products/technologies involved.
So, exactly what type of hardware/software and surrounding infrastructure you need to assemble to create 'the whole' is a somewhat open-ended question without going into a decent level of detail on your requirements and the drivers behind them. However, once you go north of about 500k users the number of commercial vendors tails off dramatically. If you include group-working as a factor it reduces further. I'll not start suggesting names (I currently work for a vendor in this space and self-plugging's not in the spirit that /. operates on), but i'd recommend starting out by talking to some of the analyst groups that have staff researching this end of the messaging market (Radicati, Gartner, Butler Group) and then opening dialogue with vendors appropriately.
Notes/Domino (Score:4, Interesting)
This is exactly what Notes was designed to do: scale. People have been building systems on this scale with notes for nearly twenty years. You can not only scale it by moving parts of your email system onto mainframe class iron, but you can distribute it and provide all kinds of flexibility and redundancy into your system to meet virtually any messaging requirement (e.g. choose an alternate MTA for high priority traffic when there are Internet disruptions). Naturally there's some complexity involved, but if you can get by with sendmail you probably shouldn't be using Notes.
What's more important is that management of accounts and identity, which is distributed, delegatable, and backed up by robust cryptographic certificate management. You can let a subsidiary manage it's own accounts, they can subdelegate that to a division and the division can subdelegate that to the IT staff on site; at each level policies can be set, enforced, and changed for lower levels.
and then you need a copy (Score:3, Interesting)
Sharing something with people (which for some reason database people call "single instance store" I've learned today) can be done in both a filesystem and in a data base. Databases are "one-size-fits-all" kind of tools, not always the "best" solution, but one that you've lot of chances of making it work even if it's not the best solution. Linus said something similar when he was suggested to develop GIT in top of MYSQL...if you really know what you're going to do with the data, and you KNOW that a filesystem is enought, why use it? It's buying a 900HP car to your mother - STUPID. The "let's do it just because we can" is a good step if what you want is to write overengineered, bloated software.
Because a filesystem IS a database. Except that instead of having a SQL-ish interface, you've a "read(), write(), readdir()" kind of interface. Which happens to be really fast (filesystems are implemented inside the kernel, they're reliable, they're much simpler, easy to manage, etc).
When you use a database like mysql, you're just using a database in top of, uh, another database (the filesystem). Which has not sense. It WILL work, but that doesn't means is the "best possible solution"
Despite of all this, BTW, hardlinks are NOT the solution for the "share a file between 1000 users" problem. It can be, but remember that you can't make hardlinks between different filesystems. I have no idea if you can use LVM to solve this, if ACLs + symbolic links can be used to implement this in a delivery agent. And if you cant (I don't really know), someone really should think about adding something to filesystems to allow it like plan9 did, because it has sense
Re:qmail-ldap can do it (Score:2, Interesting)
We were able to apply OS patches box-by-box, taking them out of service individually, but without any downtime to the service. Very nice.
Others are using qmail-ldap for large ISPs, of the size you are asking about. Check out their mailing list.