What Mailbox Format Do You Use And Why? 364
"I currently store all of my e-mail in a local mbox-style IMAP store in ~/mail/, so that I am not tied to any particular mail client. However, I am planning on syncing my mail across multiple machines (home, work, and soon a laptop) so I need to have mail in a form which can be synced easily. MBox is bad for this because if I grab mail on one machine, and later delete some mails from the same folder on another machine, then sync, the new mails will be lost. This is where maildir is good - each message is a separate file. But why do so many people hate it? If I do change over to mailbox, what IMAP/SMTP servers should I use? A hacked sendmail/UoW IMAP? Courier-IMAP + QMail? Something else? How do other people keep their mailstores synced across many machines, and what software do they use?"
Re:JWZ and me (Score:2)
Re:Exchange Mailbox format (Score:2)
Re:Exchange Mailbox format (Score:2)
Exchange doesn't like that last time I used it, in fact it acts more or less like mbox in that respect, although it depends on how your mail is localy stored. Plus Exchange is pretty expensive for people who don't have large expense accounts and have to support a large base of people.
Finally, how does supporting POP and IMAP make you a "lot more cross platform than UNIX mail"? Especially since UNIX based mail systems can do the same thing (and can share mailboxes between other non-Exchange servers if need be). My biggest beef with Exchange is the binary message format. Just try to resurrect a slightly damaged file, or search/modify something without having to fire up your mail client or web browser.
Re:I'm an 'mbox' user... (Score:2)
Re the wildcard expansion limit, xargs can handle that.
Re:Maildir is WAY better (Score:2)
It being that some organizations spend that much on a single server, this is pretty damn reasonable.
And he's entirely right -- my experience confirms that using ReiserFS makes maildir handling much faster than under ext2.
Sounds like... (Score:2)
While much M$ software is poorly designed, MAPI is an exception. MAPI is a pretty flexible, intelligent architecture for all things messaging.
MAPI allows you to do things like substitute message stores, address books stores, etc., by treating them as abstract components. Exactly what you're claiming to have done with your "data store API"
I don't want to be too critical, but I hope you folks looked at MAPI before you went out inventing another API...
but use xfs (Score:2)
as far as mta's go, does anyone know if qmail supports secure sendmail (using sasl)? I'm running an old version of postfix on my relays, time to update.
cheers,
-o
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
Actually, what it means is that you can use a high-performance, industry standard query language like SQL to extract data, instead of having to kluge together a patchwork of file and stream manipulation tools.
yes, and a couple more (Score:2)
1) It needs to be *trivial* to use standard unix tools on the mailbox to find things.
e.g.,
rmm `scan
should remove all messages with badthing[12] in the heading, where f4 is an alias for
sed -e 's/\(....\).*/\1/' |tr '\n' ' '
[I'll admit that I was briefly worried the first time that this was
my reaction to a bunch of messages from a mailer gone nuts . . . ]
2) it would be nice for the system to be hostile to abusive mailings--not by content, but from the idiots that send plain text messages in html and mime. That's not a user preference; it's *wrong*.
3) Must be command line friendly. MUA's are for sissies. Real men read from the command line
hawk
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
Re:The Disputes are probably not technical (Score:2)
It used to be that qmail was only allowed to be distributed in source code form, and the CDB database system (it's a cool thing well worth looking at) used a license that was somewhat incompatible with the GPL. There seemed to be some rancor to the effect that the GPL wasn't a "free" license; seemingly an independent recreation of the "BSD bigot" approach to software licensing.
Efficiency, Flames... (Score:2)
o_dev_t st_dev;
o_ino_t st_ino;
o_mode_t st_mode;
o_nlink_t st_nlink;
o_uid_t st_uid;
o_gid_t st_gid;
o_dev_t st_rdev;
off_t st_size;
time_t st_atime;
time_t st_mtime;
time_t st_ctime;
};
which adds up to around 48 bytes, and add to that the size of the directory entry that attaches to the inode.
It's not forcibly "ludicrously big," but it's space overhead nonetheless.
As for "flaming," it's somewhat unfortunate that Dan
If he tried to find some places for agreement, his software would probably get used more. Some of it's really very neat, cdb and the microscopic DNS server being particular examples...
The fact that he comes from a pretty strongly "pure math" background means that he comes up with substantially different ideas than most people. The PM factor adds in two particularly useful things:
The Disputes are probably not technical (Score:2)
This may be important on a big mail server where inodes or disk space may wind up being scarce commodities.
There are then nontechnial issues.
The creator of Maildir [cr.yp.to] , Dan Bernstein, [cr.yp.to] is a, um, "somewhat prickly character." Take a look at his criticisms of Postfix [cr.yp.to] for some mild material. Comparative discussions of Postfix and qmail have resulted in extremely inflammatory discussions. And Bernstein's attitudes towards the GPL seem similarly "inflammatory." This appears to have put some people off his software, whether rightly or wrongly.
Personally, I use Postfix as my MTA, and push messages through Maildir as interim step to pushing them into MH, which is only a fairly small step removed from Maildir...
Re:Exchange Mailbox format (Score:2)
Exchange has its good points, this is true, but the biggest problem I have with it is that it holds my data hostage. I can't get at the mail spools if something dies and, if it does, you're fucked unless you also bought support contracts.
We've been running qmail + vpopmail for over 1500 people with Maildir formatted message stores without a problem for over two years now. When something breaks, I can fix it. Data is stored either in the database or in regular old files. It seems to work very well on a mediocre P2 and has all the good stuff: (A)POP, IMAP (courier-IMAP), selective relaying (relaying is allowed after a successful POP or IMAP authentication), user-run mailing lists (ezmlm) and web configuration (vpopmail has a web client). Oh yes and Squirrelmail for the web based mail reading folk.
There's one thing I learned early on and that's that I don't like having my data held hostage. The software I reccomend for the companies I advise for is pretty much any software is alright so long as either a) it's open-API b) opensource or c) I get copies (and updates) of the data formats. Surprisingly few companies balk at this.
Re:My mailbox (Score:2)
Re:JWZ and me (Score:2)
Your commandline does not solve the problem that the original invocation of xargs was intended to solve - passing a *huge* number of files to grep on the commandline (grep * in a directory with a ton of files) causes it to break.
xargs works two different ways depending on how you invoke it.
is the equivilant to
or
Whereas invoking xargs like this :
is congruent to :
So, umm, there, and such.
</pedant>
--
Re:Maildir is WAY better (Score:2)
Yeah, being to grep to find a particular message properly is really handy - as is being able to kill all the messages containing 'University Diploma' with just find, grep and rm...
The other thing I've found in the past with mbox is that if you're really unlucky, the POP3 server will make a temporary copy of your (whole!) mailbox before doing a UIDL/LIST. qpopper used to do this at least, and you really knew about it when someone had a 30Mb mailbox. Maildir has a minimum of file shuffling and reading/rewriting.
"Enterprise grade", or a toy? (Score:2)
On the face of it, this statement makes no sense at all. The big mail communications servers these days are the Internet MTAs, which in all the major ISPs handle typically many millions of messages per day on behalf of millions of customers per ISP. As others on this thread have mentioned, Exchange runs out of steam if you push it beyond some 2000 users per server -- it just doesn't scale, so it's not "Enterprise Grade" by any stretch of the imagination, it's out by 2-3 orders of magnitude. You've got to stop believing manufacturer's propaganda.
You should compare Citadel/UX to qmail or Exim installations in large ISPs, not against toy systems. Server farms with dozens of hierarchically-organized, multi-CPU MTAs which provide the massive underpinning to the world's Internet mail traffic, those are the "Enterprise grade" systems of today, not the relatively puny corporate systems of yesteryear being portrayed as "Enterprise grade" by manufacturers of personal computer software with more money than experience.
I feel I must also comment on your novel use of the word "robust". If one compares the reliability, availability and robustness of a flat file to that of even the simplest database system, the mind boggles that anyone could consider the database system as anything but the less reliable of the two by a collosal amount.
We run massive database systems here from the best regarded RDBMS manufacturer in the industry and configured with their help, yet even our DBAs will admit that the reliability of their databases is not brilliant. In contrast, the reliability of Exim is, er, well, it has never failed, so I guess the reliability is infinite. And I hear that qmail is likewise excellent in that respect. How the hell is a database going to improve on that kind of reliability and robustness?
Even the best databases crash and corrupt data every once in a while, and a new database could easily be less stable rather than more. But I've never had a flat file crash on me.
If it makes you feel any better, Unix is a sort of combined I/O multiplexer and storage mechanism, which inevitably makes it a particular kind of database too. To get the most out of it you should leverage its capabilities instead of trying to impose a totally different semantic on top of it. You'll never gain robustness by adding complexity.
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
However, you'll be happy to know that we've wrapped all of the database calls into a data store API. Recently we made the transition from GDBM to Berkeley DB without having to rewrite everything -- just drop in a new data store module and re-import the data (yes, there's an import/export utility). It would be quite straightforward for someone to write a data store module that uses MySQL, Oracle, or whatever.
--
Re:Qmail, Courier-IMAP, Squirrel mail, Apache mods (Score:2)
I'm very concerned about security, so I configured Courier-IMAP to ONLY provide SSL/TLS secure POP and IMAP. I set it up to provide insecure (non-SSL) service only on localhost (127.0.0.1), but not visible over the network. That way SquirrelMail or MUAs running on my server can get to it without SSL, which is OK because there's no way for someone else on the wire to eavesdrop. Of course, I also have the .htaccess file for SquirrelMail set up to only server over SSL/TLS (see below), and I don't allow telnet, rlogin, or non-SSL'd FTP. into my server.
I'm somewhat interested in developing up with a database back end for the IMAP server, so that old archived email can be stored more efficiently than either a maildir or mbox, but still be readily accessible.
# .htaccess for SSL-only services
# Options -Indexes
<IfDefine HAVE_SSL>
SSLRequireSSL
# insert the https: URL of the service in the next line
# for automatic redirect if the user attempts a non-SSL connection
ErrorDocument 403 https://host/webmail/
</IfDefine>
<IfDefine !HAVE_SSL>
# this is to make sure that if the web server is accidentally started without
# mod_ssl, the web pages won't be served up insecurely
Deny from all
</IfDefine>
Re:(ex)mh (Score:2)
refile +foobar `pick -from foobar`
will move all messages from "foobar" into my foobar folder in about 15 keystrokes (with autocompletion).
refile -link +foobar `pick -search project6` +project6
will refile messages in my foobar folder containing the text "project6" to my project6 folder using hard links. Now the messages exists in both folders.
I can type inc, show, next, comp, etc. in any terminal window at home or at work, and the right thing happens (with a few ssh tricks and gnuclient). No fumbling for some icon to click on, or waiting for the gui to come up, or finding the window running my mail agent...
The only drawback is that after a few hundred thousand messages scattered in hundreds of folders indexing the files for backup can take a bit of time, "what do you think I'm running here, a news server?"
Re:it is nice (Score:2)
Yes, you do have to store the password (or a derivative thereof) on the server. Otherwise, the server would never know if you typed in the correct password or not. But, I think you're poorly trying to make a point that not all data should be stored on the server.
It's true; not all data should be stored on the server. Like certain subscriptions. Of course, the client doesn't have to use the server's capabilities to manage subscriptions.
I would like to have a client that allows me to choose server-based or client-based management of subscriptions and recent messages. That way, I could say "I always want this subscription, but this other subscription should only show up when I'm using balsa from home" or something. That would not be possible if the server could not store subscriptions, but the ability to store subscriptions does not prevent the client from doing its own management.
And race conditions in the spec should be fixed. They're not excuses to throw away the idea entirely.
-Dave
Re:Qmail also supposts mbox (Score:2)
maildir format does not scale well to large mailboxes on large servers because it has no sort of overview cache information. Mark Crispin (author of UW imapd) correctly deduced that MH format sucked for the same reason that qmail format sucks, and refused to implement it. Without a way to do overview information, getting headers to do the message list is excessively slow.
Re:I'm an 'mbox' user... (Score:2)
Yes, I'm aware of that. The problem is that it's dog-slow. Opening and scanning 2000 files for one mailbox alone is just darned painful. Even if the mailbox is hundreds of megabytes in size, 'grep' will operate on it faster if it's a single file than if it's zillions of separate files.
Also, when your mailbox grows to thousands of messages, the wildcard expansion in the shell ('*' in your example) may overflow or truncate, and you may not actually scan all the messages. Yes, you can resort to foreach, but then not only are you opening zillions of files, you're discretely launching 'grep' a zillion times as well.
Like I said, I admire 'maildir's reliability, and it's certainly more flexible in certain ways., and if I could get the same or similar search speed out of 'maildir', I'd switch. But for the moment, 'mbox' serves my purposes.
Schwab
Some problems with maildir (Score:2)
Re:Some problems with maildir (Score:2)
Re:Maildir is WAY better (Score:2)
Re:Outlook corporate mailbox (Score:2)
IMAP? courier-imap.
security? you have to make a tradeoff since you refuse to use proper products, there's no tradeoff using qmail-courier-imap-ssl-mutt-whatever.
workgroup facilities? there are a lot (Evolution, many webbased) so that's a moot point. you get everything.
Resources? qmail is a lot more resource friendly than Exchange...
still there's a tradeoff using OS-tools, you need somebody put all this together. a smart guy...
Re:A few thoughts on message storage (Score:2)
This combines the best of both worlds. This also means that while it's easy to corrupt your database with a single bug in your code, you can always re-build it from the on-disk messages.
Yes, it is great until the two get out of sync. If you can limit access to the raw filesystem, then that'll eliminate most of the problems, and most of the advantages.
Besides, databases are a lot better (these days) at storing large hunks of arbitrary data, so I'd just stick everything in the database.
That or use a future version of reiserfs, which could give you a database-like view of your filesystem.
Re:Maildir is WAY better (Score:2)
What I do is configure maildirs for everyone on the mail server, using either qmail or postfix (both can deliver to maildir; qmail is more minimalistic but a bit confusing, postfix is about as good and a lot more understandable), and then setup qmail's pop3 daemon (even if using postfix to deliver). This combination has worked so well for me that I use it both on server and on my desktop computer (getting mail from pop3 with fetchmail, delivering into maildirs, reading with mutt).
The only thing to make sure with maildir is that you have enough inodes. But that's easy to handle when formatting the partition, and (even better) you could use reiserfs, which has dynamic inode allocation and handles large directories of small files very well.
Re:My mailbox (Score:2)
--
Mailbox formats (Score:2)
Flat mbox file:
pros: easy to set up, accessible.
cons: subject to locking issues,
not scalable, limited to local fs
Maildir format:
pros: fast, highly scalable, good
performance, very few locking
issues, reliable
cons: limited user access to directory
Proprietary db format:
pros: transactions, scalable
cons: expensive, corrupts easily,
word of warning:
backup frequently if you are
using MSexchange.
Re:Exchange Mailbox format (Score:2)
Their points:
1) No version of Exchange had a stable message store until 5.5SP1. According to them, that's at least 3 years on the market, corrupting mail all along! But it does work fine now, and Ex2000 solves the '1 big database' problem.
2) They had weekly maintance downtime to handle the database issues. That meant they took turns coming in on Sunday mornings. Whoop for them.
3) Even so they still occassionally had niggling database consistancy problems which they never could quite work out. When these things were happening, people would get nervous because basically the server could crash anytime. Many times they had to go offline and restore the entire messagestore from tape to solve these things.
Meanwhile, I used to do some Notes stuff. Notes has it's own problems, but at least you could backup and restore mailboxes with the COPY command, as well as solve DB corruption and whitespace issues (which cropped up rarely) with the server online. I never had to come in on the weekends at least. But to prove this isn't FUD, I'd take the Outlook interface over Notes or Netscape any day of the week
--
maildir (Score:2)
Re:it is nice (Score:2)
I'm still scratching my head trying to come up with a scenario where a user would want all of his mail to suddenly be marked UNSEEN behind his back. On the hand, every user I've ever met likes the scenario where switching to a different client maintains the state of his email world.
But you don't have that feature now.
There is a vast difference between a race condition that might affect erroneously flag some mail and a design that always erroneously flags all mail. In the four years I've been using IMAP I've never had this race condition hit me. Despite your claim, I do have this feature now.
(ex)mh (Score:2)
Advantages:
* Easy to access any message with standard Unix text utilites (grep, more, and such).
* No worry about corrupting the entire mailbox if one message gets clobbered by a broken client (or broken file system or whatnot).
* Incremental backups and syncronization is easier
Disadvantages:
* Uses lots of storage. [Oh wait, I work for a storage company, so this is an advantage.]
* With one file per message, you can get more files in a directory than your shell will allow you to use as command line arguements. (e.g., `grep important *` may fail)
I guess the big safety issue is how well it behaves if you have more than one mail client accessing your email at a time. I don't see this as a very likely situation, but still something that should work.
Re:A few thoughts on message storage (Score:2)
I used to work in a place that stored about 20 terabytes of certain documents it worked with, which varied in size from 1K to 5G each. Median size was about 40M. All the meta data, like what customer it applied to, dates of processing, and so forth, were stored in a database. But the actual document file never was. The network path to the document was in the database, but the documents were stored on hundreds of Novell (ick) file servers. The database was still the major bottleneck of the whole operation. All these wonderful database facilities like SQL don't mean squat when the main functionality was to get the document, process it, and store it back, which is what happened most of the time. Of course it was nice to have the SQL when you needed to manually check on things or do some odd searches. But I would never store bulk data in a database; only the pointer to it would go in there. Databases are faster at complex searching, but not at bulk delivery of data.
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
Why are they concentrating so much in a single box like that anyway? Why not a few separate smaller boxes?
Re:Exchange Mailbox format (Score:2)
I haven't used Exchange 2000 but Exchange 5.5's mailbox format is a piece of shit! Its one huge flat file. And I mean HUGE. Plus the "Jet" database format it uses is slow as balls. And to top it all off, you have to take the service offline to defrag it! Unless you love getting up several Saturday mornings a month because your users can't check their email, then exchange isn't for anyone.
-Lee
Re:A few thoughts on message storage (Score:2)
I find future versions of trendy software to be pretty impossible to use in building a workable solution...
Re:A word of advice (Score:2)
-russ
Re:Efficiency, Flames... (Score:2)
-russ
Re:A few thoughts on message storage (Score:2)
Doesn't seem to hard to me. As far as consistency checking goes, you can ignore the on-disk text except for displaying the message. If you want to use the headers from the file to refresh the database in the event of coruption, fine, but it's not a big requirement.
Any backups of my data that I keep are also stored in the same physical universe, but I don't use this as an excuse not to keep backups. Having the headers lying in a plain text file to sanity-check against can only help. Generally when one assesses risk, one works with cost/benefit tradeoffs. What you propose is very costly in terms of database resources, whereas duplicating headers on disk is very cheap. This cost comes in terms of disk space, time used to duplicate the data (which in a very large system could be staggering for every message body), etc.I think you will find that the benefits of storing headers twice will far outweigh the cost of having done so. I can't say the same for storing open-ended (in terms of size) message bodies in a relational database.
Nice idea, but we're talking about software design here, not system administration procedures. Clearly a sysadmin should be backing the data up, but to tell the user, "something looks odd here, go chase down a sysadmin and make him restore a backup," is a lot less friendly than, "I found some courupt headers in message 501719, correcting..."Re:mbx format (Score:2)
Unless the "binary" is encrypted data then it's hardly going to make a difference. Also the encryption key had better not be stored anywhere. Otherwise "su -l \" will do the trick anyway.
Let alone that in many enviroments encrypting mail in such a way that only the the user could read it would be a very bad idea.
Re:Need better filesystem for maildir (Score:2)
What filesystem do you have
Re:Currently using Exchange 2000...And loving it! (Score:2)
Try having a look at www.courier-mta.org
Re:My practical argument for maildir (Score:2)
Quite trivial, since it's simply a matter of cutting up files into smaller bits. Can't have anything else accessing the mbox file at the time, but once the MTAs and MUAs have been switched to maildir then nothing else should be looking at it.
If maildir is indeed the great thing that some people make it out to be, you'd think that there would be more people switching
The problem is MUA writers tending to ignore maildir. Even though they will happily put the effort into more complex or redundant ways of accessing email. e.g. kmail has inbuilt POP3 support, but every machine it can run on can also run fetchmail.
Re:The Third Way (Fourth, Fifth) (Score:2)
There is always MMDF which does the same thing, except for using ^A as a message separator. Other than this it has all the same "features" as mbox.
Re:Reasons for one over the other (Score:2)
Except that mail "readers" don't just read. They also do things such as add metadata, delete, move mail around, etc. With mbox metadata is commonly done through adding extra headers into the existing file inserting stuff into the middle of a file is expensive as well as meaning that anything other than exclusive access probably isn't possible. With maildir it's simply a matter of renaming the file. To delete a message with mbox you either have to leave holes in the middle of the file (and "compress" it later) or rewrite as you go. With maildir simply delete the file. To move with mbox it's a matter of a file append followed by a delete. With maildir it's simply a rename.
Re:Maildir is WAY better (Score:2)
Actually the latter is probably even more expensive since it isn't a simple matter of copying the data a chunk at a time from one file to another. The code doing the copying needs to look at the data being copied, either generating an index or verifying an index... As well as adding metadata by inserting extra data into the file (or the copy).
Re: Maildir's mail fault (Score:2)
Guess
Unless you're running a decent btree structured filesystem like XFS, ReiserFS or JFS, expect a performance hit if you get thousands of messages in a single mailbox.
Expect an even bigger performance hit if you have lots of messages in the same file. You must use lots of locking (and it must be reliable otherwise the whole thing will get corrupted). Things such as index files must be understood by every piece of software which does anything with the file, etc. Effectivly you will end up trying to enumatle a file system in user space software.
Re:Maildir is WAY better (Score:2)
There is another consquence of this maildir supports an arbitary number of processes reading and writing at the same time. The mailbox format requires complex locking, even then adding new messages has to be strictly serial.
Maildir is also a better analogy with paper mail. Mailbox would be something like you have a scroll of all the messages pasted together which you periodically have to hand to the postman for more bits to be stuck on the end...
Re:Maildir is WAY better (Score:2)
It's not just pop3 servers which do this, indeed it's almost the standard way of processing a maildir file.
Re:Maildir is WAY better (Score:2)
You could even use SMB to access the mailbox from a Windows workstation. (Or at least you could if the software existed.)
A point which hasn't been mentioned is that accessing email from a workstation using file sharing (the same file sharing which is in use anyway) means no need for additional password entry (or storing passwords in plain text/reversable encryption formats.) User simply needs to log in and there mail is there. If they log in on more than one machine everything still works fine too...
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
SQL being so "standard" that software vendors demand a specific implimentation... Pull the other one, it's got bells on!
Re:Maildir is WAY better (Score:2)
The licence is likly to upset both GPL and BSD diehards. Also who the author is may be an issue too...
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
Sounds like big iron propaganda... Both mailbox and maildir have the advantage of being conceptually simple. The database solution is complex, probably more complex than is needed for storing email in the first place.
Arguing for a database looks to me quite similar to the arguments as to why the Windows registry is better than
Using mailbox means that a problem with John's mailbox probably won't afffect Jane's. Using maildir means that a problem with one of John's messages probably won't affect the rest of them. Using some kind of DB could easily mean, John has a problem with mail, everyone has the same problem.
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
As well as these boxes being expensive, since they need to be reliable and hot swappable redundant. Not that even that will help when something external such as a router, cable, etc fails.
Whereas if you distribute the mail load to 10x the number of boxes (albeit cheap of the shelf boxes), you just need maybe one or two decent (backup/redundancy
Unfortunatly RAIC or RAIB dosn't quite have the ring of RAID.
With mail the load can be distributed. In fact I believe people don't really mind having their email addresses being user@tag.domain.com. It's the marketing/PR guys who'd complain. Heck market it as user@neighbourhood.domain.com
Assuming the distribution needs to be that obvious in the first place...
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
As opposed to one user out of 150,000 losing their mail with mailbox or one user out of 150,000 losing some of their mail with maildir.
While granted I'm sure that bugs such as these can be worked around
What's the point of applying a work around to get a complex system to work when you could simply apply a KISS aproach?
Re:The Disputes are probably not technical (Score:2)
This might be an issue when storing things as
Re:Some problems with maildir (Score:2)
So how's your newsserver holding up...
WAFL, used by Network Appliance, can fail under this sort of load. Secondly, maildir file names can be quite long. There was a bug in a version of Solaris where the operating system would not cache file contents of an NFS-mounted file whose name was longer than 31 characters. This can result in very poor performance.
Sounds more like these are more problems with your specific platform (Solaris) though. Indeed you identify the NFS issue specifically as a "bug".
Re:Mailbox formats (Score:2)
cons: limited user access to directory
No more limited than any other directory, it's just a directory with files in. If people really want it's trivial to read their email with a text editor.
The "limited access" comes with lack of software (especially GUI software) which can handle maildir. Even though it's probably simpler in terms of programmer efort than formats more commonly supported.
Re:mbx format (Score:2)
It may make it faster but it also means that you can easily be tied to specific hardware/software combinations in order to be able to read your email.
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
(storing email in a database that is)
I worked as lead programmer at a mail provider and was in charge of the system's design from the start. The ingenius idea to store email in a database, while it sounds good...is rather horrific. We had issues with databases becoming corrupt (and hey, 150000 users like it when they lose all their mail), the database being overly bogged down (guess what, fopen is faster than going through a database) amongst other things.
While granted I'm sure that bugs such as these can be worked around, databases were meant for holding fields of data, not whole files - especially binary ones (and before you say that email is ascii, thing other languages where they use multibyte encoding etc.)
Re:Enterprise-grade messaging for Linux/Unix (Score:2)
Re:A few thoughts on message storage (Score:2)
That doesn't count... (Score:2)
Both are reasonable choices, but it's unfair to compare them in an apples-to-oranges fashion.
License info (Score:2)
Having used qmail for a few years, I can indeed say that it is a safe and reliable product. But I wouldn't recommend it for a novice sysadmin; DJB is a really smart guy, and he seems to have little patience for those who aren't.
As to his views on licensing, here is the distribution policy for his software. He strictly forbids distribution of qmail except in forms approved by him:
http://cr.yp.to/qmail/dist.html [cr.yp.to]
Re:Exchange Mailbox format (Score:2)
Stop spreading FUD.
ostiguy
Re:MS ($M) Exchange Mailbox format (Score:2)
MS is going to enterprise style per CPU licenses with its xxxxx 2000 products. Exchange 2k will be pushed heavily at ISPs and especially ASPs. MS will expect companies to scale highly, with x,000's of users per box.
ostiguy
Maildir has significant advantages (Score:2)
1) If your MAILBOX file gets trashed, you're out your entire e-mail directory. If one MAILDIR file gets messed up, you've only lost one e-mail.
2) If you get a messed up e-mail that you can't read in a mail program (and this DOES happen) you only have to delete the corresponding message in the
I know that UofW claims Maildir take a performance hit, but I've not noticed one. There's all sorts of web resources on tweaking UofW to pump out e-mails faster. I'm currently using Qmail + IMAP-2000 (with the Maildir patch on Qmail's site) on a P100 w/ 32Mb of RAM and I've got it pumping out IMAP as fast as my work's commercial server does.
Re:Mbox (Score:2)
A few points:
Tune your file system for what its used for. Your /home directories (where the mail will be stored by default) should be set to have a relatively large number of inodes because of a tendancy toward small files in there.
Read the docs on updatedb -- set the execlusions to include "/home/*/Maildir" if you wish.
Maildir also allows for multiple processes accessing a 'mailbox' because it uses per-file locking on per-message files, not a lock on an entire mbox itself. This allows for situations where 6 people all have the same IMAP shared folders for shared incoming mail (like an accounting office, or tech support) without locking problems for the MUA or IMAP server.
Re:I'm an 'mbox' user... (Score:2)
Use rgrep or GNU grep's -r option to do a recursive search:
grep -ri "slashdot" ~/Maildir/*Re:Suggest actually reading UW imapd documentation (Score:2)
Does someone want to explain how mbox is better for concurrent access than Maildir? If you do some good coding, they're equal. For Maildir though, you just do read locks on individual files in your Maildir when opening them to present them to the user, and you create new files to write new messages, which doesn't have any effect on (eg 25) other processes accessing that Maildir.
Re:mbox _should_ go away forever (Score:2)
Re:The Disputes are probably not technical (Score:2)
On the DJB note, Dan and I have gotten into our flames on his lists, but some of his software ideas are still very good. The fact that he basically doesn't care what anyone else thinks most of the time seems to me to be why he's succeeded in just writing software that goes against the status quo from the ground up. Anyone else would've crumbled at the criticism.
Re:Efficiency, Flames... (Score:2)
This is very true, although I was more concerned with time than space at the current price of a few hundred GB of disk space.
This, and providing "--help" options to his programs I suggested as being helpful ... right before the deluge of hate-mail ...
He never did reply to my philosophical statement that his famous statement, "profile, don't speculate" was incorrect since speculation is scientifically required for eventual proofs to happen.
Qmail and Dnscache are still personal favorite pieces of software for servers, although there are many things they could do much better than they do. Luckily, Dan seems to attract a large number of patch-writers and individuals who kindly host useful websites like qmail.org [qmail.org] and djbdns.org [djbdns.org].
Re:Efficiency, Flames... (Score:2)
See you on the lists
Re:The Disputes are probably not technical (Score:2)
Re:There could be some valid reasons (Score:2)
Re:Need better filesystem for maildir (Score:3)
Actually, I believe this is one of the things that ReiserFS excels at.
I have very limited experience with Reiser myself, so perhaps someone else can provide more details, but as I understand it ReiserFS is capable of dealing with thousands of small files extremely efficiently (Through the use of tree structures to hold the filesystem). From what I've read, it would be a fairly ideal file system for things like maildir storage.
In fact, now that the 2.4.1 kernel is out, with included stable ReiserFS support, I might just give this a shot. ;-)
-- Toph
Take a look at Cyrus (Score:3)
UW Imap mailbox formats (Score:3)
Is UW Imap free software? If so, someone should feel free to give it maildir, db, sql, or other mailbox support. For some reason I seem to remember that IWImap was not free software, even though the source is available (some weird academic license hostile to commercial use?). The author is a good programmer and active in the standards process, but can be abrasive to work with.
Re:JWZ and me (Score:3)
Thus
might end up invoking, if you have thosands of files, something like and so on. Using the -i flag to xargs just means it has to create a seperate process for each grep, taking a lot of extra time.--
Re:Outlook corporate mailbox (Score:3)
Second, you site three 'benefits' to Exchange:
Fast: Define fast. The Exchange/Outlook RPC is great over a 100MB network, but try it over a dial-up line, or some line with a high latency. They performance goes right now the crapper, because the protocol is very 'chatty'. The client and server communicate back and for repeatedly to get a task done. IMAP/POP3 are infinately better in adverse environments, because their protocol is 'batch' oriented. A couple of commands, and you have data streaming to the client. Another example is over that same high-latency connection, try forwarding a message with an attachment. The attachment has to be uploaded to the server before you can COMPOSE YOUR MESSAGE. On the server side alone, every internet message has to be 'decoded' into MAPI body parts for storage in the database. If it pukes on a body-part, it'll crash your information store. the IMAP servers do/can parse the messages based on MIME body parts, but that is only when necessary. Exchange parses EVERY internet message, and at a lower level that the MIME body parts.
Second, you site 'scalability'. I ran a 7000 mailbox UofW POP3 server on a dual 166Mhz Solaris box with 256MB of RAM. The concurancy was about 25%, and the server ran with a load-average of about 1.2. My previous employer is having trouble running 2500 users on a quad PII-450 with 1GB of RAM at a 50% concurency. How is that scalability?
Third, you mention 'workgroup features'. True, Exchange includes a fairly decent calendar service, this discussion is about e-mail. If you want to talk about workgroup functions, we can do that... (btw, voting is a client function, as it the task management. There is no true 'workflow' in that because there is no central process tracking the work. It's all source-routing/message updates.)
You also said that Qmail is technically correct, but it's not going to do my company's productivity any good. This may be true. But talk to me when your company starts to interact with OTHER companies, and tell me how well Exchange does. Internet software is designed for interoperability, and when you're dealing with other companies, THAT'S what will make your company productive.
As for security, I'll leave that to the rest of these guys. I already like the comment about the 5 days w/out mail due to the I Love You virus.
Try MH+(S)IMAP (Score:3)
1) I want to be able to read mail both from a GUI-based mail prog (Outlook, Eudora, Netscape, whatever) **AND** from a shell
2) I want to be able to access live and "older" mail anytime from (at least) home and work, preferably both my home and work email accounts.
3) I do not want to send any cleartext passwords
What I came up with is the following:
At home I run the UW-IMAP server, and store my incoming mail in MH folders. Stunnel does a fine job of adding SSL support to IMAP.
At work we run Netscape's Mail server which actively supports SIMAP.
Either at home or at work, both servers (and all the mail in all the folders) are available.
Just about the only thing missing is the ability to read my work mail from a shell, but that's where most of the big ugly attachments are, anyway...
Re:it is nice (Score:3)
You have to type your password into the new client--maybe we should store that on the server too?
"What if there was no last session for the client?"
Then everything is RECENT. I realize this loses you a feature, namely that you can't see only those messages in client B that you didn't see in client A. But you don't have that feature now. Why not? Because there is a race condition in the spec: if a message comes in AFTER the last time you check your mail (in client A) but BEFORE you logout (with client A) that message won't be RECENT in client B.
--
MailOne [openone.com]
I'm an 'mbox' user... (Score:4)
I've been using 'mbox' for -- gawd, can I say this? -- fifteen years, and it's served me well. 'mbox's advantages for me are that it is efficient with disk space (you don't eat an inode per message), and that it is quick to search.
9 times out of 10, when I'm searching my mail, typically with 'grep', I'm looking for something in the body, not the headers. With 'maildir', you have to open each message and search it. This is preposterously slow. There is also the danger that the shell's wildcard expansion limits may be exceeded if you have a lot of messages. With 'mbox', 'grep' opens the one file and slurps through it quickly.
Remote synchronization is not an issue for me. All my email resides on my laptop, which follows me everywhere.
However, I'm hip to 'maildir's increased reliability. I have over 2000 messages in my outgoing box alone, and I'd hate to have a system hiccup destroy any of it. If I could search the bodies of a 'maildir' spool as quickly as an 'mbox' spool, I could be convinced to switch.
Schwab
Why I don't use mbox (Score:4)
Our solution was moving to qmail and using Maildir mailboxes for our users. We never saw the problem again.
Recently, I've switched to courier mail server (http://www.courier-mta.org/) on all my non-production machines to evaluate it. I'm really, really happy with it. Courier is a complete mail system, not just an IMAP server, so you might take a look at the whole package. The whole thing is RFC compliant, which causes troublte for software that isn't, but that's a fault in the other software.
As a final rant against UW-IMAP: I hate it. It loads the whole damn mailbox being checked into memory (regardless of the type), which creates a huge load every time someone with a large mailbox checks their mail. This problem affects the POP3 server as well, since that also uses the c-client code.
Qmail also supposts mbox (Score:4)
That's just plain wrong. Qmail supports both maildir and mbox. I've been using qmail with only mbox files for years...
Re:Exchange Mailbox format (Score:4)
(oh, plus Win2000)...
(oh, plus a machine with at LEAST 256-512MB RAM)...
(oh, plus a backup solution to backup the DB live)...
(oh, plus some sort of a firewall/gateway... you wouldn't want this DIRECTLY on the 'NET..!)
Re:Enterprise-grade messaging for Linux/Unix (Score:4)
Re:My mailbox (Score:4)
My mailbox (Score:4)
-No encryption techniques neccesary
-rarely have to waste time with forwarded jokes
-Best of all, the spam it collects is occasionally useful (I know all the pizza deals available in town).
A word of advice (Score:4)
--
MailOne [openone.com]
Cyrus Rocks (Score:5)
JWZ and me (Score:5)
has a number of essays about mail on Unix systems, including problems with mail box formats.
I use Xemacs/Gnus/nnml so all my mail is stored as individual files, which is handy (as other posters have said) and has it's downsides, as they have said too (grep now bitches if passed all files in my main mail box). Still, I like it, best system I've used. Not so great for the multiple hosts thing though.
Or you could run your mail and xemacs on one machine, and either read your mail in a terminal, or open X windows on your local display. Look up gnuserve to do that, I think.
Enterprise-grade messaging for Linux/Unix (Score:5)
Fortunately, a solution to this problem is being developed right now. The Citadel/UX project [citadel.org] is developing a robust communications server that will compete with products like OpenMail, Groupwise, and Exchange. SMTP and POP3 are already in place; IMAP will be available by the end of the year. Web-based access works as well. After that's done we'll be writing plug-ins for both Evolution and Outlook, in order to facilitate all of the 'shiny things' working as well: calendars, address books, etc.
So, you might ask, what mailbox format does it use? None of the above. Messages are stored in a database, like they should be. The Berkeley DB [sleepycat.com] package from Sleepycat Software (yes, it's open source) is used for robust back-end storage, including transaction and logging support.
I'd encourage any developers who are looking for the open source world's "Exchange Killer" to get involved in this project.
--
A few thoughts on message storage (Score:5)
This makes most mail messages poor choices for database storage (for example you want to be able to use "grep" on mail or compress in-place. Headers on the other hand are a major win in a database ("select messageid from headers where user = 'me' and date > yesterday and fromaddr = 'taco@slashdot.org'" should be fast even if I have tens of thousands of messages).
The easy solution is to keep the headers in the database, and then just keep maildirs with the original messages in the normal filesystem with the filenames in the database with the headers (something like message.headerid => headers.id and message.text is a path to the maildir entry for this message.
This combines the best of both worlds. This also means that while it's easy to corrupt your database with a single bug in your code, you can always re-build it from the on-disk messages.
Maildir is WAY better (Score:5)
1) it is more reliable over nfs. Maildir is designed to not need file-level locking, which sucks over nfs.
2) maildir is more resistant to catastrophic corruption since each email is a seperate file.
3) maildir keeps metadata about the email in the emails filename, rather than a seperate index file. This helps prevent the metadata, such as "replied-to" and "forwarded this" from getting out of sync
4) filesystem level tool work well with maildir. you don't need special "formail" type tools to work wirh them, bash scripting is capable of doing it all by itself.
5) maildir is better positioned to take advantage of advanced new filesystems like reiserfs. when reiserfs has a plugin for file-level transparent compression, maildir will be able to selectivle and invisibly compess emails to the disk without requiring other programs/scripts to decompress them before use.
Study maildir, it's just plain better.