How Do You Store and Reconcile Email Archives? 380
heyitsjustme wants to know how you deal with old email. "I delete most of what I get but keep the stuff from friends and relations as an archive. Unfortunately I have these email archives from the late 80's through today in the form of macintosh, linux and windows mailboxes including AOL 1.0 mailboxes. What does everyone use to archive email across multiple platforms and non-standard mailbox formats? Is there an easy solution out there? Does anyone archive IM?"
Cyrus Imap... (Score:2, Interesting)
Unix mail format (Score:3, Interesting)
Re:Italian school of driving (Score:4, Interesting)
Kinda Sorta OT (Score:5, Interesting)
Along these lines, is there an OSS package that can read the varied formats the Submitter is referring to, tag and drop them in a DB with a nice, friendly, web-enabled (secure) front-end for searching?
My former employer kept *all* of his email from the last 20 years in tar.gz files. Let's just say it wasn't easy to find an email from er, 15 years ago very easily.
Is there a package that can read the mbox, the other box-formats, plain text, pull from pop, old tar.gz bundles, categorize (sorta), tag and make such things searchable?
Totally a shot in the dark here, i'm not a mail guy at all
It is the "drink" that makes me wonder, sorry
Re:One Word (Score:2, Interesting)
I started running my own IMAP server on an old machine a year or so ago - and synced all my old mail archives to various folders.
My mailserver also solves another problem - multiple POP accounts. I have my IMAP server set up so that each one of my POP accounts gets automaticly tagged and sent to it's own folder.
A third common problem this solves is having multiple machines. Now my desktop's email client is always synced with my laptop's email client. Before I had run into problems when ever I traveled and fetched my email from the road.
Archive what? (Score:3, Interesting)
Sure, HDD space is cheap; but I tend to equate people who archive every single form of written communication to those who have an Obsessive Compulsive Disorder, in that they hoarde everything in sight: newspapers, snail mail, magazines, boxes, etc..
Commit to memory and destroy the evidence. Thats my way of handling archives.
Re:One Word (Score:5, Interesting)
Absolutely. I use no fewer than two mail clients on two different machines on any given business day. Every email I've sent since 1995 or something like that, and received since 1998 is available and searchable. Over this time, I've accessed this archive with the following clients:
* pine (lots of pine)
* mac mail
* thunderbird
* various netscapes/mozillas
* ML (some random IMAP reader)
* My phone (my old Sony/Ericcson speaks IMAP)
* My palm (two different apps)
* python
* a java webmail system I wrote
* three or four other webmail systems
* mutt
Archiving tool: ForKeeps (Score:3, Interesting)
http://www.fkeeps.com/whofor.htm
It's a bit of an old program and the interface is clunky, but it works reasonably well once you work through it.
Re:Unix mail format (Score:3, Interesting)
Since mbox is a pretty standard format many tools have a built in import routine or that there there will be an existing third party tool to handle any conversions at least. Failing that, it's fairly trivial to cobble together a one-off conversion tool using a scripting language, or even to batch remail each message one at a time if your new email client uses some undocumented storage format, or is an online service like GMail.
Re:email archive (Score:1, Interesting)
Maybe other companies do it but until there is proof then you can't slander them but Microsoft do it, so they're fair game.
Re:One word (Score:0, Interesting)
Thanks, but I'll pass.
Practical research applications (Score:2, Interesting)
If anyone has an idea of an open-source application similar to what the submitter is looking for, it would help my research quite a bit. There's practical research applications in this stuff, if someone's interested in making it.
CSV (Score:2, Interesting)
Admittedly, sometimes the column names didn't match up ("Sender" v "From"), etc., but for the most part that how I did it. I also made an effort to keep the number of email accounts that I had to a minimum. At this point of time, most everything is stored in the form of
I also made an effort to keep my email accounts to a minimum, which probably made this entire process significantly easier and when I did close an account (like when I finished work at a company), I exported the emails from there and kept them in
As far as indexing works - I have them stored in 6 month segments (Jan97-Jun97, Jul97-Dec97,
I do archive IMs - Trillian [trillian.cc] worries about it for me.
Hope this helps.
Email archiving and tools (Score:2, Interesting)
I keep the emails in mailbox format (that is, in plain text as it is stored in most UNIX systems), in several files. The reason I do that is that most email readers (MUA) can read mailbox format. I keep them in several files to make it more manageable.
The tools that I use to manipulate emails are mostly "from", "procmail", "grep", and "less". There used to be tools from the "elm" era (still remember them?), such as "frm" (which is better than "from"), "reademail" (to read individual email, given the number of email in the archive), "deletemail" (which can delete an individual email in the archive). Too bad, these tools are gone. At one point I slapped a simple Tk interface as a front end to those tools. But it didn't scale well.
At one point I did experiment to store emails in indiviual files. But the tools to manipulate them are limitted. I used MH.
The next experiment I did was to take all those email headers and put them in a database. (I used msql, which was popular at that time.) Then, I had a Java applet and perl script to make queries to the database (and actually did an analysis of my reading habit). The actual emails were stored as plain text files. Each email was stored in individual file. Basically, the original email was untouched. I got bored and never continue the project.
Now ... I am stll searching for the perfect email tools.
OE to mbox to html (Score:2, Interesting)
I'm surprised no one has tried this before. It's a good low-tech solution for people who require information in a hurry and is more immediate than a flat file.
Re:Disk space is cheap. Why bother deleting? (Score:3, Interesting)
Outlook + IMAP is the way I do it. You can drag messages between local storage and your mail server.
Ink (Score:2, Interesting)
Me? I obsessively reinstall my operating system and reimport old mailboxes into my mail client, so I have a dozen copies of 5-year old email, ten copies of 4-year old email, 8 copies of 3-year old email, etc. No need for backups... plus when I search my computer for old email, I get a dozen copies of what I'm looking for!
ex post facto Law (Score:3, Interesting)
Re:Dave's top ten (Score:1, Interesting)
Backup mail archives along with a Linux Live CD... (Score:2, Interesting)
* Create a multi-session CD/DVD with your favorite Linux LiveCD distro
(or roll your own [linux-live.org] and create an ISO for future use)
and
* Backup email files to said CD/DVD
(I suggest a set of re-writable media of good quality to play it safe.)
Further suggestions:
1. It would be advisable to split your archives (ie. Mail2004, etc.), especially if you plan to retain a sizeable amount of mail.
2. Convert archives from older mail clients before creating backup, or use a newer mail client that can read the old files with ease.
Good luck!