Slashdot Log In
How Do You Store and Reconcile Email Archives?
Posted by
timothy
on Sat Mar 12, 2005 06:00 PM
from the translate-them-to-quipu dept.
from the translate-them-to-quipu dept.
heyitsjustme wants to know how you deal with old email. "I delete most of what I get but keep the stuff from friends and relations as an archive. Unfortunately I have these email archives from the late 80's through today in the form of macintosh, linux and windows mailboxes including AOL 1.0 mailboxes. What does everyone use to archive email across multiple platforms and non-standard mailbox formats? Is there an easy solution out there? Does anyone archive IM?"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Italian school of driving (Score:4, Funny)
Re:Italian school of driving (Score:4, Interesting)
Parent
Here's what I do... (Score:5, Funny)
Re:Here's what I do... (Score:5, Funny)
One day... someone... somewhere is going to invent some sort of mechanism for removing text you've already typed. It shall be called "back-one-space" and will remove the letter before it.
If this is impossible, surely they can keep a way of having all our text auto-submitted!
Parent
Re:Here's what I do... (Score:3, Funny)
rm -fR /var/spool/mail/* (Score:3, Funny)
and you are done!
Disk space is cheap. Why bother deleting? (Score:5, Insightful)
Thunderbird is able to import all my old mail archives (from years and years of Eudora) and search it effectively. If I were inclined to export all my archives from my Mac to my Windows machine, I could use Google Desktop Search to really search through it all.
Re:Disk space is cheap. Why bother deleting? (Score:5, Insightful)
Because if you delete early and often, you've committed no crime. If you wait to delete it until someone (feds, cops, *IAA, UN-black-helicopter troopers, whoever) demands you turn it over to them, you're screwed.
After all, you break laws too (everybody does, they are written that way). You just haven't been caught yet. (I know this because if you had, you wouldn't have all you email archived!)
Parent
Re:Disk space is cheap. Why bother deleting? (Score:3, Informative)
After all, you break laws too (everybody does, they are written that way). You just haven't been caught yet.
Instead of deleting all your e-mails "early and often" why not just delete the ones that have illegal activity in them? Or better yet, don't conduct illegal activity via e-mails. Those ar
Re:Disk space is cheap. Why bother deleting? (Score:3, Insightful)
For the majority of slashdotters this wouldn't be a problem, as I'm pretty sure American laws can't be retroactive. If I ate a chicken today and e-mailed someone saying I ate a chicken, and tomorrow it became illegal to eat chicken, I can proclaim to the world "I ate a chicken on the 12th of March" and I won't be able to be charged with any crimes.
However given your choice of words (regime change) I fig
Dave's top ten (Score:5, Funny)
9. Disks fill up, no matter how cheap they are. Low cost doesn't excuse gluttony.
8. Backups take forever.
7. Restores take an eternity, especially if your not confident.
6. Mail client gets slower and slower.
5. Searches take too long.
4. Mail clients make mistakes, especially on big stores. See #7
3. Your CYA evidence may be used against you.
2. A mail store is not a file system and SMTP is not a file transfer protocol.
And the number one reason to delete your old email...
1. IT'S ALL A BUNCH OF USELESS CRAP JUST AS IT WAS WHEN YOU FIRST RECEIVED IT!!
Parent
Re:Disk space is cheap. Why bother deleting? (Score:3, Informative)
If you do this religiously, you will only ever have to worry about your current mail format, and how you're going to upgrade it all
Re:Disk space is cheap. Why bother deleting? (Score:3, Interesting)
Outlook + IMAP is the way I do it. You can drag messages between local storage and your mail server.
I work for Microsoft (Score:5, Funny)
I'm afraid... (Score:3, Insightful)
email archive (Score:4, Funny)
You must work for microsoft
Unix mail format (Score:3, Interesting)
Re:Unix mail format (Score:3, Interesting)
Since mbox is
PDF (Score:4, Insightful)
Spotlight and Tiger (Score:5, Insightful)
Parent
One Word (Score:5, Insightful)
Re:One Word (Score:5, Interesting)
Absolutely. I use no fewer than two mail clients on two different machines on any given business day. Every email I've sent since 1995 or something like that, and received since 1998 is available and searchable. Over this time, I've accessed this archive with the following clients:
* pine (lots of pine)
* mac mail
* thunderbird
* various netscapes/mozillas
* ML (some random IMAP reader)
* My phone (my old Sony/Ericcson speaks IMAP)
* My palm (two different apps)
* python
* a java webmail system I wrote
* three or four other webmail systems
* mutt
Parent
Outport & recursive IMAP folder creation (Score:4, Informative)
Also, anyone know of a client program that will recursively create folders on an IMAP server (maybe a server issue. In which case, what server?)
I had gotten over translating my years of Outlook email into something more universally readable, but I have so many nested folders that the inability to have the client recirsively create IMAP folders is an issue. Suggestions?
Parent
Re:One Word (Score:3, Informative)
I used the UW IMAP server, which is a little easier to set up than the Cygnus one.
The UW IMAPd keeps its folders in mbox format, so it's a great tool for converting oddly formatted mail.
Moving email is pretty easy -- it's harder to move calendar entries, address books, notes, and the other sorts of data that ends up in a program like outlook. I think the easiest way to do it wo
It's simple: plain text (Score:5, Insightful)
I burn it to CD-Rs that I know won't get moved around or scratched. They stand a good chance of lasting the rest of my life.
Re:It's simple: plain text (Score:3, Informative)
No! Check those backups! I have lost data stored on CD-Rs (luckily I had copies), and many of my discs have started to turn yellow after about 2 years! Also, you can sometimes see these little spots of discolouration on the CDs, which makes me think there's a fungus of some sort that's eating them.
The lifespan of CD-Rs is unknown at this point. Don't trust them for more than
Upon Searching.. (Score:5, Informative)
More utilities than I want to bother with, but hopefully they'll have the converter(s) you need.
Good Luck!
Your favorite online storage (Score:5, Informative)
This might be useful, if they don't collapse under /.
Don't change e-mail clients (Score:4, Informative)
What I do at year end is move all of that year's messages to a new folder and reset my filters so that the new year's messages go into a new set of folders.
Periodically I just copy off previous year's messages to CD.
At least few times I have been able to back a couple of years and find information that I lacked.
Kinda Sorta OT (Score:5, Interesting)
Along these lines, is there an OSS package that can read the varied formats the Submitter is referring to, tag and drop them in a DB with a nice, friendly, web-enabled (secure) front-end for searching?
My former employer kept *all* of his email from the last 20 years in tar.gz files. Let's just say it wasn't easy to find an email from er, 15 years ago very easily.
Is there a package that can read the mbox, the other box-formats, plain text, pull from pop, old tar.gz bundles, categorize (sorta), tag and make such things searchable?
Totally a shot in the dark here, i'm not a mail guy at all
It is the "drink" that makes me wonder, sorry
Convert to MBOX format (Score:5, Insightful)
Almost every email client around can import and export mbox formats. Getting your email in a format that is going to be readable in 20 years is the first step, otherwise why bother?
Worse comes to worst mbox is readable as plain text.
Or use maildir (Score:5, Informative)
Personally, since 1999, I've been using a combination of maildir and procmail to archive and save my mail. Every message that comes in, goes to a folder called
Parent
One's things sure (Score:5, Funny)
mbox or maildir (Score:3)
Archive what? (Score:3, Interesting)
Sure, HDD space is cheap; but I tend to equate people who archive every single form of written communication to those who have an Obsessive Compulsive Disorder, in that they hoarde everything in sight: newspapers, snail mail, magazines, boxes, etc..
Commit to memory and destroy the evidence. Thats my way of handling archives.
Re:Archive what? (Score:4, Funny)
Parent
formail, mairix, and mutt (Score:4, Informative)
Use mairix [force9.co.uk] to search through email.
mutt [mutt.org] is the best mail client ever.
-rsw
Archiving tool: ForKeeps (Score:3, Interesting)
http://www.fkeeps.com/whofor.htm
It's a bit of an old program and the interface is clunky, but it works reasonably well once you work through it.
Delete it (Score:5, Insightful)
Re:Delete it (Score:3, Insightful)
Good point, though.
How I do it (Score:5, Informative)
I use grepmail [sourceforge.net] to find old emails that I might need. Grepmail lets you use perl regular expressions to find messages and then outputs the entire message where a match was found. You can use grepm [barsnick.net] to open grepmail matches as a mailbox in mutt. grepine [www.dfki.de] does the same for Pine, which I use.
At the end of each year I clean the spam out of my archives using a procmail recipe and spamassassin. This recipe marks messages as deleted in the mailbox. I open these in pine, sort by deleted, and double check them. Once I'm sure they're all spam, I delete them:
The special spamassassin config turns off bayesian filtering and sets the threshold high:
The rest of the spam I clean out by hand.Re:How I do it (Score:4, Informative)
Now add the following to your crontab:Put this in ~/bin/rotate-sent-mail.sh:
0 0 1 * * $HOME/bin/rotate-sent-mail.sh
Parent
Manatory ZOË plug (Score:3, Informative)
ZOË is a sort of an archiving proxy that sits between your mail client and your mail server. It stores and indexes everything, so you can pop open a browser window and do a search on anything you've ever sent or received. Naturally, this was created before gmail [google.com].
With ZOË you don't need to worry about those pesky email folders and waiting for long searches.
Naturally, spam filtering before ZOË is a good idea.
My solution (Score:5, Funny)
save all ingoing and outgoing in YYYY-MM files... (Score:3, Informative)
I had multiple folders, sorted by people/project. I got in a complete mess and finally snapped when I spent half an hour looking for a simple message.
Use procmail to write all incoming messages to 'all-mail-YYYY-MM' and use Mutt hooks to write out to the same directory.
At the end of the year, cat them together and make 'all-mail-YYYY'. Accessing and reading this mailbox can be done with 'mutt -R -f all-mail-YYYY' as this opens read-only. Use 'l' to do 'limit' searches and use ~t, ~f, and ~b in AND combinations to limit on To: From: and body of messages. It's lovely only having to look in one place!
Procmail:
INCOMING=all-mail-`date +%Y-%m`
# now I want to keep a copy of EVERYTHING in a dated directory
$INCOMING
Muttrc:
set record="+all-mail-`date +%Y-%m`"
Works for me!
Dr Fish
Post it on /. (Score:3, Funny)
As a bonus, you can tell which emails are worth reading by how they get moderated. All your work related emails will probably be modded Troll, except for your performance review, which will be modded +5 Funny. Email from your illicit lovers will be modded Insightful, since that type of thing is new to most of us. Email from your family will be conveniently modded down so you will not have to deal with it. Your friends won't need to send you any email at all, since they are probably already on Slashdot, and therefore, know enough to post in your threads.
Problem solved. Ah, Slashdot... Is there anything it can't do?
Re:One word (Score:5, Insightful)
I don't know about you but I generate about 6GB of email archives per year. Besides that having my email potentially available for searching doesn't sit well with me. I'm not sure where it stands now but there were a lot of potential privacy issues with Gmail.
No I don't receive hords of email, just a lot of engineering related with source code,research, white papers attached. If you do anything business related it's important to keep all of the original emails received so there is an electronic paper trail.
Parent
Re:One word (Score:4, Funny)
Do you actually sign up to those free porn places?
Parent
Re:Log everything... (Score:5, Funny)
Hey B5_geek, here's a trick to free up a lot of disk space *and* raise the S/N ratio in your logs:
mv irclog.txt irclog.txt.fat && grep -vi lol irclog.txt.fat > irclog.txt && rm -f irclog.txt.fat
Parent
Re:Since a month back (Score:3, Funny)