Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Communications Data Storage The Internet

How Do You Store and Reconcile Email Archives? 380

heyitsjustme wants to know how you deal with old email. "I delete most of what I get but keep the stuff from friends and relations as an archive. Unfortunately I have these email archives from the late 80's through today in the form of macintosh, linux and windows mailboxes including AOL 1.0 mailboxes. What does everyone use to archive email across multiple platforms and non-standard mailbox formats? Is there an easy solution out there? Does anyone archive IM?"
This discussion has been archived. No new comments can be posted.

How Do You Store and Reconcile Email Archives?

Comments Filter:
  • Since a month back (Score:1, Informative)

    by Cambrant ( 735036 ) on Saturday March 12, 2005 @07:03PM (#11922528) Homepage
    I've been going with the Gmail philosophy of storing everything. Until someone gets hold of my password of course. People should generally be more careful with the storage of their online communication. Print what's important and stick it all in a drawer. That's the safe way to do it.
  • Upon Searching.. (Score:5, Informative)

    by yuriismaster ( 776296 ) <tubaswimmer@gmai[ ]om ['l.c' in gap]> on Saturday March 12, 2005 @07:06PM (#11922565) Homepage
    EmailMan [emailman.com] has the answers to your problem.

    More utilities than I want to bother with, but hopefully they'll have the converter(s) you need.

    Good Luck!
  • by Alien54 ( 180860 ) on Saturday March 12, 2005 @07:08PM (#11922583) Journal
    Hriders.com gives unlimited free 1 Terabyte email accounts that include 500 Megabyte attachments. We have been asked why we would do such a thing. The answer is simple to help people store large amounts of information in a safe and secure environment. - - - We decided that yes a Terabyte of space may sound rather extreme to some, others will not think so. If you have a free membership with Hriders.com then you will receive a free 1 Terabyte 500 Megabyte attachment email account. You will be able to store over 40 million emails, videos, games, mp3s, or pictures.

    This might be useful, if they don't collapse under /.

  • by rueger ( 210566 ) * on Saturday March 12, 2005 @07:09PM (#11922590) Homepage
    That's one of the many reasons why I have stayed with Pegasus Mail [pmail.com]for many years. Because they were created in the same program I know that I can still access my old mail files without problems.

    What I do at year end is move all of that year's messages to a new folder and reset my filters so that the new year's messages go into a new set of folders.

    Periodically I just copy off previous year's messages to CD.

    At least few times I have been able to back a couple of years and find information that I lacked.
  • by rsw ( 70577 ) on Saturday March 12, 2005 @07:17PM (#11922641) Homepage
    Convert everything into mbox format. formail [procmail.org] will help you with that.

    Use mairix [force9.co.uk] to search through email.

    mutt [mutt.org] is the best mail client ever.

    -rsw
  • Easy... (Score:2, Informative)

    by praetis ( 806293 ) on Saturday March 12, 2005 @07:23PM (#11922669)
    I archive my mail on /dev/null. Send it there daily.
  • Or use maildir (Score:5, Informative)

    by suso ( 153703 ) on Saturday March 12, 2005 @07:24PM (#11922676) Journal
    Whatever you do, I think its best to keep it in an open and obvious format like mbox or maildir. The nice thing about maildir though, is that since all the messages are seperate, it might be a little easier to write a program to put them into a new format.

    Personally, since 1999, I've been using a combination of maildir and procmail to archive and save my mail. Every message that comes in, goes to a folder called .saved-messages-YYYY-MM and also to my inbox. I simply don't touch the saved-messages folders and when I am done with the message in my inbox, I just delete it. This has worked well for me and makes it much easier to deal with archiving old mail. In the end, having categorized folders and such is just a waste of time. Its kinda like the wm2 (window manager) way of thinking, but for mailboxes.
  • How I do it (Score:5, Informative)

    by Matt Perry ( 793115 ) <perry DOT matt54 AT yahoo DOT com> on Saturday March 12, 2005 @07:44PM (#11922786)
    I use a procmail recipe to archive my mail. I put it after filtering mailing lists and before I filter spam:

    OLDMAILDIR = $MAILDIR
    MAILDIR = $ARCHIVE_DIR
    :0 cW: archive.lock
    | /bin/gzip >>mailarchive-`date +%Y%m`.gz
    MAILDIR = $OLDMAILDIR

    I use grepmail [sourceforge.net] to find old emails that I might need. Grepmail lets you use perl regular expressions to find messages and then outputs the entire message where a match was found. You can use grepm [barsnick.net] to open grepmail matches as a mailbox in mutt. grepine [www.dfki.de] does the same for Pine, which I use.

    At the end of each year I clean the spam out of my archives using a procmail recipe and spamassassin. This recipe marks messages as deleted in the mailbox. I open these in pine, sort by deleted, and double check them. Once I'm sure they're all spam, I delete them:

    # vim:ft=procmail:

    LINEBUF = 8192
    SHELL = /bin/sh
    MAILDIR = $HOME/mail

    :0 fW: spamclean.lock
    | spamassassin -e --prefs-file=/home/matt/.spamassassin/user_prefs-s pam_clean 2>/dev/null

    # If the message was deemed to be spam, set the status to "deleted" so that
    # we can delete it easily and optionally review it.
    :0 e
    {
    :0 fhw
    * ^^rom[ ]
    | sed -e '1s/^/F/'

    :0 f: formail.lock
    | formail -I 'X-Status: D'
    }

    # Fix the mangled "From" line
    :0 fhwE
    * ^^rom[ ]
    | sed -e '1s/^/F/'

    # Remove the last of the SpamAssassin headers
    :0 f: formail2.lock
    | formail -I 'X-Spam-Checker-Version'

    # File message in temporary mailbox
    :0: sandbox.lock
    z-cleaned_mbox

    The special spamassassin config turns off bayesian filtering and sets the threshold high:

    required_hits 15
    clear_headers
    fold_headers 0
    use_bayes 0
    The rest of the spam I clean out by hand.
  • Manatory ZOË plug (Score:3, Informative)

    by MCRocker ( 461060 ) on Saturday March 12, 2005 @07:45PM (#11922789) Homepage
    Well, if you're going to be on this topic, a mention of ZOË [zoe.nu] is pretty much required.

    ZOË is a sort of an archiving proxy that sits between your mail client and your mail server. It stores and indexes everything, so you can pop open a browser window and do a search on anything you've ever sent or received. Naturally, this was created before gmail [google.com].

    With ZOË you don't need to worry about those pesky email folders and waiting for long searches.

    Naturally, spam filtering before ZOË is a good idea.
  • Re:Archive what? (Score:2, Informative)

    by Mishura ( 744815 ) on Saturday March 12, 2005 @07:58PM (#11922848)
    If you throw out your mail as soon as you read it, how are you keeping letters written by deceased relatives? Are they sending you mail after they die?

    Actually, yes. I did recieve a letter from my grandmom a week after she died. Snail mail works very slow indeed.

    Reading the letter was strange. The content wasn't strange, just the feeling you get from recieving information from a dead person. That's all I'll say about it.

    Cue the "I read dead people's email" jokes..
  • Re:How I do it (Score:4, Informative)

    by Matt Perry ( 793115 ) <perry DOT matt54 AT yahoo DOT com> on Saturday March 12, 2005 @08:00PM (#11922863)
    Almost forgot. I archive my sent mail as well. This might be harder for you if you don't use a single email client on a single machine. IMAP can help with that.


    Put this in ~/bin/rotate-sent-mail.sh:

    #!/bin/bash

    # This script takes sent mail in $HOME/mail and moves them into
    # $HOME/.mailarchives/sent. It will also rename the file to have the date of
    # the log file included.

    MAILDIR=$HOME/mail
    ARCDIR=$HOME/.mai larchive/sent
    year=`/bin/date +%Y`
    month=`/bin/date +%-m`

    # updating last months mail
    month=$((month-1))

    # if this is last years mail, set the date correctly
    if [ $month -eq 0 ] ; then
    month='12'
    year=$((year-1))
    fi

    # if the month is less than 10, add the leading zero back
    if [ $month -lt 10 ] ; then
    month=0$month
    fi

    mv $MAILDIR/sent-mail $ARCDIR/sent-mail-$year$month
    touch $MAILDIR/sent-mail && chmod 600 $MAILDIR/sent-mail
    bzip2 -9 $ARCDIR/sent-mail-$year$month
    Now add the following to your crontab:

    0 0 1 * * $HOME/bin/rotate-sent-mail.sh

  • by ahbi ( 796025 ) on Saturday March 12, 2005 @08:00PM (#11922864) Journal
    I strongly recommend Outport [sourceforge.net]. It does an extremely good job of converting MSFT Outlook attachments into something more readable (mbox I think, it has been a while). MS Outlook usually mangles attachments into some wrapper called TNEF.

    Also, anyone know of a client program that will recursively create folders on an IMAP server (maybe a server issue. In which case, what server?)
    I had gotten over translating my years of Outlook email into something more universally readable, but I have so many nested folders that the inability to have the client recirsively create IMAP folders is an issue. Suggestions?
  • Re:One Word (Score:2, Informative)

    by Vario ( 120611 ) on Saturday March 12, 2005 @08:12PM (#11922922)
    Have a look at bincimap. It works well, installs easily and seems to be quite secure.
    See http://www.bincimap.org/ [bincimap.org] for more details.

    It runs on my small linux server without problems and I can access my emails securely over ssl from anywhere. The only limit is the hd size, so even a couple of GB should be no problem.
  • by aussie_a ( 778472 ) on Saturday March 12, 2005 @08:13PM (#11922925) Journal
    Because if you delete early and often, you've committed no crime. If you wait to delete it until someone (feds, cops, *IAA, UN-black-helicopter troopers, whoever) demands you turn it over to them, you're screwed.

    After all, you break laws too (everybody does, they are written that way). You just haven't been caught yet.


    Instead of deleting all your e-mails "early and often" why not just delete the ones that have illegal activity in them? Or better yet, don't conduct illegal activity via e-mails. Those are a couple of a crazy ideas I know, but it just might save you from deleting all your e-mails.
  • by cgenman ( 325138 ) on Saturday March 12, 2005 @08:17PM (#11922947) Homepage
    The first rule of thumb is "always bring your mail with you." If you change clients, or you change OS's, there is always a way, however roundabout or painful, to get mail into a usable form. This may involve installing Outlook, exporting all of your mail to Outlook, and importing it all from outlook, but it is worth it. Worst comes to worst, redirect it all back to yourself.

    If you do this religiously, you will only ever have to worry about your current mail format, and how you're going to upgrade it all to your new mail client. For archiving, you can either put it all in a folder that you never open or search, or under a different account that you never open or search, but at least it is all together.

    It's a lot easier to figure out how to take e-mails across current and last generation systems and current and last generation mail clients than it is to try and bridge a 15 year old machine that ran from 5 1/2" floppies using some nasty proprietary mail format and modern floppyless OS using some nasty proprietary mail format.

  • by Doctor Fishboy ( 120462 ) on Saturday March 12, 2005 @08:28PM (#11922986)
    ...and let mutt sort out.

    I had multiple folders, sorted by people/project. I got in a complete mess and finally snapped when I spent half an hour looking for a simple message.

    Use procmail to write all incoming messages to 'all-mail-YYYY-MM' and use Mutt hooks to write out to the same directory.

    At the end of the year, cat them together and make 'all-mail-YYYY'. Accessing and reading this mailbox can be done with 'mutt -R -f all-mail-YYYY' as this opens read-only. Use 'l' to do 'limit' searches and use ~t, ~f, and ~b in AND combinations to limit on To: From: and body of messages. It's lovely only having to look in one place!

    Procmail:
    INCOMING=all-mail-`date +%Y-%m`
    # now I want to keep a copy of EVERYTHING in a dated directory :0 c:
    $INCOMING

    Muttrc:
    set record="+all-mail-`date +%Y-%m`"

    Works for me!

    Dr Fish
  • Re:One Word (Score:3, Informative)

    by astrashe ( 7452 ) on Saturday March 12, 2005 @08:43PM (#11923060) Journal
    IMAP is the answer. I don't use IMAP on a regular basis, but it did let me export mail from outlook over to Evolution on linux.

    I used the UW IMAP server, which is a little easier to set up than the Cygnus one.

    The UW IMAPd keeps its folders in mbox format, so it's a great tool for converting oddly formatted mail.

    Moving email is pretty easy -- it's harder to move calendar entries, address books, notes, and the other sorts of data that ends up in a program like outlook. I think the easiest way to do it would be to sync to a palm device, on windows, and then do it again under linux, although I haven't actually tried that.

  • by Dwonis ( 52652 ) * on Saturday March 12, 2005 @09:50PM (#11923403)
    I burn it to CD-Rs that I know won't get moved around or scratched. They stand a good chance of lasting the rest of my life.

    No! Check those backups! I have lost data stored on CD-Rs (luckily I had copies), and many of my discs have started to turn yellow after about 2 years! Also, you can sometimes see these little spots of discolouration on the CDs, which makes me think there's a fungus of some sort that's eating them.

    The lifespan of CD-Rs is unknown at this point. Don't trust them for more than a year without inspection, and make fresh copies after 5 or so years.

    I'd also recommend using some kind of forward error-correction scheme, like par2 [par2.net].

  • by Anonymous Coward on Saturday March 12, 2005 @11:36PM (#11923922)
    If you have a Mac, save it all as plain text files and just throw it into Boswell. It doesn't just archive it, it cross references it for you.

    Can take care of all of the text in your life.

    Check out www.boswell.com.
  • by LibrePensador ( 668335 ) on Sunday March 13, 2005 @12:56AM (#11924187) Journal

    You open all your email with an email client and move all the disparate inboxes into a big IMAP store on your own computer or one provided by a joint like Fastmail.fm [fastmail.fm] or Runbox.com

    Then, you keep a local backup on any computer that you move to with offlineimap [quux.org], a wonderful utility that doubles as a multi-inbox syncronizer and backup utility. I have been using it for the past two years and can attest to its reliability.

    Enjoy
  • Archiving IM ... (Score:3, Informative)

    by rocketfairy ( 16253 ) <(nmt2002) (at) (columbia.edu)> on Sunday March 13, 2005 @01:22AM (#11924314) Homepage
    ... is easy. gaim can do it automatically, plaintext or html, by recipient, and searching is easy. Opening up a nice big html file in a browser, complete with every conversation you've ever had with someone on AIM, can be quite handy.

    I actually have two backups of my mail:
    • raw mbox. procmail copies everything to a folder (on my mailsever) which fetchmail (on my box) grabs once in a while, usable with most civilized mail programs (want to copy everything to an imap server? Use t-bird or some crap) and searchable with mutt, or for that matter a text editor; and
    • gmail. Yes, you have an extra invite or 50 ... just procmail or otherwise autosend a copy of everything to a gmail account as backup. Gmail might not last forever (hence the mbox, and some cdr/tape), but while it does, it makes for handy searching. Not as nice as mutt though :)
  • IMAP all the way (Score:2, Informative)

    by paulsomm ( 92946 ) <paulsomm@panix.com> on Sunday March 13, 2005 @02:46AM (#11924607)
    I've about 15G of emails now dating back to the early 90s, all stored in a locally-installed Cyrus IMAP server (maildir format, technically). Never used AOL's mail or free webmails so that was never a concern of mine.
  • Re:Kinda Sorta OT (Score:3, Informative)

    by gunfleet ( 547177 ) on Sunday March 13, 2005 @06:05AM (#11925056)
    Is there a package that can read the mbox, the other box-formats, plain text, pull from pop, old tar.gz bundles, categorize (sorta), tag and make such things searchable?

    Yes there is, check out

    http://www.greenstone.org/cgi-bin/library [greenstone.org]
  • by Anonymous Coward on Sunday March 13, 2005 @09:02AM (#11925416)
    If you're serious about archiving or migrating your email, take a look at Mailbag Assistant and Aid4Mail for Windows. Mailbag Assistant makes it easy to read email from many different formats. It can search and display your email archives from CD-ROMs and any other location accessible to Windows Explorer. Aid4Mail supports even more mail clients and can archive your messages into highly compressed ZIP files. It can also help you migrate your email to another mail client or a database. Aid4Mail is very accurate; it can correctly migrate status information and is capable of rebuilding Eudora mailbox files and MS Outlook message folders into standard mbox files.

    Mailbag Assistant 3.8:
    http://www.fookes.com/mailbag/

    Aid4Mail 1.0:
    http://www.aid4mail.com/

    --
    Eric Fookes
    http://www.fookes.com/
  • Re:Archive what? (Score:2, Informative)

    by michelcultivo ( 524114 ) on Sunday March 13, 2005 @10:16AM (#11925695) Journal
    There's a lot of people that don't follow the Netiquette Guidelines [faqs.org] specifically when responding threads; they don't cut the things that don't need to be included on the response. And this was the big cause of having a lot of space wasted when storing email.

Never test for an error condition you don't know how to handle. -- Steinbach

Working...