Ask Slashdot: Best (or Better) Ways To Archive Email? 177
An anonymous reader writes: I've been using email since the early '90s and have probably half a million emails in various places and accounts. Some of them are currently in .tar files, others in the original folders from obsolete or I-don't-use-them-anymore mail clients. Some IMAP, some POP3. You get the picture. I don't often need to access emails older than a year or two, but when I do, I have found that my only hope for the truly archived ones is to guess what Grep combo might find the right text in the file ... and then pick through the often unformatted, unwrapped, super ugly text until I find the email address or info that I'm searching for. Because of this, I tend to at-all-costs leave emails on servers or at least in the clients so that I can more easily search and find.
My question is whether there's any way to safely store them in a way that I can actually use them later, offline, in a way that allows for easy date searches, email address searches, and so on. Thunderbird for example has 'Archive' as an option, but if I migrate to a different client I assume that won't work anymore. So what ways to people archive emails effectively? Or is this totally a lost cause and I should keep limping along with grep?
My question is whether there's any way to safely store them in a way that I can actually use them later, offline, in a way that allows for easy date searches, email address searches, and so on. Thunderbird for example has 'Archive' as an option, but if I migrate to a different client I assume that won't work anymore. So what ways to people archive emails effectively? Or is this totally a lost cause and I should keep limping along with grep?
MailStore Home is the Answer (Score:2, Informative)
MailStore Home is the defacto best free method I've found: http://www.mailstore.com/en/mailstore-home-email-archiving.aspx
Re: (Score:3)
Sounds like it might be good, if you run Windows. Another option is just to set up a home IMAP server that you can dump into - Dovecot handles large volumes of mail quite effectively, for instance. The mails would get stored in Maildir folders, so you can migrate or hand search if you need to as well.
The only downside is finding an IMAP client that will let you work with it without trying to make a local copy the moment you connect. (Mulberry is good, but hasn't been updated in ages. Or you can set up a
Re: (Score:2)
Yes - almost any email client (from the last, what, 20 years?) can handle IMAP. Fire up the old mail clients that produced each archive format and drag everything over to the IMAP server. You could even drag it all over to Gmail.
I asked a very similar question last month... (Score:4, Informative)
I asked a similar question to Slashdot about a month ago, where I wanted to stash E-mail and have it accessible if I'm on the road.
I looked at a few options. Using a virtual machine, an offsite storage provider, and so on.
What I have wound up doing is buying a NAS. Synology or QNAP are good companies for this. The NAS I bought was a basic one, but it supports RAID 1, which is critical. It also gets backed up automatically via a script that goes in via SSH, creates a tar file, pipes it to zbackup which has a repository on another NAS. zbackup is ideal for backups of E-mail, and having another machine pull the backups helps deal with ransomware, once the bad guys start hitting devices.
I then enabled the mail server functionality, which gave me an implementation of dovecot and roundcube. This not just gave me IMAP access, but access via the web (SSL). Using the onboard firewalling, I limited the IP range that the NAS talks with, to just the IP range of the commercial VPN service I use (which is a small provider, run by some competent admins.) This way, for an attacker to even get to an open port forwarded past the router to the machine, they have to have an account with that small VPN provider.
For me, this has worked well. I have access to my E-mail over IMAP or the web. Since the NAS doesn't send or receive mail directly (mail just gets copied to it when archived), it doesn't need SMTP access in or out.
Caveat: Focus on security when setting this up. Ideally, you could use the NAS's built in eCryptFS capability to protect the IMAP maildir directories so physical theft of the NAS doesn't mean your critical E-mails belong to someone else. From there, put the NAS in its own DMZ, blocking all outgoing traffic except for it checking for OS updates, and only allowing incoming traffic to the TLS-based ports, preferably with heavy IP restrictions. For backups, do a pull based system, so if the NAS gets infected, the bad guys can only put garbage in the backups, and not attack previously stored data.
Re: (Score:2)
Staggering levels of complexity and cost...
My 20 years of emails are in the text file format native to Eudora. If I use any other email systems, I just bcc myself (i.e. Eudora). All in, ZIPped, I'm under 90MB.
One post-processing thing helps -- I strip unneeded headers, and this chops out about half of the size.
Text files forever baby.
Re: (Score:2)
It is more complex than just tossing the E-mail from Eudora (guessing mbox format) into a zip file. However, I do have access to the mail from anywhere, and clicking on a VPN, firing up a dedicated IMAP app isn't that bad.
The costs are sunk anyway. The NAS gets used for other things (zbackup repository), so having its dual-core CPU handle some basic IMAP processing when I choose to click the "archive" button on Thunderbird doesn't hurt.
Locally, the mail is stored in the maildir format. While not as conve
Re: (Score:2)
And then let's say the motherboard in your NAS dies. Let's say it happens in 10 years (I'm being generous here), and there is no Synology/QNAP around any more, or even if they still exist, they don't make compatible products any more. Can you pull HDDs out of your NAS and read data from them somehow, in a convenient non-spend-a-week-copying-individual-files-by-hand way?
That's why a generic Linux install on a commodity PC hardware will beat any NAS for longevity.
Re: (Score:2)
Re: (Score:2)
The NAS uses Linux's LVM2 and ext4 for the drives in the machine, using a "secret sauce" to adjust the LVMs as disks are inserted/resized.
I don't know how LVM software will be in 10 years, but I think Linux's LVM software (and ext4) isn't too hard to decode if I need to pull the drives out due to a failed component.
Re: (Score:3)
PSTs have a history of getting corrupted and having you lose everything in them - and also have some issues with going to large numbers of files per PST. But it's a solution.
However, it's more complicated than dumping into an IMAP folder for the original requester (as everything would have to be imported into Outlook), and it costs more.
But this isn't particularly clunky or hard to understand - set up a IMAP mail server (like any other, using common and well-documented tools) and transfer the mail to it.
Re: (Score:2)
PSTs have a history of getting corrupted and having you lose everything in them
Citation? Sure I've seen plenty of corruptions, from the file not being dismounted correctly, but Outlook has a built-in PST repair tool which fixes this effortlessly.
Also if your data is precious, then keep a back up. PSTs are an archive so shouldn't change. Keeping two copies of each is trivial.
Re: (Score:2)
As for the remarks about the PST corrupting... rarely happens in Outlook 2010+, never if the PST is static / not connected to Exchange (OSTs have issues on Outlook 2013 with repeated network adapter handoffs between wired and wifi). He needs one golden backup of the PST and it will be solid.
Frankly, the most robust, mobile, inexpensive, and secure solution is an Outlook.com account used as an archive +
Re: (Score:2)
Functionally there's not a lot of need, though the database search features of Exchange are kind of nice.
Myself, I actually use Zimbra which is open source and free for personal use. I have that in a VM on my home server and connect using IMAP and when on the road I can still access it via the web. It uses Postfix for email on the back end with a MySQL database that contains all the mail metadata. Yeah, Zimbra uses Java heavily which kind of sucks but it's really not too bad. As of today I have email going
Re: (Score:2)
Microsoft solved this problem 15 years ago (Outlook and PSTs).
Huh?
I just had to migrate a bunch of Outlook mail for people who were moving from XP to Windows 10. "Solved" isn't a word I'd use to describe the convoluted process needed to do it.
Re: (Score:2)
It's just a guess, but I suspect your problems are less related to accessing a PST and more to do with all the other stuff that comes with OS/App version migrations.
Re: (Score:2)
+1 for Mailstore, though we use the enterprise version and not the personal version. We had A LOT of resistance when we first deployed, but we managed to get all email into a single repository and get rid of all the damned PST files people had accumulated over the years. Resistance faded after a couple of weeks, and people are generally happy with it now.
Re: (Score:2)
FOIA requests to the NSA to access them. You don't need to do anything to archive, and storage is free. Only have to pay for access.
hoarding mentality (Score:1, Insightful)
Sure, they might be useful at some point, but do you really need your emails from 20 years ago? Life is temporary. All things decay. Attachment causes suffering.
Re: (Score:3)
Re: (Score:2, Funny)
In that case, everyone at the company should print out each email they receive.
Re: (Score:2, Funny)
Re: (Score:1)
Which Bob?
Re: (Score:2)
Re: (Score:2)
If you are stapling the reports, please do remember that's my stapler. I would like it back, please.
Re:hoarding mentality (Score:5, Informative)
Holding your business emails too long is a liability risk... they are subject to discovery in the case of a lawsuit. Most businesses have a limited email retention policy for that very reason.
Re:hoarding mentality (Score:5, Insightful)
Holding your business emails too long is a liability risk..
I was just asked to recover email from the late 90s as part of a means to prove we had prior art on a patent that was being asserted against us. The email history included draft drawings, work orders to a manufacturer requesting customizations to our manufacturing equipment, invoices and negotiations with customers to work with it. etc. All with a clearly documented timeline that could be verified with multiple 3rd parties if it came to a court situation.
This sword clearly cuts both ways.
Re: (Score:2)
I don't know that this is really a case for storing email forever. Yes that is true, but it also means that decades of email are available for searching and can be required to be searched or given up.
The reality is, design docs should be saved. These sorts of notes and work should be saved. Retaining emails may provide a solution, but that doesn't mean it is a good solution or the right one. I would submit the real issue here is that nobody saved the documents; but instead relied on email to save it for the
Re: (Score:2)
I don't know that this is really a case for storing email forever. Yes that is true, but it also means that decades of email are available for searching and can be required to be searched or given up.
Yes, I agreed, it cuts both ways.
The reality is, design docs should be saved. These sorts of notes and work should be saved. Retaining emails may provide a solution,
Email provides a timeline that is much harder to forge, and which can be verified and testified to by external 3rd parties who were referenced and/or copied on various messages.
Files in a folder somewhere... 5 minutes and anyone here could make them say they were written whatever date we wanted.
I would submit the real issue here is that nobody saved the documents; but instead relied on email to save it for them.
We had documents the documents. But email is what ties them all together, and provides a strong evidence of a timeline. Documents are shown to be referenced on a given date, in a give
Re: (Score:2)
Re: (Score:2)
Surely your company would have other evidence than emails to support your prior art?
Sure it does. But email has the advantage of being time stamped, with copies sent to 3rd parties.
The dates on purely internal digital documents are much harder to establish if their integrity is challenged.
Didn't your company apply for a patent?
No. We felt (and still feel) that the 'innovation' was obvious, and that the patent has no merit.
But arguing that is expensive and time consuming and risky. If we can demonstrate that we were making, selling, and using the 'invention' well before they 'invented' it, then what exactly did they invent? And
Re: (Score:2)
Re: (Score:2)
Are these tape backups from the late '90s?
Nope. They live on spinning rust. Seriously... the entire thing is well under a terabyte. Its not exactly hard to keep it around.
Re: (Score:2)
Re:hoarding mentality (Score:5, Interesting)
I had a client who insisted he needed to keep every email forever. I thought he was full of shit until he explained to me why.
He works as a vendor rep, helping them sell shit to a well-known Fortune 50 retailer.
As it turns out, this Fortune 50 company periodically audits years old (like sometimes 5+ years) invoices and receiving information and arbitrarily decides "we just realized that shipment you sent us in 2009 was short, but we paid the invoice in full. So we're going to subtract the overpayment -- plus interest -- from the current amount we owe you."
Part of this guy's job was the ability to get the shipping/receiving info as it happens, and the old email lets him present info that basically says "you said it was a complete shipment in 2009, so no deductions".
What I found kind of amazing was that somehow this retroactive auditing is considered acceptable. My guess is vendors are just expected to eat it or not get their product on the shelves.
Re: (Score:2)
My guess is vendors are just expected to eat it or not get their product on the shelves.
I guess they don't have enough clout to write a dispute length (e.g., up to 1 year) into their contract with the retailer. Although I suppose if they have to fall back to the contract in the event of a dispute, that retailer may not use them much longer.
Re: (Score:2)
I think that was the risk.
From what I could tell, the products they repped were not like major name brands owned by other Fortune 50 (or even 100, or maybe even 500) companies, so it was the epitome of unequal bargaining power.
It really was a case of either being able to dispute it effectively with documentation, eat the costs, or complain and lose a major chunk of your retail distribution.
If it had been a vendor of equal weight to the retailer, then it gets a lot harder for the retailer.
Re: (Score:2)
"we just realized that shipment you sent us in 2009 was short, but we paid the invoice in full. So we're going to subtract the overpayment -- plus interest -- from the current amount we owe you."
Can't see this holding up in court. Having been to court a few times, the longer the period between the event, and you raising the dispute, the much more difficult your chances are of convincing anyone.
And as with science, the law also recognise the burden of proof is with the claimant, so good luck proving you had one box missing from your delivery 6 years ago, but are only raising it just now.
Re: (Score:2)
You're a company with $100 million in annual revenue for whom a Fortune 50 retailer represents some significant percentage of your total product distribution and sales.
They pull some dubious move and you sue them.
They easily determine the Chinese manufacturer of your product, obtain said product with their private label on them and drop your product.
Now you're a $75 million company.
That $50k or whatever in deductions from a past year audit you just saved suddenly isn't a very good stance, outside of its m
Re: (Score:2)
You're a company with $100 million in annual revenue for whom a Fortune 50 retailer represents some significant percentage of your total product distribution and sales.
They pull some dubious move and you sue them.
They easily determine the Chinese manufacturer of your product, obtain said product with their private label on them and drop your product.
Now you're a $75 million company.
That $50k or whatever in deductions from a past year audit you just saved suddenly isn't a very good stance, outside of its moral value.
You've been watching too many movies.
Re: (Score:2)
you sent us in 2009 was short, but we paid the invoice in full
Sounds like a company which is a SAP partner.
Re: (Score:2)
This is common practice, but hiding evidence of past crimes is a scary reason to delete old emails.
Re: (Score:2)
I can only speak for the United States, but here, the constitution explicitly forbids "ex post facto" laws that make something a crime retroactively. Can you cite an example where this has happened? Perhaps this precaution is to protect banks from such activity in other countries?
Re: (Score:3)
That's true, but from the sounds of it this is for business reasons. For business it's probably more important than if it was personal.
For business it can be even more important to clean things out. Having old things on hand is more likely to work against you than work in your favor. Yes, some documents need to be carefully retained and kept on file for the life of the business and the best place to do that is not in email. Most of these communications should be disposed of on a regular basis.
Most business lawyers I've worked with have strongly recommended a data retention policy to dump email regularly and always before the 3-month gover
Re:hoarding mentality (Score:5, Insightful)
There is no good reason to keep 25 years of email.
There is no good reason to assume that your needs are the same as those of others.
Re: (Score:2)
Of course there is. You should know, some people actually comunicate about important matters, and keeping records can oftentimes be very beneficial.
My moving window for keeping all e-mails from all e-mail accounts is 10 years, however, I also have some mails dating back as far as '95. Thing is, you can never know when and what you could need, and given the almost 0 long term cost of storing those e-mails, it's better to have them than not to have them
Re: (Score:2)
My moving window for keeping all e-mails from all e-mail accounts is 10 years, however, I also have some mails dating back as far as '95. Thing is, you can never know when and what you could need, and given the almost 0 long term cost of storing those e-mails, it's better to have them than not to have them.
While the long term cost is low, I think it's important to make a clear mark as to what's active, and what's archived. I use calendar years, and early in the year, archive digital photos, "my documents", etc from the previous year. That way I know the "2014" data set is fixed, and is the same on all backups. Then I only need to worry about stuff active in the current year. If I need to dig back to find old information I can, but it's not cluttering up my current workspace / hard drive / etc.
Re: (Score:2)
"There is no good reason to keep 25 years of email. "
Of course there is. You should know, some people actually comunicate about important matters, and keeping records can oftentimes be very beneficial.
Seems like you ignored the rest of the post, fixating on that one line.
I am not saying to dump important documents. I am saying a hodgepodge of email systems is a terrible archival method.
If there are communications that need to be preserved, preserve them properly.
As I pointed out, contracts should be preserved properly, generally meaning a hard copy printed out and kept in a physical file folder, or electronic copies should be properly archived properly as electronic documents. Mementos should be pres
Re: (Score:2)
Most work places I've been at have had 3 months before automatic forced deletion of email.
We have one of those. A big PITA. 1 year would be much more manageable. I backup the local sync'd cache every three months. If I need to dig back in old emails, I go offline, restore the old cache, and retrieve what I need. On more that one occasion I've had to ask customers or suppliers "Do you remember x many months or years ago we talked about y? Do you still have that email?"
Luckily they are as much of a pack rat and can produce the email.
Re: (Score:2)
Unread email is treated differently under the law, and currently any email that is six months old or older and marked as unread can be opened and read by federal agencies without a warrant.
And how do they get to it without a warrant? My server is behind locked doors, and I have the keys...
Re: (Score:2)
And how do they get to it without a warrant?
Under the Stored Communications Act, with an administrative subpoena that does not require any probable cause statement or review by a judge. No warrant required, but full legal force.
If you maintain it on your own you can fight it if you want.
If they give it to a third party like your ISP or some other service provider, they might fight it, or might not. Choose your partners carefully.
Re: (Score:2)
If you maintain it on your own you can fight it if you want.
Actually, in these cases, they don't try. They naver want people to know when they are data trolling them.
Re:hoarding mentality (Score:4, Insightful)
I friggin' hate people who, on an Ask Slashdot, completely fail to answer the question and say something that has nothing to do with the topic at hand.
And yes, I am aware of the irony of posting a comment like this to criticize one, so you needn't bother pointing that out.
Re: (Score:1)
You don't need them until you need them. I've had more than one occasion where my finding an email in a haystack saved my bacon.
A better associated question is why vastly more powerful search capabilities aren't built in to pretty near all email programs and services. It seems it is immeasurably harder to do good searching now than it was 30 years ago on unix boxes. Yes, the volume of storage has grown, but so has the memory and the cpu power. Everything is now "in the cloud" and almost nothing can b
Re: (Score:2, Insightful)
Email is the new "box of letters". It can be fun and sentimental to go through old correspondence. When you die, your kids will have fun reading your old emails if they can figure out your devious passwords.
Re: (Score:2)
Store as local maildir. (Score:2)
`OfflineImap` (for fetching into a local maildir), then `mu` for indexing and searching.
As for converting your already-archived mail into maildir format, that's a little more tricky. Once they're in maildir format, you can just use `tar` to compress the ones you don't currently need to access.
Mairix for local search (Score:2)
mairix is another good solution for searching them, once you've got them in local mbox/mh/maildir spools. I think back when I was converting to maildir I scripted mutt to copy them in, but it's obviously harder if you've got them in proprietary formats.
+1 for Mairix (Score:3)
After trying several solutions I settled on Mairix. Searches are screaming fast (less than a second to search several hundred thousand emails), indexing is fast, it's reliable (no problems in the 5+ years I've been using it), and the search language is easy and flexible.
* I use procmail to send a copy of everything to an archive, rotated monthly .bashrc:
* The archive is therefore just a handful of mbox files
* I have a cron job to run "mairix -Q" every 5 minutes, and "mairix -p" nightly
* I have this in my
Just keep them like any other email (Score:2)
Mailarchiva (Score:2)
https://www.mailarchiva.com/ [mailarchiva.com]
Works pretty well.
Re: (Score:2)
I use the enterprise version of Mailarchiva at my work. It is a solid product.
importexport tools - Outlook - eM CLient (Score:1)
Mail Consolidation IMAP (Score:4, Informative)
I remember having a similar problem years ago with E-mail in several systems and getting annoyed that everything was in different formats in different E-mail clients. I fixed the problem by setting up my own IMAP server. An IMAP server is a mail server that's compatible with virtually ALL E-mail clients but what's important about them is they act as mail stores unlike POP3 so you can upload mail to an IMAP server without screwing up formatting or anything. Then once you get all your E-mail up to your IMAP server, you can chose to just store it there (just remember to back it up now and then) or you can redownload it all into a Mail folder on ThunderBird (Backup Thunderbird's Mail store folder for protection) ThunderBird probably isn't going away in the foreseeable future but if it does, sometime down the road you can reuse your IMAP server to transfer it to another mail client.
Re: (Score:2)
Thanx for your enlightment.
What exactly has setting up an IMAP server to do with eMail archives? Making them searchable etc. ??
Are you from a different planet where eMail and IMAP works different?
I'm only reading this thread because I have the same problem, about a million mails. How the fuck do you expect me to get them into an IMAP server? I have them on DISK!!
And most mails I get, I get via POP. Why should I leave my mails on my providers IMAP server?
If you want to contribute, then sy something construct
Re: (Score:2)
Become your own provider; set up your own IMAP server either in-house or on a cheap hosted solution like Linode then import your data. If you want to get really complex then use scripting with S3CMD or some other tool so you can now back it all up to S3, then configure your S3 to archive to Glacier after 24 hours or so. Yeah, that means some costs but there are ways of mitigating that too.
One possibility is have a server at home with all your mail... make it a VM or a PM... whatever. Import the data through
Re: (Score:2)
Mail archives (Score:2)
One option might be to set up a local IMAP server on your machine and archive your mail there. Then any mail client that talks IMAP could access it.
Thunderbird's nice in that it uses the standard maildir format (one file per message, mail folders are just directories under the root of the tree) for it's local copy. Most IMAP servers understand and can use that format so you can just dump a copy of the local mail store into the IMAP server's user mail directory (or if that doesn't work, use the Unix movemail
Re: (Score:2)
Thunderbird's nice in that it uses the standard maildir format (one file per message, mail folders are just directories under the root of the tree)
Unfortunately, NO it doesn't! Maybe you just mistyped this, or else you are confusing Thunderbird the mail client, with an IMAP server like Dovecot, Courier, or others.
IMAP servers usually do use the "Maildir" system to store emails: 1 file per mail, which is very nice, and helps a lot with backups.
Thunderbird, the mail client, stores in mbox format: 1 file per folder. So if you add 1 email to your 2GB folder, that 2GB file will need to be backed up again. But at least, it's a text format, so it's still muc
Re: (Score:2)
It might be that I'm on Linux instead of Windows, but for me Thunderbird clearly says that the message storage type is "File per message (maildir)" and the directories exactly match the format of the maildir folders Dovecot uses on the server. You can even see the setting in the advanced preferences General tab although it's greyed out by default (the mail.server.default.canChangeStoreType setting probably controls that). I know Thunderbird used to use mbox files, but I've only ever seen it use maildir on L
My very unideal solution (Score:2)
Thunderbird - where's the objection? (Score:2)
With modern hard drive sizes I don't see the need for compression. Without compression you can use any good free text search tool. I have kept a good proportion of my email since about 1990, and it's all in Thunderbird. (Messages from earlier clients I just emailed to myself en masse).
Thunderbird has pretty good search capability, but as I am still running on Windows 7 I use Copernic Desktop Search, which has some useful features. (It indexes and searches files, and handles Firefox as well as Thunderbird).
Re: (Score:1)
Re: (Score:2)
Here is the objection to Thunderbird:
"On December 1, 2015, Mozilla Executive Chairwoman Mitchell Baker announced in a company-wide memo that Thunderbird needs to be uncoupled from Firefox. She referred to Thunderbird as paying a tax on Firefox and said that she does not believe Thunderbird has the potential for "industry-wide impact" that Firefox does."
One Big File? (Score:1)
I don't understand why emails are not more often stored as one-file-per-message, with a time-stamp as the start of the file name (YYYY-MM-DD etc.).
Some file systems are wasteful for lots of small files by padding actual space into large discrete chunks, but they should remedy that rather than stuff all messages into one big file.
use standard (open) formats w/ proven records (Score:2, Interesting)
I've been using email since the early 1980's, 1982 specifically. I was using "mail" then, later mailx, later whizbang graphical clients.
I still have tar archives of emails from a PDP-11. I can still read them today. Why? Because open formats. Tar archives from the dawn of time can still be read on a modern Linux system today. Once you start locking things up in proprietary formats such as used by Outlook, it gets harder to read them once that format dies. Not impossible, but certainly a bigger PITA.
T
Solr (Score:2)
Yes: Thunderbird archive (Score:5, Informative)
Use the Thunderbird archive.
Thunderbird for example has 'Archive' as an option, but if I migrate to a different client I assume that won't work anymore.
Nope! :-)
I have about 10 years of email in Thunderbird. It keeps data in the mbox format which is a well supported open standard. The files are human readable and can be greped. There's lots of 3rd-party tools that support mbox. Thunderbird builds indexes (maybe those are proprietary) which are good enough that I can search that decade of email in a few seconds. (Maybe that is only searching by subject, to, and from. Message body searches might take longer). I remove attachments from old mail though, because that eats up space and is not valuable. If I needed the attachment, I saved it somewhere more appropriate.
The Thunderbird archive feature merely moves the mail into separate mbox folders to keep the main file from getting too big. It doesn't make them proprietary.
The hard part might be moving existing mail into that format from whatever it is in now.
Re: (Score:2)
I'd second "use the thunderbird archive" and add "use IMAP."
Thunderbird can archive mail into a single folder, or per-year folders, or per-month folders. When you are using IMAP, those folders are on the server, and accessible from any client. All of the clients I'm aware of allow you to "subscribe" or not to folders of your choosing, and most offer more fine grained control to choose what to download and keep locally in order to control client storage and bandwidth use.
Thunderbird has an excellent search
Re: (Score:2)
1. The MBOX format [digitalpreservation.gov] gloms all your mail into one continuous text file. It does not have a special string to denote the beginning of a new mail message. It uses "From " (F r o m + a space) to figure out where the beginning of a mail message is. Consequently, if an email has a line in the body where someone has actually typed "From " as the beginning of a sentence, Thunderbird can mistake that as the beginning of a
Re: (Score:2)
Consequently, if an email has a line in the body where someone has actually typed "From " as the beginning of a sentence, Thunderbird can mistake that as the beginning of a new email (there are a couple other checks it does - read the link if you want the details).
Actually, no. You're wrong. If there is "From " at the beginning of a line, then what the mbox format specifies is that it be reencoded as ">From ", so that it can be decoded.
:-/
Unfortunately (and this is the real problem), it does not require that ">From " be reencoded as ">>From ", so in other words encoding and decoding is not an invertible situation, because most MDAs are stupid about encoding.
Bluewave! (Score:2)
Don't archive, migrate (Score:2)
Every time I switched mail clients or computers, I made sure to import all mail from the old to the new program. Messages that were made in my first mail account (in Eudora, on Macintosh System 7) are still accessible in my current Mac (Apple Mail, OSX 10.10). I don't need it often, but when I do, it's one search away.
Sounds Like You're Making a Classic Type III Error (Score:2)
...Solving the wrong problem.
eMail is not a storage medium; it is for short communiques, and sometimes those lead to threads while an issue is threshed through. But using your eMail system for historical storage is like buying a small automobile for long-haul freight. Or, using Twitter to negotiate a contract.
Decide what of all your data you intend keep, and find a useful, generic tool for storage and retrieval, irrespective of content.
Re: (Score:2)
It sounds like you've made a Category 6C blunder by providing a solution to a different problem.
Nobody has the time to sift through two decades of emails and pick out the important things. Even if they did, the custom database thing to put them in will definitely not be cross platform, necessitating keeping a copy of the original mess of mbox/tar/etc files around to dig through.
Re: (Score:2)
eMail is not a storage medium
Of course it is.
eMail is no difference than paper mail.
Solving the wrong problem
Depending on your "problem" you are obliegd by law to store them and have them accessible for 10 years, minimum. Depending on situation up to 30 years.
Or, using Twitter to negotiate a contract. ... Oki,
And what would be wrong with that? With 90% of my business partners: I have no contract at all. All we do is negotiation: can you do that? Yes I can! What is your price/timeframe? Something like X/Y
No need to save emails.... (Score:2)
Dont be a pack rat. (Score:3)
Re: (Score:2)
Except that one time when someone important (or not so important to you) dies and walking through old email correspondence lets you relive moments that are gone and may have been forever forgotten without the help of archived emails. While I don't want anyone routing through my emails while I'm alive, I can imagine what a treasure trove of my history is trapped in email format and can be visited, explored and enjoyed by someone else, be it a distant grandchild doing research or a historian trying to underst
Make searchable PDFs (Score:2)
In addition to rolling your own imap, as has already been suggested, you can/should also do this.
If you are a Windows and Outlook user, (and if not, Google and torrents are your friend) burn a wet weekend learning the mysteries of those two plus acrobat pro. Get a clean install on a fast PC with plenty of memory and an ssd.
Import all your old crap into outlook (look it up)
Install acrobat pro including outlook plugin... Trivially use this to create searchable PDFs including attachments.
IMAP server (Score:3)
Put all your mail on an imap server. You'll be able to access it with any mail client. Set up the imap server as the archive destination for TBird. Now all your mail is archived in the imap server and is accessible.
You don't trust your email host? That's fair. Run your own imap server on your NAS or even your desktop machine. Everything stays right there on your own media and is still future-proof with regard to changing clients. If you need to change servers you just use your favorite email client to transfer mail from one to another.
I have everything online at my email provider. In my case, "everything" goes back to the mid-90s. I recently switched hosting providers and did just as I described: Set up separate accounts in TBird with the old and new providers. Select all in a folder on the old provider, drag to a folder on the new provider. (Well, actually I had to do it in chunks of under 5000 messages or TBird would get all crashy on me. But you get the idea.) It was kind of tedious to move hundreds of thousands of messages, but it was merely tedious. It wasn't problematic.
Inbox (Score:2)
I just leave them in the inbox or whatever folder they end up in according to my sorting scripts. I'm using claws-mail as a client.
Works for me.
mutt + offlineimap + notmuch (Score:3)
I use a combination of mutt + offlineimap + notmuch [hobo.house] for mail, local archiving and a very powerful search.
I've been on this setup the past 6years or so. If mutt isn't your thing this approach is modular so you could simply sync with offlineimap and index/search with notmuch.
Dsync from Dovecot (Score:2)
A tip: Dovecot has a nice sync tool http://wiki2.dovecot.org/Tools... [dovecot.org] Perfect to get your email from different IMAP sources to your own system. It can also change mailbox format etc. Combine that with Dovecot itself to give you IMAP access and you have access. You can also use it to keep it in sync with an off site archive.
Dovecot does have full body search, but it is quite CPU intensive. No problem if you just run it for a few users and except that it may take a while on a large amount of emails. Not too g
Re: (Score:1)
Her server actually lasted longer than the one she was "supposed to" use. Contrary to popular myth, the office server was not designed for high-security or anything else special. It probably had lowest-bidder quality, and backups either failed or were lost. (A separate procedure was used for classified stuff.)
Re: (Score:2)
I go through >10 year old emails all the time. "Hey, I remember talking to a professor about this algorithm." "Where did I go camping that year?" "What was my order number for that game I bought ages and ages ago, since they accept them for free copies of the remake?" "I'm trying to gather information on something, but the person I talked to has long since died and their site isn't on archive.org." It's only going to happen more and more often for older and older stuff.
Email is also really convenient f
Re: (Score:2)
I go through >10 year old emails all the time. "Hey, I remember talking to a professor about this algorithm." "Where did I go camping that year?" "What was my order number for that game I bought ages and ages ago, since they accept them for free copies of the remake?" "I'm trying to gather information on something, but the person I talked to has long since died and their site isn't on archive.org." It's only going to happen more and more often for older and older stuff.
Agree. Or I'm like "What was the flight I took last year from JFK to LAX? It worked good with my connections". Even recently for work I noticed that when I ordered software from one supplier, I got an email, copied to the local vendor, with the serial number. I had another package we bought from them (by someone else that since left) where I could track down the PO, but not the serial number. I emailed the guy copied on my email, and he could dig up the copy he was CC'd on.
Email is also really convenient for backing up work that's under the ten megabyte range...manuscripts, source code, etc. If someone doesn't have a proper backup system or it's not easy to use from the system they're on at the moment, emailing something to themselves is quick and easy.
Critical University term end repor
with sentbox_2013/ and archive/. Good guy or bad g (Score:2)
This is what I do, run IMAP locally (Dovecot). Every year or so, I create a folder callled sentbox_2013/ and move all the sent emails from 2013 there. My regular sentbox contains the last 14-20 months or so.
I also have a folder called archive/ which holds the few messages I think I'll actually need again.
Regarding whether it's a good idea or a bad idea to keep them in terms of legal disputes and such:
Having the documents will allow someone to prove what was actually said. If you're the a shady characte
Re: (Score:2)
The early version of PST files has a file limit of something like 2GB, at which point the whole database has a risk of becoming corrupt. So it is worth breaking it down into bitesize chunks (yearly?) that are easier to manage and archive.
Re: (Score:2)
Yes, it's so tragic whenever we see these stories about a lonely old hacker found dead in his apartment, trapped under a toppled pile of bits.
Get a grip. Our digital closets are growing much faster than our digital hoards. Space and indexing technologies are growing faster than our compulsion to accumulate plaintext. Keeping email is not a problem.
Re: (Score:2)
I recommend the one-folder-per-file mbox idea. Beats MailDir handily.