Extracting from mbox files
I need to extract some emails from a 600MB .mbox file, which is >99% spam. Its a dspam quarantine box that's gone unpruned and now has some emails I need to remove from the quarantine, and the whole mbox file can be deleted afterwards. Any suggestions? This is on a server to which I have SSH root access, so pretty much any non-GUI tools are open to me. I have tools that will handle it in principle, just not now its that size (eg the dpsam web UI tools would be OK if I could just remove everything older than the last couple of weeks first). So an mbox pruning tool would be fine too. -- Mark Rogers // More Solutions Ltd (Peterborough Office) // 0845 45 89 555 Registered in England (0456 0902) at 13 Clarke Rd, Milton Keynes, MK1 1LG
Mark Rogers wrote:
So an mbox pruning tool would be fine too.
To answer my own question: I have discovered archivemail ("apt-get install archivemail", always a good sign :-) which has happily deleted all old mail from the quarantine allowing dspam's web UI to finish the job. Other suggested tools still welcome though, as it would be useful to be able to weed out based on other criteria too (eg the addressee, subject, etc). -- Mark Rogers // More Solutions Ltd (Peterborough Office) // 0845 45 89 555 Registered in England (0456 0902) at 13 Clarke Rd, Milton Keynes, MK1 1LG
On Thu, Jan 24, 2008 at 04:42:41PM +0000, Mark Rogers wrote:
Other suggested tools still welcome though, as it would be useful to be able to weed out based on other criteria too (eg the addressee, subject, etc).
mutt -f mboxfile if you'd then typed the sequence T~d >2w <return> ;d$ then you'd have removed all the mail older than 2 weeks, T = tag ~d = by date range, >2w = greater than 2 weeks, then ; = do something to tagged mails, d = delete and $ = sync mailbox. http://www.mutt.org Adam -- jabberid = quinophex@jabber.earth.li
Mark Rogers <mark@quarella.co.uk> wrote:
I need to extract some emails from a 600MB .mbox file, which is >99% spam. [...] Any suggestions?
Sounds like a job for mboxgrep. Hope that helps, -- MJ Ray http://mjr.towers.org.uk/email.html tel:+44-844-4437-237 - Webmaster-developer, statistician, sysadmin, online shop builder, consumer and workers co-operative member http://www.ttllp.co.uk/ - Writing on koha, debian, sat TV, Kewstoke http://mjr.towers.org.uk/
participants (3)
-
Adam Bower -
Mark Rogers -
MJ Ray