Mar 30, 2011

Amailyser - The E-mail Analyser

Recently Ed Daniel (a friend of mine) needed to produce a time analysis report out of his e-mails and asked if I can help him.  Well, obviously I said yes.


I thought to myself, he needs charts and graphs and data querying from many aspects that I may not know.  So to me the best option is to load his e-mail important fields into an RDBMS (such as SQLite or PostgreSQL).  Later he can do whatever he wants with the data using reporting tools such as JasperReports and produce nice charts and graphs as he requires.


That was the motivation to write Amailyser (on GitHub).

  • The language is Python 2.x; I could have done it in Perl but I'm learning Python and it seems like a wonderful practice to me. 
  • For DB interaction I used SQLAlchemy a nice ORM mapper for Python.
  • To read the e-mail messages I used Python standard library mailbox.
After downloading and extracting, you can run the Amailyzer (after you have modified config.py to suite your needs)  in a terminal:

$ cd amailyser/src

$ python main.py


The big picture is to read mailboxes, load each message, extract important fields and persist them in the database.


Below is the structure of the source:
Amailyser source tree
  • config.py: This is where you can change the behaviour of Amailyser like which mailboxes to process and type of each of them.
  • model.py: This is the model of database described in SQLAlchemy.
  • workhorse.py: Contains two calsses MailBox and MailMessage.  MailBox opens a mailbox in mbox or maildir formats and processes each message inside using MailMessage which (optionaly) persists them to the database.
  • main.py: Actually as a main is supposed to be, it doesn't anything special.  Just calls model setup routines and iteraties over items in the list of mail boxes, passing each one to MailBox.
To install SQLAlchemy please visit the tutorial on its website.

No comments: