Thread Compressor for Outlook – do you want it?

Here’s an appeal – nearly 8 years ago, I wrote* a little COM addin for the-then new Outlook 2000, which “compressed threads”. The idea was that it could take an email thread (eg a discussion over a period of time and a number of responses, from any number of people, and typically sent to a distribution list for the purposes of discussion), and compress that thread down to the salient points. It has evolved over a few iterations since but has been largely dormant for the best part of 5 years – it does everything I need it to do, so I’ve never developed it any further (and if truth be told, a hard disk crash blew away the source for the last version and I could never face going back to a previous beta and re-developing the changes I’d made).

I’d like to understand if anyone else would like it.

The basic assumption with Thread Compressor is that when people reply using Outlook, they tend to add all their comments at the top – some do inline replies, but most eschew that – and don’t edit the original contents. If this assumption holds true, then it would be possible to compress all discussion threads down to only holding onto the final email or the final post (to a public folder) since it will contain the entire history of that branch of the thread. Of course, there may be multiple branches of the thread, and Thread Compressor handles that.

The first time many people run TC on a large folder, it will routinely get rid of 50% or more of the content, so proves useful in slimming down folders where you archive stuff, or folders where distribution list contents are sent by Outlook rules, never to be read but to be indexed by Windows Desktop Search or similar.

In my last run, I scanned almost 1Gb of email and the Thread Compressor discovered about 21Mb of mail which could be removed… not quite as dramatic as 50%, but it saved me reading over 1,000 emails and it reduced my mailbox size a little…

There are a few obvious benefits to thread compression…

  • It reduces the size of your mailbox, so keeps you under-quota
  • It removes spurious email so you have less stuff to plough through
  • When searching, it reduces the number of hits since it won’t return every mail in a thread which contains the same word(s)

… but some obvious potential downsides…

  • The assumption at the top of this post. If I reply to someone’s email, but change the contents of their original message in the reply, then TC will retain the modified version and it will look like the originator really said that. There may be ways to work around this limitation now, but I never bothered to figure them out.
  • Legal compliance – maybe you need to keep a copy of every mail for compliance purposes: if so, users programmatically deleting messages could be a *bad thing*.
  • erm, can’t think of any/many more…

If you think this kind of functionality should be either built into Outlook or available as an opt-in addon, then please let me know. We have many thousands of regular users of Thread Compressor inside Microsoft. It would be cool to think of millions more outside as well…

//Ewan

* The really smart bit of TC was actually put together by a guy called Peter Lamsdale. All I did was take his algorithm – which I still have difficulty understanding much less explaining – and strap a UI around it. An earlier version of TC was published (unofficially) on a website and an article was written about it by Evan Morris. There is even an unconnected MSDN bit of sample code which is nowhere near as effective (IMHO)

Leave a Reply

Your email address will not be published. Required fields are marked *