This page contains a list of new features and capabilities we have planned for TMDA 1.2, or other future releases (Note that some features have already been implemented in TMDA 1.1.x). This is a lot of work, so what actually gets implemented depends on our free time, participation of new developers, etc. Therefore it can also be considered a "wish list". If you are a developer, feel free to add your own items to this list. Note that anyone with sufficient interest and skills are encouraged to work on any of these. Discussion of these items should take place on the tmda-workers mailing list. Also, please check with the list before starting work to make sure an item hasn't already been finished.
Items that are stroked out are assumed to be completed.
Architecture / Code Organization
The application code in tmda-rfilter should be moved into TMDA/ module(s) to realize the speed benefits of pre byte-compiled .pyc files. What should this module(s) be named? IncomingMessage, IncomingMail?
- The command-line argument processing code from tmda-rfilter should be moved into tmda-filter. Then tmda-rfilter can be removed. This should result in a relatively small amount of code in tmda-filter -- just the getopt/optparse stuff followed by some module imports and some object creation/manipulation.
- Similarily, the getopt code should be moved from tmda-inject to tmda-sendmail. Since tmda-sendmail also supports its own set of command-line arguments, conflicts with tmda-inject will have to be resolved. Then tmda-inject can be removed.
From a question on tmda-users, a separation between the program that applies address tagging to the message (implemented as a pure Unix filter) and the program that runs the filter (once for each recipient) and does the actual sending (as specified by MAIL_TRANSPORT) might be useful.
* Use of getopt for the command-line option handling code in the tmda-* programs needs to be replaced with Python's new optparse module. The getopt code is messy and inflexible, and optparse offers a nicer, more powerful Pythonic replacement.
Python versions and libraries
- General Python 2.3ization of any and/or all code. e.g, replace all uses of open() with file().
* Since we now require Python 2.3 or above, upgrade to email module v3.0 or v4.0, and switch to FeedParser for reasons outlined in http://thread.gmane.org/gmane.mail.spam.tmda.devel/6401
* Create a pending queue abstraction to support more than just one type of queue structure for "pending" messages, and allow sites with specific needs to write their own pending queue description.
* Once the above is done, a MaildirQueue, which is a Maildir++ compatible pending queue. The performance of such a queue will be slightly worse than the current, but it will be more reliable and compatible with many mail readers, IMAP servers, etc and allow system administrator to enforce disk quotas.
- Another idea is a MySQLQueue, where the pending queue is actually a MySQL db. I'd need to bone up on my knowledge of MySQL first though.
- Similarily, a "sender-based pending queue" where the messages are organized in subfolders based on sender, so that multiple messages can be released with one confirmation. The question is whether we should separate the structure from the functionality or not. I think I'd rather have the option to release all messages from a sender regardless of what format the pending queue is in.
- Also write conversion code that automatically converts a user's pending queue for them. If the available formats are say 'original', 'maildir', and 'mysql', the user can specify which storage format he wants in the .tmda/config. If tmda-filter tries to deliver to the queue and notices it's not in the right format, the pending queue will be converted to the target format, and then delivery resumed.
The question is how to do this safely. How about borrowing an idea from dot-qmail -- if the sticky bit is set on $HOME, TMDA will defer message delivery until the directory isn't sticky any longer. So the code can turn this on, perform the conversion, and then turn it off again. It will also provide a nice feature for users who wish to defer delivery by hand. They can chmod +t $HOME and chmod -t $HOME to stop and restart delivery.
The other question is how to handle the transfer of messages from one pending queue format to another, which may be wildly different (e.g, a SQL db to a Maildir). Since allot of the code seems to pass around messages as email.Message objects, perhaps we could just serialize each message object using pickle, write them into MAILID.pck files in a temp directory tempfile.mkdtemp, then unserialize them and insert them into the new queue one by one.
* This might be open for debate, but I'm considering removing the :3,R and :3,C status of released/confirmed pending messages. Messages will simply be unlinked after release or confirm. My reasoning is that this makes the code extremely complex with little to no benefit. It is also non-intuitive behavior, as users expect the message to be deleted after release/confirm (see FAQ 4.4).
- I want to switch to $string for all substitutions. %(string)s is too error prone. This is also on Mailman's TODO list for 2.2, so we might be able to just steal their code.
* Remove BOUNCE_TEXT_* and CONFIRM_ACCEPT_* in favour of multiple templates, one for each condition. This might mean some content duplication in the template collection and more editing, but it's much simpler. It also makes things much easier for users of non-english templates.
- Offer "Tagged Message-IDs" as an alternative to Tagged Addresses. Instead of tagging the envelope and header of an outgoing message, only the Message-ID of the outgoing message is tagged (sender, dated, keyword, etc). The Message-IDs in the References and/or In-Reply-To fields of incoming messages are checked and tagging is validated.
The logging systems needs to be completely reworked using Python's new logging module. The current logging format is hardcoded, and not suited for everyone's needs. The new system should allow complete customization of the format and delivery of logging information. A sensible default (such as the current format) should be chosen.
Local mail delivery
- Integrate Tim's work on multiple delivery instructions. Discussed extensively on tmda-workers in November 2003.
Add "numerical comparison" capability. Suggested by Bernard Johnson. See the thread http://thread.gmane.org/gmane.mail.spam.tmda.user/11330
- Provide a way to bare=append an entire domain of the recipient rather than only the individual e-mail address.
Investigate a '-decode' option to body* and headers* to allow matching of encoded content. See http://permalink.gmane.org/gmane.mail.spam.tmda.devel/5724
- Investigate a way to trigger processing of tagged addresses using the incoming filter language instead of only after the entire filter has been processed.
- Michael Bishop has suggested adding ok=append for the incoming filter.
Erik Max Francis has suggested adding "Negating filter rules".
* A 'pipe-headers' incoming filter file source, which instead of piping the entire contents of a message, simply pipes the headers. This can be helpful when you want to do header processing with an external program, but don't care about the (sometimes enormous) body.
Support true regular expressions in the <match> field of a filter rule in addition to (or instead of?) our homebrew wildcard syntax.
Integrate Jeremy Rossi's (to|from)-ldap patch.
- Remove duplicate authentication code from tmda-ofmipd. Use Auth.py instead.
*The imaplib.IMAP4_SSL\ overload in Auth.py should also be removed since this comes standard with Python 2.3. Remove Python version checks from tmda-ofmipd.
* Add "passthrough" functionality to tmda-ofmipd so that non-TMDA users can send mail through tmda-ofmipd after authenticating. In this case, tmda-ofmipd would simply reinject the message without any header/envelope alteration. Without this the admin must establish different policies for TMDA users (e.g, all TMDA users use port 8025, all non-TMDA users use port 25).
* Investigate/integrate Stephen Warren's tcpserver/stunnel patch
StephenWarren: Doesn't Python 2.3 (or some more recent version than 1.0.x required) have built-in SSL socket support, so since we require 2.3, we could piggy back off that, and avoid stunnel altogether? Update: Python only seems to have client support for SSL sockets, not server support:-( Perhaps this can be optionally enabled if certain external packages are installed. I'll keep looking into it...
StephenWarren: On the "wild ideas" front, I always wanted an ofmipd daemon that would automatically sign all my outbound email with my GPG key. Perhaps tmda-ofmipd could morph into a generic SMTP processor, that would accept email, apply a pluggable set of transforms to it, then send it out. I'd also like to see things such as virtual users support pluggable via a well-defined interface. Basically, a plugin interface for each of:
- Authentication plugins (pam, database, other IMAP server, inherit from current uid/gid, ...)
- User info plugins (map user "name" to uid/gid/home/...)
- List of processing to apply to messages
- Email sink (/bin/sendmail, SMTP params, ...)
Mailing list integration
- Possibly read a mailing list file (~/.lists) containing addresses of subscribed lists and treat them differently in order to be more list friendly. e.g, accept messages from those lists automatically, and inhibit auto-responses directed at those lists. Based on discussion with Tim about how to solve problems with forged spam causing a TMDA confirmation request to be sent to mailing lists (freebsd-stable in this case).
* Perhaps independent of the above, add a TMDAMFTFILE feature similar to qmail's QMAILMFTFILE since non-qmail MTAs don't have such a feature.
Unit testing framework
Write a testsuite for TMDA based on unittest to automate testing and validation of core TMDA functionality. Should result in more robust software, and less breakage after code is added/changed. Contributors of new code will also be encouraged to add corresponding test cases.
Performance / Scalability
Currently both tmda-filter and tmda-sendmail read the entire message into memory which can lead to system performance degredation on underpowered and/or poorly administered systems. Investigate a robust alternative to this behavior. Perhaps the strategy that maildrop uses is sufficient. From maildrop(1):
maildrop is heavily optimized and tries to use as little resources as possible. maildrop reads small messages into memory, then filters and/or delivers the message directly from memory. For larger messages, maildrop accesses the message directly from the file. If the standard input is not a file, maildrop writes the message to a temporary file, then accesses the message from the temporary file. The temporary file is automatically removed when the message is delivered.
NOTE: There has been extensive discussion on the email-sig mailing list about adding this functionality to the parser, and code and patches have been exchanged, so I think it's better to wait until they implement it.
Investigate the possibility of a split-resources, daemonized version of tmda-filter similar to SA's spamc/spamd. Nils Vogels explains at http://permalink.gmane.org/gmane.mail.spam.tmda.user/11629
Investigate whether the 3rd party popen5 module offers us significant advantages over use of the standard popen2 module. From it's description, it seems like it might. For one thing, it might simplify Util.pipecmd() which is used heavily in TMDA.
- Lazy body processing. If the body of a message is never needed by any incoming filter rule, the body is simply not read.
Document the shell= and python= outgoing filter file tag actions.
- Lots of people aren't using the best-practices configuration for TMDA. For example, when I post a response to a mailing list, and CC the original author, I have to confirm the message. We should better document how to setup TMDA for best-practices configuration (e.g. dated sender addresses, and "munged" message IDs for solving that mailing list issue)
JRM: someone should write a TmdaBestPractices page.
* Bernard Johnson has requested integration of two of his patches, tmda-rfilter-environ.patch and Header.py.encoding.patch. For the latter, I'd like to wait until the Python email lib is upgraded in TMDA, and see if that solves the problem with a custom patch.