Log in

No account? Create an account


How not to design an MTA - part 1 - the sendmail command

« previous entry | next entry »
16th Feb 2006 | 23:58

(I say part 1, but don't expect the sequels to arrive quickly.)

The sendmail command is the de facto standard API for submitting email on unix, whether or not it is implemented by Sendmail. All other MTAs have a sendmail command that is compatible with Sendmail for important functions. (Notice how I pedantically use the uc/lc distinction.)

The traditional implementation

Sendmail and its early successors (including Exim) have been setuid root programs that implement all of the MTA functions. They are also decentralized, in that each instance of Sendmail or Exim does (mostly) the whole job of delivering a message without bothering much about what else is going on on the system. The combination of these facts is bad:

(1) A large setuid root program is a serious vulnerability waiting to happen. Sendmail has a long history of problems; Exim is lucky owing to conscientiousness rather than to good architecture.

(2) Particularly subtle problems arise from the effects of what sendmail inherits from its parent process, such as the environment and file descriptors. For example, consider sendmail invoked by a CGI. If the web server is careless and doesn't mark its listening socket close-on-exec, the socket is inherited by the CGI and thence sendmail, which may then take ages to deliver the message. You can't restart the web server while this is going on, because a sendmail process is still listening on port 80, which means you can't restart the web server at all if the CGI is popular.

(3) Independent handling of messages makes load management very difficult. This is not only the load on the local machine, but also the load it imposes on the recipients. Sendmail and Exim lack the IPC necessary to be able to find out if the load is a problem before it is too late.

The qmail approach

Message submission in qmail is done with the qmail-inject program. This performs some header fix-ups, and it can extract the message envelope from the header rather than taking the envelope as separate arguments (like sendmail -t as opposed to plain sendmail). It then calls the setuid qmail-queue to add the message to the queue.

(4) The simple, braindead qmail-queue program does not impose any policy checks on messages it accepts, because that would be too complicated and therefore liable to errors. The fix-ups performed by qmail-inject are within the user's security boundary, not the MTA's, so they are a courtesy rather than a requirement.

The Postfix approach

Postfix is very similar to qmail as far as message submission is concerned, except that rather than fixing up a message, its sendmail command transforms the message into Postfix's internal form before handing it to postdrop which drops it in a queue. The fix-ups are performed later by the cleanup program, whcih also operates on messages received over the network. Which brings us to:

(5) Sendmail, qmail, and Postfix do not have an idea of message submission versus message relay. For example they tend to fix up all messages, wherever they come from - or in qmail's case, not fix them at all.

Step back a bit

So what are the requirements?

(a) A clear security boundary between local users and the MTA. Note that all the MTAs rely on setuid or setgid programs that insert messages directly into the queue. Postfix and qmail ensure they are relatively small and short-lived, but they are still bypassing the most security-conscious part of the MTA, i.e. the smtp server. This opens up an extra avenue for attack - albeit only for local users. But why do they need special privileges?

(b) Policy checks on email from local users along the same lines as those from remote MUAs. Is this message submission or message relay? Does it need to be scanned for viruses? What are the size limits? Does address verification imply this user (e.g. nobody) cannot send email at all?

If you have a sophisticated system for smtp server policy checks, why bypass that for local messages? Exim can sort-of do what I want, but it retro-fits the policy checks onto the wrong architecture.

The fanf approach

The sendmail program is a very simple RFC 2476 message submission client: it talks SMTP to a server and expects the server to do the necessary fix-ups. It doesn't need any special privilege: from the server's point of view it is just another client.

It's not quite as simple as that. You need to authenticate the local user to the server, because users should not be able to fake Sender: fix-ups, and there are situations when you will want to treat users differently, e.g. email from a mailing list manager. So instead of SMTP over TCP, talk it over a unix domain socket, which allows unforgeable transmission of the client user ID.

Problem (1) solved: no setuid or setgid programs.
Problem (2) solved: client process is short-lived and synchronous.
Problem (3) solved: messages all go through the same channel.
Problem (4) solved: messages all go through the same policy engine.
Problem (5) solved: the policy engine is powerful enough to know when submission-mode fix-ups are becessary.

One thing you do lose by this approach is that the sendmail command only works when the SMTP listener is running, which is not a problem with the other designs. But I'm not convinced this is a serious difficulty, and in fact it can be viewed as an advantage - it doesn't let email silently disappear into an unserviced queue.

A question arises with this architecture - which also arises for remote MUAs - which is, where is the best place to generate the message envelope? i.e. the transport-level sender and recipient addresses. RFC 2476 says that the client does this job, however no-one has written a decent sendmail -t replacement, and even "serious" MUAs get this job wrong. Furthermore, the server still has to parse the header and perform various checks and fix-ups, so why shouldn't it generate the envelope too? Hence draft-fanf-smtp-rcpthdr, which also has the best description of how to handle submission-time fix-ups of re-sent messages.

| Leave a comment |

Comments {5}

Simon Tatham

from: simont
date: 17th Feb 2006 09:15 (UTC)

What happens if the SMTP listener gives a 4xx? Clearly 2xx means sendmail returns success and 5xx means it returns failure. Does 4xx cause the sendmail program to fork and sit in the background retrying? :-)

The thing that's always annoyed me about the /usr/lib/sendmail interface is that if you let it make up extra headers for you (such as Message-ID) there's no way to get them back usefully. As an MUA author, I'd rather not have to make up my own Message-IDs, because the uniqueness is a lot of hassle and if someone else has already solved the problem it would seem foolish to duplicate their effort. So ideally I'd submit my outgoing messages to /usr/lib/sendmail without a Message-ID, and let it add one. Trouble is, if I do that, I don't find out what the Message-ID is so I can store it when I Fcc, and if I Fcc without a Message-ID then threading won't subsequently work.

My current MUA solves this by routinely Bccing mail to myself, so that I save it into my mail folders when it comes back with its Message-ID. This is a thoroughly horrid solution because it requires constant work on the part of the user. I could filter incoming mail and automatically Fcc it if I thought it looked like something I'd sent, but that would be unreliable and a lot of pain.

What I'd like would be the ability to ask /usr/lib/sendmail to print its version of the message on standard output after it finishes receiving my version on standard input, complete with any headers it felt a need to add. Then my mail-sending code could retrieve this and Fcc it immediately, with no need for user intervention or for strange asynchronous mail-receipt-time weirdness. Even better if I got back a Received header which mentioned the message's ID in the local mail system, so I could quote it when discussing its later non-delivery with my sysadmin.

The third option is that I give up and start inventing Message-IDs, and realistically that probably is less effort than trying to have /usr/lib/sendmail fundamentally redesigned :-) But it irritates me, because I shouldn't have to.

Reply | Thread

Tony Finch

from: fanf
date: 17th Feb 2006 13:48 (UTC)

A 4xx would be treated the same as a 5xx. Obviously you need to tune your policies so that this is unlikely to occur for local email.

Your point about Fcc is good. I also copy my own email to myself, but in my case it is because I want it to be filtered in the same way as email from other people - I don't have a sent-mail folder. You can reduce the work of this by just Bcc:ing everything to yourself automatically, and doing post-hoc de-duplication of messages that also come back to you via a mailing list using the Message-ID (as Cyrus does).

Reply | Parent | Thread


from: cjwatson
date: 17th Feb 2006 19:04 (UTC)

Of course, the unfortunate side-effect of doing that sort of de-duplication is that messages sent to a mailing list and Cc:ed to me no longer have what seem to me to be the obvious semantics, namely to appear in the mailing list folder and also in my inbox for more careful attention, since the personal copy generally arrives first and the mailing list copy then gets binned by de-duplication.

I suppose I could remember which messages I'd Bcc:ed to myself and only de-duplicate those, although that also seems messy.

Reply | Thread


from: cjwatson
date: 17th Feb 2006 19:05 (UTC)

(er, apologies for misplacing this comment at the top level rather than as a reply to fanf's)

Reply | Parent | Thread

Tony Finch

from: fanf
date: 17th Feb 2006 21:27 (UTC)

Actually the Cyrus de-duplication is cleverer than that: it happens when the message is added to a folder, so if you filter on the return path (say) then a message transferred via the mailing list will go into the list's folder, whereas other messages will be delivered to your inbox as usual.

Reply | Parent | Thread