As well as obvious goals like simplicity and modularity, I would like to make sure it distributed well over multiple servers if there was a need for that.
One part of this would be to have totally separate queue for mail items at different stages of processing.
There would be queues for
- Newly submitted mail, that arrived via SMSP, or SMTP/AUTH, or that otherwise was submitted locally
- Newly arrived mail, that came in via SMTP (there is no other transport that I am interested in
- Mail awaiting delivery via SMTP to some remote site
- Mail awaiting local delivery
There could be several local queues if different sorts of local mail are delivered differently. e.g. different local domains.
Each of these queues could reasonably be duplicated over several machines if the load demanded it.
Local delivery will use LMTP (rfc2033) to actually pass the message to a mailbox or an active agent (such as a mailing-list server). Naturally these agents could be on different machines.
A key task to be performed is mapping addresses. This includes looking up forwards and lists, and determining which server different local addresses mich go to. This should be done using a replicated, network-accessable database. Changes can be made remotely and are propagated to all replicas. A queue manager which does lookups would cache the results, but will hold a connection open and will receive any update notifications as they are propagated.
Queue are simple as possible. There is only control file per message and usually one data file. The data file is stored in a separate directory tree. The control files contain simple textual headers describing the message and what needs to be done with it. Changes are only ever appended, so the whole file needs to be read before we know what to do next, but this shouldn't be burdensome and it will usually be well under 1K.
The queue manager will read all files and record relevant information in internal data structures. It will periodically rescan looking for new messages, but will also listen on a UNIX-domain socket for notification of new messages. Messages are passed on to helper threads by sending the file name over a pipe. Threads indicate completion by sending a message back over a different pipe (or the same socket), or by exitting...
The queue manager only looks at timestamp information in the message and trigger information. Timestamps indicate when the message was last attempted and are used to determine a next time to try. Triggers indicate events that could suggest it is worth trying this message again. Triggers that are being waited on are recorded as files in a directory and are reported over the same socket that new messages are reported over. A trigger could be a remote STMP server becoming available. When it is determined to be unavailable the file is created. When it is available again, the file is removed and the name is written to the socket. This releases all messages with that trigger.
Different queue managers have different helpers, whether they deliver by SMTP, or LMTP, or do local-name lookup and re-queue.
Possibly the most interesting part of this is the local address lookup service. It needs to replace any address@localdomain with a list of either remote-addresses or address@localservice. This may involve looking up lists, local forwards, user-to-server mapping tables, and verifying local addresses. This might reasonably consult various local databases such as LDAP. Having all the data in servers which will send notification of changes would be very helpful. Otherwise each database would need to have a poll-time and information would need to be polled.
The query protocol should return a max TTL, a table version number, and should allow updates to be propagated from server to clients.
Probably my first steps here would be to replicate the mlalias data base, and put an LDAP front-end on the UDB. Then I might have the infrastructure in place to do the address lookups.
As well as obvious goals like simplicity and modularity, I would like to make sure it distributed well over multiple servers if there was a need for that.
One part of this would be to have totally separate queue for mail items at different stages of processing. There would be queues for
- Newly submitted mail, that arrived via SMSP, or SMTP/AUTH, or that otherwise was submitted locally
- Newly arrived mail, that came in via SMTP (there is no other transport that I am interested in
- Mail awaiting delivery via SMTP to some remote site
- Mail awaiting local delivery
There could be several local queues if different sorts of local mail are delivered differently. e.g. different local domains.
Each of these queues could reasonably be duplicated over several machines if the load demanded it.
Local delivery will use LMTP (rfc2033) to actually pass the message to a mailbox or an active agent (such as a mailing-list server). Naturally these agents could be on different machines.
A key task to be performed is mapping addresses. This includes looking up forwards and lists, and determining which server different local addresses mich go to. This should be done using a replicated, network-accessable database. Changes can be made remotely and are propagated to all replicas. A queue manager which does lookups would cache the results, but will hold a connection open and will receive any update notifications as they are propagated.
Queue are simple as possible. There is only control file per message and usually one data file. The data file is stored in a separate directory tree. The control files contain simple textual headers describing the message and what needs to be done with it. Changes are only ever appended, so the whole file needs to be read before we know what to do next, but this shouldn't be burdensome and it will usually be well under 1K.
The queue manager will read all files and record relevant information in internal data structures. It will periodically rescan looking for new messages, but will also listen on a UNIX-domain socket for notification of new messages. Messages are passed on to helper threads by sending the file name over a pipe. Threads indicate completion by sending a message back over a different pipe (or the same socket), or by exitting...
The queue manager only looks at timestamp information in the message and trigger information. Timestamps indicate when the message was last attempted and are used to determine a next time to try. Triggers indicate events that could suggest it is worth trying this message again. Triggers that are being waited on are recorded as files in a directory and are reported over the same socket that new messages are reported over. A trigger could be a remote STMP server becoming available. When it is determined to be unavailable the file is created. When it is available again, the file is removed and the name is written to the socket. This releases all messages with that trigger.
Different queue managers have different helpers, whether they deliver by SMTP, or LMTP, or do local-name lookup and re-queue.
Possibly the most interesting part of this is the local address lookup service. It needs to replace any address@localdomain with a list of either remote-addresses or address@localservice. This may involve looking up lists, local forwards, user-to-server mapping tables, and verifying local addresses. This might reasonably consult various local databases such as LDAP. Having all the data in servers which will send notification of changes would be very helpful. Otherwise each database would need to have a poll-time and information would need to be polled.
The query protocol should return a max TTL, a table version number, and should allow updates to be propagated from server to clients.
Probably my first steps here would be to replicate the mlalias data base, and put an LDAP front-end on the UDB. Then I might have the infrastructure in place to do the address lookups.