tempalias.com – sysadmin work

This is yet another episode in the development diary behind the creation of a new web service. Read the previous installment here.

Now that I made the SMTP proxy do its thing and that I’m able to serve out static files, I though it was time to actually set up the future production environment so that I can give it some more real-world testing and to check the general stability of the solution when exposed to the internet.

So I went ahead and set up a new VM using Ubuntu Lucid beta, running the latest (HEAD) redis and node and finally made it run the tempalias daemons (which I consolidated into one opening SMTP and HTTP ports at the same time for easier handling).

I always knew that deployment will be something of a problem to tackle. SMTP needs to run on port 25 (if you intend to be running on the machine listed as MX) and HTTP should run on port 80.

Both being sub 1024 in consequence require root privileges to listen on and I definitely didn’t want to run the first ever node.js code I’ve written to run with root privileges (even though it’s a VM – I don’t like to hand out free root on a machine that’s connected to the internet).

So additional infrastructure was needed and here’s what I came up with:

The tempalias web server listens only on localhost on port 8080. A reverse nginx proxy listens on public port 80 and forwards the requests (all of them – node is easily fast enough to serve the static content). This solves another issue I had which is HTTP content compression: Providing compression (Content-Encoding: gzip) is imperative these days and yet not something I want to implement myself in my web application server.

Having the reverse proxy is a tremendous help as it can handle the more advanced webserver tasks – like compression.

I quickly noticed though that the stable nginx release provided with Ubuntu Lucid didn’t seem to be willing to actually do the compression despite it being turned on. A bit of experimentation revealed that stable nginx, when comparing content-types for gzip_types checks the full response content-type including the charset header.

As node-paperboy adds the “;charset: UTF-8” to all requests it serves, the default setting didn’t compress. Thankfully though, nginx could live with

gzip_types "text/javascript; charset: UTF-8" "text/html; charset: UTF-8"

so that settled the compression issue.

Update: of course it should be “charset=UTF-8” instread of “charset: UTF-8” – with the equal sign, nginx actually compresses correctly. My patch to paperboy has since been accepted by upstream, so you won’t have to deal with this hassle.

Next was SMTP. As we are already an SMTP proxy and there are no further advantages of having incoming connections proxied further (no compression or anything), I wanted clients to somehow directly connect to the node daemon.

I quickly learned that even the most awesome iptables setup won’t make the Linux kernel accept on the lo interface anything that didn’t originate from lo, so no amount of NATing allows you to redirect a packet from a public interface to the local interface.

Hence I went by reconfiguring the SMTP server component of tempalias to listen on all interfaces, port 2525 and then redirect the port of packets on the public port from 25 to 2525.

This of course left the port 2525 open on the public interface which I don’t like.

A quickly created iptables rule rejecting (as opposed to dropping – I don’t want a casual port scanner to know that iptables magic is going on) any traffic going to 2525 also dropped the redirected traffic which of course wasn’t much help.

In comes the MARK extension. Here’s what I’ve done:

# mark packets going to port 25
iptables -t mangle -A PREROUTING -i eth0 -p tcp --dport 25 -j MARK --set-mark 99

# redirect packets going to port 25 to 2525
iptables -t nat -A PREROUTING -p tcp -i eth0 --dport 25 -j REDIRECT --to-ports 2525

# drop all incoming packets to 2525 which are not marked
iptables -A INPUT -i eth0 -p tcp --dport 2525 -m mark ! --mark 99 -j REJECT

So. Now the host responds on public port 25 (but not on public port 2525).

Next step was to configure DNS and tell Richard to create himself an alias using

curl --no-keepalive -H "Content-Type: application/json" 
     --data-binary '{"target":"t@example.com","days": 3,"max-usage": 5}' 
     -qsS http://tempalias.com/aliases

(yes. you too can do that right now – it’s live baby!)

Of course it blew up the moment the redis connection timed out, taking the whole node server with it.

Which was the topic of yesterdays coding session: The redis-node-client library is very brittle what connection tracking and keeping is concerned. I needed something quick, so I hacked the library to provide an additional very explicit connection management method.

Then I began discussing the issues I was having with redis-node-client’s author. He’s such a nice guy and we had one hell of a nice discussion which is still ongoing, so I will probably have to rewrite the backend code once more once we found out how to do this the right way.

Between all that sysadmin and library-fixing time, unfortunately, I didn’t yet have time to do all too much on the public facing website: http://tempalias.com at this point contains nothing but a gradient. But it’s a really nice gradient. One of the best.

Today: More redis-node-client hacking (provided I get another answer from fictorial) or finally some real HTML/CSS work (which I’m not looking forward to).

This is taking shape.

tempalias.com – the cake is a lie

This is another installment of my development diary for tempalias.com, a web service that will allow you to create self-destructing email aliases. You can read the last previous here.

This was a triumph.
I’m making a note here: HUGE SUCCESS.
It’s hard to overstate my satisfaction.

I didn’t post an update on wednesday evening because it got very late and I just wanted to sleep. Today, it’s late yet again, but I can gladly report that the backend service is now feature complete.

We are still missing the UI, but with a bit of curl on the command line, you can use the restful web service interface to create aliases and you can use the generated aliases to send email via the now completed SMTP proxy – including time and usage based expiration.

As a reminder: All the code (i.e. the completed backend) is available on my github repository, though keep in mind that there is no documentation what so ever. That I will save for later when this is really going public. If you are brave, feel free to clone it.

You will need the trunk versions for both redis and node.

Screenshot of a terminal showing three consumptions of an alias and a fourth failng.

The screenshot is is showing me consuming an alias four times in a row. Three times, I get the data back, the fourth time, it’s gone.

The website itself is still in the process of being designed and I can promise you, it will be awesome. Richard’s last design was simply mind-blowing. Unfortunately I can’t show it here yet, because he used a non-free picture. Besides, we obviously can’t use non-free artwork for a Free Software project.

So this update concerns itself with two days of work. What was going on?

On wednesday, I wanted to complete the SMTP server, but before I went ahead doing so, I revised the servers design. At the end of the last posting here, we had a design where the SMTP proxy would connect to the smarthost the moment a client connects. It would then proceed to proxy through command by command, returning error messages as they are returned by the smarthost.

The issue with this design lies in the fact that tempalias.com is, by definition, not about sending mail, but about rejecting mail. This means that once it’s up and running, the majority of mail deliveries will simply fail at the RCPT state.

From this perspective, it doesn’t make sense to connect to the smarthost when a client connects. Instead, we should do the handshake up to and including the RCPT TO command, at which time we do the alias expansion. If that fails (which is the more likely case), we don’t need to bother to connect to upstream but we can simply deny the recipient.

The consequence of course is that our RCPT TO can now return errors that happened during MAIL FROM on the upstream server. But as MAIL FROM usually only fails with a 5xx error, this isn’t terribly wrong anyways – the saved resources far outweigh the not-so-perfect error messages.

Once I completed that design change, the next roadblock I went into was the fact that both the smtp server and the smtp client libraries weren’t quite as asynchronous as I would have wanted: The server was reading the complete mail from the client into memory and the client wanted the complete mail as a parameter to its data method.

That felt unpractical to me as in the majority of cases, we won’t get the whole mail at once, but we can certainly already begin to push it through to the smarthost, keeping memory usage of our smtp server as low as possible.

So my clone of the node SMTP library now contains support for asynchronous handling for DATA. The server fires data, data_available and data_end and the client provides startData(), sendData() and endData(). Of course the old functionality is still available, but the tempalias.com SMTP server is using the new interface.

So, that was Wednesday’s work:

  • only connect to the smarthost when it’s no longer inevitable
  • complete the smtp server node library
  • made the smtp server and client libraries fully asynchronous
  • complete the SMTP proxy (but without alias expansion yet)

Before I went to bed, the SMTP server was accepting mail and sending it using the smarthost. It didn’t do alias expansion yet but just rewrote the recipient to my private email address.

This is where I picked up Thursday night: The plan was to hook the alias model classes into the SMTP server as to complete the functionality.

While doing that, I had one more architectural thing to clear: How to make sure that I can decrement the usage-counter race-free? Once that was settled, the rest was pure grunt work by just writing the needed code.

As we are getting long and as it’s quite late again, I’m saving the post-mortem of this last task for tomorrow. You’ll get a chance learn about bugs in node, about redis’ DECR command and finally you will get a chance to laugh at me for totally screwing up the usage of Object.create().

Stay tuned.

tempalias.com – persistence

(This is the third installment of a development diary about the creation of a self destructing email alias service. Read the previous episode here.)

After the earlier clear idea on how to handle the aliases identity, the next question I needed to tackle was the question of persistence: How do I want to store these aliases? Do I want them to persist a server restart? How would I access them?

On the positive side remains the fact that the data structure for this service is practically non-existant: Each alias has its identity and some data associated with it, mainly a target address and the validity information. And lookup will always happen using that identity (with the exception of garbage collection – something I will tackle later).

So this is a clear candiate to use a very simple key/value store. As I hope to gain at least some traction though (wait until I coded the bookmarklet), I would want this to be at least of some robustness, hence writing flat-files seemed like a bad idea.

Ironically, if you want a really simple, built-in solution for data persistance in node.js, you have two options: Either write your own (which is where I don’t want to go to) or use SQLite which is total overkill for the current solution.

So I had the option of just keeping stuff in memory (as plain JS objects or using memcache)  or to use any of the supported key/value storage services.

Aliases going away on server restart felt like a bad thing, so I looked into the various key/value stores.

While looking at the available libraries, I went for the one that was most recently updated, which is redis-node-client. Of course, this meant that I had to use both redis trunk and node trunk as the library is really tracking the bleeding edge. I don’t mind that much though because both redis and node are very self-contained and compile easily on both linux (deployment) and mac os (development) while requiring next to no configuration.

So with a decision made for both persistence and identity, I went ahead and wrote more code.

On the project page, you will see few commits completing the full functionality I wanted a POST to /aliases to have – including persistence using redis and identity using the previously described method of brute-forcing the issue.

I still have two issues at the moment that will need tackling

  1. The initial length of the pseudo-uuid isn’t persisted. This means that once enough aliases are created that we are increasing the length and I’m restarting the server, I will get needless collisions or even a too heavily-used keyspace.
  2. The current method of checking for ID availability and later usage is totally non-race-proof and needs some serious looking-into.

Stuff I learned:

  • node is extremely work-in-progress. While it runs flawlessly and never surprises me with irreproducible or even just seemingly illogical behavior, features appear and disappear at will.
  • This state of flux in node makes it really hard to work with external dependencies. In this case, multipart.js vanished from node trunk (without change log entry either), but express still depends upon that. On the other hand, I’m forced to use node trunk otherwise redis client won’t work.
  • Date(“<timestamp>”) in node is dependent on the local timezone and changing process.env.TZ post-startup doesn’t have any effect. This means that I’m going to have to set TZ=UTC in my start script.
  • Working with an asynchronous API seems strange sometimes, but the power of closures usually comes to the rescue. I certainly wouldn’t want to have to write software like this if I didn’t have closures at my disposal (and, NO, global variables are NOT a viable alternative…)