gnegg: April 2010

tempalias.com – persistence

(This is the third installment of a development diary about the creation of a self destructing email alias service. Read the previous episode here.)

After the earlier clear idea on how to handle the aliases identity, the next question I needed to tackle was the question of persistence: How do I want to store these aliases? Do I want them to persist a server restart? How would I access them?

On the positive side remains the fact that the data structure for this service is practically non-existant: Each alias has its identity and some data associated with it, mainly a target address and the validity information. And lookup will always happen using that identity (with the exception of garbage collection – something I will tackle later).

So this is a clear candiate to use a very simple key/value store. As I hope to gain at least some traction though (wait until I coded the bookmarklet), I would want this to be at least of some robustness, hence writing flat-files seemed like a bad idea.

Ironically, if you want a really simple, built-in solution for data persistance in node.js, you have two options: Either write your own (which is where I don’t want to go to) or use SQLite which is total overkill for the current solution.

So I had the option of just keeping stuff in memory (as plain JS objects or using memcache) or to use any of the supported key/value storage services.

Aliases going away on server restart felt like a bad thing, so I looked into the various key/value stores.

While looking at the available libraries, I went for the one that was most recently updated, which is redis-node-client. Of course, this meant that I had to use both redis trunk and node trunk as the library is really tracking the bleeding edge. I don’t mind that much though because both redis and node are very self-contained and compile easily on both linux (deployment) and mac os (development) while requiring next to no configuration.

So with a decision made for both persistence and identity, I went ahead and wrote more code.

On the project page, you will see few commits completing the full functionality I wanted a POST to /aliases to have – including persistence using redis and identity using the previously described method of brute-forcing the issue.

I still have two issues at the moment that will need tackling

The initial length of the pseudo-uuid isn’t persisted. This means that once enough aliases are created that we are increasing the length and I’m restarting the server, I will get needless collisions or even a too heavily-used keyspace.
The current method of checking for ID availability and later usage is totally non-race-proof and needs some serious looking-into.

Stuff I learned:

node is extremely work-in-progress. While it runs flawlessly and never surprises me with irreproducible or even just seemingly illogical behavior, features appear and disappear at will.
This state of flux in node makes it really hard to work with external dependencies. In this case, multipart.js vanished from node trunk (without change log entry either), but express still depends upon that. On the other hand, I’m forced to use node trunk otherwise redis client won’t work.
Date(“<timestamp>”) in node is dependent on the local timezone and changing process.env.TZ post-startup doesn’t have any effect. This means that I’m going to have to set TZ=UTC in my start script.
Working with an asynchronous API seems strange sometimes, but the power of closures usually comes to the rescue. I certainly wouldn’t want to have to write software like this if I didn’t have closures at my disposal (and, NO, global variables are NOT a viable alternative…)

tempalias.com – another day

This is the second installment of an article series about creating a web service for self-destructing email aliases. Read part 1 here.

Today, I spent a lot of thought and experimentation with two issues:

How would I name and identify the temporary aliases?
How would I store the temporary aliases

Naming and identifying sounds easy. One is inclined to just use an incrementing integer or something alike. But that won’t work for security reasons. If the address you got is 12@tempalias.net, with any likelyhood, there will be an 11@ and a 13@.

Using that information, you could easily bring the whole service down (and endlessly annoy its users) by requesting an address to get the current ID and then sending a lot of mail to the neighboring IDs. If those were created without a mail count limitation, then you could spam the recipient for the whole validity period and if they were created with a count limitation, you could use up all allowed mails.

So the aliases need to be random.

Which leads to the question of how to ensure uniqueness.

Unique random numbers you ask? Isn’t this what UUIDs were invented for?

True. But considering the length of an UUID, would you really want to have an alias in the form e8ea98ce-dabc-42f8-8fcd-c50d20b1f2c5@tempalias.net? That address is so long that it might even hit some length limitation of the target site, which of course is true even if you apply cheap tricks like removing the dashes.

Of course, using base16 to encode an UUID (basically an 128 bit integer) is hopelessly inefficient. By increasing the amount of characters we use, we might be able to decrease the amount of characters.

Keep in mind though, that the string in question is to be a local part of an email address and those tend to be case insensitive with not much guarantees that case is preserved over the process of delivering the message.

That, of course, limits the amount of characters we can use to basically 0-9 and A-Z (plus a few special characters like + . – and _).

This is what Base32 was invented for, but unfortunately, a base32 encoded UUID would still be around 26 characters in length. While that’s a bit better, I still wouldn’t want the email address scheme to be eda3u3rzcfer3fztdvvd6xnd3i@tempalias.com

So in the end, we need something way smaller (adding + . – and _ to the character space wouldn’t help much – what comes out is about 20 characters in length).

In the end, I would probably have to create a elaborate scheme doing something like this:

pick a UUID. Use the first n bytes.
base32 encode.
Check whether that ID is free. If not, add 1 to n and try again.
Keep n around so that in the future, we can already start with taking bigger chunks.

So the moment we reach the first collision, we increase the keyspace eight-fold. That feels sufficiently safe from collisions to me, but of course it increases the maintenance burden somewhat.

The next question was how to get UUIDs and how to base32 encode them from JavaScript.

I tried different aspects, one of which even included using uuidjs and doing the b32 encoding/decoding in C. The good part about that: I now have a general idea of how to extend nodejs with C++ code (yeah. it has to be C++ and my b32 code was C, so I had to do a bit of trickery there too).

In the end though, considering that I can’t use UUIDs anyways, we can go forward using Math.uuid.js and use their call using both len and radix (with the additional change of only using lowercase to encode the data), increasing the length as we hit collisions.

So the next issue is storage: How to store the alias data? How to access it?

This will be part of the next posting here.

tempalias.com – development diary

After listening to this week’s Security Now! podcast where they were discussing disposeamail.com. That reminded me of this little idea I had back in 2002: Selfdestructing Email Addresses.

Instead of providing a web interface for a catchall alias, my solution was based around the idea of providing a way to encode time based validity information and even an usage counter into an email address and then check that information on reception of the email to decide whether to alias the source address to a target address or whether to decline delivery with an “User unknown” error.

This would allow you to create temporary email aliases which redirect to your real inbox for a short amount of time or amount of emails, but instead of forcing you to visit some third-party web interface, you would get the email right there where the other messages end up in: In your personal inbox.

Of course this old solution had one big problem: It required a mail server on the receiving end and it required you as a possible user to hook the script into that mailserver (also, I never managed to do just that with exim before losing interest, but by now, I would probably know how to do it).

Now. Here comes the web 2.0 variant of the same thing.

tempalias.com (yeah. it was still available. so was .net) will provide you with a web service that will allow you to create a temporary mail address that will redirect to your real address. This temporary alias will be valid only for a certain date range and/or a certain amount of email sent to it. You will be able to freely chose the date range and/or invocation count.

In contrast to the other services out there, the alias will direct to your standard inbox. No ad-filled web interface. No security problems caused by typos and no account registration.

Also, the service will be completely open source, so you will be able to run your own.

My motivation is to learn something new, which is why I am

writing this thing in Node.js (also, because a simple REST based webapp and a simple SMTP proxy is just what node.js was invented for)
documenting my progress of implementation here (which also hopefully keeps me motivated).

My progress in implementing the service will always be visible to the public on the projects GitHub page:

http://github.com/pilif/tempalias

As you can see, there’s already stuff there. Here’s what I’ve learned about today and what I’ve done today:

I learned how to use git submodules
I learned a bunch about node.js – how to install it, how it works, how module separation works and how to export stuff from modules.
I learned about the Express micro framework (which does exactly what I need here)
- I learned how request routing works
- I learned how to configure the framework for my needs (and how that’s done internally)
- I learned how to play with HTTP status codes and how to access information about the request

What I’ve accomplished code-wise is, considering the huge amount of stuff I had plain no clue about, quite little:

I added the web server code that will run the webapp
I created a handler that handles a POST-request to /aliases
Said handler checks the content type of the request
I added a very rudimentary model class for the aliases (and learned how to include and use that)

I still don’t know how I will store the alias information. In a sense, it’s a really simple data model mapping an alias ID to its information, so it’s predestined for the cool key/value stores out there. On the other hand, I want the application to be simple and I don’t feel like adding a key/value store as a huge dependency just for keeping track of 3 values per alias.

Before writing more code, I’ll have to find out how to proceed.

So the next update will probably be about that decision.