Do spammers find pleasure in destroying fun stuff?

Recently, while reading through the log file of the mail relay used by tempalias, I noticed a disturbing trend: Apparently, SPAM was being sent through tempalias.

I’ve seen various behaviours. One was to strangely create an alias per second to the same target and then delivering email there.

While I completely fail to understand this scheme, the other one was even more disturbing: Bots were registering {max-usage: 1, days: null} aliases and then sending one mail to them – probably to get around RBL checks they’d hit when sending SPAM directly.

Aside of the fact that I do not want to be helping spammers, this also posed a technical issue: node.js head which I was running back when I developed the service tended to leak memory at times forcing me to restart the service here and then.

Now the additional huge load created by the bots forced me to do that way more often than I wanted to. Of course, the old code didn’t run on current node any more.

Hence I had to take tempalias down for maintenance.

A quick look at my commits on GitHub will show you what I have done:

  • the tempalias SMTP daemon now does RBL checks and immediately disconnects if the connected host is listed.
  • the tempalias HTTP daemon also does RBL checks on alias creation, but it doesn’t check the various DUL lists as the most likely alias creators are most certainly listed in a DUL
  • Per IP, aliases can only be generated every 30 seconds.

This should be some help. In addition, right now, the mail relay is configured to skip sender-checks and sa-exim scans (Spam Assassin on SMTP time as to reject spam before even accepting it into the system) for hosts where relaying is allowed. I intend to change that so that sa-exim and sender verify is done regardless if the connecting host is the tempalias proxy.

Looking at the mail log, I’ve seen the spam count drop to near-zero, so I’m happy, but I know that this is just a temporary victory. Spammers will find ways around the current protection and I’ll have to think of something else (I do have some options, but I don’t want to pre-announce them here for obvious reasons).

On a more happy note: During maintenance I also fixed a few issues with the Bookmarklet which should now do a better job at not coloring all text fields green eventually and at using the target site’s jQuery if available.

Windows 2008 / NAT / Direct connections

Yesterday I ran into an interesting problem with Windows 2008’s implementation of NAT (don’t ask – this was the best solution – I certainly don’t recommend using Windows for this purpose).

Whenever I enabled the NAT service, I was unable to reliably connect to the machine via remote desktop or even any other service that machine was offering. Packets sent to the machine were dropped as if a firewall was in between, but it wasn’t and the Windows firewall was configured to allow remote desktop connections.

Strangely, sometimes and from some hosts I was able to make a connection, but not consistently.

After some digging, this turned out to be a problem with the interface metrics and the server tried to respond over the interface with the private address that wasn’t routed.

So if you are in the same boat, configure the interface metrics of both interfaces manually. Set the metric of the private interface to a high value and the metrics of the public (routed) one to a low value.

At least for me, this instantly fixed the problem.

tempalias – validity limits

I’ve just pushed a small update to tempalias.com that imposes some (generous) limits to the values you can provide for the validity:

  • the maximum amount of days an alias can be valid is now 60 days.
  • the maximum amount of messages that can be sent to an aliases is now set to 100 messages.

I realized that there might be some potential for abusing tempalias.com if the aliases have a practically unlimited duration. Besides, then they wouldn’t be tempaliases any more. Right?

Already generated aliases with longer durations stay valid – true to the spirit of not looking into the data my users provided me with, I’m not going to check the existing aliases.

tempalias.com – bookmarklet work

While the user experience on tempalias.com is already really streamlined, compared to other services that encode the expiration settings and sometimes even the target) into the email address (and are thus exploitable and in some cases requiring you to have an account with them), it loses in that, when you have to register on some site, you will have to open the tempalias.com website in its own window and then manually create the alias.

Wouldn’t it be nice if this worked without having to visit the site?

This video is showing how I want this to work and how the bookmarklet branch on the github project page is already working:

http://vimeo.com/moogaloop.swf?clip_id=11193192&server=vimeo.com&show_title=1&show_byline=0&show_portrait=0&color=00ADEF&fullscreen=1

The workflow will be that you create your first (and probably only) alias manually. In the confirmation screen, you will be presented with a bookmarklet that you can drag to your bookmark bar and that will generate more aliases like the one just generated. This works independently of cookies or user accounts, so it would even work across browsers if you are synchronizing bookmarks between machines.

The actual bookmarklet is just a very small stub that will contain all the configuration for alias creation (so the actual bookmarklet will be the minified version of this file here). The bookmarklet, when executed will add a script tag to the page that actually does the heavy lifting.

The script that’s running in the video above tries really hard to be a good citizen as it’s run in the context of a third party webpage beyond my control:

  • it doesn’t pollute the global namespace. It has to add one function, window.$__tempalias_com,  so it doesn’t reload all the script if you click the bookmark button multiple times.
  • while it depends on jQuery (I’m not doing this in pure DOM), it tries really hard to be a good citizen:
    • if jQuery 1.4.2 is already used on the site, it uses that.
    • if any other jQuery version is installed, it loads 1.4.2 but restores window.jQuery to what it was before.
    • if no jQuery is installed, it loads 1.4.2
    • In all cases, it calls jQuery.noConflict if $ is bound to anything.
  • All DOM manipulation uses really unique class names and event namespaces

While implementing, I noticed that you can’t unbind live events with just their name, so $().die(‘.ta’) didn’t work an I had to provide all events I’m live-binding to. I’m using live here because the bubbling up delegation model works better in a case where there might be many matching elements on any particular page.

Now the next step will be to add some design to the whole thing and then it can go live.

tempalias.com – debriefing

This is the last part of the development diary I was keeping about the creation of a new web service in node.js. You can read the previous installment here.

It’s done.

The layout is finished, the last edges too rough for pushing the thing live are smoothed. tempalias.com is live. After coming really close to finishing the thing yesterday (hence the lack of a posting here – I was too tired when I had to quit at 2:30am) last night, now I could complete the results page and add the needed finishing touches (like a really cool way of catching enter to proceed from the first to the last form field – my favorite hidden feature).

I guess it’s time for a little debriefing:

All in all, the project took a time span of 17 days to implement from start to finish. I did this after work and mostly during weekdays and sundays, so it’s actually 11 days in which work was going on (I also was sick two days). Each day I worked around 4 hours, so all in all, this took around 44 hours to implement.

A significant part of this time went into modifications of third party libraries, while I tried to contact the initial authors to get my changes merged upstream:

  • The author of node-smtp isn’t interested in the SMTP daemon functionality (that wasn’t there when I started and is now completed)
  • The author of redis-node-client didn’t like my patch, but we had a really fruitful discussion and node-redis-client got a lot better at handling dropped connection in the process.
  • The author of node-paperboy has merged my patch for a nasty issue and even tweeted about it (THANKS!)

Before I continue, I want to say a huge thanks to fictorial on github for the awesome discussion I was allowed to have with him about node-redis-client’s handling of dropped connections. I’ve enjoyed every word I was typing and reading.

But back to the project.

Non-third-party code consists of just 1624 lines of code (using wc -l, so not an accurate measurement). This doesn’t factor in the huge amount of changes I made to my fork of node-smtp the daemon part of which was basically non-existant.

Overall, the learnings I made:

  • git and github are awesome. I knew that beforehand, but this just cemented my opinion
  • node.js and friends are still in their infancy. While node removes previously published API on a nearly daily basis (it’s mostly bug-free though), none of the third-party libraries I am using were sufficiently bug-free to use them without change.
  • Asynchronous programming can be fun if you have closures at your disposal
  • Asynchronous programming can be difficult once the nesting gets deep enough
  • Making any variable not declared with var global is the worst design decision I have ever seen in my life especially in node where we are adding concurrency to the mix)
  • While it’s possible (and IMHO preferrable) to have a website done in just RESTful webservices and static/javascript frontend, sometimes just a tiny little bit of HTML generation could be useful. Still. Everything works without emitting even a single line of dynamically generated HTML code.
  • Node is crazy fast.

Also, I want to take the opportunity and say huge thanks to:

  • the guys behind node.js. I would have had to do this in PHP or even rails (which is even less fitting than PHP as it provides so much functionality around generating dynamic HTML and so little around pure JSON based web services) without you guys!
  • Richard for his awesome layout
  • fictorial for redis-node-client and for the awesome discussion I was having with him.
  • kennethkalmer for his work on node-smtp even though it was still incomplete – you lead me on the right tracks how to write an SMTP daemon. Thank you!
  • @felixge for node-paperboy – static file serving done right
  • The guys behind sammy – writing fully JS based AJAX apps has never been easier and more fun.

Thank you all!

The next step will be marketing: Seing this is built on node.js and an actually usable project – way beyond the usual little experiments, I hope to gather some interest in the Hacker community. Seing it also provides a real-world use, I’ll even go and try to submit news about the project on more general outlets. And of course on the Security Now! feedback page as this is inspired by their episode 242.

tempalias.com – learning CSS

This is one more episode in the development diary outlining the creation of a node.js based web service. You can read the previous installment here.

Today I could finally start with creating the HTML and CSS that will become the web frontend of the tempalias.com site. On Sunday, when I initially wanted to start, I was hindered by strangeness and overengineering of the express framework and yesterday it was general breakage in the redis client library for node.

But today I had no excuse and I started doing the HTML and CSS work with the intention of converting Richard’s awesome Photoshop designs into real-world HTML.

My main issue with this task: I plain don’t know CSS. Of course I know the syntax and how it should work in general, but there’s a huge difference between being able to read the syntax and writing basic code and actually being able to understand all the minor details and tricks that make it possible to achieve what you want in a reasonable time frame.

In contrast to real programming languages where you are usually developing for one target (sure – there might be plattform differences, but even nowaways, while learning, you can get away with restricting yourself to one plattform), HTML and CSS provide the additional difficulty that you have to develop for multiple moving targets, all of which containing different subtle bugs.

Combine that with the fact that more than basic CSS definitely isn’t part of my daily work and you’ll understand why I was struggling.

In the end I seem to have gotten into the thinking that’s needed to make elements appear in the general vicinity of where you suppose they should end up. I even got used to the IMHO very non-intuitive way of having margin and border be part of the elements dimensions in addition to their padding so all the pixel calculations fell into place and the whole thing looks more or less acceptable.

Until you begin changing the text size of course. But there’s so much manual pixel painting involved in the various backgrounds (gradient support isn’t quite there yet  – even in browsers) that it’s probably impossible to create a really well-scaling layout anyways, so what I currently have is what I’m content with.

You want to have a peek?

I didn’t upload anything to the public site yet because there’s no functionality and I wouldn’t want to confuse users reaching the site by accident, so a screenshot will have to do. Or you clone my repository on github and run it yourself.

Here it is:

Screenshot of tempalias HTML running in Chrome

The really tricky thing and conversely the thing I’m really the most proud of is the alignment of both the spy and the reflection of the main page content. You witness some really creative margin- and background positioning at work there. Oh. And I just don’t want to know in what glorious ways the non-browser IE butchers this layout.

I. just. plain. don’t. care. This is supposed to be a FUNproject.

Tomorrow: Hooking in Sammy to add links to all the static pages.

It looks now as if we are going live this week :-)

tempalias.com – rewrites

This is yet another installment in my series of posts about building a web service in node.js. The previous post is here.

Between the last post and current trunk of tempalias, there lie two substantial rewrites of core components of the service. One thing is that I completely misused Object.create() which takes an object to be the prototype of the object you are creating. I was of the wrong opinion that it works like Crockford’s object.create() which is creating a clone of the object you are passing.

Also, I learned that only Function objects actually have a prototype.

Not knowing these two things made it impossible to actually deserialize the JSON representation of an alias that was previously stored in redis. This lead to the first rewrite – this time of lib/tempalias.js. Now aliases work more like standard JS objects and require to be instantiated using the new operator, on the plus side though, they work as expected now.

Speaking of serialization. I learned that in V8 (and Safari)

isNan(Date.parse( (new Date()).toJSON() )) === true

which, according to the ES5 spec is a bug. The spec states that Date.parse() should be able to parse a string created by Date.doISOStirng() which is what is used by toJSON.

This ended up with me doing an ugly hack (string replacement) and reporting a bug in Chrome (where the bug happens too).

Anyhow. Friday and Saturday I took off the project, but today I was on it again. This time, I was looking into serving static content. This is how we are going to serve the web site after all.

Express does provide a Static plugin, but it’s fairly limited in that it doesn’t do any client side caching which, even though Node.js is crazy fast, seems imperative to me. Also while allowing you to configure the file system path it should serve static content from, it insists on the static content’s URL being /public/whatever, where I would much rather have kept the URL-Space together.

I tried to add If-Modified-Since-support to express’ static plugin, but I hit some strange interraction in how express handles the HTTP request that caused some connections to never close – not what I want.

After two hours of investigating, I was looking at a different solution, which leads us to rewrite two:

tempalias trunk now doesn’t depend on express any more. Instead, it serves the web service part of the URL space manually and for all the static requests, it uses node-paperboy. paperboy doesn’t try to convert node into Rails and it provides nothing but a simple static file handler for your web server which also works completely inside node’s standard method for handling web requests.

I prefer this solution by much because express was doing too much in some cases and too little in others: Express tries to somewhat imitate rails or any other web framework in that it not only provides request routing but also template rendering (in HAML and friends). It also abstracts away node’s HTTP server module and it does so badly as eveidenced by this strange connection not-quite-ending problem.

On the other hand, it doesn’t provide any help if you want to write something that doesn’t return text/html.

Personally, if I’m doing a RESTful service anyways, I see no point in doing any server-side HTML generation. I’d much rather write a service that exposes an API at some URL endpoints and then also a static page that uses JavaScript / AJAX to consume said API. This is where express provides next to no help at all.

So if the question is whether to have a huge dependency which fails at some key points and doesn’t provide any help with other key points or to have a smaller dependency that handles the stuff I’m not interested in, but otherwise doesn’t interfer, I’d much prefer that solution to the first one.

This is why I went with this second rewrite.

Because I was already using a clean MVC separation (the “view” being the JSON I emit in the API – there’s no view in the traditional sense yet), the rewrite was quite hassle-free and basically nothing but syntax work.

After completing that, I felt like removing the known issues from my blog post where I was writing about persistence: Alias generation is now race-free and alias length is stored in redis too. The architecture can still be improved in that I’m currently doing two requests to Redis per ALIAS I’m creating (SETNX and SET). By moving stuff around a little bit, I can get away with just the SETNX.

On the other hand, let me show you this picture here:

Screenshot of ab running in a terminalConsidering that the current solution is already creating 1546 aliases per second at a concurrency of 100 requests, I can probably get away without changing the alias creation code any more.

And in case you ask: The static content is served with 3000 requests per second – again with a concurrency of 100.

Node is fast.

Really.

Tomorrow: Philip learns CSS – I’m already dreading this final step to enlightenment: Creating the HTML/CSS front-end UI according to the awesome design provided by Richard.