AV Programs as a Security Risk

Imagine you were logged into your machine as an administrator. Imagine you’re going to double-click every single attachment in every single email you get. Imagine you’re going to launch every single file your browser downloads. Imagine you answer affirmative on every single prompt to install the latest whatever. Imagine you unpack every single archive sent to you and you launch ever single file in those archives.

This is the position that AV programs put themselves on your machine if they want to have any chance at being able to actually detect malware. Just checking whether a file contains a known byte signature has stopped being a reliable method for detecting viruses long ago.

It makes sense. If I’m going to re-distribute some well-known piece of malware, all I have to do is to obfuscate it a little bit or encrypt it with a static key and my piece of malware will naturally not match any signature of any existing malware.

The loader-stub might, but if I’m using any of the existing installer packagers, then I don’t look any different than any other setup utility for any other piece of software. No AV vendor can yet affort to black-list all installers.

So the only reliable way to know whether a piece of software is malware or not, is to start running it in order to at least get it to extract/decrypt itself.

So here we are in a position where a anti malware program either is useless placebo or it has to put itself into the position I have started this article with.

Personally, I think it is impossible to safely run a piece of software in a way that it cannot do any harm to the host machine.

AV vendors could certainly try to make it as hard as possible for malware to take over a host machine, but here we are in 2016 where most of the existing AV programs are based on projects started in the 90ies where software quality and correctness was even less of a focus than it is today.

We see AV programs disabling OS security features, installing and starting VNC servers and providing any malicious web site with full shell access to the local machine. Or allow malware to completely take over a machine if a few bytes are read no matter where from.

This doesn’t cover the privacy issues yet which are caused by more and more price-pressure the various AV vendors are subject to. If you have to sell the software too cheap to pay for its development (or even give it away for free), then you need to open other revenue streams.

Being placed in such a privileged position like AV tools are, it’s no wonder what kinds of revenue streams are now in process of being tapped…

AV programs by definition put themselves into an extremely dangerous spot on your machine: In order to read every file your OS wants to access, it has to run with adminitrative rights and in order to actually protect you it has to understand many, many more file formats than what you have applications for on your machine.

AV software has to support every existing archive format, even long obsolete ones because who knows – you might have some application somewhere that can unpack it; it has to support every possibly existing container format and it has to support all kinds of malformed files.

If you try to open a malformed file with some application, then the application has the freedom to crash. An AV program must keep going and try even harder to see into the file to make sure it’s just corrupt and not somehow malicious.

And as stated above: Once it finally got to some executable payload, it often has no chance but to actually execute it, at least partially.

This must be some of the most difficult thing to get right in all of engineering: Being placed at a highly privileged spot and being tasked to then deal with content that’s malicious per definitionem is an incredibly difficult task and when combined with obviously bad security practices (see above), I come to the conclusion that installing AV programs is actually lowering the overall security of your machines.

Given a fully patched OS, installing an AV tool will greatly widen the attack surface as now you’re putting a piece of software on your machine that will try and make sense of every single byte going in and out of your machine, something your normal OS will not do.

AV tools have the choice of doing nothing against any but the most common threats if they decide to do pure signature matching, or of potentially putting your machine at risk.

AV these days might provide a very small bit of additional security against well-known threats (though against those you’re also protected if you apply the latest OS patches and don’t work as an admin) but they open your installation wide for all kinds of targeted attacks or really nasty 0-day exploits that can bring down your network without any user-interaction what so ever.

If asked what to do these days, I would give the strong recommendation to not install AV tools. Keep all the software you’re running up to date and white-list the applications you want your users to run. Make use of white-listing by code-signatures to, say, allow everything by a specific vendor. Or all OS components.

If your users are more tech-savy (like developers or sys admins), don’t whitelist but also don’t install AV on their machines. They are normally good enough to not accidentally run malware and the risk of them screwing up is much lower than the risk of somebody exploiting the latest flaw in your runs-as-admin-and-launches-every-binary piece of security software.

AJAX, Architecture, Frameworks and Hacks

Today I was talking with @brainlock about JavaScript, AJAX and Frameworks and about two paradigms that are in use today:

The first is the “traditional” paradigm where your JS code is just glorified view code. This is how AJAX worked in the early days and how people are still using it. Your JS-code intercepts a click somewhere, sends an AJAX request to the server and gets back either more JS code which just gets evaulated (thus giving the server kind of indirect access to the client DOM) or a HTML fragment which gets inserted at the appropriate spot.

This means that your JS code will be ugly (especially the code coming from the server), but it has the advantage that all your view code is right there where all your controllers and your models are: on the server. You see this pattern in use on the 37signals pages or in the github file browser for example.

Keep the file browser in mind as I’m going to use that for an example later on.

The other paradigm is to go the other way around an promote JS to a first-class language. Now you build a framework on the client end and transmit only data (XML or JSON, but mostly JSON these days) from the server to the client. The server just provides a REST API for the data plus serves static HTML files. All the view logic lives only on the client side.

The advantages are that you can organize your client side code much better, for example using backbone, that there’s no expensive view rendering on the server side and that you basically get your third party API for free because the API is the only thing the server provides.

This paradigm is used for the new twitter webpage or in my very own tempalias.com.

Now @brainlock is a heavy proponent of the second paradigm. After being enlightened by the great Crockford, we both love JS and we both worked on huge messes of client-side JS code which has grown over the years and lacks structure and feels like copy pasta sometimes. In our defense: Tons of that code was written in the pre-enlightened age (2004).

I on the other hand see some justification for the first pattern aswell and I wouldn’t throw it away so quickly.

The main reason: It’s more pragmatic, it’s more DRY once you need graceful degradation and arguably, you can reach your goal a bit faster.

Let me explain by looking at the github file browser:

If you have a browser that supoports the HTML5 history API, then a click on a directory will reload the file list via AJAX and at the same time the URL will be updated using push state (so that the current view keeps its absolute URL which is valid even after you open it in a new browser).

If a browser doesn’t support pushState, it will gracefully degrade by just using the traditional link (and reloading the full page).

Let’s map this functionality to the two paradigms.

First the hacky one:

  1. You render the full page with the file list using a server-side template
  2. You intercept clicks to the file list. If it’s a folder:
  3. you request the new file list
  4. the server now renders the file list partial (in rails terms – basically just the file list part) without the rest of the site
  5. the client gets that HTML code and inserts it in place of the current file list
  6. You patch up the url using push state

done. The view code is only on the server. Whether the file list is requested using the AJAX call or the traditional full page load doesn’t matter. The code path is exactly the same. The only difference is that the rest of the page isn’t rendered in case of an AJAX call. You get graceful degradation and no additional work.

Now assuming you want to keep graceful degradation possible and you want to go the JS framework route:

  1. You render the full page with the file list using a server-side template
  2. You intercept the click to the folder in the file list
  3. You request the JSON representation of the target folder
  4. You use that JSON representation to fill a client-side template which is a copy of the server side partial
  5. You insert that HTML at the place where the file list is
  6. You patch up the URL using push state

The amount of steps is the same, but the amount of work isn’t: If you want graceful degradation, then you write the file list template twice: Once as a server-side template, once as a client-side template. Both are quite similar but usually you’ll be forced to use slightly different syntax. If you update one, you have to update the other or the experience will be different whether you click on a link or you open the URL directly.

Also you are duplicating the code which fills that template: On the server side, you use ActiveRecord or whatever other ORM. On the client side, you’d probably use Backbone to do the same thing but now your backend isn’t the database but the JSON response. Now, Backbone is really cool and a huge timesaver, but it’s still more work than not doing it at all.

OK. Then let’s skip graceful degradation and make this a JS only client app (good luck trying to get away with that). Now the view code on the server goes away and you are just left with the model on the server to retrieve the data, with the model on the client (Backbone helps a lot here, but there’s still a substatial amount of code that needs to be written that otherwise wouldn’t) and with the view code on the client.

Now don’t ge me wrong.

I love the idea of promoting JS to a first class language. I love JS frameworks for big JS only applications. I love having a “free”, dogfooded-by-design REST API. I love building cool architectures.

I’m just thinking that at this point it’s so much work doing it right, that the old ways do have their advantages and that we should not condemn them for being hacky. True. They are. But they are also pragmatic.

Find relation sizes in PostgreSQL

Like so many times before, today I was yet again in the situation where I wanted to know which tables/indexes take the most disk space in a particular PostgreSQL database.

My usual procedure in this case was to dt+ in psql and scan the sizes by eye (this being on my development machine, trying to find out the biggest tables I could clean out to make room).

But once you’ve done that a few times and considering that dt+ does nothing but query some PostgreSQL internal tables, I thought that I want this solved in an easier way that also is less error prone. In the end I just wanted the output of dt+ sorted by size.

The lead to some digging in the source code of psql itself (src/bin/psql) where I quickly found the function that builds the query (listTables in describe.c), so from now on, this is what I’m using when I need to get an overview over all relation sizes ordered by size in descending order:

select
  n.nspname as "Schema",
  c.relname as "Name",
  case c.relkind
     when 'r' then 'table'
     when 'v' then 'view'
     when 'i' then 'index'
     when 'S' then 'sequence'
     when 's' then 'special'
  end as "Type",
  pg_catalog.pg_get_userbyid(c.relowner) as "Owner",
  pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid)) as "Size"
from pg_catalog.pg_class c
 left join pg_catalog.pg_namespace n on n.oid = c.relnamespace
where c.relkind IN ('r', 'v', 'i')
order by pg_catalog.pg_relation_size(c.oid) desc;

Of course I could have come up with this without source code digging, but honestly, I didn’t know about relkind s, about pg_size_pretty and pg_relation_size (I would have thought that one to be stored in some system view), so figuring all of this out would have taken much more time than just reading the source code.

Now it’s here so I remember it next time I need it.

How to kill IE performance

While working on my day job, we are often dealing with huge data tables in HTML augmented with some JavaScript to do calculations with that data.

Think huge shopping cart: You change the quantity of a line item and the line total as well as the order total will change.

This leads to the same data (line items) having three representations:

  1. The model on the server
  2. The HTML UI that is shown to the user
  3. The model that’s seen by JavaScript to do the calculations on the client side (and then updating the UI)

You might think that the JavaScript running in the browser would somehow be able to work with the data from 2) so that the third model wouldn’t be needed, but due to various localization issues (think number formatting) and data that’s not displayed but affects the calculations, that’s not possible.

So the question is: Considering we have some HTML templating language to build 2), how do we get to 3).

Back in 2004 when I initially designed that system (using AJAX before it was widely called AJAX even), I hadn’t seen Crockford’s lectures yet, so I still lived in the “JS sucks” world, where I’ve done something like this

<!-- lots of TRs -->
<tr>
    <td>Column 1 addSet(1234 /*prodid*/, 1 /*quantity*/, 10 /*price*/, /* and, later, more, stuff, so, really, ugly */)</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

(Yeah – as I said: 2004. No object literals, global functions. We had a lot to learn back then, but so did you, so don’t be too angry at me – we improved)

Obviously, this doesn’t scale: As the line items got more complicated, that parameter list grew and grew. The HTML code got uglier and uglier and of course, cluttering the window object is a big no-no too. So we went ahead and built a beautiful design:

<!-- lots of TRs -->
<tr class="lineitem" data-ps-lineitem='{"prodid": 1234, "quantity": 1, "price": 10, "foo": "bar", "blah": "blah"}'>
    <td>Column 1</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

The first iteration was then parsing that JSON every time we needed to access any of the associated data (and serializing again whenever it changed). Of course this didn’t go that well performance-wise, so we began caching and did something like this (using jQuery):

$(function(){
    $('.lineitem').each(function(){
        this.ps_data = $.parseJSON($(this).attr('data-ps-lineitem'));
    });
});

Now each DOM element representing one of these <tr>’s had a ps_data member which allowed for quick access. The JSON had to be parsed only once and then the data was available. If it changed, writing it back didn’t require a re-serialization either – you just changed that property directly.

This design is reasonably clean (still not as DRY as the initial attempt which had the data only in that JSON string) while still providing enough performance.

Until you begin to amass datasets. That is.

Well. Until you do so and expect this to work in IE.

800 rows like this made IE lock up its UI thread for 40 seconds.

So more optimization was in order.

First,

$('.lineitem')

will kill IE. Remember: IE (still) doesn’t have getElementsByClassName, so in IE, jQuery has to iterate the whole DOM and check whether each elements class attribute contains “lineitem”. Considering that IE’s DOM isn’t really fast to start with, this is a HUGE no-no.

So.

$('tr.lineitem')

Nope. Nearly as bad considering there are still at least 800 tr’s to iterate over.

$('#whatever tr.lineitem')

Would help if it weren’t 800 tr’s that match. Using dynaTrace AJAX (highly recommended tool, by the way) we found out that just selecting the elements alone (without the iteration) took more than 10 seconds.

So the general take-away is: Selecting lots of elements in IE is painfully slow. Don’t do that.

But back to our little problem here. Unserializing that JSON at DOM ready time is not feasible in IE, because no matter what we do to that selector, once there are enough elements to handle, it’s just going to be slow.

Now by chunking up the amount of work to do and using setTimeout() to launch various deserialization jobs we could fix the locking up, but the total run time before all data is deserialized will still be the same (or slightly worse).

So what we have done in 2004, even though it was ugly, was way more feasible in IE.

Which is why we went back to the initial design with some improvements:

<!-- lots of TRs -->
<tr class="lineitem">
    <td>Column 1 PopScan.LineItems.add({"prodid": 1234, "quantity": 1, "price": 10, "foo": "bar", "blah": "blah"});</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

phew crisis averted.

Loading time went back to where it was in the 2004 design. It was still bad though. With those 800 rows, IE was still taking more than 10 seconds for the rendering task. dynaTrace revealed that this time, the time was apparently spent rendering.

The initial feeling was that there’s not much to do at that point.

Until we began suspecting the script tags.

Doing this:

<!-- lots of TRs -->
<tr class="lineitem">
    <td>Column 1</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

The page loaded instantly.

Doing this

<!-- lots of TRs -->
<tr class="lineitem">
    <td>Column 1 1===1;</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

it took 10 seconds again.

Considering that IE’s JavaScript engine runs as a COM component, this isn’t actually that surprising: Whenever IE hits a script tag, it stops whatever it’s doing, sends that script over to the COM component (first doing all the marshaling of the data), waits for that to execute, marshals the result back (depending on where the DOM lives and whether the script accesses it, possibly crossing that COM boundary many, many times in between) and then finally resumes page loading.

It has to wait for each script because, potentially, that JavaScript could call document.open() / document.write() at which point the document could completely change.

So the final solution was to loop through the server-side model twice and do something like this:

<!-- lots of TRs -->
<tr class="lineitem">
    <td>Column 1 </td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->
</table>

PopScan.LineItems.add({prodid: 1234, quantity: 1, price: 10, foo: "bar", blah: "blah"});
// 800 more of these

Problem solved. Not too ugly design. Certainly no 2004 design any more.

And in closing, let me give you a couple of things you can do if you want to bring the performance of IE down to its knees:

  • Use broad jQuery selectors. $('.someclass') will cause jQuery to loop through all elements on the page.
  • Even if you try not to be broad, you can still kill performance: $('div.someclass'). The most help jQuery can expect from IE is getElementsByTagName, so while it’s better than iterating all elements, it’s still going over all div’s on your page. Once it’s more than 200, the performance extremely quickly falls down (probably doing some O(n^2) thing somehwere).
  • Use a lot of <script>-tags. Every one of these will force IE to marshal data to the scripting engine COM component and to wait for the result.

Next time, we’ll have a look at how to use jQuery’s delegate() to handle common cases with huge selectors.

stabilizing tempalias

While the maintenance last weekend brought quite a bit of stabilization to the tempalias service, I quickly noticed that it was still dying sooner or later and while before updating node, it died due to not being able to allocate more memory, this time, it died by just not answering any requests any more.

A look at the error log quickly revealed quite many exceptions complaining about a certain request type not being allowed to have a body and finally one complaining about not being able to open a file due to having run out of file handles.

So I quickly improved error logging and restarted the daemon in order to get a stacktrace leading to these tons of exceptions.

This quickly pointed to paperboy which was sending the file even if the request was a HEAD request. http.js in node checks for this and throws whenever you send a body when you should not. That exception lead then to paperboy never closing the file (have I already complained how incredibly difficult it is to do proper exception handling the moment continuations get involved? I think not and I also think it’s a good topic for another diary entry). With the help of lsof I’ve quickly seen that my suspicions were true: the node process serving tempalias had tons of open handles to public/index.html.

I sent a patch for this behavior to @felixge which was quickly applied, so that’s fixed now. I hope it’s of some use for other people too.

Now knowing that having a look at lsof here and then might be a good idea, quickly revealed another problem: While the file handles were gone, I’ve noticed tons and tons of SMTP sockets staying open in CLOSE_WAIT state. Not good as that too will lead to handle starvation sooner or later.

On a hunch, I found out that connecting to the SMTP daemon and then disconnecting, not sending QUIT to let the server disconnect was what was causing the lingering sockets. Clients disconnecting like that is very common in case the sender sends a 5xx response which is what the tempalias daemon was designed for.

So I had to fix that in my fork of the node smtp daemon (the original upstream isn’t interested in daemon functionality and the owner I forked the daemon for doesn’t respond to my pull requests. Hence I’m maintaining my own fork for now).

Futher looks at lsof prove that now we are quite stable in resource consumption: No lingering connections, no unclosed file handles.

But the error log was still filling up. This time something about removeListener needing a function. Thanks to the callstack I now had in my error log, I quickly hunted that one down and fixed it – that was a very stupid mistake. Thankfully, because the mails I usually deliver are small enough so that socket draining usually wasn’t required.

Onwards to the next issue filling the error log: «This deferred has already been resolved».

This comes from the Promise.js library if you emit*() multiple times on the same promise. This time, of course, the callstack was useless (… at <anonymous> – why, thank you), but I was very lucky again in that I tested from home and my mail relay didn’t trust my home IP address and thus denied relaying with a 500 which immediately led to the exception.

Now, this one is crazy: When you call .addErrback() on a Promise before calling addCallback(), your callback will be executed no matter if the errback was executed first.

Promise.js does some really interesting things to simulate polymorphism in JavaScript and I really didn’t want to fix up that library as lately, node.js itself seems go to a simpler continuation style using a callback parameter, so sooner or later, I’ll have to patch up the smtp server library anyways to remove Promise.js if I want to adhere to current node style.

So I took the workaround route by just using addCallback() before addErrback() even though the other order feels more natural to me. In addition, I reported an issue with the author as this is clearly unexpected behavior.

Now the error log is pretty much silent (minus some ECONNRESET exceptions due to clients sending RST packets in mid-transfer, but I think they are uncritical to resource consumption), so I hope the overall stability of the site has improved a bunch – I’d love not having to restart the daemon for more than a day :-)

Do spammers find pleasure in destroying fun stuff?

Recently, while reading through the log file of the mail relay used by tempalias, I noticed a disturbing trend: Apparently, SPAM was being sent through tempalias.

I’ve seen various behaviours. One was to strangely create an alias per second to the same target and then delivering email there.

While I completely fail to understand this scheme, the other one was even more disturbing: Bots were registering {max-usage: 1, days: null} aliases and then sending one mail to them – probably to get around RBL checks they’d hit when sending SPAM directly.

Aside of the fact that I do not want to be helping spammers, this also posed a technical issue: node.js head which I was running back when I developed the service tended to leak memory at times forcing me to restart the service here and then.

Now the additional huge load created by the bots forced me to do that way more often than I wanted to. Of course, the old code didn’t run on current node any more.

Hence I had to take tempalias down for maintenance.

A quick look at my commits on GitHub will show you what I have done:

  • the tempalias SMTP daemon now does RBL checks and immediately disconnects if the connected host is listed.
  • the tempalias HTTP daemon also does RBL checks on alias creation, but it doesn’t check the various DUL lists as the most likely alias creators are most certainly listed in a DUL
  • Per IP, aliases can only be generated every 30 seconds.

This should be some help. In addition, right now, the mail relay is configured to skip sender-checks and sa-exim scans (Spam Assassin on SMTP time as to reject spam before even accepting it into the system) for hosts where relaying is allowed. I intend to change that so that sa-exim and sender verify is done regardless if the connecting host is the tempalias proxy.

Looking at the mail log, I’ve seen the spam count drop to near-zero, so I’m happy, but I know that this is just a temporary victory. Spammers will find ways around the current protection and I’ll have to think of something else (I do have some options, but I don’t want to pre-announce them here for obvious reasons).

On a more happy note: During maintenance I also fixed a few issues with the Bookmarklet which should now do a better job at not coloring all text fields green eventually and at using the target site’s jQuery if available.

Windows 2008 / NAT / Direct connections

Yesterday I ran into an interesting problem with Windows 2008’s implementation of NAT (don’t ask – this was the best solution – I certainly don’t recommend using Windows for this purpose).

Whenever I enabled the NAT service, I was unable to reliably connect to the machine via remote desktop or even any other service that machine was offering. Packets sent to the machine were dropped as if a firewall was in between, but it wasn’t and the Windows firewall was configured to allow remote desktop connections.

Strangely, sometimes and from some hosts I was able to make a connection, but not consistently.

After some digging, this turned out to be a problem with the interface metrics and the server tried to respond over the interface with the private address that wasn’t routed.

So if you are in the same boat, configure the interface metrics of both interfaces manually. Set the metric of the private interface to a high value and the metrics of the public (routed) one to a low value.

At least for me, this instantly fixed the problem.

tempalias – validity limits

I’ve just pushed a small update to tempalias.com that imposes some (generous) limits to the values you can provide for the validity:

  • the maximum amount of days an alias can be valid is now 60 days.
  • the maximum amount of messages that can be sent to an aliases is now set to 100 messages.

I realized that there might be some potential for abusing tempalias.com if the aliases have a practically unlimited duration. Besides, then they wouldn’t be tempaliases any more. Right?

Already generated aliases with longer durations stay valid – true to the spirit of not looking into the data my users provided me with, I’m not going to check the existing aliases.

tempalias.com – bookmarklet work

While the user experience on tempalias.com is already really streamlined, compared to other services that encode the expiration settings and sometimes even the target) into the email address (and are thus exploitable and in some cases requiring you to have an account with them), it loses in that, when you have to register on some site, you will have to open the tempalias.com website in its own window and then manually create the alias.

Wouldn’t it be nice if this worked without having to visit the site?

This video is showing how I want this to work and how the bookmarklet branch on the github project page is already working:

http://vimeo.com/moogaloop.swf?clip_id=11193192&server=vimeo.com&show_title=1&show_byline=0&show_portrait=0&color=00ADEF&fullscreen=1

The workflow will be that you create your first (and probably only) alias manually. In the confirmation screen, you will be presented with a bookmarklet that you can drag to your bookmark bar and that will generate more aliases like the one just generated. This works independently of cookies or user accounts, so it would even work across browsers if you are synchronizing bookmarks between machines.

The actual bookmarklet is just a very small stub that will contain all the configuration for alias creation (so the actual bookmarklet will be the minified version of this file here). The bookmarklet, when executed will add a script tag to the page that actually does the heavy lifting.

The script that’s running in the video above tries really hard to be a good citizen as it’s run in the context of a third party webpage beyond my control:

  • it doesn’t pollute the global namespace. It has to add one function, window.$__tempalias_com,  so it doesn’t reload all the script if you click the bookmark button multiple times.
  • while it depends on jQuery (I’m not doing this in pure DOM), it tries really hard to be a good citizen:
    • if jQuery 1.4.2 is already used on the site, it uses that.
    • if any other jQuery version is installed, it loads 1.4.2 but restores window.jQuery to what it was before.
    • if no jQuery is installed, it loads 1.4.2
    • In all cases, it calls jQuery.noConflict if $ is bound to anything.
  • All DOM manipulation uses really unique class names and event namespaces

While implementing, I noticed that you can’t unbind live events with just their name, so $().die(‘.ta’) didn’t work an I had to provide all events I’m live-binding to. I’m using live here because the bubbling up delegation model works better in a case where there might be many matching elements on any particular page.

Now the next step will be to add some design to the whole thing and then it can go live.

tempalias.com – debriefing

This is the last part of the development diary I was keeping about the creation of a new web service in node.js. You can read the previous installment here.

It’s done.

The layout is finished, the last edges too rough for pushing the thing live are smoothed. tempalias.com is live. After coming really close to finishing the thing yesterday (hence the lack of a posting here – I was too tired when I had to quit at 2:30am) last night, now I could complete the results page and add the needed finishing touches (like a really cool way of catching enter to proceed from the first to the last form field – my favorite hidden feature).

I guess it’s time for a little debriefing:

All in all, the project took a time span of 17 days to implement from start to finish. I did this after work and mostly during weekdays and sundays, so it’s actually 11 days in which work was going on (I also was sick two days). Each day I worked around 4 hours, so all in all, this took around 44 hours to implement.

A significant part of this time went into modifications of third party libraries, while I tried to contact the initial authors to get my changes merged upstream:

  • The author of node-smtp isn’t interested in the SMTP daemon functionality (that wasn’t there when I started and is now completed)
  • The author of redis-node-client didn’t like my patch, but we had a really fruitful discussion and node-redis-client got a lot better at handling dropped connection in the process.
  • The author of node-paperboy has merged my patch for a nasty issue and even tweeted about it (THANKS!)

Before I continue, I want to say a huge thanks to fictorial on github for the awesome discussion I was allowed to have with him about node-redis-client’s handling of dropped connections. I’ve enjoyed every word I was typing and reading.

But back to the project.

Non-third-party code consists of just 1624 lines of code (using wc -l, so not an accurate measurement). This doesn’t factor in the huge amount of changes I made to my fork of node-smtp the daemon part of which was basically non-existant.

Overall, the learnings I made:

  • git and github are awesome. I knew that beforehand, but this just cemented my opinion
  • node.js and friends are still in their infancy. While node removes previously published API on a nearly daily basis (it’s mostly bug-free though), none of the third-party libraries I am using were sufficiently bug-free to use them without change.
  • Asynchronous programming can be fun if you have closures at your disposal
  • Asynchronous programming can be difficult once the nesting gets deep enough
  • Making any variable not declared with var global is the worst design decision I have ever seen in my life especially in node where we are adding concurrency to the mix)
  • While it’s possible (and IMHO preferrable) to have a website done in just RESTful webservices and static/javascript frontend, sometimes just a tiny little bit of HTML generation could be useful. Still. Everything works without emitting even a single line of dynamically generated HTML code.
  • Node is crazy fast.

Also, I want to take the opportunity and say huge thanks to:

  • the guys behind node.js. I would have had to do this in PHP or even rails (which is even less fitting than PHP as it provides so much functionality around generating dynamic HTML and so little around pure JSON based web services) without you guys!
  • Richard for his awesome layout
  • fictorial for redis-node-client and for the awesome discussion I was having with him.
  • kennethkalmer for his work on node-smtp even though it was still incomplete – you lead me on the right tracks how to write an SMTP daemon. Thank you!
  • @felixge for node-paperboy – static file serving done right
  • The guys behind sammy – writing fully JS based AJAX apps has never been easier and more fun.

Thank you all!

The next step will be marketing: Seing this is built on node.js and an actually usable project – way beyond the usual little experiments, I hope to gather some interest in the Hacker community. Seing it also provides a real-world use, I’ll even go and try to submit news about the project on more general outlets. And of course on the Security Now! feedback page as this is inspired by their episode 242.