http- gnegg

tempalias.com – rewrites

This is yet another installment in my series of posts about building a web service in node.js. The previous post is here.

Between the last post and current trunk of tempalias, there lie two substantial rewrites of core components of the service. One thing is that I completely misused Object.create() which takes an object to be the prototype of the object you are creating. I was of the wrong opinion that it works like Crockford’s object.create() which is creating a clone of the object you are passing.

Also, I learned that only Function objects actually have a prototype.

Not knowing these two things made it impossible to actually deserialize the JSON representation of an alias that was previously stored in redis. This lead to the first rewrite – this time of lib/tempalias.js. Now aliases work more like standard JS objects and require to be instantiated using the new operator, on the plus side though, they work as expected now.

Speaking of serialization. I learned that in V8 (and Safari)

isNan(Date.parse( (new Date()).toJSON() )) === true

which, according to the ES5 spec is a bug. The spec states that Date.parse() should be able to parse a string created by Date.doISOStirng() which is what is used by toJSON.

This ended up with me doing an ugly hack (string replacement) and reporting a bug in Chrome (where the bug happens too).

Anyhow. Friday and Saturday I took off the project, but today I was on it again. This time, I was looking into serving static content. This is how we are going to serve the web site after all.

Express does provide a Static plugin, but it’s fairly limited in that it doesn’t do any client side caching which, even though Node.js is crazy fast, seems imperative to me. Also while allowing you to configure the file system path it should serve static content from, it insists on the static content’s URL being /public/whatever, where I would much rather have kept the URL-Space together.

I tried to add If-Modified-Since-support to express’ static plugin, but I hit some strange interraction in how express handles the HTTP request that caused some connections to never close – not what I want.

After two hours of investigating, I was looking at a different solution, which leads us to rewrite two:

tempalias trunk now doesn’t depend on express any more. Instead, it serves the web service part of the URL space manually and for all the static requests, it uses node-paperboy. paperboy doesn’t try to convert node into Rails and it provides nothing but a simple static file handler for your web server which also works completely inside node’s standard method for handling web requests.

I prefer this solution by much because express was doing too much in some cases and too little in others: Express tries to somewhat imitate rails or any other web framework in that it not only provides request routing but also template rendering (in HAML and friends). It also abstracts away node’s HTTP server module and it does so badly as eveidenced by this strange connection not-quite-ending problem.

On the other hand, it doesn’t provide any help if you want to write something that doesn’t return text/html.

Personally, if I’m doing a RESTful service anyways, I see no point in doing any server-side HTML generation. I’d much rather write a service that exposes an API at some URL endpoints and then also a static page that uses JavaScript / AJAX to consume said API. This is where express provides next to no help at all.

So if the question is whether to have a huge dependency which fails at some key points and doesn’t provide any help with other key points or to have a smaller dependency that handles the stuff I’m not interested in, but otherwise doesn’t interfer, I’d much prefer that solution to the first one.

This is why I went with this second rewrite.

Because I was already using a clean MVC separation (the “view” being the JSON I emit in the API – there’s no view in the traditional sense yet), the rewrite was quite hassle-free and basically nothing but syntax work.

After completing that, I felt like removing the known issues from my blog post where I was writing about persistence: Alias generation is now race-free and alias length is stored in redis too. The architecture can still be improved in that I’m currently doing two requests to Redis per ALIAS I’m creating (SETNX and SET). By moving stuff around a little bit, I can get away with just the SETNX.

On the other hand, let me show you this picture here:

Considering that the current solution is already creating 1546 aliases per second at a concurrency of 100 requests, I can probably get away without changing the alias creation code any more.

And in case you ask: The static content is served with 3000 requests per second – again with a concurrency of 100.

Node is fast.

Really.

Tomorrow: Philip learns CSS – I’m already dreading this final step to enlightenment: Creating the HTML/CSS front-end UI according to the awesome design provided by Richard.

Tunnel munin nodes over HTTP

Last time I’ve talked about Munin, the one system monitoring tool I feel working well enough for me to actually bother to work with. Harsh words, I know, but the key to every solution is simplicity. And simple Munin is. Simple, but still powerful enough to do everything I would want it to do.

The one problem I had with it is that the querying of remote nodes works over a custom TCP port (4949) which doesn’t work behind firewalls.

There are some SSH tunneling solutions around, but what do you do if even SSH is no option because the remote access method provided to you relies on some kind of VPN technology or access token.

Even if you could keep a long-running VPN connection, it’s a very performance intensive solution as it requires resources on the VPN gateway. But this point is moot anyways because nearly all VPNs terminate long running connections. If re-establishing the connection requires physical interaction, then you are basically done here.

This is why I have created a neat little solution which tunnels the munin traffic over HTTP. It works with a local proxy server your munin monitoring process will connect to and a little CGI-script on the remote end.

This will cause multiple HTTP connections per query interval (the proxy uses Keep-Alive though so it’s not TCP connections we are talking about – it’s just hits in the access.log you’ll have to filter out somehow) because it’s impossible for a CGI script to keep the connection open and send data both ways – at least not if your server-side is running plain PHP which is the case in the setup I was designing this for.

Aynways – the solution works flawlessly and helps me to monitor a server behind one hell of a firewall and behind a reverse proxy.

You’ll find the code here (on GitHub as usual) and some explanation on how to use it is here.

Licensed under the MIT license as usual.

The return of Expect: 100-continue

Yesterday I had to work with a PHP-application using the CURL library to send a HTTP POST request to a lighttpd server.

Strangely enough I seemed unable to get anything back from the server when using PHP and I got the correct answer when I was using wget as a reference.

This made me check the lightpd log and I once more (I recommend you to read that entry as this is very much dependent on it) came across the friendly error 417

A quick check with Wireshark confirmed: curl was sending the Expect: 100-continue header.

Personally, I think that 100-continue thing is a good thing and it even seems to me that the curl library is intelligent about it and only does that thing when the size of the data to send is larger than a certain threshold.

Also, even though people are complaining about it, I think lighttpd does the right thing. The expect-header is mandatory and if lighttpd doesn’t support this particular header, the error 417 is the only viable option.

What I think though is that the libraries should detect that automatically.

This is because they are creating a behavior that’s not consistent to the other types of request: GET, DELETE and HEAD requests all follow a fire-and-forget paradigm and the libraries employ a 1:1 mapping: Set up the request. Send it. Return the received data.

With POST (and maybe PUT), the library changes that paradigm and in fact sends two request to the wire while actually pretending in the interface that it’s only sending one request.

If it does that, then it should at least be capable enough to handle the cases where their scheme of transparently changing semantics breaks.

Anyways: The fix for the curl-library in PHP is:

curl_setopt($ch, CURLOPT_HTTPHEADER, array('Expect:'));

Though I’m not sure how pure this solution is.