programming- gnegg

AJAX, Architecture, Frameworks and Hacks

Today I was talking with @brainlock about JavaScript, AJAX and Frameworks and about two paradigms that are in use today:

The first is the “traditional” paradigm where your JS code is just glorified view code. This is how AJAX worked in the early days and how people are still using it. Your JS-code intercepts a click somewhere, sends an AJAX request to the server and gets back either more JS code which just gets evaulated (thus giving the server kind of indirect access to the client DOM) or a HTML fragment which gets inserted at the appropriate spot.

This means that your JS code will be ugly (especially the code coming from the server), but it has the advantage that all your view code is right there where all your controllers and your models are: on the server. You see this pattern in use on the 37signals pages or in the github file browser for example.

Keep the file browser in mind as I’m going to use that for an example later on.

The other paradigm is to go the other way around an promote JS to a first-class language. Now you build a framework on the client end and transmit only data (XML or JSON, but mostly JSON these days) from the server to the client. The server just provides a REST API for the data plus serves static HTML files. All the view logic lives only on the client side.

The advantages are that you can organize your client side code much better, for example using backbone, that there’s no expensive view rendering on the server side and that you basically get your third party API for free because the API is the only thing the server provides.

This paradigm is used for the new twitter webpage or in my very own tempalias.com.

Now @brainlock is a heavy proponent of the second paradigm. After being enlightened by the great Crockford, we both love JS and we both worked on huge messes of client-side JS code which has grown over the years and lacks structure and feels like copy pasta sometimes. In our defense: Tons of that code was written in the pre-enlightened age (2004).

I on the other hand see some justification for the first pattern aswell and I wouldn’t throw it away so quickly.

The main reason: It’s more pragmatic, it’s more DRY once you need graceful degradation and arguably, you can reach your goal a bit faster.

Let me explain by looking at the github file browser:

If you have a browser that supoports the HTML5 history API, then a click on a directory will reload the file list via AJAX and at the same time the URL will be updated using push state (so that the current view keeps its absolute URL which is valid even after you open it in a new browser).

If a browser doesn’t support pushState, it will gracefully degrade by just using the traditional link (and reloading the full page).

Let’s map this functionality to the two paradigms.

First the hacky one:

You render the full page with the file list using a server-side template
You intercept clicks to the file list. If it’s a folder:
you request the new file list
the server now renders the file list partial (in rails terms – basically just the file list part) without the rest of the site
the client gets that HTML code and inserts it in place of the current file list
You patch up the url using push state

done. The view code is only on the server. Whether the file list is requested using the AJAX call or the traditional full page load doesn’t matter. The code path is exactly the same. The only difference is that the rest of the page isn’t rendered in case of an AJAX call. You get graceful degradation and no additional work.

Now assuming you want to keep graceful degradation possible and you want to go the JS framework route:

You render the full page with the file list using a server-side template
You intercept the click to the folder in the file list
You request the JSON representation of the target folder
You use that JSON representation to fill a client-side template which is a copy of the server side partial
You insert that HTML at the place where the file list is
You patch up the URL using push state

The amount of steps is the same, but the amount of work isn’t: If you want graceful degradation, then you write the file list template twice: Once as a server-side template, once as a client-side template. Both are quite similar but usually you’ll be forced to use slightly different syntax. If you update one, you have to update the other or the experience will be different whether you click on a link or you open the URL directly.

Also you are duplicating the code which fills that template: On the server side, you use ActiveRecord or whatever other ORM. On the client side, you’d probably use Backbone to do the same thing but now your backend isn’t the database but the JSON response. Now, Backbone is really cool and a huge timesaver, but it’s still more work than not doing it at all.

OK. Then let’s skip graceful degradation and make this a JS only client app (good luck trying to get away with that). Now the view code on the server goes away and you are just left with the model on the server to retrieve the data, with the model on the client (Backbone helps a lot here, but there’s still a substatial amount of code that needs to be written that otherwise wouldn’t) and with the view code on the client.

Now don’t ge me wrong.

I love the idea of promoting JS to a first class language. I love JS frameworks for big JS only applications. I love having a “free”, dogfooded-by-design REST API. I love building cool architectures.

I’m just thinking that at this point it’s so much work doing it right, that the old ways do have their advantages and that we should not condemn them for being hacky. True. They are. But they are also pragmatic.

rails, PostgreSQL and the native uuid type

UUID have the very handy property that they are uniqe and there are quite many of them for you to use. Also they are difficult to guess and knowing the UUID of one object, it’s very hard to guess a valid UUID of another object.

This makes UUIDs perfect for identifying things in web applications:

Even if you shard across multiple machines, each machine can independently generate primary keys without (realistic) fear of overlapping.
You can generate them without using any kind of locks.
Sometimes, you have to expose such keys to the user. If possible, you will of course do authorization checks, but it still makes sense not allowing users know about neighboring keysThis gets even more important when you are not able to do authorization keys because the resource you are referring to is public (like a mail alias) but it should still not possible to know other items if you know one.

Knowing that UUIDs are a good thing, you might want to use them in your application (or you just have to in the last case above).

There are multiple recipes out there that show how to do it in a rails application (this one for example).

All of these recipes store UUIDs as varchar’s in your database. In general, that’s fine and also the only thing you can do as most databases don’t have a native data type for UUIDs.

PostgreSQL the other hand indeed has a native 128 bit integer type to store UUID.

This is more space efficient than storing the UUID in string form (288 bit) and it might be a tad bit faster when doing comparison operations on the database as integer operations (even if they are this big) require a constant amount of operations whereas comparing two string UUIDs is a string comparison which is dependent on the string size and size of the matching parts.

So maybe for the (minuscule) speed increase or for the purpose of correct semantics or just for interoperability with other applications, you might want to use native PostgreSQL UUIDs from your Rails (or other, but without the abstraction of a “Migration”, just using UUID is trivial) applications.

This already works quite nicely if you generate the columns as strings in your migrations and then manually send an alter table (whenever you restore the schema from scratch).

But if you want to create the column with the correct type directly from the migration and you want the column to be created correctly when using rake db:schema:load, then you need a bit of additional magic, especially if you want to still support other databases.

In my case, I was using PostgreSQL in production (what else?), but on my local machine, for the purpose of getting started quickly, I wanted to still be able to use SQLite for development.

In the end, everything boils down to monkey patching ActiveRecord::ConnectionAdapters::Adapters and PostgreSQLColumn of the same module. So here’s what I’ve addded to config/initializers/uuuid_support.rb (Rails 3.0.):

module ActiveRecord
  module ConnectionAdapters
    SQLiteAdapter.class_eval do
      def native_database_types_with_uuid_support
        a = native_database_types_without_uuid_support
        a[:uuid] = {:name => 'varchar', :limit => 36}
        return a
      end
      alias_method_chain :native_database_types, :uuid_support
    end if ActiveRecord::Base.connection.adapter_name == 'SQLite'

    if ActiveRecord::Base.connection.adapter_name == 'PostgreSQL'
      PostgreSQLAdapter.class_eval do
        def native_database_types_with_uuid_support
          a = native_database_types_without_uuid_support
          a[:uuid] = {:name => 'uuid'}
          return a
        end
        alias_method_chain :native_database_types, :uuid_support
      end

      PostgreSQLColumn.class_eval do
        def simplified_type_with_uuid_support(field_type)
          if field_type == 'uuid'
            :uuid
          else
            simplified_type_without_uuid_support(field_type)
          end
        end
        alias_method_chain :simplified_type, :uuid_support
      end
    end
  end
end

In your migrations you can then use the :uuid type. In my sample case, this was it:

class AddGuuidToSites < ActiveRecord::Migration
  def self.up
    add_column :sites, :guuid, :uuid
    add_index :sites, :guuid
  end

  def self.down
    remove_column :sites, :guuid
  end
end

Maybe with a bit better Ruby knowledge than I have, it should be possible to just monkey-patch the parent AbstractAdaper while still calling the method of the current subclass. This would not require a separate patch for all adapters in use.

For my case which was just support for SQLite and PostgreSQL, the above initializer was fine though.

sacy 0.2 – now with less, sass and scss

To fresh up your memory (it has been a while): sacy is a Smarty (both 2 and 3) plugin that turns

{asset_compile}
<link type="text/css" rel="stylesheet" href="/styles/file1.css" />
<link type="text/css" rel="stylesheet" href="/styles/file2.css" />
<link type="text/css" rel="stylesheet" href="/styles/file3.css" />
<link type="text/css" rel="stylesheet" href="/styles/file4.css" />
 type="text/javascript" src="/jslib/file1.js">
 type="text/javascript" src="/jslib/file2.js">
 type="text/javascript" src="/jslib/file3.js">
{/asset_compile}

into

<link type="text/css" rel="stylesheet" href="/assets/files-1234abc.css" />
 type="text/javascript" src="/assets/files-abc123.js">

It does this without you ever having to manually run a compiler, without serving all your assets through some script (thus saving RAM) and without worries about stale copies being served. In fact, you can serve all static files generated with sacy with cache headers telling browsers to never revisit them!

All of this, using two lines of code (wrap as much content as you want in {asset_compile}…{/asset_compile})

Sacy has been around for a bit more than a year now and has since been in production use in PopScan. During this time, no single bug in Sacy has been found, so I would say that it’s pretty usable.

Coworkers have bugged me enough about how much better less or sass would be compared to pure CSS so that I finally decided to update sacy to allow us to use less in PopScan:

Aside of consolidating and minimizing CSS and JavaScript, sacy can now also transform less and sass (or scss) files using the exact same method as before but just changing the mime-type:

<link type="text/x-less" rel="stylesheet" href="/styles/file1.less" />
<link type="text/x-sass" rel="stylesheet" href="/styles/file2.sass" />
<link type="text/x-scss" rel="stylesheet" href="/styles/file3.scss" />

Like before, you don’t concern yourself with manual compilation or anything. Just use the links as is and sacy will do the magic for you.

Interested? Read the (by now huge) documentation on my github page!

Find relation sizes in PostgreSQL

Like so many times before, today I was yet again in the situation where I wanted to know which tables/indexes take the most disk space in a particular PostgreSQL database.

My usual procedure in this case was to dt+ in psql and scan the sizes by eye (this being on my development machine, trying to find out the biggest tables I could clean out to make room).

But once you’ve done that a few times and considering that dt+ does nothing but query some PostgreSQL internal tables, I thought that I want this solved in an easier way that also is less error prone. In the end I just wanted the output of dt+ sorted by size.

The lead to some digging in the source code of psql itself (src/bin/psql) where I quickly found the function that builds the query (listTables in describe.c), so from now on, this is what I’m using when I need to get an overview over all relation sizes ordered by size in descending order:

select
  n.nspname as "Schema",
  c.relname as "Name",
  case c.relkind
     when 'r' then 'table'
     when 'v' then 'view'
     when 'i' then 'index'
     when 'S' then 'sequence'
     when 's' then 'special'
  end as "Type",
  pg_catalog.pg_get_userbyid(c.relowner) as "Owner",
  pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid)) as "Size"
from pg_catalog.pg_class c
 left join pg_catalog.pg_namespace n on n.oid = c.relnamespace
where c.relkind IN ('r', 'v', 'i')
order by pg_catalog.pg_relation_size(c.oid) desc;

Of course I could have come up with this without source code digging, but honestly, I didn’t know about relkind s, about pg_size_pretty and pg_relation_size (I would have thought that one to be stored in some system view), so figuring all of this out would have taken much more time than just reading the source code.

Now it’s here so I remember it next time I need it.

How to kill IE performance

While working on my day job, we are often dealing with huge data tables in HTML augmented with some JavaScript to do calculations with that data.

Think huge shopping cart: You change the quantity of a line item and the line total as well as the order total will change.

This leads to the same data (line items) having three representations:

The model on the server
The HTML UI that is shown to the user
The model that’s seen by JavaScript to do the calculations on the client side (and then updating the UI)

You might think that the JavaScript running in the browser would somehow be able to work with the data from 2) so that the third model wouldn’t be needed, but due to various localization issues (think number formatting) and data that’s not displayed but affects the calculations, that’s not possible.

So the question is: Considering we have some HTML templating language to build 2), how do we get to 3).

Back in 2004 when I initially designed that system (using AJAX before it was widely called AJAX even), I hadn’t seen Crockford’s lectures yet, so I still lived in the “JS sucks” world, where I’ve done something like this

<!-- lots of TRs -->
<tr>
    <td>Column 1 addSet(1234 /*prodid*/, 1 /*quantity*/, 10 /*price*/, /* and, later, more, stuff, so, really, ugly */)</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

(Yeah – as I said: 2004. No object literals, global functions. We had a lot to learn back then, but so did you, so don’t be too angry at me – we improved)

Obviously, this doesn’t scale: As the line items got more complicated, that parameter list grew and grew. The HTML code got uglier and uglier and of course, cluttering the window object is a big no-no too. So we went ahead and built a beautiful design:

<!-- lots of TRs -->
<tr class="lineitem" data-ps-lineitem='{"prodid": 1234, "quantity": 1, "price": 10, "foo": "bar", "blah": "blah"}'>
    <td>Column 1</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

The first iteration was then parsing that JSON every time we needed to access any of the associated data (and serializing again whenever it changed). Of course this didn’t go that well performance-wise, so we began caching and did something like this (using jQuery):

$(function(){
    $('.lineitem').each(function(){
        this.ps_data = $.parseJSON($(this).attr('data-ps-lineitem'));
    });
});

Now each DOM element representing one of these <tr>’s had a ps_data member which allowed for quick access. The JSON had to be parsed only once and then the data was available. If it changed, writing it back didn’t require a re-serialization either – you just changed that property directly.

This design is reasonably clean (still not as DRY as the initial attempt which had the data only in that JSON string) while still providing enough performance.

Until you begin to amass datasets. That is.

Well. Until you do so and expect this to work in IE.

800 rows like this made IE lock up its UI thread for 40 seconds.

So more optimization was in order.

First,

$('.lineitem')

will kill IE. Remember: IE (still) doesn’t have getElementsByClassName, so in IE, jQuery has to iterate the whole DOM and check whether each elements class attribute contains “lineitem”. Considering that IE’s DOM isn’t really fast to start with, this is a HUGE no-no.

So.

$('tr.lineitem')

Nope. Nearly as bad considering there are still at least 800 tr’s to iterate over.

$('#whatever tr.lineitem')

Would help if it weren’t 800 tr’s that match. Using dynaTrace AJAX (highly recommended tool, by the way) we found out that just selecting the elements alone (without the iteration) took more than 10 seconds.

So the general take-away is: Selecting lots of elements in IE is painfully slow. Don’t do that.

But back to our little problem here. Unserializing that JSON at DOM ready time is not feasible in IE, because no matter what we do to that selector, once there are enough elements to handle, it’s just going to be slow.

Now by chunking up the amount of work to do and using setTimeout() to launch various deserialization jobs we could fix the locking up, but the total run time before all data is deserialized will still be the same (or slightly worse).

So what we have done in 2004, even though it was ugly, was way more feasible in IE.

Which is why we went back to the initial design with some improvements:

<!-- lots of TRs -->
<tr class="lineitem">
    <td>Column 1 PopScan.LineItems.add({"prodid": 1234, "quantity": 1, "price": 10, "foo": "bar", "blah": "blah"});</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

phew crisis averted.

Loading time went back to where it was in the 2004 design. It was still bad though. With those 800 rows, IE was still taking more than 10 seconds for the rendering task. dynaTrace revealed that this time, the time was apparently spent rendering.

The initial feeling was that there’s not much to do at that point.

Until we began suspecting the script tags.

Doing this:

<!-- lots of TRs -->
<tr class="lineitem">
    <td>Column 1</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

The page loaded instantly.

Doing this

<!-- lots of TRs -->
<tr class="lineitem">
    <td>Column 1 1===1;</td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->

it took 10 seconds again.

Considering that IE’s JavaScript engine runs as a COM component, this isn’t actually that surprising: Whenever IE hits a script tag, it stops whatever it’s doing, sends that script over to the COM component (first doing all the marshaling of the data), waits for that to execute, marshals the result back (depending on where the DOM lives and whether the script accesses it, possibly crossing that COM boundary many, many times in between) and then finally resumes page loading.

It has to wait for each script because, potentially, that JavaScript could call document.open() / document.write() at which point the document could completely change.

So the final solution was to loop through the server-side model twice and do something like this:

<!-- lots of TRs -->
<tr class="lineitem">
    <td>Column 1 </td>
    <td>Column 2</td>
    <td>Column 3</td>
</tr>
<!-- lots of TRs -->
</table>

PopScan.LineItems.add({prodid: 1234, quantity: 1, price: 10, foo: "bar", blah: "blah"});
// 800 more of these

Problem solved. Not too ugly design. Certainly no 2004 design any more.

And in closing, let me give you a couple of things you can do if you want to bring the performance of IE down to its knees:

Use broad jQuery selectors. $('.someclass') will cause jQuery to loop through all elements on the page.
Even if you try not to be broad, you can still kill performance: $('div.someclass'). The most help jQuery can expect from IE is getElementsByTagName, so while it’s better than iterating all elements, it’s still going over all div’s on your page. Once it’s more than 200, the performance extremely quickly falls down (probably doing some O(n^2) thing somehwere).
Use a lot of <script>-tags. Every one of these will force IE to marshal data to the scripting engine COM component and to wait for the result.

Next time, we’ll have a look at how to use jQuery’s delegate() to handle common cases with huge selectors.

stabilizing tempalias

While the maintenance last weekend brought quite a bit of stabilization to the tempalias service, I quickly noticed that it was still dying sooner or later and while before updating node, it died due to not being able to allocate more memory, this time, it died by just not answering any requests any more.

A look at the error log quickly revealed quite many exceptions complaining about a certain request type not being allowed to have a body and finally one complaining about not being able to open a file due to having run out of file handles.

So I quickly improved error logging and restarted the daemon in order to get a stacktrace leading to these tons of exceptions.

This quickly pointed to paperboy which was sending the file even if the request was a HEAD request. http.js in node checks for this and throws whenever you send a body when you should not. That exception lead then to paperboy never closing the file (have I already complained how incredibly difficult it is to do proper exception handling the moment continuations get involved? I think not and I also think it’s a good topic for another diary entry). With the help of lsof I’ve quickly seen that my suspicions were true: the node process serving tempalias had tons of open handles to public/index.html.

I sent a patch for this behavior to @felixge which was quickly applied, so that’s fixed now. I hope it’s of some use for other people too.

Now knowing that having a look at lsof here and then might be a good idea, quickly revealed another problem: While the file handles were gone, I’ve noticed tons and tons of SMTP sockets staying open in CLOSE_WAIT state. Not good as that too will lead to handle starvation sooner or later.

On a hunch, I found out that connecting to the SMTP daemon and then disconnecting, not sending QUIT to let the server disconnect was what was causing the lingering sockets. Clients disconnecting like that is very common in case the sender sends a 5xx response which is what the tempalias daemon was designed for.

So I had to fix that in my fork of the node smtp daemon (the original upstream isn’t interested in daemon functionality and the owner I forked the daemon for doesn’t respond to my pull requests. Hence I’m maintaining my own fork for now).

Futher looks at lsof prove that now we are quite stable in resource consumption: No lingering connections, no unclosed file handles.

But the error log was still filling up. This time something about removeListener needing a function. Thanks to the callstack I now had in my error log, I quickly hunted that one down and fixed it – that was a very stupid mistake. Thankfully, because the mails I usually deliver are small enough so that socket draining usually wasn’t required.

Onwards to the next issue filling the error log: «This deferred has already been resolved».

This comes from the Promise.js library if you emit*() multiple times on the same promise. This time, of course, the callstack was useless (… at <anonymous> – why, thank you), but I was very lucky again in that I tested from home and my mail relay didn’t trust my home IP address and thus denied relaying with a 500 which immediately led to the exception.

Now, this one is crazy: When you call .addErrback() on a Promise before calling addCallback(), your callback will be executed no matter if the errback was executed first.

Promise.js does some really interesting things to simulate polymorphism in JavaScript and I really didn’t want to fix up that library as lately, node.js itself seems go to a simpler continuation style using a callback parameter, so sooner or later, I’ll have to patch up the smtp server library anyways to remove Promise.js if I want to adhere to current node style.

So I took the workaround route by just using addCallback() before addErrback() even though the other order feels more natural to me. In addition, I reported an issue with the author as this is clearly unexpected behavior.

Now the error log is pretty much silent (minus some ECONNRESET exceptions due to clients sending RST packets in mid-transfer, but I think they are uncritical to resource consumption), so I hope the overall stability of the site has improved a bunch – I’d love not having to restart the daemon for more than a day :-)

Do spammers find pleasure in destroying fun stuff?

Recently, while reading through the log file of the mail relay used by tempalias, I noticed a disturbing trend: Apparently, SPAM was being sent through tempalias.

I’ve seen various behaviours. One was to strangely create an alias per second to the same target and then delivering email there.

While I completely fail to understand this scheme, the other one was even more disturbing: Bots were registering {max-usage: 1, days: null} aliases and then sending one mail to them – probably to get around RBL checks they’d hit when sending SPAM directly.

Aside of the fact that I do not want to be helping spammers, this also posed a technical issue: node.js head which I was running back when I developed the service tended to leak memory at times forcing me to restart the service here and then.

Now the additional huge load created by the bots forced me to do that way more often than I wanted to. Of course, the old code didn’t run on current node any more.

Hence I had to take tempalias down for maintenance.

A quick look at my commits on GitHub will show you what I have done:

the tempalias SMTP daemon now does RBL checks and immediately disconnects if the connected host is listed.
the tempalias HTTP daemon also does RBL checks on alias creation, but it doesn’t check the various DUL lists as the most likely alias creators are most certainly listed in a DUL
Per IP, aliases can only be generated every 30 seconds.

This should be some help. In addition, right now, the mail relay is configured to skip sender-checks and sa-exim scans (Spam Assassin on SMTP time as to reject spam before even accepting it into the system) for hosts where relaying is allowed. I intend to change that so that sa-exim and sender verify is done regardless if the connecting host is the tempalias proxy.

Looking at the mail log, I’ve seen the spam count drop to near-zero, so I’m happy, but I know that this is just a temporary victory. Spammers will find ways around the current protection and I’ll have to think of something else (I do have some options, but I don’t want to pre-announce them here for obvious reasons).

On a more happy note: During maintenance I also fixed a few issues with the Bookmarklet which should now do a better job at not coloring all text fields green eventually and at using the target site’s jQuery if available.

tempalias.com – bookmarklet work

While the user experience on tempalias.com is already really streamlined, compared to other services that encode the expiration settings and sometimes even the target) into the email address (and are thus exploitable and in some cases requiring you to have an account with them), it loses in that, when you have to register on some site, you will have to open the tempalias.com website in its own window and then manually create the alias.

Wouldn’t it be nice if this worked without having to visit the site?

This video is showing how I want this to work and how the bookmarklet branch on the github project page is already working:

http://vimeo.com/moogaloop.swf?clip_id=11193192&server=vimeo.com&show_title=1&show_byline=0&show_portrait=0&color=00ADEF&fullscreen=1

The workflow will be that you create your first (and probably only) alias manually. In the confirmation screen, you will be presented with a bookmarklet that you can drag to your bookmark bar and that will generate more aliases like the one just generated. This works independently of cookies or user accounts, so it would even work across browsers if you are synchronizing bookmarks between machines.

The actual bookmarklet is just a very small stub that will contain all the configuration for alias creation (so the actual bookmarklet will be the minified version of this file here). The bookmarklet, when executed will add a script tag to the page that actually does the heavy lifting.

The script that’s running in the video above tries really hard to be a good citizen as it’s run in the context of a third party webpage beyond my control:

it doesn’t pollute the global namespace. It has to add one function, window.$__tempalias_com, so it doesn’t reload all the script if you click the bookmark button multiple times.
while it depends on jQuery (I’m not doing this in pure DOM), it tries really hard to be a good citizen:
- if jQuery 1.4.2 is already used on the site, it uses that.
- if any other jQuery version is installed, it loads 1.4.2 but restores window.jQuery to what it was before.
- if no jQuery is installed, it loads 1.4.2
- In all cases, it calls jQuery.noConflict if $ is bound to anything.
All DOM manipulation uses really unique class names and event namespaces

While implementing, I noticed that you can’t unbind live events with just their name, so $().die(‘.ta’) didn’t work an I had to provide all events I’m live-binding to. I’m using live here because the bubbling up delegation model works better in a case where there might be many matching elements on any particular page.

Now the next step will be to add some design to the whole thing and then it can go live.

tempalias.com – debriefing

This is the last part of the development diary I was keeping about the creation of a new web service in node.js. You can read the previous installment here.

It’s done.

The layout is finished, the last edges too rough for pushing the thing live are smoothed. tempalias.com is live. After coming really close to finishing the thing yesterday (hence the lack of a posting here – I was too tired when I had to quit at 2:30am) last night, now I could complete the results page and add the needed finishing touches (like a really cool way of catching enter to proceed from the first to the last form field – my favorite hidden feature).

I guess it’s time for a little debriefing:

All in all, the project took a time span of 17 days to implement from start to finish. I did this after work and mostly during weekdays and sundays, so it’s actually 11 days in which work was going on (I also was sick two days). Each day I worked around 4 hours, so all in all, this took around 44 hours to implement.

A significant part of this time went into modifications of third party libraries, while I tried to contact the initial authors to get my changes merged upstream:

The author of node-smtp isn’t interested in the SMTP daemon functionality (that wasn’t there when I started and is now completed)
The author of redis-node-client didn’t like my patch, but we had a really fruitful discussion and node-redis-client got a lot better at handling dropped connection in the process.
The author of node-paperboy has merged my patch for a nasty issue and even tweeted about it (THANKS!)

Before I continue, I want to say a huge thanks to fictorial on github for the awesome discussion I was allowed to have with him about node-redis-client’s handling of dropped connections. I’ve enjoyed every word I was typing and reading.

But back to the project.

Non-third-party code consists of just 1624 lines of code (using wc -l, so not an accurate measurement). This doesn’t factor in the huge amount of changes I made to my fork of node-smtp the daemon part of which was basically non-existant.

Overall, the learnings I made:

git and github are awesome. I knew that beforehand, but this just cemented my opinion
node.js and friends are still in their infancy. While node removes previously published API on a nearly daily basis (it’s mostly bug-free though), none of the third-party libraries I am using were sufficiently bug-free to use them without change.
Asynchronous programming can be fun if you have closures at your disposal
Asynchronous programming can be difficult once the nesting gets deep enough
Making any variable not declared with var global is the worst design decision I have ever seen in my life especially in node where we are adding concurrency to the mix)
While it’s possible (and IMHO preferrable) to have a website done in just RESTful webservices and static/javascript frontend, sometimes just a tiny little bit of HTML generation could be useful. Still. Everything works without emitting even a single line of dynamically generated HTML code.
Node is crazy fast.

Also, I want to take the opportunity and say huge thanks to:

the guys behind node.js. I would have had to do this in PHP or even rails (which is even less fitting than PHP as it provides so much functionality around generating dynamic HTML and so little around pure JSON based web services) without you guys!
Richard for his awesome layout
fictorial for redis-node-client and for the awesome discussion I was having with him.
kennethkalmer for his work on node-smtp even though it was still incomplete – you lead me on the right tracks how to write an SMTP daemon. Thank you!
@felixge for node-paperboy – static file serving done right
The guys behind sammy – writing fully JS based AJAX apps has never been easier and more fun.

Thank you all!

The next step will be marketing: Seing this is built on node.js and an actually usable project – way beyond the usual little experiments, I hope to gather some interest in the Hacker community. Seing it also provides a real-world use, I’ll even go and try to submit news about the project on more general outlets. And of course on the Security Now! feedback page as this is inspired by their episode 242.

tempalias.com – learning CSS

This is one more episode in the development diary outlining the creation of a node.js based web service. You can read the previous installment here.

Today I could finally start with creating the HTML and CSS that will become the web frontend of the tempalias.com site. On Sunday, when I initially wanted to start, I was hindered by strangeness and overengineering of the express framework and yesterday it was general breakage in the redis client library for node.

But today I had no excuse and I started doing the HTML and CSS work with the intention of converting Richard’s awesome Photoshop designs into real-world HTML.

My main issue with this task: I plain don’t know CSS. Of course I know the syntax and how it should work in general, but there’s a huge difference between being able to read the syntax and writing basic code and actually being able to understand all the minor details and tricks that make it possible to achieve what you want in a reasonable time frame.

In contrast to real programming languages where you are usually developing for one target (sure – there might be plattform differences, but even nowaways, while learning, you can get away with restricting yourself to one plattform), HTML and CSS provide the additional difficulty that you have to develop for multiple moving targets, all of which containing different subtle bugs.

Combine that with the fact that more than basic CSS definitely isn’t part of my daily work and you’ll understand why I was struggling.

In the end I seem to have gotten into the thinking that’s needed to make elements appear in the general vicinity of where you suppose they should end up. I even got used to the IMHO very non-intuitive way of having margin and border be part of the elements dimensions in addition to their padding so all the pixel calculations fell into place and the whole thing looks more or less acceptable.

Until you begin changing the text size of course. But there’s so much manual pixel painting involved in the various backgrounds (gradient support isn’t quite there yet – even in browsers) that it’s probably impossible to create a really well-scaling layout anyways, so what I currently have is what I’m content with.

You want to have a peek?

I didn’t upload anything to the public site yet because there’s no functionality and I wouldn’t want to confuse users reaching the site by accident, so a screenshot will have to do. Or you clone my repository on github and run it yourself.

Here it is:

The really tricky thing and conversely the thing I’m really the most proud of is the alignment of both the spy and the reflection of the main page content. You witness some really creative margin- and background positioning at work there. Oh. And I just don’t want to know in what glorious ways the non-browser IE butchers this layout.

I. just. plain. don’t. care. This is supposed to be a FUNproject.

Tomorrow: Hooking in Sammy to add links to all the static pages.

It looks now as if we are going live this week :-)