tempalias.com – bookmarklet work

While the user experience on tempalias.com is already really streamlined, compared to other services that encode the expiration settings and sometimes even the target) into the email address (and are thus exploitable and in some cases requiring you to have an account with them), it loses in that, when you have to register on some site, you will have to open the tempalias.com website in its own window and then manually create the alias.

Wouldn’t it be nice if this worked without having to visit the site?

This video is showing how I want this to work and how the bookmarklet branch on the github project page is already working:

http://vimeo.com/moogaloop.swf?clip_id=11193192&server=vimeo.com&show_title=1&show_byline=0&show_portrait=0&color=00ADEF&fullscreen=1

The workflow will be that you create your first (and probably only) alias manually. In the confirmation screen, you will be presented with a bookmarklet that you can drag to your bookmark bar and that will generate more aliases like the one just generated. This works independently of cookies or user accounts, so it would even work across browsers if you are synchronizing bookmarks between machines.

The actual bookmarklet is just a very small stub that will contain all the configuration for alias creation (so the actual bookmarklet will be the minified version of this file here). The bookmarklet, when executed will add a script tag to the page that actually does the heavy lifting.

The script that’s running in the video above tries really hard to be a good citizen as it’s run in the context of a third party webpage beyond my control:

  • it doesn’t pollute the global namespace. It has to add one function, window.$__tempalias_com,  so it doesn’t reload all the script if you click the bookmark button multiple times.
  • while it depends on jQuery (I’m not doing this in pure DOM), it tries really hard to be a good citizen:
    • if jQuery 1.4.2 is already used on the site, it uses that.
    • if any other jQuery version is installed, it loads 1.4.2 but restores window.jQuery to what it was before.
    • if no jQuery is installed, it loads 1.4.2
    • In all cases, it calls jQuery.noConflict if $ is bound to anything.
  • All DOM manipulation uses really unique class names and event namespaces

While implementing, I noticed that you can’t unbind live events with just their name, so $().die(‘.ta’) didn’t work an I had to provide all events I’m live-binding to. I’m using live here because the bubbling up delegation model works better in a case where there might be many matching elements on any particular page.

Now the next step will be to add some design to the whole thing and then it can go live.

tempalias.com – debriefing

This is the last part of the development diary I was keeping about the creation of a new web service in node.js. You can read the previous installment here.

It’s done.

The layout is finished, the last edges too rough for pushing the thing live are smoothed. tempalias.com is live. After coming really close to finishing the thing yesterday (hence the lack of a posting here – I was too tired when I had to quit at 2:30am) last night, now I could complete the results page and add the needed finishing touches (like a really cool way of catching enter to proceed from the first to the last form field – my favorite hidden feature).

I guess it’s time for a little debriefing:

All in all, the project took a time span of 17 days to implement from start to finish. I did this after work and mostly during weekdays and sundays, so it’s actually 11 days in which work was going on (I also was sick two days). Each day I worked around 4 hours, so all in all, this took around 44 hours to implement.

A significant part of this time went into modifications of third party libraries, while I tried to contact the initial authors to get my changes merged upstream:

  • The author of node-smtp isn’t interested in the SMTP daemon functionality (that wasn’t there when I started and is now completed)
  • The author of redis-node-client didn’t like my patch, but we had a really fruitful discussion and node-redis-client got a lot better at handling dropped connection in the process.
  • The author of node-paperboy has merged my patch for a nasty issue and even tweeted about it (THANKS!)

Before I continue, I want to say a huge thanks to fictorial on github for the awesome discussion I was allowed to have with him about node-redis-client’s handling of dropped connections. I’ve enjoyed every word I was typing and reading.

But back to the project.

Non-third-party code consists of just 1624 lines of code (using wc -l, so not an accurate measurement). This doesn’t factor in the huge amount of changes I made to my fork of node-smtp the daemon part of which was basically non-existant.

Overall, the learnings I made:

  • git and github are awesome. I knew that beforehand, but this just cemented my opinion
  • node.js and friends are still in their infancy. While node removes previously published API on a nearly daily basis (it’s mostly bug-free though), none of the third-party libraries I am using were sufficiently bug-free to use them without change.
  • Asynchronous programming can be fun if you have closures at your disposal
  • Asynchronous programming can be difficult once the nesting gets deep enough
  • Making any variable not declared with var global is the worst design decision I have ever seen in my life especially in node where we are adding concurrency to the mix)
  • While it’s possible (and IMHO preferrable) to have a website done in just RESTful webservices and static/javascript frontend, sometimes just a tiny little bit of HTML generation could be useful. Still. Everything works without emitting even a single line of dynamically generated HTML code.
  • Node is crazy fast.

Also, I want to take the opportunity and say huge thanks to:

  • the guys behind node.js. I would have had to do this in PHP or even rails (which is even less fitting than PHP as it provides so much functionality around generating dynamic HTML and so little around pure JSON based web services) without you guys!
  • Richard for his awesome layout
  • fictorial for redis-node-client and for the awesome discussion I was having with him.
  • kennethkalmer for his work on node-smtp even though it was still incomplete – you lead me on the right tracks how to write an SMTP daemon. Thank you!
  • @felixge for node-paperboy – static file serving done right
  • The guys behind sammy – writing fully JS based AJAX apps has never been easier and more fun.

Thank you all!

The next step will be marketing: Seing this is built on node.js and an actually usable project – way beyond the usual little experiments, I hope to gather some interest in the Hacker community. Seing it also provides a real-world use, I’ll even go and try to submit news about the project on more general outlets. And of course on the Security Now! feedback page as this is inspired by their episode 242.

tempalias.com – learning CSS

This is one more episode in the development diary outlining the creation of a node.js based web service. You can read the previous installment here.

Today I could finally start with creating the HTML and CSS that will become the web frontend of the tempalias.com site. On Sunday, when I initially wanted to start, I was hindered by strangeness and overengineering of the express framework and yesterday it was general breakage in the redis client library for node.

But today I had no excuse and I started doing the HTML and CSS work with the intention of converting Richard’s awesome Photoshop designs into real-world HTML.

My main issue with this task: I plain don’t know CSS. Of course I know the syntax and how it should work in general, but there’s a huge difference between being able to read the syntax and writing basic code and actually being able to understand all the minor details and tricks that make it possible to achieve what you want in a reasonable time frame.

In contrast to real programming languages where you are usually developing for one target (sure – there might be plattform differences, but even nowaways, while learning, you can get away with restricting yourself to one plattform), HTML and CSS provide the additional difficulty that you have to develop for multiple moving targets, all of which containing different subtle bugs.

Combine that with the fact that more than basic CSS definitely isn’t part of my daily work and you’ll understand why I was struggling.

In the end I seem to have gotten into the thinking that’s needed to make elements appear in the general vicinity of where you suppose they should end up. I even got used to the IMHO very non-intuitive way of having margin and border be part of the elements dimensions in addition to their padding so all the pixel calculations fell into place and the whole thing looks more or less acceptable.

Until you begin changing the text size of course. But there’s so much manual pixel painting involved in the various backgrounds (gradient support isn’t quite there yet  – even in browsers) that it’s probably impossible to create a really well-scaling layout anyways, so what I currently have is what I’m content with.

You want to have a peek?

I didn’t upload anything to the public site yet because there’s no functionality and I wouldn’t want to confuse users reaching the site by accident, so a screenshot will have to do. Or you clone my repository on github and run it yourself.

Here it is:

Screenshot of tempalias HTML running in Chrome

The really tricky thing and conversely the thing I’m really the most proud of is the alignment of both the spy and the reflection of the main page content. You witness some really creative margin- and background positioning at work there. Oh. And I just don’t want to know in what glorious ways the non-browser IE butchers this layout.

I. just. plain. don’t. care. This is supposed to be a FUNproject.

Tomorrow: Hooking in Sammy to add links to all the static pages.

It looks now as if we are going live this week :-)

tempalias.com – rewrites

This is yet another installment in my series of posts about building a web service in node.js. The previous post is here.

Between the last post and current trunk of tempalias, there lie two substantial rewrites of core components of the service. One thing is that I completely misused Object.create() which takes an object to be the prototype of the object you are creating. I was of the wrong opinion that it works like Crockford’s object.create() which is creating a clone of the object you are passing.

Also, I learned that only Function objects actually have a prototype.

Not knowing these two things made it impossible to actually deserialize the JSON representation of an alias that was previously stored in redis. This lead to the first rewrite – this time of lib/tempalias.js. Now aliases work more like standard JS objects and require to be instantiated using the new operator, on the plus side though, they work as expected now.

Speaking of serialization. I learned that in V8 (and Safari)

isNan(Date.parse( (new Date()).toJSON() )) === true

which, according to the ES5 spec is a bug. The spec states that Date.parse() should be able to parse a string created by Date.doISOStirng() which is what is used by toJSON.

This ended up with me doing an ugly hack (string replacement) and reporting a bug in Chrome (where the bug happens too).

Anyhow. Friday and Saturday I took off the project, but today I was on it again. This time, I was looking into serving static content. This is how we are going to serve the web site after all.

Express does provide a Static plugin, but it’s fairly limited in that it doesn’t do any client side caching which, even though Node.js is crazy fast, seems imperative to me. Also while allowing you to configure the file system path it should serve static content from, it insists on the static content’s URL being /public/whatever, where I would much rather have kept the URL-Space together.

I tried to add If-Modified-Since-support to express’ static plugin, but I hit some strange interraction in how express handles the HTTP request that caused some connections to never close – not what I want.

After two hours of investigating, I was looking at a different solution, which leads us to rewrite two:

tempalias trunk now doesn’t depend on express any more. Instead, it serves the web service part of the URL space manually and for all the static requests, it uses node-paperboy. paperboy doesn’t try to convert node into Rails and it provides nothing but a simple static file handler for your web server which also works completely inside node’s standard method for handling web requests.

I prefer this solution by much because express was doing too much in some cases and too little in others: Express tries to somewhat imitate rails or any other web framework in that it not only provides request routing but also template rendering (in HAML and friends). It also abstracts away node’s HTTP server module and it does so badly as eveidenced by this strange connection not-quite-ending problem.

On the other hand, it doesn’t provide any help if you want to write something that doesn’t return text/html.

Personally, if I’m doing a RESTful service anyways, I see no point in doing any server-side HTML generation. I’d much rather write a service that exposes an API at some URL endpoints and then also a static page that uses JavaScript / AJAX to consume said API. This is where express provides next to no help at all.

So if the question is whether to have a huge dependency which fails at some key points and doesn’t provide any help with other key points or to have a smaller dependency that handles the stuff I’m not interested in, but otherwise doesn’t interfer, I’d much prefer that solution to the first one.

This is why I went with this second rewrite.

Because I was already using a clean MVC separation (the “view” being the JSON I emit in the API – there’s no view in the traditional sense yet), the rewrite was quite hassle-free and basically nothing but syntax work.

After completing that, I felt like removing the known issues from my blog post where I was writing about persistence: Alias generation is now race-free and alias length is stored in redis too. The architecture can still be improved in that I’m currently doing two requests to Redis per ALIAS I’m creating (SETNX and SET). By moving stuff around a little bit, I can get away with just the SETNX.

On the other hand, let me show you this picture here:

Screenshot of ab running in a terminalConsidering that the current solution is already creating 1546 aliases per second at a concurrency of 100 requests, I can probably get away without changing the alias creation code any more.

And in case you ask: The static content is served with 3000 requests per second – again with a concurrency of 100.

Node is fast.

Really.

Tomorrow: Philip learns CSS – I’m already dreading this final step to enlightenment: Creating the HTML/CSS front-end UI according to the awesome design provided by Richard.

No. It’s not «just» strings

On Hacker News, I came across this rant about strings in Ruby 1.9 where a developer was complaining about the new string handling in Ruby. Now, I’m no Ruby developer by even a long shot, but I am really interested in strings and string encoding which is why I posted the following comment which I reprint here as it’s too big to just be a comment:

Rants about strings and character sets that contain words of the following spirit are usually neither correct nor worth of any further thought:

It’s a +String+ for crying out loud! What other language requires you to understand this
level of complexity just to work with strings?!

Clearly the author lives in his ivory tower of English language environments where he is able to use the word “just” right next to “strings” and he probably also can say that he “switched to UTF-8” without actually really having done so because the parts of UTF-8 he uses work exactly the same as the ASCII he used before.

But the rest of the world works differently.

Data can appear in all kinds of encodings and can be required to be in different other kinds of encodings. Some of those can be converted into each other, others can’t.

Some Japanese encodings (Ruby’s creator is Japanese) can’t be converted to a unicode representation for example.

Nowadays, as a programming language, you have three options of handling strings:

1) pretend they are bytes.

This is what older languages have done and what Ruby 1.8 does. This of course means that your application has to keep track of encodings. Basically for every string you keep in your application, you need to also keep track what it is encoded in. When concatenating a string of encoding a to another string you already have that is in encoding b, you must do the conversion manually.

Additionally, because strings are bytes and the programming language doesn’t care about encoding, you basically can’t use any of the built-in string handling routines because they assume each byte representing one character.

Of course, if you are one of these lucky english UTF-8 users, getting data in ASCII and english text in UTF-8, you can easily “switch” your application to UTF-8 by still pretending strings to be bytes because, well, they are. For all intents and purposes, your UTF-8 is just ASCII called UTF-8.

This is what the author of the linked post wanted.

2) use an internal unicode representation

This is what Python 3 does and what I feel to be a very elegant solution if it works for you: A String is just a collection of Unicode code points. Strings don’t worry about encoding. String operations don’t worry about it. Only I/O worries about encoding. So whenever you get data from the outside, you need to know what encoding it is in and then you decode it to convert it to a string. Conversely, whenever you want to actually output one of these strings, you need to know in what encoding you need the data and then encode that sequence of Unicode code points to any of these encodings.

You will never be able to convert a bunch of bytes into a string or vice versa without going through some explicit encoding/decoding.

This of course has some overhead associated with it, as you always have to do the encoding and because operations on that internal collection of unicode code points might be slower than the simple array-of-byte-based approach, especially if you are using some kind of variable-length encoding (which you probably are to save memory).

Interestingly, whenever you receive data in an encoding that cannot be represented with Unicode code points and whenever you need to send out data in that encoding, then, you are screwed.

This is a defficiency in the Unicode standard. Unicode was specifically made so that it can be used to represent every encoding, but it turns out that it can’t correctly represent some Japanese encodings.

3) The third option is to store an encoding with each string and expose both the strings contents and the encoding to your users

This is what Ruby 1.9 does. It combines methods 1 and 2: It allows you to chose whatever internal encoding you need, it allows you to convert from one encoding to the other and it removes the need to externally keep book of every strings encoding because it does that for you. It also makes sure that you don’t intermix encodings, but I’m getting ahead of myself.

You can still use the languages string library functions because they are aware of the encoding and usually do the right thing (minus, of course, bugs)

As this method is independent of the (broken?) Unicode standard, you would never get into the situation where just reading data in some encoding makes you unable to write the same data back in the same encoding as in this case, you would just create a string using this problematic encoding and do your stuff on that.

Nothing prevents the author of the linked post to use Ruby 1.9’s facility to do exactly what Python 3 does (of course, again, ignoring the Unicode issue) by internally keeping all strings in, say, UTF-16 (you can’t keep strings in “Unicode” – Unicode is no encoding – but that’s for another post). You would transcode all incoming and outgoing data to and from that encoding. You would do all string operations on that application-internal representation.

A language throwing an exception when you concatenate a Latin 1-String to a UTF-8 string is a good thing! You see: Once that concatenation happened by accident, it’s really hard to detect and fix.

At least it’s fixable though because not every Latin1-String is also a UTF-8 string. But if it so happens that you concatenate, say Latin1 and Latin8 by accident, then you are really screwed and there’s no way to find out where Latin1 ends and Latin8 begins as every valid Latin 1 string is also a valid Latin 8 string. Both are arrays of bytes with values between 0 and 255 (minus some holes).

In todays small world, you want that exception to be thrown.

In conclusion, what I find really amazing about this complicated problem of character encoding is the fact that nobody feels it’s complicated because it usually just works – especially method 1 described above that has constantly been used in years past and also is very convenient to work with.

Also, it still works.

Until your application leaves your country and gets used in countries where people don’t speak ASCII (or Latin1). Then all these interesting problems arise.

Until then, you are annoyed by every of the methods I described but method 1.

Then, you will understand what great service Python 3 has done for you and you’ll switch to Python 3 which has very clear rules and seems to work for you.

And then you’ll have to deal with the japanese encoding problem and you’ll have to use binary bytes all over the place and have to stop using strings altogether because just reading input data destroys it.

And then you might finally see the light and begin to care for the seemingly complicated method 3.

</span>

Google Buzz, Android and Google Apps Accounts

I was looking at the Google Android Maps Application that is now providing integrated Google Buzz support, showing buzzes directly on the map and allowing you to buzz (around where I live and work, there has been a tremendous uptake of Google Buzz which makes this really compelling).

However, there’s a little peculiarity about the Android maps application: If your main Google Account you configured (that’s the first one you configure) on the phone is a Google Apps account, Maps will use that for buzz-support (apparently, there’s already some kind of infrastructure for inter-company Buzzing in place). This means that you would only see buzzes from other people in your domain and, because there’s no official support for this out there, only if they are also using an Android phone.

“Mittelpraktisch” as I would say in German.

The obvious workaround is to configure your private gmail account to be your primary account (this is only possible by factory-resetting your device by the way), but this has some disadvantages, mainly the fact that the calendar on the Android  phones only supports syncing with the primary account and as it happens, usually it’s the work-calendar (the Apps one) you want synchronized; not the private one (that lingers unused in my case).

To work around this issue, share your work calendar with your private Google account.

Unfortunately, I couldn’t do that as I’m posting this, because the default in the domain configuration is to not allow this. Thankfully, I’m that domain’s administrator, so I could change it (small company. remember.), but it seems to take a while to propagate into the calendar account.

I’ll post more as my investigation turns out more, though it is my gut feeling that this mess will solve itself as Google fixes their Maps application to not use that phantom corporate buzz account.

How we use git

the following article was a comment I made on Hacker News, but as it’s quite big and as I want to keep my stuff at a central place, I’m hereby reposting it and adding a bit of formating and shameless self-promotion (i.e. links):

My company is working on a – by now – quite large web application. Initially (2004), I began with CVS and then moved to SVN and in the second half of last year, to git (after a one-year period of personal use of git-svn).

We deploy the application for our customers – sometimes to our own servers (both self-hosted and in the cloud) and sometimes to their machines.

Until middle year, as a consequence of SVN’s really crappy handling of branches (it can branch, but it fails at merging), we did very incremental development, adding features on customer requests and bugfixes as needed, often times uploading specific fixes to different sites, committing them to trunk, but rarely ever updating existing applications to trunk to keep them stable.

Huge mess.

With the switch to git, we also initiated a real release management, doing one feature release every six months and keeping the released versions on strict maintenance (for all intents and purposes – the web application is highly customizable and we do make exceptions in the customized parts as to react to immediate feature-wishes of clients).

What we are doing git-wise is the reverse of what the article shows: Bug-fixes are (usually) done on the release-branches, while all feature development (except of these customizations) is done on the main branch (we just use the git default name “master”).

We branch off of master when another release date nears and then tag a specific revision of that branch as the “official” release.

There is a central gitosis repository which contains what is the “official” repository, but every one of us (4 people working on this – so we’re small compared to other projects I guess) has their own gitorious clone which we heavily use for code-sharing and code review (“hey – look at this feature I’ve done here: Pull branch foobar from my gitorious repo to see…”).

With this strict policy of (for all intents and purposes) “fixes only” and especially “no schema changes”, we can even auto-update customer installations to the head of their respective release-branches which keeps their installations bug-free. This is a huge advantage over the mess we had before.

Now. As master develops and bug-fixes usually happen on the branch(es), how do we integrate them back into the mainline?

This is where the concept of the “Friday merge” comes in.

On Friday, my coworker or I usually merge all changes in the release-branches upwards until they reach master. Because it’s only a week worth of code, conflicts rarely happen and if they do, we remember what the issue was.

If we do a commit on a branch that doesn’t make sense on master because master has sufficiently changed or a better fix for the problem is in master, then we mark these with [DONTMERGE] in the commit message and revert them as part of the merge commit.

On the other hand, in case we come across a bug during development on master and we see how it would affect production systems badly (like a security flaw – not that they happen often) and if we have already devised a simple fix that is save to apply to the branch(es), we fix those on master and then cherry-pick them on the branches.

This concept of course heavily depends upon clean patches, which is another feature git excels at: Using features like interactive rebase and interactive add, we can actually create commits that

  • Either do whitespace or functional changes. Never both.
  • Only touch the lines absolutely necessary for any specific feature or bug
  • Do one thing and only one.
  • Contain a very detailed commit message explaining exactly what the change encompasses.

This on the other hand, allows me to create extremely clean (and exhaustive) change logs and NEWS file entries.

Now some of these policies about commits were a bit painful to actually make everyone adhere to, but over time, I was able to convince everybody of the huge advantage clean commits provide even though it may take some time to get them into shape (also, you gain that time back once you have to do some blame-ing or other history digging).

Using branches with only bug-fixes and auto-deploying them, we can increase the quality of customer installations and using the concept of a “Friday merge”, we make sure all bug-fixes end up in the development tree without each developer having to spend an awful long time to manually merge or without ending up in merge-hell where branches and master have diverged too much.

The addition of gitorious for easy exchange of half-baked features to make it easier to talk about code before it gets “official” helped to increase the code quality further.

git was a tremendous help with this and I would never in my life want to go back to the dark days.

I hope this additional insight might be helpful for somebody still thinking that SVN is probably enough.

Twisted Tornado

Lately, the net is all busy talking about the new web server released by FriendFeed last week and how their server basically does the same thing as the Twisted framework that was around so much longer. One blog entry ends with

Why Facebook/Friendfeed decided to create a new web server is completely beyond us.

Well. Let me add my two cents. Not from a Python perspective (I’m quite the Python newbie, only having completed one bigger project so far), but from a software development perspective. I feel qualified to add the cents because I’ve been there and done that.

When you start any project, you will be on the lookout for a framework or solution to base your work on. Often times, you already have some kind of idea of how you want to proceed and what the different requirements of your solution will be.

Of course, you’ll be comparing existing requirements against the solutions around, but chances are that none of the existing solutions will match your requirements exactly, so you will be faced with changing them to match.

This involves not only the changes themselves but also other considerations:

  • is it even possible to change an existing solution to match your needs?
  • if the existing solution is an open source project, is there a chance of your changes being accepted upstream (this is not a given, by the way).
  • if not, are you willing to back- and forward-port your changes as new upstream versions get released? Or are you willing to stick with the version for eternity, manually back-porting security-issues?

and most importantly

  • what takes more time: Writing a tailor-made solution from scratch or learning how the most-matching solutions ticks to make it do what you want?

There is a very strong perception around, that too many features mean bloat and that a simpler solution always trumps the complex one.

Have a look at articles like «Clojure 1, PHP 0» which compares a home-grown, tailor-made solution in one language to a complete framework in another and it seems to favor the tailor-made solution because it was more performant and felt much easier to maintain.

The truth is, you can’t have it both ways:

Either you are willing to live with «bloat» and customize an existing solution, adding some features and not using others, or you are unwilling to accept any bloat and you will do a tailor-made solution that may be lacking in features, may reimplement other features of existing solutions, but will contain exactly the features you want. Thus it will not be «bloated».

FriendFeed decided to go the tailor-made route but instead of many other projects each day who go the tailor made route (take Django’s reimplementations of many existing Python technologies like templating and ORM as another example) and keep using that internally, they actually went public.

Not with the intention to bad-mouth Twisted (though it kinda sounded that way due to bad choice of words), but with the intention of telling us: «Hey – here’s the tailor-made implementation which we used to solve our problem – maybe it is or parts of it are useful to you, so go ahead and have a look».

Instead of complaining that reimplementation and a bit of NIH was going on, the community could embrace the offering and try to pick the interesting parts they see fitting for their implementation(s).

This kind of reinventing the wheel is a standard process that is going on all the time, both in the Free Software world as in the commercial software world. There’s no reason to be concerned or alarmed. Instead we should be thankful for the groups that actually manage to put their code out for us to see – in so many cases, we never get a chance to see it and thus lose a chance at making our solutions better.

Snow Leopard and PHP

Earlier versions of Mac OS X always had pretty outdated versions of PHP in their default installation, so what you usually did was to go to entropy.ch and fetch the packages provided there.

Now, after updating to Snow Leopard you’ll notice that the entropy configuration has been removed and once you add it back in, you’ll see Apache segfaulting and some missing symbol errors.

Entropy has not updated the packages to snow leopard yet, so you could have a look at PHP that came with stock snow leopard: This time it’s even bleeding edge: Snow Leopard comes with PHP 5.3.0.

Unfortunately though, some vital extensions are missing, most notably for me, the PostgeSQL extension.

This time around though, Snow Leopard comes with a functioning PHP development toolset, so there’s nothing stopping you to build it yourself, so here’s how to get the official PostgreSQL extension working on Snow Leopard’s stock php:

  1. Make sure that you have installed the current Xcode Tools. You’ll need a working compiler for this.
  2. Make sure that you have installed PostgreSQL and know where it is on your machine. In my case, I’ve used the One-click installer from EnterpriseDB (which persisted the update to 10.6).
  3. Now that Snow Leopard uses a full 64bit userspace, we’ll have to make sure that the PostgreSQL client library is available as a 64 bit binary – or even better, as an universal binary.Unfortunately, that’s not the case with the one-click installer, so we’ll have to fix that first:
    1. Download the sources of the PostgreSQL version you have installed from postgresql.org
    2. Open a terminal and use the following commands:
      % tar xjf postgresql-[version].tar.bz2
      % cd postgresql-[version]
      % CFLAGS="-arch i386 -arch x86_64" ./configure --prefix=/usr/local/mypostgres
      % make

      make will fail sooner or later because you the postgres build scripts can’t handle building an universal binary server, but the compile will progress enough for us to now build libpq. Let’s do this:

      % make -C src/interfaces
      % sudo make -C src/interfaces install
      % make -C src/include
      % sudo make -C src/include install
      % make -C src/bin
      % sudo make -C src/bin install
  4. Download the php 5.3.0 source code from their website. I used the bzipped version.
  5. Open your Terminal and cd to the location of the download. Then use the following commands:
    % tar -xjf php-5.3.0.tar.bz2
    % cd php-5.3.0/ext/pgsql
    % phpize
    % ./configure --with-pgsql=/usr/local/mypostgres
    % make -j8 # in case of one of these nice 8 core macs :p
    % sudo make install
    % cd /etc
    % cp php.ini-default php.ini
  6. Now edit your new php.ini and add the line extension=pgsql.so

And that’s it. Restart Apache (using apachectl or the System Preferences) and you’ll have PostgreSQL support.

All in all this is a tedious process and it’s the price us early adopters have to pay constantly.

If you want an honest recommendation on how to run PHP with PostgreSQL support on Snow Leopard, I’d say: Don’t. Wait for the various 3rd party packages to get updated.

OpenStreetMap

The last episode of FLOSS Weekly consisted of an interview with Steve Coast from OpenStreetMap. I knew about the project, but I was of the impression that it was in its infancy both content-wise and from a technical perspective.

During the interview I learned that it’s surprisingly complete (unless, of course, you need a map of Canada it seems) and highly advanced from a technical point of view.

But what’s really interesting is the fact how terribly easy it is to contribute. For smaller edits, you just click the edit-Link and use the Flash editor to paint a road or give it a name. If you need or want to do more, then there’s a really easy to use Java based editor:

First you drag a rectangle onto a pre-rendered version of the map which will cause the server to send you the vector information consisting of that part and then you can edit whatever you want.

If you have them, you can import traces of a GPS logger to help you add roads and paths and when you are finished, you press a button and the changes get uploaded and will be visible to the public a few minutes later (though one modification I made took about an hour to arrive on the web).

When the same nodes where updated in the meantime, a really nice conflict resolution assistant will help you to resolve the conflicts.

For me personally, this has the potential to become my new after-work time sink as it combines quite many passions of mine:

  • The GPS tracking, importing and painting of maps is pure technology fun.
  • Actually being outside to generate the traces is healthy and also a lot of fun
  • Maps also are a passion of mine. I love to look at maps and I love to compare them to my mental image of the places they are showing.

And besides all that, Open Street Map is complete enough to be of real use. For biking or hiking it even trumps Google Maps by much.

Still, at least near where I live, there are many small issues that can easily be fixed.

As the different editors are really easy to use, fixing these issues is a lot of fun and I’m totally seeing myself cleaning out all small mistakes I come across or even adding stuff that’s missing. After all, this also provides me with a very good reason to visit the places where I grew up to complete some parts.

The whole concept behind being able to update a map by just a couple of mouse clicks is very compelling too as it finally gives us the potential to have really accurate maps in a very timely fashion. For example: Last October, one of the roads near my house closed and just recently the tracks of the Forchbahn were moved a bit.

Just today I added these changes to OpenStreetMap and now OSM is the only publically available map that correctly shows the traffic situation. And all that with 15 minutes of easy but interesting work.

For those interested, my Open Street Map user profile is, of course, pilif.