gnegg: January 2010

the following article was a comment I made on Hacker News, but as it’s quite big and as I want to keep my stuff at a central place, I’m hereby reposting it and adding a bit of formating and shameless self-promotion (i.e. links):

My company is working on a – by now – quite large web application. Initially (2004), I began with CVS and then moved to SVN and in the second half of last year, to git (after a one-year period of personal use of git-svn).

We deploy the application for our customers – sometimes to our own servers (both self-hosted and in the cloud) and sometimes to their machines.

Until middle year, as a consequence of SVN’s really crappy handling of branches (it can branch, but it fails at merging), we did very incremental development, adding features on customer requests and bugfixes as needed, often times uploading specific fixes to different sites, committing them to trunk, but rarely ever updating existing applications to trunk to keep them stable.

Huge mess.

With the switch to git, we also initiated a real release management, doing one feature release every six months and keeping the released versions on strict maintenance (for all intents and purposes – the web application is highly customizable and we do make exceptions in the customized parts as to react to immediate feature-wishes of clients).

What we are doing git-wise is the reverse of what the article shows: Bug-fixes are (usually) done on the release-branches, while all feature development (except of these customizations) is done on the main branch (we just use the git default name “master”).

We branch off of master when another release date nears and then tag a specific revision of that branch as the “official” release.

There is a central gitosis repository which contains what is the “official” repository, but every one of us (4 people working on this – so we’re small compared to other projects I guess) has their own gitorious clone which we heavily use for code-sharing and code review (“hey – look at this feature I’ve done here: Pull branch foobar from my gitorious repo to see…”).

With this strict policy of (for all intents and purposes) “fixes only” and especially “no schema changes”, we can even auto-update customer installations to the head of their respective release-branches which keeps their installations bug-free. This is a huge advantage over the mess we had before.

Now. As master develops and bug-fixes usually happen on the branch(es), how do we integrate them back into the mainline?

This is where the concept of the “Friday merge” comes in.

On Friday, my coworker or I usually merge all changes in the release-branches upwards until they reach master. Because it’s only a week worth of code, conflicts rarely happen and if they do, we remember what the issue was.

If we do a commit on a branch that doesn’t make sense on master because master has sufficiently changed or a better fix for the problem is in master, then we mark these with [DONTMERGE] in the commit message and revert them as part of the merge commit.

On the other hand, in case we come across a bug during development on master and we see how it would affect production systems badly (like a security flaw – not that they happen often) and if we have already devised a simple fix that is save to apply to the branch(es), we fix those on master and then cherry-pick them on the branches.

This concept of course heavily depends upon clean patches, which is another feature git excels at: Using features like interactive rebase and interactive add, we can actually create commits that

Either do whitespace or functional changes. Never both.
Only touch the lines absolutely necessary for any specific feature or bug
Do one thing and only one.
Contain a very detailed commit message explaining exactly what the change encompasses.

This on the other hand, allows me to create extremely clean (and exhaustive) change logs and NEWS file entries.

Now some of these policies about commits were a bit painful to actually make everyone adhere to, but over time, I was able to convince everybody of the huge advantage clean commits provide even though it may take some time to get them into shape (also, you gain that time back once you have to do some blame-ing or other history digging).

Using branches with only bug-fixes and auto-deploying them, we can increase the quality of customer installations and using the concept of a “Friday merge”, we make sure all bug-fixes end up in the development tree without each developer having to spend an awful long time to manually merge or without ending up in merge-hell where branches and master have diverged too much.

The addition of gitorious for easy exchange of half-baked features to make it easier to talk about code before it gets “official” helped to increase the code quality further.

git was a tremendous help with this and I would never in my life want to go back to the dark days.

I hope this additional insight might be helpful for somebody still thinking that SVN is probably enough.

I guess it’s inevitable. Good ideas may fail. And good ideas may be years ahead of their time. And of course, sometimes, people just don’t listen.

But one never stops learning.

In the year 2000, I took part in a plan of a couple of guys to become the next Yahoo (Google wasn’t quite there yet back then), or, to use the words we used on the site,

For these reasons, we have designed an online environment that offers a truly new way for people to store, manage and share their favourite online resources and enables them to engage in long-lasting relationships of collaboration and trust with other users.

The idea behind the project, called linktrail, was basically what would much later on be picked up by the likes of twitter, facebook (to some extent) and the various community based news sites.

The whole thing went down the drain, but the good thing is that I was able to legally salvage the source code, the install it on a personal server of mine and to publish the source code. And now that so many years have passed, it’s probably time to tell the world about this, which is why I have decided to start this little series about the project. What is it? How was it made? And most importantly: Why did it fail? And concequently: What could we have done better?

But let’s first start with the basics.

As I said, I was able to legally acquire the database and code (which is mostly written by me anyways) and to install the site on a server of mine, so let’s get that out to start with. The site is available at linktrail.pilif.ch. What you see running there is the result of 6 months of programming by myself after a concept done by the guys I’ve worked with to create this.

What is linktrail?

If the tour we made back then is any good, then just taking it would probably be enough, but let me phrase in my words: The site is a collection of so called trails which in turn are small units, comparable to blogs, consisting of links, titles and descriptions. These micro-blogs are shown in a popup window (that’s what we had back then) beside the browser window to allow quick navigation between the different links in the trail.

Trails are made by users, either by each user on their own or as a collaborative work between multiple users. The owner of a trail can hand out permissions to everybody or their friends (using a system quite similar to what we currently see on facebook for example)

A trail is placed in a directory of trails which was built around the directory structures we used back then, though by now, we would probably do this much more different. Users can subscribe to trails they are interested in. In that case, they will be notified if a trail they are subscribed to is updated either by the owner or anybody else with the rights to update the trail.

Every user (called expert in the site’s terms) has their profile page (here’s mine) that lists the trails they created and the ones they are subscribed to.

The idea was for you as an user to find others with similar interests and form a community around those interests to collaborate on trails. An in-site messaging-system helped users to communicate with each other: Aside of just sending plain text messages, it’s possible to recommend trails (for easy one-click subscription) .

linktrail was my first real programming project, basically 6 months after graduating in what the US would call high school. Combine that fact with the fact that it was created during the high times of the browser wars (year 2000, remember) with web standards basically non-existing, then you can imagine what a mess is running behind the scenes.

Still, the site works fine within those constraints.

In future posts, I will talk about the history of the project, about the technology behind the site, about special features and, of course, about why this all failed and what I would do differently – both in matters of code and organization.

If I woke your interest, feel free to have a look at the code of the site which I just now converted from CVS (I started using CVS about 4 months into development, so the first commit is HUGE) to SVN to git and put it up on github for public consumption. It’s licensed under a BSD license, but I doubt that you’d find anything in this mess of PHP3(!) code (though it runs unchanged(!) on PHP5 – topic of another post I guess), HTML 3.2(!) tag soup and java-script hacks.

Oh and if you can read german, I have also converted the CVS repository that contained the concept papers that were written over the time.

In preparation of this series of blog-posts, I have already made some changes to the code base (available at github):

login after register now works
warning about unencrypted(!) passwords in the registration form
registering requires you to solve a reCAPTCHA.

Month: January 2010

How we use git

linktrail – a failed startup – introduction