How we use git

the following article was a comment I made on Hacker News, but as it’s quite big and as I want to keep my stuff at a central place, I’m hereby reposting it and adding a bit of formating and shameless self-promotion (i.e. links):

My company is working on a – by now – quite large web application. Initially (2004), I began with CVS and then moved to SVN and in the second half of last year, to git (after a one-year period of personal use of git-svn).

We deploy the application for our customers – sometimes to our own servers (both self-hosted and in the cloud) and sometimes to their machines.

Until middle year, as a consequence of SVN’s really crappy handling of branches (it can branch, but it fails at merging), we did very incremental development, adding features on customer requests and bugfixes as needed, often times uploading specific fixes to different sites, committing them to trunk, but rarely ever updating existing applications to trunk to keep them stable.

Huge mess.

With the switch to git, we also initiated a real release management, doing one feature release every six months and keeping the released versions on strict maintenance (for all intents and purposes – the web application is highly customizable and we do make exceptions in the customized parts as to react to immediate feature-wishes of clients).

What we are doing git-wise is the reverse of what the article shows: Bug-fixes are (usually) done on the release-branches, while all feature development (except of these customizations) is done on the main branch (we just use the git default name “master”).

We branch off of master when another release date nears and then tag a specific revision of that branch as the “official” release.

There is a central gitosis repository which contains what is the “official” repository, but every one of us (4 people working on this – so we’re small compared to other projects I guess) has their own gitorious clone which we heavily use for code-sharing and code review (“hey – look at this feature I’ve done here: Pull branch foobar from my gitorious repo to see…”).

With this strict policy of (for all intents and purposes) “fixes only” and especially “no schema changes”, we can even auto-update customer installations to the head of their respective release-branches which keeps their installations bug-free. This is a huge advantage over the mess we had before.

Now. As master develops and bug-fixes usually happen on the branch(es), how do we integrate them back into the mainline?

This is where the concept of the “Friday merge” comes in.

On Friday, my coworker or I usually merge all changes in the release-branches upwards until they reach master. Because it’s only a week worth of code, conflicts rarely happen and if they do, we remember what the issue was.

If we do a commit on a branch that doesn’t make sense on master because master has sufficiently changed or a better fix for the problem is in master, then we mark these with [DONTMERGE] in the commit message and revert them as part of the merge commit.

On the other hand, in case we come across a bug during development on master and we see how it would affect production systems badly (like a security flaw – not that they happen often) and if we have already devised a simple fix that is save to apply to the branch(es), we fix those on master and then cherry-pick them on the branches.

This concept of course heavily depends upon clean patches, which is another feature git excels at: Using features like interactive rebase and interactive add, we can actually create commits that

  • Either do whitespace or functional changes. Never both.
  • Only touch the lines absolutely necessary for any specific feature or bug
  • Do one thing and only one.
  • Contain a very detailed commit message explaining exactly what the change encompasses.

This on the other hand, allows me to create extremely clean (and exhaustive) change logs and NEWS file entries.

Now some of these policies about commits were a bit painful to actually make everyone adhere to, but over time, I was able to convince everybody of the huge advantage clean commits provide even though it may take some time to get them into shape (also, you gain that time back once you have to do some blame-ing or other history digging).

Using branches with only bug-fixes and auto-deploying them, we can increase the quality of customer installations and using the concept of a “Friday merge”, we make sure all bug-fixes end up in the development tree without each developer having to spend an awful long time to manually merge or without ending up in merge-hell where branches and master have diverged too much.

The addition of gitorious for easy exchange of half-baked features to make it easier to talk about code before it gets “official” helped to increase the code quality further.

git was a tremendous help with this and I would never in my life want to go back to the dark days.

I hope this additional insight might be helpful for somebody still thinking that SVN is probably enough.

All-time favourite tools – update

It has been more than four years since I’ve last talked about my all-time favourite tools. I guess it’s time for an update.

Surprisingly, I still stand behind the tools listed there: My love for Exim is still un-changed (it just got bigger lately – but that’s for another post). PostgreSQL is cooler than ever and powers PopScan day-in, day-out without flaws.

Finally, I’m still using InnoSetup for my Windows Setup programs, though that has lost a bit of importance in my daily work as we’re shifting more and more to the web.

Still. There are two more tools I must add to the list:

  • jQuery is a JavaScript helper libary that allows you to interact with the DOM of any webpage, hiding away browser incompatibilities. There are a couple of libraries out there which do the same thing, but only jQuery is such a pleasure to work with: It works flawlessly, provides one of the most beautiful APIs I’ve ever seen in any library and there are tons and tons of self-contained plug-ins out there that help you do whatever you would want to on a web page.
    jQuery is an integral part of making web applications equivalent to their desktop counterparts in matters of user interface fluidity and interactivity.
    All while being such a nice API that I’m actually looking forward to do the UI work – as opposed to the earlier days which can most accurately be described as UI sucks.
  • git is my version control system of choice. There are many of them out there in the world and I’ve tried the majority of them for one thing or another. But only git combines the awesome backwards-compatibility to what I’ve used before and what’s still in use by my coworkers (SVN) with abilities to beautify commits, have feature branches, very high speed of execution and very easy sharing of patches.
    No single day passes without me using git and running into a situation where I’m reminded of the incredible beauty that is git.

In four years, I’ve not seen one more other tool I’ve as consistenly used with as much joy as git and jQuery, so those two certainly have earned their spot in my heart.

My new friend: git rebase -i

Last summer, I was into making git commits look nice with the intent of pushing a really nice and consistent set of patches to the remote repository.

The idea is that a clean remote history is a convenience for my fellow developers and for myself. A clean history means very well-defined patches – should a merge of a branch be neccesary in the future. It also means much easier hunting for regressions and generally more fun doing some archeology in the code.

My last post was about using git add -i to refine the commits going into the repository. But what if you screw up the commit anyways? What if you forget to add a new file and notice it only some commits later?

This is where git rebase -i comes into play as this allows you to reorder your local commits and to selectively squash multiple commits into one.

Let’s see how we would add a forgotten file to a commit a couple of commits ago.

  1. You add the forgotten file and commit it. The commit message doesn’t really matter here.
  2. You use git log or gitk to find the commit id before the one you want to amend this new file to. Let’s say it’s 6bd80e12707c9b51c5f552cdba042b7d78ea2824
  3. Pick the first few characters (or the whole ID) and pass them to git rebase -i.
 % git rebase -i 6bd80e12

git will now open your favorite editor displaying your list of commits since the revision you have given. This could look like this.

pick 6bd80e1 some commit message. This is where I have forgotten the file
pick 4c1d210 one more commit message
pick 5d2f4ed this is my forgotten file

# Rebase fc9a0c6..5d2f4ed onto fc9a0c6
#
# Commands:
#  p, pick = use commit
#  e, edit = use commit, but stop for amending
#  s, squash = use commit, but meld into previous commit
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#

The comment in the file says it all – just reorder the first three (or how many there are in your case) to look like this:

pick 6bd80e1 some commit message. This is where I have forgotten the file
squash 5d2f4ed this is my forgotten file
pick 4c1d210 one more commit message

Save the file. Git will now do some magic and open the text editor again where you can amend the commit message for the commit you squashed your file into. If it’s really just a forgotten file, you’ll probably keep the message the same.

One word of caution though: Do not do this on branches you have already pushed to a remote machine or otherwise shared with somebody else. git gets badly confused if it has to pull altered history.

Isn’t it nice that after moths you still find new awesomeness in your tool of choice?

I guess I’ll have to update my all-time favorite tools list. It’s from 2004, so it’s probably ripe for that update.

Git rules.

Beautifying commits with git

When you look at our Subversion log, you’d often see revisions containing multiple topics, which is something I don’t particularly like. The main problem is merging patches. The moment you cramp up multiple things into a single commit, you are doomed should you ever decide to merge one of the things into another branch.

Because it’s such an easy thing to do, I began committing really, really often in git, but whenever I was writing the changes back to subversion, I’ve used merge –squash as to not clutter the main revision history with abandoned attempts at fixing a problem or implementing a feature.

So in essence, this meant that by using git, I was working against my usual goals: The actual commits to SVN were larger than before which is the exact opposite thing of how I’d want the repository to look.

I’ve lived with that, until I learned about the interactive mode of git add.

Beginners with git (at least those coming from subversion and friends) always struggle to really get the concept of the index and usually just git commit –a when committing changes.

This does exactly the same thing as a svn commit would do: It takes all changes you made to your working copy and commits them to the repository. This also means that the smallest unit of change you can track is the state of the working copy itself.

To do finer grained commits, you can git add a file and commit that, which is the same as svn status followed by some grep and awk magic.

But even a file is too large a unit for a commit if you ask me. When you implement feature X, it’s possible if not very probable, that you fix bugs a and b and extend the interface I to make the feature Y work – a feature on which X depends.

Bugfixes, interface changes, subfeatures. A git commit –a will mash them all together. A git add per file will mash some of them together. Unless you are really really careful and cleanly only do one thing at a time, but in the end that’s now how reality works.

It may very well be that you discover bug b after having written a good amount of code for feature Y and that both Y and b are in the same file. Now you have to either back out b again, commit Y and reapply b or you just commit Y and b in one go, making it very hard to later just merge b into a maintenance branch because you’d also get Y which you would not want to.

But backing out already written code to make a commit? This is not a productive workflow. I could never make myself do something like that, let alone my coworkers. Aside of that, it’s yet another cause to create errors.

This is where the git index shines. Git tracks content. The index is a stage area where you store content you whish to later commit to the repository. Content isn’t bound to a file. It’s just content. By help of the index, you can incrementally collect single changes in different files, assemble them to a complete package and commit that to the repository.

As the index is tracking content and not files, you can add parts of files to it. This solves the problems outlined above.

So once I have completed Feature X, and assuming I could do it in one quick go, then I run git add with the –i argument. Now I see a list of changed files in my working copy. Using the patch-command, I can decide, hunk per hunk, whether it should be included in the index or not. Once I’m done, I exit the tool using 7. Then I run git commit1) to commit all the changes I’ve put into the index.

Remember: This is not done per file, but per line in the file. This way I can separate all the changes in my working copy, bug a and b, feature Y and X into single commits and commit them separately.

With a clean history like that, I can consequently merge the feature branch without —squash, thus keeping the history when dcommiting to subversion, finally producing something that can easily be merged around and tracked.

This is yet another feature of git that, after you get used to it, makes this VCS shine even more than everything else I’ve seen so far.

Git is fun indeed.

1) and not git commit -a which would destroy all the fine-grained plucking of lines you just did – trust me: I know. Now.

Converting ogg streams into mp3 ones

This is just an announcement for my newest quick-hack which can be used to on-the-fly convert streams from webradios which use the ogg/vorbis format into the mp3 format which is more widely supported by the various devices out there.

I have created an own dedicated page for the project for those who are interested.

Also, I really got to like github.com, not as the commercial service they intend to be (I’ve already written about the stupidity of hosting your company trade secrets at a company in a foreign country with foreign legislation), but as a place to quickly and easily dump some code you want to be publically available without going through all the hassle otherwise associated with public project hosting.

This is why this little script is hosted there and not here. As I’m using git, even if github goes away, I still have the full repository around to either self-host or let someone else host for me, which is a crucial requirement for me to outsource anything.

git branch in ZSH prompt

Screenshot of the terminal showing the current git branch

Today, I came across a little trick on how to output the current git branch on your bash prompt. This is very useful, but not as much for me as I’m using ZSH. Of course, I wanted to adapt the method (and to use fewer backslashes :-) ).

Also, in my setup, I’m making use of ZSH’s prompt themes feature of which I’ve chosen the theme “adam1”. So let’s use that as a starting point.

  1. First, create a copy of the prompt theme into a directory of your control where you intend to store private ZSH functions (~/zshfuncs in my case).
    cp /usr/share/zsh/4.3.4/functions/prompt_adam1_setup ~/zshfuncs/prompt_pilif_setup
  2. Tweak the file. I’ve adapted the prompt from the original article, but I’ve managed to get rid of all the backslashes (to actually make the regex readable) and to place it nicely in the adam1 prompt framework.
  3. Advise ZSH about the new ZSH function directory (if you haven’t already done so).
    fpath=(~/zshfunc $fpath)
  4. Load your new prompt theme.
    prompt pilif

And here’s the adapted adam1 prompt theme:

# pilif prompt theme

prompt_pilif_help () {
  cat <<'EOF'
This prompt is color-scheme-able.  You can invoke it thus:

  prompt pilif [<color1> [<color2> [<color3>]]]

This is heavily based on adam1 which is distributed with ZSH. In fact,
the only change from adam1 is support for displaying the current branch
of your git repository (if you are in one)
EOF
}

prompt_pilif_setup () {
  prompt_adam1_color1=${1:-'blue'}
  prompt_adam1_color2=${2:-'cyan'}
  prompt_adam1_color3=${3:-'green'}

  base_prompt="%{$bg_no_bold[$prompt_adam1_color1]%}%n@%m%{$reset_color%} "
  post_prompt="%{$reset_color%}"

  base_prompt_no_color=$(echo "$base_prompt" | perl -pe "s/%{.*?%}//g")
  post_prompt_no_color=$(echo "$post_prompt" | perl -pe "s/%{.*?%}//g")

  precmd  () { prompt_pilif_precmd }
  preexec () { }
}

prompt_pilif_precmd () {
  setopt noxtrace localoptions
  local base_prompt_expanded_no_color base_prompt_etc
  local prompt_length space_left
  local git_branch

  git_branch=`git branch 2>/dev/null | grep -e '^*' | sed -E 's/^* (.+)$/(1) /'`
  base_prompt_expanded_no_color=$(print -P "$base_prompt_no_color")
  base_prompt_etc=$(print -P "$base_prompt%(4~|...|)%3~")
  prompt_length=${#base_prompt_etc}
  if [[ $prompt_length -lt 40 ]]; then
    path_prompt="%{$fg_bold[$prompt_adam1_color2]%}%(4~|...|)%3~%{$fg_bold[white]%}$git_branch"
  else
    space_left=$(( $COLUMNS - $#base_prompt_expanded_no_color - 2 ))
    path_prompt="%{$fg_bold[$prompt_adam1_color3]%}%${space_left}<...<%~ %{$reset_color%}$git_branch%{$fg_bold[$prompt_adam1_color3]%} $prompt_newline%{$fg_bold_white%}"
  fi

  PS1="$base_prompt$path_prompt %# $post_prompt"
  PS2="$base_prompt$path_prompt %_&gt; $post_prompt"
  PS3="$base_prompt$path_prompt ?# $post_prompt"
}

prompt_pilif_setup "$@"

The theme file can be downloaded here

Impressed by git

The company I’m working with is a Subversion shop. It has been for a long time – since fall of 2004 actually where I finally decided that the time for CVS is over and that I was going to move to subversion. As I was the only developer back then and as the whole infrastructure mainly consisted of CVS and ViewVC (cvsweb back then), this move was an easy one.

Now, we are a team of three developers, heavy trac users and truly dependant on Subversion which is – mainly due to the amount of infrastructure that we built around it – not going away anytime soon.

But none the less: We (mainly I) were feeling the shortcomings of subversion:

  • Branching is not something you do easily. I tried working with branches before, but merging them really hurt, thus making it somewhat prohibitive to branch often.
  • Sometimes, half-finished stuff ends up in the repository. This is unavoidable considering the option of having a bucket load of uncommitted changes in the working copy.
  • Code review is difficult as actually trying out patches is a real pain to do due to the process of sending, applying and reverting patches being a manual kind of work.
  • A pet-peeve of mine though is untested, experimental features developed out of sheer interest. Stuff like that lies in the working copy, waiting to be reviewed or even just having its real-life use discussed. Sooner or later, a needed change must go in and you have the two options of either sneaking in the change (bad), manually diffing out the change (hard to do sometimes) or just forget it and svn revert it (a real shame).

Ever since the Linux kernel first began using Bitkeeper to track development, I knew that there is no technical reason for these problems. I knew that a solution for all this existed and that I just wasn’t ready to try it.

Last weekend, I finally had a look at the different distributed revision control systems out there. Due to the insane amount of infrastructure built around Subversion and not to scare off my team members, I wanted something that integrated into subversion, using that repository as the official place where official code ends up while still giving us the freedom to fix all the problems listed above.

I had a closer look at both Mercurial and git, though in the end, the nicely working SVN integration of git was what made me have a closer look at that.

Contrary to what everyone is saying, I have no problem with the interface of the tool – once you learn the terminology of stuff, it’s quite easy to get used to the system. So far, I did a lot of testing with both live repositories and test repositories – everything working out very nicely. I’ve already seen the impressive branch merging abilities of git (to think that in subversion you actually have to a) find out at which revision a branch was created and to b) remember every patch you cherry-picked…. crazy) and I’m getting into the details more and more.

On our trac installation, I’ve written a tutorial on how we could use git in conjunction with the central Subversion server which allowed me to learn quite a lot about how git works and what it can do for us.

So for me it’s git-all-the-way now and I’m already looking forward to being able to create many little branches containing many little experimental features.

If you have the time and you are interested in gaining many unexpected freedoms in matters of source code management, you too should have a look at git. Also consider that on the side of the subversion backend, no change is needed at all, meaning that even if you are forced to use subversion, you can privately use git to help you manage your work. Nobody would ever have to know.

Very, very nice.