Hosted Code Repository?

Recently (yesterday), the Ruby on Rails project announced their switch to git for their revision controlling needs. Also, they announced that they will use the hosted service github as the place to host the main repository on (even though git is decentralized, there is some sense in having a “main tree” which contains what’s going to be the official releases).

I didn’t know github, so I had a look at their project.

What I don’t understand is that they seem to also target commercial entities with their offering. Think of it: Supposing that you are a commercial entity doing commercial software development. Would you send over all your sourcecode and all the development history to another company?

Sure. They call themselves “Secure”. But what does that mean? Sure: They have SSL and SSH support, but frankly, I’m less concerned with patches travelling over the network unencrypted than I’m concerned with trusting anybody to host my code.

Even if they don’t screw up storage security (think: “accessing the code of your competition”), even if they are completely 100% trustworthy (think: “displeased employee selling out to your competition before leaving his employer”), there is still the issue of government/legal access.

When using an external hosting provider, you are storing your code (and history) in a foreign country with its own legislation. Are you prepared for that?

And finally, do you want the government of the country you’ve just sent your code (and history) to, to really have access to all that data? Who guarantees that the hosting provider of your choice won’t cooperate as soon as the government comes knoking (it happened before, even without legal base at all)?

All that is never worth the risk for a larger company (or for smaller ones – like ours).

So what exactly are these hosting companies (github is one. Code Spaces is another) targeted at?

  • Free Software developers? Their code is open to begin with, so they have to face the problems I described anyways. But they are much harder to sue. Also, I’m not sure how compelling it is for a free software project to use a non-free tool (rails being the exception, but we’ll talk about that later on)
  • Large companies? No way (see above)
  • Smaller companies? Probably not. Smaller companies are less of a target due to lower visibility, but sueing them for anything is more likely to get you something in return quickly as they usually don’t dare prolonged legal fights.

A rant on brace placement

Many people consider it to be good coding style to have braces (in language that use them for block boundaries) on their own line. Like so:

function doSomething($param1, $param2)
{
    echo "param1: $param1 / param2: $param2";
}

Their argument usually is that it clearly shows the block boundaries, thus increasing the readability. I, as a proponent of placing bracers at the end of the statement opening the block, strongly disagree. I would format above code like so:

function doSomething($param1, $param2){
    echo "param1: $param1 / param2: $param2";
}

Here is why I prefer this:

  • In many languages code blocks don’t have their own identity – functions have, but not blocks (they don’t provide scope). Placing the opening brace on its own line, you emphasize the block but you actually make it harder to see what caused the block in the first place.
  • Using correct indentation, the presence of the block should be obvious anyways. There is no need to emphasize it more (at the cost of readability of the block opening statement).
  • I doubt that using one line per token really makes the code more readable. Heck… why don’t we write that sample code like so?
function
doSomething
(
$param1,
$param2
)
{
    echo "param1: $param1 / param2: $param2";
}

PostgreSQL on Ubuntu

Today, it was time to provision another virtual machine. While I’m a large fan of Gentoo, there were some reasons that made me decide to gradually start switching over to Ubuntu Linux for our servers:

  • One of the large advantages of Gentoo is that it’s possible to get bleeding edge packages. Or at least you are supposed to. Lately, it’s taking longer and longer for an ebuild of an updated version to finally become available. Take PostgreSQL for example: It took about 8 months for 8.2 to become available and it looks like history is repeating itself for 8.3
  • It seems like there are more flamewars than real development going on in Gentoo-Land lately (which in the end leads to above problems)
  • Sometimes, init-scripts and stuff changes over time and there is not always a clear upgrade-path. emerge -u world once, then forget to etc-update and on next reboot, hell will break loose.
  • Installing a new system takes ages due to the manual installation process. I’m not saying it’s hard. It’s just time-intensive

Earlier, the advantage of having current packages greatly outweighted the issues coming with Gentoo, but lately, due to the current state of the project, it’s taking longer and longer for packages to become available. So that advantage fades away, leaving me with only the disadvantages.

So at least for now, I’m sorry to say, Gentoo has outlived it’s usefulness on my productive servers and has been replaced by Ubuntu, which albeit not being bleeding-edge with packages, at least provides a very clean update-path and is installed quickly.

But back to the topic which is the installation of PostgreSQL on Ubuntu.

(it’s ironic, btw, that Postgres 8.3 actually is in the current hardy beta, together with a framework to concurrently use multiple versions whereas it’s still nowhere to be seen for Gentoo. Granted: An experimental overlay exists, but that’s mainly untested and I had some headaches installing it on a dev machine)

After installing the packages, you may wonder how to get it running. At least I wondered.

/etc/init.d/postgresql-8.3 start

did nothing (not very nice a thing to do, btw). initdb wasn’t in the path. This was a real WTF moment for me and I assumed some problem in the package installation.

But in the end, it turned out to be an (underdocumented) feature: Ubuntu comes with a really nice framework to keep multiple versions of PostgreSQL running at the same time. And it comes with scripts helping to set up that configuration.

So what I had to do was to create a cluster with

pg_createcluster --lc-collate=de_CH --lc-ctype=de_CH -e utf-8 8.3 main

(your settings my vary – especially the locale settings)

Then it worked flawlessly.

I do have some issues with this process though:

  • it’s underdocumented. Good thing I speak perl and bash, so I could use the source to figure this out.
  • in contrast to about every other package in Ubuntu, the default installation does not come with a working installation. You have to manually create the cluster after installing the packages
  • pg_createcluster –help bails out with an error
  • I had /var/lib/postgresql on its own partition and forgot to remount it after a reboot which caused the init-script to fail with a couple of uninitialized value errors in perl itself. This should be handeled cleaner.

Still. It’s a nice configuration scheme and a real progress from gentoo. The only thing left for me now is to report these issues to the bugtracker and hope to see this fixed eventually. And it it isn’t, there is this post here to remind me and my visitors.