armchair scientists

The place: London. The time: Around 1890.

Imagine a medium sized room, lined with huge shelves filled with dusty books. The lights are dim, the air is heavy with cigar smoke. Outside the last shred of daylight is fading away.

In one corner of the room, you spot two large leather armchairs and a small table. On top of the table, two half-full glasses of Whisky. In each of the armchair an elderly person.

One of them opens the mouth to speak

«If I were in charge down there in South Africa, we’d be so much better off – running a colony just can’t be so hard as they make it out to be»

Conceivably to have happened? Yeah. Very likely actually. Crazy and misguided? Of course – we learned about that in school, imperialism doesn’t work.

Of course that elderly guy in the little story is wrong. The problems are way too complex for a bystander to even understand, let alone solve. More than likely he doesn’t even have a fraction of the background needed to understand the complexities.

And yet he sits there, in his comfortable chair, in the warmth of his club in cozy London and yet he explains that he knows so much better than, you know, the people actually doing the work.

Now think today.

Think about that article you just read that was explaining a problem the author was solving. Or that other article that was illustrating a problem the author is having, still in search of a solution.

Didn’t you feel the urge to go to Hacker News and reply how much you know better and how crazy the original poster must be not to see the obvious simple solution?

Having trouble scaling 4chan? How can that be hard?

Having trouble with your programming environment feeling unable to assign a string to another? Well. It’s just strings, why is that so hard?

Or those idiots at Amazon who can’t even keep their cloud service running? Clearly it can’t be that hard!

See a connection? By stating opinion like that, you are not even a little bit better than the elderly guy in the beginning of this essay.

Until you know all the facts, until you were there, on the ladder holding a hose trying to extinguish the flames, until then, you don’t have the right to assume that you’d do better.

The world we live in is incredibly complicated. Even though computer science might boil down to math, our job is dominated by side-effects and uncontrollable external factors.

Even if you think that you know the big picture, you probably won’t know all the details and without knowing the details, it’s increasingly likely that you don’t understand the big picture either.

Don’t be an armchair scientist.

Be a scientist. Work with people. Encourage them, discuss solutions, propose ideas, ask what obvious fact you missed or was missing in the problem description.

This is 2012, not 1890.

E_NOTICE stays off.

I’m sure you’ve used this idiom a lot when writing JavaScript code

options['a'] = options['a'] || 'foobar';

It’s short, it’s concise and it’s clear what it does. In ruby, you can even be more concise:

params[:a] ||= 'foobar'

So you can imagine that I was happy with PHP 5.3’s new ?: operator:

<?php $options['a'] = $options['a'] ?: 'foobar';

In all three cases, the syntax is concise and readable, though arguably, the PHP one could read a bit better, but, ?: still is better than writing the full ternary expression, spelling out $options['a'] three times.

PopScan, since forever (forever being 2004) runs with E_NOTICE turned off. Back in the times, I felt it provided just baggage and I just wanted (had to) get things done quickly.

This, of course, lead to people not taking enough care for the code and recently, I had one too many case of a bug caused by accessing a variable that was undefined in a specific code path.

I decided that I’m willing to spend the effort in cleaning all of this up and making sure that there are no undeclared fields and variables in all of PopScans codebase.

Which turned out to be quite a bit of work as a lot of code is apparently happily relying on the default null that you can read out of undefined variables. Those instances might be ugly, but they are by no means bugs.

Cases where the null wouldn’t be expected are the ones I care about, but I don’t even what to go and discern the two – I’ll just fix all of the instances (embarrassingly many, most of them, thankfully, not mine).

Of course, if I put hours into a cleanup project like this, I want to be sure that nobody destroys my work again over time.

Which is why I was looking into running PHP with E_NOTICE in development mode at least.

Which brings us back to the introduction.

<?php $options['a'] = $options['a'] ?: 'foobar';

is wrong code. Any accessing of an undefined index of an array always raises a notice. It’s not like Python where you can chose (accessing a dictionary using [] will throw a KeyError, but there’s get() which just returns None). No. You don’t get to chose. You only get to add boilerplate:

<?php $options['a'] = isset($options['a']) ? $options['a'] : 'foobar';

See how I’m now spelling $options['a'] three times again? ?: just got a whole lot less useful.

But not only that. Let’s say you have code like this:

<?php
list($host, $port) = explode(':', trim($def))
$port = $port ?: 11211;

IMHO very readable and clear what it does: It extracts a host and a port and sets the port to 11211 if there’s none in the initial string.

This of course won’t work with E_NOTICE enabled. You either lose the very concise list() syntax, or you do – ugh – this:

<?php
list($host, $port) = explode(':', trim($def)) + array(null, null);
$port = $port ?: 11211;

Which looks ugly as hell. And no, you can’t write a wrapper to explode() which always returns an array big enough, because you don’t know what’s big enough. You would have to pass the amount of nulls you want into the call too. That would look nicer then above hack, but it still doesn’t even come close in conciseness to the solution which throws a notice.

So. In the end, I’m just complaining about syntax you might think? I though so too and I wanted to add the syntax I liked, so I did a bit of experimenting.

Here’s a little something I’ve come up with:

<?php
define('IT', 100000);
error_reporting(E_ALL);
$g = array('a' => 'b');
$a = '';
function _(&$in, $k=null){
if (!isset($in)) return null;
if (is_array($in)){
return isset($k) ? ( isset($in[$k]) ? $in[$k] : null) : new WrappedArray($in);
}
return $in ?: null;
}
class WrappedArray{
private $v;
function __construct(&$v){
$this->v = $v;
}
function __get($key){
return isset($this->v[$key]) ? $this->v[$key] : null;
}
function __set($key, $value){
$this->v[$key] = $value;
}
}
error_reporting(E_ALL ^ E_NOTICE);
$start = microtime(true);
for($i = 0; $i < IT; $i++){
$a = $g['gnegg'];
}
$end = microtime(true);
printf("Notices off. Array %d iterations took %.6fs\n", IT, $end-$start);
$start = microtime(true);
for($i = 0; $i < IT; $i++){
$a = isset($g['gnegg']) ? $g['gnegg'] : null;
}
$end = microtime(true);
printf("Notices off. Inline. Array %d iterations took %.6fs\n", IT, $end-$start);
$start = microtime(true);
for($i = 0; $i < IT; $i++){
$a .= $blupp;
}
$end = microtime(true);
$a = "";
printf("Notices off. Var. Array %d iterations took %.6fs\n", IT, $end-$start);
error_reporting(E_ALL);
$start = microtime(true);
for($i = 0; $i < IT; $i++){
@$a = $g['gnegg'];
}
$end = microtime(true);
printf("Notices on. @-operator. %d iterations took %.6fs\n", IT, $end-$start);
$start = microtime(true);
for($i = 0; $i < IT; $i++){
@$a .= $blupp;
}
$end = microtime(true);
printf("Notices on. Var. @-operator. %d iterations took %.6fs\n", IT, $end-$start);
$start = microtime(true);
for($i = 0; $i < IT; $i++){
$a = _($g)->gnegg;
}
$end = microtime(true);
printf("Wrapped array. %d iterations took %.6fs\n", IT, $end-$start);
$start = microtime(true);
for($i = 0; $i < IT; $i++){
$a = _($g, 'gnegg');
}
$end = microtime(true);
printf("Parameter call. %d iterations took %.6fs\n", IT, $end-$start);
$start = microtime(true);
for($i = 0; $i < IT; $i++){
$a .= _($blupp);
}
$end = microtime(true);
$a="";
printf("Undefined var. %d iterations took %.6fs\n", IT, $end-$start);

The wrapped array solution looks really compelling syntax-wise and I could totally see myself using this and even forcing everybody else to go there. But of course, I didn’t trust PHP’s interpreter and thus benchmarked the thing.

pilif@tali ~ % php e_notice_stays_off.php
Notices off. Array 100000 iterations took 0.118751s
Notices off. Inline. Array 100000 iterations took 0.044247s
Notices off. Var. Array 100000 iterations took 0.118603s
Wrapped array. 100000 iterations took 0.962119s
Parameter call. 100000 iterations took 0.406003s
Undefined var. 100000 iterations took 0.194525s

So. Using nice syntactic sugar costs 7 times the performance. The second best solution? Still 4 times. Out of the question. Yes. It could be seen as a micro-optimization, but 100’000 iterations, while a lot is not that many. Waiting nearly a second instead of 0.1 second is crazy, especially for a common operation like this.

Interestingly, the most bloated code (that checks with isset()) is twice as fast as the most readable (just assign). Likely, the notice gets fired regardless of error_reporting() and then just ignored later on.

What really pisses me off about this is the fact that everywhere else PHP doesn’t give a damn. ‘0’ is equal to 0. Heck, even ‘abc’ is equal to 0. It even fails silently many times.

But in a case like this, where there is even newly added nice and concise syntax, it has to be overly conservative. And there’s no way to get to the needed solution but to either write too expensive wrappers or ugly boilerplate.

Dynamic languages give us a very useful tool to be dynamic in the APIs we write. We can create functions that take a dictionary (an array in PHP) of options. We can extend our objects at runtime by just adding a property. And with PHP’s (way too) lenient data conversion rules, we can even do math with user supplied string data.

But can we read data from $_GET without boilerplate? No. Not in PHP. Can we use a dictionary of optional parameters? Not in PHP. PHP would require boilerplate.

If a language basically mandates retyping the same expression three times, then, IMHO, something is broken. And if all the workarounds are either crappy to read or have very bad runtime properties, then something is terribly broken.

So, I decided to just fix the problem (undefined variable access) but leave E_NOTICE where it is (off). There’s always git blame and I’ll make sure I will get a beer every time somebody lets another undefined variable slip in.

Alt-Space

Today, I was looking into the new jnlp_href way of launching a Java Applet. Just like applet-launcher, this allows one to create applets that depend on native libraries without the usual hassle of manually downloading the files and installing them.

Contrary to applet-launcher, it’s built into the later versions of Java 1.6 and it’s officially supported, so I have higher hopes concerning its robustness.

It’s even possible to keep the applet-launcher calls in there if the user has an older Java Plugin that doesn’t support jnlp_href yet.

So in the end, you just write a .jnlp file describing your applet and add

<param name="jnlp_href" value="http://www.example.com/path/to/your/file.jnlp">

and be done with it.

Unless of course, your JNLP file has a syntax error. Then you’ll get this in your error console (at least in case of this specific syntax error):

java.lang.NullPointerException
    at sun.plugin2.applet.Plugin2Manager.findAppletJDKLevel(Unknown Source)
    at sun.plugin2.applet.Plugin2Manager.createApplet(Unknown Source)
    at sun.plugin2.applet.Plugin2Manager$AppletExecutionRunnable.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Ausnahme: java.lang.NullPointerException

How helpful is that?

Thanks, by the way, for insisting to display a half-assed German translation on my otherwise english OS: Never use locale info for determining the UI langauge, please.

Of course, this error does not give any indication of what the problem could be.

And even worse: The error in question is the topic of this blog post: It’s the dreaded Alt-Space character, 0xa0, or NBSP in ISO 8859-1.

0xa0 looks like a space, feels like a space, is incredibly easy to type instead of a space, but it’s not a space – not in the least. Depending on your compiler/parser, this will blow up in various ways:

pilif@celes ~ % ls | grep gnegg
zsh: command not found:  grep
pilif@celes ~ %
pilif@celes ~ % cat test.php
<?
echo "gnegg";
?>
pilif@celes ~ % php test.php
PHP Parse error:  syntax error, unexpected T_CONSTANT_ENCAPSED_STRING in /Users/pilif/test.php on line 2

Parse error: syntax error, unexpected T_CONSTANT_ENCAPSED_STRING in /Users/pilif/test.php on line 2
pilif@celes ~ %

and so on.

Now you people in the US with US keyboard layouts might think that I’m just one of those whiners – after all, how stupid must one be to press Alt-Space all the time? Probably stupid enough to deserve stuff like this.

Before you think these nasty thoughts, I ask you to consider the Swiss German keyboard layout though: Nearly all the characters use programmers use are accessed by pressing Alt-[some letter]. At least on the Mac. Windows uses AltGr, or right-alt, but on the mac, any alt will do.

So when you look at the shell line above:

ls | grep gnegg

you’ll see how easy it is to hit alt-space: First I type ls, then space. Then I press and hold alt-7 for the pipe and then, I am supposed to let go of alt and hit space. But because my left hand is on alt and the right one is pressing space, it’s very easy to hit space before letting go of alt.

Now instead of getting immediate feedback, nothing happens. It looks as if the space had been added, when in fact, something else has been added and that something is not recognized as a white space character and thus is something completely different from a space – despite looking exactly the same.

As much fun as reading hexdump -C output is – I need this to stop.

Dear internet! How can I make my Mac (or Linux when using the Mac keyboard layout) stop recognizing Alt-Space?

To take air out of the eventually arriving troll’s sails:

  • I won’t use Windows again. Thank you. Neither do I want to use Linux on my desktop.
  • I cannot use the US keybindings because my brain just can’t handle the keyboard layout changing all the time and as I’m a native German speaker, I do have to type umlauts here and then – actually often enough, so that the ¨+vocal combo isn’t acceptable.
  • While running Mac OS X, I’m stuck with the mac keyboard layout – I can’t use the Windows one.

Above JNLP error (printed here just in case somebody else has the same issue) caused me to lose nearly 5 hours of my life and will force me to work this weekend – who’d expect a XML parser error due to a space that isn’t one when seeing above call stack?

Update: A commenter on reddit.com has recommended to use Ukelele which I did and it helped me to create a custom keyboard layout that makes alt-space work like just space. That’s the best solution for my specific taste, so thanks a lot!

Of all the hardware that can break…

… it has to be the one that’s most difficult to replace.

Today, my Gefen HDMI over Cat5 adapter died. Well. It didn’t die completely, it just lost its ability to produce a stable image. What is transmitted is very intermittent and in the few seconds the image is available, it’s heavily distorted.

Also, it’s not the obvious issue (faulty cabling) as the problems did not go away after using two very short (1m) cat 5 cables to test.

Now this is really bad for a variety of reasons:

  • Only just last Saturday I bought Star Ocean and Tales of Vesperia for my 360, giving me a total play time of 1.5 hours so far.
  • Yesterday I noticed that Worms: Armageddon was released for Xbox arcade and I have already invited Ebi after the huge success that was our earlier Worms evening on the 360.
  • My setup is totally dependent on the two extenders as I am covering more than 20 meters of distance between receiver and projector. No extender, no Xbox, no Wii, no projector.
  • Last time I waited around six weeks for the extender to arrive

Of all the hardware I’m having at home, the HDMI extender is the worst to break. Not only is it very hard to replace (see above), it’s so deeply integrated into my home cinema setup that just debugging what was going on took a ladder, a screwdriver, a hex-wrench and unwinding an ungodly heap of cables.

All of that in an apartment whose temperature is currently at 30°C (86 °F) and with a hell of a headache.

I’d take anything else going down. Anything but that Gefen extender. My XBox? Sure. Shion? It’d suck, but sure if it has to be, go ahead. My reciever? That would hurt as it was very expensive, but at least it’s easily replaced.

Why did it have to be that Gefen extender? Why??

digg bar controversy

Update: I’ve actually written this post yesterday and scheduled it for posting today. In the mean time, digg has found an even better solution and only shows their bar for logged in users. Still – a solution like the one provided here would allow for the link to go to the right location regardless of the state of the digg bar settings.

Recently, digg.com added a controversial feature, the digg bar, which basically frames every posted link in a little IFRAME.

Rightfully so, webmasters were concerned about this and quite quickly, we had the usual religious war going on between the people finding the bar quite useful and the webmasters hating it for lost page rank, even worse recognition of their site and presumed affiliation with digg.

Ideas crept up over the weekend, but turned out not to be so terribly good.

Basically it all boils down to digg.com screwing up on this, IMHO.

I know that they let you turn off that dreaded digg bar, but all the links on their page still point to their own short url. Only then is the decision made whether to show the bar or not.

This means that all links on digg currently just point to digg itself, not awarding any linked page with anything but the traffic which they don’t necessarily want. Digg-traffic isn’t worth much in terms of returning users. You get dugg, you melt your servers, you return back to be unknown.

So you would probably appreciate the higher page rank you get from being linked at by digg as that leads to increased search engine traffic which generally is worth much more.

The solution on diggs part could be simple: Keep the original site url in the href of their links, but use some JS-magic to still open the digg bar. That way they still get to keep their foot in the users path away from the site, but search engines will now do the right thing and follow the links to their actual target, thus giving the webmasters their page rank back.

How to do this?

Here’s a few lines of jQuery to automatically make links formated in the form

be opened via the digg bar while still working correctly for search engines (assuming that the link’s ID is the digg shorturl):

$(function(){
  $('div#link_container a').click(function(){
    $(this).attr('href') = 'http://digg.com/' + this.id;
  });
});

piece of cacke.

No further changes needed and all the web masters will be so much happier while digg gets to keep all the advantages (and it may actually help digg to increase their pagerank as I could imagine that a site with a lot of links pointing to different places could rank higher than one without any external links).

Webmasters then still could do their usual parent.location.href trickery to get out of the digg bar if they want to, but they could also retain their page rank.

No need to add further complexity to the webs standards because one site decides not to play well.

Bugs, Bugs and more Bugs

I love my job. Ever loved it, always will love it.

But if you ask me what the most annoying aspect of it is, then I would answer you that it’s stuff always breaking all around me.

Whatever I do, there is no guarantee that any defined thing will work like it’s expected to, it will break from one moment to another or it will never work. There are hardware failures, OS failures, software failures – each and every day I lose at least one or two hours due to stuff not working or suddenly stopping to work.

Let me give you an account of what happened since the beginning of 2009:

  • When installing two previously configured servers at a collocation center, one didn’t start up at all (opening and reclosing the case fixed that) and the ESX server on the other machine refused to connect to the VMWare license server despite a working TCP/IP connection between them which turned out to be a missing host file entry despite connecting via IP-address.
  • One day later, Outlook on a computer of someone I’m looking after the PC a bit decided to trash the .PST-file and I had to remotely guide (on the phone) the person to restore it from the backup.
  • Yesterday, my Firebug suddenly stopped working. At least the console-object wasn’t any longer available in my scripts and the console itself didn’t work. Reinstalling the Addon helped (WTF?)
  • One of my two Vista Media Center PCs suddenly stopped to play any video file, despite me not doing updates on these machines to prevent stuff like this from happening. To this date I have no idea how to fix this.
  • My Delphi 2007 installation just now decided to stop displaying the online help. Trying to fix that by reinstalling it ended with an Error message containing title and content of “Error”, but not after first completely uninstalling Delphi with no way of getting it back (you know… “Error” again). This was fixed by removing D2009 and then reinstalling 2007 and 2009 – a process that took 2 hours of installation time and another three to figure out what’s going on.
  • When I was frustrated enough and wanted to vent (i.e. write this post), my WordPress just now decided to do something really strange to the layout of the “Add New Post” page which made it impossible to post anything. Disabling Google Gears and restarting the browser helped.

Our everyday technology is becoming more and more complex, thus causing more and more strange problems, requiring more and more knowledge and time to work around them. If we continue on that path, sooner or later it will be impossible to keep up with fixing problems popping up.

That will be the day when I’ll hopefully live on some island way off the net and all this stuff.

Automatic language detection

If you write a website, do not use Geolocation to determine the language to display to your user.

If you write a desktop application, do not use the region setting to determine the language to display to your user.

This is incredibly annoying for some of us, especially for me which is why I’m ranting here.

The moment Google released their (awful) German translation for their RSS reader, I was served the German version just because I have a Swiss IP address.

Here in Switzerland, we actually speak one of three (or four, depending on who you ask) languages, so defaulting to German is probably not of much help for the people in the french speaking part.

Additionally, there are many users fluent in (at least reading) English. We always prefer the original language if at all possible because generally, translations never quite work. Even if you have the best translators at work, translated texts never feel fluid. Especially not when you are used to the original version.

So, Google, what were you thinking to switch me over to the German version of the reader? I have been using the English version for more than a year, so clearly, I understood enough of that language to be able to use it. More than 90% of the RSS feeds I’m subscribed to are, in fact, in English. Can you imagine how pissed I was to see the interface changed?

This is even worse on the iPhone/iPod frontend, because, there, you don’t even provide an option to change the language aside of manually hacking the URL.

Or take desktop applications. I live in the German speaking parts of Switzerland. True. So naturally I have set my locale settings to Swiss German. You know: I want to have the correct number formatting, I want my weeks to start on Mondays. I want the correct currency. I want my 24 hours clock I’m used to.

Actually, I also want the German week and month names, because I will be using these in most of my letters and documents, which are, in fact, German too.

But my OS installation is English. I am used to English. I prefer English. Why do so many programs insist to use the locale setting to determine the display language? Do you developers think it’s funny to have a mish-mash of languages on the screen? Don’t you think that me using an English OS version may be an indication that I do not want to read your crappy German translation alongside the English user interface of my OS?

Don’t you think that it feels really stupid to have a button in a German dialog box open another, English, dialog (the first one is from Chrome, the one that opens once you click “Zertifikate verwalten” (Manage certificates) is from Windows itself)?

In Chrome, I can at least fix the language – once I found the knob to turn. At first, it was easier for me to just delete the German localization file from the chrome installation because, due to being completely unused to German UIs, I was unable to find the right setting.

This is really annoying and I see this particular problem being neglected on an incredibly large scale. I know that I am a minority, but the problem is so terribly easy to fix:

  • All current browsers send an Accept-Language header. In contrast to the earlier times, nowadays, it is actually correctly preset in all the common browsers. Use that. Don’t use my IP-address.
  • Instead of reading the locale setting in my OS, ask the OS for its UI language and use that to determine which localization to load (actually, this is the recommended way of doing things according to Microsoft’s guidelines at least since Windows XP which was 2001).

Using these two simple tricks, you help a minority without hindering the majority in any way and without additional development overhead!

Actually, you’ll be getting away a lot cheaper than before. GeoIP is expensive if you want it to be accurate (and you do want that. Don’t you?), whereas there are ready-to-use libraries to determine the correct language even from the most complex Accept-Language-Header.

Asking the OS for the UI language isn’t harder than asking it for the locale, so no overhead there either.

Please, developers, please have mercy! Stop the annoyance! Stop it now!

Internet at home

I’m a usually very happy customer of Cablecom. They provide internet-over-tv-cable and as here in Switzerland, basically everyone has tv cable and because they provide nice pure ip addresses (no PPPoE stuff) and because when you are not trapped in the administrative trap, then it just works. Cablecom internet is never down, very speedy and usually I’m envied for my pings in online matches of whatever game.

All these are very good reasons to become a customer of Cablecom and depite of what you are going to read here shortly, I would probably still recommend them to other users – at least those with some technical background because, quite frankly, of all the ways to get broadband here in Switzerland, this one is the one that works the easiest and the most consistent.

But once you fall into the administrative trap, all hell breaks lose.

Here’s what happened to me (also, read my other post about Cablecom’s service):

Somewhere around the end of May I got a letter telling me that I would get sent a new cable modem. Once I’ve got that, I should give them a call so they can deactivate my old one. Also, if I don’t call, they’d automatically disable the old modem after a couple of weeks.

Unfortunately, I never got that modem. I don’t know who’s to blame and I don’t care. Also, I could not have anticipated the story as it’s now unfolding because the letter clearly said that I’d get the modem at an unknown later date, so I wasn’t worried at the time.

At the beginning of June, I’ve noticed the network going down. Not used to that, especially not as it was down for a whole day, I called the hotline and told them that I suspected them of shutting of my service despite me not reciving the modem.

They’ve confirmed that and promised me to resend the modem. Re-enabling the old one was not possible they’ve told me futher on.

One week later – not having recived the modem – I’ve called again and they told me that the order was delayed due to some CRM software change at their end, but they’ve promised me to send it that week.

Another week passes. No modem. I call again and they tell me that the reporcessing of orders was delayed, but that I will get the modem that week for sure. Knowing that this probably won’t be the case, I’ve told them that I will be on vacation and that they should send it to my office address.

Another week passes and I go to vacation.

Another week passes and I call the office to ask if the modem (that was supposed to arrive two weeks ago the latest) has arrived. Of course it didn’t. What made me actually make the call was the fact that I’ve received a press release from Cablecom announcing more customers than ever – the irony of that bringing my memory back to the non-existing internet at my home.

So I called support again. They did notice that my order was late, but they had no idea why it was taking so long, there was no way of speeding it up and they had no idea when I would get the modem (keep in mind that I’m paying CHF 79/mt for not working internet access).

At this point I’ve had enough and I’ve called someone higher up I know working at Cablecom.

In the end, I was able to get internet access using that route, but it’s not entirely official and I still have not the slightest idea of when/if the problem with my actual account will ever be fixed.

Pathetic.

Still: If everything goes well, then you have nothing to fear. From a technical standpoint, Cablecom owns all other currently widely available methods for broadband internet access, so this is what I will be sticking with. Just be prepared for longer service intermissions once you fall into the administrative trap.

Hosted Code Repository?

Recently (yesterday), the Ruby on Rails project announced their switch to git for their revision controlling needs. Also, they announced that they will use the hosted service github as the place to host the main repository on (even though git is decentralized, there is some sense in having a “main tree” which contains what’s going to be the official releases).

I didn’t know github, so I had a look at their project.

What I don’t understand is that they seem to also target commercial entities with their offering. Think of it: Supposing that you are a commercial entity doing commercial software development. Would you send over all your sourcecode and all the development history to another company?

Sure. They call themselves “Secure”. But what does that mean? Sure: They have SSL and SSH support, but frankly, I’m less concerned with patches travelling over the network unencrypted than I’m concerned with trusting anybody to host my code.

Even if they don’t screw up storage security (think: “accessing the code of your competition”), even if they are completely 100% trustworthy (think: “displeased employee selling out to your competition before leaving his employer”), there is still the issue of government/legal access.

When using an external hosting provider, you are storing your code (and history) in a foreign country with its own legislation. Are you prepared for that?

And finally, do you want the government of the country you’ve just sent your code (and history) to, to really have access to all that data? Who guarantees that the hosting provider of your choice won’t cooperate as soon as the government comes knoking (it happened before, even without legal base at all)?

All that is never worth the risk for a larger company (or for smaller ones – like ours).

So what exactly are these hosting companies (github is one. Code Spaces is another) targeted at?

  • Free Software developers? Their code is open to begin with, so they have to face the problems I described anyways. But they are much harder to sue. Also, I’m not sure how compelling it is for a free software project to use a non-free tool (rails being the exception, but we’ll talk about that later on)
  • Large companies? No way (see above)
  • Smaller companies? Probably not. Smaller companies are less of a target due to lower visibility, but sueing them for anything is more likely to get you something in return quickly as they usually don’t dare prolonged legal fights.

A rant on brace placement

Many people consider it to be good coding style to have braces (in language that use them for block boundaries) on their own line. Like so:

function doSomething($param1, $param2)
{
    echo "param1: $param1 / param2: $param2";
}

Their argument usually is that it clearly shows the block boundaries, thus increasing the readability. I, as a proponent of placing bracers at the end of the statement opening the block, strongly disagree. I would format above code like so:

function doSomething($param1, $param2){
    echo "param1: $param1 / param2: $param2";
}

Here is why I prefer this:

  • In many languages code blocks don’t have their own identity – functions have, but not blocks (they don’t provide scope). Placing the opening brace on its own line, you emphasize the block but you actually make it harder to see what caused the block in the first place.
  • Using correct indentation, the presence of the block should be obvious anyways. There is no need to emphasize it more (at the cost of readability of the block opening statement).
  • I doubt that using one line per token really makes the code more readable. Heck… why don’t we write that sample code like so?
function
doSomething
(
$param1,
$param2
)
{
    echo "param1: $param1 / param2: $param2";
}