PHP 5.2.4

Today, the bugfix-release 5.2.4 of PHP has been released.

This is an interesting release, because it includes my fix for bug 42117 which I discovered and fixed a couple of weeks ago.

This means that with PHP 5.2.4 I will finally be able to bzip2-encode data as it is generated on the server and stream it out to the client, greatly speeding up our windows client.

Now I only need to wait for the updated gentoo package to update our servers.

More iPod fun

Last time I explained how to get .OGG-feeds to your iPod.

Today I’ll show you one possible direction one could go to greatly increase the usability of non-official (read: not bought at audible.com) audiobooks you may have lying around in .MP3 format.

You see, your iPod threats every MP3-File of your library as music, regardless of length and content. This can be annoying as the iPod (rightly so) forgets the position in the file when you stop playback. So if you return to the file, you’ll have to start from the beginning and seek through the file.

This is a real pain in case of longer audiobooks and / or radio plays of which I have a ton

One way is to convert your audiobooks to AAC and rename the file to .m4b which will convince iTunes to internally tag the files as audiobooks and then enable the additional features (storing the position and providing UI to change play speed).

Of course this would have meant converting a considerable part of my MP3 library to the AAC-format which is not yet as widely deployed (not to speak of the quality-loss I’d have to endure when converting a lossy format into another lossy format).

It dawned me that there’s another way to make the iPod store the position – even with MP3-files: Podcasts.

So the idea was to create a script that reads my MP3-Library and outputs RSS to make iTunes think it’s working with a Podcast.

And thus, audiobook2cast.php was born.

The script is very much tailored to my directory structure and probably won’t work at your end, but I hope it’ll provide you with something to work with.

In the script, I can only point out two interesting points:

  • When checking a podcast, iTunes ignores the type-attribute of the enclosure when determining whether a file can be played or not. So I had to add the fake .mp3-extension.
  • I’m outputting a totally fake pubDate-Element in the <item>-Tag to force iTunes to sort the audiobooks in ascending order.

As I said: This is probably not useful to you out-of-the-box, but it’s certainly an interesting solution to an interesting problem.

Cheating with OGG-podcasts

For about a year, I’m listening to Podcasts all the time. Until now, I was using my iPod nano with iTunes for my podcasting needs and I was pretty happy about it.

Lately though, I came across some podcasts that provide either only OGG versions or at least enhanced OGG versions (like stereo or additional content). Not wanting to start writing code to listen to Podcasts, I thought that maybe I should try out another player…

I settled with an iRiver Clix 2 which looks great, has a nice OLED display and plays OGG files.

Unfortunately though, it doesn’t play AAC-files which is what one of the podcasts I listen to is distributed in.

So I went down to code and wrote some conversion scripts that download the AAC-files, convert them to ogg and alter the RSS-feed to point to the converted files.

This worked perfectly, so today I rsynced two Podcasts to the iRiver and went to the Office, only to noticing two big problems with the thing:

  1. It doesn’t keep track of what Podcasts I’ve already listened to. As I have quite many podcasts I’m subscribed to, it’s very hard to manually keep track.
  2. And the killer: It doesn’t store the playback position. This is totally bad as podcasts usually are long (up to two hours) and while I like the iRiver’s nice ‘press-the-edge-of-the-device’ usage concept, it’s a real pain to seek in the file: Either it’s way too slow or totally inaccurate, so while seeking on the iPod would be tolerable, it’s completely impossible to do on the iRiver.

Just when I thought that the advantages of being able to play OGGs still outweigh the two disadvantages, I began thinking that maybe, maybe I could do the AAC to OGG-Hack again, but in the other direction…

So now I’m “cheating” myself into better quality and bonus content without actually really using the free format.

And this is how it works (it’s basically the same thing as the scripts I linked in the forum post above, but it has some advanced features):

  • At half pas midnight (though I may increase the interval), ogg_cast_download.php runs. It goes over a list of RSS-feeds (though I may actually automate this list in a later revision – as soon as I’m getting more and more ogg-casts), checks them for new entries (which is easy: If the file isn’t there, it must be new), downloads the enclosures (using wget for resume functionality, proper handling of redirects and meaningful output), acquires tagging information and finally converts the files to AAC format using faac.
  • Whenever iTunes checks for new podcasts, it doesn’t actually download the original, but uses oggcasts.php running on shion, passing the original URL
  • oggcasts.php checks the (symlinked) output directoy of the ogg downloader and alters the feeds to match the converted files.

And if you think you can just install the official quicktime OGG component to import the feeds: That unfortunately won’t work. iTunes refuses to directly download ogg-feeds.

Updating or replacing datasets

This is maybe the most obvious trick in the world but I see people not doing it all over the place, so I guess it’s time to write about it.

Let’s say you have a certain set of data you need to enter into your RDBMS. Let’s further assume that you don’t know whether the data is already there or not, so you don’t know whether to use INSERT or UPDATE

Some databases provide us with something like REPLACE or “INSERT OR REPLACE”, but others do not. Now the question is, how to do this efficiently?

What I always see is something like this (pseudo-code):

  1. select count(*) from xxx where primary_key = xxx
  2. if (count > 0) update; else insert;

This means that for every dataset you will have to do two queries. This can be reduced to only one query in some cases by using this little trick:

  1. update xxx set yyy where primary_key = xxx
  2. if (affected_rows(query) == 0) insert;

This method just goes ahead with the update, assuming that data is already there (which usually is the right assumption anyways). Then it checks if an update has been made. If not, it goes ahead and inserts the data set.

This means that in cases where the data is already there in the database, you can reduce the work on the database to one single query.

Additionally, doing a SELECT and then an UPDATE essentially does the select twice as the update will cause the database to select the rows to update anyways. Depending on your optimizer and/or query cache, this can be optimized away of course, but there are no guarantees.

Careful when clean-installing TabletPCs

At work, I got my hands on a LS-800 TabletPC by motion computing and after spending a lot of time with it and as I’m very interested in TabletPCs anyways, I finally got myself its bigger brother, the LE-1700

The device is a joy to work with: Relatively small and light, one big display and generally nice to handle.

The tablet came with Windows XP preinstalled and naturally, I wanted to have a look at the new Tablet-centric features in Vista, so I went ahead and upgraded.

Or better: Clean-installed.

The initial XP installation was german and I was installing an english copy of Vista which makes the clean installation mandatory.

The LE-1700 is one of the few devices without official Vista-support, but I guess that’s because of the missing software for the integrated UMTS modem – for all other devices, drivers either come prebundled with Vista, are available on Windows update or you can use the XP drivers provided at the Motion computing support site.

After the clean installation, I noticed that the calibration of the pen was a bit off – depending on the position on the screen, the tablet noticed the pen up to 5mm left or above the actual position of the pen. Unfortunately, using the calibration utility in the control panel didn’t seem to help much.

After some googling, I found out what’s going on:

The end-user accessible calibration tool only calibrates the screen for the tilt of the pen relative to the current position. The calibration of the pens position is done by the device manufacturer and there is no tool available for end-users to do that.

Which, by the way, is understandable considering how the miscalibration showed itself: To the middle of the screen it was perfect and near the sides it got worse and worse. This means that a tool would have to present quite a lot of points for you to hit to actually get a accurately working calibration.

Of course, this was a problem for me – especially when I tried out journal and had to notice that the error was bad enough to take all the fun out of hand-writing (imagine writing on a paper and the text appearing .5cm left of where you put the pen).

I needed to get the calibration data and I needed to put it back after the clean installation.

It turns out that the linear calibration data is stored in the registry under HKLMSYSTEMCurrentControlSetControlTabletPCLinearityData in the form of a (large) binary blob.

Unfortunately, Motion does not provide a tool or even reg-file to quickly re-add the data should you clean-install your device, so I had to do the unthinkable (I probably could have called support, but my method had the side effect of not making me wait forever for a fix):

I restored the device to the factory state (by using the preinstalled Acronis True Image residing on a hidden partition), exported the registry settings, reinstalled Vista (at which time the calibration error resurfaced), imported the .reg-File and rebooted.

This solved the problem – the calibration was as smooth as ever.

Now, I’m not sure if the calibration data is valid for the whole series or even defined per device, but here is my calibration data in case you have the same problem as I had.

If the settings are per device or you have a non-LE-1700, I strongly advise you to export that registry key before clean-installing

Obviously I would have loved to know this beforehand, but… oh well.

Gmail – The review

It has been quite a while since I began routing my mail to Gmail with the intention of checking that often-praised mail service out thoroughly.

The idea was to find out if it’s true what everyone keeps saying: That gmail has a great user interface, that it provides all the features one needs and that it’s a plain pleasure to work with it.

Personally, I’m blown away.

Despite the obviously longer load time to be able to access the mailbox (Mac Mail launches quicker than it takes gmail to load here – even with a 10 MBit/s connection), the gmail interface is much faster to use – especially with the nice keyboard shortcuts – but I’m getting ahead of myself.

When I began to use the interface for some real email work, I immediately noticed the shift of paradigm: There are no folders and – the real new thing for me – you are encouraged to move your mail out of the inbox as you take notice of them and/or complete the task associated with the message.

When you archive a message, it moves out of the inbox and is – unless you tag it with a label for quick retrieval – only accessible via the (quick) full text search engine built into the application.

The searching part of this usage philosophy is known to me. When I was using desktop clients, I usually kept arriving email in my inbox until it contained somewhere around 1500 messages or so. Then I grabbed all the messages and put them to my “Old Mail” folder where I accessed them strictly via the search functionality built into the mail client (or the server in case of a good IMAP client).

What’s new for me is the notion of moving mail out of your inbox as you stop being interested in the message – either because you plain read it or because the associated task is completed.

This allows you for a quick overview over the tasks still pending and it keeps your inbox nice and clean.

If you want quick access to certain messages, you can tag them with any label you want (multiple labels per message are possible of course) in which case you can access the messages with one click, saving you the searching.

Also, it’s possible to define filters allowing you to automatically apply labels to messages and – if you want, move them out of the inbox automatically – a perfect setup for the SVN commit messages I’m getting, allowing me to quickly access them at the end of the day and looking over the commits.

But the real killer feature of gmail is the keyboard interface.

Gmail is nearly completely accessible without requiring you to move your hands off the keyboard. Additionally, you don’t even need to press modifier keys as the interface is very much aware of state and mode, so it’s completely usable with some very intuitive shortcuts which all work by pressing just any letter button.

So usually, my workflow is like this: Open gmail, press o to open the new message, read it, press y to archive it, close the browser (or press j to move to the next message and press o again to open it).

This is as fast as using, say, mutt on the console, but with the benefit of staying usable even when you don’t know which key to press (in that case, you just take the mouse).

Gmail is perfectly integrated into google calendar, and it’s – contrary to mac mail – even able to detect outlook meeting invitations (and send back correct responses).

Additionally, there’s a MIDP applet available for your mobile phone that’s incredibly fast and does a perfect job of giving you access to all your email messages when you are on the road. As it’s a Java application, it runs on pretty much every conceivable mobile phone and because it’s a local application, it’s fast as hell and can continue to provide the nice, keyboard shortcut driven interface which we are used to from the AJAXy web application.

Overall, the experiment of switching to gmail proofed to be a real success and I will not switch back anytime soon (all my mail is still archived in our Exchange IMAP box). The only downside I’ve seen so far is that if you use different email-aliases with your gmail-account, gmail will set the Sender:-Header to your gmail-address (which is a perfectly valid – and even mandated – thing to do), and the stupid outlook on the receiving end will display the email as being sent from your gmail adress “in behalf of” your real address, exposing your gmail-address at the receiving. Meh. So for sending non-private email, I’m still forced to use Mac Mail – unfortunately.