Mysql in Acrobat 8

I have Acrobat 8 running on my Mac. And look what I’ve found by accident:

I had console.log open to check something, when I found these lines:

<p>061115 9:57:48 [Warning] Can’t open and lock time zone table: Table ‘mysql.time_zone_leap_second’ doesn’t exist trying to live without them</p>

/Applications/Adobe Acrobat 8 Professional/Adobe Acrobat Professional.app/Contents/MacOS/mysqld: ready for connections.

Version: ‘4.1.18-standard’ socket: ‘/Users/pilif/Library/Caches/Acrobat/8.0_x86/Organizer70’ port: 0 MySQL Community Edition – Standard (GPL)

</tt>

MySQL shipped with Acrobat? Interesting.

The GPL-Version shipped with Acrobat? IMHO a clear license breach.

Of course, I peeked into the Acrobat bundle:

% pwd
/Applications/Adobe Acrobat 8 Professional/Adobe Acrobat Professional.app/Contents/MacOS
% dir mysql*
-rwxrwxr-x    1 pilif    admin     2260448 Feb 20  2006 mysqladmin
-rwxrwxr-x    1 pilif    admin     8879076 Feb 20  2006 mysqld

Interesting. Shouldn’t the commercial edition not print “Community Edition (GPL)”? Even if Adobe doesn’t violate the license (because they are just shipping the GPLed server and have bought the client library (which is GPL too) or written their own client), the GPL clearly states that I can get the sourcecode and a copy of the license. I couldn’t find these anywhere though…

I guess I should ask at mysql what’s going on here.

Bootcamp, Vista, EFI-Update

Near the end of october I wanted to install Vista on my Mac Pro, using Bootcamp of course. The reason is that I need a Windows machine at home to watch speedruns on it, so it seemed like a nice thing to try.

Back then, I was unable to even get setup going: Whenever you selected a partition that’s not the first partition on the drive (where OS X must be). The installer complained that the BIOS reported the selected partition to be non-bootable and that was it.

Yesterday, Apple has released another EFI update which was said to improve compatibility with Bootcamp and to fix some suspend/resume problems (I never had those)

Naturally, I went ahead and tried again.

The good news: Setup doesn’t complain any more. Vista can be installed to the second (or rather third) partition without complaining.

The bad news: The bootcamp driver installer doesn’t work. It always cancels out with some MSI-error, claims to roll back all changes (which it doesn’t – sound keeps working even after that «rollback» has occurred). This means: No driver support for NVIDIA card of my MacPro.

Even after trying to fetch a vista compliant driver from NVIDIA, I had no luck: The installer claimed the installation to be successful, but resolution stayed at 640x480x16 after a reboot. Device manager complained about the driver not finding certain resources to claim the device and that I was supposed to turn off other devices… whatever.

So in the MacPro case, I guess it’s waiting for updated Bootcamp drivers by Apple. I hear though that the other machines – those with an ATI driver are quite well supported.

All you have to do is to launch the bootcamp driver installer with the /a /v parameters to just extract the drivers and then you use the device manager and point it to that directory to manually install the drivers.

The pain of email SPAM

Lately, the SPAM problem got a lot worse in my email INBOX. Spammers seem to more and more check if their mail gets flagged by SpamAssasin and tweak the messages until they get through.

Due to some tricky aliasing going on on the mail server, I’m unable to properly use the bayes filter of SpamAssasin on our main mail server. You see, I have an infinite amount of addresses which are in the end delivered to the same account and all that aliasing can only be done after the message has passed SpamAssassin.

This means that even though mail may go to one and the same user in the end, it’s seen as mail for many different users by SpamAssassin.

This inability to use Bayes with SpamAssassin means that lately, SPAM has been getting through the filter.

So much SPAM that I began getting really, really annoyed.

I know that mail clients themselves also have bayes based SPAM filters, but I often check my email account with my mobile phone or on different computers, so I’m dependent on a solution that filters out the SPAM before it reaches my INBOX on the server.

The day before yesterday I had enough.

While all mail for all domains I’m managing is handled by a customized MySQL-Exim-Courier setting, mail to the @sensational.ch domain is relayed to another server and then delivered to our exchange server.

Even better: That final delivery step is done after all the aliasing steps (the catch-all aliases being the difficult part here) have completed. This means that I can in-fact have all mail to @sensational.ch pass through a bayes filter and the messages will all be filtered for the correct account.

This made me install dspam on the relay that transmits mail from our central server to the exchange server.

Even after only one day of training, I’m getting impressive results: DSPAM only touches mail that isn’t flagged as spam by SpamAssassin, which means that it’s carefully crafted to look “real”.

After one day of training, DSPAM usually detects junk messages and I’m down to one false negative every 10 junk messages (and no false positives).

Even after running SpamAssassin and thus filtering out the obvious suspects, a whopping 40% of emails I’m receiving are SPAM. So nearly half of the messages not already filtered out by SA are still SPAM.

If I take a look at the big picture, even when counting the various mails sent by various cron daemons as genuine email, I’m getting much more junk email than genuine email per day!

Yesterday, tuesday, for example, I got – including mails from cron jobs and backup copies of order confirmations for PopScan installations currently in public tests – 62 genuine emails and 252 junk mails of which 187 were caught by SpamAssassin and the rest was detected by DSPAM (with the exception of two mails that got through).

This is insane. I’m getting four times more spam than genuine messages! What the hell are these people thinking? With that volume of junk filling up our inboxes how ever could one of these “advertisers” think that somebody is both stupid enough to fall for such a message and intelligent enough to pick the one to fall for from all the others?

Anyways. This isn’t supposed to be a rant. It’s supposed to be a praise to DSPAM. Thanks guys! You rule!

podcast recommendation

I haven’t been much into podcasts till now: The ones I heard were boring, unprofessional or way too professional. Additionally, I didn’t have a nice framework set up to get them and to listen to them.

That’s because I don’t often sync my ipod. Most of the time, it’s not connected to a computer: About once every two months, I connect it to upload a new batch of audiobooks (I can’t fit my whole connection on the nano). So podcasting was – even if I had found one that I could interest myself in, an experience to have while behind the computer monitor.

Now two things have changed:

  1. I found the Linux Action Show. They guy doing that podcast are incredibly talented people. The entries sound very professionally made, while still not being on the obviously commercial side of things. They cover very, very interesting topics and they are everything but boring. Funny, entertaining and competent. Very good stuff.
  2. At least since the release of SlimServer 6.5, my Squeezebox is able to tune into RSS feeds with enclosures (or podcast for the less technical savy people – not that those would read this blog). Even better: The current server release brought a firmware which finally gives the Squeezebox the capability of natively playing ogg streams.

    Up until now, it could only play FLAC, PCM and MP3, requiring tools like sox to convert ogg streams on the fly. Unfortunately, that didn’t work as stable as I would have liked, but native OGG support helped a lot

So now, whenever a new episode of the podcast is released (once per week – and each episode is nearly two hours in length), I can use my Squeezebox to hear it via my home stereo.

Wow… I’m so looking forward to do that in front of a cozy fire in my fireplace once I can finally move into my new flat.

DVD ripping, second edition

HandBrake is a tool with the worst website possible: The screenshot that’s presented on the index page leaves behind a completely wrong image of the application.

When you just look at the screenshot, you will get the impression that the tool is fairly limited and totally optimized for creating movies for handheld devices.

That’s not true though. The screenshot is the screenshot of a light edition of the tool. The real thing is actually quite capable and only lacks the capability to store subtitles in the container format.

And it doesn’t know about Matroska.

And it refuses to store x264 encoded video in the OGM container.

Another tool I found after my first very bad experience with ripping DVDs last time is OGMrip. The tool is a frontend for mencoder (of mplayer fame) and has all the features you’d ever want from a ripping tool, while still being easy to use.

It even provides a command line interface, allowing to process your movies from the console.

It has one inherent flaw though: It’s single threaded.

HandBrake on the other hand, can split the encoding work (yes. the actual encoding) over multiple threads and thus can profit a lot of SMP machines.

Here’s what I found in matters of encoding speed. I encoded the same video (from a DVD ISO image) with the same settings (x264, 1079kbit/s, 112kbit mp3 audio, 640×480 resolution at 30fps) on different machines:

  • 1.4Ghz, G4 Mac mini, running Gentoo Linux with OGMrip: 3fps
  • Thinkpad T43, running Ubuntu Edgy Eft, 1.6Ghz Centrino, OGMRip: 8fps
  • MacBook Pro, 2Ghz Core Duo, HandBrake: 22fps (both cores at 100%)
  • Mac Pro, Dual Dual Core 2.66Ghz, HandBrake: 110fps(!!), 80% total cpu usage (hdd io seems to limit the process)

This means that encoding the whole 47 minutes A-Team episode takes:

  • OGMRip on Mac mini G4: 7.8 hours
  • OGMRip on Thinkpad: 2.35 hours per episode
  • HandBrake on MacBook Pro: 1.6 hours per episode
  • HandBrake on MacPro: 0.2 hours (12 minutes) per episode

Needless to say what method I’m using. Screw subtitles and Matroska – I want to finish ripping my collection this century!

On an additional closing note, I’d like to add that even after 3 hours of encoding video, the MacPro stayed very, very quiet. The only thing I could hear was the hard drive – the fans either didn’t run or were quieter than the harddrive (which is quiet too)

ripping DVDs

I have plenty of DVDs in my possession: Some movies of dubious quality which I bought when I was still going to school (like “Deep Rising” – eeew) and many, many episodes of various series (Columbo, the complete Babylon 5 series, A-Team and other pearls).

As you may know, I’m soon to move into a new flat which I thought would be a nice opportunity to reorganize my library.

shion has around 1.5TB of storage space and I can easily upgrade her capacity (shion is the only computer I own I’m using a female pronoun for – the machine is something really special to me – like the warships of old times) by plugging in yet another USB hub and USB hard drives.

It makes totally sense to use that unlimited amount of storage capacity to store all my movies – not only the ones I’ve downloaded (like video game speed runs). Spoiled by the ease of use of ripping CDs, I thought, that this would be just another little thing to do before moving.

You know: Enter the DVD, use the ripper, use the encoder, done.

Unfortunately, this is proving to be harder than it looked like in the first place:

  • Under Mac OS X, you can try to use the Unix tools with fink or some home-grown native tools. Whatever you do, you either get outdated software (fink) or not really working freeware tools documented in outdated tutorials. Nah.
  • Under Windows, there are two kinds of utilities: On one hand, you have the single-click ones (like AutoGK) which really do what I initially wanted. Unfortunately, they are limited in their use: They provide only a limited amount of output formats (like no x264) and they hard-code the subtitles into the movie stream. But they are easy to use. On the other hand, you have the hardcore tools like Gordian Knot or MeGUI or even StaxRip. These tools are frontends for other tools that work like Unix tools: Each does one thing, but tries to excel at that one thing.

    This could be a good thing, but unfortunately, it fails at things like awful documentation, hard-coded paths to files everywhere and outdated tools.

    I could not get any of the tools listed above to actually create a x264 AVI or MKV-File without either throwing a completely unusable error message (“Unknown exception ocurred”) or just not working at all or missing things like subtitles.

  • Linux has dvd::rip which is a really nice solution, but unfortunately, no solution for me as I don’t have the right platform to run it on: My MCE machine is – well – running Windows MCE, my laptop is running Ubuntu (no luck with the debian packages and no ubuntu-packages). shion is running Gentoo, but she’s headless, so I have to use a remote X-connection which is awfully slow and non-scriptable.

The solution I want works on the Linux (or MacOS X) console, is scriptable and – well – works.

I guess I’m going the hard-core way and use transcode which is what dvd::rip is using – provided I find good documentation (I’m more than willing to read and learn – if the documentation is current enough and actually documents the software that I’m running and not the software at the state of two years ago).

I’ll keep you posted on how I’m progressing.

Prewritten content

You may have noticed that last week had postings on nearly every day – and all the postings seem to have happened around 8:30am.

The reason for that is that I had a lot of inspiration on last Monday, allowing me to write two or three entries at once. I made Serendipity queue them up and post one on each day.

And as time progressed, I was adding more entries which I could schedule to the future too, thus keeping the illusion up that I was actually posting at 8:30 in the morning – a thing I’m certainly not thinking about doing.

While I’m awake at 8:30, I am most certainly not in the mood to post anything not to speak about the lack of inspiration due to not having surfed the web yet.

Writing content ahead of time has some advantages like allowing for better editing (much more time to read before the entry is posted) and it helping keeping the blog alive (a post for every day), but it also has some disadvantages: For one, the entries may not be as deep as one I’m writing for the moment.

After writing down an entry or two, I’m feeling a bit of a burnout, which certainly has negative effects on the entries length and depth.

And even worse: s9y insists on sending pings when the entry is submitted – not when it’s published.

This means that I’m sending out pings for non-existing entries (bad thing) or I’m not sending out pings at all (slightly better).

So in retrospect, I’m going to do both: Posting ahead and posting in real-time.

An insider trick to find out if the posting is pre-written or not would be to look at the posting time: If it’s 8:30 in the morning, it’s prewritten.

Intel Mac Mini, Linux, Ethernet

If you have one of these new Intel Macs, you will sooner or later find yourself in the situation of having to run Linux on one of them. (Ok. Granted: The situation may be coming sooner for some than for others).

Last weekend, I was in that situation: I had to install Linux on an Intel Mac Mini.

The whole thing is quite easy to do and if you don’t need Mac OS X, you can just go ahead and install Linux like you would on any other x86 machine (provided the hardware is sufficiently new to have the BIOS emulation layer already installed – otherwise you have to install the Firmware Update first – you’ll notice by the mac not booting from the CD despite holding c during the initial boot sequence).

You can partition the disk to your liking – the Mac bootloader will notice that there’s something fishy with the parition layout (the question-mark-on-a-folder icon will blink one or two times) before passing control to the BIOS emulation which will be able to boot Linux from the partitions you created during installation.

Don’t use grub as bootloader though.

I don’t know if it’s something grub does to the BIOS or if it’s something about the partition table, but grub can’t launch stage 1.5 and thus is unable to boot your installation.

lilo works fine though (use plain lilo when using the BIOS emulation for the boot process, not elilo)

When you are done with the installation process, something bad will happen sooner or later though: Ethernet will stop working.

This is what syslog has to say about it:

NETDEV WATCHDOG: eth0: transmit timed out
sky2 eth0: tx timeout
sky2 eth0: transmit ring 60 .. 37 report=60 done=60
sky2 hardware hung? flushing

When I pulled the cable and plugged it in again, the kernel even oops’ed.

The macs have a Marvel Yukon ethernet chipset. This is what lspci has to tell us: 01:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 22). The driver to use in the kernel config is “SysKonnect Yukon2 support (EXPERIMENTAL)” (CONFIG_SKY2)

I guess the EXPERIMENTAL tag is warranted for once.

The good news is, that this problem is fixable. The bad news is: It’s tricky to do.

Basically, you have to update the driver with the version that is in the repository of what’s going to be kernel 2.6.19

Getting a current version of sky.c and sky.h is not that difficult. Unfortunately though, the new driver won’t compile with the current 2.6.18 kernel (and upgrading to a pre-rc is out of the question – even more so considering the ton of stuff going into 2.6.19).

So first, we have to patch in this changeset to make the current release of sky compile.

Put the patch to /usr/src/linux and patch with patch -p1

Then fetch the current revision of sky2.c and sky2.h and overwrite the existing files. I used the web interface to git for that as I have no idea how the command line tools work.

Recompile the thing and reboot.

For me, this fixed the problem with the sky2 driver: The machine in question is now running for a whole week without any networking lockups – despite heavy network load at times.

While happy to see this fixed, my statement about not buying too new hardware (posting number 6 here on gnegg.ch – ages ago) if you intend to use Linux on it seems to continue to apply.

XmlTextReader, UTF-8, Memory Corruption

XmlTextReader on the .NET CF doesn’t support anything but UTF-8 which can be a good thing as it can be a bad thing.

Good thing because UTF-8 is a very flexible character encoding giving access to the whole Unicode character range while still being compact and easy to handle.

Bad thing because PopScan doesn’t do UTF-8. It was just never needed as its primary market is countries well within the range of ISO-8859-1. This means that the protocol between server and client so far was XML encoded in ISO-8859-1.

To be able to speak with the Windows Mobile application, the server had to convert the data to UTF-8.

And this is where a small bug occurred: Part of the data wasn’t properly encoded and was transmitted as ISO-8859-1.

The correct thing a XML-Parser should do about obviously incorrect data is to bail out, which also is what the .NET CF DOM parser did.

XmlTextReader did something else though: It threw an uncatchable IndexOutOfRange exception either in Read() or ReadString(). And sometimes it miraculously changed its internal state – jumping from element to element even when just using ReadString().

To make things even worse, the exception happened at a location not even close to where the invalid character was in the stream.

In short, from what I have seen (undocumented and uncatchable exceptions being thrown at random places), it feels like the specific invalid character that was parsed in my particular situation caused memory corruption somewhere inside the parser.

Try to imagine how frustrating it was to find and fix this bug – it felt like the old days of manual memory allocation combined with stack corruption. And all because of one single bad byte in a stream of thousands of bytes.