bug- gnegg

Windows Media Encoder: File not found

Today I have come across an installation of a Windows Media Encoder that refused to actually encode media. Whenever I started the encoding process, the encoder quit with the error 0x80070002 and gave the very helpful unformation that “the system cannot find the file specified”.

The problem appeared quite suddenly after working perfectly fine for the last three months. As the system is behind a very air-tight firewall and is the only machine in the network segment (aside of some IP cameras), the system hasn’t even been updated via Windows Update. So I have to conclude, that the problem appeared out of the blue. One day it worked, the next it stopped working.

I’ve tried everything to fix this (the encoder in question was encoding a live stream for a client of ours): From reinstalling the Axis capture driver to reinstalling Windows Media Encoder – nothing worked – the error message stayed the same.

Even googling proved all but helpful: There are quite many pages apparently mirroring all and the same MSDN forum on which someone actually posted the same problem but never got an answer. How annoying is that? You find 10 or more hits, everyone having your problem right in the title and everyone on a different page, but in the end, it’s all the same posting mirrored by different sites and plastered with advertisements.

On a hunch though, I have deleted “%Localappdata%MicrosoftWindows Media” and “%Localappdata%MicrosoftWindows Media Player” seeing that these folders stayed intact after a reinstallation while also being somewhat Windows media related.

Of course that helped!

So if you ever are in the same problem and Media Encoder suddenly stops encoding, it’s maybe caused by a corrupted cache of sorts. In that case, remove the cache and be encoding again, but note though, that if you are on a client machine with all your media on, removing these folders may be unwise as they could contain some meta information about your media.

In my case that didn’t matter though.

Dependent on working infrastructure

If you create and later deploy and run a web application, then you are dependent on a working infrastructure: You need a working web server, you need a working application server and in most cases, you’ll need a working database server.

Also, you’d want a solution that always and consistently works.

We’ve been using lighttpd/FastCGI/PHP for our deployment needs lately. I’ve preferred this to apache due to the easier configuration possible with lighty (out of the box automated virtual hosting for example), the potentially higher performance (due to long-running FastCGI processes) and the smaller amount of memory consumed by lighttpd.

But last week, I had to learn the price of walking off the beaten path (Apache, mod_php).

In one particular constellation, the lighty, fastcgi, php combination, running on a Gentoo box sometimes (read: 50% of the time) a certain script didn’t output all the data it should have. Instead, lighty randomly sent out RST packets. This without any indication of what could be wrong in any of the involved log files.

Naturally, I looked everywhere.

I read the source code of PHP. I’ve created reduced test cases. I’ve tried workarounds.

The problem didn’t go away until I tested the same script with Apache.

This is where I’m getting pragmatic: I depend on a working infrastructure. I need it to work. Our customers need it to work. I don’t care who is to blame. Is it PHP? Is it lighty? Is it Gentoo? Is it the ISP (though it would have to be on the senders end as I’ve seen the described failure with different ISPs)?

I don’t care.

My interest is in developing a web application. Not in deploying one. Not really, anyways.

I’m willing (and able) to fix bugs in my development environment. I may even be able to fix bugs in my deployment platform. But I’m certainly not willing to. Not if there is a competing platform that works.

So after quite some time with lighty and fastcgi, it’s back to Apache. The prospect of having a consistently working backed largely outweighs the theoretical benefits of memory savings, I’m afraid.

The IE rendering dilemma

There’s a new release of Internet Explorer, aptly named IE8, pending and a whole lot of web developers are in fear of new bugs and no fixes to existing ones. Like the problems we had with IE7.

A couple of really nasty bugs where fixed, but there wasn’t any significant progress in matters of extended support for web standards or even a really significant amount of bugfixes.

And now, so fear the web developers, history is going to repeat itself. Why, are people asking, aren’t they just throwing away the currently existing code-base, replacing it with something more reasonable? Or if licensing or political issues prevent using something not developed in-house, why not rewrite IE’s rendering engine from scratch?

Backwards compatibility. While the web itself has more or less stopped using IE-only-isms and began embracing the way of the web standards (and thus began cursing on IE’s bugs), corporate intranets, the websites accessed by Microsoft’s main customer base, certainly have not.

ActiveX, <FONT>-Tags, VBScript – the list is endless and companies don’t have the time or resources to remedy that. Remember. Rewriting for no real purpose than “being modern” is a real waste of time and certainly not worth the effort. Sure. New Applications can be developped in a standards compliant way. But think about the legacy! Why throw all that away when it works so well in the currently installed base of IE6?

This is why Microsoft can’t just throw away what they have.

The only option I see, aside of trying to patch up what’s badly broken, is to integrated another rendering engine into IE. One that’s standards compliant and one that can be selected by some means – maybe a HTML comment (the DOCTYPE specification is already taken).

But then, think of the amount of work this creates in the backend. Now you have to maintain two completely different engines with completely different bugs at different places. Think of security problems. And think of what happens if one of these buggers is detected in a third party engine a hypothetical IE may be using. Is MS willing to take responsibility of third-party bugs? Is it reasonable to ask them to do this?

To me it looks like we are now paying the price for mistakes MS did a long time ago and for quick technological innovation happening at the wrong time on the wrong platform (imagine the intranet revolution happening now). And personally, I don’t see an easy way out.

I’m very interested in seeing how Microsoft solves this problem. Ignore the standards-crowd? Ignore the corporate customers? Add the immense burden of another rendering engine? FIx the current engine (impossible, IMHO)? We’ll know once IE8 is out I guess.

PHP 5.2.4

Today, the bugfix-release 5.2.4 of PHP has been released.

This is an interesting release, because it includes my fix for bug 42117 which I discovered and fixed a couple of weeks ago.

This means that with PHP 5.2.4 I will finally be able to bzip2-encode data as it is generated on the server and stream it out to the client, greatly speeding up our windows client.

Now I only need to wait for the updated gentoo package to update our servers.

PHP, stream filters, bzip2.compress

Maybe you remember that, more than a year ago, I had an interesting problem with stream filters.

The general idea is that I want to output bz2-compressed data to the client as the output is being assembled – or, more to the point: The PopScan Windows-Client supports the transmission of bzip2 encoded data which gets really interesting as the amount of data to be transferred increases.

Even more so: The transmitted data is in XML format which is very easily compressed – especially with bzip2.

Once you begin to transmit multiple megabytes of uncompressed XML-data, you begin to see the sense in jumping through a hoop or two to decrease the time needed to transmit the data.

On the receiving end, I have an elaborate construct capable of downloading, decompressing, parsing and storing data as it arrives over the network.

On the sending end though, I have been less lucky: Because of that problem I had, I was unable to stream out bzip2 compressed data as it was generated – the end of the file was sometimes missing. This is why I’m using ob_start() to gather all the output and then compress it with bzcompress() to send it out.

Of course this means that all the data must be assembled before it can be compressed and the sent to the client.

As we have more and more data to transmit, the client must wait longer and longer before the data begins to reach it.

And then comes the moment when the client times out.

So I finally really had to fix the problem. I could not believe that I was unable to compress and stream out data on the fly.

It turns out that I finally found the smallest possible amount of code to illustrate the problem in a non-hacky way:

So: This fails under PHP up until 5.2.3:

<?
$str = "BEGIN (%d)n
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
ex ea commodo consequat. Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur
sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum.
nEND (%d)n";

$h = fopen($_SERVER['argv'][1], 'w');
$f = stream_filter_append($h, "bzip2.compress", STREAM_FILTER_WRITE);
for($x=0; $x < 10000; $x++){
   fprintf($h, $str, $x, $x);

}
fclose($h);
echo "Writtenn";
?>

Even worse though: It doesn’t fail with a message, but it writes out a corrupt bzip-File.

And it gets worse: With a little amount of data it works, but as the amount of data increases, it begins to fail – at different places depending on how you shuffle the data around.

Above script will write a bzip file which – when uncompressed – will end around iteration 9600.

So now that I had a small reproducible testcase, I could report a bug in PHP: Bug 47117.

After spending so many hours on a problem which in the end boiled down to a bug in PHP (I’ve looked anywhere, believe me. I also tried workarounds, but all to no avail), I just could not let the story end there.

Some investigation quickly turned up a wrong check for a return value in bz2_filter.c which I was able to patch up very, very quickly, so if you visit that bug above, you will find a patch correcting the problem.

Then, when I finished patching PHP itself, hacking up the needed PHP-code to let the thing stream out the compressed data as it arrived was easy. If you want, you can have a look at bzcomp.phps which demonstrates how to plug the compression into either the output buffer handling or something quick, dirty and easier else.

Oh, and if you are tempted to do this:

function ob($buf){
        return bzcompress($buf);
}

ob_start('ob');

… it won’t do any good because you will still gobble up all the data before compressing. And this:

function ob($buf){
        return bzcompress($buf);
}

ob_start('ob', 32768);

will encode in chunks (good), but it will write a bzip2-end-of-stream marker after every chunk (bad), so neither will work.

Nothing more satisfying than to fix a bug in someone else’s code. Now let’s hope this gets applied to PHP itself so I don’t have to manually patch my installations.