Dependent on working infrastructure

If you create and later deploy and run a web application, then you are dependent on a working infrastructure: You need a working web server, you need a working application server and in most cases, you’ll need a working database server.

Also, you’d want a solution that always and consistently works.

We’ve been using lighttpd/FastCGI/PHP for our deployment needs lately. I’ve preferred this to apache due to the easier configuration possible with lighty (out of the box automated virtual hosting for example), the potentially higher performance (due to long-running FastCGI processes) and the smaller amount of memory consumed by lighttpd.

But last week, I had to learn the price of walking off the beaten path (Apache, mod_php).

In one particular constellation, the lighty, fastcgi, php combination, running on a Gentoo box sometimes (read: 50% of the time) a certain script didn’t output all the data it should have. Instead, lighty randomly sent out RST packets. This without any indication of what could be wrong in any of the involved log files.

Naturally, I looked everywhere.

I read the source code of PHP. I’ve created reduced test cases. I’ve tried workarounds.

The problem didn’t go away until I tested the same script with Apache.

This is where I’m getting pragmatic: I depend on a working infrastructure. I need it to work. Our customers need it to work. I don’t care who is to blame. Is it PHP? Is it lighty? Is it Gentoo? Is it the ISP (though it would have to be on the senders end as I’ve seen the described failure with different ISPs)?

I don’t care.

My interest is in developing a web application. Not in deploying one. Not really, anyways.

I’m willing (and able) to fix bugs in my development environment. I may even be able to fix bugs in my deployment platform. But I’m certainly not willing to. Not if there is a competing platform that works.

So after quite some time with lighty and fastcgi, it’s back to Apache. The prospect of having a consistently working backed largely outweighs the theoretical benefits of memory savings, I’m afraid.

The return of Expect: 100-continue

Yesterday I had to work with a PHP-application using the CURL library to send a HTTP POST request to a lighttpd server.

Strangely enough I seemed unable to get anything back from the server when using PHP and I got the correct answer when I was using wget as a reference.

This made me check the lightpd log and I once more (I recommend you to read that entry as this is very much dependent on it) came across the friendly error 417

A quick check with Wireshark confirmed: curl was sending the Expect: 100-continue header.

Personally, I think that 100-continue thing is a good thing and it even seems to me that the curl library is intelligent about it and only does that thing when the size of the data to send is larger than a certain threshold.

Also, even though people are complaining about it, I think lighttpd does the right thing. The expect-header is mandatory and if lighttpd doesn’t support this particular header, the error 417 is the only viable option.

What I think though is that the libraries should detect that automatically.

This is because they are creating a behavior that’s not consistent to the other types of request: GET, DELETE and HEAD requests all follow a fire-and-forget paradigm and the libraries employ a 1:1 mapping: Set up the request. Send it. Return the received data.

With POST (and maybe PUT), the library changes that paradigm and in fact sends two request to the wire while actually pretending in the interface that it’s only sending one request.

If it does that, then it should at least be capable enough to handle the cases where their scheme of transparently changing semantics breaks.

Anyways: The fix for the curl-library in PHP is:

curl_setopt($ch, CURLOPT_HTTPHEADER, array('Expect:'));

Though I’m not sure how pure this solution is.