I’ve just updated the sacy repository again and tagged a v0.3-beta1 release.
The main feature since yesterday is support for the official compilers and
tools if you can provide them on the target machine.
The drawback is that these things come with hefty dependencies at times (I
don’t think you’d find a shared hoster willing to install node.js or Ruby for
you), but if you can provide the tools, you can get some really nice
advantages over the PHP ports of the various compilers:
the PHP port of sass has an issue that prevents
@import from working. sacy’s build script does patch that, but the way they
were parsing the file names doesn’t inspire confidence in the library. You
might get a more robust solution by using the official tool.
uglifier-js is a bit faster than JSMin, produces significantly smaller
output and comes with a better license (JSMin isn’t strictly free software
as it has this “do no evil” clause)
coffee script is under very heavy development, so I’d much rather use the
upstream source than some experimental fun project. So far I haven’t seen
issues with coffeescript-php, but then I haven’t been using it much yet.
Absent from the list you’ll find less and css minification:
the PHP native CSSMin is really good and
there’s no single official external tool out that demonstrably better (maybe
the YUI compressor, but I’m not going to support something that requires me
to deal with Java)
lessphp is very lightweight and yet very full
featured and very actively developed. It also has a nice advantage over the
native solution in that the currently released native compiler does not
support reading its input from STDIN, so if you want to use the official
less, you have to go with the git HEAD.
Feel free to try this out (and/or send me a patch)!
Oh and by the way: If you want to use uglifier or the original coffee script
and you need node but can’t install it, have a look at the static binary I created
I’ve just updated the sacy repository
to now also provide support for compiling Coffee Script.
As always, the support is seamless – this is all you have to do.
Again, in order to keep deployment simple, I decided to go with a pure PHP solution (coffeescript-php).
I do see some advantages in the native solutions though (performance, better output), so I’m actively looking into a solution to detect the availability of native converters that I could shell out to without having to hit the file system on every request.
Once both the support for external tools and the refactoring of the transformation is completed, I’m going to release v0.3, but if you want/need coffee support right now, go ahead and clone the repository.
After reading some incredibly beautiful coffee code by @brainlock
(work related, so I can’t link the code), I decided that I wanted to use
coffee in PopScan and as such I need coffee support in sacy which handles
asset compilation for us.
This means that I need node.js on the server (sacy is allowing us a very cool
checkout-and-forget deployment without any build-scripts, so I’d like to keep
this going on).
On servers we manage, this isn’t an issue, but some customers insist on
hosting PopScan within their DMZ and provide a pre-configured Linux machine
running OS versions that weren’t quite current a decade ago.
Have fun compiling node.js for these: There are so many dependencies to meet
(a recent python for example) to build it – if you even manage to get it to
compile on these ancient C compilers available for these ancient systems.
But I really wanted coffee.
So here you go: Here’s a statically linked (this required a bit of trickery)
binary of node.js v0.4.7 compiled for 32bit Linux. This runs even on an
ancient RedHat Enterprise 3 installation, so I’m quite confident that it runs
everywhere running at least Linux 2.2:
Over the last weekend, 9to5mac.com posted about a hack which shows that it’s possible to run Siri on a iPhone 4 and
an iPod Touch 4g and possibly even oder devices – considering how much of Siri
is running on Apple’s servers.
We’ve always suspected that the decision to restrict Siri to the 4S is
basically a marketing decision and I don’t really care about this either.
Nobody is forcing you to use Siri and thus nobody is forcing you to update to
Siri is Apple’s product and so are the various iPhones. It’s their decision
whom they want to sell what to.
What I find more interesting is that it was even possible to have a hacked
Siri on a non 4S-phone talk to Apple’s servers. If I were in Apple’s shoes, I
would have made that (practically) impossible.
And here’s how:
Having a device that you put into users hands and trusting it is always a very
hard, if impossible thing to do as the device can (more or less) easily be
So to solve this problem, we need some component that we know reasonably well
to be safe from the user’s tampering and we need to find a way for that
component to prove to the server that indeed the component is available and
I would do that using public key crypto and specialized hardware that works
like a TPM. So that would be a chip that contains a private key embedded in
hardware, likely not updatable. Also, that private key will never leave that
device. There is no API to read it.
The only API the chip provides is either a relatively high-level API to sign
an arbitrary binary blob or, more likely, a lower level one to encrypt some
small input (a SHA1 hash for example) with the private key.
OK. Now we have that device (also, it’s likely that the iPhone already has
something like that for its secured boot process). What’s next?
Next you make sure that the initial handshake with your servers requires that
device. Have the server post a challenge to the phone. Have the phone solve it
and have the response signed by that crypto device.
On your server, you will have the matching public key. If the signature checks
out, you talk to the device. If not, you don’t.
Now, it is possible using very expensive hardware to extract that key from the
hardware (by opening the chip’s casing and using a microscope and a lot of
skills). If you are really concerned about this, give each device a unique
private key. If a key gets compromised, blacklist it.
This greatly complicates the manufacturing process of course, so you might go
ahead with just one private key per hardware type and hope that cracking the
key will take longer than the lifetime of the hardware (which is very likely).
This isn’t at all specific to Siri of course. Whenever you have to trust a
device that you put into consumers hands, this is the way to go and I’m sure
we’ll be seeing more of this in the future (imagine the uses for copy
protection – let’s hope we don’t end up there).
I’m not particularly happy that this is possible, but I’d rather talk about it
than to hope that it’s never going to happen – it will and I’ll be pissed.
For now I’m just wondering why Apple wasn’t doing it to protect Siri.
One of the many impressive facts about JSConf is the quality of their Wifi
connection. It’s not just free and stable, it’s also fast. Not only that, this
time around, they had a very cool feature: You authenticated via twitter.
As most of the JS community seems to be having twitter accounts anyways, this
was probably the most convenient solution for everyone: You didn’t have to
deal with creating an account or asking someone for a password and on the
other hand, the organizers could make sure that, if abuse should happen,
they’d know whom to notify.
On a related note: This was in stark contrast to the WiFi I had in the hotel
which was unstable, slow and cost a ton of money to use and it didn’t use
Twitter either :-)
In fact, the twitter thing was so cool to see in practice, that I want to use
it for myself too.
Since the days of WEP-only Nintendo DS, I’m running two WiFi networks at home:
One is WPA protected and for my own use, the other is open, but it runs over
a different interface on shion
which has no access to any other machine in my network. This is even more
important as I have a permanent OpenVPN connection
to my office and I definitely don’t want to give the world access to that.
So now the plan would be to change that open network so that it redirects to a
captive portal until the user has authenticated with twitter (I might add
other providers later on – LinkedIn would be awesome for the office for
In order for me to actually get the thing going, I’m doing a tempalias on this
one too and keep a diary of my work.
So here we go. I really think that every year I should do some fun-project
that’s programming related, can be done on my own and is at least of some use. Last time it was tempalias, this time, it’ll be Jocotoco (more about the name in the next installment).
But before we take off, let me give, again, huge thanks to the JSConf crew for
the amazing conference they manage to organize year after year. If I could,
I’d already preorder the tickets for next year :p
Attending a JSConf feels like a two-day drug-trip that lasts for at least two
options['a'] = options['a'] || 'foobar';
It’s short, it’s concise and it’s clear what it does. In ruby, you can even be more concise:
params[:a] ||= 'foobar'
So you can imagine that I was happy with PHP 5.3’s new ?: operator:
<?php $options['a'] = $options['a'] ?: 'foobar';
In all three cases, the syntax is concise and readable, though arguably, the PHP one could read a bit better, but, ?: still is better than writing the full ternary expression, spelling out $options['a'] three times.
PopScan, since forever (forever being 2004) runs with E_NOTICE turned off. Back in the times, I felt it provided just baggage and I just wanted (had to) get things done quickly.
This, of course, lead to people not taking enough care for the code and recently, I had one too many case of a bug caused by accessing a variable that was undefined in a specific code path.
I decided that I’m willing to spend the effort in cleaning all of this up and making sure that there are no undeclared fields and variables in all of PopScans codebase.
Which turned out to be quite a bit of work as a lot of code is apparently happily relying on the default null that you can read out of undefined variables. Those instances might be ugly, but they are by no means bugs.
Cases where the null wouldn’t be expected are the ones I care about, but I don’t even what to go and discern the two – I’ll just fix all of the instances (embarrassingly many, most of them, thankfully, not mine).
Of course, if I put hours into a cleanup project like this, I want to be sure that nobody destroys my work again over time.
Which is why I was looking into running PHP with E_NOTICE in development mode at least.
Which brings us back to the introduction.
<?php $options['a'] = $options['a'] ?: 'foobar';
is wrong code. Any accessing of an undefined index of an array always raises a notice. It’s not like Python where you can chose (accessing a dictionary using  will throw a KeyError, but there’s get() which just returns None). No. You don’t get to chose. You only get to add boilerplate:
Which looks ugly as hell. And no, you can’t write a wrapper to explode() which always returns an array big enough, because you don’t know what’s big enough. You would have to pass the amount of nulls you want into the call too. That would look nicer then above hack, but it still doesn’t even come close in conciseness to the solution which throws a notice.
So. In the end, I’m just complaining about syntax you might think? I though so too and I wanted to add the syntax I liked, so I did a bit of experimenting.
The wrapped array solution looks really compelling syntax-wise and I could totally see myself using this and even forcing everybody else to go there. But of course, I didn’t trust PHP’s interpreter and thus benchmarked the thing.
pilif@tali ~ % php e_notice_stays_off.php
Notices off. Array 100000 iterations took 0.118751s
Notices off. Inline. Array 100000 iterations took 0.044247s
Notices off. Var. Array 100000 iterations took 0.118603s
Wrapped array. 100000 iterations took 0.962119s
Parameter call. 100000 iterations took 0.406003s
Undefined var. 100000 iterations took 0.194525s
So. Using nice syntactic sugar costs 7 times the performance. The second best solution? Still 4 times. Out of the question. Yes. It could be seen as a micro-optimization, but 100’000 iterations, while a lot is not that many. Waiting nearly a second instead of 0.1 second is crazy, especially for a common operation like this.
Interestingly, the most bloated code (that checks with isset()) is twice as fast as the most readable (just assign). Likely, the notice gets fired regardless of error_reporting() and then just ignored later on.
What really pisses me off about this is the fact that everywhere else PHP doesn’t give a damn. ‘0’ is equal to 0. Heck, even ‘abc’ is equal to 0. It even fails silently many times.
But in a case like this, where there is even newly added nice and concise syntax, it has to be overly conservative. And there’s no way to get to the needed solution but to either write too expensive wrappers or ugly boilerplate.
Dynamic languages give us a very useful tool to be dynamic in the APIs we write. We can create functions that take a dictionary (an array in PHP) of options. We can extend our objects at runtime by just adding a property. And with PHP’s (way too) lenient data conversion rules, we can even do math with user supplied string data.
But can we read data from $_GET without boilerplate? No. Not in PHP. Can we use a dictionary of optional parameters? Not in PHP. PHP would require boilerplate.
If a language basically mandates retyping the same expression three times, then, IMHO, something is broken. And if all the workarounds are either crappy to read or have very bad runtime properties, then something is terribly broken.
So, I decided to just fix the problem (undefined variable access) but leave E_NOTICE where it is (off). There’s always git blame and I’ll make sure I will get a beer every time somebody lets another undefined variable slip in.
Only just last year, I told @brainlock (in real life, so I can’t link) that the coolest thing about our industry was that you don’t have to ask for permission to do anything.
Want to start the next big web project? Just start it. Want to write about your opinions? Just write about them. Want to get famous? It’s still a lot of work and marketing, but nothing (aside of lack of talent) is stopping you.
Whenever you have a good idea for a project, you start working on it, you see how it turns out and you decide whether to continue working on it or whether to scrap it. Aside of a bit of cash for hosting, you don’t need anything else.
This is very cool because is empowers “normal people”. Heck, I probably wouldn’t be where I currently am if it wasn’t for this. Back in 1996 I had no money, I wasn’t known, I had no past experience. What I had though was enthusiasm.
Which is all that’s needed.
Only a year later though, I’m sad to see that we are at the verge of losing all of this. Piece by piece.
First was apple with their iPhone. Even with all the enthusiasm of the world, you are not going to write an app that other people can run on the phone. No. First you will have to ask Apple for permission.
Want to access some third-party hardware from that iPhone app? Sure. But now you have to not only ask Apple, but also the third party vendor for permission.
The explanation we were given is that a malicious app could easily bring down the mobile network. Thus they needed to be careful what we could run on our phones.
But then, we got the iPad with the exact same restrictions even though not all of them even have mobile network access.
The explanation this time? Security.
As nobody wants their machine to be insecure, everybody just accepts it.
Next came Microsoft: In the Windows Mobile days before the release of 7, you didn’t have to ask anybody for permission. You bought (or pirated if you didn’t have money) Visual Studio, you wrote your app, you published it.
All of this is lost now. Now you ask for permission. Now you hope for the powers that be to allow you to write your software.
So there’s still the web you think? I wish I could be positive about that, but as we are running out of IP-addresses and the adoption of IPv6 is slow as ever, I believe that public IP addresses are becoming a scarce good at which point, again, you will be asking for permission.
In some countries, even today, it’s not possible to just write a blog post because the government is afraid of “unrest” (read: losing even more credibility). That’s not just countries we always perceived as “not free” – heck, even in Italy you must register with the government if you want to have a blog (it turns out that law didn’t come to pass – let’s hope no other country has the same bright idea). In Germany, if you read the law by the letter, you can’t blog at all without getting every post approved – you could write
something that a minor might see.
«But permission will be granted anyways», you might say. Are you sure though? What if you are a minor wanting to create an application for your first client? Back in my days, I could just do it. Are you sure that whatever entity is going to have to give permission wan’t to do business with minors? You do know that you can’t have a Gmail account if you are younger than 13 years, do you? So age barriers exist.
What if your project competes with whatever entity has to give permission? Remember the story about the Google Voice app? Once we are out of IP addresses, the big provider and media companies who still have addresses might see you little startup web project as competition in some way. Are you sure you will still get permission?
Back in 1996 when I started my company in High-School, all you needed to earn your living was enthusiasm and a PC (yes – I started doing web programming without having access to the internet)
Now you need signed contracts, signed NDAs, lobbying, developer program memberships, cash – the barriers to entry are infinitely higher at this point.
I’m afraid though, that this is just the beginning. If we don’t stand up now, if we continue to let big companies and governments take away our freedom of expression piece by piece, if we give up more and more of our freedom because of the false promise of security, then, at one point, all of what we had will be lost.
We won’t be able to just start our projects. We won’t be able to create – only to work on other peoples projects. We will lose all that makes our profession interesting.
You will notice that the format encodes the strings length together with the string. And because PHP is inherently not unicode capable, it’s not encoding the strings character length, but its byte-length.
unserialize() checks whether the encoded length matches the actual delimited strings length. This means that if you treat the serialized output as text and your databases’s encoding changes along the way, that the retrieved string can’t be unserialized any more.
I just learned that the hard way (even though it’s obvious in hindsight) while migrating PopScan from ISO-8859-1 to UTF-8:
The databases of existing systems now contain a lot of output from serialize() which was run over ISO strings but now that the client-encoding in the database client is set to utf-8, the data will be retrieved as UTF-8 and because the serialize() output was stored in a TEXT column, it happily gets UTF-8 encoded.
If we remove the database from the picture and express the problem in code, this is what’s going on:
unserialize(utf8encode(serialize('data with 8bit chàracters')));
i.e the data gets altered after serializing and the way it gets altered is a way that unserialize can’t deal with the data any more.
So, for everybody else not yet in this dead end:
The output of serialize() is binary data. It looks like textual data, bit it isn’t. Treat it as binary. If you store it somewhere, make sure that the medium you store it to treats the data as binary. No transformation what so ever must ever be made on it.
Of course, that leaves you with a problem later on if you switch character sets and you have to unserialize, but at least you get to unserialize then. I have to go great lengths now to salvage the old data.
The first is the “traditional” paradigm where your JS code is just glorified view code. This is how AJAX worked in the early days and how people are still using it. Your JS-code intercepts a click somewhere, sends an AJAX request to the server and gets back either more JS code which just gets evaulated (thus giving the server kind of indirect access to the client DOM) or a HTML fragment which gets inserted at the appropriate spot.
This means that your JS code will be ugly (especially the code coming from the server), but it has the advantage that all your view code is right there where all your controllers and your models are: on the server. You see this pattern in use on the 37signals pages or in the github file browser for example.
Keep the file browser in mind as I’m going to use that for an example later on.
The other paradigm is to go the other way around an promote JS to a first-class language. Now you build a framework on the client end and transmit only data (XML or JSON, but mostly JSON these days) from the server to the client. The server just provides a REST API for the data plus serves static HTML files. All the view logic lives only on the client side.
The advantages are that you can organize your client side code much better, for example using backbone, that there’s no expensive view rendering on the server side and that you basically get your third party API for free because the API is the only thing the server provides.
This paradigm is used for the new twitter webpage or in my very own tempalias.com.
Now @brainlock is a heavy proponent of the second paradigm. After being enlightened by the great Crockford, we both love JS and we both worked on huge messes of client-side JS code which has grown over the years and lacks structure and feels like copy pasta sometimes. In our defense: Tons of that code was written in the pre-enlightened age (2004).
I on the other hand see some justification for the first pattern aswell and I wouldn’t throw it away so quickly.
The main reason: It’s more pragmatic, it’s more DRY once you need graceful degradation and arguably, you can reach your goal a bit faster.
Let me explain by looking at the github file browser:
If you have a browser that supoports the HTML5 history API, then a click on a directory will reload the file list via AJAX and at the same time the URL will be updated using push state (so that the current view keeps its absolute URL which is valid even after you open it in a new browser).
If a browser doesn’t support pushState, it will gracefully degrade by just using the traditional link (and reloading the full page).
Let’s map this functionality to the two paradigms.
First the hacky one:
You render the full page with the file list using a server-side template
You intercept clicks to the file list. If it’s a folder:
you request the new file list
the server now renders the file list partial (in rails terms – basically just the file list part) without the rest of the site
the client gets that HTML code and inserts it in place of the current file list
You patch up the url using push state
done. The view code is only on the server. Whether the file list is requested using the AJAX call or the traditional full page load doesn’t matter. The code path is exactly the same. The only difference is that the rest of the page isn’t rendered in case of an AJAX call. You get graceful degradation and no additional work.
Now assuming you want to keep graceful degradation possible and you want to go the JS framework route:
You render the full page with the file list using a server-side template
You intercept the click to the folder in the file list
You request the JSON representation of the target folder
You use that JSON representation to fill a client-side template which is a copy of the server side partial
You insert that HTML at the place where the file list is
You patch up the URL using push state
The amount of steps is the same, but the amount of work isn’t: If you want graceful degradation, then you write the file list template twice: Once as a server-side template, once as a client-side template. Both are quite similar but usually you’ll be forced to use slightly different syntax. If you update one, you have to update the other or the experience will be different whether you click on a link or you open the URL directly.
Also you are duplicating the code which fills that template: On the server side, you use ActiveRecord or whatever other ORM. On the client side, you’d probably use Backbone to do the same thing but now your backend isn’t the database but the JSON response. Now, Backbone is really cool and a huge timesaver, but it’s still more work than not doing it at all.
OK. Then let’s skip graceful degradation and make this a JS only client app (good luck trying to get away with that). Now the view code on the server goes away and you are just left with the model on the server to retrieve the data, with the model on the client (Backbone helps a lot here, but there’s still a substatial amount of code that needs to be written that otherwise wouldn’t) and with the view code on the client.
Now don’t ge me wrong.
I love the idea of promoting JS to a first class language. I love JS frameworks for big JS only applications. I love having a “free”, dogfooded-by-design REST API. I love building cool architectures.
I’m just thinking that at this point it’s so much work doing it right, that the old ways do have their advantages and that we should not condemn them for being hacky. True. They are. But they are also pragmatic.