programming- gnegg

Abusing LiveConnect for fun and profit

On december 20th I gave a talk at the JSZurich user group meeting in Zürich.
The talk is about a decade old technology which can be abused to get full,
unrestricted access to a client machine from JavaScript and HTML.

I was showing how you would script a Java Applet (which is completely hidden
from the user) to do the dirty work for you while you are creating a very nice
user interface using JavaScript and HTML.

The slides are available in PDF format too.

While it’s a very cool tech demo, it’s IMHO also a very bad security issue
which browser vendors and Oracle need to have a look at. The user sees nothing
but a dialog like this:

security prompt

and once they click OK, they are completely owned.

Even worse, while this dialog is showing the case of a valid certificate, the
dialog in case of an invalid (self-signed or expired) certificate isn’t much
different, so users can easily tricked into clicking allow.

The source code of the demo application is on github
and I’ve already written about this on this blog here,
but back then I was mainly interested in getting it work.

By now though, I’m really concerned about putting an end to this, or at least
increasing the hurdle the end-user has to jump through before this goes off –
maybe force them to click a visible Applet. Or just remove the LiveConnect feature all
together from browsers, thus forcing applets to be visible.

But aside of the security issues, I still think that this is a very
interesting case of long forgotten technology. If you are interested, do have
a look at the talk and travel back in time to when stuff like this was only
half as scary as it is now.

updated sacy – now with external tools

I’ve just updated the sacy repository again and tagged a v0.3-beta1 release.

The main feature since yesterday is support for the official compilers and
tools if you can provide them on the target machine.

The drawback is that these things come with hefty dependencies at times (I
don’t think you’d find a shared hoster willing to install node.js or Ruby for
you), but if you can provide the tools, you can get some really nice
advantages over the PHP ports of the various compilers:

the PHP port of sass has an issue that prevents
@import from working. sacy’s build script does patch that, but the way they
were parsing the file names doesn’t inspire confidence in the library. You
might get a more robust solution by using the official tool.
uglifier-js is a bit faster than JSMin, produces significantly smaller
output and comes with a better license (JSMin isn’t strictly free software
as it has this “do no evil” clause)
coffee script is under very heavy development, so I’d much rather use the
upstream source than some experimental fun project. So far I haven’t seen
issues with coffeescript-php, but then I haven’t been using it much yet.

Absent from the list you’ll find less and css minification:

the PHP native CSSMin is really good and
there’s no single official external tool out that demonstrably better (maybe
the YUI compressor, but I’m not going to support something that requires me
to deal with Java)
lessphp is very lightweight and yet very full
featured and very actively developed. It also has a nice advantage over the
native solution in that the currently released native compiler does not
support reading its input from STDIN, so if you want to use the official
less, you have to go with the git HEAD.

Feel free to try this out (and/or send me a patch)!

Oh and by the way: If you want to use uglifier or the original coffee script
and you need node but can’t install it, have a look at the
static binary I created

updated sacy – now with more coffee

I’ve just updated the sacy repository
to now also provide support for compiling Coffee Script.

{asset_compile}
 type="text/coffeescript" src="/file1.coffee">
 type="text/javascript" src="/file2.js">
{/asset_compile}

will now not compile file1.coffee into JS before creating and linking one big chunk of minified JavaScript.

 type="text/javascript" src="/assetcache/file2-deadbeef1234.js">

As always, the support is seamless – this is all you have to do.

Again, in order to keep deployment simple, I decided to go with a pure PHP solution (coffeescript-php).

I do see some advantages in the native solutions though (performance, better output), so I’m actively looking into a solution to detect the availability of native converters that I could shell out to without having to hit the file system on every request.

Also, when adding the coffee support, I noticed that the architecture of sacy isn’t perfect for doing this transformation stuff. Too much code had to be duplicated between CSS and JavaScript, so I will do a bit of refactoring there.

Once both the support for external tools and the refactoring of the transformation is completed, I’m going to release v0.3, but if you want/need coffee support right now, go ahead and clone
the repository.

node to go

Having node.js around on your machine can be very useful – not just if you are
building your new fun project, but also for
quite real world applications.

For me it was coffee script.

After reading some incredibly beautiful coffee code by @brainlock
(work related, so I can’t link the code), I decided that I wanted to use
coffee in PopScan and as such I need coffee support in sacy which handles
asset compilation for us.

This means that I need node.js on the server (sacy is allowing us a very cool
checkout-and-forget deployment without any build-scripts, so I’d like to keep
this going on).

On servers we manage, this isn’t an issue, but some customers insist on
hosting PopScan within their DMZ and provide a pre-configured Linux machine
running OS versions that weren’t quite current a decade ago.

Have fun compiling node.js for these: There are so many dependencies to meet
(a recent python for example) to build it – if you even manage to get it to
compile on these ancient C compilers available for these ancient systems.

But I really wanted coffee.

So here you go: Here’s a statically linked (this required a bit of trickery)
binary of node.js v0.4.7 compiled for 32bit Linux. This runs even on an
ancient RedHat Enterprise 3 installation, so I’m quite confident that it runs
everywhere running at least Linux 2.2:

node-x86-v0.4.7.bz2
(SHA256: 142085682187a57f312d095499e7d8b2b7677815c783b3a6751a846f102ac7b9)

pilif@miscweb ~ % file node-x86-v0.4.7
node-x86: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, statically linked, for GNU/Linux 2.2.5, not stripped

The binary can be placed wherever you want and executed from there – node
doesn’t require any external files (which is very cool).

I’ll update the file from time to time and provide an updated post. 0.4.7 is good enough to run coffee script though.

protecting siri

Over the last weekend, 9to5mac.com posted about a hack which shows that it’s possible to run Siri on a iPhone 4 and
an iPod Touch 4g and possibly even oder devices – considering how much of Siri
is running on Apple’s servers.

We’ve always suspected that the decision to restrict Siri to the 4S is
basically a marketing decision and I don’t really care about this either.
Nobody is forcing you to use Siri and thus nobody is forcing you to update to
anything.

Siri is Apple’s product and so are the various iPhones. It’s their decision
whom they want to sell what to.

What I find more interesting is that it was even possible to have a hacked
Siri on a non 4S-phone talk to Apple’s servers. If I were in Apple’s shoes, I
would have made that (practically) impossible.

And here’s how:

Having a device that you put into users hands and trusting it is always a very
hard, if impossible thing to do as the device can (more or less) easily be
tampered with.

So to solve this problem, we need some component that we know reasonably well
to be safe from the user’s tampering and we need to find a way for that
component to prove to the server that indeed the component is available and
healthy.

I would do that using public key crypto and specialized hardware that works
like a TPM. So that would be a chip that contains a private key embedded in
hardware, likely not updatable. Also, that private key will never leave that
device. There is no API to read it.

The only API the chip provides is either a relatively high-level API to sign
an arbitrary binary blob or, more likely, a lower level one to encrypt some
small input (a SHA1 hash for example) with the private key.

OK. Now we have that device (also, it’s likely that the iPhone already has
something like that for its secured boot process). What’s next?

Next you make sure that the initial handshake with your servers requires that
device. Have the server post a challenge to the phone. Have the phone solve it
and have the response signed by that crypto device.

On your server, you will have the matching public key. If the signature checks
out, you talk to the device. If not, you don’t.

Now, it is possible using very expensive hardware to extract that key from the
hardware (by opening the chip’s casing and using a microscope and a lot of
skills). If you are really concerned about this, give each device a unique
private key. If a key gets compromised, blacklist it.

This greatly complicates the manufacturing process of course, so you might go
ahead with just one private key per hardware type and hope that cracking the
key will take longer than the lifetime of the hardware (which is very likely).

This isn’t at all specific to Siri of course. Whenever you have to trust a
device that you put into consumers hands, this is the way to go and I’m sure
we’ll be seeing more of this in the future (imagine the uses for copy
protection – let’s hope we don’t end up there).

I’m not particularly happy that this is possible, but I’d rather talk about it
than to hope that it’s never going to happen – it will and I’ll be pissed.

For now I’m just wondering why Apple wasn’t doing it to protect Siri.

A new fun project

Like back in 2010 I went to JSConf.eu this year around.

One of the many impressive facts about JSConf is the quality of their Wifi
connection. It’s not just free and stable, it’s also fast. Not only that, this
time around, they had a very cool feature: You authenticated via twitter.

As most of the JS community seems to be having twitter accounts anyways, this
was probably the most convenient solution for everyone: You didn’t have to
deal with creating an account or asking someone for a password and on the
other hand, the organizers could make sure that, if abuse should happen,
they’d know whom to notify.

On a related note: This was in stark contrast to the WiFi I had in the hotel
which was unstable, slow and cost a ton of money to use and it didn’t use
Twitter either :-)

In fact, the twitter thing was so cool to see in practice, that I want to use
it for myself too.

Since the days of WEP-only Nintendo DS, I’m running two WiFi networks at home:
One is WPA protected and for my own use, the other is open, but it runs over
a different interface on shion
which has no access to any other machine in my network. This is even more
important as I have a permanent OpenVPN connection
to my office and I definitely don’t want to give the world access to that.

So now the plan would be to change that open network so that it redirects to a
captive portal until the user has authenticated with twitter (I might add
other providers later on – LinkedIn would be awesome for the office for
example).

In order for me to actually get the thing going, I’m doing a tempalias on this
one too and keep a diary of my work.

So here we go. I really think that every year I should do some fun-project
that’s programming related, can be done on my own and is at least of some use.
Last time it was tempalias, this time, it’ll be
Jocotoco (more about the name in the next installment).

But before we take off, let me give, again, huge thanks to the JSConf crew for
the amazing conference they manage to organize year after year. If I could,
I’d already preorder the tickets for next year :p

Attending a JSConf feels like a two-day drug-trip that lasts for at least two
weeks.

E_NOTICE stays off.

I’m sure you’ve used this idiom a lot when writing JavaScript code

options['a'] = options['a'] || 'foobar';

It’s short, it’s concise and it’s clear what it does. In ruby, you can even be more concise:

params[:a] ||= 'foobar'

So you can imagine that I was happy with PHP 5.3’s new ?: operator:

<?php $options['a'] = $options['a'] ?: 'foobar';

In all three cases, the syntax is concise and readable, though arguably, the PHP one could read a bit better, but, ?: still is better than writing the full ternary expression, spelling out $options['a'] three times.

PopScan, since forever (forever being 2004) runs with E_NOTICE turned off. Back in the times, I felt it provided just baggage and I just wanted (had to) get things done quickly.

This, of course, lead to people not taking enough care for the code and recently, I had one too many case of a bug caused by accessing a variable that was undefined in a specific code path.

I decided that I’m willing to spend the effort in cleaning all of this up and making sure that there are no undeclared fields and variables in all of PopScans codebase.

Which turned out to be quite a bit of work as a lot of code is apparently happily relying on the default null that you can read out of undefined variables. Those instances might be ugly, but they are by no means bugs.

Cases where the null wouldn’t be expected are the ones I care about, but I don’t even what to go and discern the two – I’ll just fix all of the instances (embarrassingly many, most of them, thankfully, not mine).

Of course, if I put hours into a cleanup project like this, I want to be sure that nobody destroys my work again over time.

Which is why I was looking into running PHP with E_NOTICE in development mode at least.

Which brings us back to the introduction.

<?php $options['a'] = $options['a'] ?: 'foobar';

is wrong code. Any accessing of an undefined index of an array always raises a notice. It’s not like Python where you can chose (accessing a dictionary using [] will throw a KeyError, but there’s get() which just returns None). No. You don’t get to chose. You only get to add boilerplate:

<?php $options['a'] = isset($options['a']) ? $options['a'] : 'foobar';

See how I’m now spelling $options['a'] three times again? ?: just got a whole lot less useful.

But not only that. Let’s say you have code like this:

<?php
list($host, $port) = explode(':', trim($def))
$port = $port ?: 11211;

IMHO very readable and clear what it does: It extracts a host and a port and sets the port to 11211 if there’s none in the initial string.

This of course won’t work with E_NOTICE enabled. You either lose the very concise list() syntax, or you do – ugh – this:

<?php
list($host, $port) = explode(':', trim($def)) + array(null, null);
$port = $port ?: 11211;

Which looks ugly as hell. And no, you can’t write a wrapper to explode() which always returns an array big enough, because you don’t know what’s big enough. You would have to pass the amount of nulls you want into the call too. That would look nicer then above hack, but it still doesn’t even come close in conciseness to the solution which throws a notice.

So. In the end, I’m just complaining about syntax you might think? I though so too and I wanted to add the syntax I liked, so I did a bit of experimenting.

Here’s a little something I’ve come up with:

	<?php
	define('IT', 100000);

	error_reporting(E_ALL);

	$g = array('a' => 'b');
	$a = '';

	function _(&$in, $k=null){
	if (!isset($in)) return null;
	if (is_array($in)){
	return isset($k) ? ( isset($in[$k]) ? $in[$k] : null) : new WrappedArray($in);
	}
	return $in ?: null;
	}

	class WrappedArray{
	private $v;

	function __construct(&$v){
	$this->v = $v;
	}

	function __get($key){
	return isset($this->v[$key]) ? $this->v[$key] : null;
	}

	function __set($key, $value){
	$this->v[$key] = $value;
	}
	}

	error_reporting(E_ALL ^ E_NOTICE);
	$start = microtime(true);
	for($i = 0; $i < IT; $i++){
	$a = $g['gnegg'];
	}
	$end = microtime(true);
	printf("Notices off. Array %d iterations took %.6fs\n", IT, $end-$start);


	$start = microtime(true);
	for($i = 0; $i < IT; $i++){
	$a = isset($g['gnegg']) ? $g['gnegg'] : null;
	}
	$end = microtime(true);
	printf("Notices off. Inline. Array %d iterations took %.6fs\n", IT, $end-$start);


	$start = microtime(true);
	for($i = 0; $i < IT; $i++){
	$a .= $blupp;
	}
	$end = microtime(true);
	$a = "";
	printf("Notices off. Var. Array %d iterations took %.6fs\n", IT, $end-$start);


	error_reporting(E_ALL);

	$start = microtime(true);
	for($i = 0; $i < IT; $i++){
	@$a = $g['gnegg'];
	}
	$end = microtime(true);
	printf("Notices on. @-operator. %d iterations took %.6fs\n", IT, $end-$start);

	$start = microtime(true);
	for($i = 0; $i < IT; $i++){
	@$a .= $blupp;
	}
	$end = microtime(true);
	printf("Notices on. Var. @-operator. %d iterations took %.6fs\n", IT, $end-$start);


	$start = microtime(true);
	for($i = 0; $i < IT; $i++){
	$a = _($g)->gnegg;
	}
	$end = microtime(true);
	printf("Wrapped array. %d iterations took %.6fs\n", IT, $end-$start);

	$start = microtime(true);
	for($i = 0; $i < IT; $i++){
	$a = _($g, 'gnegg');
	}
	$end = microtime(true);
	printf("Parameter call. %d iterations took %.6fs\n", IT, $end-$start);


	$start = microtime(true);
	for($i = 0; $i < IT; $i++){
	$a .= _($blupp);
	}
	$end = microtime(true);
	$a="";
	printf("Undefined var. %d iterations took %.6fs\n", IT, $end-$start);

view raw e_notice_stays_off.php hosted with ❤ by GitHub

The wrapped array solution looks really compelling syntax-wise and I could totally see myself using this and even forcing everybody else to go there. But of course, I didn’t trust PHP’s interpreter and thus benchmarked the thing.

pilif@tali ~ % php e_notice_stays_off.php
Notices off. Array 100000 iterations took 0.118751s
Notices off. Inline. Array 100000 iterations took 0.044247s
Notices off. Var. Array 100000 iterations took 0.118603s
Wrapped array. 100000 iterations took 0.962119s
Parameter call. 100000 iterations took 0.406003s
Undefined var. 100000 iterations took 0.194525s

So. Using nice syntactic sugar costs 7 times the performance. The second best solution? Still 4 times. Out of the question. Yes. It could be seen as a micro-optimization, but 100’000 iterations, while a lot is not that many. Waiting nearly a second instead of 0.1 second is crazy, especially for a common operation like this.

Interestingly, the most bloated code (that checks with isset()) is twice as fast as the most readable (just assign). Likely, the notice gets fired regardless of error_reporting() and then just ignored later on.

What really pisses me off about this is the fact that everywhere else PHP doesn’t give a damn. ‘0’ is equal to 0. Heck, even ‘abc’ is equal to 0. It even fails silently many times.

But in a case like this, where there is even newly added nice and concise syntax, it has to be overly conservative. And there’s no way to get to the needed solution but to either write too expensive wrappers or ugly boilerplate.

Dynamic languages give us a very useful tool to be dynamic in the APIs we write. We can create functions that take a dictionary (an array in PHP) of options. We can extend our objects at runtime by just adding a property. And with PHP’s (way too) lenient data conversion rules, we can even do math with user supplied string data.

But can we read data from $_GET without boilerplate? No. Not in PHP. Can we use a dictionary of optional parameters? Not in PHP. PHP would require boilerplate.

If a language basically mandates retyping the same expression three times, then, IMHO, something is broken. And if all the workarounds are either crappy to read or have very bad runtime properties, then something is terribly broken.

So, I decided to just fix the problem (undefined variable access) but leave E_NOTICE where it is (off). There’s always git blame and I’ll make sure I will get a beer every time somebody lets another undefined variable slip in.

Asking for permission

Only just last year, I told @brainlock (in real life, so I can’t link) that the coolest thing about our industry was that you don’t have to ask for permission to do anything.

Want to start the next big web project? Just start it. Want to write about your opinions? Just write about them. Want to get famous? It’s still a lot of work and marketing, but nothing (aside of lack of talent) is stopping you.

Whenever you have a good idea for a project, you start working on it, you see how it turns out and you decide whether to continue working on it or whether to scrap it. Aside of a bit of cash for hosting, you don’t need anything else.

This is very cool because is empowers “normal people”. Heck, I probably wouldn’t be where I currently am if it wasn’t for this. Back in 1996 I had no money, I wasn’t known, I had no past experience. What I had though was enthusiasm.

Which is all that’s needed.

Only a year later though, I’m sad to see that we are at the verge of losing all of this. Piece by piece.

First was apple with their iPhone. Even with all the enthusiasm of the world, you are not going to write an app that other people can run on the phone. No. First you will have to ask Apple for permission.

Want to access some third-party hardware from that iPhone app? Sure. But now you have to not only ask Apple, but also the third party vendor for permission.

The explanation we were given is that a malicious app could easily bring down the mobile network. Thus they needed to be careful what we could run on our phones.

But then, we got the iPad with the exact same restrictions even though not all of them even have mobile network access.

The explanation this time? Security.

As nobody wants their machine to be insecure, everybody just accepts it.

Next came Microsoft: In the Windows Mobile days before the release of 7, you didn’t have to ask anybody for permission. You bought (or pirated if you didn’t have money) Visual Studio, you wrote your app, you published it.

All of this is lost now. Now you ask for permission. Now you hope for the powers that be to allow you to write your software.

Finally, you can’t even do what you want with your PC – all because of security.

So there’s still the web you think? I wish I could be positive about that, but as we are running out of IP-addresses and the adoption of IPv6 is slow as ever, I believe that public IP addresses are becoming a scarce good at which point, again, you will be asking for permission.

In some countries, even today, it’s not possible to just write a blog post because the government is afraid of “unrest” (read: losing even more credibility). That’s not just countries we always perceived as “not free” – heck, ~~even in Italy you must register with the government if you want to have a blog~~ (it turns out that law didn’t come to pass – let’s hope no other country has the same bright idea). In Germany, if you read the law by the letter, you can’t blog at all without getting every post approved – you could write
something that a minor might see.

«But permission will be granted anyways», you might say. Are you sure though? What if you are a minor wanting to create an application for your first client? Back in my days, I could just do it. Are you sure that whatever entity is going to have to give permission wan’t to do business with minors? You do know that you can’t have a Gmail account if you are younger than 13 years, do you? So age barriers exist.

What if your project competes with whatever entity has to give permission? Remember the story about the Google Voice app? Once we are out of IP addresses, the big provider and media companies who still have addresses might see you little startup web project as competition in some way. Are you sure you will still get permission?

Back in 1996 when I started my company in High-School, all you needed to earn your living was enthusiasm and a PC (yes – I started doing web programming without having access to the internet)

Now you need signed contracts, signed NDAs, lobbying, developer program memberships, cash – the barriers to entry are infinitely higher at this point.

I’m afraid though, that this is just the beginning. If we don’t stand up now, if we continue to let big companies and governments take away our freedom of expression piece by piece, if we give up more and more of our freedom because of the false promise of security, then, at one point, all of what we had will be lost.

We won’t be able to just start our projects. We won’t be able to create – only to work on other peoples projects. We will lose all that makes our profession interesting.

Let’s not go there.

Please.

Discussion on HackerNews

Lion Server authentication issues

Lately I was having an issue with a Lion Server that refused logins of users stored in OpenDirectory. A quick check of /var/log/opendirectoryd.log revealed an issue with the «Password Server»:

Module: AppleODClient - unable to send command to Password Server - sendmsg() on socket fd 16 failed: Broken pipe (5205)

As this message apparently doesn’t appear on Google yet, there’s my contribution to solving this.

The fix was to kill -9 the kerberos authentication daemon:

sudo killall kpasswdd

which in fact didn’t help (sometimes even sudo isn’t enough), so I had to be more persuasive to get rid of the apparently badly hanging process:

sudo killall -9 kpasswdd

This time the process was really killed and subsequently instantly restarted by launchd.

After that, the problem went away.

serialize() output is binary data!

When you call serialize() in PHP, to serialize a value into something that you store for later use with unserialize(), then be very careful what you are doing with that data.

When you look at the output, you’d be tempted to assume that it’s text data:

php > $a = array('foo' => 'bar');
php > echo serialize($a);
a:1:{s:3:"foo";s:3:"bar";}
php >

and as such, you’d be tempted to treat this as text data (i.e. store it in a TEXT column in your database).

But what looks like text on first glance isn’t text data at all. Assume that my terminal is in ISO-8859-1 encoding:

php > echo serialize(array('foo' => 'bär'));
a:1:{s:3:"foo";s:3:"bär";}

and now assume it’s in UTF-8 encoding:

php > echo serialize(array('foo' => 'bär'));
a:1:{s:3:"foo";s:4:"bär";}

You will notice that the format encodes the strings length together with the string. And because PHP is inherently not unicode capable, it’s not encoding the strings character length, but its byte-length.

unserialize() checks whether the encoded length matches the actual delimited strings length. This means that if you treat the serialized output as text and your databases’s encoding changes along the way, that the retrieved string can’t be unserialized any more.

I just learned that the hard way (even though it’s obvious in hindsight) while migrating PopScan from ISO-8859-1 to UTF-8:

The databases of existing systems now contain a lot of output from serialize() which was run over ISO strings but now that the client-encoding in the database client is set to utf-8, the data will be retrieved as UTF-8 and because the serialize() output was stored in a TEXT column, it happily gets UTF-8 encoded.

If we remove the database from the picture and express the problem in code, this is what’s going on:

unserialize(utf8encode(serialize('data with 8bit chàracters')));

i.e the data gets altered after serializing and the way it gets altered is a way that unserialize can’t deal with the data any more.

So, for everybody else not yet in this dead end:

The output of serialize() is binary data. It looks like textual data, bit it isn’t. Treat it as binary. If you store it somewhere, make sure that the medium you store it to treats the data as binary. No transformation what so ever must ever be made on it.

Of course, that leaves you with a problem later on if you switch character sets and you have to unserialize, but at least you get to unserialize then. I have to go great lengths now to salvage the old data.