Why I recommend against JWT

Json Web Tokens are all the rage lately. They are lauded as being a stateless alternative to server-side cookies and as the perfect way to use authentication in your single-page app and some people also sell them as a work around for the EU cookie policy because, you know, they work without cookies too.

If you ask me though, I would always recommend against the use of JWT to solve your problem.

Let me give you a few arguments to debunk, from worse to better:

Debunking arguments

It requires no cookies

General “best” practice stores JWT in the browsers local storage and then sends that off to the server in all authenticated API calls.

This is no different from a traditional cookie with the exception that transmission to the server isn’t done automatically by the browsers (which a cookie would be) and that it is significantly less secure than a cookie: As there is no way to set a value in local storage outside of JavaScript, there consequently is no feature equivalent to cookies’ httponly. This means that XSS vulnerabilities in your frontend now give an attacker access to the JWT token.

Worse, as people often use JWT for both a short-lived and a refresh token, this means that any XSS vulnerability now gives the attacker to a valid refresh token that can be used to create new session tokens at-will, even when your session has expired, in the process completely invalidating all the benefits of having separate refresh and access tokens.

“But at least I don’t need to display one of those EU cookie warnings” I hear you say. But did you know that the warning is only required for tracking cookies? Cookies that are required for the operation of your site (so a traditional session cookie) don’t require you to put up that warning in the first place.

It’s stateless

This is another often used argument in favour of JWT: Because the server can put all the required state into them, there’s no need to store any thing on the server end, so you can load-balance incoming requests to whatever app server you want and you don’t need any central store for session state.

In general, that’s true, but it becomes an issue once you need to revoke or refresh tokens.

JWT is often used in conjunction with OAuth where the server issues a relatively short-lived access token and a longer-lived refresh token.

If a client wants to refresh its access token, it’s using its refresh token to do so. The server will validate that and then hand out a new access token.

But for security reasons, you don’t want that refresh token to be re-used (otherwise, a leaked refresh token could be used to gain access to the site for its whole validity period) and you probably also want to invalidate the previously used access token otherwise, if that has leaked, it could be used until its expiration date even though the legitimate client has already refreshed it.

So you need a means to black-list tokens.

Which means you’re back at keeping track of state because that’s the only way to do this. Either you black-list the whole binary representation of the token, or you put some unique ID in the token and then blacklist that (and compare after decoding the token), but what ever you do, you still need to keep track of that shared state.

And once you’re doing that, you lose all the perceived advantages of statelessness.

Worse: Because the server has to invalidate and blacklist both access and refresh token when a refresh happens, a connection failure during a refresh can leave a client without a valid token, forcing users to log in again.

In todays world of mostly mobile clients using the mobile phone network, this happens more often than you’d think. Especially as your access tokens should be relatively short-lived.

It’s better than rolling your own crypto

In general, yes, I agree with that argument. Anything is better than rolling your own crypto. But are you sure your library of choice has implemented the signature check and decryption correctly? Are you keeping up to date with security flaws in your library of choice (or its dependencies).

You know what is still better than using existing crypto? Using no crypto what so ever. If all you hand out to the client to keep is a completely random token and all you do is look up the data assigned to that token, then there’s no crypto anybody could get wrong.

A solution in search of a problem

So once all good arguments in favour of JWT have dissolved, you’re left with all their disadvantages:

  • By default, the JWT spec allows for insecure algorithms and key sizes. It’s up to you to chose safe parameters for your application
  • Doing JWT means you’re doing crypto and you’re decrypting potentially hostile data. Are you up to this additional complexity compared to a single primary key lookup?
  • JWTs contain quite a bit of metadata and other bookkeeping information. Transmitting this for every request is more expensive than just transmitting a single ID.
  • It’s brittle: Your application has to make sure to never make a request to the server without the token present. Every AJAX request your frontend makes needs to manually append the token and as the server has to blacklist both access and refresh tokens whenever they are used, you might accidentally end up without a valid token when the connection fails during refresh.

So are they really useless?

Even despite all these negative arguments, I think that JWT are great for one specific purpose and that’s authentication between different services in the backend if the various services can’t trust each other.

In such a case, you can use very short-lived tokens (with a lifetime measured in seconds at most) and you never have them leave your internal network. All the clients ever see is a traditional session-cookie (in case of a browser-based frontend) or a traditional OAuth access token.

This session cookie or access token is checked by frontend servers (which, yes, have to have access to some shared state, but this isn’t an unsolvable issue) which then issue the required short-lived JW tokens to talk to the various backend services.

Or you use them when you have two loosely coupled backend services who trust each other and need to talk to each other. There too, you can issue short-lived tokens (given you are aware of above described security issues).

In the case of short-lived tokens that never go to the user, you circumvent most of the issues outlined above: They can be truly stateless because thank to their short lifetime, you don’t ever need to blacklist them and they can be stored in a location that’s not exposed to possible XSS attacks against your frontend.

This just leaves the issue of the difficult-to-get-right crypto, but as you never accept tokens from untrusted sources, a whole class of possible attacks becomes impossible, so you might even get away with not updating on an too-regular basis.

So, please, when you are writing your next web API that uses any kind of authentication and you ask yourself “should I use JWT for this”, resist the temptation. Using plain opaque tokens is always better when you talk to an untrusted frontend.

Only when you are working on scaling our your application and splitting it out into multiple disconnected microservices and you need a way to pass credentials between them, then by all means go ahead and investigate JWT – it’ll surely be better than cobbling something up for yourself.

linktrail – a failed startup – introduction

I guess it’s inevitable. Good ideas may fail. And good ideas may be years ahead of their time. And of course, sometimes, people just don’t listen.

But one never stops learning.

In the year 2000, I took part in a plan of a couple of guys to become the next Yahoo (Google wasn’t quite there yet back then), or, to use the words we used on the site,

For these reasons, we have designed an online environment that offers a truly new way for people to store, manage and share their favourite online resources and enables them to engage in long-lasting relationships of collaboration and trust with other users.

The idea behind the project, called linktrail, was basically what would much later on be picked up by the likes of twitter, facebook (to some extent) and the various community based news sites.

The whole thing went down the drain, but the good thing is that I was able to legally salvage the source code, the install it on a personal server of mine and to publish the source code. And now that so many years have passed, it’s probably time to tell the world about this, which is why I have decided to start this little series about the project. What is it? How was it made? And most importantly: Why did it fail? And concequently: What could we have done better?

But let’s first start with the basics.

As I said, I was able to legally acquire the database and code (which is mostly written by me anyways) and to install the site on a server of mine, so let’s get that out to start with. The site is available at linktrail.pilif.ch. What you see running there is the result of 6 months of programming by myself after a concept done by the guys I’ve worked with to create this.

What is linktrail?

If the tour we made back then is any good, then just taking it would probably be enough, but let me phrase in my words: The site is a collection of so called trails which in turn are small units, comparable to blogs, consisting of links, titles and descriptions. These micro-blogs are shown in a popup window (that’s what we had back then) beside the browser window to allow quick navigation between the different links in the trail.

Trails are made by users, either by each user on their own or as a collaborative work between multiple users. The owner of a trail can hand out permissions to everybody or their friends (using a system quite similar to what we currently see on facebook for example)

A trail is placed in a directory of trails which was built around the directory structures we used back then, though by now, we would probably do this much more different. Users can subscribe to trails they are interested in. In that case, they will be notified if a trail they are subscribed to is updated either by the owner or anybody else with the rights to update the trail.

Every user (called expert in the site’s terms) has their profile page (here’s mine) that lists the trails they created and the ones they are subscribed to.

The idea was for you as an user to find others with similar interests and form a community around those interests to collaborate on trails. An in-site messaging-system helped users to communicate with each other: Aside of just sending plain text messages, it’s possible to recommend trails (for easy one-click subscription) .

linktrail was my first real programming project, basically 6 months after graduating in what the US would call high school. Combine that fact with the fact that it was created during the high times of the browser wars (year 2000, remember)  with web standards basically non-existing, then you can imagine what a mess is running behind the scenes.

Still, the site works fine within those constraints.

In future posts, I will talk about the history of the project, about the technology behind the site, about special features and, of course, about why this all failed and what I would do differently – both in matters of code and organization.

If I woke your interest, feel free to have a look at the code of the site which I just now converted from CVS (I started using CVS about 4 months into development, so the first commit is HUGE) to SVN to git and put it up on github for public consumption. It’s licensed under a BSD license, but I doubt that you’d find anything in this mess of PHP3(!) code (though it runs unchanged(!) on PHP5 – topic of another post I guess), HTML 3.2(!) tag soup and java-script hacks.

Oh and if you can read german, I have also converted the CVS repository that contained the concept papers that were written over the time.

In preparation of this series of blog-posts, I have already made some changes to the code base (available at github):

  • login after register now works
  • warning about unencrypted(!) passwords in the registration form
  • registering requires you to solve a reCAPTCHA.

Twisted Tornado

Lately, the net is all busy talking about the new web server released by FriendFeed last week and how their server basically does the same thing as the Twisted framework that was around so much longer. One blog entry ends with

Why Facebook/Friendfeed decided to create a new web server is completely beyond us.

Well. Let me add my two cents. Not from a Python perspective (I’m quite the Python newbie, only having completed one bigger project so far), but from a software development perspective. I feel qualified to add the cents because I’ve been there and done that.

When you start any project, you will be on the lookout for a framework or solution to base your work on. Often times, you already have some kind of idea of how you want to proceed and what the different requirements of your solution will be.

Of course, you’ll be comparing existing requirements against the solutions around, but chances are that none of the existing solutions will match your requirements exactly, so you will be faced with changing them to match.

This involves not only the changes themselves but also other considerations:

  • is it even possible to change an existing solution to match your needs?
  • if the existing solution is an open source project, is there a chance of your changes being accepted upstream (this is not a given, by the way).
  • if not, are you willing to back- and forward-port your changes as new upstream versions get released? Or are you willing to stick with the version for eternity, manually back-porting security-issues?

and most importantly

  • what takes more time: Writing a tailor-made solution from scratch or learning how the most-matching solutions ticks to make it do what you want?

There is a very strong perception around, that too many features mean bloat and that a simpler solution always trumps the complex one.

Have a look at articles like «Clojure 1, PHP 0» which compares a home-grown, tailor-made solution in one language to a complete framework in another and it seems to favor the tailor-made solution because it was more performant and felt much easier to maintain.

The truth is, you can’t have it both ways:

Either you are willing to live with «bloat» and customize an existing solution, adding some features and not using others, or you are unwilling to accept any bloat and you will do a tailor-made solution that may be lacking in features, may reimplement other features of existing solutions, but will contain exactly the features you want. Thus it will not be «bloated».

FriendFeed decided to go the tailor-made route but instead of many other projects each day who go the tailor made route (take Django’s reimplementations of many existing Python technologies like templating and ORM as another example) and keep using that internally, they actually went public.

Not with the intention to bad-mouth Twisted (though it kinda sounded that way due to bad choice of words), but with the intention of telling us: «Hey – here’s the tailor-made implementation which we used to solve our problem – maybe it is or parts of it are useful to you, so go ahead and have a look».

Instead of complaining that reimplementation and a bit of NIH was going on, the community could embrace the offering and try to pick the interesting parts they see fitting for their implementation(s).

This kind of reinventing the wheel is a standard process that is going on all the time, both in the Free Software world as in the commercial software world. There’s no reason to be concerned or alarmed. Instead we should be thankful for the groups that actually manage to put their code out for us to see – in so many cases, we never get a chance to see it and thus lose a chance at making our solutions better.

JavaScript and Applet interaction

As I said earlier this month: While Java applets are dead for games and animations and whatever else they were used back in the nineties, they still have their use when you have to access the local machine from your web application in some way.

There are other possibilities of course, but they all are limited:

  • Flash loads quickly and is available in most browsers, but you can only access the  hardware Adobe has created an API for. That’s upload of files the user has to manually select, webcams and microphones.
  • ActiveX doesn’t work in browsers, but only in IE.
  • .NET dito.
  • Silverlight is neither commonly installed on your users machines, nor does it provide the native hardware access.

So if you need to, say, access a bar code scanner. Or access a specific file on the users computer – maybe stored in a place that is inconvenient for the user to get to (%Localappdata% for example is hidden in explorer). In this case, a signed Java applet is the only way to go.

You might tell me that a website has no business accessing that kind of data and generally, I would agree, but what if your requirements are to read data from a bar code scanner without altering the target machine at all and without requiring the user to perform any steps but to plug the scanner and click a button.

But Java applets have that certain 1996 look to them, so even if you access the data somehow, the applet still feels foreign to your cool Web 2.0 application: It doesn’t quite fit the tight coupling between browser and server that AJAX gets us and even if you use Swing, the GUI will never look as good (and customized) as something you could do in HTML and CSS.

But did you know that Java Applets are fully scriptable?

Per default, any JavaScript function on a page can call any public method of any applet on the site. So let’s say your applet implements

public String sayHello(String name){
    return "Hello "+name;
}

Then you can use JavaScript to call that method (using jQuery here):

$('#some-div').html(
    $('#id_of_the_applet').get(0).sayHello(
        $('#some-form-field').val())
);

If you do that, you have to remember though that any applet method called this way will run inside the sandbox regardless if the applet is signed or not.

So how do you access the hardware then?

Simple: Tell the JRE that you are sure (you are. aren’t you?) that it’s ok for a script to call a certain method. To do that, you use AccessController.doPrivileged(). So if for example, you want to check if some specific file is on the users machine. Let’s further assume that you have a singleton RuntimeSettings that provides a method to check the existence of the file and then return its name, you could do something like this:

   public String getInterfaceDirectory(){
        return (String) AccessController.doPrivileged(
                new PrivilegedAction() {
                    public Object run() {
                        return RuntimeSettings.getInstance().getInterfaceDirectory();
                    }
                }
            );
    }

Now it’s safe to call this method from JavaScript despite the fact that RuntimeSettings.getInterfaceDirectory() directly accesses the underlying system. Whatever is in PrivilegedAction.run() will have full hardware access (provided the applet in question is signed and the user has given permission).

Just keep one thing in mind: Your applet is fully scriptable and if you are not very careful where that Script comes from, your applet may be abused and thus the security of the client browser might be at risk.

Keeping this in mind, try to:

  • Make these elevated methods do one and only one thing.
  • Keep the interface between the page and the applet as simple as possible.
  • In elevated methods, do not call into javascript (see below) and certainly do not eval() any code coming from the outside.
  • Make sure that your pages are sufficiently secured against XSS: Don’t allow any user generated content to reach the page unescaped.

The explicit and cumbersome declaration of elevated actions was put in place to make sure that the developer keeps the associated security risk in mind. So be a good developer and do so.

Using this technology, you can even pass around Java objects from the Applet to the page.

Also, if you need your applet to call into the page, you can do that too, of course, but you’ll need a bit of additional work.

  1. You need to import JSObject from netscape.javascript (yes – that’s how it’s called. It works in all browsers though), so to compile the applet, you’ll have to add plugin.jar (or netscape.jar – depending on the version of the JRE) from somewhere below your JRE/JDK installation to the build classpath. On a Mac, you’ll find it below /System/Library/Frameworks/JavaVM.framework/Versions/<your version>/Home/lib.
  2. You need to tell the Java plugin that you want the applet to be able to call into the page. Use the mayscript attribute of the java applet for that (interestingly, it’s just mayscript – without value, thus making your nice XHTML page invalid the moment you add it – mayscript=”true” or the correct mayscript=”mayscript” don’t work consistently on all browsers).
  3. In your applet, call the static JSObject.getWindow() and pass it a reference to your applet to acquire a reference to the current pages window-object.
  4. On that reference you can call eval() or getMember() or just call() to call into the JavaScript on the page.

This tool set allows you to add the applet to the page with 1 pixel size in diameter placed somewhere way out of the viewport distance and with visibility: hidden, while writing the actual GUI code in HTML and CSS, using normal JS/AJAX calls to communicate with the server.

If you need access to specific system components, this (together with JNA and applet-launcher) is the way to go, IMHO as it solves the anachronism that is Java GUIs in applets.

There is still the long launch time of the JRE, but that’s getting better and better with every JRE release.

I was having so much fun last week discovering all that stuff.

Do not change base library behavior

Modern languages like JavaScript or Ruby provide the programmer with an option to “reopen” any class to add additional behavior to them. In the case of Ruby and JavaScript, this is not constrained in any way: You are able to reopen any  class – even the ones that come with your language itself and there are no restrictions on the functionality of your extension methods.

Ruby at least knows of the concept of private methods and fields which you can’t call from your additional methods, but that’s just Ruby. JS knows of no such thing.

This provides awesome freedom to the users of these languages. Agreed. Miss a method on a class? Easy. Just implement that and call it from wherever you want.

This also helps to free you from things like

BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(of)));

which is lots of small (but terribly inconventiently named) classes wrapped into each other to provide the needed functionality. In this example, what the author wanted is to read a file line-by-line. Why exactly do I need three objects for this? Separation of concern is nice, but stuff like this make learning a language needlessly complicated.

In the world of Ruby or JS, you would just extend FileInputStream with whatever functionality you need and then call that, creating code that is much easier to read.

FileInputStream.prototype.readLine = function(){...}
//...
of.readLine();
//...

And yet, if you are a library (as opposed to consumer code), this is a terrible, terrible thing to do!

We have seen previous instances of the kind of problems you will cause: Libraries adding functionality to existing classes create real problems when multiple libraries are doing the same thing and the consuming application is using both libraries.

Let’s say for example, that your library A added that method sum() to the generic Array class. Let’s also say that your consumer also uses library B which does the same thing.

What’s the deal about this, you might ask? It’s pretty clear, what sum does after all?

Is it? It probably is when that array contains something that is summable. But what if there is, say, a string in the array you want to sum up? In your library, the functionality of sum() could be defined as “summing up all the numeric values in the array, assuming 0 for non-numeric values”. In the other library, sum() could be defined as “summing up all the numeric values in the array, throwing an exception if sum() encounters invalid value”.

If your consumer loads your library A first and later on that other library B, you will be calling B’s Array#sum().

Now due to your definition of sum(), you assume that it’s pretty safe to call sum() with an array that contains mixed values. But because you are now calling B’s sum(), you’ll get an exception you certainly did not expect in the first place!

Loading B after A in the consumer caused A to break because both created the same method conforming to different specs.

Loading A after B would fix the problem in this case, but what, say, if both you and B implement Array#avg, but with reversed semantics this time around?

You see, there is no escape.

Altering classes in the global name space breaks any name spacing facility that may have been available in your language. Even if all your “usual” code lives in your own, unique name space, the moment you alter the global space, you break out of your small island and begin to compete with the rest of the world.

If you are a library, you cannot be sure that you are alone in that competition.

And even if you are a top level application you have to be careful not to break implementations of functions provided by libraries you either use directly or, even worse, indirectly.

If you need a real-life example, the following code in an (outdated) version of scriptaculous’ effects.js broke jQuery, despite the latter being very, very careful to check if it can rely on the base functionality provided:

Array.prototype.call = function() {
 var args = arguments;
 this.each(function(f){ f.apply(this, args) });
}

Interestingly enough, Array#call wasn’t used in the affected version of the library. This was a code artifact that actually did nothing but break a completely independent library (I did not have time to determine the exact nature of the breakage).

Not convinced? After all I was using an outdated version of scriptaculous and I should have updated (which is not an option if you have even more libraries dependent on bugs in exactly that version – unless you update all other components as well and then fix all the then broken unit tests).

Firefox 3.0 was the first browser to add document.getElementByClassName, a method also implemented by Prototype. Of course the functionality in Firefox was slightly different from the implementation in Prototype, which now called the built-in version instead its own version which caused a lot of breakage all over the place.

So, dear library developers, stay in your own namespace, please. You’ll make us consumers (and your own) lives so much more easier.

This month’s find: jna and applet-launcher

Way way back, I was talking about java applets and native libraries and the things you need to consider when writing applets that need access to native libraries (mostly for hardware access). And let’s be honest – considering how far HTML and JavaScript have come, native hardware access is probably the only thing you still needs applets for.

Java is slow and bloated and users generally don’t seem to like it very much, but the moment you need access to specific hardware – or even just to specific files on the users filesystem, Java becomes an interesting option as it is the only technology readily available on multiple platforms and browsers.

Flash only works for hardware Adobe has put an API in (cameras and microphones) and doesn’t allow access to arbitrary files. .NET doesn’t work on browsers (it works on IE, but the solution at hand should work on browsers too) and ActiveX is generally horrible, doesn’t work in browsers and additionally only works under windows (.NET works in theory on Unixes and Macs as well).

Which leaves us with Java.

Because applets are scriptable, you get away with hiding the awful user interface that is Swing (or, god forbid, AWT) and writing a nice integrated GUI using web technologies.

But there’s still the issue with native libraries.

First, your applet needs to be signed – no way around that. Then, you need to manually transfer all the native libraries and extension libraries. Also, you’ll need to put them in certain predefined places – some of which require administration privileges to be written into.

And don’t get me started about JNI. Contrary to .NET, you can’t just call into native libraries. You’ll have to write your own glue layer between the native OS and the JRE. That glue layer is platform specific of course, so you better have your C compiler ready – and the plattforms you intend to run on, of course.

So even if Java is the only way, it still sucks.

Complex deployment, administrative privileges and antiquated glue layers. Is this what you would want to work with?

Fortunately, I’ve just discovered two real pearls completely solving the two problems leaving me with the hassle that is Java itself, but it’s always nice to keep some practice in multiple programming languages, as long as it doesn’t involve C shudder.

The first component I’m going to talk about is JNA (Java Native Access) which is for Java what P/Invoke is for .NET: A way for directly calling into the native API from your Java code. No JNI and thus no custom glue code and C compiler needed. Translating the native calls and structures into what JNA wants still isn’t as convenient as P/Invoke, but it sure as hell beats JNI.

In my case, I needed to get find the directory corresponding to CSIDL_LOCAL_APPDATA when running under Windows. While I could have hacked something together, the only really reliable way of getting the correct path is to query the Windows API, for which JNA proved to be the perfect fit.

JNA of course comes with its own glue layer (available in precompiled form for more plattforms than I would ever want to support in the first place), so this leads us directly to the second issue: Native libraries and applets don’t go very well together.

This is where applet-launcher comes into play. Actually, applet-launcher’s functionality is even built into the JRE itself – provided you target JRE 1.6 Update 10 and later, which isn’t realistic in most cases (just today I was handling a case where an applet had to work with JRE 1.3 which was superseded in 2002), so for now, applet-launcher which works with JRE 1.4.2 and later is probably the way to go.

The idea is that you embed the applet-launcher applet instead of the applet you want to embed in the first place. The launcher will download a JNLP file from the server, download and extract external JNI glue libraries and finally load your applet.

When compared with the native 1.6 method, this has the problem that the library which uses the JNI glue has to have some special hooks in place, but it works like a charm and fixes all the issues I’ve previously had with native libraries in applets.

These two components renewed my interest in Java as a glue layer between the webbrowser where your application logic resides and the hardware the user is depending upon. While earlier methods kind of worked but were either hacky or a real pain to implement, this is as clean as it gets and works like a charm.

And next time we’ll learn about scripting Java applets.

OAuth signature methods

I’m currently looking into web services and different methods of request authentication, especially as what I’m aiming to end up with is something inherently RESTful as this method will provide me with the best flexibility when designing a frontend to the service and generally, the arguments of the REST crowd seem to convince me (works like the human readable web, inherently scalable, enforces clean structure of resources and finally: easy to program against due to “obvious” API).

As different services are going to communicate with themselves, sometimes acting as users of their respective platforms and because I’m not really inclined to pass credentials around (or make the user do one half of the tasks on one site and the other half on another site), I was looking into different methods of authentication and authorization which work in a RESTful enviroment and work without passing around user credentials.

The first thing I did was to note the requirements and subsequently, I quickly designed something using public key cryptography which would have worked quite nicely (possibly – I’m no expert in this field – yet).

Then I learned about OAuth which was designed precisely to solve my issues.

Eager, I read through the specification, but I was put off by one single fact: The default method for signing requests, the method that is most widely used, the method that is most widely supported, relies on a shared secret.

Even worse: The shared secret must be known in clear on both the client and the server (using the common terminology here; OAuth speaks of consumers and providers, but I’m (still) more used to the traditional naming).

This is bad on multiple levels:

  • As the secret is stored on two places (client and server), it’s twice as probable to leak out than if it’s only stored on one place (the client).
  • If the token is compromised, the attacker can act in the name of the client with no way of detection.
  • Frankly, it’s responsibility I, as a server designer, would not want to take on. If the secret is on the client and the client screws up and lets it leak, it’s their problem, if the secret is stored on the server and the server screws up, it’s my problem and I have to take responsibility.
    Personally, I’m quite confident that I would not leak secret tokens, but can I be sure? Maybe. Do I even want to think about this? Certainly not if there is another option.
  • If, god forbid, the whole table containing all the shared secrets is compromised, I’m really, utterly screwed as the attacker can use all services, impersonating any user at will.
  • As the server needs to know all shared secrets, the risk of losing all of them is only even created. If only the client knows the secret, an attacker has to compromise each client individually. If the server knows the secret, it suffices to compromise the server to get all clients.
  • As per the point above, the server gets to be a really interesting target for attacks and thus needs to be extra secured and even needs to take measures against all kinds of more-or-less intelligent attacks (usually ending up DoSing the server or worse).

In the end, HMAC-SHA1 is just repeating history. At first, we stored passwords in the clear, then we’ve learned to hash them, then we even salted them and now we’re exchanging them for tokens stored in the clear.

No.

What I need is something that keeps the secret on the client.

The secret should never ever need to be transmitted to the server. The server should have no knowledge at all of the secret.

Thankfully, OAuth contains a solution for this problem: RSA-SHA1 as defined in section 9.3 of the specification. Unfortunately, it leaves a lot to be desired though. Whereas the rest of the specification is a pleasure to read and very, well, specific, 9.3 contains the following phrase:

It is assumed that the Consumer has provided its RSA public key in a verified way to the Service Provider, in a manner which is beyond the scope of this specification.

Sure. Just specify the (IMHO) useless way using shared secrets and leave out the interesting and IMHO only functional method.

Sure. Transmitting a Public Key is a piece of cake (it’s public after all), but this puts another burden on the writer of the provider documentation and as it’s unspecified, implementors will be forced to amend the existing libraries with custom code to transmit the key.

Also I’m unclear on header size limitations. As the server needs to know what public key was used for signature (oauth_consumer_key), it must be sent on each requests. While manually generated public token can be small, a public key certainly isn’t. Is there a size-limit for HTTP-headers? I’ll have to check that.

I could just transmit the key ID (the key is known on the server) or the key fingerprint as the consumer key, but is that following the standard? I didn’t see this documented anywhere and examples in the wild are very scarcely implemented.

Well… as usual, the better solution just requires more work and I can live with that, especially considering as, for now, I’ll be the person to write both server and client, but I feel the upcoming pain, should third party consumers decide to hook up with that provider.

If you ask me what I would have done in the footsteps of the OAuth guys, I would only have specified RSA-SHA1 (and maybe PLAINTEXT) and not even bothered with HMAC-SHA1. And I would have specified a standard way for public key exchange between consumer and provider.

Now the train has left and everyone interested in creating a really secure (and convenient – at least for the provider) solution will be left with more work and not standardized methods.

Simplest possible RPCs in PHP

After spending hours to find out why a particular combination of SoapClient in PHP itself and SOAP::Server from PEAR didn’t consistenly work together (sometimes, arrays passed around lost an arbitrary number of elements), I thought about what would be needed to make RPCs work form a PHP client to a PHP server.

I wanted nothing fancy and I certainly wanted as less an overhead as humanly possible.

This is what I came up with for the server:

<?php
header('Content-Type: text/plain');

require_once('a/file/containing/a/class/you/want/to/expose.php');

$method = str_replace('/', '', $_SERVER['PATH_INFO']);

if ($_SERVER['REQUEST_METHOD'] != 'POST'){
   sendResponse(array('state' =&gt; 'error', 'cause' =&gt; 'unsuppored HTTP method'));
}

$s = new MyServerObject();
$params = unserialize(file_get_contents('php://input'));
if ( ($res = call_user_func_array(array($s, $method), $params)) === false)
   sendResponse(array('state' => 'error', 'cause' => 'RPC failed'));
if (is_object($res))
   $res = get_object_vars($res);
sendResponse($res);

function sendResponse($resobj){
    echo serialize($resobj);
    exit;

}

?>

This client as shown below is a bit more complex, mainly because it contains some HTTP protocol logic. Logic, which could possibly be reduced to 2-3 lines of code if I’d use the CURL library, but the client in this case does not have the luxury of having access to such functionality.

Also, I’ve already had the function laying around (/me winks at domi), so that’s what I used (as opposed to file_get_contents with a pre-prepared stream context). This way, we DO have the advantage of learning a bit of how HTTP works and we are totally self-contained.

<?php
class Client{
    function __call($name, $args){
        $req = $this-&gt;openHTTPRequest('http://localhost:5436/restapi.php/'.$name, 'POST', array('Content-Type' =&gt; 'text/plain'), serialize($args));
        $data = unserialize(stream_get_contents($req['handle']));
        fclose($req['handle']);
        return $data;
    }
    private function openHTTPRequest($url, $method = 'GET', $additional_headers = null, $data = null){
        $parts = parse_url($url);

        $fp = fsockopen($parts['host'], $parts['port'] ? $parts['port'] : 80);
        fprintf($fp, "%s %s HTTP/1.1rn", $method, implode('?', array($parts['path'], $parts['query'])));
        fputs($fp, "Host: ".$parts['host']."rn");
        if ($data){
            fputs($fp, 'Content-Length: '.strlen($data)."rn");
        }
        if (is_array($additional_headers)){
            foreach($additional_headers as $name => $value){
                fprintf($fp, "%s: %srn", $name, $value);
            }
        }
        fputs($fp, "Connection: closernrn");
        if ($data)
            fputs($fp, "$datarn");

        // read away header
        $header = array();
        $response = "";
        while(!feof($fp)) {
            $line = trim(fgets($fp, 1024));
            if (empty($response)){
                $response = $line;
                continue;
            }
            if (empty($line)){
                break;
            }
            list($name, $value) = explode(':', $line, 2);
            $header[strtolower(trim($name))] = trim($value);
        }
        return array('response' => $response, 'header' => $header, 'handle' => $fp);
   }

}

$client = new Client();
$result = $client->someMethod(array('data' => 'even arrays work'));

?>

What you can’t pass around this way is objects (at least object which are not of type stdClass) as both client and server would need to have access to the prototype. Also, this seriously lacks error handling. But it generally works much better than what SOAP ever could accomplish.

Naturally, I give up stuff when compared to SOAP or any «real» RPC solution:

  • This one works only with PHP
  • It has limitations on what data structures can be passed around, though that’s aleviated by PHP’s incredibly strong array support.
  • It relies heavily on PHP’s loosely typed nature and thus probably isn’t as robust.

Still, protocols like SOAP (or even any protocol with either «simple» or «lightweight» in its name) tend to be so complicated that it’s incredibly hard if not impossible to create different implementations what still correctly work together in all cases.

In my case, where I have the problem of having to separate two pieces of the same application due to unstable third-party libraries which I would not want to have linked into every PHP instance running on that server for which the solution outlined above (plus some error handling code) works better than SOAP on so many levels:

  • it’s easily debuggable. No need for wireshark or comparable tools
  • client and server are written by me, so they are under my full control
  • it works all the time
  • it relies on as little functionality of PHP as possible and the functionality it depends on is widely used and tested, to I can assume that it’s reasonably bug-free (aside of my own bugs).
  • it’s a whole lot faster than SOAP, though this does not matter at all in this case.

First mail, then office, now IRC. What’s next?

I know that I may be really late with this, but I recently came across Mibbit, a web based IRC client. This is another instance of the recent rush of applications being transported over to the web platform.

In the early days, there were webbased email services. Like Hotmail (or the third CGI script I’ve ever written – the firewall/proxy in my school only supported traffic on port 80 and I didn’t know about tunnels, nor did I have the infrastructure to create a fitting one).

Then came office applications like Google’s offering. And of course games. Many games.

Of course there were webbased chats in the earlier days. But they either required a plugin like java or flash or they worked by constantly reloading the page where the chat is appearing on. Neither of the solution provided what I’d call a full IRC-client. And many of the better solutions required a plugin to work.

mibbit is though. It provides many of the features a not-too-advanced IRC user would want to have. Sure. Scripting is (currently) absent, but everything else is here. In a pleasant interface.

What’s interesting is the fact that so many applications can nowadays be perfectly represented on the web. In fact, XHTML/CSS is perfectly fitted to present a whole lot of data to the user. For IRC for example, there is among the desktop clients to use HTML for their chat rendering aswell.

So in case of IRC clients, both types of applications sooner or later reach the same state: Representing chat messages in good-looking HTML while providing a myriad of features to put off everyone but the most interested and tech-savy user :-)

Still. The trend is an interesting thing to note. As more and more applications hop over to the web, we get more and more independant of infrastructure and OSes. Sometime in the future, maybe we’ll have the paradise of just having a browser to access all our data and applications from wherever we are.

No more software installations. No more viruses and spyware. No more software inexplicably stopping to work. And for the developer: Easy deployment of fixes, shorter turnaround times.

Interesting times ahead indeed.

Why is nobody using SSL client certificates?

Did you know that ever since the days of Netscape Navigator 3.0, there is a technology that allows you to

  • securely sign on without using passwords
  • allow for non-annoying two-factor authentication
  • uniquely identify yourself to third-party websites without giving the second party any account information

All of this can be done using SSL client certificates.

You know: Whenever you visit an SSL protected page, what usually happens is that your browser checks the identity of the remote site by checking their certificate. But what also could happen is that the remote site could check your identity using a previously issued certificate.

This is called SSL client side certificate.

Sites can make the browser generate a keypair for you. Then they’ll sign your public key using their private key and they’ll be able to securely identify you from then on.

The certificate is stored in the browser itself and your browser will send it to any (SSL protected) site requesting it. The site in turn could then identify you as the owner of the private key associated to the presented certificate (provided the key wasn’t generated on a pre-patch Debian installation *sigh*).

The keypair is bound to the machine it was generated on, though it can be exported and re-imported on a different machine.

It solves our introductory three problems like this:

  • by presenting the certificate, the origin server can identify you. No need to enter a user name or a password.
  • By asking for a password (something you know) and comparing the SSL certificate (something you have), you get cheap and easy two factor authentication that’s a lot more secure than asking for your mothers maiden name.
  • If the requesting party in a three-site scenario knows your public key and uses that to request information from a requested party, you, can revoke access by this key at any time without any of the parties knowing your username and password.

Looks very nice, doesn’t it?

So why isn’t it used more often (read: at all)?

This is why:

Picture underlining the

The screenshot shows what’s needed to actually have a look at the client side certificates installed in your browser, which currently is the only way of accessing them. Let’s say you want to copy a keypair from one machine to another. You’ll have to:

  1. Open the preferences (many people are afraid of even that)
  2. Select Advanced (scary)
  3. Click Encryption (encry… what?)
  4. Click “View Certificates” (what do the other buttons do? oops! Another dialog?)
  5. Select your certificate (which one?) and click “Export” (huh?)

Even generation of the key is done in-browser without feedback by the site requesting the key.

This is like basic authentication (nobody uses this one) vs. forms based authentication (which is what everybody uses): It’s non-themeable, scary, modal and complicated.

What we need for client side certificates to become useful is a way for sites to get more access to the functionality than they currently do: They need information on the key generation process. They should allow the user to export the key and to re-import it (just spawning two file dialogs should suffice – of course the key must not be transmitted to the site in the process). They need a way to list the keys installed in a browser. They need to be able to add and remove keys (on the user’s request).

In the current state, this excellent idea is rendered completely useless by the awful usability and the completely detached nature: This is a browser feature. It’s browser dependent without a way for the sites to control it – to guide users through steps.

For this to work, sites need more control.

Without giving them access to your keys.

<divpInteresting problem. Isn’t it?</p>