Google Apps – Provisioning – Two-Legged OAuth

Our company uses Google Apps premium for Email and shared documents, but in order to have more freedom in email aliases, in order to have more control over email routing and finally, because there are a couple of local parts we use to direct mail to some applications, all our mail, even though it’s created in Google Apps and finally ends up in Google Apps, goes via a central mail relay we are running ourselves (well. I’m running it).

Google Apps premium allows you to do that and it’s a really cool feature.

One additional thing I’m doing on that central relay is to keep a backup of all mail that comes from Google or goes to Google. The reason: While I trust them not to lose my data, there are stories around of people losing their accounts to Googles anti-spam automatisms. This is especially bad as there usually is nobody to appeal to.

So I deemed it imperative that we store a backup of every message so we can move away from google if the need to do so arises.

Of course that means though that our relay needs to know what local parts are valid for the google apps domain – after all, I don’t want to store mail that would later be bounced by google. And I’d love to bounce directly without relaying the mail unconditionally, so that’s another reason why I’d want to know the list of users.

Google provides their provisioning API to do that and using the GData python packages, you can easily access that data. In theory.

Up until very recently, the big problem was that the provisioning API didn’t support OAuth. That meant that my little script that retreives the local parts had to have a password of an administrator which is something that really bugged me as it meant that either I store my password in the script or I can’t run the script from cron.

With the Google Apps Marketplace, they fixed that somewhat, but it still requires a strange dance:

When you visit the OAuth client configuration (https://www.google.com/a/cpanel/YOURDOMAIN/ManageOauthClients), it lists you domain with the note “This client has access to all APIs.”.

This is totally not true though as Google’s definition of “all” apparently doesn’t include “Provisioning” :-)

To make two-legged OAuth work for the provisioning API, you have to explicitly list the feeds. In my case, this was Users and Groups:

Under “Client Name”, add your domain again (“example.com”) and unter One or More API Scopes, add the two feeds like this: “https://apps-apis.google.com/a/feeds/group/#readonly,https://apps-apis.google.com/a/feeds/user/#readonly”

This will enable two-legged OAuth access to the user and group lists which is what I need in my little script:

import gdata.apps.service
import gdata.apps.groups.service

consumer_key = 'YOUR.DOMAIN'
consumer_secret = 'secret' #check advanced / OAuth in you control panel
sig_method = gdata.auth.OAuthSignatureMethod.HMAC_SHA1

service = gdata.apps.service.AppsService(domain=consumer_key)
service.SetOAuthInputParameters(sig_method, consumer_key, consumer_secret=consumer_secret, two_legged_oauth=True)

res = service.RetrieveAllUsers()
for entry in res.entry:
    print entry.login.user_name

service = gdata.apps.groups.service.GroupsService(domain=consumer_key)
service.SetOAuthInputParameters(sig_method, consumer_key, consumer_secret=consumer_secret, two_legged_oauth=True)
res = service.RetrieveAllGroups()
for entry in res:
    print entry['groupName']

OAuth signature methods

I’m currently looking into web services and different methods of request authentication, especially as what I’m aiming to end up with is something inherently RESTful as this method will provide me with the best flexibility when designing a frontend to the service and generally, the arguments of the REST crowd seem to convince me (works like the human readable web, inherently scalable, enforces clean structure of resources and finally: easy to program against due to “obvious” API).

As different services are going to communicate with themselves, sometimes acting as users of their respective platforms and because I’m not really inclined to pass credentials around (or make the user do one half of the tasks on one site and the other half on another site), I was looking into different methods of authentication and authorization which work in a RESTful enviroment and work without passing around user credentials.

The first thing I did was to note the requirements and subsequently, I quickly designed something using public key cryptography which would have worked quite nicely (possibly – I’m no expert in this field – yet).

Then I learned about OAuth which was designed precisely to solve my issues.

Eager, I read through the specification, but I was put off by one single fact: The default method for signing requests, the method that is most widely used, the method that is most widely supported, relies on a shared secret.

Even worse: The shared secret must be known in clear on both the client and the server (using the common terminology here; OAuth speaks of consumers and providers, but I’m (still) more used to the traditional naming).

This is bad on multiple levels:

  • As the secret is stored on two places (client and server), it’s twice as probable to leak out than if it’s only stored on one place (the client).
  • If the token is compromised, the attacker can act in the name of the client with no way of detection.
  • Frankly, it’s responsibility I, as a server designer, would not want to take on. If the secret is on the client and the client screws up and lets it leak, it’s their problem, if the secret is stored on the server and the server screws up, it’s my problem and I have to take responsibility.
    Personally, I’m quite confident that I would not leak secret tokens, but can I be sure? Maybe. Do I even want to think about this? Certainly not if there is another option.
  • If, god forbid, the whole table containing all the shared secrets is compromised, I’m really, utterly screwed as the attacker can use all services, impersonating any user at will.
  • As the server needs to know all shared secrets, the risk of losing all of them is only even created. If only the client knows the secret, an attacker has to compromise each client individually. If the server knows the secret, it suffices to compromise the server to get all clients.
  • As per the point above, the server gets to be a really interesting target for attacks and thus needs to be extra secured and even needs to take measures against all kinds of more-or-less intelligent attacks (usually ending up DoSing the server or worse).

In the end, HMAC-SHA1 is just repeating history. At first, we stored passwords in the clear, then we’ve learned to hash them, then we even salted them and now we’re exchanging them for tokens stored in the clear.

No.

What I need is something that keeps the secret on the client.

The secret should never ever need to be transmitted to the server. The server should have no knowledge at all of the secret.

Thankfully, OAuth contains a solution for this problem: RSA-SHA1 as defined in section 9.3 of the specification. Unfortunately, it leaves a lot to be desired though. Whereas the rest of the specification is a pleasure to read and very, well, specific, 9.3 contains the following phrase:

It is assumed that the Consumer has provided its RSA public key in a verified way to the Service Provider, in a manner which is beyond the scope of this specification.

Sure. Just specify the (IMHO) useless way using shared secrets and leave out the interesting and IMHO only functional method.

Sure. Transmitting a Public Key is a piece of cake (it’s public after all), but this puts another burden on the writer of the provider documentation and as it’s unspecified, implementors will be forced to amend the existing libraries with custom code to transmit the key.

Also I’m unclear on header size limitations. As the server needs to know what public key was used for signature (oauth_consumer_key), it must be sent on each requests. While manually generated public token can be small, a public key certainly isn’t. Is there a size-limit for HTTP-headers? I’ll have to check that.

I could just transmit the key ID (the key is known on the server) or the key fingerprint as the consumer key, but is that following the standard? I didn’t see this documented anywhere and examples in the wild are very scarcely implemented.

Well… as usual, the better solution just requires more work and I can live with that, especially considering as, for now, I’ll be the person to write both server and client, but I feel the upcoming pain, should third party consumers decide to hook up with that provider.

If you ask me what I would have done in the footsteps of the OAuth guys, I would only have specified RSA-SHA1 (and maybe PLAINTEXT) and not even bothered with HMAC-SHA1. And I would have specified a standard way for public key exchange between consumer and provider.

Now the train has left and everyone interested in creating a really secure (and convenient – at least for the provider) solution will be left with more work and not standardized methods.