php- gnegg

Sensational AG is hiring (once more)

Sensational AG is the company I founded together with a colleague back in 2000. Ever since then, we had a very nice combination of fun, interesting work and a very successful business.

We’re a very small team – just nine programmers, one business guy, a product designer, a frontend designer and two bloody excellent project managers. Me personally, I would love to keep the team as small and tightly-knit as possible as that brings huge advantages: Little internal politics, a lot of freedoms for everybody and mind-blowing productivity.

I’m still amazed to see what we manage to do with our small team time and time again and yet still manage to keep the job fun. It’s not just the stuff we do outside of immediate work, like Cola Double Blind Tests, Drone Flights directly from the roof of our office and much more – it’s also the work itself that we try to make as fun as possible for everybody.

We are looking for a new member to help us with the development of our main product, an eCommernce platform that’s optimized for wholesale customers.

We’re not about presenting a small amount of product in the most enticing manner, but we’re into helping our end users to be as efficient and quick as possible to deal with their big orders (up to 400 line items per week).

Processes most of Switzerland’s whole-sale food orders

Over the years the main industry we serve has focussed itself on wholesale gastronomy. If you have ever bought something in any of the bigger empolyee restaurants in any larger company, if you have ever bought something in any restaurant of any school or university in Switzerland, whatever you consumed has likely been procured and managed over one of our platforms.

For our customers and ourselves, we handle a seizable amount of data (the largest data set is 3.2 TB in size).

I’m always calling our field «medium data» – while it might still fit into memory, it’s definitely too big to deal with it in the naïve way, so it’s not quite big-data yet, but it’s certainly in interesting spheres.

We’re in the comfortable position that the data entrusted to us is growing in the speed that we’re able to learn how to deal with it and so is our architecture. What started as a simple PHP-in-front-of-PostgreSQL deal back in 2004 by now has grown to a cluster of about 40 machines: Job queue servers, importer servers, application servers, media servers, event forwarding servers; because we are hosting our infrastructure for our customers, we can afford to go the extra mile to do things technically interesting and exciting.

Speaking of infrastructure: We own the full stack of our product: Our web application, its connected micro services, our phone apps, our barcode reading apps, but also our backend infrastructure (which is kept up to date by Puppet)

While our main application is a beast of 250k lines of PHP code, we still strive to use the best tool for their jobs and in the last years have grown our infrastructure with tools we have written in Rust, JavaScript & TypeScript (via Node.js) and of course our mobile apps are written in their native languages Swift and Java with more and more Kotlin.

We try to stay as current as possible even with our core PHP code. We have upgraded PHP versions more or less the day they come out.

As strong believers in Open Source, whenever we come across a bug in our dependencies, we fix it and publish it upstream. Many of our team members have had their patches merged into PHP, Psalm, Rust, Tantivy and others. Giving back is only fair (and of course also helps us with future maintenance).

Over the years we have learned about the value of modern software development: Strong typing (even in PHP through Psalm), functional programming, automated tests, automated deployments: We do whatever we can do to allow us to continuously and with confidence push updates to our thousands of users multiple times a day.

Ready to press the button and deploy to 100s of users

If this sounds interesting to you and you want to help us make it possible for our end users to leave their workplace earlier because ordering is so much easier, then ping me at jobs@sensational.ch.

You should be familiar with working on bigger Software projects and understanding of software maintainability over the years. We hardly ever start fresh, but we constantly strive to keep what we have modern and up to speed with wherever technology goes.

You will be initially mostly working on our PHP and JS/TypeScript code-base, but if you’re into another language and it will help you solve a problem you’re having or your skill in a language we’re already working with can help us solve a problem, then you’re more than welcome to help.

If you have UNIX shell experience, that’s a bigger plus, though it’s not required, but you will just have to learn the ropes a bit.

All our work is tracked in git and we’re extremely into beautiful commit histories and thus heavy users of the full feature-set that git offers. But don’t worry – so far, we’ve helped everybody get up to speed.

And finally: As a mostly male team – after all, we only have two women working on our team of developers, we’d especially love if more women would find their way into our team.

All of us are very aware how difficult it is for minorities to find a comfortable working environment they can add their experiences to and where they can be themselves. All of us strive to provide such an environment.

pdo_pgsql improvements

Last autumn, I was talking about how I would like to see pdo_pgsql for PHP to be improved.

Over the last few days I had time to seriously start looking into making sure I get my wish. Even though my C is very rusty and I have next to no experience in dealing with the PHP/Zend API, I made quite a bit of progress over the last few days.

First, JSON support

json

If you have the json extension enabled in your PHP install (it’s enabled by default), then any column of data type json will be automatically parsed and returned to you as an array.

No need to constantly repeat yourself with json_parse(). This works, of course, with directly selected json columns or with any expression that returns json (like array_to_json or the direct typecast shown in the screenshot).

This is off by default and can be enabled on a per-connection or a per-statement level as to not break backwards compatibility (I’ll need it off until I get a chance to clean up PopScan for example).

Next, array support:

array

Just like with JSON, this will automatically turn any array expression (of the built-in array types) into an array to use from PHP.

As I’m writing this blog entry here, this only works for text[] and it’s always enabled.

Once I have an elegant way to deal with the various types of arrays and convert them into the correct PHP types, I’ll work on making this turnoffable (technical term) too.

I’ll probably combine this and the automatic JSON parsing into just one setting which will include various extended data types both Postgres and PHP know about.

Once I’ve done that, I’ll look into more points on my wishlist (better error reporting with 9.3 and later and a way to quote identifiers comes to mind) and then I’ll probably try to write a proper RFC and propose this for inclusion into PHP itself (though don’t get your hopes up – they are a conservative bunch).

If you want to follow along with my work, have a look at my pdo_pgsql-improvements branch on github (tracks to PHP-5.5)

pdo_pgsql needs some love

Today, PostgreSQL 9.3 was released.
September is always the month of PostgreSQL as every September a new
Major Release with awesome new feature is released and every September
I have to fight the urgue to run and immediately update the production
systems to the new version of my
favorite toy

As every year, I want to talk the awesome guys (and girls I hope) that
make PostgreSQL one of my favorite pieces of software overall and for
certain my most favorite database system.

That said, there’s another aspect of PostgreSQL that needs some serious
love: While back in the days PHP was known for its robust database
client libraries, over time other language environments have caught up
and long since surpassed what’s possible in PHP.

To be honest, the PostgreSQL client libraries as they are currently
available in PHP are in serious need of some love.

If you want to connect to a PostgreSQL database, you have two options:
Either you use the thin wrapper over libpq, the pgsql extension,
or you go PDO at which point, you’d use pdo_pgsql

Both solutions are, unfortunately, quite inadequate solutions that fail
to expose most of the awesomeness that is PostgreSQL to the user:

pgsql

On the positive side, being a small wrapper around libpq, the pgsql
extension knows quite a bit about Postgres’ internals: It has excellent
support for COPY, it knows about a result sets data types (but doesn’t
use that knowledge as you’ll see below), it has pg_quote_identifier
to correctly quote identifiers, it support asynchronous queries and it
supports NOTIFY.

But, while pgsql knows a lot about Postgres’ specifics, to this day,
the pg_fetch_* functions convert all columns into strings. Numeric
types? String. Dates? String. Booleans? Yes. String too (‘t’ or ‘f’,
both trueish values to PHP).

To this day, while the extension supports prepared statements, their
use is terribly inconvenient, forcing you to name your statements and
to manually free them.

To this day, the pg_fetch_* functions load the whole result set into
an internal buffer, making it impossible to stream results out to the
client using an iterator pattern. Well. Of course it’s still possible,
but you waste the memory for that internal buffer, forcing you to
manually play with DECLARE CURSOR and friends.

There is zero support for advanced data types in Postgres and the
library doesn’t help at all with todays best practices for accessing a
database (prepared statements).

There are other things that make the extension unpractical for me, but
they are not the extensions fault, so I won’t spend any time explaining
them here (like the lack of support by newrelic – but, as I said,
that’s not the extensions fault)

pdo_pgsql

pdo_pgsql gets a lot of stuff right that the pgsql extension doesn’t:
It doesn’t read the whole result set into memory, it knows a bit about
data types, preserving numbers and booleans and, being a PDO driver, it
follows the generic PDO paradigms, giving a unified API with other PDO
modules.

It also has good support for prepared statements (not perfect, but
that’s PDOs fault).

But it also has some warts:

There’s no way to safely quote an identifier. Yes. That’s a PDO
shortcoming, but still. It should be there.
While it knows about numbers and booleans, it doesn’t know about any of the other more advanced data types.
Getting metadata about a query result actually makes it query the
database – once per column, even though the information is right there
in libpq, available to use (look at the
source
of PDOStatement::getColumnMeta). This makes it impossible to fix above issue in userland.
It has zero support for COPY

If only

Imagine the joy of having a pdo_pgsql that actually cares about
Postgres. Imagine how selecting a JSON column would give you its data
already decoded, ready to use in userland (or at least an option to).

Imagine how selecting dates would at least give you the option of
getting them as a DateTime (there’s loss of precision though –
Postgres’ TIMESTAMP has more precision than DateTime)

Imagine how selecting an array type in postgres would actually give you
back an array in PHP. The string that you have to deal with now is
notoriously hard to parse. Yes. There now is array_to_json in
Postgres, but hat shouldn’t be needed.

Imagine how selecting a HSTORE would give you an associative array.

Imagine using COPY with pdo_pgsql for very quickly moving bulk data.

Imagine the new features of PGResult being exposed to userland.
Giving applications the ability to detect what constraint was just
violated (very handy to detect whether it’s safe to retry).

Wouldn’t that be fun? Wouldn’t that save us from having to type so much
boilerplate all day?

Honestly, what I think should happen is somebody should create a
pdo_pgsql2 that breaks backwards compatibility and adds all these
features.

Have getColumnMeta just return the OID instead of querying the
database. Have a quoteIdentifier method (yes. That should be in PDO
itself, but let’s fix it where we can).

Have fetch() return Arrays or Objects for JSON columns. Have it
return Arrays for arrays and HSTOREs. Have it optionally return
DateTimes instead of strings.

Wouldn’t that be great?

Unfortunately, while I can write some C, I’m not nearly good enough
to produce something that I could live with other people using, so any
progress I can achieve will be slow.

I’m also unsure of whether this would ever have a chance to land in PHP
itself. Internals are very adverse to adding new features to stuff that
already “works” and no matter how good the proposal, you need a very
thick skin if you want to ever get something merged, no matter whether
you can actually offer patches or not.

Would people be using an external pdo_pgsql2? Would it have a chance as
a pecl extension? Do other people see a need for this? Is somebody
willing to help me? I really think something needs to be done and I’m
willing to get my hands dirty – I just have my doubts about the quality
of the result I’m capable of producing. But I can certainly try.

And I will.