Indexed and Associative Arrays in PHP, Gotcha

To follow on from my previous posts Perl Weak Typing Woes and More Perl Weak Typing Woes I thought I’d continue on the topic with a similar gotcha, but this time from some PHP code that I came across recently.

This time, we were selecting multiple entries from a database using PHP PDO and a query with an IN condition filled with positional placeholders. A somewhat contrived but minimal example is below:

<?php
function documentsFromDatabase ($db, $ids) {
    $ids = array_unique($ids);
    print 'Looking for ' . count($ids) . ' documents';
    $sth_ids_in = implode(',', array_fill(0, count($ids), '?')) ?: 'NULL';
    return $db
        ->query("
            SELECT id, name, description
            FROM relation
            WHERE id IN ($sth_ids_in)
        ", $ids)
        ->fetchAll(PDO::FETCH_ASSOC);
}

Superficially this all works okay, passing an array of identifiers to the $ids parameter works as expected and the function returns a list of matching documents. However, in some cases the function was causing an exception to be thrown.

SQLSTATE[HY093]: Invalid parameter number: parameter was not defined

This indicates that something is wrong with how the parameters are being passed to the database. Specifically, the values that we’ve passed don’t match the placeholders in the query.

To understand this, we need to know a few quirks of PHP, array_unique and how PDO treats arrays of bind parameters.

Firstly, arrays in PHP can be of two types:

  • indexed arrays, which have contiguous numerical indexes starting from 0
  • associative arrays, which are indexed by arbitrary values. These might look like hashes or dictionaries in your favourite language.

Now, passing an indexed array into array_unique doesn’t necessarily give you an indexed array back. This is because array_unique preserves the indexes of elements, meaning that if duplicates from the middle of an indexed array are removed the array indices are no longer contiguious.

For example, the output here has indices 0, 1 and 3:

<?php
var_dump(array_unique([1,2,2,3]));

array(3) {
  [0]=>
  int(1)
  [1]=>
  int(2)
  [3]=>
  int(3)
}

Secondly, for supported database drivers, PDO supports either named or positional placeholders in queries. Named placeholders appear in a query as :name, and positional placeholders appear as question marks as in our example above.

To bind values to placeholders when executing a query when using named placeholders in a query PDO expects to be given an associative array with keys matching the names of the placeholders. And conversely, as in our example, when binding values to positional placeholders, an indexed array containing only the values to be bound is expected.

The subtlety here is that when array_unique was removing duplicates from our array of identifiers, the array was being treated by PDO as an associative array. PDO was therefore expecting to find named placeholders in the query, and throwing an exception when they did not exist.

The fact that this bug existed says as much about language design as it does the need for testing and a deep understanding of your chosen programming language.

Fancy reading more? See more blog posts.