Composer autoloading and Drupal 8

Ever wondered what exists inside the vendor/ directory of your Drupal or PHP codebase? Let's dive down the rabbit hole and see.

A little bit of history

Let's digress into a little history lesson to see why things are they way they are in the PHP autoloading world.

When we want to include code written in some other PHP file, we did a bunch of requires and soon, the code became a ball of spaghetti. Then came the concept of autoloading, a concept which simplifies the deveoper's life by automatically requireing a file when a developer uses the class.

First there was __autoload

The __autoload is a function which is called automatically by PHP when a new object is initiated. This will try to include the file which contains the class and works for most simple cases. It does have its drawbacks. It might grow hairy quickly. For example, consider this piece of code, maybe this is how Wordpress looked in its first commit :)

function __autoload($class_name) {
  if(in_array($class_name, array('Comment', 'Post'))) {
    require('./Blog/src/' . $class_name . '.php');
  }
}

$post = new Post();
$comment = new Comment();
$post->addComment($comment);

Every "module" will have its own directory with src containing the different entities. Now, if we want to add menus to our blog,

function __autoload($class_name) {
  if(in_array($class_name, array('Comment', 'Post'))) {
    require('./Blog/src/' . $class_name . '.php');
  }
  if(in_array($class_name, array('Menu'))) {
    require('./Menu/src/' . $class_name . '.php');
  }
}
// ...
$menu = new Menu();

You get the point? To solve this problem by at least one step, the spl_autoload_register was introduced.

spl autoloaders

spl_autoload_register is a function defined by PHP to register custom autoloaders. It helps every module to have their own preferred structure and load the module's code in its own way, using a different handler(a.k.a function) for different modules. Here's an example of spl_autoload_register with a function object.

// we could achieve the same using __autoload
spl_autoload_register(function ($class_name) {
    include $class_name . '.php';
});

$obj  = new MyClass1();
$obj2 = new MyClass2();

The earlier blog example using spl_autoload_register.

function blog_autoload($class_name) {
  $blog_file = './Blog/src/' . $class_name . '.php';
  if (file_exists($blog_file)) {
    require($blog_file);
  }
}

function menu_autoload($class_name) {
  $menu_file = './Menu/src/' . $class_name . '.php';
  if (file_exists($menu_file)) {
    require($menu_file);
  }
}
spl_autoload_register('blog_autoload');
spl_autoload_register('menu_autoload');

Why the file_exists check in every autoloader? PHP queues all autoloaders we register via the spl_autoload_register function. When it encounters an undefined class, it goes through every autoloader till it finds a match. Now, if we write without the file_exists check and use the Menu class, it will try to look for ./Blog/src/Menu.php before ./Menu/src/Menu.php and bail out with a fatal error.

Standards

Birth of PSR-0

spl autoloaders created a new set of problems. Now that every module could define their own directory structure, there was no fixed standard per se for how modules should be structured. Everytime I add a new dependency to my codebase, I have to ensure if its directory structure was compatible with my existing autoloaders, or I had to write and register a new autoloader.

Armed with the concept of namespaces in PHP 5.3, the PHP Standards Group(later renamed to Frameworks Interoperability Group) created PSR-0. A PSR(PHP Standard Recommendation) is a set of agreed upon and recommended best practices for building better PHP applications. PSR-0 deals specifically with standards for autoloading. It states, among others,

  • Every library/package will have a top level namespace, which is the Vendor name, followed by optional sub namespaces or class names.Ex: \Acme\Foo\Bar is a file called Bar.php in the Foo namespace by vendor Acme.
  • Each namespace separator maps to a directory separator while autoloading. So, the folder structure for \Acme\Foo\Bar will look like this(assuming we are putting all packages in src/):

PSR-0 directory structure

  • All _ in classnames map to a directory separator. So, \Acme\Foo\Mail_Handler will map to Acme/Foo/Mail/Handler.

It is not mandatory to follow these standards for your library, as these are just "recommended", but they allow for better interoperability with other libraries and frameworks.

Make way for PSR-4

The trouble with PSR-0 is, we end up with sophisticated directory structures while striving to adhere to the recommendation. For instance, the folder structure for \Acme\Foo\Bar will be src/Acme/Foo/Bar, while for \Acme\Foo\BarTest, which contains test cases for Bar, will be in tests/Acme/Foo.

PSR-0 tests directory structure

PSR-4 addresses this issue. It is termed as "improved autoloading". So, according to the newer PSR-4, \Acme\Foo\BarTest will be BarTest.php inside src/Acme/Foo/Bar, or just tests/.

Also, the 1-to-1 mapping between _ and directory separators is removed in PSR-4. The big picture being, there need not be any strict correspondence between namespaces and directories, leading to a flatter directory hierarchy. In case you are wondering where PSRs inbetween are, PSR-1 and PSR-2 deal with coding guidelines. PSR-3 is about common interface for logging libraries.

Composer's autoload

How does all this tie with composer? You just write a simple equire __DIR__ . '/vendor/autoload.php'; in your index.php and composer takes care of all the magic. With many frameworks, this step is alredy done for you.

If you crack open Drupal 8's autoload.php, it looks like this,

// autoload.php @generated by Composer

require_once __DIR__ . '/composer' . '/autoload_real.php';

return ComposerAutoloaderInitDrupal8::getLoader();

For many PHP apps, this might slightly vary, to look something like,

<?php

// autoload.php @generated by Composer

require_once __DIR__ . '/composer/autoload_real.php';

return ComposerAutoloaderInit6efd4e07b69a68da0c53cdd247c98ed1::getLoader();

The suffix hash 6efd4e07b69a68da0c53cdd247c98ed1 is regenerated every time composer updates the packages. This is also the reason why we have one more level of redirection from autoload.php to autoloadreal.php. This is set to Drupal8 for Drupal's autoloader class by configuring autoloader-suffix in composer.json,

"config": {
    "preferred-install": "dist",
    "autoloader-suffix": "Drupal8"
},

The getLoader() function contains different kind of autoloaders and their respective loading mechanisms required by Drupal. There are 4 kinds of autoloaders present.

The file based autoloader

This is the simplest autoloader(not an autoloader really) which loads any legacy PHP dependencies which don't fall into any other autoloading mechanism. It looks like this,

$vendorDir = dirname(dirname(__FILE__));
$baseDir = dirname($vendorDir);

return array(
    '0e6d7bf4a5811bfa5cf40c5ccd6fae6a' => $vendorDir . '/symfony/polyfill-mbstring/bootstrap.php',
    'e40631d46120a9c38ea139981f8dab26' => $vendorDir . '/ircmaxell/password-compat/lib/password.php',
    'edc6464955a37aa4d5fbf39d40fb6ee7' => $vendorDir . '/symfony/polyfill-php55/bootstrap.php',
    '3e2471375464aac821502deb0ac64275' => $vendorDir . '/symfony/polyfill-php54/bootstrap.php',
    '32dcc8afd4335739640db7d200c1971d' => $vendorDir . '/symfony/polyfill-apcu/bootstrap.php',
    'a0edc8309cc5e1d60e3047b5df6b7052' => $vendorDir . '/guzzlehttp/psr7/src/functions_include.php',
    'c964ee0ededf28c96ebd9db5099ef910' => $vendorDir . '/guzzlehttp/promises/src/functions_include.php',
    '37a3dc5111fe8f707ab4c132ef1dbc62' => $vendorDir . '/guzzlehttp/guzzle/src/functions_include.php',
    '5255c38a0faeba867671b61dfda6d864' => $vendorDir . '/paragonie/random_compat/lib/random.php',
    'def43f6c87e4f8dfd0c9e1b1bab14fe8' => $vendorDir . '/symfony/polyfill-iconv/bootstrap.php',
);

The hash key is used to prevent repeated includes. This will be evident in a moment. These set of files are loaded by autoload_real.php. Here's the simplified code for the same.

$includeFiles = require __DIR__ . '/autoload_files.php';
foreach ($includeFiles as $fileIdentifier => $file) {
    composerRequireDrupal8($fileIdentifier, $file);
}
// ...

function composerRequireDrupal8($fileIdentifier, $file)
{
    if (empty($GLOBALS['__composer_autoload_files'][$fileIdentifier])) {
        require $file;

        $GLOBALS['__composer_autoload_files'][$fileIdentifier] = true;
    }
}

The classmap based autoloader

This is more straightforward, where there is a mapping between the fully qualified class name and file name.

// autoload_classmap.php @generated by Composer

$vendorDir = dirname(dirname(__FILE__));
$baseDir = dirname($vendorDir);

return array(
    'CallbackFilterIterator' => $vendorDir . '/symfony/polyfill-php54/Resources/stubs/CallbackFilterIterator.php',
    'Drupal' => $baseDir . '/core/lib/Drupal.php',
    'Drupal\\Component\\Utility\\Timer' => $baseDir . '/core/lib/Drupal/Component/Utility/Timer.php',
    'Drupal\\Component\\Utility\\Unicode' => $baseDir . '/core/lib/Drupal/Component/Utility/Unicode.php',
    'Drupal\\Core\\Database\\Database' => $baseDir . '/core/lib/Drupal/Core/Database/Database.php',
    'Drupal\\Core\\DrupalKernel' => $baseDir . '/core/lib/Drupal/Core/DrupalKernel.php',
    'Drupal\\Core\\DrupalKernelInterface' => $baseDir . '/core/lib/Drupal/Core/DrupalKernelInterface.php',
    'Drupal\\Core\\Site\\Settings' => $baseDir . '/core/lib/Drupal/Core/Site/Settings.php',
// Truncated

For instance, the Timer mentioned above will get loaded from the namespace Drupal\Component\Timer in path /core/lib/Drupal/Component/Utility/Timer.php.

Namespace based autoloader

This is for loading one or more files under a common namespace. For example, here's a line from autoload_namespaces.php.

'Doctrine\\Common\\Annotations\\' => array($vendorDir . '/doctrine/annotations/lib'),

So, all files in directory docroot/vendor/doctrine/annotations/lib/Doctrine/Common/Annotations will fall under this namespace and will be autoloaded by the namespace autoloader.

PSR-4 autoloader

Those packages which follow the PSR-4 naming scheme are loaded by the PSR-4 autoloader. Note that all the above autoloaders are have their loading mechanism implemented in the ClassLoader.php file.

$map = require __DIR__ . '/autoload_namespaces.php';
foreach ($map as $namespace => $path) {
    $loader->set($namespace, $path);
}

$map = require __DIR__ . '/autoload_psr4.php';
foreach ($map as $namespace => $path) {
    $loader->setPsr4($namespace, $path);
}

$classMap = require __DIR__ . '/autoload_classmap.php';
if ($classMap) {
    $loader->addClassMap($classMap);
}

The ClassLoader contains the file loading logic for different loaders, which is stored in a file map and loaded using the loadClass function.

    public function loadClass($class)
    {
        if ($file = $this->findFile($class)) {
            includeFile($file);

            return true;
        }
    }
// ...
function includeFile($file)
{
    include $file;
}

This loadClass gets called by a public function called register from autoload_real.php,

$loader->register(true);

which implements our humble spl_autoload_register.

public function register($prepend = false)
{
    spl_autoload_register(array($this, 'loadClass'), true, $prepend);
}

Next time you use a class and instantiate an object from it in Drupal 8, you know that so many cogs turn behind the scenes to load the file which holds that class, and all the magic that composer does to include that file!