Battle log: a deep dive in Symfony stack in search of optimizations 2/n

This article is the second in a series that explains what we learned and how we discovered new performance improvements. It’s focused on the dev environment.

The first article guides us through multiple new optimizations for the prod environment. You should definitely read it first.

Optimizing the dev environment

On each request, the dev environment has to :

  • always check if caches are stale and if something has changed, rebuild what is necessary to reflect the change in cache (without calling the cache:clear command manually);
  • on some paths, bypass the cache because we are not capable of handling it;
  • collect / log / profile what we can, so the developer can be made aware of what happened during the request very easily. What does it mean for us developers?

Debug mode: container compilation

In Symfony, config is mainly in YAML, XML or annotations (it can also be PHP). For a few years now, we’ve also had autoconfiguration and autowiring, which needs to know everything in your application and dependencies to be able to link the right stuff to the right place.

The doc itself says:

Thanks to Symfony’s compiled container, there is no performance penalty for using autowiring. However, there is a small performance penalty in the dev environment, as the container may be rebuilt more often as you modify classes. If rebuilding your container is slow (possible on very large projects), you may not be able to use autowiring.

But it is quite uncommon to modify everything all the time, maybe we can do something to reduce this slowness. How does debug work?

No kernel
    => compile kernel
Kernel ?
    prod: use it
    dev: hmm, is it fresh? => compile it and register the resource that need to be watched

So we need to check:

  • what is registered to be watched;
  • how it is watched.

We will do it in reverse: let’s go see how it is watched, it will help us to understand what is really used, how and what we should avoid.

How is it watched?

Doctrine and Symfony have their own ways to ensure proper freshness. They can be redundant.

1/ First, Symfony

In the book Symfony 5: the fast track the last chapter is Discovering Symfony Internals. It explains how we can use Blackfire to understand Symfony Internals. That’s exactly what I did and so, I will extend this chapter here. More precisely, let’s talk about what happens in Kernel::handle().

Kernel::handle() calls Kernel::boot() which itself calls Kernel::initializeContainer(), where all the interesting stuff will happen.

// Symfony\Component\HttpKernel\Kernel

public function handle()
{
    $this->boot();
    // stuff that handles the request itself (and so, the future response)
}

public function boot()
{
    $this->initializeBundles();
    $this->initializeContainer();

    foreach ($this->getBundles() as $bundle) {
        $bundle->setContainer($this->container);
        $bundle->boot();
    }
}

protected function initializeContainer()
{
    $cache = new ConfigCache();

    if (prod and built)
        // => use it and return

    if (debug mode and $cache->isFresh())
        // => use it and return

    else
        // lots of stuff

    $container = $this->buildContainer();
    $container->compile();
}

What interests us here is the $cache->isFresh(). Under the hood, it’s ResourceCheckerConfigCache in action.

// Symfony\Component\Config\ResourceCheckerConfigCache

function isFresh()
{
    $meta = stuff() // get all metadata "see next section"

    foreach ($meta as $resource) {
        /* @var ResourceInterface $resource */
        foreach ($this->resourceCheckers as $checker) {
            if (!$checker->supports($resource)) {
                continue; // next checker
            }
            if ($checker->isFresh($resource, $time)) {
                break; // no need to further check this resource
            }

            return false; // cache is stale
        }
        // no suitable checker found, ignore this resource
    }

    return true;
}

The $this->resourceCheckers is not interesting (it seems there is only one, and not an interesting one; SelfCheckingResourceChecker). The $resource however is very interesting: we are finally in the right place, where everything happens.

What are all the ResourceInterface proposed by Symfony?

  • ClassExistenceResource: freshness is only evaluated against resource existence => a class that ceases to exist generates container compilation
  • ComposerResource: tracks the PHP version and Composer dependencies => at least one composer dependency installation or upgrade generates container compilation
  • DirectoryResource: freshness is evaluated against the filemtime (date time of the last modification) of all the files under this directory (recursively) => at least one file updated in the directory generates container compilation
  • FileExistenceResource: freshness is only evaluated against resource creation or deletion => a file that ceases to exist generates container compilation
  • FileResource: freshness is evaluated against the filemtime. => at least one file is updated (no matter what) and it will force new compilation.
  • GlobResource: only existence/removal is tracked (not modification times).
  • ReflectionClassResource: freshness is evaluated against the exposed signature of the class : constants, interfaces, traits, public and protected properties and methods.

Just the basic analysis of these classes leads me to these discoveries:

  • GlobResource has two paths: one very optimized and one not at all, which uses symfony/finder (so a tank on a critical path). The slow path is meant to be used only for path containing /**. We will see later what it means for us developers.
  • This is what I saw in my blackfire profile: just checking if a new file is created or removed (the job of GlobResource war responsible for 47% of the whole page. WHAT?

symfony/finder takes 40%...

For that kind of stuff, there is no wait, I ran on sfDev slack to check with Nicolas Grekas what was happening. Together we understood there was a bug for Alpine based PHP setup because the condition was based on the existence of the system’s constant GLOB_BRACE which is not present in Alpine. It forces us to use the slow path for every GlobResource, which was dramatic, performance wise.

Nicolas fixed it in 4.4 and everyone is happy.

symfony/finder gone 2!...

This fix alone greatly enhances the docker experience for our developers on MacOS.


2/ Doctrine Annotations

The doctrine/annotations project is a hard dependency for everyone using annotations in classes. It’s widely used by Symfony and API Platform, not just for Doctrine ORM / ODM. Doctrine has a different way of doing checks. There is no global state as it’s only used as a library. The CacheReader has a cache but calls filemtime for each file and all their related files again and again for each class, properties, and methods.

foreach $entityClasses as $classes

    getClassAnnotation => x90
        fetchFromCache
            isCacheFresh
                filemtime for the $class, is parents and traits (easily 5 or 6 files)
    getMethodAnnotations
        foreach properties => x803
            fetchFromCache
                isCacheFresh
                    filemtime for the $class, is parents and traits (easily 5 or 6 files)
    getPropertyAnnotations
        foreach properties => x1076
            fetchFromCache
                isCacheFresh
                    filemtime for the $class, is parents and traits (easily 5 or 6 files)

This generates way too many calls to filemtime (10951 filemtimes for only 96 files) so I opened a pull-request.

And so:

It was not easy to get it merged because there exists a slight possibility that someone somehow uses this “feature” in a long running process where the annotation cache itself will have no way to be cached, but new annotation in a file will be authorized. I mainly see it as a bug because it could cause side effects in so many cases. Eventually, the maintainers were ok with that and merged it and will be released for the next minor, 1.9.

What is registered for a freshness check?

1/ Kernel stuff Now that we know all the different ways to do freshness checks, let’s see how Symfony, bundles and our code use them.

Let’s continue where we left off, after $cache->isFresh(), $this->buildContainer() is called:

// Symfony\Component\HttpKernel\Kernel

protected function buildContainer()
{
    $container = $this->getContainerBuilder();
    $container->addObjectResource($this);
    $this->prepareContainer($container);
    // more internal stuff

    return $container;
}

The resource tracking begins with ContainerBuilder::addObjectResource($this). The abstract of it is:

// Symfony\Component\DependencyInjection\ContainerBuilder

public function addObjectResource($object)
{
    // ok, let's get all my related: traits, parents, interfaces, all recursively
    // for all of them, call $this->fileExists($file);
}

Which brings us to this new method:

// Symfony\Component\DependencyInjection\ContainerBuilder

/**
 * Checks whether the requested file or directory exists and registers the result for resource tracking.
 *
 * @param string      $path          The file or directory path for which to check the existence
 * @param bool|string $trackContents Whether to track contents of the given resource. If a string is passed,
 *                                   it will be used as pattern for tracking contents of the requested directory
 */
public function fileExists(string $path, $trackContents = true): bool
{
    $exists = file_exists($path);

    if (!$this->trackResources || $this->inVendors($path)) {
        return $exists;
    }

    if (!$exists) {
        $this->addResource(new FileExistenceResource($path));

        return $exists;
    }

    if (is_dir($path)) {
        if ($trackContents) {
            $this->addResource(new DirectoryResource($path, \is_string($trackContents) ? $trackContents : null));
        } else {
            $this->addResource(new GlobResource($path, '/*', false));
        }
    } elseif ($trackContents) {
        $this->addResource(new FileResource($path));
    }

    return $exists;
}

This is the place to be. We find here almost all the resource trackers and how they are called. The $this->inVendors will add the ComposerResource we talked about before.

We can see that when fileExists() is called from buildContainer(), the second parameter is not set to false, so their content is tracked. Under the hood, it will use DirectoryResource and FileResource, which means that if ANYTHING changes in the given file or directory (even a CS fix or a PHP Doc), the container will be rebuilt. The PHP classes that are tracked like that are the ones that are used to build the container itself. Touching them means modifying how the container will be built, and as we can’t know for sure what the changes mean, we rebuild it as a precaution.

This means that working on the Kernel or CompilerPass level is annoying because you will rebuild everything at each call. But this is for good: every computation made in compiler pass will be made only once for all subsequent requests. For example, if you have some configuration for a service that is written in YAML, it’s much better to parse the configuration in a compiler pass and inject the array in the service, rather than inject the file and parse it at runtime.

Next function call is Kernel::prepareContainer() that calls the Extension class of all bundles and register the bundle class as content tracked. We build all the bundles then the Kernel itself. The last one is $kernel->build() , which is an abstract method that lets you hook this in your kernel app. For us, it’s handled by MicroKernelTrait.

// build() is in MicroKernelTrait and call our Kernel::configureContainer()

protected function configureContainer(ContainerBuilder $container, LoaderInterface $loader)
{
    $container->addResource(new FileResource($this->getProjectDir().'/config/bundles.php'));
    $confDir = $this->getProjectDir().'/config';
    $loader->load($confDir.'/{packages}/*'.self::CONFIG_EXTS, 'glob');
    $loader->load($confDir.'/{packages}/'.$this->environment.'/**/*'.self::CONFIG_EXTS, 'glob');
    $loader->load($confDir.'/{services}'.self::CONFIG_EXTS, 'glob');
    $loader->load($confDir.'/{services}_'.$this->environment.self::CONFIG_EXTS, 'glob');
}

You can see here the use of /**. This is the slow path that costs so much. Previously it was a bug, but here we use it voluntarily, to allow the developer to gracefully reorder their config package files into subdirectories. I personally never rearrange anything in these directories as they are generated by flex and I’m totally fine with that. It was only after finding that optimization on my side that I discovered that since 2019–11–19, the default for new Symfony projects was to replace the use /**/* with /* (see the changes).

You should ABSOLUTELY check your Kernel and remove all the /** if you are not using a nested path in your config/packages directory, or consider flattening it if you do. This is a ridiculous performance penalty to endure for very little gain (and I’m not talking of the env directories config/packages/test, config/packages/dev and config/packages/prod, they are fine).

The extensions supported for your config are defined by const CONFIG_EXTS = '.{php,xml,yaml,yml}'; in your Kernel. You should remove the ones that are not useful to you. I, for example, do not use XML or PHPfiles, so I changed to const CONFIG_EXTS = '.yaml';.

The fixed version looks like this, taken from the symfony/recipes repo. This is the Kernel you will get if you create the app now:

// https://github.com/symfony/recipes/blob/990f591216ef8ab5d44d7c09acd7ea393159ef76/symfony/framework-bundle/4.2/src/Kernel.php#L33
// build() is in MicroKernelTrait and call our Kernel::configureContainer()
const CONFIG_EXTS = '.yaml';

protected function configureContainer(ContainerBuilder $container, LoaderInterface $loader)
{
    $container->addResource(new FileResource($this->getProjectDir().'/config/bundles.php'));
    $confDir = $this->getProjectDir().'/config';
    $loader->load($confDir.'/{packages}/*'.self::CONFIG_EXTS, 'glob');
    $loader->load($confDir.'/{packages}/'.$this->environment.'/*'.self::CONFIG_EXTS, 'glob');
    $loader->load($confDir.'/{services}'.self::CONFIG_EXTS, 'glob');
    $loader->load($confDir.'/{services}_'.$this->environment.self::CONFIG_EXTS, 'glob');
}

Please note that for the 5.1 release, this code has changed.


2/ Configuration For anyone used to Symfony, the name of FrameworkBundle should ring a bell. It is what defines Symfony, not as a set of relatively independent components but, as a Framework. This is what gives all the opinionated stuff. The FrameworkExtension is a beast of a class: no less than 120 use, 2063 lines and 41 methods. This is the glue that makes everything work as designed. You will not be surprised that it makes 17 calls to fileExists. They are in charge of loading translations, validation, serialization mapping directories and files, following the paths given in the build phase of the container.

Other extensions add their own resources (think config/api_resources for API Platform).


3/ Code Here comes almost the last useful stuff for this chapter: our code! All the previous resources were meant for config or Kernel / compiler stuff, but nothing for our code. It’s handled by Kernel's call to $container->compile().

It’s beginning to be a very very long article so I will spare you the code, but it goes through all the classes registered as services manually or via all the methods possible in Symfony (hello autoconfigure) and make them a ReflectionClassResource. And so, if you change a public attribute of one of your classes, the container will be rebuilt.

You can tell Symfony to add / ignore some resources at once.


4/ Routing The routing config is injected by our Kernel:

// https://github.com/symfony/recipes/blob/990f591216ef8ab5d44d7c09acd7ea393159ef76/symfony/framework-bundle/4.2/src/Kernel.php#L46
// loadRoutes() is in MicroKernelTrait and call configureRoutes

protected function configureRoutes(RouteCollectionBuilder $routes)
{
    $confDir = $this->getProjectDir().'/config';
    $routes->import($confDir.'/{routes}/'.$this->environment.'/*'.self::CONFIG_EXTS, '/', 'glob');
    $routes->import($confDir.'/{routes}/*'.self::CONFIG_EXTS, '/', 'glob');
    $routes->import($confDir.'/{routes}'.self::CONFIG_EXTS, '/', 'glob');
}

Same as config, you should absolutely avoid the /**.

Then UrlGenerator and UrlMatcher each make fileExists calls to register these resources.


5/ Recap of all resources checked

Configuration :

Resource type Resource(s)
ComposerResource ComposerResource
FileResource src/Kernel.php
FileResource vendor/symfony/http-kernel/Kernel.php
FileResource ALL OUR compiler passes
FileResource ALL OUR YAML files in config/packages/* (23)
FileResource ALL OUR YAML files in config/packages/dev/* (5)
FileResource config/services.yaml
FileResource ALL OUR YAML and XML files linked in services.yaml
ReflectionClassResource ALL OUR PHP CLASSES (~900)
GlobResource config/packages/*.yaml
GlobResource config/packages/dev/*.yaml
GlobResource src/*
GlobResource src/Controller/*
DirectoryResource config/validator/
DirectoryResource config/serialization/
DirectoryResource translations
DirectoryResource config/api_resources
ClassExistenceResource ALL bundle configuration classes
ClassExistenceResource ALL compiler passes
FileExistenceResource config/validator
FileExistenceResource config/serializer
FileExistenceResource ALL COMBINATION OF translations folders (26)
FileExistenceResource ALL COMBINATION OF views and templates folders (76)

Routing (one for UrlGenerator and one for UrlMatcher):

Resource type Resource(s)
FileResource config/routes.yaml
FileResource ALL OUR CONTROLLERS
FileResource config/routes/api_platform.yaml
FileResource ALL routing bundle’s conf files (10)
FileResource src/Kernel.php
FileResource vendor/symfony/http-kernel/Kernel.php
DirectoryResource src/Controller
DirectoryResource config/api_platfom/
GlobResource config/routes/*.yaml
GlobResource config/routes/dev/*.yaml
Symfony\[…]\ContainerParametersResource container_parameters_XXXX

6/ New optimizations

So… we now have seen how this stack works. What can we see?

Avoid parsing YAML / XML at runtime

You can see that there is a DirectoryResource on serialization and validator config, so if something changes there, the container is rebuilt. Despite that, the YAML/XML files inside these folders were parsed AT RUNTIME EVERY TIME. I made a PR so they are only parsed once during the life of the container. The trick is to use a magnificent proposal from Teoh Han Hui in API Platform to make use of a warmer (called at container boot) to actually clear this cache. It works well in API Platform, so I just ported it on Symfony to cache this YAML parsing. It will also benefit API Platform (and a lot of other projects).

It was merged into master, and will be released as 5.1 (too bad, not for 4.4.*).

No more YAML parsing!...

Avoid parsing annotation if you don’t use them

There are even some bugs / misconfigurations: I do not use annotations for serialization and validator configuration. The config enable_annotations for serialization and validator are documented to be false by default (https://symfony.com/doc/current/reference/configuration/framework.html#reference-serializer-enable-annotations). In reality, it’s not what really happens. That’s only the case if you use symfony/symfony (see here and here).

I do use symfony/symfony and I saw that I had an annotationLoader for stuff I didn’t want, and the cost is not harmless for performance. You can really disabled them with:

// config/packages/framework.yaml
framework:
    # we do not put serialization config in entities annotations (we use YAML), so avoid parsing them
    serializer:
      enable_annotations: false
    # we do not put validation config in entities annotations (we use YAML), so avoid parsing them
    validation:
      enable_annotations: false

I also opened a discussion about that on the symfony-docs repository.

Avoid checking for API Platform config in entities

In API Platform, you can have all the configuration as annotations, YAML and XML. To be able to provide the best DX, there is some watcher to reload the container build if you change any annotation.

Check that you do not have that config. Because it will check this directory again and again (along with a lot of others watchers) for nothing.

// config/packages/api_platform.yaml
api_platform:
#    resource_class_directories:
#        - '%kernel.project_dir%/src/Entity'

To check ?

  • The DirectoryResource on src/Controller is redundant with all the FileResource on Controllers.
  • Avoid checking for templates: I see while writing this article is that there are 76 FileExistenceResource (one for each combination of possible template folder) for templating despite being an API and so do not need any templating. This seems odd, and I will check that. It was maybe a config mistake on my side.

Before ending this chapter, I want to add a special mention to all the safeguards in place to handle locks on container compilation: avoid concurrent queries to build the container at the same time. Without that and before all these new optimizations, it would have been impossible to work.

Debug mode: collector

The profiler works in two phases: first collect everything during the request and store it, then displays it in parallel with the result.

The collection itself can cost a lot.

One of them was tricky to explain even to the maintainer. With Doctrine, to create a database schema, you:

  • create some entity,
  • add annotations on class and properties, launch a command. Voilà the schema is generated. This schema can contain errors (invalid relation, properties mismatched, etc.). These errors can be checked by a console command bin/console doctrine:schema:validate, and by the profiler. The console command checks ALL registered entities. The profiler is supposed to check only the loaded entities types. But the validator itself loads all the relation of all entities loaded. So we can have relations of relations, that, in my schema can easily lead to « load everything ». Remember that in dev, we do not have a primed cache, so if the metadata were not used for the current request, they are not yet computed, so they are computed just for the collector.

I proposed to put it behind a feature switch, as it can always be accessed via Symfony command console (in CI for example). It got merged and will be released for 2.1.

// config/packages/doctrine.yaml
doctrine:
    dbal:
        profiling_collect_schema_class_errors: false

SchemaValidator no more costly!...

FYI: how many times do you really watch the profiler? Do you know you can disable it and still benefit from some of the debug mode (SQL queries in log, Exception well caught)?

That’s all for today

That concludes our second run of performance improvements. We learned a lot about how the dev env works and what brings the dev experience together. I hope that you learned something today and the performance of your dev environment will benefit from it as much as mine.

The next article will recap what we learned and will focus on how better Symfony developer environment can be, and what I dream for us PHP developers.

Nos formations sur le sujet

  • Logo Symfony

    Symfony

    Formez-vous à Symfony, l’un des frameworks web PHP les plus connus au monde

  • Logo Symfony avancée

    Symfony avancée

    Décou­vrez les fonc­tion­na­li­tés et concepts avan­cés de Symfo­ny

blog comments powered by Disqus