What you need to know about environment variables with PHP

Environment variables for configuration are today’s best practice for application setup – database credentials, API Keys, secrets and everything varying between deploys are now exposed to the code via the environment, instead of configuration files or worse, directly hard-coded.

You can't leak what you don't store

Let’s dive into:

  • how does it work?
  • is it really a good idea?
  • how to deal with them in PHP?
  • and finally some recommendations and common errors to avoid – with some real world traps we fell into!

We are not going to cover how to setup environment variables in your webserver / Docker / crontabs… as it depends on the system, the software and we want to focus on env vars.

If your hosting is using Docker Swarm or AWS, things will be a little bit different for example, as they decided to push files on your container filesystem to expose your secrets, not env vars: that’s very specific to those platforms and not a standard at all.

Env vars 101

When you run a program, it inherits all environment variables from its parent. So if you set a variable named YOLO with the value covfefe in your bash and then run a command, you will be able to read YOLO in any child process.

$ YOLO=covfefe php -r 'echo getenv("YOLO");'

As this variable is only locally defined, we can’t read it from another terminal (another parent). So the idea is to make sure your application always inherits the needed variables.

You can see all environment variables in your shell by running the following command, but as you will not see the YOLO variable yet because it was only passed to the php command on the fly, not set in the current process:

$ env

You can set an environment variable with the syntax export <NAME>=<VALUE>:

$ export YOLO=covfefe

Variable names are case sensitive and the convention is to only use English, uppercase names with _ as separator (upper snake case). You already know some like PATH, DISPLAY, HTTP_PROXY, …

Today’s best practice

You may already know the twelve-factor methodology to build robust and scalable applications (if not, I suggest you take a break and check it out). The Configuration chapter explains why storing configuration in the environment is the way to go:

  • Config varies substantially across deploys (production, staging, testing…), code does not;
  • Env vars are easy to change between deploys without changing any code;
  • They are a language – and OS – agnostic standard. The same configuration can be shared between your PHP and Python processes.

The manifesto also describes quite well what should be in the code and what should be in the environment – do not put your whole application configuration in it, only what differ from one deploy to another .

I read on the Internet that env vars are dangerous

Some articles will tell you that env vars are harmful for your secrets; the main reason is that any process inherits from his parent variables, all of them. So if you have a very secret setting in the environment, child processes will have access to it:

$ export YOLO=covfefe
$ php -r "echo exec('echo $YOLO');"

Child processes can consider environment variable to be something public, writable into logs, to include in bug reports, to dump to the user in case of error… They can leak your secrets.

The alternative is plain old text files, with strong Unix permissions. But what should really be done is clearing the environment when running a child process you do not trust, like nginx does. By default, nginx removes all environment variables inherited from its parent process except the TZ variable. Problem solved!

This can be done with env -i which tells to start the following commands with an empty environment.

$ php -r "echo exec('env -i php -r \'echo getenv(\"YOLO\");\'');"

$ php -r "echo exec('php -r \'echo getenv(\"YOLO\");\'');"

Always run processes you do not trust in a restricted environment.

Even if you trust your code, you should still be very careful and expose your variables to the least possible processes – you never know (NPM Drama inside).

Getting your PHP application ready

When dealing with env vars in a PHP project, you want to make sure your code is going to always get the variable from a reliable source, be it $_ENV, $_SERVER, getenv… But those three methods are not returning the same results!

$ php -r "echo getenv('HOME');"

$ php -r 'echo $_ENV["HOME"];'
PHP Notice:  Undefined index: HOME

$ php -r 'echo $_SERVER["HOME"];'

This is because of the variables_order PHP setting on my machine which is GPCS, as there is no E I can’t rely on the $_ENV superglobal. This can lead to code working on one PHP installation and not the other.

Another point is that developers don’t want to manage env vars locally. We do not want to edit VirtualHost all the time, reloading php-fpm, rebooting some services, clearing caches… Developers wants a simple and painless way of setting environment variables… like a .env file!

An .env file is just a compilation of env vars with their values:


Dot Env libraries to the rescue

vlucas/phpdotenv, the most popular library at the moment

This library will read a .env file and populate all the superglobals:

$dotenv = new Dotenv\Dotenv(__DIR__);

$s3Bucket = getenv('S3_BUCKET');
$s3Bucket = $_ENV['S3_BUCKET'];
$s3Bucket = $_SERVER['S3_BUCKET'];

There are some nice additions like the ability to mark some variables as required (and this is the one used by Laravel).

josegonzalez/dotenv, security oriented

This library doesn’t populate the superglobals by default:

$Loader = new josegonzalez\Dotenv\Loader('path/to/.env');
// Parse the .env file
// Send the parsed .env file to the $_ENV variable

It supports required keys, filtering, and can throw exceptions when a variable is overwritten.

symfony/dotenv, the new kid on the block

Available since Symfony 3.3, this component takes care of the .env file like the others, and populates the superglobals too:

$dotenv = new Symfony\Component\Dotenv\Dotenv();

$dbUser = getenv('DB_USER');
$dbUser = $_ENV['DB_USER'];
$dbUser = $_SERVER['DB_USER'];

There is more on packagist and at that point I’m too afraid to ask why everyone is writing the same parser all over again.

But they are all using the same logic:

  • find a .env file;
  • parse it, check for nested values, extract all the variables;
  • populate all the superglobals only for variables not already set.

I recommend to commit a .env file with values made for the developers: everyone should be able to checkout your project and run it the way they like (command line server, Apache, nginx…) without dealing with configuration.

(new Dotenv())->load(__DIR__.'/.env');

This recommendation work well when everyone has the same infrastructure locally: same database password, same server port… As we use Docker Compose on all our projects we never have any difference from one developer to another, if you don’t have this luxury, just allow developers to overwrite the defaults by importing two files:

(new Dotenv())->load(__DIR__.'/.env', __DIR__.'/.env.dev');

That way you just have to create and populate a .env.dev file with what’s different for you (and add it to .gitignore).

Then on production, you should not load those default values, so the idea is to protect the loader with an env var only set in production:

if (!isset($_SERVER['APP_ENV'])) {
    (new Dotenv())->load(__DIR__.'/.env', __DIR__.'/.env.dev');

If you don’t do that and your hosting provider forgot a variable, you are going to run development settings in production and have a bad time.

The pitfalls you have to look for ⚠

Name conflicts

Naming is hard, and env vars don’t escape this rule.

So when naming your env vars, you have to be specific and avoid as much as possible name collision. As there is no official list of reserved names, it’s up to you. Prefixing custom variables can’t harm.

The Unix world do it already, with LC_, GTK_, NODE_

Missing variables at runtime

You have two choices when a variable is missing: either throw an Exception, or use a default value. That’s up to you but the second one is silent… Which can cause harm in a lot of contexts.

As soon as you want to use env vars, you have to set them everywhere:

  • in the webserver;
  • in the long running scripts and services;
  • in the crontabs…
  • and in the deployment scripts!

The last one is easy to miss, but as some deployment can warm application cache (like Symfony’s)… Yep, a missing variable can lead to a corrupted application delivery. Be strict about them and add a requirement check on your application startup.

The HTTP_ prefix

There is just one prefix you should never use: HTTP_. Because this is the one used by PHP itself (and other CGI-like contexts) to store HTTP request headers.

Do you remember the httpoxy security vulnerability? It was caused by HTTP Client looking for this variable in the environment, in a way that could be set via a simple HTTP header.

Some DotEnv libraries also prevent override of those variables, like the Symfony one.

Thread safety of getenv()

I have a bad news: in some configurations, using the getenv function will result in unexpected results. This function is not thread safe!

You should not use it to retrieve your values, so I suggest you call $_SERVER instead – there is also a small performance difference between an array access and a function call for what it’s worth.

Env vars are always strings

One of the main issue now that we have type casting in PHP is that our settings coming from env vars are not always properly typed.

public function connect(string hostname, int port)

// This will not work properly:

Symfony now allow to cast variables, and more like reading a file, decoding JSON…

Env vars everywhere, or not

There is a lot of debates at the moment between env vars, files, or a mix of it: env vars referencing a configuration file. The fact is that despite being considered a best practice, env vars are not introducing a lot of advantages…

But if properly used, in a Symfony application for example, env vars can be changed on the fly, without clearing any cache, without doing any filesystem access, without deploying code: just by restarting a process, for example.

The trend to have just one variable, like APP_CONFIG_PATH, and reading it via '%env(json:file:APP_CONFIG_PATH)%' looks like re-inventing the good old parameters.yml to me, unless the file is managed automatically by a trusted tool (like AWS Secret Store). There is also envkey.com which allow to control your env vars from one location, without dealing with files yourself, I like this approach as it’s closer to the simplicity of Heroku-like hosting!

What are you using to expose your credentials to your application? Do you have any pro-tips ©️ to share about env vars? Please comment!

blog comments powered by Disqus