DIY Drupal hosting: DrupalVM

Time for a little confession. I didn't intend to showcase DrupalVM as a DIY Drupal hosting solution when I conceived this series idea. Jeff Geerling, DrupalVM's creator hinted at using DrupalVM as a viable solution for small to medium sites in the first post of the series. It was an idea worth exploring and the result is this post.

DrupalVM is designed to spin up full stack Drupal setups for testing and development purposes on local VMs, It solves a major pain point encountered by developers, which is production parity, a.k.a It Works on My MachineTM problem. In my early development days, I go fix a bug, only to find that it breaks in production because, well, the PHP version differs from the one on my local, or I spend hours figuring out why this piece of code is not working in production, because the caching settings are different from my local setup. The answer to this, is to have your dev setup match your production setup as close as possible, down to a T. Yeah, I hear you, what if another project I'm working on uses a different set of settings, or even worse, an older version of PHP? You're out of luck, and its time to get rid of your LAMP/MAMP setup and use DrupalVM.

The biggest strength of DrupalVM is its configurability. It has enough settings to choke a whale, and more. DrupalVM is written using a configuration management tool called Ansible, which allows users to specify finer aspects of their Drupal site using YAML, a human-friendly declarative language. Ansible has the convention of declaring a set of variables in YAML, which will be parsed and a set of tasks executed(ex. install MySQL, install Drush, update composer etc). This collection of tasks are called playbooks in Ansible lingo. DrupalVM, in technical terms, is a highly configurable ansible playbook to setup and deploy Drupal onto any infrastructure. "Any infrastructure" could mean that this could be a Vagrant machine(the default use case for DrupalVM), a DigitalOcean/Linode VPS, or your own server. The only prerequisite is for the target machine to have SSH access. This is where Ansible scores over other configuration management systems like Chef and Puppet. Ansible operates over an "agentless architecture". In other words, the target machines need not have any agent or tool installed for Ansible scripts to work.

Installing DrupalVM involves cloning the official repository. The only requirement is to have Ansible installed in your system. The repository contains a default.config.yml, which is pretty much the only file you need to go through and play around with to setup your site. There's actually 2 more files you need to edit, which is trivial and I'll cover it in a bit. The documentation has instructions for making DrupalVM work on a Vagrant machine. I'll walk the hosting/production deployment part alone.

The DrupalVM repo consists of a main playbook.yml inside the provisioning directory. This reads configuration from a config.yml written by you under a config_dir folder. The playbook assembles various roles(a modular collection of Ansible tasks) and other task files dynamically by reading the values in the config file. For example, it will add the apache role and install apache if apache is configured as the webserver of choice, and so on. This is a gist of how DrupalVM works.

In order to demonstrate DIY Hosting capabilities of DrupalVM, I've written a set of customized sample config.yml files for a couple of sites. This is not the only way to do things, but a recommended practice to maintain your production infrastructure configuration. Clone the repo used to illustrate this example.

$ git clone deployments

This will be organized on a per-site basis. Each folder(named after the site's TLD) will contain a customized config.yml, which contains only the overridden values. The default ones will be picked up from default.config.yml in DrupalVM repo. This is pretty much the blueprint of your site. There will also be a per-site inventory file, which is an Ansible convention specifying where to run all our tasks and playbooks on. This could be a list of IPs or hostnames, grouped into groups and follows an INI-ish format.


[drupalvm] ansible_ssh_user=root

I specify a global variable called ansible_python_interpreter, which indicates Ansible the python executable to use on the remote machine(ansible is written in Python BTW). This is done explicitly as all our remote machines are Ubuntu 16.04, where the default Python version is 3.5.2, whereas Ansible is configured to work with Python 2.x by default. Hopefully, this will go way sometime soon and there won't be any need for this piece of configuration. Read up more about Ansible inventories if you're curious.

The other site common across all sites is the ansible.cfg file. As aptly named, it contains configuration needed while running Ansible playbooks.


ssh_args = -o ForwardAgent=yes -o StrictHostKeyChecking=no

Arguably, the most important piece here is the ForwardAgent=yes part. When running one of the playbooks, I specify a Git repo to clone my site from. This will most likely be a private repository, which means that you have to have a pair of SSH keys on the target machine to authorize the git clone command. This can get messy quickly, as you have to create new key pairs every time you deploy to a new machine and move it around. The ForwardAgent tells Ansible to pick up keys from the machine where you are running the playbooks, which means no more clumsy key management.

Onto the scripts themselves. Let's take the and dissect it. First I specify the domain where we will deploy the site. NOTE that before you run Ansible, you should have a bare Ubuntu 16.04 server running and pointing to that domain. If you are using DigitalOcean, it should look something like this:

drupal_domain: ""

Rebuild DO image Rebuild DigitalOcean Image

DO DNS configuration DigitalOcean DNS configuration

Then comes the apache_vhosts configuration. Though this is specified in the same manner as default.config.yml, I'm pruning other entries in vhost configuration and only having the Drupal stuff, hence the overwrite. This is followed by Drupal version and PHP versions.

  - servername: "{{ drupal_domain }}"
    documentroot: "{{ drupal_core_path }}"
    extra_parameters: |
          ProxyPassMatch ^/(.*\.php(/.*)?)$ "fcgi://{{ drupal_core_path }}"

drupal_major_version: 7
php_version: "5.6"

DrupalVM has provision to install a lot of add-on packages, like Varnish, Solr, Redis, DrupalConsole etc. For this, we just stick with Drush, can't do Drupal without it :)

  - drush

We then specify the various credentials used in the stack, like Drupal admin password, DB username/password etc. NOTE that this is NOT how you specify credentials in Ansible. I'll cover the proper way to do it in another blog post. We are just exposing it here for pedagogical purposes. Please don't specify your actual credentials and check in this code, you WILL be fired from your job!

drupal_account_pass: admin
drupal_db_password: drupal
mysql_root_password: root

There are a couple of not-so-obvious gems in DrupalVM called the pre and post provision tasks.

pre_provision_tasks_dir: '{{config_dir }}/pre.yml'
post_provision_tasks_dir: '{{config_dir }}/post.yml'

These can be specified as either shell scripts or Ansible tasks. I prefer the latter. The 'pre' tasks are run before running any other task in DrupalVM. The 'post' is run after all the required packages are configured and installed. For this site, I want to set it up from a git repo and install from a DB snapshot. So, I set drupal_install_site: false, and add a bunch of custom variables.

git_repo: ''
git_branch: 'master'
db_snapshot: ''
prod_url: ""

this is pretty much what I do in the post provision file.

- name: "{{ drupal_domain }} | Install Drupal with drush."
  command: >
    {{ drush_path }} site-install {{ drupal_install_profile | default('standard') }} -y
    --site-name="{{ drupal_site_name }}"
    --account-name={{ drupal_account_name }}
    --account-pass={{ drupal_account_pass }}
    --db-url={{ drupal_db_backend }}://{{ drupal_db_user }}:{{ drupal_db_password }}@localhost/{{ drupal_db_name }}
    {{ drupal_site_install_extra_args | default([]) | join(" ") }}
    -r {{ drupal_core_path }}
  when: db_snapshot is undefined and not drupal_install_site

- include: "{{config_dir }}/install-from-db.yml"
  when: db_snapshot is defined
  static: no

The drush site-install won't execute in this site, as I've defined a DB snapshot. I include an install-from-db.yml file instead. The convenience of Ansible is the idea that you can modularize a set of related tasks and include them as per the context, like this example above.

The pre-provision file just installs Git and clones the repo.

- name: "{{ drupal_domain }} | Install dependencies"
  apt: pkg={{ item }} state=installed
    - python3-mysqldb
    - git

- name: "{{ drupal_domain }} | Clone repo"
  git: repo={{ git_repo }}
       version={{ git_branch }}
       dest={{ drupal_core_path }}

The python3-mysqldb is another Ansible-Python3 nuance. The git clone will work because of the ForwardAgent trick we talked earlier. Let's give our new setup a spin by running the playbook. Make sure you are running this inside the DrupalVM repo you cloned.

$ env ANSIBLE_CONFIG=~/deployment/ansible.cfg ansible-playbook -i ~/deployment/  provisioning/playbook.yml  --extra-vars="config_dir=~/deployment/"

Here, ~/deployments is the directory you cloned the DIY-hosting-drupalvm repository. We specify the non-standard location of ansible.cfg first, then indicate the location of the inventory file using the -i flag, and lastly tell Ansible where the config_dir resides. Go grab a coffee while Ansible grinds your site. it might take a good 15 minutes for the first run.

The second site installs plain vanilla Drupal 8 on a PHP 7 stack served by Nginx.

$ env ANSIBLE_CONFIG=~/deployment/ansible.cfg ansible-playbook -i ~/deployment/  provisioning/playbook.yml  --extra-vars="config_dir=~/deployment/"

Any steps specific to this site could be added in its own post config script. The post config script is the way to extend DrupalVM's functionality without hacking its core. A simple example of this would be applying Drupal module updates. We could write a set of tasks for:

  • Put the site in maintenance mode using drush
  • Taking a backup of the site.
  • Switch the code to lastest/appropriate commit/tag/branch.
  • Apply updates using Drush
  • Ensure that the update is successful. You could define your own set of tests for this or capture the output of drush updb command and validate accordingly.
  • Run the eternal drush cache clear command
  • Switch off maintenance mode, using drush

Ansible has the concept of tagging tasks, as in,

- name: "{{ drupal_domain }} | Set the site in maintenance mode."
  command: >
    {{ drush_path }} vset maintenance_mode 1
    -r {{ drupal_core_path }}
    - updates

We could tag all the above tasks under the "updates" tag, and run only those tasks in Ansible. So, to update, effectively,

$ env ANSIBLE_CONFIG=~/deployment/ansible.cfg ansible-playbook -i ~/deployment/  provisioning/playbook.yml  --extra-vars="config_dir=~/deployment/" --tags="updates"

DrupalVM, though lacking a neat dashboard, helps you manage all your Drupal infrastructure using a single codebase without leaving the comfort of your commandline or editor. You can also extend all the ops you do on your sites by writing new tasks and "ansibilizing" them. All this while achieving development-production parity. You should give DrupalVM a spin if every one of your sites has config as unique as a snowflake and if you don't want to invest in complicated tools or manpower. It's just Ansible and YAML files. Can't get simpler than that!