Highly – Available Infrastructure Provisioning and Configuration with Ansible

The following is an excerpt from Chapter 8 of Ansible for DevOps, a book on Ansible by Jeff Geerling. The example highlights Ansible’s simplicity and flexibility by provisioning and configuring of a highly available web application infrastructure on a local Vagrant-managed cloud, DigitalOcean droplets, and Amazon Web Services EC2 instances, with one set of Ansible playbooks.

tl;dr Check out the code on GitHub, and buy the book to learn more about Ansible!

Highly-Available Infrastructure with Ansible

Real-world web applications require redundancy and horizontal scalability with multi-server infrastructure. In the following example, we’ll use Ansible to configure a complex infrastructure (illustrated below) on servers provisioned either locally via Vagrant and VirtualBox, or on a set of automatically-provisioned instances running on either DigitalOcean or Amazon Web Services:

Highly-Available Infrastructure.

Varnish acts as a load balancer and reverse proxy, fronting web requests and routing them to the application servers. We could just as easily use something like Nginx or HAProxy, or even a proprietary cloud-based solution like an Amazon’s Elastic Load Balancer or Linode’s NodeBalancer, but for simplicity’s sake, and for flexibility in deployment, we’ll use Varnish.

Apache and mod_php run a PHP-based application that displays the entire stack’s current status and outputs the current server’s IP address for load balancing verification.

A Memcached server provides a caching layer that can be used to store and retrieve frequently-accessed objects in lieu of slower database storage.

Two MySQL servers, configured as a master and slave, offer redundant and performant database access; all data will be replicated from the master to the slave, and the slave can also be used as a secondary server for read-only queries to take some load off the master.

Directory Structure

In order to keep our configuration organized, we’ll use the following structure for our playbooks and configuration:


Organizing things this way allows us to focus on each server configuration individually, then build playbooks for provisioning and configuring instances on different hosting providers later. This organization also keeps server playbooks completely independent, so we can modularize and reuse individual server configurations.

Individual Server Playbooks

Let’s start building our individual server playbooks (in the playbooks directory). To make our playbooks more efficient, we’ll use some contributed Ansible roles on Ansible Galaxy rather than install and configure everything step-by-step. We’re going to target CentOS 6.x servers in these playbooks, but only minimal changes would be required to use the playbooks with Ubuntu, Debian, or later versions of CentOS. Varnish Create a main.yml file within the the playbooks/varnish directory, with the following contents:

--- - hosts: lamp-varnish sudo: yesvars_files: – vars.ymlroles: – geerlingguy.firewall – geerlingguy.repo-epel – geerlingguy.varnish tasks: – name: Copy Varnish default.vcl. template: src: “templates/default.vcl.j2” dest: “/etc/varnish/default.vcl” notify: restart varnish

We’re going to run this playbook on all hosts in the lamp-varnish inventory group (we’ll create this later), and we’ll run a few simple roles to configure the server:

  • geerlingguy.firewall configures a simple iptables-based firewall using a couple variables defined in vars.yml.
  • geerlingguy.repo-epel adds the EPEL repository (a prerequisite for varnish).
  • geerlingguy.varnish installs and configures Varnish.

Finally, a task copies over a custom default.vcl that configures Varnish, telling it where to find our web servers and how to load balance requests between the servers.

Let’s create the two files referenced in the above playbook. First, vars.yml, in the same directory as main.yml:

- "22"
- "80"varnish_use_default_vcl: false

The first variable tells the geerlingguy.firewall role to open TCP ports 22 and 80 for incoming traffic. The second variable tells the geerlingguy.varnish we will supply a custom default.vcl for Varnish configuration. Create a templates directory inside the playbooks/varnish directory, and inside, create a default.vcl.j2 file. This file will use Jinja2 syntax to build Varnish’s custom default.vcl file:

vcl 4.0; import directors; {% for host in groups[‘lamp-www’] %} backend www{{ loop.index }} { .host = “{{ host }}”; .port = “80”; } {% endfor %} sub vcl_init { new vdir = directors.random(); {% for host in groups[‘lamp-www’] %} vdir.add_backend(www{{ loop.index }}, 1); {% endfor %} } sub vcl_recv { set req.backend_hint = vdir.backend(); # For testing ONLY; makes sure load balancing is working correctly. return (pass); }

We won’t study Varnish’s VCL syntax in depth but we’ll run through default.vcl and highlight what is being configured:

  1. (1-3) Indicate that we’re using the 4.0 version of the VCL syntax and import the directors varnish module (which is used to configure load balancing).
  2. (5-10) Define each web server as a new backend; give a host and a port through which varnish can contact each host.
  3. (12-17) vcl_init is called when Varnish boots and initializes any required varnish modules. In this case, we’re configuring a load balancer vdir, and adding each of the www[#] backends we defined earlier as backends to which the load balancer will distribute requests. We use a random director so we can easily demonstrate Varnish’s ability to distribute requests to both app backends, but other load balancing strategies are also available.
  4. (19-24) vcl_recv is called for each request, and routes the request through Varnish. In this case, we route the request to the vdir backend defined in vcl_init, and indicate that Varnish should not cache the result.

According to #4, we’re actually bypassing Varnish’s caching layer, which is not helpful in a typical production environment. If you only need a load balancer without any reverse proxy or caching capabilities, there are better options. However, we need to verify our infrastructure is working as it should. If we used Varnish’s caching, Varnish would only ever hit one of our two web servers during normal testing.

In terms of our caching/load balancing layer, this should suffice. For a true production environment, you should remove the final return (pass) and customize default.vcl according to your application’s needs.

Apache / PHP

Create a main.yml file within the the playbooks/www directory, with the following contents:

- hosts: lamp-www
sudo: yesvars_files: - vars.ymlroles: - geerlingguy.firewall - geerlingguy.repo-epel - geerlingguy.apache - geerlingguy.php - geerlingguy.php-mysql - geerlingguy.php-memcached tasks: - name: Remove the Apache test page. file: path: /var/www/html/index.html state: absent - name: Copy our fancy server-specific home page. template: src: templates/index.php.j2 dest: /var/www/html/index.php

As with Varnish’s configuration, we’ll configure a firewall and add the EPEL repository (required for PHP’s memcached integration), and we’ll also add the following roles:

  • geerlingguy.apache installs and configures the latest available version of the Apache web server.
  • geerlingguy.php installs and configures PHP to run through Apache.
  • geerlingguy.php-mysql adds MySQL support to PHP.
  • geerlingguy.php-memcached adds Memcached support to PHP.

Two final tasks remove the default index.html home page included with Apache, and replace it with our PHP app.

As in the Varnish example, create the two files referenced in the above playbook. First, vars.yml, alongside main.yml:

- "22"
- "80"

Create a templates directory inside the playbooks/www directory, and inside, create an index.php.j2 file. This file will use Jinja2 syntax to build a (relatively) simple PHP script to display the health and status of all the servers in our infrastructure:

<?php /** * @file * Infrastructure test page. * * DO NOT use this in production. It is simply a PoC. */$mysql_servers = array( {% for host in groups[‘lamp-db’] %} ‘{{ host }}’, {% endfor %} ); $mysql_results = array(); foreach ($mysql_servers as $host) { if ($result = mysql_test_connection($host)) { $mysql_results[$host] = ‘<span style=”color: green;”>PASS</span>’; $mysql_results[$host] .= ‘ (‘ . $result[‘status’] . ‘)’; } else { $mysql_results[$host] = ‘<span style=”color: red;”>FAIL</span>’; } }// Connect to Memcached. $memcached_result = ‘<span style=”color: red;”>FAIL</span>’; if (class_exists(‘Memcached’)) { $memcached = new Memcached; $memcached->addServer(‘{{ groups[‘lamp-memcached’][0] }}’, 11211); // Test adding a value to memcached. if ($memcached->add(‘test’, ‘success’, 1)) { $result = $memcached->get(‘test’); if ($result == ‘success’) { $memcached_result = ‘<span style=”color: green;”>PASS</span>’; $memcached->delete(‘test’); } } } /** * Connect to a MySQL server and test the connection. * * @param string $host * IP Address or hostname of the server. * * @return array * Array with keys ‘success’ (bool) and ‘status’ (‘slave’ or ‘master’). * Empty if connection failure. */ function mysql_test_connection($host) { $username = ‘mycompany_user’; $password = ‘secret’; try { $db = new PDO( ‘mysql:host=’ . $host . ‘;dbname=mycompany_database’, $username, $password, array(PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION)); // Query to see if the server is configured as a master or slave. $statement = $db->prepare(“SELECT variable_value FROM information_schema.global_variables WHERE variable_name = ‘LOG_BIN’;”); $statement->execute(); $result = $statement->fetch(); return array( ‘success’ => TRUE, ‘status’ => ($result[0] == ‘ON’) ? ‘master’ : ‘slave’, ); } catch (PDOException $e) { return array(); } } ?> <!DOCTYPE html> <html> <head> <title>Host {{ inventory_hostname }}</title> <style>* { font-family: Helvetica, Arial, sans-serif }</style> </head> <body> <h1>Host {{ inventory_hostname }}</h1> <?php foreach ($mysql_results as $host => $result): ?> <p>MySQL Connection (<?php print $host; ?>): <?php print $result; ?></p>

Ansible 1.8 and greater provide more flexibility in requirements files. If you use a YAML file (e.g. requirements.yml) to define a structured list of all the roles you need, you can source them from Ansible Galaxy, a git repository, a web-accessible URL (as a .tar.gz), or even a mercurial repository! See the documentation for Advanced Control over Role Requirements Files.


Vagrantfile for Local Infrastructure via VirtualBox

As with many other examples in this book, we can use Vagrant and VirtualBox to build and configure the infrastructure locally. This lets us test things as much as we want with zero cost, and usually results in faster testing cycles, since everything is orchestrated over a local private network on a (hopefully) beefy workstation. Our basic Vagrantfile layout will be something like the following:

    1. Define a base box (in this case, CentOS 6.x) and VM hardware defaults.


  • Define all the VMs to be built, with VM-specific IP addresses and hostname configurations.
  • Define the Ansible provisioner along with the last VM, so Ansible can run once at the end of Vagrant’s build cycle.


Here’s the Vagrantfile in all its glory:

# -*- mode: ruby -*-
# vi: set ft=ruby :Vagrant.configure("2") do |config| # Base VM OS configuration. config.vm.box = "geerlingguy/centos6"# General VirtualBox VM configuration. config.vm.provider :virtualbox do |v| v.customize ["modifyvm", :id, "--memory", 512] v.customize ["modifyvm", :id, "--cpus", 1] v.customize ["modifyvm", :id, "--natdnshostresolver1", "on"] v.customize ["modifyvm", :id, "--ioapic", "on"] end # Varnish. config.vm.define "varnish" do |varnish| varnish.vm.hostname = "varnish.dev" varnish.vm.network :private_network, ip: "" end # Apache. config.vm.define "www1" do |www1| www1.vm.hostname = "www1.dev" www1.vm.network :private_network, ip: "" www1.vm.provision "shell", inline: "sudo yum update -y" www1.vm.provider :virtualbox do |v| v.customize ["modifyvm", :id, "--memory", 256] end end # Apache. config.vm.define "www2" do |www2| www2.vm.hostname = "www2.dev" www2.vm.network :private_network, ip: "" www2.vm.provision "shell", inline: "sudo yum update -y" www2.vm.provider :virtualbox do |v| v.customize ["modifyvm", :id, "--memory", 256] end end # MySQL. config.vm.define "db1" do |db1| db1.vm.hostname = "db1.dev" db1.vm.network :private_network, ip: "" end # MySQL. config.vm.define "db2" do |db2| db2.vm.hostname = "db2.dev" db2.vm.network :private_network, ip: "" end # Memcached. config.vm.define "memcached" do |memcached| memcached.vm.hostname = "memcached.dev" memcached.vm.network :private_network, ip: "" # Run Ansible provisioner once for all VMs at the end. memcached.vm.provision "ansible" do |ansible| ansible.playbook = "configure.yml" ansible.inventory_path = "inventories/vagrant/inventory" ansible.limit = "all" ansible.extra_vars = { ansible_ssh_user: 'vagrant', ansible_ssh_private_key_file: "~/.vagrant.d/insecure_private_key" } end end end

Most of the Vagrantfile is straightforward, and similar to other examples used in this book. The last block of code, which defines the ansible provisioner configuration, contains three extra values that are important for our purposes:

ansible.inventory_path = "inventories/vagrant/inventory"
ansible.limit = "all"
ansible.extra_vars = {
ansible_ssh_user: 'vagrant',
ansible_ssh_private_key_file: "~/.vagrant.d/insecure_private_key"
  1. ansible.inventory_path defines an inventory file to be used with the ansible.playbook. You could certainly create a dynamic inventory script for use with Vagrant, but because we know the IP addresses ahead of time, and are expecting a few specially-crafted inventory group names, it’s simpler to build the inventory file for Vagrant provisioning by hand (we’ll do this next).
  2. ansible.limit is set to all so Vagrant knows it should run the Ansible playbook connected to all VMs, and not just the current VM. You could technically use ansible.limit with a provisioner configuration for each of the individual VMs, and just run the VM-specific playbook through Vagrant, but our live production infrastructure will be using one playbook to configure all the servers, so we’ll do the same locally.
  3. ansible.extra_vars contains the vagrant SSH user configuration for Ansible. It’s more standard to include these settings in a static inventory file or use Vagrant’s automatically-generated inventory file, but it’s easiest to set them once for all servers here.

Before running vagrant up to see the fruits of our labor, we need to create an inventory file for Vagrant at inventories/vagrant/inventory:




Now cd into the project’s root directory, run vagrant up, and after ten or fifteen minutes, load in your browser. Voila!

Highly Available Infrastructure - Success!

You should see something like the above screenshot; the PHP app simply displays the current app server’s IP address, the individual MySQL servers’ status, and the Memcached server status. Refresh the page a few times to verify Varnish is distributing requests randomly between the two app servers.

We have local infrastructure development covered, and Ansible makes it easy to use the exact same configuration to build our infrastructure in the cloud.

Provisioner Configuration: DigitalOcean

In Chapter 7, we learned provisioning and configuring DigitalOcean droplets in an Ansible playbook is fairly simple. But we need to take provisioning a step further by provisioning multiple droplets (one for each server in our infrastructure) and dynamically grouping them so we can configure them after they are booted and online.

For the sake of flexibility, let’s create a playbook for our DigitalOcean droplets in provisioners/digitalocean.yml. This will allow us to add other provisioner configurations later, alongside the digitalocean.yml playbook. As with our example in Chapter 7, we will use a local connection to provision cloud instances. Begin the playbook with:

- hosts: localhost
connection: local
gather_facts: false

Next we need to define some metadata to describe each of our droplets. For simplicity’s sake, we’ll inline the droplets variable in this playbook:

- { name: a4d.lamp.varnish, group: "lamp-varnish" }
- { name: a4d.lamp.www.1, group: "lamp-www" }
- { name: a4d.lamp.www.2, group: "lamp-www" }
- { name: a4d.lamp.db.1, group: "lamp-db" }
- { name: a4d.lamp.db.2, group: "lamp-db" }
- { name: a4d.lamp.memcached, group: "lamp-memcached" }

Each droplet is an object with two keys:

  • name: The name of the Droplet for DigitalOcean’s listings and Ansible’s host inventory.
  • group: The Ansible inventory group for the droplet.

Next we need to add a task to create the droplets, using the droplets list as a guide, and as part of the same task, register each droplet’s information in a separate dictionary, created_droplets:

- name: Provision DigitalOcean droplets.
state: "{{ item.state | default('present') }}"
command: droplet
name: "{{ item.name }}"
private_networking: yes
size_id: "{{ item.size | default(66) }}" # 512mb
image_id: "{{ item.image | default(6372108) }}" # CentOS 6 x64.
region_id: "{{ item.region | default(4) }}" # NYC2
ssh_key_ids: "{{ item.ssh_key | default('138954') }}" # geerlingguy
unique_name: yes
register: created_droplets
with_items: droplets

Many of the options (e.g. size_id) are defined as {{ item.property | default('default_value') }}, which allows us to use optional variables per droplet. For any of the defined droplets, we could add size_id: 72 (or whatever valid value you’d like), and it would override the default value set in the task.

You could specify an SSH public key per droplet, or (as in this instance) use the same key for all hosts by providing a default. In this case, I added an SSH key to my DigitalOcean account, then used the DigitalOcean API to retrieve the key’s numeric ID (as described in the previous chapter).

It’s best to use key-based authentication and add at least one SSH key to your DigitalOcean account so Ansible can connect using keys instead of insecure passwords, especially since these instances will be created with only a root account.

We loop through all the defined droplets using with_items: droplets, and after each droplet is created add the droplet’s metadata (name, IP address, etc.) to the created_droplets variable. Next, we’ll loop through that variable to build our inventory on-the-fly so our configuration applies to the correct servers:

- name: Add DigitalOcean hosts to their respective inventory groups.
name: "{{ item.1.droplet.ip_address }}"
groups: "do,{{ droplets[item.0].group }},{{ item.1.droplet.name }}"
# You can dynamically add inventory variables per-host.
ansible_ssh_user: root
mysql_replication_role: >
"{{ 'master' if (item.1.droplet.name == 'a4d.lamp.db.1')
else 'slave' }}"
mysql_server_id: "{{ item.0 }}"
when: item.1.droplet is defined
with_indexed_items: created_droplets.results

You’ll notice a few interesting things happening in this task:

  • This is the first time we’ve used with_indexed_items. The reason for using this less-common loop feature is to add a sequential and unique mysql_server_id. Though only the MySQL servers need a server ID set, it’s simplest to dynamically create the variable for every server, so it’s available when needed. with_indexed_items simply sets item.0 to the key of the item, and item.1 to the value of the item.
  • with_indexed_items also helps us reliably set each droplet’s group. Because the v1 DigitalOcean API doesn’t support features like tags for Droplets, we need to set up the groups on our own. Using the droplets variable we manually created earlier allows us to set the proper group for a particular droplet.
  • Finally we add inventory variables per-host in add_host by adding the variable name as a key, and the variable value as the key’s value. Simple, but powerful!

There are a few different ways you can approach dynamic provisioning and inventory management for your infrastructure, and, especially if you are only targeting one cloud hosting provider, there are ways to avoid using more exotic features of Ansible (e.g. with_indexed_items) and complex if/else conditions. This example is slightly more complex due to the fact that the playbook is being created to be interchangeable with other similar provisioning playbooks.

The final step in our provisioning is to make sure all the droplets are booted and can be reached via SSH, so at the end of the digitalocean.yml playbook, add another play to be run on hosts in the do group we just defined:

- hosts: do
remote_user: roottasks: - name: Wait for port 22 to become available. local_action: "wait_for port=22 host={{ inventory_hostname }}"

Once we know port 22 is reachable, we know the droplet is up and ready for configuration.

We’re almost ready to provision and configure our entire infrastructure on DigitalOcean, but we need to create one last playbook to tie everything together. Create provision.yml in the project root with the following contents:

- include: provisioners/digitalocean.yml
- include: configure.yml

That’s it! Now, assuming you set the environment variables DO_CLIENT_ID and DO_API_KEY, you can run $ ansible-playbook provision.yml to provision and configure the infrastructure on DigitalOcean.

The entire process should take about 15 minutes, and once it’s complete, you should see something like:

PLAY RECAP *****************************************************************             : ok=19   changed=13   unreachable=0    failed=0               : ok=13   changed=8    unreachable=0    failed=0             : ok=40   changed=18   unreachable=0    failed=0             : ok=27   changed=16   unreachable=0    failed=0              : ok=24   changed=15   unreachable=0    failed=0            : ok=40   changed=18   unreachable=0    failed=0
localhost                  : ok=2    changed=1    unreachable=0    failed=0

Visit the IP address of the varnish server and you should be greeted with a status page similar to the one generated by the Vagrant-based infrastructure:

Highly Available Infrastructure on DigitalOcean.

Because everything in this playbook is idempotent, running $ ansible-playbook provision.yml again should report no changes, and helps you verify that everything is running correctly.

Ansible will also rebuild and reconfigure any droplets that may be missing from your infrastructure. If you’re daring, and want to test this feature, just log into your DigitalOcean account, delete one of the droplets just created by this playbook (maybe one of the two app servers), then run the playbook again.

Now that we’ve tested our infrastructure on DigitalOcean, we can destroy the droplets just as easily (change the state parameter in provisioners/digitalocean.yml to default to 'absent' and run $ ansible-playbook provision.yml again).

Next up, we’ll build the infrastructure a third time—on Amazon’s infrastructure.

Provisioner Configuration: Amazon Web Services (EC2)

For Amazon Web Services, provisioning works slightly different. Amazon has a broader ecosystem of services surrounding EC2 instances, and for our particular example, we will need to configure security groups prior to provisioning instances.

To begin, create aws.yml inside the provisioners directory and begin the playbook the same ways as with DigitalOcean:

- hosts: localhost
connection: local
gather_facts: false

EC2 instances use security groups as an AWS-level firewall (which operates outside the individual instance’s OS).
We will need to define a list of security_groups alongside our EC2 instances. First, the instances:

- {
name: a4d.lamp.varnish,
group: "lamp-varnish",
security_group: ["default", "a4d_lamp_http"]
- {
name: a4d.lamp.www.1,
group: "lamp-www",
security_group: ["default", "a4d_lamp_http"]
- {
name: a4d.lamp.www.2,
group: "lamp-www",
security_group: ["default", "a4d_lamp_http"]
- {
name: a4d.lamp.db.1,
group: "lamp-db",
security_group: ["default", "a4d_lamp_db"]
- {
name: a4d.lamp.db.2,
group: "lamp-db",
security_group: ["default", "a4d_lamp_db"]
- {
name: a4d.lamp.memcached,
group: "lamp-memcached",
security_group: ["default", "a4d_lamp_memcached"]

Inside the instances variable, each instance is an object with three keys:

  • name: The name of the instance, which we’ll use to tag the instance and ensure only one instance is created per name.
  • group: The Ansible inventory group in which the instance should belong.
  • security_group: A list of security groups into which the instance will be placed. The default security group comes is added to your AWS account upon creation, and has one rule to allow outgoing traffic on any port to any IP address.

If you use AWS exclusively, it would be best to autoscaling groups and change the design of this infrastructure a bit. For this example, we just need to ensure that the six instances we explicitly define are created, so we’re using particular names and an exact_count to enforce the 1:1 relationship.

With our instances defined, we’ll next define a security_groups variable containing all the required security group configuration for each server:

- name: a4d_lamp_http
- { proto: tcp, from_port: 80, to_port: 80, cidr_ip: }
- { proto: tcp, from_port: 22, to_port: 22, cidr_ip: }
rules_egress: []
- name: a4d_lamp_db
- { proto: tcp, from_port: 3306, to_port: 3306, cidr_ip: }
- { proto: tcp, from_port: 22, to_port: 22, cidr_ip: }
rules_egress: []
- name: a4d_lamp_memcached
- { proto: tcp, from_port: 11211, to_port: 11211, cidr_ip: }
- { proto: tcp, from_port: 22, to_port: 22, cidr_ip: }
rules_egress: []

Each security group has a name (which was used to identify the security group in the instances list), rules (a list of firewall rules like protocol, ports, and IP ranges to limit incoming traffic), and rules_egress (a list of firewall rules to limit outgoing traffic).

We need three security groups: a4d_lamp_http to open port 80, a4d_lamp_db to open port 3306, and a4d_lamp_memcached to open port 11211.

Now that we have all the data we need to set up security groups and instances, the first task needs to to create or verify the existence of the security groups:

- name: Configure EC2 Security Groups.
name: "{{ item.name }}"
description: Example EC2 security group for A4D.
region: "{{ item.region | default('us-west-2') }}" # Oregon
state: present
rules: "{{ item.rules }}"
rules_egress: "{{ item.rules_egress }}"
with_items: security_groups

The ec2_group requires a name, region, and rules for each security group. Security groups will be created if they don’t exist, modified to match the supplied values if they do exist, or simply verified if they exist and match the given values.

With the security groups configured, we can provision the defined EC2 instances by looping through instances with the ec2 module:

    - name: Provision EC2 instances.
key_name: "{{ item.ssh_key | default('jeff_mba_home') }}"
inventory_group: "{{ item.group | default('') }}"
inventory_host: "{{ item.name | default('') }}"
group: "{{ item.security_group | default('') }}"
instance_type: "{{ item.type | default('t2.micro')}}" # Free Tier
image: "{{ item.image | default('ami-11125e21') }}" # RHEL6 x64 hvm
region: "{{ item.region | default('us-west-2') }}" # Oregon
wait: yes
wait_timeout: 500
exact_count: 1
inventory_group: "{{ item.group | default('') }}"
inventory_host: "{{ item.name | default('') }}"
register: created_instances
with_items: instances

This example is slightly more complex than the DigitalOcean example, and a few parts warrant a deeper look:

  • EC2 allows SSH keys to be defined by name—in my case, I have a key jeff_mba_home in my AWS account. You should set the key_name default to a key that you have in your account.
  • Instance tags are tags that AWS will attach to your instance, for categorization purposes. By giving a list of keys and values, I can then use that list later in the count_tag parameter.
  • t2.micro was used as the default instance type, since it falls within EC2’s free tier usage. If you just set up an account and keep all AWS resource usage within free tier limits, you won’t be billed anything.
  • exact_count and count_tag work together to ensure AWS provisions only one of each of the instances we defined. The count_tag tells the ec2 module to match the given group + host and then exact_count tells the module to only provision 1 instance. If you wanted to remove all your instances, you could set exact_count to 0 and run the playbook again.

Each provisioned instance will have its metadata added to the registered created_instances variable, which we’ll use to build Ansible inventory groups for the server configuration playbooks.

    - name: Add EC2 instances to their respective inventory groups.
name: "{{ item.1.tagged_instances.0.public_ip }}"
groups: "aws,{{ item.1.item.group }},{{ item.1.item.name }}"
# You can dynamically add inventory variables per-host.
ansible_ssh_user: ec2-user
mysql_replication_role: >
{{ 'master' if (item.1.item.name == 'a4d.lamp.db.1')
else 'slave' }}
mysql_server_id: "{{ item.0 }}"
when: item.1.instances is defined
with_indexed_items: created_instances.results

This add_host example is slightly simpler than the one for DigitalOcean, because AWS attaches metadata to EC2 instances which we can re-use when building groups or hostnames (e.g. item.1.item.group). We don’t have to use list indexes to fetch group names from the original instances variable.

We still use with_indexed_items so we can use the index to generate a unique ID per server for use in building the MySQL master-slave replication.

The final step in provisioning the EC2 instances is to ensure we can connect to them before continuing, and to set selinux into permissive mode so the configuration we supply will work correctly.

# Run some general configuration on all AWS hosts.
- hosts: aws
gather_facts: falsetasks: - name: Wait for port 22 to become available. local_action: "wait_for port=22 host={{ inventory_hostname }}"- name: Set selinux into 'permissive' mode. selinux: policy=targeted state=permissive sudo: yes

Since we defined ansible_ssh_user as ec2-user in the dynamically-generated inventory above, we need to ensure the selinux task runs with sudo explicitly.

Now, modify the provision.yml file in the root of the project folder, and change the provisioners include to look like the following:

- include: provisioners/aws.yml
- include: configure.yml

Assuming the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are set in your current terminal session, you can run $ ansible-playbook provision.yml to provision and configure the infrastructure on AWS.

The entire process should take about 15 minutes, and once it’s complete, you should see something like:

PLAY RECAP *****************************************************************              : ok=24   changed=16   unreachable=0    failed=0              : ok=40   changed=19   unreachable=0    failed=0              : ok=40   changed=19   unreachable=0    failed=0              : ok=13   changed=9    unreachable=0    failed=0               : ok=27   changed=17   unreachable=0    failed=0               : ok=19   changed=14   unreachable=0    failed=0
localhost                  : ok=3    changed=1    unreachable=0    failed=0

Visit the IP address of the varnish server (the first server configured) and you should be greeted with a status page similar to the one generated by the Vagrant and DigitalOcean-based infrastructure:

Highly Available Infrastructure on AWS EC2.

As with the earlier examples, running ansible-playbook provision.yml again should produce no changes, because everything in this playbook is idempotent. And if one of your instances were terminated, running the playbook again would recreate and reconfigure the instance in a few minutes.

To terminate all the provisioned instances, you can change the exact_count in the ec2 task to 0, and run $ ansible-playbook provision.yml again.


In the above example, an entire highly-available PHP application infrastructure was defined in a series of short Ansible playbooks, and then provisioning configuration was created to build the infrastructure on either local VMs, DigitalOcean droplets, or AWS EC2 instances.

Once you start working on building infrastructure this way — abstracting individual servers, then abstracting cloud provisioning — you’ll start to see some of Ansible’s true power in being more than just a configuration management tool. Imagine being able to create your own multi-datacenter, multi-provider infrastructure with Ansible and some basic configuration.

While Amazon, DigitalOcean, Rackspace and other hosting providers have their own tooling and unique infrastructure merits, the agility and flexibility afforded by building infrastructure in a provider-agnostic fashion lets you treat hosting providers as commodities, and gives you freedom to build more reliable, performant, and simple application infrastructure.

Even if you plan on running everything within one hosting provider’s network (or in a private cloud, or even on a few bare metal servers), Ansible provides deep stack-specific integration so you can do whatever you need to do and manage the provider’s services within your playbooks.

You can find the entire contents of this example in the Ansible for DevOps GitHub repository, in the lamp-infrastructure directory.

Purchase Ansible for DevOps on LeanPub, Amazon, or iTunes.