Provision with Ansible from inside Docker


There are many deployment tools, such as Puppet, Chef and Salt Stack, most of them are all pull-based. Which means, when you deploy to a machine, the provisioning code will be downloaded to the target machine and run locally. Unlike many others, Ansible is a push-based deployment tool, instead of pulling code, it pushes SSH commands to the target machine. It’s great to have push-based approach in many situations. For example, you don’t need to install Ansible runtimes on the target machine, you can simply provision it. However, there are also shortcomings of this approach. Say if you want to provision EC2 instances in an AWS auto-scaling group, you don’t know when a new instance will be launched, and when it happens, it needs to be provisioned immediately. In this case, Ansible’s pushing approach is not that useful, since you need to provision the target machine on demand.

There are many ways to solve that problem, namely, to run Ansible provisioning code in a pulling manner.

Ansible-pull

One obvious approach is to use ansible-pull, it’s an Ansible command line tool clones your Ansible git repo for you, and run them locally. It works, however, there are some drawbacks. First thing is the dependencies issue, to run ansible-pull on the target machine, you will need to install Ansible runtimes on the machine first, if you are running an Ansible playbook depends on newer version of Ansible, then you need to find a way to upgrade the runtimes. Another problem is the provisioning code is installed via git or other version control system, it’s hard to verify the integrity of those playbooks, and the code cannot be shipped as a single file.

Ansible Tower

Ansible Tower is the official commercial tool for managing and running Ansible. There is an interesting feature it provides, which is so-called “phone home”. It works like this, when a new machine is launched, it makes an HTTP request to the Ansible Tower server, just like calling home and says

hey! I’m ready, please provision me

Then the server will run ansible-playbook against the machine. It works, but one problem we see there is, when your Ansible Tower can SSH into different machines and run sudo commands, it usually means you need to install your SSH private key in the tower server, and also need to preinstall the corresponding public key to all other machines. Allowing one machine to be able to SSH into all other machines makes me feels uncomfortable, it’s like to put all eggs in single bucket. Although you can actually set pass-phase for your private key on the tower server, since your machines in AWS auto-scaling group need to be provisioned at anytime, so that you cannot encrypt your private key with pass-phase.

An interesting approach - Docker

With the requirements in mind

  • No runtime dependencies issue
  • Provision code can be shipped as a file
  • Provision code integrity can be verified (signed)

an interesting idea came to my mind. Why I don’t simply put Ansible playbooks into a docker container, and ship the image to the target machine, then run the Ansible playbooks from inside the docker image and SSH against the host? With a docker image, I don’t need to worry about Ansible dependencies issue, including Ansible runtimes themselves and many other necessary runtimes, such as boto, can all be installed into the docker image. And the docker image, can be shipped as a single file, we can sign the file and verify it on the target machine to ensure its integrity.

A simple example

I wrote a simple project to demonstrate this idea, the project can be found on github. It’s actually pretty easy, for the Dockerfile, we install ansible dependencies and install necessary roles. We also copy our own site.yml into the docker image.

FROM phusion/baseimage:0.9.15

RUN apt-get update && \
    apt-get install -y python python-dev python-pip && \
    apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

RUN pip install ansible
RUN ansible-galaxy install \
    Ansibles.hostname \
    Ansibles.apt \
    Ansibles.build-essential \
    Ansibles.perl \
    Ansibles.monit \
    ANXS.nginx
ADD site.yml /srv/ansible/site.yml

CMD ["/sbin/my_init"]

You can build the ansible image with

sudo docker build -t ansible-examples .

Then, to run it, before you do it, you need to create a host file, and insert the private IP address of your host machine. Like this

hosts:

10.0.2.15

You should notice that since the Ansible is executed inside the docker container, so localhost simply doesn’t work. You need to specify an address which is accessible from the Docker container network. To allow SSH connection from the docker container, you also need to provide a temporary SSH public key installed in the host machine, and the private key for the container to connect to the host. Here is pretty much the command you run

sudo docker run -it \
    -v /vagrant/hosts:/tmp/hosts \
    -v /vagrant/insecure_private_key:/tmp/insecure_private_key \
    ansible-examples \
    /sbin/my_init --skip-startup-files --skip-runit -- \
    ansible-playbook /srv/ansible/site.yml \
    -i /tmp/hosts -u vagrant --private-key=/tmp/insecure_private_key

We map our hosts file to the docker container at /tmp/hosts, and the SSH private key at /tmp/insecure_private_key, then we can use it in the ansible-playbook command arguments. That’s it!

It’s so powerful to combine Ansible and Docker

It’s so powerful to combine Ansible and Docker together, as you can see, the software for provisioning machines now is packed as a Docker image, so that you can run it anywhere. It’s a solid unit, you can sign it, verify it, tag it, ship it, share it and test it. Everything is installed in the container, you don’t need to worry about missing some plugins or roles on the target machine.

The only drawback I can think of is you need to install Docker on the target machine before you can use this approach, but it’s not a problem since Docker gets more and more popular, you can preinstalled it in your AMI. And the only thing I am not happy with docker is the image registry system, it’s very slow to push or pull an image if you have many layers in it and the size is big. Actually I have an idea about building a way more better docker registry, hopefully I have time to do it.

I am already using this approach for provisioning machines in our production environment, and it works like a charm so far. I am looking forward to see people using this technique to pack deployment code into docker image, imagine this:

sudo docker pull ansible-open-stack-swift
sudo docker run -it ansible-open-stack-swift

and boom! you have a fully functional Swift cluster in AWS EC2 now, isn’t that awesome?

Recent articles:

My Beancount books are 95% automatic after 3 years
CADing and 3D printing like a software engineer, part 1 - baby step with an overengineered webcam raiser
How I discovered a 9.8 critical security vulnerability in ZeroMQ with mostly pure luck and my two cents about xz backdoor