Less automation with Ansible


  • Sun 08 May 2016
  • misc

I've been automating a lot of my world with Ansible lately. But sometimes I want a little less automation than Ansible is willing to provide.

Case in point is my "hmail" (that's "hosted mail", not "hate mail") environment. There are several SmartOS VMs here, a dozen at last count, to provide mail services for family and friends, often with subtly different policies such a different RBLs, greylisting timing, local whitelisting, etc. to accommodate the individual whims of the user community on that VM.

My preferred workflow is to update a single VM and then check it for correctness before proceeding to the next one. Allowing Ansible to fully automate the process would be awesome in theory, but since the VMs all share a single zpool, lack of serialization will result in more downtime for each user if things go right, and a big mess and protracted downtime if I got things wrong in some way and need to clean up. After all, one of the great things about automation is that you can automate shooting yourself in the foot.

In practice, I like to do one upgrade per night, perform a cursory functionality check, and then wait 24 hours for the phone to ring before doing the next one. Kind of the lazy man's way of keeping an eye on your KPIs.

This is easy to do, just add "--limit somehost.hmail.example.org" to your ansible-playbook command line. But what if you forget? It would be mighty inconvenient to realize that you'd forgotten this one key element just as "-hosts: all_hmail" got executed, eh?

We want to throw an error if we forgot to limit to a single host. After some searching around, I found this nifty task:

{% raw %} ---

tasks:
- name: Check to enforce single host
    fail: msg="Didn't limit to single host, try --limit hostname.example.com"
          when: "{{ play_hosts|length }} != 1"

- debug: msg='Made it this far...'

{% endraw %}

The debug: of course is by way of illustration, not part of the conditional.

OK, so far so good, but the "roles:" get executed first before the "tasks:". How do we fix this? Do we need to create a separate role just to execute this one task? Actually (and unsurprisingly), no. But it's buried in the docs and you don't necessarily know it as a comparative Ansible n00b like me.

{% raw %} ---

- hosts: webservers

  pre_tasks:
  - shell: echo 'hello'

  roles:
  - { role: some_role }

  tasks:
  - shell: echo 'still busy'

  post_tasks:
  - shell: echo 'goodbye'

{% endraw %}

So we can make this the first pre_task and it will have the desired effect - bring the playbook to a sudden halt if we got it wrong and forgot to limit the playbook run.

(Sorry for the mangled YAML by the way. How I eventually address this will be the source of a future blog post once I figure it out).