Sandboxing Ansible – Part 4: Ansible

It’s finally time to start playing with Ansible. Part 1 of this series gives a brief introduction to Ansible as a tool, so I’ll skip over that and go straight into it.

At this point I’ll assume that the Ansible control server and the other nodes are already provisioned. If that’s not the case, then refer to other parts in the series for guidance.

It’s impossible to cover even a tiny fraction of Ansible within a single post — instead this will just be a basic primer. But still, I’m warning you now: this is a very long post.

How Ansible Works

Ansible does not require any complex infrastructure to get started. All you need is the Ansible control server and devices which support SSH or WinRM (yup, Ansible can configure Windows hosts). However, the Ansible control node (it doesn’t have to be a server necessarily) does need to be in the *nix family.

The way that Ansible works is fairly straightforward. This is going to be a gross simplification, but it goes something like this:

  1. The Ansible control node parses playbooks which link specific plays to groups of hosts. These plays consist of modules and generally some associated parameters.
  2. The Ansible server then references the configuration file to see how it should handle the transport and execution. This includes the number of concurrent hosts it should connect to and many other things.
  3. In remote execution mode, the default, Ansible will copy the Python package (built from the modules and their parameters) to a temporary directory on the remote host and execute the code, deleting the package afterwards. This is done over SSH. In local execution mode, which is required when the remote hosts cannot run an appropriate version of Python, the Ansible server will execute the package remotely and perform what is required on the remote host over SSH. This should be avoided when possible.
  4. The results are sent back to the Ansible control node as JSON.

I’m sure Ansible experts would rip it apart, but it’s close enough for our purposes.

It should also be noted that nearly everything in Ansible can and should be stored in Git. Most files related to Ansible are written in YAML.

They key to understanding Ansible will be understanding it’s three primary components: inventory, modules, and playbooks.

Inventory

There are two primary types of inventory files: static and dynamic. Static are basic ini-like files in which the hosts are each explicitly identified. Dynamic inventory is an automatic method of gathering inventory data. You can either create your own dynamic inventory  script or you can use a pre-existing dynamic inventory script. Dynamic inventory scripts exit for many of the popular services such as:

  • Spacewalk
  • Cobbler
  • Linode
  • Google Compute Engine
  • Amazon EC2
  • OpenStack
  • Digital Ocean

These can connect to the services and automatically add the nodes to your inventory. Here’s additional documentation on dynamic inventory.

For simplicity, we are going to create a static inventory. I’ll start by opening up my text editor and saving the file to the ansible_data folder which is synced to the Ansible control server. You can name the file whatever you want; I’ll call mine dev-hosts. Once saved, check that it is synced to the control server by typing vagrant ssh control and navigating to ‘/home/vagrant/ansible_data’.

Adding Nodes and Variables

Let’s add some nodes to this inventory file. We need to tell Ansible how to find and communicate with our other Vagrant hosts. This is where variables come in. Since we didn’t do anything with DNS, we’ll just use the hard-coded IPs from the Vagrantfile to identify the hosts. Since we are using folder syncing, you can use the editor on your host machine to add the following entries

no-work.png

The format here is the hostname followed by a special ansible variable, ansible_host. This variable would not be required if we had DNS setup appropriately. We can test Ansible against the hosts by utilizing the Ansible ping module. This is NOT an ICMP ping, but rather an Ansible module which tests its ability to connect to the hosts. From within ‘/home/users/vagrant/ansible_data’ run the command

ansible all -i dev-hosts -m ping

I’ll explain the syntax as we go, but notice the failure

ping-fail.png

Remember, in the background, Ansible needs to SSH into nodes to execute packages. It can’t do this with an IP address alone. We are missing the authentication piece. Again, to keep things simple, we’ll do password authentication and add the variables ansible_user and ansible_ssh_pass. The username and password for the Vagrant boxes that we run are both vagrant.

credentials-vs

And then we run the same command

add-credentials

Alright, so now that we defined our nodes we can put the nodes into groups. The groups here are pretty obvious: we have two web servers and a database server. The format for this is [group_name] with the nodes listed underneath (like a .ini file)

first_groupings.png

Now that we have them grouped, we can target specific groups. Let’s try the test module against just the web_servers group

ansible web_servers -i dev-hosts -m ping

ping-webservers.png

Ok, time to explain that syntax:

  • ansible: The Ansible binary which was added to your path at time of install.
  • web_servers: The group which you targeted. You can get pretty specific with your targeting using operators. You can explicitly exclude groups, use operators to target specific combinations of groups, and much more. See here for details.
  • -i dev-hosts: This is the inventory file which you are running the command against. You can set a default inventory in a ansible.cfg file which negates the requirement of the parameter. You can also reference it by the full path.
  • -m ping: The -m is the module parameter and ping is the module which you would like to run against the chosen hosts.

This is simply saying ‘run the ping module against any host within the web_servers group in the specified inventory file’.

Now let’s clean up the variables a bit.

By doing [groupname:vars], we can specify any parameters that are consistent across a group

group_vars

Now for any node put into the web_servers group, Ansible will automatically use those credentials specified for the group.

Not only can a node be a member of multiple groups, but groups can also be nested. We’ll create a group called datacenter which includes both web_servers and database_servers. And since the credentials are the same for both groups, we’ll create those variables for the new group as well

nested.png

We use the groupname:children syntax to specify this is a group that contains other groups and the :vars is specifying the variables for that group. Now we can target the datacenter group

ansible datacenter -i dev-hosts -m ping

datacenter-ping.png

Scaling

Obviously this could get ugly as you scale out. It’s possible to make it more module by breaking out group variables, host variables into separate files to remove variables from the inventory file. It will be too tedious to go through here, but it would look something like this

new_structure.png

In addition to adding modularity, it’s also a way to further separate out environments. This dev folder has its own inventory file, configuration file, and variables. We could duplicate this structure into another folder called prod and it would be like having two infrastructures right next to each other (but separated enough to be safe).

With this added separation, we need to consider that a variable may be defined in multiple places. It’s important to understand which variable will ultimately be used.

The more specific the higher priority. For example, when I broke out those file as shown above, I could define the ansible_ssh_pass for the web_servers group in multiple locations. I could define it under group_vars->all, group_vars->web_servers, and also under the host_vars files web and web2.

In this scenario, host_vars will win over specific group_vars, which wins out over the all group_vars.

However, when getting serious about using Ansible, using Roles is the proper way to go.

Go here for more detailed documentation on static inventory files.

Modules

As mentioned earlier, modules are what actually do all of the work in Ansible. In this guide we have been using the ping module.

Modules can do pretty much anything that you can imagine. They can be used to manage network devices (firewalls, load balancers, etc), deploy configurations and applications to servers, manage your on-prem virtual environments, provision VMs in cloud services, and a whole lot more.

There are three different types of modules

  • Core: Modules which are supported by Ansible themselves. These are all included by default.
  • Extras: Modules created by external communities. These are included with Ansible by default, but are not supported by Ansible. These can (and sometimes do) get promoted to Core modules.
  • Deprecated: Modules are labeled this when another module is identified as its replacement. Avoid these when you can.

In order to list all of the modules available, run the command

ansible-doc -l

To get information about a specific module, such as the yum module, type

ansible-doc yum

Most modules will have parameters available, some of them being required. Required parameters are preceded with a ‘=’ rather than a ‘-‘ in the ansible docs.

parameters.png

 

This shows us that name is a required parameter. The yum module is used for installing packages, so requiring the name of the package you want to install is pretty reasonable.

I did add an ansible.cfg file to the current diectory so that I wouldn’t have to specify the inventory file each time

config.png

Let’s go ahead and execute our first real module by installing MariaDB on our database server. Run the command

ansible database_servers -m yum -a "name=mariadb-server" --sudo

The -a parameter is indicating that we want to include a number of parameters when executing the module. In our case, it is the package name.The –sudo means that the command will require sudo privileges and Ansible will use sudo to run the module.

installed-db.jpg

Alright, so the output is ugly. But what matters is at the very top — which says that it the result was both successful and that the node was changed.

Now just for fun, let’s try running it again

idempotence.jpg

It was successful, but there was no change to the system. When you hear Ansible folks talking about “idempotence”, this is what they mean. More specifically: “The concept that change commands should only be applied when they need to be applied, and that it is better to describe the desired state of a system than the process of how to get to that state”

Let’s SSH into that server and see if everything looks good.

installed-packages.jpg

It’s installed but the service is not running.

This gives us an opportunity to test another module called service. We will use the service module and use three parameters: name (name of the service), state (whether it should be running or stopped), and enabled (whether it should be an enabled service on the system).

ansible database_servers -m service -a "name=mariadb enabled=yes state=started" --sudo

start-service.jpg

We can see that the changed=true, and the service is both enabled and started. Let’s double-check by looking at the service on the database server

service-started.jpg

All looks good.

Tasks, Plays, and Playbooks

We’ve been executing modules in an ad-hoc fashion up until this point. Now we’ll look at adding a bit of structure and logic into the mix.

First of all, let’s just identify these three terms.

A task is simply the execution of some action, such as a module and it’s parameters. Taking our ad-hoc command we ran to install MariaDB and making it a task would look something like this

task.png

We give the task a name, specify the module and then its parameters. Tasks are organized into something called Plays. A play is the glue between tasks and inventories. Here is a sample play which deploys MariaDB to the database servers

first_play

At this point, we have our first complete play. There are a few things to point out before building this out further.

First of all, the language is YAML. Nearly everything in Ansible is written in YAML — meaning white-space is very important. I always add the three dashes at the top to explicitly identify as a YAML file. This is not required, but it is best practice.

The first stanza defines items that will be used across all tasks within the play. All of these tasks will be applied to those hosts and will be run as sudo. Additionally, we can define other items at this level such as variables.

Notice the file is named first.playbook. It doesn’t  matter how it is named or what the file extension is. I used .playbook  so that I identify it as a playbook.

So is this a play or a playbook? A playbook is just a collection of plays. This is a playbook with a single play, and the play has a single task.

It is important to note that playbooks are completely parsed before any of the plays are run, meaning any includes need to exist by the time it is being parsed. This is mostly for error checking. Ansible also runs through the plays from top-to-bottom.

Let’s finally try this out

ansible-playbook first.playbook

playbook-run.jpg

The results are returned as JSON and are pretty straightforward. Nothing was failed but nothing was changed since MariaDB is already installed.

However, notice the [setup] task. This is an implicit task that runs by default. It’s gathering facts about the systems which the plays are targeting. The facts gathered are stored as variables which can be used for things like conditionals. As an example, perform a task only if the host has x properties or is in y state.

You can list all of the possible facts of a host by running the command

ansible -m setup <host>

facts

We can disable this fact gathering by adding one line to the play.

remove-gather.png

And notice the setup task is gone

remove_gather.jpg

Now I’m going add a bit of complexity to this Playbook, including making use of captured facts, so I’ll explicitly make sure we are gathering facts

fatter-pb

By gathering the facts of the system, we can do conditional targeting. In this case I am grabbing the OS of the nodes and performing the package installs when the condition evaluates as true. This way we avoid trying to use the yum module on a system that might expect the apt module or something else.

The rest of the additions are pretty self-explanatory. First is the configuration of the MariaDB service to make sure that it is enabled and started.

Then I added another play, which only applies to the web_servers group and configures the Apache service in addition to installing it.

And when we run it

complex-success

So now we have a playbook that is capable of configuring nodes as Apache web servers and MariaDB servers.

I could do five more posts just on cool different modules, but this we’ll move on for now.

Handlers and Jinja2

I’m going to finish up this section with touching on handlers and the Jinja2 engine. Let’s say we don’t just want to spin up a server and install Apache, but we want to configure Apache with a custom httpd.conf and then restart Apache.

We’ll do this by using the template module which is used to copy files to remote hosts. We’ll add a new task to deploy the configuration file, but before we do, let’s add our modifications to it by using a neat variable templating system called Jinja.

I grabbed a sample Apache configuration file, I added it to our ansible_data folder, created a new templates folder and saved the httpd.conf file to it. These are the values in the httpd.conf files on web and web2.

conf.png

We’re going to directly modify that the values we want to change, with some variables

changed-conf.png

The {{ variable_name }} syntax is what the Jinha2 engine will recognize as a variable. I can choose to prompt for input to assign to values at run-time or just add the values here. I’m going to add the values at the top of the play.

values.png

Now when this play runs it will substitute in those values. This is a powerful way to make plays more dynamic.

But once we change and deploy the configuration file we still need to restart Apache.

For something like this, we’ll use a handler. Handlers are like tasks, but they are only run when triggered. Specifically, the handler will only run when a task completes a change and then performs a notify to trigger the associated handler.

Handlers are not really any different than a task, but they do differ in that they only run when they are called by a notify. This is what the notify and handlers look like

notify-handler.png

And sure enough, it deploys the config and does restart Apache.

changed_config.jpg

I also added a quick index.html file to the document root to test out the change, and sure enough it is now listening on port 8080

confirmed.png

It might not seem completely obvious how powerful this is, but the Jinja2 templating engine is one of the most important pieces of the Ansible workflow. It is what makes the entire platform so dynamic.

Final Step

Do make sure to add, commit, and push these files to the repository created in Part 2. I definitely haven’t been adding everything as I should, so I’ll catch up at the end

bad.png

I’ll do a quick couple of commands to get everything into BitBucket

git add *
git commit * -m "adding all new ansible related files"
git push

If you have any other computer, you can easily pull down this whole repository, vagrant up, and continue your Ansible learning experience!

Conclusion and Oversights

Ok — be nice. Ansible is way too big to cover in a single post. I could break just the Ansible part of this series into 5 parts. My goal with this entire series was to simply expose folks to some of these tools and methodologies.

However, I just wanted to point out a couple of other critical components of Ansible that I had to leave out. These are things that you absolutely must learn if you are going to dive into Ansible and I may end up covering them at some point in a different post.

  • Troubleshooting: this is basically the most important part of any tool. Ansible has great documentation on it here.
  • ansible.cfg: I didn’t cover important default settings here or precedence. More info on that. 
  • Vault: Ansible Vault is a way to secure the Ansible environment by means of encryption. Documentation on it here.
  • Roles: the ultimate way of scaling out and making your infrastructure more modular and maintainable. Documentation on it here.
  • Tower: the web interface for Ansible. It is free for up to 10 nodes, but since the Red Hat acquisition, they have announced it will be open-sourced in the near future. Documentation.
  • Module Development: While there are hundreds of modules available, there might be something specific you want to accomplish with a module that does not exist. Fortunately Ansible modules are not particularly hard to develop. Documentation here. 
  • Windows: Ansible can be used to provide configuration management for Windows. While this is awesome, it would have taken too long to add it. Docs here.
  • General Details: I cringed at how much I left out — but I also cringed at how long this post was. For way more in-depth details on all of this, just pour over the documentation on their site.

Anyways, this finally concludes the series! Feel free to reach out to me with questions.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s