What: System provisioning
Why: Faster development startup
How: Using Ansible modules for provisioning via ssh
During the what the data hackathon, a friend of mine and I used Vagrant for setting up the same system for both of us. It was a perfect starting point. Nevertheless, during hacking we installed this package, tried out that package, .. After a short while, the two systems were not equal anymore. This is a first approach to use Ansible to keep two machines in sync during development.
Ansible works by using a central machine (control machine) to provision other machines (remote machines) via ssh. In this tutorial the control machine is assummed to be a linux machine with Python < 3 installed and the remote machine is a Vagrant VM running debian (see here for the Vagrantfile).
Additional resources
See also: this blog
Ansible Installation
In principal, it should be possible to install Ansible from source (github). Nevertheless, I already had Anaconda with Python 3 installed on the controll machine, which complicated things a little bit because Ansible works only wih Python < 3. After some trial and error, I installed Ansible using pip. You have to install the dependencies before:
1 2 | sudo easy_install pip sudo pip install paramiko PyYAML Jinja2 httplib2 six |
If you get an error message about missing libffi, install python-cffi before.
Install Ansible via pip:
1 | sudo pip install ansible |
Setup Ansible
You have to setup Ansible by providing a so called Inventory file to specify, which machines should be provisioned.
Create a file called hosts and add a name in brackets as well as ip, port, … for all the machines you want to provision. If you use the Vagrantfile from here, the ip should be 10.0.2.2 (default gateway in Vagrant for public networks). In other cases or if no connection is established change the ip accordingly. You can find the ip by running the following command on the remote machine. Look for the address which is listed as default gateway ip.
1 | /sbin/ifconfig |
With the correct ip, create the inventory file:
1 2 3 4 | cat <> /home/vagrant/ansible/hosts [ansibletest] 10.0.2.2 ansible_port=2200 ansible_user=vagrant EOF |
Ping
Ansible uses key based authentification by default. I switched it off in my use case for simplicity and use password based authentication which is enabled by –ask-pass. Otherwise, the keys have to be set up. For password based authentication: You have to install sshpass.
Ansible comes with a predefined module for checkin availability of remote machines: ping. Use it with the -m option:
1 | ansible ansibletest -i -m ping --ask-pass |
You will be asked for the password on the remote machine (it is vagrant) and after that the output should be equivalent to:
1 2 3 4 | 10.0.2.2 | SUCCESS => { "changed": false, "ping": "pong" } |
Installing Python3, pip and Pandas on the remote machine
Ansible works by executing so called playbooks for different scenarios.
Lets create a playbook, which uses apt to install Python 3 and pip on the remote machine and afterwards using pip to install python packages (Pandas in this case). Create a file called python-datascience.yml with the following content (or download it here):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | --- - name: install python for data science hosts: ansibletest become: true tasks: - name: Get aptitude for upgrade apt: pkg=aptitude state=present - name: install base packages apt: pkg={{item}} state=present with_items: - python3 - python3-pip - name: Install global python requirements pip: name={{item}} state=present executable=pip3 with_items: - pandas |
Run this playbook in a way similar to the ping above:
1 | ansible-playbook python-datascience.yml --ask-pass |
The output should be:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | PLAY [install python for data science] ***************************************** TASK [setup] ******************************************************************* ok: [10.0.2.2] TASK [Get aptitude for upgrade] ************************************************ ok: [10.0.2.2] TASK [install base packages] *************************************************** ok: [10.0.2.2] => (item=[u'python3', u'python3-pip']) TASK [Install global python requirements] ************************************** ok: [10.0.2.2] => (item=pandas) PLAY RECAP ********************************************************************* 10.0.2.2 : ok=4 changed=0 unreachable=0 failed=0 |
Login to the remote machine to check the result. You should have python3 and pandas now installed.