Provisioning in Vagrant
There is a lot of choices when you want to provision your virtual machine with vagrant:
1. Run scripts manually - completely insane idea
2. Run shell scripts inline or not
3. Use chef or puppet
The first one is good only for testing, trying, ..., but it is insane if you want to provision all machines like this.
The second one is looks quite optimistic, especially if you don't know what are tools from third point.
Today I will try to present how to setup storm project on vagrant machine using only shell provisioning. To simplify the operation I will use tutorial from Hortonworks: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-latest/bkinstallingmanuallybook/content/chrpm_storm.html
1000 words which describe my work looks like:
Step .0. Choosing and preparing machine
First we need to create directory and init vagrant so let's do this:
mkdir storm-vagrant cd storm-vagrant vagrant init
I will use centos box which I downloaded before (you can use URL instead of file). Why Centos you asked? Because in many companies the only allowed Linux is Red Hat. Centos is closed enough. So let's add the box:
vagrant box add "centos-6.5" file:///c:/vagrantfiles/CentOS-6.4-i386-v20131103.box
In vagrant file we need to change our box name so
config.vm.box = "base" becomes
config.vm.box = "centos-6.5"
Step .1. Configure network
We will need network access to guest machine. It can be done in two ways:
1. Using port forwarding
2. Setting IP adress for guest machine.
I will add code to enable private network in machine so vagrantfile will look like below.:
# -*- mode: ruby -*- # vi: set ft=ruby : # Vagrantfile API/syntax version. Don't touch unless you know what you're doing! VAGRANTFILE_API_VERSION = "2" Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| # All Vagrant configuration is done here. The most common configuration # options are documented and commented below. For a complete reference, # please see the online documentation at vagrantup.com. # Every Vagrant virtual environment requires a box to build off of. config.vm.box = "centos-6.5" config.vm.network "private_network", ip: "192.168.33.10"
Caution: In next code listings I will only append above lines.
Now we are ready to run:
After the above command is complete (don't worry about read color on console), let's explore our box. Run
vagrant ssh and you are inside. Check what you want and logout from the machine.
Step .2. Installing storm rpms.
According to " Chapter 1. Getting Ready to Install" we need to configure remote repositories. For Centos 6 the line is:
wget -nv http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/18.104.22.168/hdp.repo -O /etc/yum.repos.d/hdp.repo
So let's change our vagrant file to (first two lines already exist in our file):
config.vm.box = "centos-6.5" config.vm.network "private_network", ip: "192.168.33.10" config.vm.provision :shell, :inline => "wget -nv http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/22.214.171.124/hdp.repo -O /etc/yum.repos.d/hdp.repo"
According to Chapter 17.1 Install the Storm RPMs. we can install storm rpm with
yum install storm
but above command will prompt as if we are sure to download storm + zookeeper (yes storm needs zookeeper to run). To avoid prompt just add
-y to the command. So vagrant file will evolve to:
config.vm.box = "centos-6.5" config.vm.provision :shell, :inline => "wget -nv http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/126.96.36.199/hdp.repo -O /etc/yum.repos.d/hdp.repo" config.vm.provision :shell, :inline => "yum -y install storm"
To call above provisioning we have to choices:
vagrant destroy and again
vagrant up - if we do too much manual changes in previous steps
2. Or just
vagrant provision to run only provisioning on running box
Step .3. Running zookeeper
Zookeeper? Why? -you think now. Storm need zookeeper to comunicate between its nodes. The second line of last chapter: 5. Validate the Installation is:
You must start ZooKeeper before starting Storm.
First we need Java (for storm and zookeeper). I prefer Java 7 form Sun, so I just downloaded rpm form official page to vagrant folder. In recipe please add:
config.vm.provision :shell, :inline => ' javarpm="$(ls /vagrant/ | grep "^jdk.*.rpm" | tail -1)" echo $javarpm rpm -Uvh /vagrant/$javarpm java -version'
Please notice that I have to change
'. The second option will be escape
Then we need to create dirs as it is explain in Chapter 5.2: Set Directories and Permissions. I created a file
directories.sh with below content. It should be placed side-by-side to Vagrantfile:
#!/bin/sh # Directory where ZooKeeper will store data. For example, /grid1/hadoop/zookeeper/data export ZOOKEEPER_DATA_DIR="/grid1/hadoop/zookeeper/data"; # Directory to store the ZooKeeper configuration files. export ZOOKEEPER_CONF_DIR="/etc/zookeeper/conf"; # Directory to store the ZooKeeper logs. export ZOOKEEPER_LOG_DIR="/var/log/zookeeper"; # Directory to store the ZooKeeper process ID. export ZOOKEEPER_PID_DIR="/var/run/zookeeper";
Caution: If you are on Windows host, remember to change line ending to Unix style to avoid
: command not found printing on console during login to guest machine
And do provisioning below with all steps mention in zookeeper chapter (creating dirs, set zookeeper node id and start it).
config.vm.provision :shell, :inline => " yes | cp /vagrant/directories.sh /etc/profile.d/directories.sh chmod 755 /etc/profile.d/directories.sh" config.vm.provision :shell, :inline => " mkdir -p $ZOOKEEPER_LOG_DIR; chown -R $ZOOKEEPER_USER:$HADOOP_GROUP $ZOOKEEPER_LOG_DIR; chmod -R 755 $ZOOKEEPER_LOG_DIR; mkdir -p $ZOOKEEPER_PID_DIR; chown -R $ZOOKEEPER_USER:$HADOOP_GROUP $ZOOKEEPER_PID_DIR; chmod -R 755 $ZOOKEEPER_PID_DIR; mkdir -p $ZOOKEEPER_DATA_DIR; chmod -R 755 $ZOOKEEPER_DATA_DIR; chown -R $ZOOKEEPER_USER:$HADOOP_GROUP $ZOOKEEPER_DATA_DIR echo '1' >> $ZOOKEEPER_DATA_DIR/myid su - zookeeper -c 'source /etc/zookeeper/conf/zookeeper-env.sh ; export ZOOCFGDIR=/etc/zookeeper/conf;/usr/lib/zookeeper/bin/zkServer.sh start >> /var/log/zookeeper/zoo.out 2>&1'"
To check if ZooKeeper is running we need to login to our box and run:
ps aux | grep zookeeper
Is process does not exist just check:
Step .4. Configure Storm
After reading 17.2. Configure Storm a new file is needed:
storm.yaml. Because everything will be on one machine I will set
localhost. Create a
storm.yaml again side-by-side to vagrantfile with following content:
storm.zookeeper.servers: - localhost nimbus.host: localhost drpc.servers: - "localhost" storm.local.dir: /tmp/storm/local logviewer.port: 8081
and copy it into
/etc/storm/conf/storm.yaml. The above file is located in shared
Both above modification in vagrantfile:
[...] config.vm.provision :shell, :inline => "yum -y install storm" config.vm.provision :shell, :inline => " mkdir -p $STORM_LOCAL_DIR chown -R storm:storm $STORM_LOCAL_DIR chmod -R 755 $STORM_LOCAL_DIR cp -f /vagrant/storm.yaml /etc/storm/conf/storm.yaml"
Attention: I added
mkdir -p $STORM_LOCAL_DIR because this folder wasn't create on destination machine.
The last step is to add proper exports in
#storm local dir export STORM_LOCAL_DIR="/tmp/storm/local";
To validate install we should login to guest and run:
sudo su - storm storm nimbus
The expected output is something similar to:
/usr/bin/storm: line 2: /etc/default/hadoop: No such file or directory Running: java -server -Dstorm.options= -Dstorm.home=/usr/lib/storm [...snip...] backtype.storm.daemon.nimbus
Don't be afraid about first line. We won't need Hadoop at all, but Hortonworks installation files assume that we install it everywhere.
Step .5. Configure Process Controller
The optional 3 chapter I omit because we don't need it now - we didn't secure zookeeper. What is interesting we didn't do anything about zookeeper - let's back to this later. Just believe me.
In chapter 4. Configure Process Controller there is a mention about tool called supervisord. If we check it using
yum search supervisord we won't find it in enabled repositories. To install it we need EPEL repo with
cd /tmp wget http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm sudo rpm -Uhv epel-release-6-8.noarch.rpm
and then run:
yum install -y supervisor. Because
supervisor.conf can contain password it is suggested to mark it with 600
So in vagrant file we have to add above (remember with
-y in yum command):
config.vm.provision :shell, :inline => " cd /tmp wget http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm sudo rpm -Uhv epel-release-6-8.noarch.rpm yum install -y supervisor sudo chmod 600 /etc/supervisord.conf"
Let's run the machine (with
vagrant provision), login to it and copy default conf file:
sudo cp /etc/supervisord.conf /vagrant. In copied file we can append lines from 4. Configure Process Controller.
After this step we need to add copy back this file in provisioning:
config.vm.provision :shell, :inline => " yes | cp /vagrant/supervisord.conf /etc/supervisord.conf /etc/init.d/supervisord restart"
vagrantfile are integral parts of our recipe
Step.6. Opening ports
The last part will be open ports for storm-ui, which is by default 8080. We should add line in
/etc/sysconfig/iptables file and restart iptables. The file will look like (remember about UNIX line endings):
# Firewall configuration written by system-config-firewall # Manual customization of this file is not recommended. *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT -A INPUT -m state --state NEW -m tcp -p tcp --dport 8080 -j ACCEPT -A INPUT -j REJECT --reject-with icmp-host-prohibited -A FORWARD -j REJECT --reject-with icmp-host-prohibited COMMIT
And provisioning is:
config.vm.provision :shell, :inline => " yes | cp /vagrant/iptables /etc/sysconfig/iptables /etc/init.d/iptables restart"
vagrant provision for the last time.
Step.7. Check installation
At least we check if everything is working. Just open your favorite browser on http://192.168.33.10:8080/ and you should see Storm UI web page. If there is a connection error you can:
1. Wait a while - it need some time to start :)
2. Check logs in
3. Recreate your machine with running
vagrant destroy and then
That all. Next time I will show how to use better provisiong than shell, because as you probably notice shell provisionig works, but is quite ugly and has a lot of harcoded stuff.