Friday 20 October 2017

Getting started with Docker Swarm on PhotonOS

I have been playing with Docker as a bit of a hobby for the past few months, in between work, life, travel, etc.  so haven't had a chance to play with Swarm...  I have found a few hours each day thanks to a train commute, so that's now changed :)

So what is Swarm; in short, Swarm gives you the ability to cluster multiple Docker hosts to allow native Docker-level clustering, all implemented with a few easy steps...  Something which is very powerful and I will demonstrate in a future post (too much info for one post here!)

I started with PhotonOS (an old version - one I had on my laptop and didn't have the bandwidth to redownload!) so already I was behind...  Swarm is a feature of Docker 1.12 and above; with PhotonOS v1, Docker is on V1.11:

I had to configure the initial VM; run updates and drag Docker to a reasonable version...  All of which took more in terms of downloads than actually downloading a later image...!

I had to spend a bit of time familiarising myself with systemctl, networkctl, hostnamectl, etc... in order to configure the box.  To start, hostname.  Although the traditional 'hostname <new hostname> appears to work, it is not persistent after reboots.  You need hostnamectl:

hostnamectl set-hostname <new hostname> 

Reboot and the hostname remains.   I then needed to set the IP address;  to do this, touch /etc/systemd/network/10-static-en.network and add the following content:

[Match]
Name=eth0
[Network]
Address=192.168.73.11/24
Gateway=192.168.73.2


You will also have to chmod 644 on the file.  This assumes your ethernet adapter is eth0; something which you can confirm by running networkctl


Good!  The last thing I needed to do was get the machine up-to-date...  On PhotonOS, yum is a cut down tool called tdnf.  So tdnf update, I had to reboot here, then systemctl start docker and we can see the version of docker is where in needs to be.
N.B., I also enabled ssh on this server by using systemctl start sshd.  

Now, I cloned a few VMs out so I also had docker02, docker03, docker04 and docker05.  I noticed that the VMs couldn't ping each other, something due to the default firewall being quite strict...  Quick and dirty fix time - systemctl stop iptables.  This is a lab running on my laptop, so not the end of the world.  In a prod environment you would speak with your security team, of course ;-)


So to create the Swarm cluster...  A couple of things to note...  Docker recommend using 3 or 5 'Manager' nodes.  You will start creating a cluster with the docker swarm init command which will force the first node to be a manager.  Although more managers can be added, it makes cluster election take time and is not recommended.  Secondly, by specifying the listen and advertise IP and port, you are ensuring that should your Docker host will use an IP:port that you want.  With that said:


Above, I have run three commands...  The first is the docker swarm init command to initialise the cluster.  The second gives the join string for manager node an the third (also output when I ran the init) gives the command to join a worker node:

docker swarm init --advertise-addr 192.168.73.11:2377 --listen-addr 192.168.73.11:2377
docker swarm join-token manager
docker swarm join-token worker 

I will add an additional two manager nodes and the final two nodes will be worker nodes.  It's worth noting that all manager nodes are also be worker nodes - so by having 5 nodes in your cluster with 3 manager does not mean that you only have 2 worker nodes.

From Docker02:


I copied the command output from the previous 'join-token manager' command and suffixed it with the same 'advertise-addr' and 'listen-addr' switches to ensure that docker02 is listening on its adapter / port that I want it to.  Now, when I run docker node list, I get:

Both nodes added to the swarm cluster.   I will continue through the remainder of the cluster adding 03 as a manager and 04 and 05 as workers...

From the above, you can see in the 'MANAGER STATUS' column that we have 3 managers, one 'Leader' and two 'Reachable'.  All managers will proxy commands through the node leader...  In the event of a failure of the leader, a re-election process will take place between the remaining managers.  We can see that nodes 4 and 5 have nothing in the manager column...  As we would expect.

One final thing to note before I close this post off (way too much to demo without a separate post here!) is the port that I used earlier; 2377...  This port is now an IANA-registered standard (it wasn't in 1.12), although you can specify your own port here if it's more convenient.

Coming up soon... How Swarm works & why it should be interesting to those looking at containers!