Sunday 5 November 2017

Docker Swarm clusters part 2 - running containers on Swarm


From my last blog post about getting started with docker , this post is about running containers at scale with Swarm and why this should be interesting to those deploying containers on Docker. 

The current status of my Swarm cluster is that I have a total of 5 nodes, three of which are managers and two worker nodes.
 

To start with, on a single node, I am going to pull Mediawiki from Docker Hub to see what the default behaviour is; so docker pull mediawiki, then docker run --name medwik -p 8080:80 -d mediawiki.  If you're getting familiar with Docker and what the above command has done, then:

docker run (run a container) --name medwik (name that container 'medwik') --p 8080:80 (map port 8080 on the container host through to port 80 on the container) -d (detatched, i.e. run in the background) mediawiki:latest (run the latest mediawiki image...  I only ran docker pull mediawiki which will have pulled the latest by default).



After which, from a browser, to the IP address of the docker host, on port 8080, I can see my container image has started

As you'd expect, on the IP address of another docker host in the Swarm cluster, the Mediawiki site is not reachable:


So that's where Docker Swarm comes into its own;  using the Docker Service command, I can start the same container, but this time as a service:

(Please note, I also ran a 'docker pull mediawiki' on all other hosts before performing this step).

docker service create -p 8080:80 -d --name wikicluster mediawiki:latest --replicas 3

'docker run' has been replaced with 'docker service create', so this is now running as a Swarm service rather than a container.   The other parameters are the same with the exception of '--replicas 3'.  This defines how many versions of the container will run.  This can be verified using 'docker service ls' and 'docker service ps wikicluster':

We can see that under the 'Replicas' column of 'docker service ls' we have 3/3.  This means that the desired state is 3 running containers and the actual state is 3;  good news!  This is verified by running 'docker service ps wikicluster'.

What has happened is the below; although we have mediawiki running on node 01, 03 and 05, because it is running as part of a Swarm cluster, if I hit any node in the Swarm cluster, it'll redirect to a Docker host running the container.  All of this is managed by the Swarm managers.
 To prove this, if I hit node 02 or 04 (i.e., the nodes that are not running the container), port 8080 still responds!

 

The actual/desired state is important.  What it means is that Swarm will keep a constant eye on the state of the cluster and in the event of a problem with a host, will ensure that the desired state of 3 running clusters is met.  So to prove this, host 'docker01' had a pretty fundamental problem (I restarted it with 'init 6'!)

Which is confirmed by running 'docker node ls' on another host.   However, if I run 'docker service ls', I can see that the replicas state is 3/3 again...  What has happened?  I run 'docker service ps wikicluster' which shows that I now have wikicluster running on nodes 02, 03 and 05 - so swarm picked up the issue and immediately fixed it. 


This is very powerful; within minutes, I have been able to run up a cluster that is automatically load balanced and fault tolerant.  

How does Swarm compare with Kubernetes (K8s)?  From a very high-level, K8s needs a fair amount supporting infrastructure to run, but does provide additional features such as image management (not having to docker pull many times), autoscale, a better API (makes for a more cloud native experience), a larger scale, more tooling, etc.  whereas Swarm is something that you might deploy to get a small Container deployment running in a highly-available, fault tolerant way, quickly!  

This article has a great 'vs' section between Swarm and K8s.

Thanks for reading :-)