blog.lazkani.io/content/posts/raspberry-pi-container-orchestration-and-swarm-right-at-home.md

263 lines
8.5 KiB
Markdown
Raw Normal View History

+++
title = "Raspberry Pi, Container Orchestration and Swarm right at home"
author = ["Elia el Lazkani"]
date = 2022-08-24
lastmod = 2022-08-24
tags = ["docker", "linux", "arm", "swarm", "traefik", "raspberry-pi"]
categories = ["container"]
draft = false
+++
When I started looking into solutions for my home container orchestration, I
wanted a solution that runs on my 2 Raspberry Pis. These beasts have 4 virtual
CPUs and a whoping 1GB of memory each. In other words, not a lot of resources to
go around. What can I run on these? I wonder!
<!--more-->
## Consideration {#consideration}
If we look at the state of _container orchestration_ today, we see that
_Kubernetes_ domates the space. _Kubernetes_ is awesome, but will it run on my
Pis ? I doubt it.
Fret not ! There are other, _more lightweight_, solutions out there. Let's
discuss them briefly.
### K3s {#k3s}
I have experience with _K3s_. I even wrote a blog [post]({{< relref "building-k3s-on-a-pi" >}}) on it. Unfortunately, I
found that _K3s_ uses almost half of the memory resources of the Pis to run.
That's too much overhead lost.
### MicroK8s {#microk8s}
_MicroK8s_ is a Canonical project. It has similarities to _K3s_ in the way of
easy deployment and lightweight focus. The end result is also extremly similar
to _K3s_ in resource usage.
### Nomad {#nomad}
_Nomad_ is a _HashiCorp_ product and just all their other products, it is very
well designed, very robust and extremly versatile. Running it on the Pis was a
breeze, it barely used any resources.
It shoulds great so far, doesn't it ? Well, sort of. The deployment and
configuration of _Nomad_ is a bit tricky and requires a bit of moving
components. Those can be automated with _Ansible_ eventually. Aside that,
_Nomad_ requires extra configuration to install and enable CNI and service
discovery.
Finally, it has a steep learning curve to deploy containers in the cluster and
you have HCL to deal with.
### Swarm {#swarm}
I was surprised to find that not only _Docker Swarm_ is still alive, it also
became a mode which comes preshipped with _docker_ since a few years ago.
I also found out that _Swarm_ has great _Ansible_ integration, for both
initializing and creating the cluster and deploying _stacks_ and _services_ into
it. After all, if you are already familiar with _docker-compose_, you'll feel
right at home.
## Setting up a Swarm cluster {#setting-up-a-swarm-cluster}
I set up to deploy my _Swarm Cluster_ and manage it using _Ansible_. I didn't
want to do the work again in the future and I wanted to go the IaC
(_Infrastructure as Code_) route, as should you.
At this stage, I have to take a few assumptions. I assume that you already have
at least 2 machines with a Linux Distribution installed on them. I, also, assume
that _docker_ is already installed and running on both machines. Finally, all
the dependencies required to run _Ansible_ on both hosts (`python3-docker` and
`python3-jsondiff` on _Ubuntu_).
There are **two** types of _nodes_ in a _Swarm_ cluster; `manager` and `worker`.
The **first** node used to initialize the cluster is the _leader_ node which is
also a `manager` node.
### Leader {#leader}
For the `leader` node, our tasks are going to be initializing the cluster.
Before we do so, let's create our _quick and dirty_ **Ansible** `inventory` file.
```yaml
---
all:
hosts:
children:
leader:
hosts:
node001:
ansible_host: 192.168.0.100
ansible_user: user
ansible_port: 22
ansible_become: yes
ansible_become_method: sudo
manager:
worker:
hosts:
node002:
ansible_host: 192.168.0.101
ansible_user: user
ansible_port: 22
ansible_become: yes
ansible_become_method: sudo
```
<div class="admonition warning">
<p class="admonition-title">warning</p>
This isn't meant to be deployed in **production** in a _professional_ setting. It
goes without saying, the `leader` is static, not highly available and prone to
failure. The `manager` and `worker` node tasks are, also, dependent on the
successful run of the initialization task on the `leader`.
</div>
Now that we've taken care of categorizing the nodes and writing the _Ansible_
`inventory`, let's initialize a _Swarm_ cluster.
```yaml
---
- name: Init a new swarm cluster
community.docker.docker_swarm:
state: present
advertise_addr: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}"
register: clustering_swarm_cluster
```
<div class="admonition note">
<p class="admonition-title">Note</p>
We use `hostvars[inventory_hostname]['ansible_default_ipv4']['address']` which
returns the IP address of the node itself. This is the IP adress used to advertise.
</div>
<div class="admonition note">
<p class="admonition-title">Note</p>
We use `register` to save the returned response from the cluster initialization
into a new variable we called `clustering_swarm_cluster`. This will come handy later.
</div>
This should take care of initializing a new _Swarm_ cluster.
You can verify if _Swarm_ is running.
```shell
$ docker system info 2>&1 | grep Swarm
Swarm: active
```
### Manager {#manager}
If you have a larger number of nodes, you might require more than one `manager`
node. To join more _managers_ to the cluster, we can use the power of _Ansible_ again.
```yaml
---
- name: Add manager node to Swarm cluster
community.docker.docker_swarm:
state: join
advertise_addr: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}"
join_token: "{{ hostvars[groups['leader'][0]]['clustering_swarm_cluster']['swarm_facts']['JoinTokens']['Manager'] }}"
remote_addrs: [ "{{ hostvars[groups['leader'][0]]['ansible_default_ipv4']['address'] }}:2377" ]
```
<div class="admonition note">
<p class="admonition-title">Note</p>
We access the token we saved earlier on the `leader` to join a `manager` to the cluster using `hostvars[groups['leader'][0]]['clustering_swarm_cluster']['swarm_facts']['JoinTokens']['Manager']`.
</div>
<div class="admonition note">
<p class="admonition-title">Note</p>
If we can get a hostvar from a different node, we can also get the IP of such
node with `hostvars[groups['leader'][0]]['ansible_default_ipv4']['address']`.
</div>
Now that we've taken care of the `manager` code, let's work on the `worker` nodes.
### Worker {#worker}
Just as easily as we created the _task_ to **join** a `manager` node to the cluster,
we do the same for the `worker`.
```yaml
---
- name: Add worker node to Swarm cluster
community.docker.docker_swarm:
state: join
advertise_addr: "{{ hostvars[inventory_hostname]['ansible_default_ipv4']['address'] }}"
join_token: "{{ hostvars[groups['leader'][0]]['clustering_swarm_cluster']['swarm_facts']['JoinTokens']['Worker'] }}"
remote_addrs: [ "{{ hostvars[groups['leader'][0]]['ansible_default_ipv4']['address'] }}:2377" ]
```
<div class="admonition note">
<p class="admonition-title">Note</p>
Déjà vu when it comes to the `join_token`, except that we use the `worker` token instead.
</div>
The _glue code_ you're looking for that does the magic is this.
```yaml
---
- name: Bootstrap Swarm depedencies
include_tasks: common.yml
- name: Bootstrap leader node
include_tasks: leader.yml
when: inventory_hostname in groups['leader']
- name: Bootstrap manager node
include_tasks: manager.yml
when: inventory_hostname in groups['manager']
- name: Bootstrap worker node
include_tasks: worker.yml
when: inventory_hostname in groups['worker']
```
Each of the tasks described above should be in its own file, as shown in the
_glue code_, and they will **only** run on the group they are meant to run on.
Following these tasks, I ended up with the cluster below.
```shell
# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
h4scu4nry2r9p129rsdt88ae2 * node001 Ready Active Leader 20.10.17
uyn43a9tsdn2n435untva9pae node002 Ready Active 20.10.17
```
There, we see both nodes and they both seem to be in a `Ready` state.
## Conclusion {#conclusion}
If you're _outside_ a professional setting and you find yourself needing to run a
container orchestration platform, some platforms might be overkill. _Docker
Swarm_ has great community support in _Ansible_ making the management of small
clusters on low resource devices extremly easy. It comes with the added bonus of
having built-in _service discovery_ and _networking_. Give it a try, you might
be pleasently surprised like I was.