Docker Swarm HA Cluster Setup with Ansible, Keepalived & Automated Container Updates

How I built a resilient Docker Swarm cluster using Ansible and Keepalived, with automated container updates

Docker Swarm HA Cluster Setup with Ansible, Keepalived & Automated Container Updates

I've been using docker for a few years now but last year i wanted to improve my existing setup. I was running 2 Raspberry Pi 4's each running from USB SSD drives with each host running specific containers.

My aims were:

  • Allow containers to run on any node to provide High Availability and make maintenance easier
  • Provide enough capacity for the future whilst still using low powered hosts
  • Upgrade to 3x Raspberry Pi 5's with NVME storage
  • Keep key services such as DNS and reverse proxy available all the time
  • Take steps to improve docker security by using internal networks, Apparmor, sudo for docker commands and trying to run containers as a non root account
  • Automate as much as possible with Ansible to allow quick rebuilding of a Pi for future OS upgrades

To achieve this i decided to go with Docker Swarm instead of going the Kubernetes route (Maybe one day!)

This post will go over how i set all this up and everything that cropped up during the process.


Raspberry Pi OS

Since i'm running my swarm cluster on Raspberry Pi's i'm also using Raspberry Pi OS which required some changes to the defaults.

I'll highlight a few changes i made here and go in depth in the Ansible section.

  1. Remove the default Pi user
  2. Create new user account without access to the docker group so sudo must be used. Set a sudo timeout of 1 hour
  3. Enable AppArmor
  4. Set additional options required for cAdvisor and graylog
  5. Configure unattended upgrades and email notifications
  6. Configure syslog forwarding to graylog
  7. Configure prometheus node exporter


Docker Swarm

Creating the swarm is the easy part... it's all the unknowns that come after that takes the time!

For my setup i have 3 nodes and each configured as Managers. 3 nodes is the minimum for quorum to be reached and some containers require access to the docker socket which a worker node wouldn't be able to run.

docker swarm init --advertise-addr 10.0.0.1

The output will provide another command to use on the other nodes. If you ever lose the command you find the tokens with

docker swarm join-token manager or docker swarm join-token worker

Then use docker node ls to confirm each node is showing correctly.

Networks

I've decided to create my networks so they are all referenced as external in the compose configurations. As part of this i also wanted to ensure that each network was created as internal unless it would require external access.

docker network create --driver overlay --subnet 10.0.1.0/24 proxy
docker network create --driver overlay --internal --subnet 10.0.2.0/24 socket-proxy-traefik

Users

Some containers run as the root user so where possible i changed this to a non root user and created an account on the host for them.

sudo useradd -r -u 201 -s /usr/sbin/nologin -M traefik
sudo useradd -r -u 202 -s /usr/sbin/nologin -M authelia

Then set the permissions on the container directory

sudo chown -R 201:201 ../traefik

Secrets

Since docker swarm fully supports secrets i could move away from my previous implementation of creating a file for the secret and create it directly in swarm

echo -n '[email protected]' | docker secret create traefik_cf_api_email -
echo -n 'apikey' | docker secret create traefik_cf_api_key -

Useful Commands

Frequent commands i find myself using

List Services: docker service ls
Check logs: docker service logs -f crowdsec
Remove a stack: docker stack rm traefik
Deploy a stack: docker stack deploy traefik -c /mnt/containers/_swarm/prod/traefik/docker-compose.yml
Re-deploy a stack: docker stack rm traefik && docker stack deploy traefik -c /mnt/containers/_swarm/prod/traefik/docker-compose.yml
Find active node of a running service: docker service ps traefik or docker stack ps traefik

Other commands...
List Stacks: docker stack ls
Inspect Service: docker service inspect --pretty traefik
Scale to 5 instances: docker service scale helloworld=5 (Scale to 0 would effectively stop the container)
Remove Service: docker service rm traefik
Remove all stacks: docker service rm $(docker service ls -q)
Update a service: docker service update --force bookstack

Most of the time i've found that after i've made a change and tried a docker service update it didnt have the required effect and needed the stack to be removed and recreated for the change to be picked up.

Draining

Draining a node prevents new containers being placed on a node, stops any replicas and moves them to another node. If you also run containers outside of swarm as i do then this won't do anything with those.

docker node ls
docker node update --availability drain node1

And to make the node available again... docker node update --availability active node1

Making a node active again does not re-balance the containers to use this node, this will happen when:

  • during a service update to scale up
  • during a rolling update
  • when you set another node to drain
  • when a task fails on another active node

Deployment Issues

If a container fails to start and no logs are generated then it's useful to check the error with the following command

docker service ps --no-trunc portainer_portainer

If your stack has multiple services then i found it useful to update the replicas in the compose file to 0 to prevent that container from starting and make troubleshooting a bit easier.

Constraints

Another useful feature is using constraints to prevent a container running on a specific node

Prevent running on node3

deploy:      
  placement:
        constraints:
          - node.hostname != node3

Force to only run on node1:

deploy:      
  placement:
        constraints:
          - node.hostname == node1

Socket Proxy

I've only got a couple of containers that require access to the docker socket but i have been using a socket proxy for a while to only provide the relevant access.

Part of this project i started to create separate socket proxy configurations for the necessary containers

services:
  socket-proxy-traefik:
    image: lscr.io/linuxserver/socket-proxy:latest
    environment:
      - LOG_LEVEL=info # debug,info,notice,warning,err,crit,alert,emerg
      ## Variables match the URL prefix (i.e. AUTH blocks access to /auth/* parts of the API, etc.).
      # 0 to revoke access.
      # 1 to grant access.
      ## Granted by Default
      - EVENTS=1
      - PING=1
      ## Revoked by Default
      # Security critical
      - AUTH=0
      - SECRETS=0
      - POST=0
     # Other
      - NETWORKS=1
      - SERVICES=1
      - TASKS=1
      - VERSION=1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - socket-proxy-traefik
    tmpfs:
      - /run
    deploy:
      mode: replicated
      replicas: 1
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 60s
      placement:
        constraints:
          - "node.role==manager" # Ensures it only runs on a manager node
      labels:
        - "gantry.services.excluded=true"

networks:
  socket-proxy-traefik:
    external: true

socket-proxy/docker-compose-traefik.yml

services:
  socket-proxy-gantry:
    image: lscr.io/linuxserver/socket-proxy:latest
    environment:
      - LOG_LEVEL=info # debug,info,notice,warning,err,crit,alert,emerg
      ## Variables match the URL prefix (i.e. AUTH blocks access to /auth/* parts of the API, etc.).
      # 0 to revoke access.
      # 1 to grant access.
      ## Granted by Default
      - EVENTS=1
      - PING=1
      ## Revoked by Default
      # Security critical
      - AUTH=0
      - SECRETS=0
      - POST=1
     # Other
      - ALLOW_START=1
      - ALLOW_STOP=1
      - ALLOW_RESTARTS=1
      - CONTAINERS=1
      - DISTRIBUTION=1
      - IMAGES=1
      - INFO=1
      - NETWORKS=1
      - NODES=1
      - SERVICES=1
      - SWARM=1
      - TASKS=1
      - VERSION=1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - socket-proxy-gantry
    tmpfs:
      - /run
    deploy:
      mode: replicated
      replicas: 1
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 60s
      placement:
        constraints:
          - "node.role==manager"

networks:
  socket-proxy-gantry:
    external: true

socket-proxy/docker-compose-gantry.yml


Gantry + Apprise

I previously used watchtower for automatic container updates but had to find an alternative for swarm, this is where i came across Gantry.

There is also the nice ability to rollback an update if an update fails.

Since there is no native alerting built in to Gantry this is where Apprise comes in...

I signed up for Pushover years ago and have never looked at an alternative so here is the config to make that work. I did try to have the pushover URL in a docker secret but couldn't get it to work, that's a job for another day.

Create notifications network
docker network create --driver overlay --subnet x.x.x.x/24 notifications

Swarm Configuration

services:
  gantry:
    image: shizunge/gantry:latest
    networks:
      - notifications
      - socket-proxy-gantry
    environment:
      - "DOCKER_HOST=tcp://socket-proxy-gantry:2375"
      - "GANTRY_NODE_NAME={{.Node.Hostname}}"
      - "GANTRY_SLEEP_SECONDS=86400"
      - "GANTRY_NOTIFICATION_CONDITION=on-change"
      - "GANTRY_NOTIFICATION_APPRISE_URL=http://apprise:8000/notify"
      - "TZ=Europe/London"
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.role==manager
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 60s

  # Refer to https://github.com/caronc/apprise-api for all configurations of the API service.
  apprise:
    image: caronc/apprise:latest
    networks:
      - notifications
    environment:
      # Apprise supports almost all of the most popular notification services.
      # Refer to https://github.com/caronc/apprise for all supported notification services.
      - "APPRISE_STATELESS_URLS=pover://userkey@token/?sound=spacealarm"
    volumes:
      - "/mnt/containers/_swarm/prod/apprise/config:/config"
      - "/mnt/containers/_swarm/prod/apprise/plugin:/plugin"
      - "/mnt/containers/_swarm/prod/apprise/attach:/attach"
    deploy:
      replicas: 1
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 60s

networks:
  notifications:
    external: true
  socket-proxy-gantry:
    external: true

gantry/docker-compose.yml

Send Test Notification
apprise -vv -t "Test Message Title" -b "Test Message Body" pover://userkey@token/?sound=spacealarm"

Excluding Containers
Some containers i allow auto updates by leaving the image tag set to latest and for those that i don't want to auto update i set the following label

deploy:
  mode: replicated
  replicas: 1
  restart_policy:
    condition: on-failure
    delay: 5s
    max_attempts: 3
    window: 60s
  labels:
    - "gantry.services.excluded=true"

This does exclude the containers from being updated by Gantry but i also set the image tag version to prevent any accidental updates and to ensure each node runs the same version.


keepalived

I have a few services running outside of Docker Swarm, such as Pi-hole. I wanted to move away from assigning two DNS servers via DHCP and instead use a single IP address. This ensures traffic always goes to the expected server while still providing high availability if the primary node is rebooted.

For other services that required referencing a specific Swarm node IP, I wanted to guarantee continued availability even if that node went down. Although the routing mesh allows any Swarm node to handle requests, I didn’t want traffic directed to a node that was offline.

To address this, I configured Virtual IPs (VIPs) for the containers. This ensures the service remains accessible even if a node becomes unavailable, as the VIP automatically moves to another healthy node during maintenance or failure scenarios.

On each node i have ansible install keepalived, configure the keepalived.conf and copy the relevant scripts to each node. A jinja2 template is used to set the states and priority

My keepalived.conf looks like the following, with each other node having the state set to BACKUP and the priority set to something lower which are defined in the host variables.

#global_defs { 
#    enable_script_security
#}

vrrp_script chk_pihole {
    script "/etc/keepalived/scripts/check_pihole.sh"
    interval 5
    timeout 3
    fall 2
    rise 2
    user root
}

vrrp_script chk_snmpexporter {
    script "/etc/keepalived/scripts/check_snmpexporter.sh"
    interval 5
    timeout 3
    fall 2
    rise 2
    user root
}


vrrp_script chk_mqtt {
    script "/etc/keepalived/scripts/check_mqtt.sh"
    interval 5
    timeout 3
    fall 2
    rise 2
    user root
}

vrrp_script chk_traefik {
    script "/etc/keepalived/scripts/check_traefik.sh"
    interval 5
    timeout 3
    fall 2
    rise 2
    user root
}


vrrp_instance pihole {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass yourpassword
    }
    virtual_ipaddress {
        10.0.0.101
    }

    track_script {
        chk_pihole
   }
}

vrrp_instance snmpexporter {
    state MASTER
    interface eth0
    virtual_router_id 53
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass yourpassword
    }
    virtual_ipaddress {
        10.0.0.103
    }
    track_script {
        chk_snmpexporter
    }
}

 
vrrp_instance mqtt {
    state MASTER  
    interface eth0  
    virtual_router_id 54
    priority 100
    advert_int 1    
    authentication {
        auth_type PASS
        auth_pass yourpassword
    }
    virtual_ipaddress {
        10.0.0.104
    }
    track_script {  
        chk_mqtt
    }
}

vrrp_instance traefik {
    state MASTER
    interface eth0
    virtual_router_id 55
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass yourpassword
    }
    virtual_ipaddress {
        10.0.0.105
    }
    track_script {
        chk_traefik
    }
}

/etc/keepalived/keepalived.conf

💡
The script security global option enforces strict checks but does cause issues with my setup as the script runs as the user keepalived_script UID:999 GID:991 which for me wont work as the script to run is running docker commands which it won't be able to perform. Instead i have locked down the directory to the root user only and set each script to run as this user.

Then the necessary scripts should exist under /etc/keepalived/scripts and owned by root

sudo mkdir /etc/keepalived/scripts
sudo chmod 700 /etc/keepalived/scripts

#!/bin/bash

# Check if traefik container is running
if docker ps --filter "name=traefik" --filter "status=running" | grep -q traefik_traefik; then
    exit 0
else
    exit 1
fi

/etc/keepalived/scripts/check_traefik.sh

Restart the keepalived service for changes to take effect

sudo systemctl restart keepalived.service


Ansible

I was already using Ansible prior to migrating to Swarm to allow quick rebuilds of a host when a new version of Raspberry Pi OS is released. Migrating to Swarm has brought some additional complexities but Ansible has helped make rebuilds easy.

This is still a bit of a work in progress but all my existing playbooks are below. My original ones were all under a playbooks directory but when i revisited this i ended up using some jinja2 templates so i setup the relevant folder structure and used roles.

Running a playbook would be done via ansible-playbook /mnt/nas/ansible/playbook_docker.yml -i /mnt/nas/ansible/inventory_new.yml

To help improve this slightly so the inventory part does not need specifying or inputting the vault password, i set some defaults in ansible.cfg under my home directory

[defaults]
inventory = /mnt/nas/ansible/inventory.yml
vault_password_file = ~/.ansible/vault_pass.txt
roles_path = /mnt/nas/ansible/roles

~/ansible.cfg

pibuild:
   hosts:
      sn3:
        ansible_host: 10.0.0.3
   vars:
     ansible_user: nick

/mnt/nas/ansible/inventory.yml

Some useful commands...

Install pipx: sudo apt install pipx
Install ansible: pipx install --include-deps ansible
Add ~/.local/bin to PATH: pipx ensurepath
Update ansible: pipx upgrade ansible
Verify inventory: ansible-inventory -i inventory.yml --list
Run in check mode: ansible-playbook --check playbook.yaml
Start at Task: ansible-playbook playbook.yml --start-at-task="install packages"
Run interactively / step mode: ansible-playbook playbook.yml --step
Ask for become password: --ask-become-pass
Info: ansible --version and ansible-config dump


Ansible Vault

I started using Resend for an SMTP relay so i thought this was a good opportunity to try out the Vault.

ansible-vault create /mnt/nas/ansible/group_vars/all/vault.yml

ansible-vault edit /mnt/nas/ansible/group_vars/all/vault.yml

Add the line smtp_pass: "apikey" to reference the variable in the template

View the entry with ansible-vault view /mnt/shared/ansible/group_vars/all/vault.yml

I seem to be missing notes on this but it seems i had some issues with the vault and needed to use the following commands

pipx inject ansible-playbook passlib
pipx inject ansible passlib
pipx inject ansible-core passlib

To prevent asking for the Vault password store it in a file

nano .ansible/vault_pass.txt
chmod 600 .ansible/vault_pass.txt


Playbooks

Install Packages

Install required packages and only install apcupsd on sn3

- name: Raspberry Pi Updates Playbook
  hosts: pibuild
  become: true
  tasks:
   - name: Update apt repo and cache
     apt:
       update_cache: yes
   - name: Upgrade all packages to the latest version
     apt:
       name: "*"
       state: latest
   - name: Install a list of packages
     apt:
       pkg:
       - tldr
       - tmux
       - tcpdump
       - stress
       - sshfs
       - snmp
       - smartmontools
       - nmap
       - iperf3
       - iotop
       - htop
       - hdparm
       - dnsutils
       - apticron
       - msmtp-mta
       - nethogs
       - nload
       - screen
       - ca-certificates
       - gnupg
       - unattended-upgrades
       - bat
       - sg3-utils
       - keepalived
       - fish
       - tree
       - rsyslog
       - duf

   - name: Install apcupsd only on sn3
     apt:
       name: apcupsd
       state: present
     when: inventory_hostname == "sn3"

ansible/playbooks/playbook_apt.yml

Boot Config

Ensure apparmor is enabled and set required options for cAdvisor monitoring

- name: Playbook to update cmdline boot options
  hosts: pibuild
  become: true
  tasks:

    - name: Read current cmdline.txt
      ansible.builtin.command: cat /boot/firmware/cmdline.txt
      register: current_cmdline
      changed_when: false
      tags: bootflags

    - name: Ensure AppArmor and memory cgroup flags are present in /boot/firmware/cmdline.txt
      ansible.builtin.lineinfile:
        path: /boot/firmware/cmdline.txt
        regexp: '^.*$'
        line: >-
          {{ current_cmdline.stdout
           | regex_replace('apparmor=1', '')
           | regex_replace('security=apparmor', '')
           | regex_replace('cgroup_enable=memory', '') #cAdvisor
           | regex_replace('swapaccount=1', '')        #cAdvisor
           | regex_replace('cgroup_memory=1', '')      #cAdvisor
           | regex_replace('  +', ' ')
           | trim
          }} apparmor=1 security=apparmor cgroup_enable=memory swapaccount=1 cgroup_memory=1
        backrefs: true
      register: cmdline_update
      tags: bootflags

ansible/playbooks/playbook_boot.yml

.bashrc

Create bash aliases and change timestamp for history command

- name: Update .bashrc
  hosts: pibuild
  tasks:
   - name: Append Aliases
     lineinfile:
       dest: "/home/nick/.bashrc"
       line: "{{ item }}"
       state: present
       insertafter: EOF
     loop:
       - "alias updates='sudo apt update; apt list --upgradeable'"
       - "alias running_services='systemctl list-units --type=service --state=running'"
       - "alias dockerstop-all='docker stop $(docker ps -q)'"
       - "alias dockerstart='/home/nick/scripts/dockerstart.sh'"
       - "export HISTTIMEFORMAT='%d-%m-%Y %T '"

ansible/playbooks/playbook_bashrc.yml

Update Config Files

Configure various changes for updates, msmtp and configure syslog forwarding to graylog.

Ensure only the Raspberry Pi devices have the 51unattended-upgrades file for RPI updates to be installed

Unattended-Upgrade::Origins-Pattern {
        "o=Raspberry Pi Foundation,a=stable";
 };

51unattended-upgrades

- name: Raspberry Pi Configs Playbook
  hosts: pibuild
  become: true
  tasks:
   - name: Copy over 51unattended config to add Raspberry Pi updates
     ansible.builtin.copy:
       src: ../configs/51unattended-upgrades
       dest: /etc/apt/apt.conf.d/51unattended-upgrades
     when: inventory_hostname != "minipc"

   - name: Deploy msmtp config to /etc
     ansible.builtin.template:
       src: ../roles/msmtprc/templates/msmtprc.j2
       dest: /etc/msmtprc
       owner: root
       group: root
       mode: '0600'
   - name: Deploy msmtp config to home dir
     ansible.builtin.template:
       src: ../roles/msmtprc/templates/msmtprc.j2
       dest: /home/{{ ansible_user }}/.msmtprc
       owner: "{{ ansible_user }}"
       group: "{{ ansible_user }}"
       mode: '0600'

   - name: Update 50unattended-upgrades
     template:
       src: ../roles/unattended_upgrades/templates/50unattended-upgrades.j2
       dest: /etc/apt/apt.conf.d/50unattended-upgrades

   - name: Copy apcupsd config to SN3
     ansible.builtin.copy:
       src: ../roles/apcupsd/files/sn3/apcupsd.conf
       dest: /etc/apcupsd/
     when: inventory_hostname == "sn3"

   - name: Restart apcupsd
     ansible.builtin.service:
       name: apcupsd
       state: restarted

   - name: Set fish as the default shell for nick
     ansible.builtin.user:
       name: nick
       shell: /usr/bin/fish

   - name: Copy fish aliases
     ansible.builtin.copy:
       src: "{{ item }}"
       dest: /home/nick/.config/fish/functions/
       owner: nick
       group: nick
       mode: '0644'
     loop: "{{ lookup('fileglob', '../configs/aliases/*.fish', wantlist=True) }}"

   - name: Configure syslog forwarding to Graylog
     ansible.builtin.lineinfile:
       path: /etc/rsyslog.conf
       line: '*.*@10.0.0.1:5140;RSYSLOG_SyslogProtocol23Format'
       state: present
       #create: yes
       backup: yes

   - name: Restart rsyslog
     ansible.builtin.service:
       name: rsyslog
       state: restarted

   - name: Ensure vm.max_map_count is 262144 for Graylog
     sysctl:
       name: vm.max_map_count
       value: 262144
       state: present
       reload: yes

ansible/playbooks/playbook_configs.yml

Templates

The template for msmtprc to configure the file in /etc and also insert the smtp password from Ansible Vault. The email address is created by using the hostname for the respective node

defaults
auth           on
tls            on
logfile        ~/.msmtp.log

account        default
host           smtp.resend.com
port           587
from           {{ ansible_hostname }}@example.com
user           resend
password       {{ smtp_pass }}

ansible/roles/msmtprc/templates/msmtprc.j2

For apticron email alerts, use the template to set the email address using the hostname

   - name: Update apticron.conf
     template:
       src: ../roles/apticron/templates/apticron.conf.j2
       dest: /etc/apticron/apticron.conf
       owner: root
       group: root
       mode: '0644'

   - name: Restart apticron service
     ansible.builtin.service:
       name: apticron
       state: restarted

ansible/roles/apticron/tasks/main.yml

CUSTOM_FROM="{{ inventory_hostname | lower }}@example.com"

ansible/roles/apticron/templates/apticron.conf.j2

- hosts: pibuild
  become: true
  roles:
    - apticron

ansible/playbooks/apticron.yml

Run the playbook
ansible-playbook ../ansible/playbooks/apticron.yml

For unattended-upgrades, configure what should be updated and configure the sender address

"origin=Debian,codename={distro_codename}-updates";
"origin=Debian,codename={distro_codename}-proposed-updates";
"origin=Debian,codename={distro_codename},label=Debian";
"origin=Debian,codename={distro_codename},label=Debian-Security";
"origin=Debian,codename={distro_codename}-security,label=Debian-Security";

Unattended-Upgrade::Mail "[email protected]";
Unattended-Upgrade::Sender "{{ inventory_hostname | lower }}@example.co.uk";        

ansible/roles/unattended_upgrades/templates/50unattended-upgrades.j2

Scripts

Create required directories for scripts to be copied in to and set permissions while only copying the relevant scripts depending on hostname.

- name: Ensure target script directories exist
  file:
    path: "{{ item.path }}"
    state: directory
    owner: "{{ item.owner }}"
    group: "{{ item.group }}"
    mode: "{{ item.mode }}"
  loop:
    - { path: "/root/scripts", owner: "root", group: "root", mode: "0700" }
    - { path: "/home/nick/scripts", owner: "nick", group: "nick", mode: "0755" }

- name: Copy backup script to /root/scripts
  copy:
    src: "{{ item }}"
    dest: "/root/scripts/{{ item | basename }}"
    mode: '0755'
  with_fileglob:
    - "../roles/scripts/files/common/root/*"

- name: Copy scripts to /home/nick/scripts
  copy:
    src: "{{ item }}"
    dest: "/home/nick/scripts/{{ item | basename }}"
    owner: nick
    group: nick
    mode: '0755'
  with_fileglob:
    - "../roles/scripts/files/common/nick/*"
  when: inventory_hostname | lower != "sn1"

- name: Copy container backup scripts to SN1 /root/scripts
  copy:
    src: "{{ item }}"
    dest: "/root/scripts/{{ item | basename }}"
    mode: '0755'
  with_fileglob:
    - "../roles/scripts/files/sn1/root/*"
  when: inventory_hostname | lower == "sn1"

- name: Copy SN1 specific scripts to /home/nick/scripts
  copy:
    src: "{{ item }}"
    dest: "/home/nick/scripts/{{ item | basename }}"
    owner: nick
    group: nick
    mode: '0755'
  with_fileglob:
    - "../roles/scripts/files/sn1/nick/*"
  when: inventory_hostname | lower == "sn1"

ansible/roles/scripts/tasks/main.yml

Draining

Find the local hostname and drain the node

sudo docker node update --availability drain $(hostname)
sudo docker stop cadvisor

ansible/roles/scripts/files/common/nick/dockerstop.sh

Make the node available for tasks

sudo docker node update --availability active $(hostname)
sudo docker compose -f /mnt/containers/_swarm/prod/cadvisor/docker-compose.yml up -d

ansible/roles/scripts/files/common/nick/dockerstart.sh

Some containers i run outside of compose and so this script only needs to be copied to that node

sudo docker node update --availability drain $(hostname)
sudo docker stop pi-hole stubby homeassistant govee2mqtt cadvisor qbittorrent

ansible/roles/scripts/files/sn1/nick/dockerstop.sh

sudo docker node update --availability active $(hostname)
sudo docker compose -f /home/nick/containers/pihole/docker-compose.yml up -d
sudo docker compose -f /mnt/containers/_swarm/prod/homeassistant/docker-compose.yml up -d --force-recreate
sudo docker compose -f /mnt/containers/_swarm/prod/govee2mqtt/docker-compose.yml up -d
sudo docker compose -f /mnt/containers/_swarm/prod/qbittorrent/docker-compose.yml up -d
sudo docker compose -f /mnt/containers/_swarm/prod/cadvisor/docker-compose.yml up -d

ansible/roles/scripts/files/sn1/nick/dockerstart.sh

Backups

I only care about backing up the etc and homes dir just in case, this script is only copied to the root user on all nodes

#!/bin/bash
#Backup Homes/etc
HOST=$(hostname -s)
tar -pzcvf /mnt/backups/etc/$(date +%Y%m%d)_${HOST}-etc.zip /etc
tar -pzcvf /mnt/backups/homes/$(date +%Y%m%d)_${HOST}-home-nick.zip /home/nick
tar -pzcvf /mnt/backups/homes/$(date +%Y%m%d)_${HOST}-home-root.zip /root

ansible/roles/scripts/files/common/root/backup_host.sh

Cron

Set email address and create the necessary cron jobs defined for all hosts and host specific jobs. In my case, all database backup scripts get copied to sn1 only and this creates the cron job based on the code in ansible/roles/cron/vars/main.yml

- name: Set MAILTO for root
  ansible.builtin.cron:
    user: root
    name: "MAILTO"
    env: yes
    job: "[email protected]"


- name: Set MAILTO for nick
  ansible.builtin.cron:
    user: nick
    name: "MAILTO"
    env: yes
    job: "[email protected]"


- name: Add common cron jobs
  cron:
    name: "{{ item.name }}"
    user: "{{ item.user }}"
    job: "{{ item.job }}"
    minute: "{{ item.minute }}"
    hour: "{{ item.hour }}"
    weekday: "{{ item.weekday }}"
  loop: "{{ common_cron_jobs }}"

- name: Add host-specific cron jobs
  cron:
    name: "{{ item.name }}"
    user: "{{ item.user }}"
    job: "{{ item.job }}"
    minute: "{{ item.minute }}"
    hour: "{{ item.hour }}"
    weekday: "{{ item.weekday }}"
  loop: "{{ host_cron_jobs[inventory_hostname] | default([]) }}"
  when: host_cron_jobs[inventory_hostname] is defined

ansible/roles/cron/tasks/main.yml

The only common cron job i have for all my hosts is the above backup script so this is created on each node

common_cron_jobs:
  - { name: "Backup /etc and /home", user: root, job: "/root/scripts/backup_host.sh", minute: "0", hour: "3", weekday: "5" }

ansible/roles/cron/defaults/main.yml

I run the rest of my scripts from one node

host_cron_jobs:
  sn1:
    - { name: "Bookstack DB", user: root, job: "/root/scripts/backup_bookstackdb.sh", minute: "0", hour: "4", weekday: "5" }
    - { name: "Joplin DB", user: root, job: "/root/scripts/backup_joplindb.sh", minute: "5", hour: "4", weekday: "5" }
    - { name: "Firefly3 DB", user: root, job: "/root/scripts/backup_fireflydb.sh", minute: "10", hour: "4", weekday: "5" }
    - { name: "Ghost DB", user: root, job: "/root/scripts/backup_ghostdb.sh", minute: "15", hour: "4", weekday: "5" }
    - { name: "Cleanup", user: root, job: "/root/scripts/cleanup.sh", minute: "0", hour: "3", weekday: "7" }

ansible/roles/cron/vars/main.yml

 hosts: pibuild
  become: true
  roles:
    - cron

ansible/playbooks/cron.yml

Docker Setup

Download and install docker packages and enable metrics

- name: Raspberry Pi Docker Playbook
  hosts: pibuild
  become: true
  tasks:
   - name: Add Docker GPG apt Key
     apt_key:
        url: https://download.docker.com/linux/ubuntu/gpg
        state: present

   - name: Add Docker Repository
     apt_repository:
        repo: deb https://download.docker.com/linux/debian bookworm stable
        state: present

   - name: Update apt repo and cache
     apt:
       update_cache: yes

   - name: Install docker packages
     apt:
       pkg:
       - docker-ce
       - docker-ce-cli
       - containerd.io
       - docker-buildx-plugin
       - docker-compose-plugin

   - name: Ensure group "docker" exists
     ansible.builtin.group:
       name: docker
       state: present

   - name: Ensure Docker daemon.json for metrics
     copy:
       dest: /etc/docker/daemon.json
       content: |
         {
           "metrics-addr": "0.0.0.0:9323"
         }
       owner: root
       group: root
       mode: '0644'

   - name: Restart docker service
     ansible.builtin.service:
       name: docker
       state: restarted

ansible/playbooks/playbook_docker.yml

Container Users

Create users for the containers being run with different users

   - name: Ensure container groups exist
     group:
       name: "{{ item.name }}"
       gid: "{{ item.gid }}"
       system: yes
     loop: "{{ container_users }}"

   - name: Ensure container service accounts exist
     user:
       name: "{{ item.name }}"
       uid: "{{ item.uid }}"
       group: "{{ item.name }}"
       shell: /usr/sbin/nologin
       system: yes
       create_home: no
     loop: "{{ container_users }}"

ansible/roles/container_users/tasks/main.yml

container_users:
  - { name: "traefik",      uid: 201, gid: 201 }
  - { name: "authelia",     uid: 202, gid: 202 }

ansible/group_vars/all/main.yml

- hosts: pibuild
  become: true
  roles:
    - container_users

ansible/playbooks/container_users.yml

Prometheus

I do need to revisit this one, initially i had issues with downloading the file directly so instead i just place it in the installs directory.

Installs prometheus node exporter and configures it as a service

- name: Prometheus Node Exporter
  hosts: pibuild
  become: true
  tasks:
   - name: Download and extract prometheus node exporter
     ansible.builtin.unarchive:
       src: "../installs/node_exporter-{{ node_exporter_version }}.linux-arm64.tar.gz"
       dest: "/tmp/"
     vars:
      node_exporter_version: 1.9.1
   - name: Copy Extracted File to /usr/local/bin
     copy:
       src: "/tmp/node_exporter-{{ node_exporter_version }}.linux-arm64/node_exporter"
       dest: "/usr/local/bin/"
       mode: preserve
       remote_src: true
     vars:
      node_exporter_version: 1.9.1
   - name: Create Node Exporter Service Unit File
     copy:
       content: |
         [Unit]
         Description=Prometheus Node Exporter
         After=network.target

         [Service]
         ExecStart=/usr/local/bin/node_exporter
         Restart=always

         [Install]
         WantedBy=multi-user.target
       dest: /etc/systemd/system/node_exporter.service
   - name: Enable and Start Node Exporter Service
     systemd:
       name: node_exporter
       enabled: yes
       state: started

ansible/playbooks/playbook_nodeexporter.yml

Swap

Allocate swap and add to fstab

- name: Playbook to configure swap
  hosts: pibuild
  become: true
  tasks:
   - name: set swap_file variable
     set_fact:
       swap_file: /swapfile

   - name: check if swap file exists
     stat:
       path: "{{ swap_file }}"
     register: swap_file_check

   - name: create swap file
     command: fallocate -l 4G {{ swap_file }}
     when: not swap_file_check.stat.exists

   - name: set permissions on swap file
     file:
       path: "{{ swap_file }}"
       mode: 0600

   - name: Mark as swap
     command: mkswap {{ swap_file }}
     when: not swap_file_check.stat.exists

   - name: Enable swap
     command: swapon /swapfile

   - name: Add to fstab
     lineinfile:
       dest: /etc/fstab
       regexp: "{{ swap_file }}"
       line: "{{ swap_file }} none swap sw 0 0"

ansible/playbooks/playbook_swap.yml

Networking

- name: Raspberry Pi Networking Playbook
  hosts: pibuild
  become: true
  tasks:
   - name: Update Ethernet connection profile
     become: true
     nmcli:
       conn_name: eth0
       ifname: eth0
       type: ethernet
       ip4: 10.0.0.3/24
       gw4: 10.0.0.254
       dns4:
         - 10.0.0.101
       dns4_search: local.lan
       method6: disabled
       state: present
   - name: Reboot
     ansible.builtin.reboot:

ansible/playbooks/playbook_networking.yml

NFS

- name: Setup NFS mounts
  hosts: pibuild
  become: true
  tasks:
   - name: Create and mount NFS Containers share
     ansible.posix.mount:
       src: 10.0.0.10:/volume1/containers
       path: /mnt/containers
       opts: defaults
       state: mounted
       fstype: nfs

ansible/playbooks/playbook_nfs.yml

SSHFS

I also have an SSHFS mountpoint...it will appear to not work until you've manually connected and accepted the certificate

sudo sshfs -o allow_other,default_permissions,IdentityFile=/home/nick/.ssh/id_ed25519 nick@synology:/ /mnt/nas

- name: Setup sshfs mountpoint
  hosts: pibuild
  become: true
  tasks:
   - name: Create nas dir for nfs mount
     ansible.builtin.file:
       path: /mnt/nas
       state: directory
       mode: '770'
       owner: nick
       group: nick
   - name: Add sshfs mountpoint to fstab
     lineinfile:
       dest: /etc/fstab
       line: "sshfs#nick@synology:/ /mnt/nas/ fuse delay_connect,defaults,idmap=user,IdentityFile=/home/nick/.ssh/id_ed25519,port=22,uid=1000,gid=1000,allow_other 0 0"
   - name: Update fuse.conf
     lineinfile:
       dest: /etc/fuse.conf
       line: "allow_other"
   - name: Update fuse.conf part 2
     ansible.builtin.lineinfile:
       path: /etc/fuse.conf
       regexp: '^#user_allow_other '
       insertafter: '^#user_allow_other '
       line: user_allow_other
   - name: Mount /mnt/nas
     ansible.builtin.command: mount /mnt/nas

Keepalived

Update the keepalived.conf file with the correct values based on hostname and copy over the scripts for ensuring the VIP comes up on whichever node is running the container

   - name: Copy keepalived.conf
     template:
      src: ../roles/keepalived/templates/keepalived.conf.j2
      dest: /etc/keepalived/keepalived.conf
      owner: root
      group: root
      mode: '0644'

   - name: Copy keepalived scripts
     copy:
       src: "{{ item }}"
       dest: /etc/keepalived/
       owner: root
       group: root
       mode: '0700'
     loop: "{{ lookup('fileglob', '../roles/keepalived/files/scripts/*.sh', wantlist=True) }}"

   - name: Restart keepalived
     ansible.builtin.service:
       name: keepalived
       state: restarted

ansible/roles/keepalived/tasks/main.yml

#global_defs {
#    enable_script_security
#}

vrrp_script chk_pihole {
    script "/etc/keepalived/check_pihole.sh"
    interval 5
    timeout 3
    fall 2
    rise 2
    user root
}

vrrp_script chk_snmpexporter {
    script "/etc/keepalived/check_snmpexporter.sh"
    interval 5
    timeout 3
    fall 2
    rise 2
    user root
}


vrrp_script chk_mqtt {
    script "/etc/keepalived/check_mqtt.sh"
    interval 5
    timeout 3
    fall 2
    rise 2
    user root
}

vrrp_script chk_traefik {
    script "/etc/keepalived/check_traefik.sh"
    interval 5
    timeout 3
    fall 2
    rise 2
    user root
}


vrrp_instance pihole {
    state {{ keepalived_state }}
    interface eth0
    virtual_router_id 51
    priority {{ keepalived_priority }}
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass MySecret
    }
    virtual_ipaddress {
        10.0.0.101
    }

    track_script {
        chk_pihole
   }
}

vrrp_instance snmpexporter {
    state {{ keepalived_state }}
    interface eth0
    virtual_router_id 53
    priority {{ keepalived_priority }}
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass MySecret
    }
    virtual_ipaddress {
        10.0.0.103
    }
    track_script {
        chk_snmpexporter
    }
}

vrrp_instance mqtt {
    state {{ keepalived_state }}
    interface eth0
    virtual_router_id 54
    priority {{ keepalived_priority }}
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass MySecret
    }
    virtual_ipaddress {
        10.0.0.104
    }
    track_script {
        chk_mqtt
    }
}

vrrp_instance traefik {
    state {{ keepalived_state }}
    interface eth0
    virtual_router_id 55
    priority {{ keepalived_priority }}
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass MySecret
    }
    virtual_ipaddress {
        10.0.0.105
    }
    track_script {
        chk_traefik
    }
}

ansible/roles/keepalived/templates/keepalived.conf.j2

If this script is successful then traefik is running on this node and should own the VIP

#!/bin/bash

# Check if traefik container is running
if docker ps --filter "name=traefik" --filter "status=running" | grep -q traefik_traefik; then
    exit 0
else
    exit 1
fi

ansible/roles/keepalived/files/scripts/check_traefik.sh

Set the state and priority for the nodes

keepalived_state: MASTER
keepalived_priority: 100

ansible/host_vars/sn1.yml

keepalived_state: BACKUP
keepalived_priority: 90

ansible/host_vars/sn2.yml

Sudo

Add my user to sudo and set a timeout of 1 hour

- name: Raspberry Pi Updates Playbook
  hosts: pibuild
  become: true

  tasks:
    - name: Ensure 'nick' user exists with password and sudo group
      user:
        name: nick
        groups: sudo
        append: yes

    - name: Configure sudo access for 'nick' with password required + 1hr timeout
      copy:
        dest: /etc/sudoers.d/nick
        content: |
          nick ALL=(ALL) ALL
          Defaults:nick timestamp_timeout=60
        owner: root
        group: root
        mode: '0440'
        validate: 'visudo -cf %s'

    - name: Remove any existing NOPASSWD rule for 'nick'
      file:
        path: /etc/sudoers.d/nick_nopasswd
        state: absent

ansible/playbooks/sudo.yml

After running this playbook ansible will ask for the become password going forward so you would need to use --ask-become-pass

ansible-playbook /mnt/nas/ansible/playbooks/playbook_boot.yml --ask-become-pass


Buy Me A Coffee