3 minutes
Rebuilding my homelab (Part 3): configuration with Ansible
Once the virtual machines are provisioned, I use Ansible to configure them. I moved away from Puppet toward Ansible because it’s more lightweight, agentless, and widely adopted — which makes it easier to automate and maintain for a smaller homelab setup.
All my Ansible playbooks, roles, and inventory are stored in a monorepo, structured like this:
.
├── ansible.cfg
├── inventory/
│ ├── group_vars/ # Group-wide variables (e.g., all monitoring nodes)
│ ├── host_vars/ # Per-host overrides
│ ├── hosts/ # Inventory files (static or dynamic)
│ └── recovery_vars/ # Used for restoring nodes
├── playbooks/
│ ├── general/
│ ├── recovery/
│ └── site.yaml # Main entry point
├── roles/
│ ├── common/
│ ├── gitea/
│ ├── prometheus/
│ ├── grafana/
│ ├── vault/
│ └── ...
└── renovate.json # Keeps versioning fresh
Each role is self-contained and reusable. The main playbook (site.yaml) orchestrates which roles run on which nodes by group.
GitOps Workflow with Gitea Actions
To make Ansible changes automatic and repeatable, I’ve set up Gitea Actions here too. Each role has its own workflow file that gets triggered when either:
- The role code is updated
- Its group variables are changed
This ensures that only the relevant role gets deployed, avoiding unnecessary reconfigurations across the fleet. Example: if I change something in the Prometheus role or its vars, only Prometheus hosts are reconfigured.
Here’s a simplified workflow example:
on:
push:
paths:
- 'roles/prometheus/**'
- 'inventory/group_vars/prometheus.yaml'
jobs:
Deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python and Ansible
run: |
pip3 install ansible yamllint
- name: Lint
run: yamllint .
- name: Set up SSH
run: |
mkdir -p ~/.ssh
echo "${{ secrets.SSH_PRIVATE_KEY }}" > ~/.ssh/id_rsa
chmod 600 ~/.ssh/id_rsa
- name: Run playbook
run: ansible-playbook playbooks/site.yaml --tags 'prometheus' -b -l 'prometheus'
Manual Role Triggers for Host-Specific Vars
One current limitation: if I change host-specific variables (in host_vars/
), no workflow is triggered. For now, I run these manually when needed — which is rare, since most configuration is managed at the group level.
Eventually, I might improve this by creating a matrix-based workflow or adding logic to detect host var changes and trigger the correct role, but it’s not critical at the moment.
Recovery logic
Despite all the automation in place, it’s handy to have some playbooks to recover certain nodes that have backups — like gitea, vault, and jenkins. These services all have backup scripts that run every night and get shipped to my NAS. To handle this, I’ve created dedicated Ansible recovery playbooks that can bring up these key services independently when needed.
These playbooks live under:
playbooks/
├── general/
├── recovery/ # Recovery-specific flows
│ ├── gitea.yaml
│ ├── jenkins.yaml
│ └── vault.yaml
Each recovery playbook focuses on reapplying just the essential role(s) needed to restore a given service from scratch. These are designed to run even when the GitOps pipeline isn’t available — for example, if gitea itself is down and CI is blocked.
These playbooks are:
- Standalone: They don’t rely on the CI pipeline
- Minimal: Only what’s needed to bring the service back online
- Idempotent: Can be safely rerun if something fails mid-recovery
They’re especially useful during:
- Bare-metal reinstallations
- Hardware migrations
- Disaster recovery testing