Infrastructure as Code: From Manual Provisioning to Ansible + Terraform
After three years of manually provisioning VMs through ssh and adjusting docker-compose files on the hosts, I finally committed to Infrastructure as Code. April 1, 2024: First commit to the infrastructure repository.
The Problem
Since my first homelab post in November 2020, I’d accumulated a collection of snowflake servers—each one unique, manually configured, and completely undocumented.
The cost:
- ❌ 4-hour recovery times for failed VMs
- ❌ Deployment anxiety (one wrong click = broken service)
- ❌ Zero reproducibility
- ❌ Tribal knowledge locked in my head
The wake-up call: A hard drive failure on my GitLab VM. I had no backup of the VM configuration. Was it 8GB or 16GB of RAM? What VLAN was it on? Where were the mount points?
The Solution
Every component of my infrastructure is now defined in code. No more clicking through UIs. No more manual SSH sessions.
Two-Layer Architecture
Terraform: Manages external/immutable infrastructure.
- Cloudflare DNS records and Tunnel
Ansible: Manages mutable state and VM lifecycle.
- Proxmox VMs (provisioning from cloud-init templates)
- Docker containers
- File system configurations (NFS mounts)
- Service orchestration
Why both? Terraform manages external resources. Ansible handles everything on Proxmox VMs.
Implementation
VM Provisioning with Ansible
Before, I clicked through Proxmox UI screens and hoped I wrote down what I did.
Now, VM specs are defined in YAML:
# vars/vm_specs.yml
vms:
- name: ruby
vmId: 920
target_node: pve5
cores: 8
memory: 16384
disk:
scsi0: 'nvme-thin:200'
net:
net0: 'virtio,bridge=vmbr0,tag=50'
ipconfig:
ipconfig0: 'ip=192.168.50.20/24,gw=192.168.50.1'
tags: ['ansible', 'gitlab-runner']
clone: 'ubuntu-24.04-server-cloudinit-template'
# roles/hypervisor/tasks/provision_vm.yml
- name: Provision VM from cloud-init template
community.general.proxmox_kvm:
api_host: "{{ ansible_host }}"
node: "{{ inventory_hostname }}"
name: "{{ vm_name }}"
vmid: "{{ vm_id }}"
clone: "{{ vm_clone }}"
cores: "{{ vm_cores }}"
memory: "{{ vm_memory }}"
net: "{{ vm_net }}"
state: "{{ vm_state }}"
Benefits:
- ✅ Reproducible (spin up identical VMs from templates)
- ✅ Version controlled (VM specs in Git)
- ✅ Self-documenting (vm_specs.yml IS the documentation)
- ✅ Idempotent (run multiple times safely)
Application Deployment
Before: SSH in, manually install packages, hope nothing breaks.
After:
# roles/platform_one/tasks/gitlab.yml
- name: Ensure GitLab directory structure
file:
path: "{{ container_data }}/ruby/gitlab"
state: directory
- name: Template GitLab docker-compose
template:
src: gitlab-docker-compose.yml.j2
dest: "{{ container_data }}/ruby/gitlab/docker-compose.yml"
notify: restart gitlab
- name: Deploy GitLab container
community.docker.docker_compose_v2:
project_src: "{{ container_data }}/ruby/gitlab"
state: present
Run once: ansible-playbook main_playbook.yml --tags gitlab --limit ruby
Terraform → Ansible Integration
Terraform outputs (VM IPs, bucket names) flow into Ansible via a generated vars file:
# terraform/outputs.tf
resource "local_file" "ansible_vars" {
filename = "${path.root}/../../vars/tf_ansible_vars_file.yml"
content = yamlencode({
minio_endpoint = minio_s3_bucket.backups.endpoint
vault_addr = vault_auth_backend.oidc.path
})
}
Ansible consumes this automatically:
# playbook.yml
- hosts: all
vars_files:
- vars/tf_ansible_vars_file.yml
The Results
Full environment rebuild:
$ cd terraform/1-infrastructure && terraform apply
$ ansible-playbook main_playbook.yml
That’s it. Every VM, every service, every configuration restored from code.
This was the foundation. Terraform came later in December 2024. For now, Ansible handled everything—VM provisioning, Docker deployments, configuration management. The infrastructure repo was born on April 1, 2024. What followed was rapid iteration.