<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://matheja.me/feed.xml" rel="self" type="application/atom+xml" /><link href="https://matheja.me/" rel="alternate" type="text/html" /><updated>2026-04-07T09:10:07+02:00</updated><id>https://matheja.me/feed.xml</id><title type="html">Ben Matheja</title><subtitle>Senior Product Manager @Porsche 
| Cloud-Native Platforms | SAFe ART | GenAI Integration | Operational Excellence | posts are my own opinion
</subtitle><author><name>Ben Matheja</name><email>ben@matheja.me</email></author><entry><title type="html">Breaking Data Gatekeeping: Extracting Wächtersbach Müllkalender to ICS</title><link href="https://matheja.me/2025/12/29/waechtersbach-muellkalender-ics-extraction.html" rel="alternate" type="text/html" title="Breaking Data Gatekeeping: Extracting Wächtersbach Müllkalender to ICS" /><published>2025-12-29T10:00:00+01:00</published><updated>2025-12-29T10:00:00+01:00</updated><id>https://matheja.me/2025/12/29/waechtersbach-muellkalender-ics-extraction</id><content type="html" xml:base="https://matheja.me/2025/12/29/waechtersbach-muellkalender-ics-extraction.html"><![CDATA[<p>Our waste collection schedules are locked behind PDFs and a proprietary app that doesn’t allow exporting. I used GenAI to transform the visual calendar into ICS files ready for import into Outlook, Google Calendar, and the likes.
<!--more--></p>

<h2 id="the-problem">The Problem</h2>

<p>Wächtersbach’s waste calendar: PDF-only distribution. No structured data. No API. Manual transcription required.</p>

<p>We organize everything with Google Calendar in our family. I don’t want another single-purpose app just to check when garbage is due.</p>

<p>The <a href="https://www.stadt-waechtersbach.de/wohnen/wohnen-bauen/muell-und-abfallentsorgung/">official page</a> offers static PDFs—hostile to calendar automation.</p>

<h2 id="failed-approaches">Failed Approaches</h2>

<p><strong>Structured PDF parsing:</strong> Traditional tools failed. The PDF isn’t tabular data — it’s a visual calendar layout based on images. In fact thats the worst case as this defies what usual OCR can do.</p>

<h2 id="the-solution-llm-powered-extraction">The Solution: LLM-Powered Extraction</h2>

<p>I used Gemini (fast tier) with a targeted prompt and attached a screenshot of the PDF:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"You are an expert OCR.
Your task is to extract Garbage Tours out of a PDF containing visual elements. 
The expectation is to provide a structured file containing the tours and the dates. 
Do not add complexity to the file like recurring dates, 
just print the ones which are in the visual elements.

Respond only with this file"
</code></pre></div></div>

<p><strong>Workflow:</strong></p>
<ol>
  <li>Gemini extracts dates → <a href="/assets/garbage/muellkalender-transformed-v2.json">structured JSON</a></li>
  <li>Python script generates ICS files per waste type</li>
  <li>Validation script compares output versions across runs for accuracy</li>
</ol>

<hr />

<h2 id="download-calendars">Download Calendars</h2>

<p><strong>Not sure which tour you need?</strong> Check the <a href="https://www.stadt-waechtersbach.de/wohnen/wohnen-bauen/muell-und-abfallentsorgung/downloads/abfallkalender-2025-neu.pdf?cid=j4x">official tour map</a> to find your street’s tour number.</p>

<p><strong>Bio Waste:</strong></p>
<ul>
  <li><a href="/assets/garbage/calenders_v2/Bio_Tour_12.ics">Tour 12</a></li>
  <li><a href="/assets/garbage/calenders_v2/Bio_Tours_34.ics">Tours 3+4</a></li>
</ul>

<p><strong>Paper (PPK):</strong></p>
<ul>
  <li><a href="/assets/garbage/calenders_v2/PPK_Tour_5.ics">Tour 5</a></li>
  <li><a href="/assets/garbage/calenders_v2/PPK_Tours_12.ics">Tours 1+2</a></li>
  <li><a href="/assets/garbage/calenders_v2/PPK_Tours_34.ics">Tours 3+4</a></li>
</ul>

<p><strong>Recycling (DSD):</strong></p>
<ul>
  <li><a href="/assets/garbage/calenders_v2/DSD_Tours_12.ics">Tours 1+2</a></li>
</ul>

<p><strong>Residual Waste (RM):</strong></p>
<ul>
  <li><a href="/assets/garbage/calenders_v2/RM_Tours_12.ics">Tours 1+2</a></li>
  <li><a href="/assets/garbage/calenders_v2/RM_Tours_34.ics">Tours 3+4</a></li>
</ul>]]></content><author><name>Ben Matheja</name></author><category term="python" /><category term="automation" /><category term="data" /><category term="productivity" /><summary type="html"><![CDATA[Our waste collection schedules are locked behind PDFs and a proprietary app that doesn’t allow exporting. I used GenAI to transform the visual calendar into ICS files ready for import into Outlook, Google Calendar, and the likes.]]></summary></entry><entry><title type="html">Homelab: The Next Iteration</title><link href="https://matheja.me/2025/12/19/homelab-next-iteration.html" rel="alternate" type="text/html" title="Homelab: The Next Iteration" /><published>2025-12-19T12:00:00+01:00</published><updated>2025-12-19T12:00:00+01:00</updated><id>https://matheja.me/2025/12/19/homelab-next-iteration</id><content type="html" xml:base="https://matheja.me/2025/12/19/homelab-next-iteration.html"><![CDATA[<p>The architecture started with a Fujitsu PRIMERGY TX120 S3 and matured when the Fujitsu PRIMERGY RX2530 M2 (1U, dual CPU) became the primary hypervisor. After that, everything accelerated: Talos nodes, GPU passthrough, a new hypervisor class, and a deployment model that finally scales.</p>

<!--more-->

<h3 id="phase-1-core-infrastructure-and-virtualization">Phase 1: Core Infrastructure and Virtualization</h3>

<p>With the Fujitsu server anchoring the rack, the fleet expanded with two <strong>HP EliteDesk 600 G3</strong> units. <strong>Talos Linux</strong> was installed on them to establish a programmable infrastructure fabric. One was commissioned as a Windows 11 Pro VM, a project that forced a deep dive into <strong>GPU passthrough</strong> on Proxmox. Mastering this was a critical success: GPU acceleration became available inside the virtualized perimeter.</p>

<p>This is also where the baseline architecture locked in: DNS, TLS, and identity.</p>

<ul>
  <li><strong>DNS:</strong> <strong>ExternalDNS</strong> automates record management, but the zone is only resolvable from inside. Names look public. Reach is not.</li>
  <li><strong>Edge + TLS:</strong> <strong>Traefik</strong> is configured for <strong>Let’s Encrypt</strong> using the Cloudflare DNS-01 challenge (API token-based). If it has a hostname, it has HTTPS.</li>
  <li><strong>SSO:</strong> <strong>Authentik</strong> sits in front of everything that matters. Once it’s set up, it turns access control from a scattered set of configs into a single system.</li>
  <li><strong>Selective Exposure:</strong> A <strong>Cloudflare Tunnel Operator</strong> exposes only what should be public. This blog is served from the homelab through this tunnel.</li>
</ul>

<h3 id="phase-2-the-industrialization-of-compute">Phase 2: The Industrialization of Compute</h3>

<p>The EliteDesk experiment revealed the operational limitations of small-form-factor systems under sustained load. The next logical step was to architect a resilient, standardized solution. The design brief was clear: maximize the price-to-performance-to-wattage ratio.</p>

<p>The result is a pair of custom-built 4U servers, the new backbone of the lab.</p>

<ul>
  <li><strong>Chassis:</strong> Inter-Tech 4408 4U. A raw, functional enclosure.</li>
  <li><strong>Processing:</strong> 11th Gen Intel Core i5 and i7 CPUs.</li>
  <li><strong>Motherboard:</strong> Standard ATX boards for maximum compatibility and cost-effectiveness.</li>
  <li><strong>Storage:</strong> Dedicated RAID controllers for data redundancy and performance.</li>
</ul>

<p>These machines are the primary hypervisors, both running <strong>Proxmox VE</strong>. They represent a fundamental shift from ad-hoc expansion to a replicable server architecture.</p>

<h3 id="phase-3-a-new-deployment-doctrine">Phase 3: A New Deployment Doctrine</h3>

<p>The deployment model has undergone a significant strategic shift, moving away from inefficient and rigid structures.</p>

<ul>
  <li><strong>Initial State:</strong> 1 VM per container. Simple, but resource-intensive and unscalable.</li>
  <li><strong>Interim State:</strong> 1 VM for multiple containers, grouped by role. This improved density but introduced scalability bottlenecks and single points of failure.</li>
  <li><strong>Current Doctrine:</strong> A cluster of three purpose-built VMs, deployed and configured via a single, unified <strong>Ansible role</strong>. This provides a balance of isolation, resource management, and automated scalability.</li>
</ul>

<h3 id="phase-4-the-automation-singularity">Phase 4: The Automation Singularity</h3>

<p>The inflection point was the total adoption of Infrastructure as Code (IaC). Manual configuration was deprecated.</p>

<p>The lab now has a single planning surface: a GitLab-hosted repository containing the Ansible and Terraform code. It has crossed the 3000-commit mark. That number matters because it signals intent: iteration, review, rollback, history. The homelab stopped being “machines I own” and became “a system I can recreate.”</p>

<p>The workflow is codified and automated.</p>

<ul>
  <li><strong>Proxmox VM Provisioning:</strong> VMs are no longer created; they are stamped out from standardized templates. Consistency and speed are enforced through automation.</li>
  <li><strong>Talos Deployment via Terraform:</strong> The entire Talos Linux environment is defined in <strong>Terraform</strong> and managed via <strong>GitLab CI/CD</strong>. A <code class="language-plaintext highlighter-rouge">git push</code> to the repository triggers a pipeline that configures the entire cluster. This is GitOps executed.</li>
  <li><strong>Automated Dependency Management:</strong> <strong>RenovateBot</strong> is integrated into the workflow, automatically scanning for updates, creating merge requests, and keeping the entire software stack current.</li>
  <li><strong>Secrets Handling:</strong> <strong>HashiCorp Vault</strong> is the secrets backbone. Credentials are no longer scattered across shell histories and ad-hoc files; they are stored, accessed, and rotated intentionally.</li>
</ul>

<p>This trifecta of templating, Terraform, and automated updates was the catalyst. The lab transformed from a manually curated system into a self-provisioning, self-updating organism. The velocity of experimentation and deployment increased by an order of magnitude.</p>

<h3 id="the-kubernetes-question-storage-as-the-final-frontier">The Kubernetes Question: Storage as the Final Frontier</h3>

<p>While the infrastructure is robust, mass adoption of <strong>Kubernetes</strong> is pending. The critical dependency is a resilient, distributed storage solution. Without it, stateful workloads in Kubernetes are a liability. The current focus is on architecting a storage backend that meets the required performance and reliability specifications.</p>

<p>However, a lean Kubernetes control plane is operational, serving critical ingress and service discovery functions.</p>

<ul>
  <li><strong>CNI:</strong> <strong>Cilium</strong> provides networking, observability, and security.</li>
</ul>

<p>Ingress, DNS automation, TLS issuance, and SSO are part of the baseline architecture (see Phase 1). Kubernetes plugs into that foundation; it doesn’t reinvent it.</p>

<h3 id="up-next-media-models-and-acceleration">Up Next: Media, Models, and Acceleration</h3>

<p>The next steps are pragmatic. They target consolidation, capability, and GPU acceleration.</p>

<ul>
  <li><strong>Photos:</strong> Consolidate the photo estate into <strong>Immich</strong>. One system of record.</li>
  <li><strong>GPU Node:</strong> Build a dedicated GPU-enabled node. The exact model is secondary; 16GB VRAM is not. Candidates are an RTX 4060 Ti (16GB) or a future 50-series equivalent with the same memory tier.</li>
  <li><strong>Local Inference:</strong> Use that node to host smaller local LLMs for the workloads that benefit from proximity, privacy, and low latency.</li>
  <li><strong>Accelerated Workflows:</strong> Run audio tooling (“audiomuse”) and use the GPU for Immich’s accelerated tagging.</li>
</ul>

<p>The homelab is no longer a collection of hardware. It is an engineered system. The next iteration will solve the storage challenge and unlock the full potential of the Kubernetes ecosystem. The blueprint is drawn. Execution is in progress.</p>]]></content><author><name>Ben Matheja</name></author><category term="development" /><category term="homelab" /><category term="proxmox" /><category term="ansible" /><category term="gitlab" /><category term="terraform" /><category term="renovate" /><category term="vault" /><category term="talos" /><category term="kubernetes" /><category term="cilium" /><category term="cloudflare" /><category term="traefik" /><category term="externaldns" /><category term="letsencrypt" /><category term="authentik" /><summary type="html"><![CDATA[In January I published a short rundown of the major changes since 2020. This post is the follow-up.The architecture started with a Fujitsu PRIMERGY TX120 S3 and matured when the Fujitsu PRIMERGY RX2530 M2 (1U, dual CPU) became the primary hypervisor. After that, everything accelerated&#58; Talos nodes, GPU passthrough, a new hypervisor class, and a deployment model that finally scales.Reflecting on Progress&#58; My Homelab Journey Since 2020]]></summary></entry><entry><title type="html">Replacing Root Tokens with SSO: HashiCorp Vault + Authentik OIDC</title><link href="https://matheja.me/2025/11/20/vault-oidc-authentik-sso.html" rel="alternate" type="text/html" title="Replacing Root Tokens with SSO: HashiCorp Vault + Authentik OIDC" /><published>2025-11-20T11:20:00+01:00</published><updated>2025-11-20T11:20:00+01:00</updated><id>https://matheja.me/2025/11/20/vault-oidc-authentik-sso</id><content type="html" xml:base="https://matheja.me/2025/11/20/vault-oidc-authentik-sso.html"><![CDATA[<p>If you’re using Vault’s root token for daily operations, you’re doing it wrong. I was too.
Then I accidentally committed my <code class="language-plaintext highlighter-rouge">.vault-token</code> file to Gitlab.
<!--more--></p>

<h2 id="the-problem">The Problem</h2>

<p>My original Vault workflow:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">VAULT_TOKEN</span><span class="o">=</span><span class="s2">"hvs.CAESIF..."</span>
vault kv get secret/my-app
</code></pre></div></div>

<p>Why this is terrible:</p>
<ul>
  <li>❌ Root token never expires (unlimited attack window)</li>
  <li>❌ No audit trail (all operations appear as “root”)</li>
  <li>❌ Single point of failure (anyone with this token has god-mode)</li>
  <li>❌ No MFA (compromised laptop = compromised infrastructure)</li>
  <li>❌ Can’t revoke selectively (revoking root token locks everyone out)</li>
</ul>

<p>The wake-up call: I accidentally committed <code class="language-plaintext highlighter-rouge">.vault-token</code> to Gitlab. Panic-revoking the root token meant:</p>
<ol>
  <li>Regenerating via Terraform</li>
  <li>Updating every automation script</li>
  <li>Breaking every service that depended on Vault (all of them)</li>
</ol>

<h2 id="the-solution">The Solution</h2>

<p>Replace root tokens with federated authentication. Users log in once through Authentik SSO, and Vault grants permissions based on group membership.</p>

<h3 id="architecture">Architecture</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>User/Script
    │
    ▼
HashiCorp Vault (OIDC Client)
    │
    │ Redirect
    ▼
Authentik (OIDC Provider)
    │
    │ Groups: vault-admins, vault-readonly
    ▼
User Identity
</code></pre></div></div>

<p>Why Authentik? I already had it running for GitLab, Grafana, and other services. Adding Vault to the same identity provider gives:</p>
<ul>
  <li>✅ Single source of truth for identities</li>
  <li>✅ Centralized MFA enforcement (TOTP, WebAuthn)</li>
  <li>✅ Group-based RBAC (manage access in one place)</li>
  <li>✅ Audit trail (who accessed what, when, from where)</li>
  <li>✅ Token expiration (force re-authentication every 8 hours)</li>
</ul>

<h2 id="implementation-with-terraform">Implementation with Terraform</h2>

<h3 id="1-enable-oidc-auth-method">1. Enable OIDC Auth Method</h3>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># terraform/modules/vault/auth-methods.tf</span>
<span class="nx">resource</span> <span class="s2">"vault_jwt_auth_backend"</span> <span class="s2">"authentik"</span> <span class="p">{</span>
  <span class="nx">path</span>               <span class="o">=</span> <span class="s2">"oidc"</span>
  <span class="nx">type</span>               <span class="o">=</span> <span class="s2">"oidc"</span>
  <span class="nx">oidc_discovery_url</span> <span class="o">=</span> <span class="s2">"https://auth.mittbachweg.de/application/o/vault/"</span>
  <span class="nx">oidc_client_id</span>     <span class="o">=</span> <span class="nx">var</span><span class="p">.</span><span class="nx">authentik_client_id</span>
  <span class="nx">oidc_client_secret</span> <span class="o">=</span> <span class="nx">var</span><span class="p">.</span><span class="nx">authentik_client_secret</span>
  <span class="nx">default_role</span>       <span class="o">=</span> <span class="s2">"default"</span>
  
  <span class="nx">tune</span> <span class="p">{</span>
    <span class="nx">default_lease_ttl</span> <span class="o">=</span> <span class="s2">"8h"</span>
    <span class="nx">max_lease_ttl</span>     <span class="o">=</span> <span class="s2">"24h"</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Key parameters:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">oidc_discovery_url</code>: Authentik’s OIDC discovery endpoint</li>
  <li><code class="language-plaintext highlighter-rouge">default_lease_ttl</code>: Tokens expire after 8 hours (force re-authentication)</li>
</ul>

<h3 id="2-configure-oidc-roles">2. Configure OIDC Roles</h3>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">resource</span> <span class="s2">"vault_jwt_auth_backend_role"</span> <span class="s2">"default"</span> <span class="p">{</span>
  <span class="nx">backend</span>         <span class="o">=</span> <span class="nx">vault_jwt_auth_backend</span><span class="p">.</span><span class="nx">authentik</span><span class="p">.</span><span class="nx">path</span>
  <span class="nx">role_name</span>       <span class="o">=</span> <span class="s2">"default"</span>
  <span class="nx">bound_audiences</span> <span class="o">=</span> <span class="p">[</span><span class="nx">var</span><span class="p">.</span><span class="nx">authentik_client_id</span><span class="p">]</span>
  <span class="nx">user_claim</span>      <span class="o">=</span> <span class="s2">"sub"</span>
  <span class="nx">role_type</span>       <span class="o">=</span> <span class="s2">"oidc"</span>
  
  <span class="nx">token_policies</span> <span class="o">=</span> <span class="p">[</span>
    <span class="s2">"default"</span><span class="p">,</span>
    <span class="s2">"read-own-token"</span>
  <span class="p">]</span>
  
  <span class="nx">groups_claim</span> <span class="o">=</span> <span class="s2">"groups"</span>
  <span class="nx">allowed_redirect_uris</span> <span class="o">=</span> <span class="p">[</span>
    <span class="s2">"http://localhost:8250/oidc/callback"</span><span class="p">,</span>
    <span class="s2">"https://vault.one.mittbachweg.de/ui/vault/auth/oidc/oidc/callback"</span>
  <span class="p">]</span>
  
  <span class="nx">claim_mappings</span> <span class="o">=</span> <span class="p">{</span>
    <span class="nx">preferred_username</span> <span class="o">=</span> <span class="s2">"username"</span>
    <span class="nx">email</span>              <span class="o">=</span> <span class="s2">"email"</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Role breakdown:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">bound_audiences</code>: Only accept tokens for this Vault instance (prevents token replay)</li>
  <li><code class="language-plaintext highlighter-rouge">groups_claim</code>: Extract Authentik group memberships from JWT</li>
  <li><code class="language-plaintext highlighter-rouge">allowed_redirect_uris</code>: Whitelist for OAuth2 redirect (CLI and Web UI)</li>
</ul>

<h3 id="3-group-based-policies">3. Group-Based Policies</h3>

<p>Map Authentik groups to Vault policies:</p>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Admin group gets full access</span>
<span class="nx">resource</span> <span class="s2">"vault_identity_group"</span> <span class="s2">"vault_admins"</span> <span class="p">{</span>
  <span class="nx">name</span>     <span class="o">=</span> <span class="s2">"vault-admins"</span>
  <span class="nx">type</span>     <span class="o">=</span> <span class="s2">"external"</span>
  <span class="nx">policies</span> <span class="o">=</span> <span class="p">[</span>
    <span class="s2">"admin"</span><span class="p">,</span>
    <span class="s2">"create-tokens"</span><span class="p">,</span>
    <span class="s2">"manage-auth"</span>
  <span class="p">]</span>
<span class="p">}</span>

<span class="c1"># Readonly group gets limited access</span>
<span class="nx">resource</span> <span class="s2">"vault_identity_group"</span> <span class="s2">"vault_readonly"</span> <span class="p">{</span>
  <span class="nx">name</span>     <span class="o">=</span> <span class="s2">"vault-readonly"</span>
  <span class="nx">type</span>     <span class="o">=</span> <span class="s2">"external"</span>
  <span class="nx">policies</span> <span class="o">=</span> <span class="p">[</span>
    <span class="s2">"read-secrets"</span><span class="p">,</span>
    <span class="s2">"read-own-token"</span>
  <span class="p">]</span>
<span class="p">}</span>

<span class="c1"># Bind to Authentik groups</span>
<span class="nx">resource</span> <span class="s2">"vault_identity_group_alias"</span> <span class="s2">"vault_admins"</span> <span class="p">{</span>
  <span class="nx">name</span>           <span class="o">=</span> <span class="s2">"vault-admins"</span>
  <span class="nx">mount_accessor</span> <span class="o">=</span> <span class="nx">vault_jwt_auth_backend</span><span class="p">.</span><span class="nx">authentik</span><span class="p">.</span><span class="nx">accessor</span>
  <span class="nx">canonical_id</span>   <span class="o">=</span> <span class="nx">vault_identity_group</span><span class="p">.</span><span class="nx">vault_admins</span><span class="p">.</span><span class="nx">id</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Policy example (admin.hcl):</p>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Full access to secrets</span>
<span class="nx">path</span> <span class="s2">"secret/*"</span> <span class="p">{</span>
  <span class="nx">capabilities</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"create"</span><span class="p">,</span> <span class="s2">"read"</span><span class="p">,</span> <span class="s2">"update"</span><span class="p">,</span> <span class="s2">"delete"</span><span class="p">,</span> <span class="s2">"list"</span><span class="p">]</span>
<span class="p">}</span>

<span class="c1"># Manage auth methods</span>
<span class="nx">path</span> <span class="s2">"auth/*"</span> <span class="p">{</span>
  <span class="nx">capabilities</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"create"</span><span class="p">,</span> <span class="s2">"read"</span><span class="p">,</span> <span class="s2">"update"</span><span class="p">,</span> <span class="s2">"delete"</span><span class="p">,</span> <span class="s2">"list"</span><span class="p">,</span> <span class="s2">"sudo"</span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="authentik-configuration">Authentik Configuration</h2>

<p>Create an OAuth2/OpenID provider in Authentik:</p>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># OAuth2 Provider for Vault</span>
<span class="nx">resource</span> <span class="s2">"authentik_provider_oauth2"</span> <span class="s2">"vault"</span> <span class="p">{</span>
  <span class="nx">name</span>               <span class="o">=</span> <span class="s2">"Vault OIDC Provider"</span>
  <span class="nx">client_id</span>          <span class="o">=</span> <span class="s2">"vault"</span>
  <span class="nx">client_secret</span>      <span class="o">=</span> <span class="nx">random_password</span><span class="p">.</span><span class="nx">vault_oauth_secret</span><span class="p">.</span><span class="nx">result</span>
  <span class="nx">authorization_flow</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">authentik_flow</span><span class="p">.</span><span class="nx">default-provider-authorization-implicit-consent</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">invalidation_flow</span>  <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">authentik_flow</span><span class="p">.</span><span class="nx">default-invalidation-flow</span><span class="p">.</span><span class="nx">id</span>

  <span class="c1"># Redirect URIs for Vault UI, API, and CLI</span>
  <span class="nx">allowed_redirect_uris</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">{</span>
      <span class="nx">matching_mode</span> <span class="o">=</span> <span class="s2">"strict"</span>
      <span class="nx">url</span>           <span class="o">=</span> <span class="s2">"https://vault.one.mittbachweg.de/ui/vault/auth/oidc/oidc/callback"</span>
    <span class="p">},</span>
    <span class="p">{</span>
      <span class="nx">matching_mode</span> <span class="o">=</span> <span class="s2">"strict"</span>
      <span class="nx">url</span>           <span class="o">=</span> <span class="s2">"https://vault.one.mittbachweg.de/oidc/callback"</span>
    <span class="p">},</span>
    <span class="p">{</span>
      <span class="nx">matching_mode</span> <span class="o">=</span> <span class="s2">"strict"</span>
      <span class="nx">url</span>           <span class="o">=</span> <span class="s2">"http://localhost:8250/oidc/callback"</span>
    <span class="p">},</span>
  <span class="p">]</span>

  <span class="c1"># Property mappings for OAuth scopes</span>
  <span class="nx">property_mappings</span> <span class="o">=</span> <span class="p">[</span>
    <span class="nx">data</span><span class="p">.</span><span class="nx">authentik_property_mapping_provider_scope</span><span class="p">.</span><span class="nx">scope-email</span><span class="p">.</span><span class="nx">id</span><span class="p">,</span>
    <span class="nx">data</span><span class="p">.</span><span class="nx">authentik_property_mapping_provider_scope</span><span class="p">.</span><span class="nx">scope-profile</span><span class="p">.</span><span class="nx">id</span><span class="p">,</span>
    <span class="nx">data</span><span class="p">.</span><span class="nx">authentik_property_mapping_provider_scope</span><span class="p">.</span><span class="nx">scope-openid</span><span class="p">.</span><span class="nx">id</span><span class="p">,</span>
  <span class="p">]</span>

  <span class="c1"># Additional configuration for proper OIDC flow</span>
  <span class="nx">signing_key</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">authentik_certificate_key_pair</span><span class="p">.</span><span class="nx">default</span><span class="p">.</span><span class="nx">id</span>
<span class="p">}</span>

<span class="c1"># Vault OIDC Application</span>
<span class="nx">resource</span> <span class="s2">"authentik_application"</span> <span class="s2">"vault"</span> <span class="p">{</span>
  <span class="nx">name</span>               <span class="o">=</span> <span class="s2">"HashiCorp Vault"</span>
  <span class="nx">slug</span>               <span class="o">=</span> <span class="s2">"vault"</span>
  <span class="nx">protocol_provider</span>  <span class="o">=</span> <span class="nx">authentik_provider_oauth2</span><span class="p">.</span><span class="nx">vault</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">meta_description</span>   <span class="o">=</span> <span class="s2">"OIDC authentication for HashiCorp Vault"</span>
  <span class="nx">meta_publisher</span>     <span class="o">=</span> <span class="s2">"Mittbachweg Infrastructure"</span>
  <span class="nx">policy_engine_mode</span> <span class="o">=</span> <span class="s2">"any"</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="usage">Usage</h2>

<h3 id="cli-login">CLI Login</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>vault login <span class="nt">-method</span><span class="o">=</span>oidc

<span class="c"># Browser opens for Authentik login</span>
<span class="c"># After success:</span>
Success! You are now authenticated.

<span class="nv">$ </span>vault token lookup
Key                 Value
<span class="nt">---</span>                 <span class="nt">-----</span>
entity_id           8a7b9c1d-2e3f-4a5b-6c7d-8e9f0a1b2c3d
expire_time         2024-02-15T19:20:00Z
policies            <span class="o">[</span>default read-secrets]
</code></pre></div></div>

<h3 id="web-ui-login">Web UI Login</h3>

<p>Navigate to <code class="language-plaintext highlighter-rouge">https://vault.one.mittbachweg.de</code> → Select “OIDC” method → Redirects to Authentik</p>

<h3 id="ansible-integration">Ansible Integration</h3>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># playbooks/vault_lookup.yml</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Fetch secret from Vault</span>
  <span class="na">set_fact</span><span class="pi">:</span>
    <span class="na">db_password</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">lookup('community.hashi_vault.hashi_vault_read',</span>
      <span class="s">'secret=ansible/data/database',</span>
      <span class="s">auth_method='oidc',</span>
      <span class="s">url='https://vault.one.mittbachweg.de'</span>
    <span class="s">).password</span><span class="nv"> </span><span class="s">}}"</span>
</code></pre></div></div>

<p>For automation, use service accounts with AppRole auth:</p>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">resource</span> <span class="s2">"vault_auth_backend"</span> <span class="s2">"approle"</span> <span class="p">{</span>
  <span class="nx">type</span> <span class="o">=</span> <span class="s2">"approle"</span>
<span class="p">}</span>

<span class="nx">resource</span> <span class="s2">"vault_approle_auth_backend_role"</span> <span class="s2">"ansible"</span> <span class="p">{</span>
  <span class="nx">backend</span>        <span class="o">=</span> <span class="nx">vault_auth_backend</span><span class="p">.</span><span class="nx">approle</span><span class="p">.</span><span class="nx">path</span>
  <span class="nx">role_name</span>      <span class="o">=</span> <span class="s2">"ansible"</span>
  <span class="nx">token_policies</span> <span class="o">=</span> <span class="p">[</span><span class="s2">"read-secrets"</span><span class="p">]</span>
  <span class="nx">token_ttl</span>      <span class="o">=</span> <span class="mi">3600</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="lessons-learned">Lessons Learned</h2>

<p>What worked:</p>
<ul>
  <li>✅ OIDC login eliminated root token dependency</li>
  <li>✅ Group-based RBAC simplified permission management</li>
  <li>✅ MFA enforcement via Authentik blocked unauthorized access</li>
  <li>✅ Token expiration forced regular re-authentication</li>
</ul>

<p>What didn’t:</p>
<ul>
  <li>❌ Initial 8-hour token TTL was too short for long-running tasks (extended to 24h max)</li>
  <li>❌ Forgot to configure service accounts for CI/CD (added AppRole auth)</li>
  <li>❌ Debugging OIDC callback issues required checking Vault AND Authentik logs</li>
</ul>

<p>The security improvement is worth the complexity. No more root tokens in plaintext files.</p>]]></content><author><name>Ben Matheja</name></author><category term="homelab" /><category term="vault" /><category term="security" /><category term="oidc" /><category term="authentik" /><summary type="html"><![CDATA[If you’re using Vault’s root token for daily operations, you’re doing it wrong. I was too. Then I accidentally committed my .vault-token file to Gitlab.]]></summary></entry><entry><title type="html">Modular Backup Architecture: From Centralized to Application-Level with Restic</title><link href="https://matheja.me/2025/08/15/modular-backup-architecture.html" rel="alternate" type="text/html" title="Modular Backup Architecture: From Centralized to Application-Level with Restic" /><published>2025-08-15T09:00:00+02:00</published><updated>2025-08-15T09:00:00+02:00</updated><id>https://matheja.me/2025/08/15/modular-backup-architecture</id><content type="html" xml:base="https://matheja.me/2025/08/15/modular-backup-architecture.html"><![CDATA[<p>After launching Platform One in July with 6 applications across 3 VMs, my centralized backup strategy immediately broke.
Stackback couldn’t discover volumes across different Docker Compose contexts.
<!--more--></p>

<h2 id="the-problem">The Problem</h2>

<p>My initial architecture:</p>
<ul>
  <li>Single Stackback (Restic wrapper) container</li>
  <li>Two shared S3 buckets</li>
  <li>Centralized backup configuration via Ansible</li>
</ul>

<p>The catch: Stackback relies on Docker labels (<code class="language-plaintext highlighter-rouge">stack-back.volumes=true</code>) to auto-discover backup targets.
When your backup container runs in a separate <code class="language-plaintext highlighter-rouge">docker-compose.yml</code> from your application stacks, it can’t see what it’s supposed to back up.</p>

<p>Real-world impact:</p>
<ul>
  <li>❌ Inconsistent backup coverage (some volumes discovered, others missed)</li>
  <li>❌ No visibility into PostgreSQL backups</li>
  <li>❌ Single point of failure for credentials</li>
  <li>❌ Resource contention when all backups ran simultaneously</li>
</ul>

<h2 id="the-solution">The Solution</h2>

<p>Each application stack gets its own dedicated backup container.</p>

<h3 id="architecture-evolution">Architecture Evolution</h3>

<p>Before:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># centralized-stackback/docker-compose.yml</span>
<span class="na">services</span><span class="pi">:</span>
  <span class="na">stackback</span><span class="pi">:</span>
    <span class="na">volumes</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">/var/run/docker.sock:/var/run/docker.sock</span>
    <span class="na">environment</span><span class="pi">:</span>
      <span class="na">RESTIC_REPOSITORY</span><span class="pi">:</span> <span class="s">s3:minio.internal/shared-bucket</span>
</code></pre></div></div>

<p>After:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># mattermost/docker-compose.yml</span>
<span class="na">services</span><span class="pi">:</span>
  <span class="na">mattermost</span><span class="pi">:</span>
    <span class="na">labels</span><span class="pi">:</span>
      <span class="na">stack-back.volumes</span><span class="pi">:</span> <span class="s2">"</span><span class="s">true"</span>
      <span class="na">stack-back.postgres</span><span class="pi">:</span> <span class="s2">"</span><span class="s">true"</span>
  
  <span class="na">stackback</span><span class="pi">:</span>
    <span class="na">image</span><span class="pi">:</span> <span class="s">ghcr.io/lawndoc/stack-back:latest</span>
    <span class="na">volumes</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">/var/run/docker.sock:/var/run/docker.sock</span>
    <span class="na">environment</span><span class="pi">:</span>
      <span class="na">RESTIC_REPOSITORY</span><span class="pi">:</span> <span class="s">s3:minio.internal/mattermost-backup-bucket</span>
      <span class="na">RESTIC_PASSWORD</span><span class="pi">:</span> <span class="s">${RESTIC_PASSWORD_MATTERMOST}</span>
      <span class="na">BACKUP_CRON</span><span class="pi">:</span> <span class="s2">"</span><span class="s">0</span><span class="nv"> </span><span class="s">2</span><span class="nv"> </span><span class="s">*</span><span class="nv"> </span><span class="s">*</span><span class="nv"> </span><span class="s">*"</span>
</code></pre></div></div>

<h3 id="key-design-decisions">Key Design Decisions</h3>

<p><strong>1. Dedicated S3 Buckets Per Application</strong></p>

<p>Using Terraform’s MinIO provider:</p>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># terraform/modules/minio/main.tf</span>
<span class="nx">resource</span> <span class="s2">"minio_s3_bucket"</span> <span class="s2">"stackback_per_app"</span> <span class="p">{</span>
  <span class="nx">for_each</span> <span class="o">=</span> <span class="nx">var</span><span class="p">.</span><span class="nx">applications</span>
  <span class="nx">bucket</span>   <span class="o">=</span> <span class="s2">"restic-stackback-${each.key}-bucket"</span>
  <span class="nx">acl</span>      <span class="o">=</span> <span class="s2">"private"</span>
<span class="p">}</span>

<span class="nx">resource</span> <span class="s2">"minio_ilm_policy"</span> <span class="s2">"stackback_lifecycle"</span> <span class="p">{</span>
  <span class="nx">for_each</span> <span class="o">=</span> <span class="nx">minio_s3_bucket</span><span class="p">.</span><span class="nx">stackback_per_app</span>
  <span class="nx">bucket</span>   <span class="o">=</span> <span class="nx">each</span><span class="p">.</span><span class="nx">value</span><span class="p">.</span><span class="nx">bucket</span>
  
  <span class="nx">rule</span> <span class="p">{</span>
    <span class="nx">id</span>         <span class="o">=</span> <span class="s2">"delete-old-backups"</span>
    <span class="nx">expiration</span> <span class="p">{</span>
      <span class="nx">days</span> <span class="o">=</span> <span class="mi">30</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Why 30 days? GitLab backups alone consumed exponential storage. Automated lifecycle policies prevent the “set-and-forget-until-disk-full” trap.</p>

<p><strong>2. IAM Credential Isolation</strong></p>

<p>Each application receives unique S3 credentials:</p>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">resource</span> <span class="s2">"minio_iam_user"</span> <span class="s2">"stackback_per_app"</span> <span class="p">{</span>
  <span class="nx">for_each</span> <span class="o">=</span> <span class="nx">var</span><span class="p">.</span><span class="nx">applications</span>
  <span class="nx">name</span>     <span class="o">=</span> <span class="s2">"restic-${each.key}-user"</span>
<span class="p">}</span>

<span class="nx">resource</span> <span class="s2">"minio_iam_policy"</span> <span class="s2">"stackback_per_app"</span> <span class="p">{</span>
  <span class="nx">policy</span> <span class="o">=</span> <span class="nx">jsonencode</span><span class="p">({</span>
    <span class="nx">Statement</span> <span class="o">=</span> <span class="p">[{</span>
      <span class="nx">Effect</span>   <span class="o">=</span> <span class="s2">"Allow"</span>
      <span class="nx">Action</span>   <span class="o">=</span> <span class="p">[</span><span class="s2">"s3:*"</span><span class="p">]</span>
      <span class="nx">Resource</span> <span class="o">=</span> <span class="p">[</span>
        <span class="s2">"arn:aws:s3:::restic-stackback-${each.key}-bucket/*"</span>
      <span class="p">]</span>
    <span class="p">}]</span>
  <span class="p">})</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Security win: A compromised application can only access its own backup bucket.</p>

<p><strong>3. Staggered Backup Schedules</strong></p>

<p>Running all backups simultaneously caused I/O storms on NFS storage. Solution: offset schedules.</p>

<table>
  <thead>
    <tr>
      <th>Application</th>
      <th>Schedule</th>
      <th>Resource Group</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Mattermost</td>
      <td>2:00 AM</td>
      <td>Green</td>
    </tr>
    <tr>
      <td>N8N</td>
      <td>2:20 AM</td>
      <td>Green</td>
    </tr>
    <tr>
      <td>Vault</td>
      <td>2:40 AM</td>
      <td>Green</td>
    </tr>
    <tr>
      <td>Linkwarden</td>
      <td>2:00 AM</td>
      <td>Blue</td>
    </tr>
    <tr>
      <td>Solidtime</td>
      <td>2:20 AM</td>
      <td>Blue</td>
    </tr>
  </tbody>
</table>

<p><strong>4. Vault Integration for Secrets</strong></p>

<p>Backup credentials stored in HashiCorp Vault:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># ansible/roles/platform_one/templates/stackback.env.j2</span>
<span class="s">RESTIC_REPOSITORY=s3:https://{{ minio_endpoint }}/{{ backup_bucket }}</span>
<span class="s">RESTIC_PASSWORD={{ lookup('community.hashi_vault.hashi_vault_read',</span> 
  <span class="s">'secret=ansible/data/stackback_{{ app_name }}').password }}</span>
<span class="s">AWS_ACCESS_KEY_ID={{ lookup('community.hashi_vault.hashi_vault_read',</span> 
  <span class="s">'secret=ansible/data/stackback_{{ app_name }}').access_key }}</span>
</code></pre></div></div>

<h2 id="implementation-with-ansible">Implementation with Ansible</h2>

<p>Dynamic template generation per application:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># roles/platform_one/tasks/deploy_application.yml</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Generate stackback environment file</span>
  <span class="na">template</span><span class="pi">:</span>
    <span class="na">src</span><span class="pi">:</span> <span class="s">stackback.env.j2</span>
    <span class="na">dest</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">container_data</span><span class="nv"> </span><span class="s">}}/{{</span><span class="nv"> </span><span class="s">app_name</span><span class="nv"> </span><span class="s">}}/stackback.env"</span>
    <span class="na">mode</span><span class="pi">:</span> <span class="s1">'</span><span class="s">0600'</span>
  <span class="na">vars</span><span class="pi">:</span>
    <span class="na">backup_bucket</span><span class="pi">:</span> <span class="s2">"</span><span class="s">restic-stackback-{{</span><span class="nv"> </span><span class="s">vm_name</span><span class="nv"> </span><span class="s">}}-{{</span><span class="nv"> </span><span class="s">app_name</span><span class="nv"> </span><span class="s">}}-bucket"</span>
    <span class="na">backup_schedule</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">applications[app_name].backup_schedule</span><span class="nv"> </span><span class="s">|</span><span class="nv"> </span><span class="s">default('0</span><span class="nv"> </span><span class="s">2</span><span class="nv"> </span><span class="s">*</span><span class="nv"> </span><span class="s">*</span><span class="nv"> </span><span class="s">*')</span><span class="nv"> </span><span class="s">}}"</span>
  <span class="na">when</span><span class="pi">:</span> <span class="s">applications[app_name].backup_enabled | default(false)</span>
</code></pre></div></div>

<p>Docker Compose integration:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># templates/docker-compose.yml.j2</span>
<span class="pi">{</span><span class="err">%</span> <span class="nv">if app.backup_enabled | default(false) %</span><span class="pi">}</span>
  <span class="na">stackback</span><span class="pi">:</span>
    <span class="na">image</span><span class="pi">:</span> <span class="s">ghcr.io/mittbachweg/stack-back:2024.11.1</span>
    <span class="na">container_name</span><span class="pi">:</span> <span class="pi">{{</span> <span class="nv">app_name</span> <span class="pi">}}</span><span class="s">_stackback</span>
    <span class="na">env_file</span><span class="pi">:</span> <span class="s">./stackback.env</span>
    <span class="na">volumes</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="s">/var/run/docker.sock:/var/run/docker.sock:ro</span>
    <span class="na">restart</span><span class="pi">:</span> <span class="s">unless-stopped</span>
<span class="pi">{</span><span class="err">%</span> <span class="nv">endif %</span><span class="pi">}</span>
</code></pre></div></div>

<h2 id="lessons-learned">Lessons Learned</h2>

<p><strong>What worked:</strong></p>
<ul>
  <li>✅ Application-level isolation caught backup failures early</li>
  <li>✅ Staggered schedules eliminated I/O contention</li>
  <li>✅ Lifecycle policies prevented storage exhaustion</li>
  <li>✅ Vault integration centralized credential management</li>
</ul>

<p><strong>What didn’t:</strong></p>
<ul>
  <li>❌ Initial 7-day retention was too short (extended to 30 days)</li>
  <li>❌ Forgot to monitor backup success (added Prometheus metrics)</li>
  <li>❌ Manual Restic repository initialization (automated via Ansible)</li>
</ul>

<p>The modular approach trades simplicity for reliability. Worth it.</p>]]></content><author><name>Ben Matheja</name></author><category term="homelab" /><category term="backup" /><category term="restic" /><category term="ansible" /><category term="infrastructure" /><summary type="html"><![CDATA[After launching Platform One in July with 6 applications across 3 VMs, my centralized backup strategy immediately broke. Stackback couldn’t discover volumes across different Docker Compose contexts.]]></summary></entry><entry><title type="html">Cloudflare: DNS Migration and Tunnel Integration</title><link href="https://matheja.me/2025/01/13/dns-migration-cloudflare.html" rel="alternate" type="text/html" title="Cloudflare: DNS Migration and Tunnel Integration" /><published>2025-01-13T16:30:00+01:00</published><updated>2025-01-13T16:30:00+01:00</updated><id>https://matheja.me/2025/01/13/dns-migration-cloudflare</id><content type="html" xml:base="https://matheja.me/2025/01/13/dns-migration-cloudflare.html"><![CDATA[<p>Cloudflare Tunnels enabled external access from behind Carrier Grade NAT or Dual-Stack Lite. Moving DNS from Route53 to Cloudflare was the logical next step.
<!--more--></p>

<h2 id="the-context">The Context</h2>

<p>I procure all my domains on Netcup (can only recommend checking them out) and delegate nameservers to wherever I want them managed.
For years, that was AWS Route53 for <code class="language-plaintext highlighter-rouge">mittbachweg.de</code> and <code class="language-plaintext highlighter-rouge">benmatheja.de</code>.</p>

<p>Route53 worked well:</p>
<ul>
  <li>Reliable DNS resolution</li>
  <li>LetsEncrypt DNS-01 challenges via Traefik (automated cert issuance)</li>
  <li>Terraform-managed records</li>
</ul>

<p>But it wasn’t free. ~€12/year for basic DNS hosting.</p>

<p>The real trigger: Cloudflare Tunnels. My ISP uses Carrier Grade NAT (DS-Lite), so port forwarding was never an option. Cloudflare Tunnels enabled external access for the first time. Once I was using Cloudflare for tunnels, managing DNS in two places didn’t make sense.</p>

<hr />

<h2 id="the-solution">The Solution</h2>

<p>Cloudflare’s free plan includes:</p>
<ul>
  <li>Unlimited DNS queries</li>
  <li>Unlimited DNS records</li>
  <li>DDoS protection</li>
  <li>Universal SSL for all subdomains</li>
  <li>Cloudflare Tunnels support</li>
  <li>Terraform provider</li>
</ul>

<p>Annual cost: $0</p>

<hr />

<h2 id="migration">Migration</h2>

<h3 id="export-route53-records">Export Route53 Records</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aws route53 list-resource-record-sets <span class="se">\</span>
  <span class="nt">--hosted-zone-id</span> Z1234EXAMPLE <span class="se">\</span>
  <span class="nt">--output</span> json <span class="o">&gt;</span> mittbachweg-zone.json
</code></pre></div></div>

<p>Discovered: 47 DNS records across both domains (A, AAAA, CNAME, MX, TXT)</p>

<h3 id="terraform-cloudflare-provider">Terraform Cloudflare Provider</h3>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># terraform/3-security/cloudflare.tf</span>
<span class="nx">terraform</span> <span class="p">{</span>
  <span class="nx">required_providers</span> <span class="p">{</span>
    <span class="nx">cloudflare</span> <span class="o">=</span> <span class="p">{</span>
      <span class="nx">source</span>  <span class="o">=</span> <span class="s2">"cloudflare/cloudflare"</span>
      <span class="nx">version</span> <span class="o">=</span> <span class="s2">"~&gt; 4.43.0"</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="nx">provider</span> <span class="s2">"cloudflare"</span> <span class="p">{</span>
  <span class="nx">api_token</span> <span class="o">=</span> <span class="nx">var</span><span class="p">.</span><span class="nx">cloudflare_api_token</span>
<span class="p">}</span>

<span class="nx">resource</span> <span class="s2">"cloudflare_zone"</span> <span class="s2">"mittbachweg"</span> <span class="p">{</span>
  <span class="nx">zone</span> <span class="o">=</span> <span class="s2">"mittbachweg.de"</span>
  <span class="nx">plan</span> <span class="o">=</span> <span class="s2">"free"</span>
  <span class="nx">jump_start</span> <span class="o">=</span> <span class="kc">false</span>
<span class="p">}</span>

<span class="nx">resource</span> <span class="s2">"cloudflare_record"</span> <span class="s2">"gitlab"</span> <span class="p">{</span>
  <span class="nx">zone_id</span> <span class="o">=</span> <span class="nx">cloudflare_zone</span><span class="p">.</span><span class="nx">mittbachweg</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">name</span>    <span class="o">=</span> <span class="s2">"git"</span>
  <span class="nx">type</span>    <span class="o">=</span> <span class="s2">"A"</span>
  <span class="nx">value</span>   <span class="o">=</span> <span class="s2">"192.168.1.10"</span>
  <span class="nx">ttl</span>     <span class="o">=</span> <span class="mi">1</span>
<span class="p">}</span>

<span class="nx">resource</span> <span class="s2">"cloudflare_record"</span> <span class="s2">"wildcard_platform_one"</span> <span class="p">{</span>
  <span class="nx">zone_id</span> <span class="o">=</span> <span class="nx">cloudflare_zone</span><span class="p">.</span><span class="nx">mittbachweg</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">name</span>    <span class="o">=</span> <span class="s2">"*.one"</span>
  <span class="nx">type</span>    <span class="o">=</span> <span class="s2">"CNAME"</span>
  <span class="nx">value</span>   <span class="o">=</span> <span class="s2">"ruby.mittbachweg.de"</span>
  <span class="nx">ttl</span>     <span class="o">=</span> <span class="mi">300</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="ansible-integration">Ansible Integration</h3>

<p>Instead of managing DNS via Terraform for every service, I integrated Cloudflare DNS into Ansible roles:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># roles/platform_one/tasks/dns.yml</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Register Cloudflare DNS record for application</span>
  <span class="na">community.general.cloudflare_dns</span><span class="pi">:</span>
    <span class="na">zone</span><span class="pi">:</span> <span class="s">mittbachweg.de</span>
    <span class="na">record</span><span class="pi">:</span> <span class="s2">"</span><span class="s">.one"</span>
    <span class="na">type</span><span class="pi">:</span> <span class="s">A</span>
    <span class="na">value</span><span class="pi">:</span> <span class="s2">"</span><span class="s">"</span>
    <span class="na">proxied</span><span class="pi">:</span> <span class="s">yes</span>
    <span class="na">account_email</span><span class="pi">:</span> <span class="s2">"</span><span class="s">"</span>
    <span class="na">account_api_token</span><span class="pi">:</span> <span class="s2">"</span><span class="s">"</span>
    <span class="na">state</span><span class="pi">:</span> <span class="s">present</span>
  <span class="na">when</span><span class="pi">:</span> <span class="s">dns_record | default(false)</span>
</code></pre></div></div>

<p><strong>Benefits:</strong></p>

<ul>
  <li>DNS records created automatically when deploying services</li>
  <li>No manual Terraform updates for each app</li>
  <li>DNS lifecycle tied to application deployment</li>
</ul>

<h3 id="phase-4-nameserver-cutover">Phase 4: Nameserver Cutover</h3>

<p>The critical migration moment:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># 1. Verify Cloudflare DNS is fully populated</span>
dig @eva.ns.cloudflare.com git.mittbachweg.de

<span class="c"># Verify Cloudflare DNS populated</span>
dig @eva.ns.cloudflare.com git.mittbachweg.de

<span class="c"># Lower Route53 TTL to 60 seconds</span>
aws route53 change-resource-record-sets <span class="nt">--change-batch</span> <span class="s1">'...'</span>

<span class="c"># Update nameservers at Netcup</span>
<span class="c"># Old: ns-1234.awsdns-12.org</span>
<span class="c"># New: eva.ns.cloudflare.com, walt.ns.cloudflare.com</span>

<span class="c"># Monitor propagation</span>
watch <span class="nt">-n</span> 5 <span class="s1">'dig NS mittbachweg.de +short'</span>
</code></pre></div></div>

<p>Result: Zero downtime. Full propagation in ~45 minutes.</p>

<h2 id="cloudflare-tunnel-the-real-motivation">Cloudflare Tunnel: The Real Motivation</h2>

<p>Port forwarding was never an option for me. My ISP uses Carrier Grade NAT (DS-Lite), which means I don’t have a public IPv4 address. Traditional port forwarding simply doesn’t work in this setup.</p>

<p>Even if I could forward ports, the security implications always turned me off. Exposing services directly to the internet invites constant scanning, brute force attempts, and potential DDoS attacks.</p>

<p>Cloudflare Tunnels changed everything. For the first time, I could reliably expose services from home without:</p>
<ul>
  <li>❌ Public IP address</li>
  <li>❌ Port forwarding rules</li>
  <li>❌ Firewall holes</li>
  <li>❌ Direct exposure to internet threats</li>
</ul>

<p>The architecture:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Internet → Cloudflare Edge → Encrypted Tunnel → Traefik → Services
</code></pre></div></div>

<p>Cloudflare establishes an outbound connection from my homelab to their edge network. All inbound traffic is proxied through Cloudflare’s DDoS protection and WAF. My actual infrastructure remains completely hidden.</p>

<h3 id="terraform-configuration">Terraform Configuration</h3>

<p>Tunnel setup:</p>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">resource</span> <span class="s2">"cloudflare_tunnel"</span> <span class="s2">"app_platform"</span> <span class="p">{</span>
  <span class="nx">account_id</span> <span class="o">=</span> <span class="nx">var</span><span class="p">.</span><span class="nx">cloudflare_account_id</span>
  <span class="nx">name</span>       <span class="o">=</span> <span class="s2">"app-platform-tunnel"</span>
  <span class="nx">secret</span>     <span class="o">=</span> <span class="nx">base64encode</span><span class="p">(</span><span class="nx">random_password</span><span class="p">.</span><span class="nx">tunnel_secret</span><span class="p">.</span><span class="nx">result</span><span class="p">)</span>
<span class="p">}</span>

<span class="nx">resource</span> <span class="s2">"cloudflare_tunnel_config"</span> <span class="s2">"app_platform"</span> <span class="p">{</span>
  <span class="nx">tunnel_id</span>  <span class="o">=</span> <span class="nx">cloudflare_tunnel</span><span class="p">.</span><span class="nx">app_platform</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">account_id</span> <span class="o">=</span> <span class="nx">var</span><span class="p">.</span><span class="nx">cloudflare_account_id</span>
  
  <span class="nx">config</span> <span class="p">{</span>
    <span class="nx">ingress_rule</span> <span class="p">{</span>
      <span class="nx">hostname</span> <span class="o">=</span> <span class="s2">"git.mittbachweg.de"</span>
      <span class="nx">service</span>  <span class="o">=</span> <span class="s2">"https://ruby.internal:443"</span>
    <span class="p">}</span>
    
    <span class="nx">ingress_rule</span> <span class="p">{</span>
      <span class="nx">hostname</span> <span class="o">=</span> <span class="s2">"*.one.mittbachweg.de"</span>
      <span class="nx">service</span>  <span class="o">=</span> <span class="s2">"https://ruby.internal:443"</span>
    <span class="p">}</span>
    
    <span class="nx">ingress_rule</span> <span class="p">{</span>
      <span class="nx">service</span> <span class="o">=</span> <span class="s2">"http_status:404"</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="nx">resource</span> <span class="s2">"cloudflare_record"</span> <span class="s2">"tunnel_gitlab"</span> <span class="p">{</span>
  <span class="nx">zone_id</span> <span class="o">=</span> <span class="nx">cloudflare_zone</span><span class="p">.</span><span class="nx">mittbachweg</span><span class="p">.</span><span class="nx">id</span>
  <span class="nx">name</span>    <span class="o">=</span> <span class="s2">"git"</span>
  <span class="nx">type</span>    <span class="o">=</span> <span class="s2">"CNAME"</span>
  <span class="nx">value</span>   <span class="o">=</span> <span class="s2">"${cloudflare_tunnel.app_platform.id}.cfargotunnel.com"</span>
  <span class="nx">proxied</span> <span class="o">=</span> <span class="kc">true</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="deployment">Deployment</h3>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># roles/cloudflare_tunnel/tasks/main.yml</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Install cloudflared</span>
  <span class="na">apt</span><span class="pi">:</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">cloudflared</span>

<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Configure tunnel credentials</span>
  <span class="na">copy</span><span class="pi">:</span>
    <span class="na">content</span><span class="pi">:</span> <span class="s">\"{{ tunnel_credentials | to_json }}\"</span>
    <span class="na">dest</span><span class="pi">:</span> <span class="s">/etc/cloudflared/credentials.json</span>
    <span class="na">mode</span><span class="pi">:</span> <span class="s1">'</span><span class="s">0600'</span>

<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Start tunnel service</span>
  <span class="na">systemd</span><span class="pi">:</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">cloudflared-tunnel@app-platform</span>
    <span class="na">enabled</span><span class="pi">:</span> <span class="s">yes</span>
</code></pre></div></div>

<h3 id="why-this-matters">Why This Matters</h3>

<p>Cloudflare Tunnels solved a problem I couldn’t fix any other way:</p>
<ul>
  <li><strong>Carrier Grade NAT / DualStack Lite</strong>: No public IPv4, port forwarding impossible</li>
  <li><strong>Security</strong>: Zero exposed ports, all traffic proxied through Cloudflare, adding policies like GeoIP-based blocking or another SSO is possible</li>
  <li><strong>Reliability</strong>: Persistent outbound connection, no inbound firewall rules needed</li>
  <li><strong>DDoS protection</strong>: Cloudflare’s network absorbs attacks before they reach my connection</li>
</ul>

<p>This wasn’t a migration from port forwarding. It was the first time I could expose homelab services to the internet reliably and securely.</p>

<h2 id="what-i-learned">What I Learned</h2>

<h3 id="ansible-vs-terraform-for-dns-management">Ansible vs. Terraform for DNS Management</h3>

<p>I use Ansible for DNS record creation during service deployments and vm setups. This caused problems.</p>

<p><strong>The issue:</strong> Ansible doesn’t track state. When you remove a DNS record from your playbook, Ansible doesn’t know to delete it from Cloudflare. The record stays there until you manually remove it.</p>

<p>This led to:</p>
<ul>
  <li><strong>Duplicate records:</strong> Deploying the same service twice created multiple DNS entries</li>
  <li><strong>Stale records:</strong> Removing services left orphaned DNS records in Cloudflare</li>
  <li><strong>Manual cleanup:</strong> Had to manually edit DNS records in Cloudflare’s UI to fix duplicates</li>
</ul>

<p><strong>The fix:</strong> Move DNS records to Terraform where possible.</p>

<p>Terraform tracks state. When you remove a record from your Terraform code:</p>
<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Delete this resource</span>
<span class="nx">resource</span> <span class="s2">"cloudflare_record"</span> <span class="s2">"old_service"</span> <span class="p">{</span>
  <span class="c1"># ...</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Run <code class="language-plaintext highlighter-rouge">terraform apply</code>, and Terraform automatically removes it from Cloudflare.</p>

<p><strong>Current approach:</strong></p>
<ul>
  <li><strong>Terraform:</strong> Static DNS records (zones, subdomains, tunnel CNAMEs) - the majority</li>
  <li><strong>Ansible:</strong> Dynamic records only when deployment requires it (rare)</li>
</ul>

<p>Terraform’s state management eliminated the duplicate/stale record problem entirely.
Everything configured via Terraform. LetsEncrypt DNS-01 challenges still work via Traefik.</p>

<p>The migration to Cloudflare made sense for consistency with Tunnel and eliminated AWS entirely.</p>]]></content><author><name>Ben Matheja</name></author><category term="homelab" /><category term="cloudflare" /><category term="dns" /><category term="terraform" /><category term="tunnel" /><category term="aws" /><summary type="html"><![CDATA[Cloudflare Tunnels enabled external access from behind Carrier Grade NAT or Dual-Stack Lite. Moving DNS from Route53 to Cloudflare was the logical next step.]]></summary></entry><entry><title type="html">Reflecting on Progress: My Homelab Journey Since 2020</title><link href="https://matheja.me/2025/01/02/homelab-update.html" rel="alternate" type="text/html" title="Reflecting on Progress: My Homelab Journey Since 2020" /><published>2025-01-02T12:09:00+01:00</published><updated>2025-01-02T12:09:00+01:00</updated><id>https://matheja.me/2025/01/02/homelab-update</id><content type="html" xml:base="https://matheja.me/2025/01/02/homelab-update.html"><![CDATA[<p>A lot has passed since my last update in 2020. Here’s a rundown of the major changes:</p>

<!--more-->

<ul>
  <li>Home Renovation and Automation: We moved and renovated a house, which led to automating shutters with Shelly 2.5s.</li>
  <li>Proxmox Node Setup: I turned my TX120 S3 running Ubuntu into my first Proxmox Node. This setup, featuring Home Assistant, a self-hosted VPN VM, and a GitLab Instance, served me well for months.</li>
  <li>Deploying a Real Server: I acquired an RX2530 M2 rack-mounted server. Setting up VMs on it taught me about IPMI and the key differences between servers and consumer PCs (Noise, Heat, Power Consumption, Redundancy).</li>
  <li>HP EliteDesk Addition: I added two HP EliteDesk 600 G3 units to leverage integrated GPUs for transcoding and overall lab management.</li>
  <li>Central Media Server: Recently, I got an ITX system with an Intel i5-11400, now my central media server due to its superior Intel UHD 730 transcoding capabilities.</li>
</ul>

<p>This journey has been one of learning, experimenting, and optimizing. Each step brought new insights and capabilities. Stay tuned for more updates as I continue to refine my setup and explore new technologies!</p>]]></content><author><name>Ben Matheja</name></author><category term="development" /><category term="homelab" /><category term="proxmox" /><category term="ansible" /><category term="gitlab" /><summary type="html"><![CDATA[A lot has passed since my last update in 2020. Here’s a rundown of the major changes:]]></summary></entry><entry><title type="html">Infrastructure as Code: From Manual Provisioning to Ansible + Terraform</title><link href="https://matheja.me/2024/04/15/infrastructure-as-code-journey.html" rel="alternate" type="text/html" title="Infrastructure as Code: From Manual Provisioning to Ansible + Terraform" /><published>2024-04-15T10:15:00+02:00</published><updated>2024-04-15T10:15:00+02:00</updated><id>https://matheja.me/2024/04/15/infrastructure-as-code-journey</id><content type="html" xml:base="https://matheja.me/2024/04/15/infrastructure-as-code-journey.html"><![CDATA[<p>After three years of manually provisioning VMs through ssh and adjusting docker-compose files on the hosts, I finally committed to Infrastructure as Code.
April 1, 2024: First commit to the infrastructure repository.
<!--more--></p>

<h2 id="the-problem">The Problem</h2>

<p>Since my first homelab post in November 2020, I’d accumulated a collection of snowflake servers—each one unique, manually configured, and completely undocumented.</p>

<p><strong>The cost:</strong></p>
<ul>
  <li>❌ 4-hour recovery times for failed VMs</li>
  <li>❌ Deployment anxiety (one wrong click = broken service)</li>
  <li>❌ Zero reproducibility</li>
  <li>❌ Tribal knowledge locked in my head</li>
</ul>

<p>The wake-up call: A hard drive failure on my GitLab VM. I had no backup of the VM configuration.
Was it 8GB or 16GB of RAM? What VLAN was it on? Where were the mount points?</p>

<h2 id="the-solution">The Solution</h2>

<p>Every component of my infrastructure is now defined in code. No more clicking through UIs. No more manual SSH sessions.</p>

<h3 id="two-layer-architecture">Two-Layer Architecture</h3>

<p><strong>Terraform:</strong> Manages external/immutable infrastructure.</p>
<ul>
  <li>Cloudflare DNS records and Tunnel</li>
</ul>

<p><strong>Ansible:</strong> Manages mutable state and VM lifecycle.</p>
<ul>
  <li>Proxmox VMs (provisioning from cloud-init templates)</li>
  <li>Docker containers</li>
  <li>File system configurations (NFS mounts)</li>
  <li>Service orchestration</li>
</ul>

<p>Why both? Terraform manages external resources. Ansible handles everything on Proxmox VMs.</p>

<h2 id="implementation">Implementation</h2>

<h3 id="vm-provisioning-with-ansible">VM Provisioning with Ansible</h3>

<p>Before, I clicked through Proxmox UI screens and hoped I wrote down what I did.</p>

<p>Now, VM specs are defined in YAML:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># vars/vm_specs.yml</span>
<span class="na">vms</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">ruby</span>
    <span class="na">vmId</span><span class="pi">:</span> <span class="m">920</span>
    <span class="na">target_node</span><span class="pi">:</span> <span class="s">pve5</span>
    <span class="na">cores</span><span class="pi">:</span> <span class="m">8</span>
    <span class="na">memory</span><span class="pi">:</span> <span class="m">16384</span>
    <span class="na">disk</span><span class="pi">:</span>
      <span class="na">scsi0</span><span class="pi">:</span> <span class="s1">'</span><span class="s">nvme-thin:200'</span>
    <span class="na">net</span><span class="pi">:</span>
      <span class="na">net0</span><span class="pi">:</span> <span class="s1">'</span><span class="s">virtio,bridge=vmbr0,tag=50'</span>
    <span class="na">ipconfig</span><span class="pi">:</span>
      <span class="na">ipconfig0</span><span class="pi">:</span> <span class="s1">'</span><span class="s">ip=192.168.50.20/24,gw=192.168.50.1'</span>
    <span class="na">tags</span><span class="pi">:</span> <span class="pi">[</span><span class="s1">'</span><span class="s">ansible'</span><span class="pi">,</span> <span class="s1">'</span><span class="s">gitlab-runner'</span><span class="pi">]</span>
    <span class="na">clone</span><span class="pi">:</span> <span class="s1">'</span><span class="s">ubuntu-24.04-server-cloudinit-template'</span>
</code></pre></div></div>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># roles/hypervisor/tasks/provision_vm.yml</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Provision VM from cloud-init template</span>
  <span class="na">community.general.proxmox_kvm</span><span class="pi">:</span>
    <span class="na">api_host</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">ansible_host</span><span class="nv"> </span><span class="s">}}"</span>
    <span class="na">node</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">inventory_hostname</span><span class="nv"> </span><span class="s">}}"</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">vm_name</span><span class="nv"> </span><span class="s">}}"</span>
    <span class="na">vmid</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">vm_id</span><span class="nv"> </span><span class="s">}}"</span>
    <span class="na">clone</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">vm_clone</span><span class="nv"> </span><span class="s">}}"</span>
    <span class="na">cores</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">vm_cores</span><span class="nv"> </span><span class="s">}}"</span>
    <span class="na">memory</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">vm_memory</span><span class="nv"> </span><span class="s">}}"</span>
    <span class="na">net</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">vm_net</span><span class="nv"> </span><span class="s">}}"</span>
    <span class="na">state</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">vm_state</span><span class="nv"> </span><span class="s">}}"</span>
</code></pre></div></div>

<p>Benefits:</p>
<ul>
  <li>✅ Reproducible (spin up identical VMs from templates)</li>
  <li>✅ Version controlled (VM specs in Git)</li>
  <li>✅ Self-documenting (vm_specs.yml IS the documentation)</li>
  <li>✅ Idempotent (run multiple times safely)</li>
</ul>

<h3 id="application-deployment">Application Deployment</h3>

<p>Before: SSH in, manually install packages, hope nothing breaks.</p>

<p>After:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># roles/platform_one/tasks/gitlab.yml</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Ensure GitLab directory structure</span>
  <span class="na">file</span><span class="pi">:</span>
    <span class="na">path</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">container_data</span><span class="nv"> </span><span class="s">}}/ruby/gitlab"</span>
    <span class="na">state</span><span class="pi">:</span> <span class="s">directory</span>

<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Template GitLab docker-compose</span>
  <span class="na">template</span><span class="pi">:</span>
    <span class="na">src</span><span class="pi">:</span> <span class="s">gitlab-docker-compose.yml.j2</span>
    <span class="na">dest</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">container_data</span><span class="nv"> </span><span class="s">}}/ruby/gitlab/docker-compose.yml"</span>
  <span class="na">notify</span><span class="pi">:</span> <span class="s">restart gitlab</span>

<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Deploy GitLab container</span>
  <span class="na">community.docker.docker_compose_v2</span><span class="pi">:</span>
    <span class="na">project_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">{{</span><span class="nv"> </span><span class="s">container_data</span><span class="nv"> </span><span class="s">}}/ruby/gitlab"</span>
    <span class="na">state</span><span class="pi">:</span> <span class="s">present</span>
</code></pre></div></div>

<p>Run once: <code class="language-plaintext highlighter-rouge">ansible-playbook main_playbook.yml --tags gitlab --limit ruby</code></p>

<h3 id="terraform--ansible-integration">Terraform → Ansible Integration</h3>

<p>Terraform outputs (VM IPs, bucket names) flow into Ansible via a generated vars file:</p>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># terraform/outputs.tf</span>
<span class="nx">resource</span> <span class="s2">"local_file"</span> <span class="s2">"ansible_vars"</span> <span class="p">{</span>
  <span class="nx">filename</span> <span class="o">=</span> <span class="s2">"${path.root}/../../vars/tf_ansible_vars_file.yml"</span>
  <span class="nx">content</span>  <span class="o">=</span> <span class="nx">yamlencode</span><span class="p">({</span>
    <span class="nx">minio_endpoint</span> <span class="o">=</span> <span class="nx">minio_s3_bucket</span><span class="p">.</span><span class="nx">backups</span><span class="p">.</span><span class="nx">endpoint</span>
    <span class="nx">vault_addr</span>     <span class="o">=</span> <span class="nx">vault_auth_backend</span><span class="p">.</span><span class="nx">oidc</span><span class="p">.</span><span class="nx">path</span>
  <span class="p">})</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Ansible consumes this automatically:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># playbook.yml</span>
<span class="pi">-</span> <span class="na">hosts</span><span class="pi">:</span> <span class="s">all</span>
  <span class="na">vars_files</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">vars/tf_ansible_vars_file.yml</span>
</code></pre></div></div>

<h2 id="the-results">The Results</h2>

<p>Full environment rebuild:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">cd </span>terraform/1-infrastructure <span class="o">&amp;&amp;</span> terraform apply
<span class="nv">$ </span>ansible-playbook main_playbook.yml
</code></pre></div></div>

<p>That’s it. Every VM, every service, every configuration restored from code.</p>

<p>This was the foundation. Terraform came later in December 2024. For now, Ansible handled everything—VM provisioning, Docker deployments, configuration management. The infrastructure repo was born on April 1, 2024. What followed was rapid iteration.</p>]]></content><author><name>Ben Matheja</name></author><category term="homelab" /><category term="ansible" /><category term="terraform" /><category term="proxmox" /><category term="devops" /><summary type="html"><![CDATA[After three years of manually provisioning VMs through ssh and adjusting docker-compose files on the hosts, I finally committed to Infrastructure as Code. April 1, 2024: First commit to the infrastructure repository.]]></summary></entry><entry><title type="html">Homelab - Part 1: Overview</title><link href="https://matheja.me/2020/11/07/homelab-part1.html" rel="alternate" type="text/html" title="Homelab - Part 1: Overview" /><published>2020-11-07T09:31:00+01:00</published><updated>2020-11-07T09:31:00+01:00</updated><id>https://matheja.me/2020/11/07/homelab-part1</id><content type="html" xml:base="https://matheja.me/2020/11/07/homelab-part1.html"><![CDATA[<p>Due to Covid-19 I took on the task to build something new and exciting.
It was time to rebuild and extend services i’m hosting in my local network.
As a quick premise: Everything i’m running locally is only exposed to clients within the same network.
<!--more--></p>

<h2 id="status-quo">Status Quo</h2>

<p>Having the need to run at least the Unifi-Controller within my network to control my Unifi devices I used an old Raspberry Pi 2 before.
The RPI 2 used my Synology NAS as NFS as I was aware of file system issues using only the internal SD cards.
Still the setup was painfully slow and not extendable.</p>

<h2 id="overview">Overview</h2>

<p>With the use of Containers and Orchestration Tools such as <code class="language-plaintext highlighter-rouge">docker-compose</code> you can bring up entire stacks within seconds and manage them in a painless way.
I procured a used <a href="https://sp.ts.fujitsu.com/dmsp/Publications/public/ds-py-tx120-s3.pdf">Fujitsu TX-120 S3</a> with a single Xeon E3-1220 3,1GHz 8GB Ram and 300GB SAS Drives for below 200€.
Then I installed Ubuntu 18.04 on it to use it as my primary docker-compose host.</p>

<p>Sidenote: The TX-120 is placed in the hall of our flat and I’m a bit annoyed by the continous noisy write on the Disks.
So i’ll opt to switch the SAS Drives towards 2,5” SSDs in the Future.
But still i’m really amazed by the size of the case. You’ll find a spot for a server that big. And as a bonus tt features 2 usable Network Interfaces out of the box.</p>

<p>Here is an overview of the current setup.</p>

<pre><code class="language-asci">                          +--------------------------------------+
                          |                       TX120 S3 (io)  |
                          |                       docker-compose |
                          |                                      |
+-----------+             | +-----------+        +-------------+ |
|           |             | |           |        |             | |
| Quaysi.de +---------------&gt;  Traefik  +--------&gt;  Services   | |
|           |             | |           |        |             | |
+-----------+             | +-----------+        +-------------+ |
                          |                                      |
                          +--------------------------------------+

</code></pre>

<p>Everything is accessible and exposed via HTTPS below the quaysi.de domain.
Each services has its own subdomain e.g. unifi.quaysi.de.</p>

<h3 id="traefik-domain-and-certificates">Traefik, Domain and Certificates</h3>

<p><a href="https://traefik.io/">Traefik</a> is configured as the central edge router handling incoming requests and forwarding them to the services.
If you ever set up local services I assume you encountered the issues regarding non-trusted certificates.
With the help of <a href="https://aws.amazon.com/route53/">Amazon Route 53</a>, <a href="https://traefik.io/">Traefik</a> and the LetsEncrypt Resolver its possible to bypass that.</p>

<ul>
  <li>Set public Records in a Hosted Zone within Route 53 for your internal services e.g. quaysi.de points to 192.168.1.9, 192.168.1.10 and 192.168.1.11</li>
  <li>Configure Traefik to use the <a href="https://doc.traefik.io/traefik/user-guides/docker-compose/acme-dns/">DNS-Challenge</a> and provision AWS Credentials</li>
</ul>

<p>The effect is: all your internal services receive a trusted LetsEncrypt certificate and you can just work around those annoying “untrusted connection” issues, as everything is provisioned.</p>

<p>This was the first part of the series about my homelab. As always love to hear your feedback about.</p>]]></content><author><name>Ben Matheja</name></author><category term="development" /><category term="traefik" /><category term="container" /><category term="docker" /><category term="unix" /><category term="homelab" /><category term="route53" /><category term="letsencrypt" /><summary type="html"><![CDATA[Due to Covid-19 I took on the task to build something new and exciting. It was time to rebuild and extend services i’m hosting in my local network. As a quick premise: Everything i’m running locally is only exposed to clients within the same network.]]></summary></entry><entry><title type="html">Stop Trying to Make Hard Work Easy</title><link href="https://matheja.me/2020/08/20/stop-trying-to-make-hard-work-easy.html" rel="alternate" type="text/html" title="Stop Trying to Make Hard Work Easy" /><published>2020-08-20T15:37:00+02:00</published><updated>2020-08-20T15:37:00+02:00</updated><id>https://matheja.me/2020/08/20/stop-trying-to-make-hard-work-easy</id><content type="html" xml:base="https://matheja.me/2020/08/20/stop-trying-to-make-hard-work-easy.html"><![CDATA[<p>Today I want to recommend an interesting article from Nir Eyal.
<!--more--></p>

<p><a href="https://superorganizers.substack.com/p/stop-trying-to-make-hard-work-easy">Stop Trying to Make Hard Work Easy by Nir Eyal</a></p>

<p>Nir Eyal hightlights the number one barrier to getting our work done is distraction. And for me quite interesting elaborates on the opposite of distraction which is not focus, it’s traction.</p>

<p>From his perspective the main challenge to remain productive i.e gaining traction is mastering our triggers. They may be internal (our discomfort when pushed to focus on a given task without allowing distraction to ourselves) as well as external triggers (disturbances like a message on our phone).</p>]]></content><author><name>Ben Matheja</name></author><category term="productivity" /><category term="focus" /><category term="newwork" /><summary type="html"><![CDATA[Today I want to recommend an interesting article from Nir Eyal.]]></summary></entry><entry><title type="html">Journey to Serverless - Migrated my Todoist Integration to Lambda</title><link href="https://matheja.me/2020/06/15/transform-todoist-integration-to-serverless.html" rel="alternate" type="text/html" title="Journey to Serverless - Migrated my Todoist Integration to Lambda" /><published>2020-06-15T11:00:00+02:00</published><updated>2020-06-15T11:00:00+02:00</updated><id>https://matheja.me/2020/06/15/transform-todoist-integration-to-serverless</id><content type="html" xml:base="https://matheja.me/2020/06/15/transform-todoist-integration-to-serverless.html"><![CDATA[<p>I migrated my <a href="https://github.com/BenMatheja/todoist-serverless-lambda">Todoist Webhook Integration</a> from a self-hosted version towards <a href="https://aws.amazon.com/lambda/">AWS Lambda</a>.</p>

<p>Here is why!
<!--more--></p>
<h2 id="why">Why</h2>
<p>My old approach had a lot of shortcomings.</p>

<p>The way I was developing on the app and how it ran in production was not the same.
There was no real CI/CD process involved and the packaging of the app itself really differed from local development. This caused issues, whenever changes had to be done.</p>

<p>The simple function consisted of too many moving parts which also introduced more complexity to the overall system.
I used an <em>apt-get installed</em> Nginx as reverse proxy. There was <a href="https://github.com/benoitc/gunicorn">Gunicorn</a> with it’s own configuration files and last but not least the Python App itself.</p>

<p>The durability of the app was not convincing. From a user’s perspective the performance got worse the longer the application has been running. This resulted in dropped events and made the service not reliable.
I remember someone talking about self-driving cars and he said “if it doesn’t work in all circumstances - there is no use for it”. To be honest, the complexity of the <a href="https://github.com/BenMatheja/todoist-serverless-lambda">Todoist Webhook Integration</a> is trivial compared to the challenges of writing software for self-driving cars, but still the argument holds.</p>

<p>The handling of credentials was far from optimal. With the current app, they were just inserted to an settings.py file on the machine.</p>

<h2 id="what-i-did">What I did</h2>

<p>I used both <a href="https://github.com/Miserlou/Zappa">Zappa</a> and <a href="https://www.serverless.com/">Serverless</a> for setting up the AWS Stack and configuring the app to run properly on <a href="https://aws.amazon.com/lambda/">AWS Lambda</a>. Serverless seemed more mature to me which is the reason I’m still using it that way.</p>

<p>The integration runs at free tier with no hassle. I configured <a href="https://developer.todoist.com/sync/v8/#webhooks">Todoist Webhooks</a> to fire whenever an item on my list is marked as <em>completed</em></p>

<p>I reduced the moving parts necessary to maintain, it’s just packaging the app, making sure that it’ll run, deploy it, test it on Lambda and you’re done.</p>

<p>I kind of fought with Github Actions to set up a CI/CD but can say that it’ll now deploy to AWS whenever something is pushed to master. This is a real relief, knowing that wherever I commit changes to the repository, the app will get deployed in the same (working) way.
So you can say I’m pretty happy with the status quo of the app.</p>

<h2 id="future">Future</h2>
<ul>
  <li>Let Github Actions run integration tests after deployment on dev (e.g. is the Endpoint responding as expected using newman or something else). If succesfull stage to prod</li>
</ul>]]></content><author><name>Ben Matheja</name></author><category term="aws" /><category term="lambda" /><category term="todoist" /><category term="python" /><category term="development" /><summary type="html"><![CDATA[I migrated my Todoist Webhook Integration from a self-hosted version towards AWS Lambda. Here is why!]]></summary></entry></feed>