vault backup: 2026-01-14 21:12:55

Affected files:
.obsidian/workspace.json
2 Personal/Home Lab/Pangolin Installation.md
This commit is contained in:
2026-01-14 21:12:55 +01:00
parent 224efa0b6b
commit 9188e43301
2 changed files with 948 additions and 13 deletions

View File

@@ -27,12 +27,12 @@
"state": {
"type": "markdown",
"state": {
"file": "2 Personal/Home Lab/Homelab.md",
"file": "2 Personal/Home Lab/Pangolin Installation.md",
"mode": "source",
"source": false
},
"icon": "lucide-file",
"title": "Homelab"
"title": "Pangolin Installation"
}
},
{
@@ -92,7 +92,7 @@
}
}
],
"currentTab": 3
"currentTab": 1
}
],
"direction": "vertical"
@@ -317,7 +317,8 @@
"vantage-obsidian:Vantage - Advanced search builder": false,
"templater-obsidian:Templater": false,
"obsidian-git:Open Git source control": false,
"markdown-importer:Open format converter": false
"markdown-importer:Open format converter": false,
"periodic-notes:Open today": false
}
},
"floating": {
@@ -350,12 +351,14 @@
"id": "a0f509d11e0796d0",
"type": "leaf",
"state": {
"type": "release-notes",
"type": "markdown",
"state": {
"currentVersion": "1.11.4"
"file": "2 Personal/Home Lab/NAS/Backup Strategy.md",
"mode": "source",
"source": false
},
"icon": "lucide-book-up",
"title": "Release Notes 1.11.4"
"icon": "lucide-file",
"title": "Backup Strategy"
}
}
],
@@ -364,7 +367,7 @@
],
"direction": "vertical",
"x": 0,
"y": 44,
"y": 57,
"width": 900,
"height": 777,
"maximize": false,
@@ -372,12 +375,14 @@
}
]
},
"active": "45138afa5cf89635",
"active": "a136dcfcf5de7dd5",
"lastOpenFiles": [
"2 Personal/Home Lab/Homelab.md",
"2 Personal/Home Lab/Pangolin Installation.md",
"0 Journal/0 Daily/2026-01-07.md",
"2 Personal/Home Lab/NAS/Backup Strategy.md",
"2 Personal/Lists/Media/Bücher.md",
"2 Personal/Home Lab/Homelab Architecture.excalidraw.md",
"0 Journal/0 Daily/2026-01-07.md",
"2 Personal/Home Lab/Drawing 2026-01-09 15.01.17.excalidraw.md",
"99 Work/0 OneSec/OneSecNotes/Handover Planning.md",
"OneNote/Listen/Bücher.md",
@@ -398,8 +403,6 @@
"0 Journal/0 Daily/2025-12-09.md",
"0 Journal/Meetings/OneSec Cofounder Verhandlung.md",
"0 Journal/Meetings/Luca Radojevic - Meeting 1.md",
"0 Journal/0 Daily/2025-11-11.md",
"99 Work/0 OneSec/OneSecThoughts/Patent Analyse.md",
"Attachments/Pasted image 20251202214228.png",
"2 Personal/1 Skills/AI",
"2 Personal/Home Lab/Baerhalten",

View File

@@ -0,0 +1,932 @@
# Pangolin + Gerbil + Traefik + Newt + Cloudflare Tunnel (cloudflared) — Deterministic Setup Notes
This document is a **README-style, long-term memory** of what was built and debugged in this chat session (Jan 14, 2026).
Goal: In **two years**, you should be able to reread this and immediately understand:
- what the architecture is
- how traffic flows end-to-end
- why certain configs exist
- what failure modes looked like and how they were diagnosed/fixed
---
## 0) What problem this stack solves
You want to expose **private services** (e.g. `glance`, `n8n`, etc.) via **friendly hostnames** on your domain, while:
- avoiding inbound port-forwarding (ideally)
- keeping access controlled (Pangolin policy/auth)
- keeping the architecture understandable and debuggable
- using a remote connector (Newt) from environments like LXC/VM/homelab networks
In practice you ended up with:
- **Pangolin** as control plane (UI + API + policy + config generator)
- **Gerbil** as data plane / “edge node” (WireGuard/UDP side + reachability endpoint)
- **Traefik** as reverse proxy in front of Gerbil (routing + Badger plugin)
- **Newt** as remote agent that connects a site back to Pangolin (WebSocket)
- **cloudflared** as Cloudflare Tunnel client (public HTTPS entry -> private origin)
- **Cloudflare DNS** for hostnames, and a special DNS-only UDP endpoint for Gerbil
---
## 1) Architecture overview
### 1.1 High-level components
- **Cloudflare (edge)**
- DNS for `*.frusetik.com`
- HTTP(S) reverse proxy to your tunnel (when proxied)
- Cloudflare Tunnel “ingress rules” define where each hostname goes
- **cloudflared (tunnel client)**
- Runs as a container (in your case it lived in another compose project, e.g. `n8n-compose`)
- Maintains outbound tunnel connections to Cloudflare edge
- For each hostname, forwards requests to an **origin service** (e.g. `http://gerbil:80` or `https://gerbil:443`)
- **Pangolin (control plane container)**
- API + internal server + Next.js UI
- Generates a Traefik dynamic config at:
- `http://pangolin:3001/api/v1/traefik-config`
- Tracks Newt connections and publishes routes to Traefik
- **Traefik (reverse proxy container)**
- Pulls dynamic routes from Pangolin via **HTTP provider**
- Also loads a local file provider config (`/etc/traefik/dynamic_config.yml`)
- Runs with the **Badger plugin** (used by Pangolin for auth/policy enforcement)
- In your deployment: Traefik runs **inside the same network namespace as Gerbil**
- `network_mode: service:gerbil`
- **Gerbil (data plane container)**
- Has UDP ports published to the host:
- `51820/udp`, `21820/udp` (WireGuard-ish / overlay traffic)
- Has a reachable endpoint (`--reachableAt=http://gerbil:3004`)
- Pulls remote config from Pangolin: `--remoteConfig=http://pangolin:3001/api/v1/`
- Hosts the Traefik ports implicitly because Traefik shares its namespace
- **Newt (remote agent, runs outside docker in your case)**
- A client connecting **outbound** to Pangolin
- Lets Pangolin route requests to services in a remote site
- On the remote site, your test indicated the service name `glance` existed and was reachable internally (200 OK)
---
## 2) Mermaid diagrams
### 2.1 Request path (public hostname -> private service)
```mermaid
sequenceDiagram
autonumber
participant U as User Browser
participant CF as Cloudflare Edge
participant T as cloudflared (Tunnel Client)
participant G as Gerbil Namespace
participant TR as Traefik
participant P as Pangolin API/UI
participant N as Newt (remote)
participant S as Service (e.g. Glance)
U->>CF: HTTPS GET https://glance.frusetik.com
CF->>T: Forward via Tunnel (ingress rule)
T->>G: Origin request (http://gerbil:80 OR https://gerbil:443)
G->>TR: Request enters Traefik router
TR->>P: (Badger plugin / policy decisions)
TR->>N: Route to remote site address (generated by Pangolin)
N->>S: Forward to local service (e.g. http://glance:8080)
S-->>U: Response (via reverse path)
```
### 2.2 Docker network / namespace relationship
```mermaid
graph TD
subgraph DockerNetwork["Docker network: pangolin (bridge)"]
P[pangolin container<br/>ports: none published]
G[gerbil container<br/>UDP published: 51820/udp, 21820/udp]
TR[traefik container<br/>network_mode: service:gerbil]
P --- G
P --- TR
end
subgraph OtherCompose["Other compose project (e.g. n8n-compose)"]
CFd[cloudflared container]
end
CFd -. must join .-> DockerNetwork
TR -->|HTTP provider| P
G -->|remoteConfig| P
```
### 2.3 Control plane (config generation)
```mermaid
flowchart LR
P[Pangolin] -->|/api/v1/traefik-config| TR[Traefik HTTP Provider]
TR -->|dynamic routers/services| TR
P -->|tracks Newt connections| P
N[Newt Client] -->|WebSocket| P
```
---
## 3) Deterministic deployment (what runs where)
### 3.1 VPS docker-compose (Pangolin/Gerbil/Traefik)
You had a compose like this (simplified to highlight what mattered):
- network name: `pangolin`
- Traefik shares Gerbil network namespace: `network_mode: service:gerbil`
```yaml
services:
pangolin:
image: fosrl/pangolin:latest
container_name: pangolin
restart: unless-stopped
volumes:
- ./config:/app/config
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3001/api/v1/"]
interval: "3s"
timeout: "3s"
retries: 15
gerbil:
image: fosrl/gerbil:latest
container_name: gerbil
restart: unless-stopped
depends_on:
pangolin:
condition: service_healthy
command:
- --reachableAt=http://gerbil:3004
- --generateAndSaveKeyTo=/var/config/key
- --remoteConfig=http://pangolin:3001/api/v1/
volumes:
- ./config/:/var/config
cap_add:
- NET_ADMIN
- SYS_MODULE
ports:
- 51820:51820/udp
- 21820:21820/udp
traefik:
image: traefik:v3.4.0
container_name: traefik
restart: unless-stopped
network_mode: service:gerbil
depends_on:
pangolin:
condition: service_healthy
command:
- --configFile=/etc/traefik/traefik_config.yml
volumes:
- ./config/traefik:/etc/traefik:ro
- ./config/letsencrypt:/letsencrypt
- ./config/traefik/logs:/var/log/traefik
networks:
default:
driver: bridge
name: pangolin
```
**Key implication:** Traefik and Gerbil share the same IP and exposed ports (because of `network_mode: service:gerbil`).
---
## 4) Traefik configuration (as actually used)
### 4.1 Static Traefik config (`traefik/traefik_config.yml`)
This is the “bootstrap” config. It defines:
- how Traefik loads config
- entrypoints (ports)
- plugin definitions
- logging
- ACME (if enabled) — note that later you saw ACME provider running again, which indicates the config changed or a different file was being used.
Your baseline file (annotated):
```yaml
api:
insecure: true
dashboard: true
# Dashboard is OK only if Traefik is NOT public.
providers:
http:
endpoint: "http://pangolin:3001/api/v1/traefik-config"
pollInterval: "5s"
# Pangolin generates dynamic routers/services here.
file:
filename: "/etc/traefik/dynamic_config.yml"
# Local static routers/services/middlewares if needed.
experimental:
plugins:
badger:
moduleName: "github.com/fosrl/badger"
version: "v1.3.0"
log:
level: "INFO"
format: "common"
entryPoints:
web:
address: ":80"
# websecure:
# address: ":443"
# Only enable if Traefik terminates TLS itself.
serversTransport:
insecureSkipVerify: true
ping:
entryPoint: "web"
```
### 4.2 Dynamic file provider (`traefik/dynamic_config.yml`)
This was your “manual fallback” for Pangolin UI routing and Badger middleware.
**Important lesson from the debugging:**
If Cloudflare Tunnel forwards to `https://gerbil:443`, but your routers only listen on the `web` entrypoint, you will get **404** from Traefik (no matching router on that entrypoint). This happened for `pangolin.frusetik.com`.
So the entrypoint choice must be consistent across:
- what cloudflared connects to (HTTP:80 or HTTPS:443)
- what Traefik is configured to listen on
- what routers are bound to (`web` vs `websecure`)
---
## 5) Pangolin-generated Traefik config (observed)
You tested:
```bash
docker run --rm --network pangolin curlimages/curl:8.6.0 \
curl -i http://pangolin:3001/api/v1/traefik-config | head -n 30
```
It returned JSON with routers/services such as:
- `Host(\`glance.frusetik.com\`)`
- service backend like `http://100.89.128.4:40650`
- **sometimes including**:
- `entryPoints: ["websecure"]`
- `tls: { certResolver: "letsencrypt" }`
You also confirmed:
```bash
curl -s http://pangolin:3001/api/v1/traefik-config | grep -E 'websecure|certResolver|letsencrypt'
```
This matters because it can silently reintroduce TLS expectations even when you intended Traefik to be “internal HTTP only”.
---
## 6) Cloudflare configuration model (what matters)
### 6.1 Two kinds of DNS records
- **Normal HTTP hostnames** (`pangolin.frusetik.com`, `glance.frusetik.com`)
- Often proxied (orange cloud)
- Terminate TLS at Cloudflare edge
- Forward through Tunnel to your origin service
- **UDP endpoint for Gerbil**
- `pangolin-udp.frusetik.com`
- MUST be **DNS-only (grey cloud)** because Cloudflare cannot proxy UDP in the normal way.
- Points directly to VPS public IP
This was encoded in your Pangolin config:
```yaml
gerbil:
base_endpoint: "pangolin-udp.frusetik.com"
# This MUST be DNS-only (grey cloud) pointing directly to VPS IP.
```
### 6.2 cloudflared ingress rules
Your cloudflared logs showed configs like:
```json
{
"ingress":[
{"hostname":"pangolin.frusetik.com","service":"http://gerbil:80"},
{"hostname":"glance.frusetik.com","service":"https://gerbil:443","originRequest":{"noTLSVerify":true}},
{"service":"http_status:404"}
]
}
```
**Critical requirement:** cloudflared must be able to resolve the origin hostnames (e.g. `gerbil`).
That only happens if cloudflared is on the same docker network as Gerbil (here: network `pangolin`) **or** you use a reachable IP.
---
## 7) The failure modes you hit (and what they meant)
### 7.1 `ERR_NAME_NOT_RESOLVED` / `curl: (6) Could not resolve host`
You saw:
- `curl -I https://glance.frusetik.com` failing to resolve
- browser: `ERR_NAME_NOT_RESOLVED`
Then you ran:
```bash
nslookup glance.frusetik.com
dig glance.frusetik.com +short
```
and got Cloudflare IPs (`188.114.x.x`). That means **public DNS was fine**.
**What actually happened:** you had **Tailscale enabled**, and it was interfering with name resolution on your client, then switching behavior once disabled. After toggling, errors changed to redirects.
### 7.2 `ERR_TOO_MANY_REDIRECTS`
Once name resolution was OK, you got redirect loops.
This usually happens when:
- Cloudflare is HTTPS at edge
- origin redirects HTTP -> HTTPS (or vice versa)
- origin expects TLS but you connect over HTTP
- or you forward `pangolin.frusetik.com` to a different scheme than `glance.frusetik.com` and the auth redirect chain bounces forever
You observed a 302 from `glance` to Pangolin auth:
```
location: https://pangolin.frusetik.com/auth/resource/<uuid>?redirect=https%3A%2F%2Fglance.frusetik.com%2F
```
That is expected for Pangolin-protected resources, **as long as Pangolin itself is reachable without mismatch**.
### 7.3 Cloudflare 502 Bad Gateway
Cloudflare 502 means: edge could not reach your origin successfully (or origin errored).
Your definitive smoking gun was in cloudflared logs:
```
dial tcp: lookup gerbil on 127.0.0.11:53: no such host
```
That means:
- cloudflared container DNS cannot resolve `gerbil`
- therefore cloudflared cannot connect to origin
- therefore Cloudflare returns 502
**Why this happened:** cloudflared was running in a different compose project / docker network, and **was not attached to the `pangolin` network**, so Docker DNS had no idea what `gerbil` is.
### 7.4 `TRAEFIK DEFAULT CERT` and 404 from Traefik on 443
You ran:
```bash
curl -vkI --connect-to pangolin.frusetik.com:443:gerbil:443 https://pangolin.frusetik.com/
```
and got:
- certificate: `TRAEFIK DEFAULT CERT` (self-signed)
- response: HTTP/2 404
Interpretation:
- you were successfully speaking TLS to Traefik on 443
- but Traefik had **no router matching Host(`pangolin.frusetik.com`) on that entrypoint**
- hence 404
This is the “entrypoint mismatch” problem:
- routers in `dynamic_config.yml` were on `web` (80)
- your tunnel/origin was using `https://...:443`
### 7.5 Newt systemd service failing with `status=217/USER`
In the LXC container, your unit logs showed:
```
Failed to determine user credentials: No such process
Failed at step USER spawning /usr/local/bin/newt
status=217/USER
```
This is deterministic: your systemd service file specified a `User=` that did not exist.
Fix is either:
- create that user, or
- remove `User=...` and run as root, or
- set `User=root`
You also observed that the newt config path was:
- `~/.config/newt-client/config.json`
- and there was **no** `.newt` folder (which is fine; paths differ by versions/builds)
---
## 8) Deterministic debugging checklist (the exact style you used)
### 8.1 Check Pangolin health inside Docker network
```bash
docker run --rm --network pangolin curlimages/curl:8.6.0 \
curl -i http://pangolin:3001/api/v1/ | head -n 30
```
Expect:
- `200 OK`
- `{"message":"Healthy"}`
### 8.2 Check Pangolin-generated Traefik config
```bash
docker run --rm --network pangolin curlimages/curl:8.6.0 \
curl -s http://pangolin:3001/api/v1/traefik-config | head -n 5
```
Then search for a hostname:
```bash
docker run --rm --network pangolin curlimages/curl:8.6.0 \
sh -lc "curl -s http://pangolin:3001/api/v1/traefik-config | grep -n 'glance.frusetik.com' || true"
```
### 8.3 Verify internal routing at the Gerbil/Traefik layer
HTTP test (port 80):
```bash
docker run --rm --network pangolin curlimages/curl:8.6.0 \
curl -i -H "Host: glance.frusetik.com" http://gerbil:80 | head -n 30
```
HTTPS test (port 443):
```bash
docker run --rm --network pangolin curlimages/curl:8.6.0 \
curl -vkI --connect-to glance.frusetik.com:443:gerbil:443 https://glance.frusetik.com/
```
### 8.4 Verify cloudflared can resolve the origin (`gerbil`)
From cloudflared container:
```bash
docker exec -it <cloudflared_container> sh -lc 'getent hosts gerbil || nslookup gerbil || cat /etc/resolv.conf'
```
If that fails with “no such host”, cloudflared is **not on the pangolin network**.
### 8.5 Verify Docker networks
```bash
docker network ls | grep pangolin
docker network inspect pangolin | sed -n '1,120p'
docker ps --format 'table {{.Names}}\t{{.Networks}}' | grep -E 'pangolin|gerbil|traefik|cloudflared'
```
---
## 9) “Two years later” rules of thumb (the invariants)
### Invariant A — origin scheme and Traefik entrypoints must match
Pick one model:
**Model 1 (simplest): Cloudflare terminates TLS, origin is HTTP**
- cloudflared forwards to `http://gerbil:80`
- Traefik routers use `entryPoints: [web]`
- no internal HTTP->HTTPS redirects
- Traefik does not need ACME
**Model 2: Traefik terminates TLS, origin is HTTPS**
- cloudflared forwards to `https://gerbil:443`
- Traefik routers use `entryPoints: [websecure]`
- Traefik must have certs (ACME or provided)
- Cloudflare origin cert verification must be configured (or disabled)
Most of the pain today was caused by being halfway between these models.
### Invariant B — cloudflared must reach origin *by DNS or IP*
If you configure origin as `http://gerbil:80`, cloudflared must be able to resolve `gerbil`.
That only works if cloudflared is attached to the same Docker network, or you route to a fixed IP.
### Invariant C — UDP endpoint is DNS-only
`pangolin-udp.frusetik.com` must remain grey-cloud and point directly at the VPS public IP.
### Invariant D — Pangolin can generate TLS expectations
If Pangolin outputs routers with `websecure` + `certResolver`, Traefik will behave accordingly.
If you intend “internal HTTP only”, ensure Pangolin is configured to stop emitting TLS fields.
---
## 10) Practical recommendations going forward
1. **Choose Model 1 unless you have a strong reason otherwise.**
Cloudflare handles TLS. Origin stays HTTP. Less moving parts. Fewer redirect loops.
2. **Run cloudflared on the same compose / docker network as Pangolin/Gerbil**
or explicitly attach it:
```bash
docker network connect pangolin <cloudflared_container>
```
3. **Avoid mixing multiple overlay DNS systems on the client** (Tailscale + system DNS)
If something suddenly becomes `ERR_NAME_NOT_RESOLVED`, validate with:
```bash
dig glance.frusetik.com +short
```
4. **Systemd services in LXC:** ensure `User=` exists
`status=217/USER` is always “bad/missing user”.
---
## 11) Appendix: observed log snippets and what they meant
### Pangolin seeing Newt connect
You saw entries like:
- “Client added to tracking - NEWT ID: ...”
- “WebSocket connection established”
That confirmed Newt ↔ Pangolin connectivity.
### Pangolin warning about Docker socket
```
Newt <id> does not have Docker socket access
```
This is only relevant if Pangolin expects the remote site to expose Docker metadata via Newt.
It does not prevent basic HTTP service forwarding.
### cloudflared definitive failure message
```
dial tcp: lookup gerbil on 127.0.0.11:53: no such host
```
This means: cloudflared container cannot resolve Docker service name `gerbil`.
---
## 12) Minimal “golden config” target (recommended)
If you want the most deterministic future-proof setup:
- Cloudflare edge HTTPS
- cloudflared origin: `http://gerbil:80`
- Traefik only `web` entrypoint
- No Traefik ACME
- Badger plugin enabled
- cloudflared container attached to `pangolin` docker network
That configuration has the fewest interacting TLS layers.
---
# ZeroTrust HomeLab Infrastructure with Pangolin, Traefik and Cloudflare
## Introduction
This guide documents the architecture and deployment steps used to host a **zerotrust homelab** using [Pangolin](https://pangolin.net), [Traefik](https://traefik.io), [Cloudflare Tunnel](https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/), and [Nginx](https://nginx.org/). The objective is to create a secure environment where web applications (e.g. Glance, Immich, Jellyfin) are only reachable after authentication and without exposing ports directly to the internet. The guide was written in January 2026 after several iterations of troubleshooting and configuration. It explains the reasoning behind each step so that futureyou can understand why certain decisions were made.
The overall architecture has four core components:
1. **Pangolin control plane** orchestrates users, sites and resources and pushes routing rules to Traefik; includes a REST API, WebSocket server and authentication system【250364735321846†L101-L114】.
2. **Gerbil tunnel manager** manages WireGuard tunnels between edge networks and the central server【250364735321846†L117-L128】.
3. **Newt clients** lightweight agents running on remote nodes (LXC/VMs) to create WireGuard tunnels and proxy services through Gerbil【250364735321846†L131-L139】.
4. **Traefik reverse proxy with Badger plugin** Traefik routes HTTP requests and enforces authentication using the Badger middleware plugin【250364735321846†L145-L166】; Badger is specifically designed to work with Pangolin【271803808534996†L50-L63】.
In addition, **Cloudflare Tunnel** provides public ingress into the private network without opening ports on your VPS. **Nginx** continues serving other public websites on ports 80 and 443.
The guide assumes you are comfortable with Docker, Docker Compose and Linux firewall management, and that you have a VPS running Ubuntu or similar.
## Highlevel Architecture
At a high level, the flow of a client request looks like this:
```mermaid
flowchart LR
User[Browser Client] -->|HTTPS| CloudflareEdge[Cloudflare Edge]
CloudflareEdge -->|QUIC or WebSocket via Tunnel| Cloudflared[cloudflared Connector]
Cloudflared -->|HTTP or HTTPS| Traefik[Traefik]
Traefik -->|Badger Plugin Auth Check| PangolinAPI[Pangolin API]
PangolinAPI -->|Auth Decision| Traefik
Traefik -->|Proxy| AppService[Internal Service via Gerbil Newt Tunnel]
subgraph VPS
Traefik -.->|Internal HTTP 80 and HTTPS 443| Nginx[Nginx]
Nginx -->|Public HTTP HTTPS| otherApps[Other Web Apps]
end
```
1. A user browses to `https://glance.frusetik.com`. DNS resolves the hostname to Cloudflare. The browser establishes an HTTPS connection to Cloudflare.
2. Cloudflare uses the **cloudflared** tunnel agent running on your VPS to connect over QUIC/WebSocket to your environment. The agent forwards the request to Traefik via HTTP or HTTPS.
3. Traefik processes the request. It consults the **Badger middleware**, which enforces authentication and consults the Pangolin API. Badger intercepts the request, checks the `p_session_token` cookie or redirects to Pangolins login page【250364735321846†L156-L166】. Pangolin verifies the user and returns a decision.
4. After successful authentication, Traefik proxies the request to the backend service. For services hosted behind a Newt client, the connection flows over a WireGuard tunnel managed by Gerbil.
### Component roles
- **Pangolin Control Plane**: Maintains configuration for sites, users, organizations, domains and resources; exposes a web UI, REST API and WebSocket server【250364735321846†L101-L114】. It orchestrates Newt clients and pushes routing rules to Traefik via its HTTP provider.
- **Gerbil Tunnel Manager**: Maintains peer keys and orchestrates WireGuard tunnels【250364735321846†L117-L128】. It exposes UDP ports 51820 and 21820 on the VPS for tunnel connectivity.
- **Newt Edge Client**: Runs on each remote node (e.g. in your Proxmox LXC). It connects to the Pangolin API over WebSocket and to Gerbil via WireGuard to expose internal services【250364735321846†L131-L139】.
- **Traefik Reverse Proxy**: Serves as the ingress router. It reads its dynamic configuration from Pangolin via the HTTP provider and enforces authentication via the Badger plugin【250364735321846†L145-L166】. It listens on internal ports 80 (HTTP) and 443 (HTTPS) but does not publish those to the host; Cloudflare tunnel connects directly to it.
- **Badger Middleware**: A Traefik plugin that acts as an authentication bouncer. It is automatically installed with Pangolin and ensures that only authenticated requests are allowed through【271803808534996†L50-L63】.
- **Nginx**: Continues to listen on ports 80/443 on the VPS for your other public services. It proxies requests unrelated to Pangolin and is configured separately.
- **Cloudflare Tunnel**: Provides a secure tunnel between Cloudflares edge network and your VPS. It runs as a Docker container (`cloudflared`). Multiple hostnames can point through the same tunnel. Cloudflare resolves your subdomains and forwards traffic to Traefik via the tunnel. No public ports are opened on the VPS for Pangolin or its resources.
## Directory Structure
All Pangolinrelated files reside in `/home/claudio/infra/pangolin`. The structure at the end of this configuration looks like this:
```
/home/claudio/infra/pangolin
├── docker-compose.yml # Compose definition for pangolin, gerbil, traefik
├── config/
│   ├── config.yml # Pangolin configuration (domains, secret, etc.)
│   ├── traefik/
│   │   ├── traefik_config.yml # Static Traefik config (providers, entrypoints, ACME)
│   │   └── dynamic_config.yml # Optional custom routes (we keep minimal)
│   └── letsencrypt/ # Stores acme.json for Traefik certificates
└── data/ # Optional persistent data for pangolin (database)
```
## StepbyStep Installation
### 1. Prepare the VPS
1. **Install Docker and Docker Compose** on the VPS (Ubuntu). Ensure you are using Compose v2 (integrated in `docker compose`).
2. **Firewall configuration** (UFW):
- Open UDP 51820 and 21820 for WireGuard tunnels.
- Keep ports 80 and 443 open for Nginx (public websites). If you prefer a completely closed web port and rely solely on Cloudflare tunnel, you can remove these rules later.
- Ensure SSH (TCP 22) remains open.
### 2. Configure DNS and Cloudflare
1. **DNS records**: Create a wildcard DNS record for your domain pointing to your Cloudflare tunnel (or CNAME to the tunnel). For example:
- `*.frusetik.com` → proxied (orange cloud) to your Cloudflare tunnel.
- `pangolin-udp.frusetik.com` → DNSonly (grey cloud) A record pointing to your VPSs public IP; Gerbil uses this for WireGuard clients.
2. **Cloudflare Tunnel**: In the Cloudflare ZeroTrust dashboard, create or reuse a tunnel. Assign the connector a token and run cloudflared in Docker on the VPS:
```bash
docker run -d \
--name cloudflared \
--restart unless-stopped \
-e TUNNEL_TOKEN=<your-token> \
cloudflare/cloudflared:latest tunnel run
```
Or define `cloudflared` as a service in your existing `docker-compose.yml` with the token environment variable.
3. **Public hostnames**: For each application (e.g. `pangolin.frusetik.com`, `glance.frusetik.com`), create a **public hostname** entry under your tunnel. Initially set the service to `http://gerbil:80`; we later adjusted to `https://gerbil:443` if using internal TLS. For each hostname set **No TLS Verify** to `true` until certificate issues are resolved.
### 3. Create the Compose file
Create `docker-compose.yml` in `/home/claudio/infra/pangolin` with the following services:
#### 3.1 Pangolin Service
```yaml
services:
pangolin:
image: fosrl/pangolin:latest
container_name: pangolin
restart: unless-stopped
volumes:
- ./config:/app/config
- ./data:/app/data # Persist database
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3001/api/v1/"]
interval: "3s"
timeout: "3s"
retries: 15
```
This exposes Pangolins API on ports 3000 (REST), 3001 (internal API for Traefik config) and 3002 (Next.js UI) **inside the container network** only. None of these ports are published on the host.
#### 3.2 Gerbil Service
```yaml
gerbil:
image: fosrl/gerbil:latest
container_name: gerbil
restart: unless-stopped
depends_on:
pangolin:
condition: service_healthy
command:
- --reachableAt=http://gerbil:3004
- --generateAndSaveKeyTo=/var/config/key
- --remoteConfig=http://pangolin:3001/api/v1/
volumes:
- ./config:/var/config
cap_add:
- NET_ADMIN
- SYS_MODULE
ports:
- 51820:51820/udp
- 21820:21820/udp
```
Gerbil publishes UDP 51820 and 21820 on the host. It does **not** publish 80/443; these are only available to Traefik inside the network.
#### 3.3 Traefik Service
We had two alternative configurations during troubleshooting:
1. **HTTPonly internal** (simpler but requires disabling Pangolins redirect):
```yaml
traefik:
image: traefik:v3.4.0
container_name: traefik
restart: unless-stopped
depends_on:
pangolin:
condition: service_healthy
command:
- --configFile=/etc/traefik/traefik_config.yml
volumes:
- ./config/traefik:/etc/traefik:ro
- ./config/traefik/logs:/var/log/traefik
networks:
default:
name: pangolin
```
With this configuration, only the `web` (port 80) entrypoint is defined in `traefik_config.yml`, and certificatesResolvers are **disabled** to avoid Lets Encrypt redirect loops. Cloudflare connects to `http://gerbil:80`. Traefik still enforces authentication via Badger but never redirects to HTTPS. This is ideal when you terminate TLS at Cloudflare and want to avoid ACME complications.
2. **HTTPS internal with DNS01** (more secure but complex):
We later enabled the `websecure` entrypoint and configured an ACME DNS01 resolver using Cloudflare. This required adding a `certificatesResolvers` block in the static config and passing `CF_DNS_API_TOKEN` to the container. Cloudflare tunnels now point to `https://gerbil:443`. Traefik issues certificates for your `*.frusetik.com` hostnames and serves them internally. Without a valid trust store inside cloudflared, we needed to set `noTLSVerify` to avoid TLS validation errors. In practice the HTTPonly option proved simpler.
**Note:** `network_mode: service:gerbil` is used in the official install, causing Traefik to share Gerbils network namespace. We retained it but discovered that connecting cloudflared to the **pangolin** network is essential so that it can resolve `gerbil` host names. After adding the cloudflared container to the `pangolin` network (`docker network connect pangolin cloudflared`), the DNS lookup succeeded.
### 4. Configure Traefik
Create `config/traefik/traefik_config.yml`. The following example shows the **HTTPonly** static config with ACME disabled. Comments explain why certain features are commented out.
```yaml
api:
insecure: true
dashboard: true
providers:
http:
endpoint: "http://pangolin:3001/api/v1/traefik-config"
pollInterval: "5s"
# Traefik polls Pangolin for dynamic config (routers, services, middlewares).
file:
filename: "/etc/traefik/dynamic_config.yml"
# Optional static overrides (kept minimal).
experimental:
plugins:
badger:
moduleName: "github.com/fosrl/badger"
version: "v1.3.1" # Pin the version you tested
log:
level: "INFO"
format: "common"
# ACME is disabled because the origin ports 80/443 are not publicly reachable.
# Uncomment the certificatesResolvers section and configure DNS01 if you want
# Traefik to issue certificates and serve HTTPS internally.
entryPoints:
web:
address: ":80"
# Traefik listens on port 80 inside the Docker network (not published on host).
# websecure is commented out for HTTPonly deployment. See the alternative
# configuration above if enabling internal HTTPS.
# websecure:
# address: ":443"
# transport:
# respondingTimeouts:
# readTimeout: "30m"
# http:
# tls:
# certResolver: "letsencrypt"
serversTransport:
insecureSkipVerify: true
# Accept selfsigned certificates when Traefik proxies HTTPS backends (e.g., newt client services).
ping:
entryPoint: "web"
# Health endpoint for Traefik (internal only).
```
`config/traefik/dynamic_config.yml` can be almost empty except for the Badger middleware, because Pangolin dynamically generates routers and services for each resource:
```yaml
http:
middlewares:
badger:
plugin:
badger:
disableForwardAuth: true # Traefik uses Pangolin to authenticate
```
### 5. Configure Pangolin
The file `config/config.yml` defines the Pangolin control plane. A minimal configuration looks like this:
```yaml
app:
dashboard_url: "https://pangolin.frusetik.com"
domains:
pangolin_ui:
base_domain: "pangolin.frusetik.com"
root:
base_domain: "frusetik.com"
server:
secret: "<random-secret>"
# Gerbil expects clients to connect to the VPS for WireGuard tunnels. Use a
# DNSonly (nonproxied) record that points to your VPS public IP.
gerbil:
base_endpoint: "pangolin-udp.frusetik.com"
flags:
require_email_verification: false
disable_signup_without_invite: true
disable_user_create_org: true
```
- `dashboard_url` should point to the hostname where you access the Pangolin UI.
- `domains` defines base domains that Pangolin can manage. We added `frusetik.com` so that Pangolin can create public resources directly under `*.frusetik.com`. Without this, Pangolin would prefix hostnames with `pangolin.`.
- `server.secret` must be a long random string; generate it with `openssl rand -base64 48`.
- `gerbil.base_endpoint` must resolve to your VPS (not proxied by Cloudflare) to allow UDP WireGuard connections.
- Flags disable selfregistration and email verification for a closed system.
### 6. Start the stack
Run the stack from the Pangolin directory:
```bash
cd /home/claudio/infra/pangolin
sudo docker compose pull
sudo docker compose up -d
```
Check logs with `docker compose logs -f pangolin` and `docker compose logs -f traefik` to ensure services start without errors. Note that Gerbil will complain about the missing Docker socket; this warning is safe to ignore when running outside of Swarm or when you are not autopublishing containers.
### 7. Connect Newt client
For each remote node (e.g. LXC container) that you want to expose through Pangolin:
1. Download the `newt` client from the Pangolin UI (Sites → Create site → Add a new site). It provides a command line including a token.
2. Copy the binary to the target node and run the command once to initialize the connection. The client writes state to `~/.config/newt-client/config.json`.
3. To run `newt` persistently, create a `systemd` unit or background script. Example systemd service:
```ini
[Unit]
Description=Pangolin Newt Client
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
ExecStart=/usr/local/bin/newt
Restart=always
RestartSec=3
[Install]
WantedBy=multi-user.target
```
If your LXC does not use systemd, run `nohup newt &` and disown it.
4. Verify in the Pangolin UI that your Site is “Connected”.
### 8. Create Resources
Once sites are connected, you can publish services. In the Pangolin dashboard, navigate to **Resources** and choose **Public Resource**. Fill in:
- **Hostname**: `glance.frusetik.com` (or `immich.frusetik.com`, etc.).
- **Site**: the site where your service runs.
- **Target**: the internal URL of the service as seen from the Newt client (e.g. `http://glance:8080`). Test connectivity from the newt host using `curl` before publishing.
Pangolin pushes a new router and service into Traefik via its HTTP provider. Traefik routes based on the Host header and proxies to the Newt client via Gerbil.
### 9. Troubleshooting Tips
- **Cloudflared cannot resolve `gerbil`** If Cloudflared logs show `dial tcp: lookup gerbil on 127.0.0.11:53: no such host`, the cloudflared container isnt attached to the `pangolin` network. Run `docker network connect pangolin <cloudflared-container>` and restart the container. After connecting, `nslookup gerbil` inside the cloudflared container should resolve to the Gerbil container IP.
- **Infinite redirect (ERR_TOO_MANY_REDIRECTS)** This occurs when Traefik redirects HTTP to HTTPS internally and Cloudflare is connecting via HTTP, causing a loop. To fix, either (a) disable the redirect and run HTTPonly; or (b) enable internal HTTPS (websecure) and have cloudflared connect via `https://gerbil:443`.
- **502 Host error** Cloudflare returns a 502 when the tunnel cannot reach your origin. Check `docker logs cloudflared` for DNS lookup errors or connection refused. Most often this means the service is misconfigured or cloudflared is not on the right network.
- **Certificate issues** When using internal HTTPS, Traefiks default certificate is selfsigned. Cloudflared will reject it unless you set `noTLSVerify` or supply a CA bundle via `originRequest.caPool`. A more robust solution is to issue real certificates via ACME DNS01 and mount them in Traefik.
- **Nginx conflict** Traefik and Nginx can coexist as long as Traefiks ports are not published on the host. Nginx still serves your other web apps on 80/443.
## Conclusion
This readme captures the final architecture and troubleshooting process used to deploy Pangolin on a VPS with Cloudflare Tunnel. The key lessons learned are:
- Attach the Cloudflare tunnel container to the same network as Traefik/Gerbil so that it can resolve service names. Use `docker network connect` to add existing containers to a network.
- Decide early whether you want internal HTTPS. Running HTTPonly behind Cloudflare simplifies the configuration and avoids Lets Encrypt complications, but you must disable Traefiks redirects. Running HTTPS internally requires DNS01 ACME and possibly trust store adjustments for cloudflared.
- Keep Nginx separate for nonPangolin services. Use DNS and Cloudflare to route only specific hostnames into Pangolin.
- Use Pangolins `Domains` configuration to manage which base domains it can serve. Without adding `frusetik.com`, Pangolin will prepend its own subdomain.
- Persist Pangolins data (`/app/data`) to avoid losing resources and state when restarting containers.
By following these steps you can host internal services securely without exposing ports, while granting friends and family access through a single login. In two years, when you revisit this setup, the diagrams and commentary here should refresh your understanding of each components role and the reasoning behind the configuration choices.