Files
Main/2 Personal/Home Lab/Pangolin Installation.md
Obsidian-MBPM4 9188e43301 vault backup: 2026-01-14 21:12:55
Affected files:
.obsidian/workspace.json
2 Personal/Home Lab/Pangolin Installation.md
2026-01-14 21:12:55 +01:00

36 KiB
Raw Permalink Blame History

Pangolin + Gerbil + Traefik + Newt + Cloudflare Tunnel (cloudflared) — Deterministic Setup Notes

This document is a README-style, long-term memory of what was built and debugged in this chat session (Jan 14, 2026).
Goal: In two years, you should be able to reread this and immediately understand:

  • what the architecture is
  • how traffic flows end-to-end
  • why certain configs exist
  • what failure modes looked like and how they were diagnosed/fixed

0) What problem this stack solves

You want to expose private services (e.g. glance, n8n, etc.) via friendly hostnames on your domain, while:

  • avoiding inbound port-forwarding (ideally)
  • keeping access controlled (Pangolin policy/auth)
  • keeping the architecture understandable and debuggable
  • using a remote connector (Newt) from environments like LXC/VM/homelab networks

In practice you ended up with:

  • Pangolin as control plane (UI + API + policy + config generator)
  • Gerbil as data plane / “edge node” (WireGuard/UDP side + reachability endpoint)
  • Traefik as reverse proxy in front of Gerbil (routing + Badger plugin)
  • Newt as remote agent that connects a site back to Pangolin (WebSocket)
  • cloudflared as Cloudflare Tunnel client (public HTTPS entry -> private origin)
  • Cloudflare DNS for hostnames, and a special DNS-only UDP endpoint for Gerbil

1) Architecture overview

1.1 High-level components

  • Cloudflare (edge)

    • DNS for *.frusetik.com
    • HTTP(S) reverse proxy to your tunnel (when proxied)
    • Cloudflare Tunnel “ingress rules” define where each hostname goes
  • cloudflared (tunnel client)

    • Runs as a container (in your case it lived in another compose project, e.g. n8n-compose)
    • Maintains outbound tunnel connections to Cloudflare edge
    • For each hostname, forwards requests to an origin service (e.g. http://gerbil:80 or https://gerbil:443)
  • Pangolin (control plane container)

    • API + internal server + Next.js UI
    • Generates a Traefik dynamic config at:
      • http://pangolin:3001/api/v1/traefik-config
    • Tracks Newt connections and publishes routes to Traefik
  • Traefik (reverse proxy container)

    • Pulls dynamic routes from Pangolin via HTTP provider
    • Also loads a local file provider config (/etc/traefik/dynamic_config.yml)
    • Runs with the Badger plugin (used by Pangolin for auth/policy enforcement)
    • In your deployment: Traefik runs inside the same network namespace as Gerbil
      • network_mode: service:gerbil
  • Gerbil (data plane container)

    • Has UDP ports published to the host:
      • 51820/udp, 21820/udp (WireGuard-ish / overlay traffic)
    • Has a reachable endpoint (--reachableAt=http://gerbil:3004)
    • Pulls remote config from Pangolin: --remoteConfig=http://pangolin:3001/api/v1/
    • Hosts the Traefik ports implicitly because Traefik shares its namespace
  • Newt (remote agent, runs outside docker in your case)

    • A client connecting outbound to Pangolin
    • Lets Pangolin route requests to services in a remote site
    • On the remote site, your test indicated the service name glance existed and was reachable internally (200 OK)

2) Mermaid diagrams

2.1 Request path (public hostname -> private service)

sequenceDiagram
  autonumber
  participant U as User Browser
  participant CF as Cloudflare Edge
  participant T as cloudflared (Tunnel Client)
  participant G as Gerbil Namespace
  participant TR as Traefik
  participant P as Pangolin API/UI
  participant N as Newt (remote)
  participant S as Service (e.g. Glance)

  U->>CF: HTTPS GET https://glance.frusetik.com
  CF->>T: Forward via Tunnel (ingress rule)
  T->>G: Origin request (http://gerbil:80 OR https://gerbil:443)
  G->>TR: Request enters Traefik router
  TR->>P: (Badger plugin / policy decisions)
  TR->>N: Route to remote site address (generated by Pangolin)
  N->>S: Forward to local service (e.g. http://glance:8080)
  S-->>U: Response (via reverse path)

2.2 Docker network / namespace relationship

graph TD
  subgraph DockerNetwork["Docker network: pangolin (bridge)"]
    P[pangolin container<br/>ports: none published]
    G[gerbil container<br/>UDP published: 51820/udp, 21820/udp]
    TR[traefik container<br/>network_mode: service:gerbil]
    P --- G
    P --- TR
  end

  subgraph OtherCompose["Other compose project (e.g. n8n-compose)"]
    CFd[cloudflared container]
  end

  CFd -. must join .-> DockerNetwork
  TR -->|HTTP provider| P
  G -->|remoteConfig| P

2.3 Control plane (config generation)

flowchart LR
  P[Pangolin] -->|/api/v1/traefik-config| TR[Traefik HTTP Provider]
  TR -->|dynamic routers/services| TR
  P -->|tracks Newt connections| P
  N[Newt Client] -->|WebSocket| P

3) Deterministic deployment (what runs where)

3.1 VPS docker-compose (Pangolin/Gerbil/Traefik)

You had a compose like this (simplified to highlight what mattered):

  • network name: pangolin
  • Traefik shares Gerbil network namespace: network_mode: service:gerbil
services:
  pangolin:
    image: fosrl/pangolin:latest
    container_name: pangolin
    restart: unless-stopped
    volumes:
      - ./config:/app/config
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3001/api/v1/"]
      interval: "3s"
      timeout: "3s"
      retries: 15

  gerbil:
    image: fosrl/gerbil:latest
    container_name: gerbil
    restart: unless-stopped
    depends_on:
      pangolin:
        condition: service_healthy
    command:
      - --reachableAt=http://gerbil:3004
      - --generateAndSaveKeyTo=/var/config/key
      - --remoteConfig=http://pangolin:3001/api/v1/
    volumes:
      - ./config/:/var/config
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    ports:
      - 51820:51820/udp
      - 21820:21820/udp

  traefik:
    image: traefik:v3.4.0
    container_name: traefik
    restart: unless-stopped
    network_mode: service:gerbil
    depends_on:
      pangolin:
        condition: service_healthy
    command:
      - --configFile=/etc/traefik/traefik_config.yml
    volumes:
      - ./config/traefik:/etc/traefik:ro
      - ./config/letsencrypt:/letsencrypt
      - ./config/traefik/logs:/var/log/traefik

networks:
  default:
    driver: bridge
    name: pangolin

Key implication: Traefik and Gerbil share the same IP and exposed ports (because of network_mode: service:gerbil).


4) Traefik configuration (as actually used)

4.1 Static Traefik config (traefik/traefik_config.yml)

This is the “bootstrap” config. It defines:

  • how Traefik loads config
  • entrypoints (ports)
  • plugin definitions
  • logging
  • ACME (if enabled) — note that later you saw ACME provider running again, which indicates the config changed or a different file was being used.

Your baseline file (annotated):

api:
  insecure: true
  dashboard: true
  # Dashboard is OK only if Traefik is NOT public.

providers:
  http:
    endpoint: "http://pangolin:3001/api/v1/traefik-config"
    pollInterval: "5s"
    # Pangolin generates dynamic routers/services here.
  file:
    filename: "/etc/traefik/dynamic_config.yml"
    # Local static routers/services/middlewares if needed.

experimental:
  plugins:
    badger:
      moduleName: "github.com/fosrl/badger"
      version: "v1.3.0"

log:
  level: "INFO"
  format: "common"

entryPoints:
  web:
    address: ":80"

  # websecure:
  #   address: ":443"
  # Only enable if Traefik terminates TLS itself.

serversTransport:
  insecureSkipVerify: true

ping:
  entryPoint: "web"

4.2 Dynamic file provider (traefik/dynamic_config.yml)

This was your “manual fallback” for Pangolin UI routing and Badger middleware.

Important lesson from the debugging:
If Cloudflare Tunnel forwards to https://gerbil:443, but your routers only listen on the web entrypoint, you will get 404 from Traefik (no matching router on that entrypoint). This happened for pangolin.frusetik.com.

So the entrypoint choice must be consistent across:

  • what cloudflared connects to (HTTP:80 or HTTPS:443)
  • what Traefik is configured to listen on
  • what routers are bound to (web vs websecure)

5) Pangolin-generated Traefik config (observed)

You tested:

docker run --rm --network pangolin curlimages/curl:8.6.0 \
  curl -i http://pangolin:3001/api/v1/traefik-config | head -n 30

It returned JSON with routers/services such as:

  • Host(\glance.frusetik.com`)`
  • service backend like http://100.89.128.4:40650
  • sometimes including:
    • entryPoints: ["websecure"]
    • tls: { certResolver: "letsencrypt" }

You also confirmed:

curl -s http://pangolin:3001/api/v1/traefik-config | grep -E 'websecure|certResolver|letsencrypt'

This matters because it can silently reintroduce TLS expectations even when you intended Traefik to be “internal HTTP only”.


6) Cloudflare configuration model (what matters)

6.1 Two kinds of DNS records

  • Normal HTTP hostnames (pangolin.frusetik.com, glance.frusetik.com)

    • Often proxied (orange cloud)
    • Terminate TLS at Cloudflare edge
    • Forward through Tunnel to your origin service
  • UDP endpoint for Gerbil

    • pangolin-udp.frusetik.com
    • MUST be DNS-only (grey cloud) because Cloudflare cannot proxy UDP in the normal way.
    • Points directly to VPS public IP

This was encoded in your Pangolin config:

gerbil:
  base_endpoint: "pangolin-udp.frusetik.com"
  # This MUST be DNS-only (grey cloud) pointing directly to VPS IP.

6.2 cloudflared ingress rules

Your cloudflared logs showed configs like:

{
  "ingress":[
    {"hostname":"pangolin.frusetik.com","service":"http://gerbil:80"},
    {"hostname":"glance.frusetik.com","service":"https://gerbil:443","originRequest":{"noTLSVerify":true}},
    {"service":"http_status:404"}
  ]
}

Critical requirement: cloudflared must be able to resolve the origin hostnames (e.g. gerbil).
That only happens if cloudflared is on the same docker network as Gerbil (here: network pangolin) or you use a reachable IP.


7) The failure modes you hit (and what they meant)

7.1 ERR_NAME_NOT_RESOLVED / curl: (6) Could not resolve host

You saw:

  • curl -I https://glance.frusetik.com failing to resolve
  • browser: ERR_NAME_NOT_RESOLVED

Then you ran:

nslookup glance.frusetik.com
dig glance.frusetik.com +short

and got Cloudflare IPs (188.114.x.x). That means public DNS was fine.

What actually happened: you had Tailscale enabled, and it was interfering with name resolution on your client, then switching behavior once disabled. After toggling, errors changed to redirects.

7.2 ERR_TOO_MANY_REDIRECTS

Once name resolution was OK, you got redirect loops.

This usually happens when:

  • Cloudflare is HTTPS at edge
  • origin redirects HTTP -> HTTPS (or vice versa)
  • origin expects TLS but you connect over HTTP
  • or you forward pangolin.frusetik.com to a different scheme than glance.frusetik.com and the auth redirect chain bounces forever

You observed a 302 from glance to Pangolin auth:

location: https://pangolin.frusetik.com/auth/resource/<uuid>?redirect=https%3A%2F%2Fglance.frusetik.com%2F

That is expected for Pangolin-protected resources, as long as Pangolin itself is reachable without mismatch.

7.3 Cloudflare 502 Bad Gateway

Cloudflare 502 means: edge could not reach your origin successfully (or origin errored).

Your definitive smoking gun was in cloudflared logs:

dial tcp: lookup gerbil on 127.0.0.11:53: no such host

That means:

  • cloudflared container DNS cannot resolve gerbil
  • therefore cloudflared cannot connect to origin
  • therefore Cloudflare returns 502

Why this happened: cloudflared was running in a different compose project / docker network, and was not attached to the pangolin network, so Docker DNS had no idea what gerbil is.

7.4 TRAEFIK DEFAULT CERT and 404 from Traefik on 443

You ran:

curl -vkI --connect-to pangolin.frusetik.com:443:gerbil:443 https://pangolin.frusetik.com/

and got:

  • certificate: TRAEFIK DEFAULT CERT (self-signed)
  • response: HTTP/2 404

Interpretation:

  • you were successfully speaking TLS to Traefik on 443
  • but Traefik had no router matching Host(pangolin.frusetik.com) on that entrypoint
  • hence 404

This is the “entrypoint mismatch” problem:

  • routers in dynamic_config.yml were on web (80)
  • your tunnel/origin was using https://...:443

7.5 Newt systemd service failing with status=217/USER

In the LXC container, your unit logs showed:

Failed to determine user credentials: No such process
Failed at step USER spawning /usr/local/bin/newt
status=217/USER

This is deterministic: your systemd service file specified a User= that did not exist.
Fix is either:

  • create that user, or
  • remove User=... and run as root, or
  • set User=root

You also observed that the newt config path was:

  • ~/.config/newt-client/config.json
  • and there was no .newt folder (which is fine; paths differ by versions/builds)

8) Deterministic debugging checklist (the exact style you used)

8.1 Check Pangolin health inside Docker network

docker run --rm --network pangolin curlimages/curl:8.6.0 \
  curl -i http://pangolin:3001/api/v1/ | head -n 30

Expect:

  • 200 OK
  • {"message":"Healthy"}

8.2 Check Pangolin-generated Traefik config

docker run --rm --network pangolin curlimages/curl:8.6.0 \
  curl -s http://pangolin:3001/api/v1/traefik-config | head -n 5

Then search for a hostname:

docker run --rm --network pangolin curlimages/curl:8.6.0 \
  sh -lc "curl -s http://pangolin:3001/api/v1/traefik-config | grep -n 'glance.frusetik.com' || true"

8.3 Verify internal routing at the Gerbil/Traefik layer

HTTP test (port 80):

docker run --rm --network pangolin curlimages/curl:8.6.0 \
  curl -i -H "Host: glance.frusetik.com" http://gerbil:80 | head -n 30

HTTPS test (port 443):

docker run --rm --network pangolin curlimages/curl:8.6.0 \
  curl -vkI --connect-to glance.frusetik.com:443:gerbil:443 https://glance.frusetik.com/

8.4 Verify cloudflared can resolve the origin (gerbil)

From cloudflared container:

docker exec -it <cloudflared_container> sh -lc 'getent hosts gerbil || nslookup gerbil || cat /etc/resolv.conf'

If that fails with “no such host”, cloudflared is not on the pangolin network.

8.5 Verify Docker networks

docker network ls | grep pangolin
docker network inspect pangolin | sed -n '1,120p'
docker ps --format 'table {{.Names}}\t{{.Networks}}' | grep -E 'pangolin|gerbil|traefik|cloudflared'

9) “Two years later” rules of thumb (the invariants)

Invariant A — origin scheme and Traefik entrypoints must match

Pick one model:

Model 1 (simplest): Cloudflare terminates TLS, origin is HTTP

  • cloudflared forwards to http://gerbil:80
  • Traefik routers use entryPoints: [web]
  • no internal HTTP->HTTPS redirects
  • Traefik does not need ACME

Model 2: Traefik terminates TLS, origin is HTTPS

  • cloudflared forwards to https://gerbil:443
  • Traefik routers use entryPoints: [websecure]
  • Traefik must have certs (ACME or provided)
  • Cloudflare origin cert verification must be configured (or disabled)

Most of the pain today was caused by being halfway between these models.

Invariant B — cloudflared must reach origin by DNS or IP

If you configure origin as http://gerbil:80, cloudflared must be able to resolve gerbil. That only works if cloudflared is attached to the same Docker network, or you route to a fixed IP.

Invariant C — UDP endpoint is DNS-only

pangolin-udp.frusetik.com must remain grey-cloud and point directly at the VPS public IP.

Invariant D — Pangolin can generate TLS expectations

If Pangolin outputs routers with websecure + certResolver, Traefik will behave accordingly. If you intend “internal HTTP only”, ensure Pangolin is configured to stop emitting TLS fields.


10) Practical recommendations going forward

  1. Choose Model 1 unless you have a strong reason otherwise.
    Cloudflare handles TLS. Origin stays HTTP. Less moving parts. Fewer redirect loops.

  2. Run cloudflared on the same compose / docker network as Pangolin/Gerbil
    or explicitly attach it:

    docker network connect pangolin <cloudflared_container>
    
  3. Avoid mixing multiple overlay DNS systems on the client (Tailscale + system DNS)
    If something suddenly becomes ERR_NAME_NOT_RESOLVED, validate with:

    dig glance.frusetik.com +short
    
  4. Systemd services in LXC: ensure User= exists
    status=217/USER is always “bad/missing user”.


11) Appendix: observed log snippets and what they meant

Pangolin seeing Newt connect

You saw entries like:

  • “Client added to tracking - NEWT ID: ...”
  • “WebSocket connection established”

That confirmed Newt ↔ Pangolin connectivity.

Pangolin warning about Docker socket

Newt <id> does not have Docker socket access

This is only relevant if Pangolin expects the remote site to expose Docker metadata via Newt. It does not prevent basic HTTP service forwarding.

cloudflared definitive failure message

dial tcp: lookup gerbil on 127.0.0.11:53: no such host

This means: cloudflared container cannot resolve Docker service name gerbil.


If you want the most deterministic future-proof setup:

  • Cloudflare edge HTTPS
  • cloudflared origin: http://gerbil:80
  • Traefik only web entrypoint
  • No Traefik ACME
  • Badger plugin enabled
  • cloudflared container attached to pangolin docker network

That configuration has the fewest interacting TLS layers.


ZeroTrust HomeLab Infrastructure with Pangolin, Traefik and Cloudflare

Introduction

This guide documents the architecture and deployment steps used to host a zerotrust homelab using Pangolin, Traefik, Cloudflare Tunnel, and Nginx. The objective is to create a secure environment where web applications (e.g. Glance, Immich, Jellyfin) are only reachable after authentication and without exposing ports directly to the internet. The guide was written in January 2026 after several iterations of troubleshooting and configuration. It explains the reasoning behind each step so that futureyou can understand why certain decisions were made.

The overall architecture has four core components:

  1. Pangolin control plane orchestrates users, sites and resources and pushes routing rules to Traefik; includes a REST API, WebSocket server and authentication system【250364735321846†L101-L114】.
  2. Gerbil tunnel manager manages WireGuard tunnels between edge networks and the central server【250364735321846†L117-L128】.
  3. Newt clients lightweight agents running on remote nodes (LXC/VMs) to create WireGuard tunnels and proxy services through Gerbil【250364735321846†L131-L139】.
  4. Traefik reverse proxy with Badger plugin Traefik routes HTTP requests and enforces authentication using the Badger middleware plugin【250364735321846†L145-L166】; Badger is specifically designed to work with Pangolin【271803808534996†L50-L63】.

In addition, Cloudflare Tunnel provides public ingress into the private network without opening ports on your VPS. Nginx continues serving other public websites on ports 80 and 443.

The guide assumes you are comfortable with Docker, Docker Compose and Linux firewall management, and that you have a VPS running Ubuntu or similar.

Highlevel Architecture

At a high level, the flow of a client request looks like this:

flowchart LR
    User[Browser Client] -->|HTTPS| CloudflareEdge[Cloudflare Edge]
    CloudflareEdge -->|QUIC or WebSocket via Tunnel| Cloudflared[cloudflared Connector]
    Cloudflared -->|HTTP or HTTPS| Traefik[Traefik]
    Traefik -->|Badger Plugin Auth Check| PangolinAPI[Pangolin API]
    PangolinAPI -->|Auth Decision| Traefik
    Traefik -->|Proxy| AppService[Internal Service via Gerbil Newt Tunnel]

    subgraph VPS
        Traefik -.->|Internal HTTP 80 and HTTPS 443| Nginx[Nginx]
        Nginx -->|Public HTTP HTTPS| otherApps[Other Web Apps]
    end
  1. A user browses to https://glance.frusetik.com. DNS resolves the hostname to Cloudflare. The browser establishes an HTTPS connection to Cloudflare.
  2. Cloudflare uses the cloudflared tunnel agent running on your VPS to connect over QUIC/WebSocket to your environment. The agent forwards the request to Traefik via HTTP or HTTPS.
  3. Traefik processes the request. It consults the Badger middleware, which enforces authentication and consults the Pangolin API. Badger intercepts the request, checks the p_session_token cookie or redirects to Pangolins login page【250364735321846†L156-L166】. Pangolin verifies the user and returns a decision.
  4. After successful authentication, Traefik proxies the request to the backend service. For services hosted behind a Newt client, the connection flows over a WireGuard tunnel managed by Gerbil.

Component roles

  • Pangolin Control Plane: Maintains configuration for sites, users, organizations, domains and resources; exposes a web UI, REST API and WebSocket server【250364735321846†L101-L114】. It orchestrates Newt clients and pushes routing rules to Traefik via its HTTP provider.
  • Gerbil Tunnel Manager: Maintains peer keys and orchestrates WireGuard tunnels【250364735321846†L117-L128】. It exposes UDP ports 51820 and 21820 on the VPS for tunnel connectivity.
  • Newt Edge Client: Runs on each remote node (e.g. in your Proxmox LXC). It connects to the Pangolin API over WebSocket and to Gerbil via WireGuard to expose internal services【250364735321846†L131-L139】.
  • Traefik Reverse Proxy: Serves as the ingress router. It reads its dynamic configuration from Pangolin via the HTTP provider and enforces authentication via the Badger plugin【250364735321846†L145-L166】. It listens on internal ports 80 (HTTP) and 443 (HTTPS) but does not publish those to the host; Cloudflare tunnel connects directly to it.
  • Badger Middleware: A Traefik plugin that acts as an authentication bouncer. It is automatically installed with Pangolin and ensures that only authenticated requests are allowed through【271803808534996†L50-L63】.
  • Nginx: Continues to listen on ports 80/443 on the VPS for your other public services. It proxies requests unrelated to Pangolin and is configured separately.
  • Cloudflare Tunnel: Provides a secure tunnel between Cloudflares edge network and your VPS. It runs as a Docker container (cloudflared). Multiple hostnames can point through the same tunnel. Cloudflare resolves your subdomains and forwards traffic to Traefik via the tunnel. No public ports are opened on the VPS for Pangolin or its resources.

Directory Structure

All Pangolinrelated files reside in /home/claudio/infra/pangolin. The structure at the end of this configuration looks like this:

/home/claudio/infra/pangolin
├── docker-compose.yml        # Compose definition for pangolin, gerbil, traefik
├── config/
│   ├── config.yml            # Pangolin configuration (domains, secret, etc.)
│   ├── traefik/
│   │   ├── traefik_config.yml # Static Traefik config (providers, entrypoints, ACME)
│   │   └── dynamic_config.yml # Optional custom routes (we keep minimal)
│   └── letsencrypt/          # Stores acme.json for Traefik certificates
└── data/                     # Optional persistent data for pangolin (database)

StepbyStep Installation

1. Prepare the VPS

  1. Install Docker and Docker Compose on the VPS (Ubuntu). Ensure you are using Compose v2 (integrated in docker compose).
  2. Firewall configuration (UFW):
    • Open UDP 51820 and 21820 for WireGuard tunnels.
    • Keep ports 80 and 443 open for Nginx (public websites). If you prefer a completely closed web port and rely solely on Cloudflare tunnel, you can remove these rules later.
    • Ensure SSH (TCP 22) remains open.

2. Configure DNS and Cloudflare

  1. DNS records: Create a wildcard DNS record for your domain pointing to your Cloudflare tunnel (or CNAME to the tunnel). For example:

    • *.frusetik.com → proxied (orange cloud) to your Cloudflare tunnel.
    • pangolin-udp.frusetik.com → DNSonly (grey cloud) A record pointing to your VPSs public IP; Gerbil uses this for WireGuard clients.
  2. Cloudflare Tunnel: In the Cloudflare ZeroTrust dashboard, create or reuse a tunnel. Assign the connector a token and run cloudflared in Docker on the VPS:

    docker run -d \
      --name cloudflared \
      --restart unless-stopped \
      -e TUNNEL_TOKEN=<your-token> \
      cloudflare/cloudflared:latest tunnel run
    

    Or define cloudflared as a service in your existing docker-compose.yml with the token environment variable.

  3. Public hostnames: For each application (e.g. pangolin.frusetik.com, glance.frusetik.com), create a public hostname entry under your tunnel. Initially set the service to http://gerbil:80; we later adjusted to https://gerbil:443 if using internal TLS. For each hostname set No TLS Verify to true until certificate issues are resolved.

3. Create the Compose file

Create docker-compose.yml in /home/claudio/infra/pangolin with the following services:

3.1 Pangolin Service

services:
  pangolin:
    image: fosrl/pangolin:latest
    container_name: pangolin
    restart: unless-stopped
    volumes:
      - ./config:/app/config
      - ./data:/app/data  # Persist database
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3001/api/v1/"]
      interval: "3s"
      timeout: "3s"
      retries: 15

This exposes Pangolins API on ports 3000 (REST), 3001 (internal API for Traefik config) and 3002 (Next.js UI) inside the container network only. None of these ports are published on the host.

3.2 Gerbil Service

  gerbil:
    image: fosrl/gerbil:latest
    container_name: gerbil
    restart: unless-stopped
    depends_on:
      pangolin:
        condition: service_healthy
    command:
      - --reachableAt=http://gerbil:3004
      - --generateAndSaveKeyTo=/var/config/key
      - --remoteConfig=http://pangolin:3001/api/v1/
    volumes:
      - ./config:/var/config
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    ports:
      - 51820:51820/udp
      - 21820:21820/udp

Gerbil publishes UDP 51820 and 21820 on the host. It does not publish 80/443; these are only available to Traefik inside the network.

3.3 Traefik Service

We had two alternative configurations during troubleshooting:

  1. HTTPonly internal (simpler but requires disabling Pangolins redirect):

    traefik:
      image: traefik:v3.4.0
      container_name: traefik
      restart: unless-stopped
      depends_on:
        pangolin:
          condition: service_healthy
      command:
        - --configFile=/etc/traefik/traefik_config.yml
      volumes:
        - ./config/traefik:/etc/traefik:ro
        - ./config/traefik/logs:/var/log/traefik
    networks:
      default:
        name: pangolin
    

    With this configuration, only the web (port 80) entrypoint is defined in traefik_config.yml, and certificatesResolvers are disabled to avoid Lets Encrypt redirect loops. Cloudflare connects to http://gerbil:80. Traefik still enforces authentication via Badger but never redirects to HTTPS. This is ideal when you terminate TLS at Cloudflare and want to avoid ACME complications.

  2. HTTPS internal with DNS01 (more secure but complex):

    We later enabled the websecure entrypoint and configured an ACME DNS01 resolver using Cloudflare. This required adding a certificatesResolvers block in the static config and passing CF_DNS_API_TOKEN to the container. Cloudflare tunnels now point to https://gerbil:443. Traefik issues certificates for your *.frusetik.com hostnames and serves them internally. Without a valid trust store inside cloudflared, we needed to set noTLSVerify to avoid TLS validation errors. In practice the HTTPonly option proved simpler.

Note: network_mode: service:gerbil is used in the official install, causing Traefik to share Gerbils network namespace. We retained it but discovered that connecting cloudflared to the pangolin network is essential so that it can resolve gerbil host names. After adding the cloudflared container to the pangolin network (docker network connect pangolin cloudflared), the DNS lookup succeeded.

4. Configure Traefik

Create config/traefik/traefik_config.yml. The following example shows the HTTPonly static config with ACME disabled. Comments explain why certain features are commented out.

api:
  insecure: true
  dashboard: true

providers:
  http:
    endpoint: "http://pangolin:3001/api/v1/traefik-config"
    pollInterval: "5s"
    # Traefik polls Pangolin for dynamic config (routers, services, middlewares).
  file:
    filename: "/etc/traefik/dynamic_config.yml"
    # Optional static overrides (kept minimal).

experimental:
  plugins:
    badger:
      moduleName: "github.com/fosrl/badger"
      version: "v1.3.1"  # Pin the version you tested

log:
  level: "INFO"
  format: "common"

# ACME is disabled because the origin ports 80/443 are not publicly reachable.
# Uncomment the certificatesResolvers section and configure DNS01 if you want
# Traefik to issue certificates and serve HTTPS internally.

entryPoints:
  web:
    address: ":80"
    # Traefik listens on port 80 inside the Docker network (not published on host).

  # websecure is commented out for HTTPonly deployment.  See the alternative
  # configuration above if enabling internal HTTPS.
  # websecure:
  #   address: ":443"
  #   transport:
  #     respondingTimeouts:
  #       readTimeout: "30m"
  #   http:
  #     tls:
  #       certResolver: "letsencrypt"

serversTransport:
  insecureSkipVerify: true
  # Accept selfsigned certificates when Traefik proxies HTTPS backends (e.g., newt client services).

ping:
  entryPoint: "web"
  # Health endpoint for Traefik (internal only).

config/traefik/dynamic_config.yml can be almost empty except for the Badger middleware, because Pangolin dynamically generates routers and services for each resource:

http:
  middlewares:
    badger:
      plugin:
        badger:
          disableForwardAuth: true  # Traefik uses Pangolin to authenticate

5. Configure Pangolin

The file config/config.yml defines the Pangolin control plane. A minimal configuration looks like this:

app:
  dashboard_url: "https://pangolin.frusetik.com"

domains:
  pangolin_ui:
    base_domain: "pangolin.frusetik.com"
  root:
    base_domain: "frusetik.com"

server:
  secret: "<random-secret>"

# Gerbil expects clients to connect to the VPS for WireGuard tunnels.  Use a
# DNSonly (nonproxied) record that points to your VPS public IP.
gerbil:
  base_endpoint: "pangolin-udp.frusetik.com"

flags:
  require_email_verification: false
  disable_signup_without_invite: true
  disable_user_create_org: true
  • dashboard_url should point to the hostname where you access the Pangolin UI.
  • domains defines base domains that Pangolin can manage. We added frusetik.com so that Pangolin can create public resources directly under *.frusetik.com. Without this, Pangolin would prefix hostnames with pangolin..
  • server.secret must be a long random string; generate it with openssl rand -base64 48.
  • gerbil.base_endpoint must resolve to your VPS (not proxied by Cloudflare) to allow UDP WireGuard connections.
  • Flags disable selfregistration and email verification for a closed system.

6. Start the stack

Run the stack from the Pangolin directory:

cd /home/claudio/infra/pangolin
sudo docker compose pull
sudo docker compose up -d

Check logs with docker compose logs -f pangolin and docker compose logs -f traefik to ensure services start without errors. Note that Gerbil will complain about the missing Docker socket; this warning is safe to ignore when running outside of Swarm or when you are not autopublishing containers.

7. Connect Newt client

For each remote node (e.g. LXC container) that you want to expose through Pangolin:

  1. Download the newt client from the Pangolin UI (Sites → Create site → Add a new site). It provides a command line including a token.

  2. Copy the binary to the target node and run the command once to initialize the connection. The client writes state to ~/.config/newt-client/config.json.

  3. To run newt persistently, create a systemd unit or background script. Example systemd service:

    [Unit]
    Description=Pangolin Newt Client
    After=network-online.target
    Wants=network-online.target
    
    [Service]
    Type=simple
    ExecStart=/usr/local/bin/newt
    Restart=always
    RestartSec=3
    
    [Install]
    WantedBy=multi-user.target
    

    If your LXC does not use systemd, run nohup newt & and disown it.

  4. Verify in the Pangolin UI that your Site is “Connected”.

8. Create Resources

Once sites are connected, you can publish services. In the Pangolin dashboard, navigate to Resources and choose Public Resource. Fill in:

  • Hostname: glance.frusetik.com (or immich.frusetik.com, etc.).
  • Site: the site where your service runs.
  • Target: the internal URL of the service as seen from the Newt client (e.g. http://glance:8080). Test connectivity from the newt host using curl before publishing.

Pangolin pushes a new router and service into Traefik via its HTTP provider. Traefik routes based on the Host header and proxies to the Newt client via Gerbil.

9. Troubleshooting Tips

  • Cloudflared cannot resolve gerbil If Cloudflared logs show dial tcp: lookup gerbil on 127.0.0.11:53: no such host, the cloudflared container isnt attached to the pangolin network. Run docker network connect pangolin <cloudflared-container> and restart the container. After connecting, nslookup gerbil inside the cloudflared container should resolve to the Gerbil container IP.
  • Infinite redirect (ERR_TOO_MANY_REDIRECTS) This occurs when Traefik redirects HTTP to HTTPS internally and Cloudflare is connecting via HTTP, causing a loop. To fix, either (a) disable the redirect and run HTTPonly; or (b) enable internal HTTPS (websecure) and have cloudflared connect via https://gerbil:443.
  • 502 Host error Cloudflare returns a 502 when the tunnel cannot reach your origin. Check docker logs cloudflared for DNS lookup errors or connection refused. Most often this means the service is misconfigured or cloudflared is not on the right network.
  • Certificate issues When using internal HTTPS, Traefiks default certificate is selfsigned. Cloudflared will reject it unless you set noTLSVerify or supply a CA bundle via originRequest.caPool. A more robust solution is to issue real certificates via ACME DNS01 and mount them in Traefik.
  • Nginx conflict Traefik and Nginx can coexist as long as Traefiks ports are not published on the host. Nginx still serves your other web apps on 80/443.

Conclusion

This readme captures the final architecture and troubleshooting process used to deploy Pangolin on a VPS with Cloudflare Tunnel. The key lessons learned are:

  • Attach the Cloudflare tunnel container to the same network as Traefik/Gerbil so that it can resolve service names. Use docker network connect to add existing containers to a network.
  • Decide early whether you want internal HTTPS. Running HTTPonly behind Cloudflare simplifies the configuration and avoids Lets Encrypt complications, but you must disable Traefiks redirects. Running HTTPS internally requires DNS01 ACME and possibly trust store adjustments for cloudflared.
  • Keep Nginx separate for nonPangolin services. Use DNS and Cloudflare to route only specific hostnames into Pangolin.
  • Use Pangolins Domains configuration to manage which base domains it can serve. Without adding frusetik.com, Pangolin will prepend its own subdomain.
  • Persist Pangolins data (/app/data) to avoid losing resources and state when restarting containers.

By following these steps you can host internal services securely without exposing ports, while granting friends and family access through a single login. In two years, when you revisit this setup, the diagrams and commentary here should refresh your understanding of each components role and the reasoning behind the configuration choices.