Update Homelab Todo List
This commit is contained in:
@@ -1,84 +1,205 @@
|
||||
# Homelab Todo List
|
||||
|
||||
Prioritized list of things Claudio wants to do with his homelab. Last updated: 2026-04-01.
|
||||
Curated list of open homelab work across task history, memory, and second-brain notes. Last updated: 2026-04-15.
|
||||
|
||||
## Backup & Restore
|
||||
## Current operating priorities
|
||||
|
||||
- [ ] Buy a 4-bay NAS for backup at parents' place ← **NEW 2026-04-03**
|
||||
- [ ] Regular backup for NAS at parents' place
|
||||
- [ ] Proxmox backup
|
||||
- [ ] Paperless backup (and public access)
|
||||
- [ ] Backup test script — verify restores actually work
|
||||
- [ ] Kopia/Time Machine backup for Claudio's + Alena's machines (dotfiles, etc.)
|
||||
- [ ] Backup system across entire lab (321 rule: 3 copies, 2 media, 1 offsite)
|
||||
1. **Backup foundation first**
|
||||
2. **Simplify hosting and homelab boundaries**
|
||||
3. **Stabilize access and edge architecture**
|
||||
4. **Migrate or delete deliberately, not ad hoc**
|
||||
5. **Document every meaningful infra change**
|
||||
|
||||
## Hosting & Apps
|
||||
## Now
|
||||
|
||||
- [ ] Immich: test thoroughly and validate for production use (see [[Immich Testing Plan]])
|
||||
- [ ] Automatic phone backup (iOS)
|
||||
- [ ] Immich library + database backup/restore
|
||||
- [ ] Public sharing guest experience
|
||||
- [ ] 1-week stability run
|
||||
### 1. Define and implement the backup foundation
|
||||
- [ ] Define the backup policy for each critical system: Synology, Proxmox, VPS, Gitea, Joplin, Immich, Paperless, config files
|
||||
- Next action: create one backup matrix with source, method, frequency, retention, restore target, and off-site destination
|
||||
- [ ] Set up Proxmox backup server
|
||||
- Candidate target: **goodolddell**
|
||||
- Next action: decide whether goodolddell should host Proxmox Backup Server or remain a generic restic/utility host
|
||||
- [ ] Decide where Proxmox backups should land first
|
||||
- Options currently implied by notes: goodolddell, Synology, or both
|
||||
- Next action: choose primary landing zone and secondary/off-site path
|
||||
- [ ] Set up lab-wide backup strategy following 3-2-1
|
||||
- Next action: map current copies vs missing copies for each critical service
|
||||
- [ ] Add backup verification, not just backup jobs
|
||||
- Next action: define one monthly restore drill and one automated verification check
|
||||
- [ ] Write a backup test script / restore validation workflow
|
||||
- Next action: start with one service, likely Gitea or Immich
|
||||
- [ ] Buy a 4-bay NAS for backup at parents' place
|
||||
- Blocker: hardware purchase decision
|
||||
- [ ] Define regular backup flow to parents' NAS
|
||||
- Depends on: backup matrix + parents' NAS target design
|
||||
- [ ] Set up Kopia/Time Machine backup for Claudio's and Alena's machines
|
||||
- Next action: choose destination and retention policy
|
||||
|
||||
## Infrastructure Cleanup
|
||||
### 2. Simplify the homelab and hosting architecture
|
||||
- [ ] Simplify hosting and homelab structure because too many things are mixed together
|
||||
- Goal: each service should have one clear host, one clear access path, one clear backup path, and one clear reason to exist where it is
|
||||
- Next action: create a service inventory table with columns: service, host, purpose, audience, access path, backup path, migration status
|
||||
- [ ] Decide what belongs on VPS vs Proxmox vs Synology vs goodolddell
|
||||
- Next action: classify each service as edge/public, production internal, backup/infra, or experimental
|
||||
- [ ] Review whether Proxmox should become the central app platform
|
||||
- Existing concern: avoid turning it into an unclear catch-all host
|
||||
- [ ] Keep the VPS minimal
|
||||
- Existing note direction: public edge and only truly necessary public components
|
||||
- [ ] Keep goodolddell focused
|
||||
- Candidate role: backup and always-on infra, not random leftover app host
|
||||
- [ ] Give Orik passwordless access to its own machine only
|
||||
- Goal: Orik should be able to operate its own host without interactive password prompts
|
||||
- Constraint: do **not** grant write-capable access to other machines on the network
|
||||
- Next action: design a least-privilege access model for the local host vs all remote hosts before changing SSH/sudo setup
|
||||
- [ ] Ensure Orik does not have write access to other machines on the network
|
||||
- Next action: separate local-machine automation privileges from remote-machine credentials and confirm remote access should be read-only or absent by default
|
||||
|
||||
- [ ] Move Gitea + Joplin from VPS to Proxmox ← **NEW 2026-04-03**
|
||||
- [ ] Find out if network traffic is limited/throttled through VPS ← **NEW 2026-04-03**
|
||||
- [ ] Buy a second VPS instance? ← **NEW 2026-04-03**
|
||||
- [ ] Evaluate: Pangolin + Authentik vs Cloudflare Access (free tier) — do we need both or is Cloudflare enough?
|
||||
- [ ] Clean up VPS — consolidate from many reverse proxies (pangolin, nginx, caddy, traefik, dokku, cloudflare?) to one proven stack
|
||||
- [ ] Version control VPS setup (docker files + config files in git)
|
||||
- [ ] Fix SSH keys: use single key or few keys instead of many
|
||||
- [ ] Setup `info@frusetik.com` email account + SMTP for all self-hosted apps (Immich, etc.)
|
||||
### 3. Finish the remote access simplification
|
||||
- [ ] Adopt one default admin lane
|
||||
- Recommended target from existing notes: **Tailscale**
|
||||
- Next action: confirm Tailscale is the default admin path and mark ZeroTier as deprecated unless proven needed
|
||||
- [ ] Adopt one default user/app access lane
|
||||
- Recommended target from existing notes: **Cloudflare Tunnel / reverse proxy**
|
||||
- Next action: list which services are user-facing vs admin-only
|
||||
- [ ] Evaluate Pangolin + Authentik vs Cloudflare Access free tier
|
||||
- Next action: write down what problem Authentik is solving today that Cloudflare alone does not
|
||||
- [ ] Remove overlapping access paths from the critical path
|
||||
- Next action: document one primary access path per service
|
||||
- [ ] Invite family/friends only after service access model is clear
|
||||
- Depends on: service inventory + access policy
|
||||
|
||||
## Monitoring & Documentation
|
||||
### 4. Stabilize the VPS and edge stack
|
||||
- [ ] Check whether VPS network traffic is limited or throttled
|
||||
- Next action: inspect Netcup plan limits and current usage
|
||||
- [ ] Decide whether a second VPS is actually needed
|
||||
- Next action: answer only after traffic/memory constraints are measured
|
||||
- [ ] Clean up the VPS reverse-proxy sprawl
|
||||
- Goal: converge from multiple overlapping edge tools to one proven stack
|
||||
- Next action: inventory all currently running proxy/edge/auth components on the VPS
|
||||
- [ ] Version-control VPS setup
|
||||
- Next action: put docker compose files, env templates, and key config under git
|
||||
- [ ] Add swap to VPS to reduce OOM risk
|
||||
- Source: Immich public outage incident on 2026-04-03
|
||||
- [ ] Add memory monitoring and alerting on VPS
|
||||
- Source: Immich public outage incident on 2026-04-03
|
||||
- [ ] Consider Traefik health-check / config-refresh resilience measure
|
||||
- Source: Immich public outage incident on 2026-04-03
|
||||
- [ ] Fix SSH key sprawl
|
||||
- Next action: reduce to one primary key or a very small set
|
||||
- [ ] Set up `info@frusetik.com` + SMTP for self-hosted apps
|
||||
- Next action: decide provider and which apps should send mail first
|
||||
|
||||
- [ ] Glance / Uptime Kuma page showing all hosted services status
|
||||
- [ ] Documentation for everything hosted
|
||||
- [ ] Monthly maintenance reminder + checklist
|
||||
## Next
|
||||
|
||||
## Access & Networking
|
||||
### 5. Migrate or remove services deliberately
|
||||
- [ ] Move Gitea from VPS to Proxmox
|
||||
- Preconditions: backup, restore plan, target host decision, access path
|
||||
- [ ] Move Joplin from VPS to Proxmox
|
||||
- Preconditions: backup, restore plan, target host decision, access path
|
||||
- [ ] Delete Immich from **goodolddell**
|
||||
- Intent: remove outdated or misplaced deployment from the Dell machine
|
||||
- Preconditions: confirm no required data or active path still depends on it
|
||||
- Next action: verify whether Immich on goodolddell is unused/stale, then remove container, volumes, and residual config deliberately
|
||||
- [ ] Review whether Cloudflare Tunnel management should move off VPS
|
||||
- Next action: decide if VPS remains public edge only, or if edge shifts elsewhere
|
||||
|
||||
- [ ] One admin VPN network (evaluate: ZeroTier vs Tailscale vs Pangolin private)
|
||||
- [ ] Invite people (family, friends) to appropriate services
|
||||
### 6. Validate Immich before declaring it production
|
||||
- [ ] Test automatic phone backup on Claudio's iPhone
|
||||
- [ ] Test Immich library + database backup and restore
|
||||
- [ ] Test public sharing guest experience
|
||||
- [ ] Run a 1-week stability check
|
||||
- [ ] Keep Immich **not protected by Pangolin auth** if mobile app backup depends on direct access
|
||||
- [ ] Verify off-site destination for Immich backups
|
||||
- Likely target: parents' NAS or equivalent off-site storage
|
||||
|
||||
## Network Infrastructure
|
||||
### 7. Cover the missing app backup gaps
|
||||
- [ ] Paperless backup
|
||||
- Next action: document data path, DB path, and restore steps
|
||||
- [ ] Decide Paperless public access policy
|
||||
- Next action: determine whether it should be public at all or Tailscale-only
|
||||
- [ ] Define backup rotation for PostgreSQL-backed services
|
||||
- Existing note source: Self-Hosting backup section
|
||||
- [ ] Define config-file backup for infrastructure
|
||||
- Includes compose files, tunnel/proxy config, auth config, DNS-related config
|
||||
|
||||
- [ ] Define IP ranges properly (e.g., 10.0.0.0/24 for lab, 10.0.1.0/24 for prod, 10.0.2.0/24 for DMZ)
|
||||
- [ ] Set up VLANs: separate prod, dev/staging, IoT, guests
|
||||
- [ ] Document VLAN/subnet map and which services live where
|
||||
- [ ] Firewall rules between VLANs (default deny, explicit allow)
|
||||
## Later
|
||||
|
||||
## Automation & Maintenance
|
||||
### 8. Monitoring, maintenance, and observability
|
||||
- [ ] Set up Uptime Kuma or similar monitoring tool
|
||||
- Requirement from Claudio: add **read access for Orik**
|
||||
- Next action: choose tool and hosting location, then define how Orik should access it safely
|
||||
- [ ] Build a single service status page
|
||||
- Candidate: Glance or Uptime Kuma
|
||||
- [ ] Add automated health checks + alerts
|
||||
- [ ] Build the monthly maintenance checklist
|
||||
- [ ] Set the monthly maintenance reminder
|
||||
- [ ] Keep maintenance under 1 hour/month through automation where possible
|
||||
|
||||
- [ ] Max 1h/month maintenance target — automate as much as possible
|
||||
- [ ] Monthly maintenance reminder + checklist (Orik helps build)
|
||||
- [ ] Automated backup verification (not just "ran", but "actually restoreable")
|
||||
- [ ] Automated health checks + alerts
|
||||
### 9. Network architecture cleanup
|
||||
- [ ] Define IP ranges properly
|
||||
- [ ] Set up VLANs for prod, dev/staging, IoT, and guests
|
||||
- [ ] Document VLAN/subnet map
|
||||
- [ ] Add inter-VLAN firewall rules with default deny and explicit allow
|
||||
|
||||
## Environments
|
||||
### 10. Environment clarity
|
||||
- [ ] Define what is production, testing, and staging today
|
||||
- [ ] Keep dev/staging separate from production
|
||||
- [ ] Establish naming conventions for hosts, services, and environments
|
||||
|
||||
- [ ] Proper distinction between production, development, and staging
|
||||
- [ ] Dev/staging on separate VLAN from production
|
||||
- [ ] Clear naming conventions for which services are which environment
|
||||
### 11. Documentation discipline
|
||||
- [ ] Document everything hosted
|
||||
- [ ] Keep service inventory current with host, access path, backup method, and owner
|
||||
- [ ] Record architecture changes in the second brain as they happen
|
||||
|
||||
## Notes
|
||||
## Suggested sequencing
|
||||
|
||||
### Priority direction
|
||||
Backup foundation first, then hosting apps, then cleanup and monitoring.
|
||||
### Phase A, make the platform safe
|
||||
- backup matrix
|
||||
- decide where Proxmox Backup Server lives
|
||||
- Proxmox backup
|
||||
- VPS stabilization (swap, monitoring, proxy inventory)
|
||||
- one primary access path per service
|
||||
|
||||
### VPN evaluation criteria
|
||||
- Ease of setup + maintenance
|
||||
- Works on all devices (Claudio's + Alena's)
|
||||
- integrates with existing Cloudflare/Pangolin setup
|
||||
- performance on mobile
|
||||
### Phase B, reduce ambiguity
|
||||
- service inventory with host + purpose + access path + backup path
|
||||
- confirm Tailscale for admin
|
||||
- confirm Cloudflare for user-facing apps
|
||||
- deprecate ZeroTier unless needed
|
||||
- decide Pangolin/Auth vs Cloudflare role clearly
|
||||
- decide what goodolddell is for, and remove misplaced services
|
||||
|
||||
### Monthly maintenance checklist (to build)
|
||||
- Verify backups ran successfully
|
||||
- Check disk usage on all nodes
|
||||
- Review logs for errors
|
||||
- Test at least one restore
|
||||
- Update docker images / system packages
|
||||
- Check SSL certs expiration
|
||||
- Verify VPN connectivity
|
||||
- Review access logs for anomalies
|
||||
### Phase C, migrate deliberately
|
||||
- Gitea migration
|
||||
- Joplin migration
|
||||
- Immich production validation
|
||||
- Paperless backup/access decision
|
||||
- monitoring deployment
|
||||
|
||||
### Phase D, operational polish
|
||||
- status page
|
||||
- maintenance checklist and reminder
|
||||
- VLAN and firewall cleanup
|
||||
- full documentation coverage
|
||||
|
||||
## Open decisions Claudio still needs to make
|
||||
- Should **goodolddell** become the Proxmox Backup Server, or stay a simpler backup/restic host?
|
||||
- Which backup target should be primary for Proxmox and service backups: goodolddell, Synology, or both?
|
||||
- Buy the parents' 4-bay NAS now or later?
|
||||
- Is Proxmox meant to be the main long-term app host, or mainly test/transition infrastructure?
|
||||
- Does Authentik stay in the near-term critical path, or should Cloudflare carry more of the access burden for now?
|
||||
- Is a second VPS actually needed, or is the current VPS just under-instrumented?
|
||||
- Where should Uptime Kuma live, and what exact read-only access should Orik have?
|
||||
|
||||
## Source consolidation notes
|
||||
This list was curated from:
|
||||
- `TASKS.md` homelab task state
|
||||
- `02_Projects/Home Lab.md`
|
||||
- `02_Projects/Home Lab Plan.md`
|
||||
- `04_Topics/Self-Hosting.md`
|
||||
- `05_Resources/Home Lab Architecture.md`
|
||||
- `05_Resources/Home Lab Inventory.md`
|
||||
- `05_Resources/Home Lab Incidents.md`
|
||||
- `06_Decisions/Home Lab Principles.md`
|
||||
- `06_Decisions/Remote Access V1 Proposal.md`
|
||||
- `02_Projects/Immich Testing Plan.md`
|
||||
- workspace memory notes referencing homelab follow-ups
|
||||
- Claudio additions on 2026-04-15: delete Immich from goodolddell, set up Proxmox Backup Server, set up Uptime Kuma or equivalent with read access for Orik, simplify hosting/homelab boundaries
|
||||
|
||||
Reference in New Issue
Block a user