Infrastructure

The home lab.

Everything I build runs on hardware I own. The home lab is the substrate — two AMD Strix Halo machines linked over Thunderbolt 5, plus a Mac Mini for daily dev work. AWS is for things that need to survive a power cut.

192GB
Combined RAM
160GB
VRAM
8
Always-on models
9+
AWS accounts
FORGE GATEWAY 33 models · 392.3 GB VRAM snapshot just now
22 text · 5 vlm · 1 embeddings · 1 whisper · 1 face · 1 rerank · 1 ocr · 1 tts
  • qwen3.6-chat-35b-a3b text 22 GB
  • qwen3.6-permissive-35b-a3b text 21 GB
  • gemma-4-permissive-31b text 18 GB
  • gemma-4-permissive-8b-e4b vlm 6 GB
  • qwen3-permissive-vl-8b vlm 6 GB
  • qwen3.5-permissive-35b-a3b text 21 GB
  • qwen3-code-30b-a3b text 18 GB
  • qwen3-code-30b-a3b-furnace text 18 GB
  • devstral2-code-24b-furnace text 14 GB
  • qwen3-coder-next-code-80b text 48 GB
  • qwen3-coder-next-permissive-80b text 46 GB
  • qwen3.6-vl-27b vlm 36 GB
  • qwen3-4b text 2.3 GB
  • qwen3-embed-8b embeddings 8 GB
  • qwen3-vl-8b vlm 5.4 GB
  • qwen3-vl-30b-a3b vlm 20 GB
  • nic-style text 6.2 GB
  • whisper-large-v3-turbo whisper 1.5 GB
  • insightface-buffalo face 0.33 GB
  • reranker rerank 0.1 GB
Nodes 4 machines · click for detail
Furnace
primary
Primary compute & gateway
Hardware GMKTec EVO-X2
Specs AMD Strix Halo · 128GB RAM · 96GB VRAM · 1.9TB NVMe + 932GB SSD
  • Forge LLM gateway (port 8642)
  • llama-swap on :8099 (fronts all Furnace LLM traffic)
  • qwen3-next-chat-80b (primary reasoning, preloaded)
  • qwen3-vl-30b-a3b (vision)
  • qwen3-embed-8b (embeddings, preloaded)
  • Whisper (transcription)
  • InsightFace (face detection)
  • Florence-2 OCR
  • Chatterbox TTS
  • Smithy image + video gen (port 8766)
  • PostgreSQL 16 (ARIA + Nexus, ~191 tables)
  • Caddy + CoreDNS
  • Prometheus + Grafana
Crucible
satellite
Satellite compute
Hardware GMKTec EVO-X2
Specs AMD Strix Halo · 64GB RAM · 64GB VRAM · Thunderbolt 5 to Furnace (40 Gbps, 0.12 ms)
  • nexus-scaler (autoscaler for bulk workers)
  • RPC worker (Llama 70B tensor layers)
  • crucible-router (abliterated Qwen3 30B-A3B, port 8090)
  • crucible-postproc (CodeFormer + Real-ESRGAN, port 8770)
  • sd-cli overflow for image + video gen
Anvil
dev
Dev environment
Hardware M4 Mac Mini
Specs Apple Silicon M4 · 16GB unified memory
  • Claude Code workstation
  • ARIA Relay daemon (iMessage bridge)
  • Photo + media sync agents (osxphotos, iMessage DB copy)
  • Local builds for niclydon.io / niclydon.com
Bellows
edge
Out-of-band management
Hardware GL.iNet GL-RM1 KVM
Specs Remote KVM · Ethernet + HDMI capture · power-cycle relay
  • Furnace remote console (video + keyboard)
  • Power cycle / BIOS access
  • Tailscale-reachable OOB
Stack 5 layers
Compute
  • AMD Strix Halo ×2 (Furnace + Crucible)
  • M4 Mac Mini (Anvil)
  • Thunderbolt 5 interconnect (40 Gbps)
Inference
  • llama.cpp (primary runtime)
  • vLLM (batch + bulk)
  • Forge — OpenAI-compatible gateway over everything
  • 8 always-on models, ~65GB VRAM baseline
Data
  • PostgreSQL 16 (ARIA + Nexus, ~191 tables)
  • SQLite (request logging, edge caches)
  • S3 cold storage (media archives)
Network
  • Tailscale tailnet (zero-trust)
  • CoreDNS for *.niclydon.io
  • Caddy with wildcard TLS
  • Cloudflare for public DNS + edge
Cloud
  • AWS Organization with 9+ accounts
  • Centralized secrets in AWS Secrets Manager
  • Vercel for marketing sites