Infrastructure

The home lab.

Everything I build runs on hardware I own. The home lab is the substrate — two AMD Strix Halo machines linked over Thunderbolt 5, plus a Mac Mini for daily dev work. AWS is for things that need to survive a power cut.

192GB

Combined RAM

160GB

VRAM

Always-on models

AWS accounts

FORGE GATEWAY 28 models / 323.7 GB VRAM snapshot just now

17 text / 4 vlm / 2 whisper / 1 embeddings / 1 face / 1 rerank / 1 ocr / 1 tts

gemma-4-chat-12b text 16 GB
gemma-4-chat-26b-a4b text 17 GB
qwen3.6-chat-35b-a3b text 22 GB
q3.6-permissive-kimi-35b-a3b text 29 GB
qwen3-permissive-vl-8b vlm 6 GB
qwen3-code-30b-a3b text 18 GB
qwen3-code-30b-a3b-furnace text 18 GB
qwen3-coder-next-code-80b text 48 GB
qwen3-coder-next-permissive-80b text 46 GB
gemma-4-vl-12b vlm 16 GB
qwen3.6-vl-27b vlm 36 GB
qwen3-embed-8b embeddings 8 GB
qwen3-vl-8b vlm 6 GB
nic-style text 6.2 GB
whisper-large-v3-turbo whisper 1.5 GB
parakeet-tdt-0.6b-v3 whisper 1 GB
insightface-buffalo face 0.33 GB
reranker rerank 1.2 GB
florence2-ocr ocr 0.5 GB
chatterbox-tts tts 1 GB

Nodes 4 machines / click for detail

Furnace

primary

Primary compute & gateway

Hardware GMKTec EVO-X2

Specs AMD Strix Halo · 128GB RAM · 96GB VRAM · 1.9TB NVMe + 932GB SSD

Forge LLM gateway (port 8642)
llama-swap on :8099 (fronts all Furnace LLM traffic)
qwen3-next-chat-80b (primary reasoning, preloaded)
qwen3-vl-30b-a3b (vision)
qwen3-embed-8b (embeddings, preloaded)
Whisper (transcription)
InsightFace (face detection)
Florence-2 OCR
Chatterbox TTS
Smithy image + video gen (port 8766)
PostgreSQL 16 (ARIA + Nexus, ~191 tables)
Caddy + CoreDNS
Prometheus + Grafana

Crucible

satellite

Satellite compute

Hardware GMKTec EVO-X2

Specs AMD Strix Halo · 64GB RAM · 64GB VRAM · Thunderbolt 5 to Furnace (40 Gbps, 0.12 ms)

nexus-scaler (autoscaler for bulk workers)
RPC worker (Llama 70B tensor layers)
crucible-router (abliterated Qwen3 30B-A3B, port 8090)
crucible-postproc (CodeFormer + Real-ESRGAN, port 8770)
sd-cli overflow for image + video gen

Anvil

dev

Dev environment

Hardware M4 Mac Mini

Specs Apple Silicon M4 · 16GB unified memory

Claude Code workstation
ARIA Relay daemon (iMessage bridge)
Photo + media sync agents (osxphotos, iMessage DB copy)
Local builds for niclydon.io / niclydon.com

Bellows

edge

Out-of-band management

Hardware GL.iNet GL-RM1 KVM

Specs Remote KVM · Ethernet + HDMI capture · power-cycle relay

Furnace remote console (video + keyboard)
Power cycle / BIOS access
Tailscale-reachable OOB

Stack 5 layers

Compute

AMD Strix Halo ×2 (Furnace + Crucible)
M4 Mac Mini (Anvil)
Thunderbolt 5 interconnect (40 Gbps)

Inference

llama.cpp (primary runtime)
vLLM (batch + bulk)
Forge — OpenAI-compatible gateway over everything
8 always-on models, ~65GB VRAM baseline

Data

PostgreSQL 16 (ARIA + Nexus, ~191 tables)
SQLite (request logging, edge caches)
S3 cold storage (media archives)

Network

Tailscale tailnet (zero-trust)
CoreDNS for *.niclydon.io
Caddy with wildcard TLS
Cloudflare for public DNS + edge

Cloud

AWS Organization with 9+ accounts
Centralized secrets in AWS Secrets Manager
Vercel for marketing sites