← All projects
Active

Tongs

Unrestricted permissive chat with visible thinking

Status
Active
Primary Stack
Next.js 16 · React 19 · TypeScript
Depends On
Forge

Sometimes I want to reason with a model without the context window filled with every prior conversation and every proactive-system tool wired in. Tongs is that — clean permissive chat, visible thinking, nothing else.

Tongs schematic — a pair of blacksmithing tongs gripping a glowing amber ember

There’s a mode of working with a model that ARIA can’t provide — no memory, no tools, no prior-conversation context, no proactive-system wiring. Just the model, a prompt, and its thinking visible inline. That mode existed before Tongs as a Python CLI called sec_bench, then as a TypeScript terminal app built on Ink called rig. Tongs is the third iteration: same ethos, now a DM-style web chat at tongs.niclydon.io. The name is the blacksmithing tool you grip the hot metal with — the literal interface to Forge.

Next.js 16 App Router + React 19 + Tailwind v4 on the frontend, assistant-ui primitives for the transcript, and Vercel AI SDK v6 for the streaming plumbing. The provider is @ai-sdk/openai pointed at Forge via a custom baseURL, so any model in the Forge catalog is reachable — but the default is the abliterated qwen3.5-permissive-35b-a3b build, which responds to anything without the usual corporate-model refusals. Tailnet-only on Furnace. No database, no memory, no tool registry — the whole point is that each conversation starts empty and ends on refresh.

The most interesting sharp edge was making <think>…</think> render as actual thinking UI. Forge inline-wraps Qwen3-family reasoning as <think> tags by default (opted in via chat_template_kwargs.enable_thinking=true), and the AI SDK will happily stream those tags as plain content — which shows up in the transcript as the model rambling to itself before answering. The fix is a custom fetch shim around createOpenAI that injects the thinking flag on outbound calls, plus extractReasoningMiddleware({ tagName: "think" }) on the returned LanguageModel, which splits the wrapped reasoning back into proper reasoning-start / reasoning-delta / reasoning-end SSE parts. The existing <ReasoningPart> in assistant-ui renders the pulsing “thinking → thought” UI without any other change. Lineage lesson from sec_bench and rig: the interface keeps getting lighter, and letting the model do more of the work is the right direction.

Next.js 16React 19TypeScriptTailwind v4Vercel AI SDKassistant-ui
tongs.niclydon.io (tailnet) ↗