← All projects
Active

Logbook

A personal activity engine summarized locally on Apple Silicon

Status
Active
Primary Stack
Python · FastAPI · MLX

I pointed the same local Gemma 4 instance at both my screen and my chest and got a unified timeline of my day on a $599 Mac mini — zero cloud calls.

Datasheet schematic of multiple sensor streams converging into a single coalesced timeline

Logbook is the answer to a privacy-shaped question: can I get a unified, queryable record of my day from multiple sensors without a single byte leaving my tailnet? It replaced a cloud-based predecessor that made per-frame API calls with one local Gemma 4 multimodal instance running on a 16 GB Mac mini. The narrative that drove the build was deliberately concrete — point the same model at both my screen and the camera on my chest, and have it produce one time-ordered timeline of what I actually did. It was my entry for the dev.to Gemma 4 Challenge, which gave it a hard deadline and forced the architecture to stop being clever and start shipping.

The unit is an “observation”: one moment from one sensor, summarized by Gemma 4 in either frame-extraction or native-video mode. Four production sensors now flow — Mac screen recordings every 90 seconds, two streams from a wearable camera — all routing through a single Gemma 4 E4B 4-bit MLX instance and landing as observation.event.v1 envelopes in one bronze table in Nexus’ Postgres. The same daemon that serves Logbook also absorbed Forge’s vision-language backend, so one process and one ~6 GB model load does double duty. A 15-minute coalescence job rolls cross-source observations into a single paragraph per window. Everything is redacted before it persists.

The instructive bug was the day the model wasn’t actually watching. The native-video pass was calling the chat template without the video path, so the formatted prompt never got a <video> placeholder and the bytes never participated in attention — the model was confidently describing clips it had never seen. I caught it by A/B-ing two identical clips and noticing the output hashes matched pre-fix and diverged after a one-branch fix. The first post-fix summary — “The screen displays a code editor with several files open…” — was the first time it had genuinely looked. There was also an NVMe staging redesign once a spinning disk turned out to be the real bottleneck on the media path, which is the kind of thing you only find when the model is finally fast enough to expose the disk.

PythonFastAPIMLXGemma 4PostgreSQLSwift