← portfolio · barisgunaydin.com

zero-tvm

Phi-3-mini decoding in a browser on 10 hand-written WGSL kernels (27 files, 3,078 LOC). 228 dispatches/token at ~40 tok/s on M2 Pro — 22% behind WebLLM's TVM-autotuned 85 kernels on identical weights.

github → live ↗

WGSL · 50 commits synced · last 2026-05-15