← portfolio · barisgunaydin.com
zero-tvm
Phi-3-mini decoding in a browser on 10 hand-written WGSL kernels (27 files, 3,078 LOC). 228 dispatches/token at ~40 tok/s on M2 Pro — 22% behind WebLLM's TVM-autotuned 85 kernels on identical weights.
WGSL · 50 commits synced · last 2026-05-15