mirlyDownload

latency · pillar 01

From last syllable
to first token,
in <150ms.

Multi-second lag is the #1 'AI tell' interviewers cite. Mirly runs the entire pipeline an order of magnitude faster than the incumbent — with documented budgets per stage, reproducible benchmarks, and raw data published.

Mirly delivers a first useful token in 127ms p50 and 189msp95, measured end-to-end from the last syllable of the interviewer’s question to the first pixel rendered on the candidate’s screen. Final Round AI on the same machine measured 1,810ms p50 — a 14× gap.

pipeline budget

Every millisecond, accounted for.

stagebudgetimplementation
Audio frame ready0msgetDisplayMedia + getUserMedia capture
STT partial token0mswhisper.cpp on Apple Silicon
Question detection0msspeech_final + heuristic match
LLM cache hit0msGroq Llama 3.3 instant skeleton
Render0msCSS opacity transition, no layout
total0msp50, end-to-end, opt-in telemetry

benchmark

5–14× faster than the rest.

Same machine, same audio source, same question. Warm p50 in milliseconds — last syllable to first visible token, frame-counted at 60fps.

Mirly
0ms
Parakeet AI
0ms
LockedIn AI
0ms
Verve AI
0ms
Cluely
0ms
Sensei AI
0ms
Final Round AI
0ms

methodology

Reproducible. Auditable.

Published benchmarks should be re-runnable by anyone with the same hardware. Ours are. Raw WAV, screen-recordings, and frame-count spreadsheet at github.com/mirly/latency-benchmark-2026.

Hardware
MacBook Air M2 · 16GB · macOS 14.5 · plugged in, single-app foreground
Network
Gigabit ethernet, London — deliberately worst-case for US-East vendors
Audio source
Pre-recorded 16kHz mono WAV, played into system audio via BlackHole
Question
"Tell me about a time you led a contentious technical decision" — same across all tools
Metric
Last syllable of question → first visible token, 60fps frame count
Runs
10 per tool — 1 cold + 9 warm; p50 = median(warm)
Date
2026-05-15 · vendor versions tabulated below

Full teardown: /blog/latency-teardown-6-copilots

Feel the difference.