Twelve ASR Backends, One UI
Soniox, Volcengine, ElevenLabs, Mistral AI, Gladia, Deepgram, AssemblyAI, Cloudflare Workers AI, SiliconFlow, Groq, local OpenAI-compatible, and local whisper.cpp — three execution modes in one app.
Capture system audio. Transcribe with twelve ASR backends. Review with AI — correction, summaries, chat, mind maps. All local-first.
Real-time transcription, AI review, and more — all in one desktop app










Premium UI overhaul, eight color themes, AI streaming chat, real-time waveform, and more
Upload audio or video files and transcribe them offline. Ten cloud ASR engines supported — Soniox, Volcengine, ElevenLabs, Mistral, Gladia, Deepgram, AssemblyAI, Cloudflare, SiliconFlow, and Groq — with speaker diarization and word-level timestamps.
New ASR provider — Whisper-based transcription via Cloudflare Workers AI. Low cost with generous free tier, VAD filter, and anti-hallucination.
Solaria-1 real-time streaming with sub-300ms latency and 100+ language support. Embedded proxy handles session init and authentication.
Persisted correction streaming text across tab switches, real-time progress display (character count, elapsed time), and improved AI analysis status tracking.
AI post-processing now auto-selects corrected transcript when available. Configurable preference (Auto / Always Original / Always Corrected) with real-time status banners.
Provider selection reordered to: Soniox, Volcengine, ElevenLabs, Mistral AI, Gladia, Deepgram, AssemblyAI, Cloudflare, SiliconFlow, Groq, Local OpenAI, whisper.cpp.
Expanded test suite with 314 tests across 32 files, including AI correction, hypothesis buffer, PCM/WAV encoding, and all previous coverage areas.
WindowedBatchTranscriptionProvider base class extracted — shared logic for interval-based retranscription, silence detection, and hypothesis buffer management.
Main process tray menu, dialog titles, and system notifications now respect the user's language setting.