BridgeVoice

Privacy-first voice dictation for builders. On-device Whisper transcription, universal text injection, and zero cloud dependency.

Talk instead of type. BridgeVoice is a privacy-first desktop voice dictation app that turns your speech into text and pastes it directly into whatever app you're focused on — your editor, terminal, browser, or anything else.

Transcription happens on your machine using Whisper, an open-source speech recognition model from OpenAI. Your audio never leaves your device, and it works offline.

Key Features

On-device transcription — Whisper runs locally on your machine. Nothing is sent to the cloud.
Universal text injection — Transcribed text is pasted into the currently focused app. Works with every desktop application.
Sub-500ms latency — From the moment you stop speaking to text appearing on screen.
Offline support — No internet connection required for local transcription.
Cloud transcription — Optional Groq-powered transcription for 99+ languages (requires BridgeMind Pro).
Push-to-Talk and Toggle modes — Hold a key to record, or press once to start and again to stop.
Custom dictionary — Automatic text replacements for technical terms (e.g., "web hook" → "WebHook").
Transcription history — Every transcription is saved locally with timestamps, word counts, and duration.

Installation

macOS

Download BridgeVoice from bridgemind.ai.

Apple Silicon (ARM64) — Metal GPU acceleration for fast transcription
Intel (x86_64) — CPU transcription

Windows

Download the installer from bridgemind.ai.

Linux (Experimental)

Available as AppImage and .deb packages.

Getting Started

Download and install BridgeVoice from bridgemind.ai.
Grant microphone permissions when prompted.
Download a Whisper model — Open Settings and choose a model size. Start with "Base" for a balance of speed and accuracy.
Set your hotkey — Configure a Push-to-Talk key in Settings (e.g., Right Option).
Start dictating — Hold your hotkey, speak, and release. Text appears in your focused app.

Transcription Modes

Local (Whisper)

Audio is processed entirely on your machine — nothing is sent to the cloud.

Model Sizes

Model	Size	Speed	Accuracy	Best For
Tiny	75 MB	Fastest	Basic	Quick notes, commands
Base	142 MB	Fast	Good	General dictation
Small	466 MB	Moderate	Better	Longer dictation
Medium	1.5 GB	Slower	Great	Detailed transcription
Large	3.1 GB	Slowest	Best	Maximum accuracy
Distil-Large	~1.5 GB	Fast	Great	Best speed-to-accuracy ratio

On Apple Silicon Macs, BridgeVoice uses GPU acceleration for roughly 10x faster transcription compared to CPU-only.

Cloud

Optional cloud transcription via the BridgeMind API. Requires a BridgeMind Pro subscription.

99+ languages with automatic language detection
Audio is uploaded securely over HTTPS

Switch between Local and Cloud modes in the BridgeVoice dashboard.

Transcribed text is copied to the system clipboard
A keyboard shortcut (Cmd+V on macOS, Ctrl+V on Windows) is simulated
Text appears in the focused application

This approach works universally across all desktop apps — editors, terminals, browsers, chat apps, and more. Full Unicode support is included.

BridgeVoice includes a compact always-on-top widget that floats over your other windows:

State	Appearance
Idle	Small pill with BridgeVoice logo
Listening	Expanded with real-time audio visualization (7 frequency bands)
Processing	Loading indicator while transcription runs

Double-click the widget to toggle recording. Drag it anywhere on screen.

Custom Dictionary

Create automatic text replacements for terms that Whisper frequently gets wrong:

Spoken	Replaced With
"web hook"	"WebHook"
"next js"	"Next.js"
"typescript"	"TypeScript"
"bridge mind"	"BridgeMind"

Add entries in Settings → Dictionary. You can also quick-add from the transcription history.

Transcription History

Every transcription is saved locally with metadata:

Text — Full transcription content
Timestamp — When the transcription occurred
Word count — Number of words transcribed
Duration — How long the recording lasted
Source — Local (Whisper) or Cloud (Groq)

Statistics

The dashboard tracks your usage over time:

Total words transcribed
Total speaking time
Session count
Words per minute average

Subscription Tiers

Feature	Free	Pro
On-device transcription	Yes	Yes
All Whisper model sizes	Yes	Yes
Push-to-Talk / Toggle	Yes	Yes
Custom dictionary	Yes	Yes
Transcription history	Yes	Yes
Cloud transcription (Groq)	—	Yes
99+ language support	—	Yes
AI text polish (coming soon)	—	Yes
Cross-device sync (coming soon)	—	Yes

System Requirements

Platform	Minimum
macOS	macOS 11 (Big Sur) or later
Windows	Windows 10 or later
Linux	Ubuntu 20.04+ or equivalent (experimental)
RAM	4 GB minimum (8 GB recommended for Large models)
Disk	200 MB + model size (75 MB – 3.1 GB)