BridgeVoice
Privacy-first voice dictation for builders. On-device Whisper transcription, universal text injection, and zero cloud dependency.
Talk instead of type. BridgeVoice is a privacy-first desktop voice dictation app that turns your speech into text and pastes it directly into whatever app you're focused on — your editor, terminal, browser, or anything else.
Transcription happens on your machine using Whisper, an open-source speech recognition model from OpenAI. Your audio never leaves your device, and it works offline.
Key Features
- On-device transcription — Whisper runs locally on your machine. Nothing is sent to the cloud.
- Universal text injection — Transcribed text is pasted into the currently focused app. Works with every desktop application.
- Sub-500ms latency — From the moment you stop speaking to text appearing on screen.
- Offline support — No internet connection required for local transcription.
- Cloud transcription — Optional Groq-powered transcription for 99+ languages (requires BridgeMind Pro).
- Push-to-Talk and Toggle modes — Hold a key to record, or press once to start and again to stop.
- Custom dictionary — Automatic text replacements for technical terms (e.g., "web hook" → "WebHook").
- Transcription history — Every transcription is saved locally with timestamps, word counts, and duration.
Installation
macOS
Download BridgeVoice from bridgemind.ai.
- Apple Silicon (ARM64) — Metal GPU acceleration for fast transcription
- Intel (x86_64) — CPU transcription
Windows
Download the installer from bridgemind.ai.
Linux (Experimental)
Available as AppImage and .deb packages.
Getting Started
- Download and install BridgeVoice from bridgemind.ai.
- Grant microphone permissions when prompted.
- Download a Whisper model — Open Settings and choose a model size. Start with "Base" for a balance of speed and accuracy.
- Set your hotkey — Configure a Push-to-Talk key in Settings (e.g., Right Option).
- Start dictating — Hold your hotkey, speak, and release. Text appears in your focused app.
Transcription Modes
Local (Whisper)
Audio is processed entirely on your machine — nothing is sent to the cloud.
Model Sizes
| Model | Size | Speed | Accuracy | Best For |
|---|---|---|---|---|
| Tiny | 75 MB | Fastest | Basic | Quick notes, commands |
| Base | 142 MB | Fast | Good | General dictation |
| Small | 466 MB | Moderate | Better | Longer dictation |
| Medium | 1.5 GB | Slower | Great | Detailed transcription |
| Large | 3.1 GB | Slowest | Best | Maximum accuracy |
| Distil-Large | ~1.5 GB | Fast | Great | Best speed-to-accuracy ratio |
On Apple Silicon Macs, BridgeVoice uses GPU acceleration for roughly 10x faster transcription compared to CPU-only.
Cloud
Optional cloud transcription via the BridgeMind API. Requires a BridgeMind Pro subscription.
- 99+ languages with automatic language detection
- Audio is uploaded securely over HTTPS
Switch between Local and Cloud modes in the BridgeVoice dashboard.
Recording Modes
Push-to-Talk
Hold your configured hotkey to record. Release to stop recording and trigger transcription. This is the default mode and works well for short dictation bursts.
Toggle Recording
Press your hotkey once to start recording, press again to stop. Better for longer dictation sessions where holding a key is uncomfortable.
Configure your preferred mode and hotkey in Settings → Recording.
Text Injection
After transcription, BridgeVoice injects the text into your currently focused application using a clipboard-and-paste method:
- Transcribed text is copied to the system clipboard
- A keyboard shortcut (
Cmd+Von macOS,Ctrl+Von Windows) is simulated - Text appears in the focused application
This approach works universally across all desktop apps — editors, terminals, browsers, chat apps, and more. Full Unicode support is included.
Widget
BridgeVoice includes a compact always-on-top widget that floats over your other windows:
| State | Appearance |
|---|---|
| Idle | Small pill with BridgeVoice logo |
| Listening | Expanded with real-time audio visualization (7 frequency bands) |
| Processing | Loading indicator while transcription runs |
Double-click the widget to toggle recording. Drag it anywhere on screen.
Custom Dictionary
Create automatic text replacements for terms that Whisper frequently gets wrong:
| Spoken | Replaced With |
|---|---|
| "web hook" | "WebHook" |
| "next js" | "Next.js" |
| "typescript" | "TypeScript" |
| "bridge mind" | "BridgeMind" |
Add entries in Settings → Dictionary. You can also quick-add from the transcription history.
Transcription History
Every transcription is saved locally with metadata:
- Text — Full transcription content
- Timestamp — When the transcription occurred
- Word count — Number of words transcribed
- Duration — How long the recording lasted
- Source — Local (Whisper) or Cloud (Groq)
Statistics
The dashboard tracks your usage over time:
- Total words transcribed
- Total speaking time
- Session count
- Words per minute average
Subscription Tiers
| Feature | Free | Pro |
|---|---|---|
| On-device transcription | Yes | Yes |
| All Whisper model sizes | Yes | Yes |
| Push-to-Talk / Toggle | Yes | Yes |
| Custom dictionary | Yes | Yes |
| Transcription history | Yes | Yes |
| Cloud transcription (Groq) | — | Yes |
| 99+ language support | — | Yes |
| AI text polish (coming soon) | — | Yes |
| Cross-device sync (coming soon) | — | Yes |
System Requirements
| Platform | Minimum |
|---|---|
| macOS | macOS 11 (Big Sur) or later |
| Windows | Windows 10 or later |
| Linux | Ubuntu 20.04+ or equivalent (experimental) |
| RAM | 4 GB minimum (8 GB recommended for Large models) |
| Disk | 200 MB + model size (75 MB – 3.1 GB) |
BridgeSpace
Your desktop command center for shipping. Multi-pane terminals, integrated code editor, and AI agent workflows — all in one native app.
BridgeMCP
Give your agents superpowers. A Model Context Protocol server that connects any AI coding agent to the BridgeMind ecosystem through a standardized, open protocol.