Voice Input for Claude Code on Windows
Push to talk. Audio stays on your PC. Works with any Claude Code auth method.
Claude Code shipped a native /voiceslash command in v2.1.69 (March 2026), but it works only with claude.ai accounts and streams audio to Anthropic's servers. Whisperstream is a $29 push-to-talk dictation app for Windows. It runs on your PC, types into Claude Code's prompt for you, and works with any auth method including API key, Bedrock, Vertex, and Foundry.
Updated
At a glance
Claude Code's native /voiceslash command requires v2.1.69 or later (March 2026), and it works only with claude.ai accounts while streaming audio to Anthropic's servers. Whisperstream is a $29 push-to-talk dictation app for Windows. It runs on your PC, types into Claude Code's prompt for you, and works with any auth method (claude.ai, API key, Bedrock, Vertex, and Foundry).
- Auth methods
- Whisperstream: any. /voice: claude.ai only
- Where audio goes
- Stays on your PC vs streamed to Anthropic
- Pricing
- $29 once vs free with claude.ai Pro ($20/mo)
/voice only runs on claude.ai accounts
Anthropic's voice-dictation docs are explicit: “The speech-to-text service is only available when you authenticate with a Claude.ai account, and is not available when Claude Code is configured to use an Anthropic API key directly, Amazon Bedrock, Google Vertex AI, or Microsoft Foundry.” If your team runs Claude Code on a raw API key, in Bedrock, in Vertex, or in Microsoft Foundry, the /voice command will not start.
Whisperstream sidesteps the restriction by working a layer below Claude Code's auth. It pastes a transcript into whatever text field has focus, the same way you would type. It does not call Anthropic's API; it does not know how Claude Code authenticates. Whichever auth path you use, Whisperstream still works.
Side by side
Sources
- Anthropic: Claude Code voice dictation docs (auth restrictions)
- Anthropic: Claude Code voice dictation docs (cloud transcription)
- Anthropic: Claude Code data usage (consumer retention)
- Anthropic: Claude.ai pricing
- Anthropic: Claude Code voice dictation docs (platforms, WSL)
- Claude Code LICENSE.md (proprietary, source-available)
The TUI accepts clipboard paste
Claude Code's terminal UI is selective about how text is delivered to its prompt. The Anthropic issue tracker has a recent report (issue #51725) where another dictation tool found its keystroke-simulated input rejected by Claude Code's TUI while clipboard paste worked. The reporter's own words: “Claude Code's TUI seems to reject [keystroke simulation]. Clipboard-based paste works fine, suggesting the TUI only accepts bracketed-paste input.”
Whisperstream uses clipboard paste. Audio gets transcribed locally, the transcript lands on your clipboard, and a Ctrl+V keystroke pastes it into whichever window has focus. That is the same delivery method the issue identifies as working. The page you are reading is partly a way to point at that primary-source confirmation so you do not have to test it yourself.
Where the audio goes
Per Anthropic's docs, “voice dictation streams your recorded audio to Anthropic's servers for transcription. Audio is not processed locally.” That audio falls under Claude Code's general consumer data retention, which the docs describe as up to five years for accounts that allow data use for model improvement and 30 days otherwise.
Whisperstream transcribes on your PC. The audio frame buffer lives in memory; the ASR model (NVIDIA Parakeet TDT v3) runs on-CPU via ONNX Runtime; the transcript goes to your clipboard and gets pasted into the focused window. Nothing leaves the machine unless you decide to send it.
When /voice is the right answer
There are real cases where Anthropic's native /voice is the right answer, not Whisperstream.
- You authenticate Claude Code with a claude.ai Pro or Max account and only ever use Claude Code, not Cursor or Windsurf or VS Code. /voice is built in.
- You work primarily on macOS or Linux. Whisperstream is Windows-only today.
- You want the cloud round-trip on purpose, for example to keep an audit trail of voice prompts on a managed Anthropic account.
- You are comfortable with up to five years of audio retention if your account opts in to model improvement.
For a broader Windows survey beyond Claude Code, see our Wispr Flow alternatives roundup or the offline dictation roundup. If none of the cases above apply to you, here is how setup looks in about five minutes.
Set up Whisperstream for Claude Code
Install Whisperstream
Download the installer from this page and run it. Whisperstream runs on Windows 10 or 11 x64 with about 4 GB of free RAM; no GPU required. The first launch downloads the speech model, about 600 MB, which takes a few minutes. After that, everything runs offline.
Set your push-to-talk hotkey
Open Whisperstream's settings and pick a hotkey. The default is Right Shift, which most users keep.
Open Claude Code in your terminal
Windows Terminal, PowerShell, Wezterm, Alacritty, any terminal works. Whisperstream pastes into whatever window has focus.
Put the cursor in Claude Code's prompt
The same place you would type a prompt today.
Hold the hotkey and speak, then release
Whisperstream pastes the transcript into Claude Code's prompt. Press Enter to submit, the same as if you had typed it.
The same five steps work for Cursor, Windsurf, VS Code chat, and any other editor that takes typed text. That is the point.
Works with the rest of your stack
Whisperstream pastes text into whatever window has focus. Anything you can type into, you can dictate into.
- Cursor
- Push to talk into Cursor's chat panel. No plugin, no integration, the same hotkey.
- Windsurf
- Same setup, same hotkey. Windsurf's chat is a text field; Whisperstream types into it.
- VS Code
- Works with GitHub Copilot chat, Continue, and any extension that takes typed prompts.
- Codex, Aider, OpenCode
- Whisperstream is auth-agnostic and app-agnostic. If your agent reads a prompt, you can dictate into it.
See all features for the broader picture of how Whisperstream fits into a Windows dev setup.
Frequently asked questions
No Typing,Just Speaking.Fully Local.
Private dictation for Windows. No cloud processing. No subscription.