TL;DR: Voice typing for terminal stopped being a gimmick the day AI agents moved in: prompts to Claude Code, Codex, and Gemini CLI are long English sentences, and speaking them is faster than typing them. Infina types raw on-device dictation into any Mac terminal for $99 one-time (at the time of writing), with a 7-day refund. And unlike every dictation app that chains you to a hotkey for each dictation, Infina runs hands-free: sit back two feet from the desk, eat lunch, say "type run the test suite again and fix whatever fails", say "send", say "open Warp", and keep going without touching a key.

Why voice typing for terminal suddenly makes sense

For thirty years the terminal was the worst possible place to dictate. Commands were dense flags and paths: tar -xzvf, grep -rn, chmod 755. No speech engine on earth wants that job.

Then the terminal became where AI agents live. Claude Code, Codex, and Gemini CLI all run in a terminal window, and what you feed them all day is not flags. It is English.

"Refactor the auth middleware but keep the session cookie names the same, and update the tests to match." That is a terminal command now. It is 24 words, you will say fifty of them before lunch, and your voice produces them roughly three times faster than your fingers do.

So the real search behind terminal voice input on a Mac is not "how do I dictate ls -la". It is "how do I stop typing paragraphs to my agents". That is exactly the job Infina was built for. If Claude Code is your daily driver, we wrote a dedicated guide to voice typing for Claude Code, and Gemini CLI users have their own guide too.

What a terminal needs from dictation (and what usually breaks)

Terminals are plain text boxes with sharp edges. Dictation tools built for email tend to fail here in two ways.

Autocorrect mangling. A "helpful" dictation layer that capitalizes sentences, swaps words for prettier ones, or fixes your "mistakes" is a menace in a shell. When you say a branch name, a file path, or "npm run build", you need exactly those characters.

Latency and cloud round-trips. If every utterance travels to a server and back, dictation feels like typing over a bad SSH connection.

Infina's base product is raw on-device dictation by design:

  • Raw output. Speech-to-text runs on your Mac's Neural Engine and what you said is what gets typed. No autocorrect layer rewriting your words. Agents do not care about polish, and shells actively punish it.
  • On-device and offline. Transcription happens entirely on your Mac by default; your audio never leaves the device, and it works on a plane.
  • OS-level typing. Infina types at the operating system level, so it works in Terminal.app, iTerm2, Warp, Ghostty, the VS Code terminal, and any other terminal you run. There is no per-app plugin to install.
  • 95%+ accuracy for clear speech, which for English prompt sentences is the whole job.

Nothing runs until you send it

The reasonable fear about speech to text in a terminal: what if it mishears me and runs something destructive?

It cannot. Dictation types text. That is all it does.

Infina puts your words into the prompt line and stops. Nothing executes until you press Enter yourself, or, in hands-free mode, until you deliberately say "send". A misheard word just sits there on the line, where you can read it, fix it, or clear it, exactly like a typo.

You keep the same review step you have always had. You just stop paying the typing cost before it.

How to dictate terminal commands with Infina

Setup is two minutes: install Infina, grant the microphone and accessibility permissions it asks for, and open your terminal.

Push-to-talk (the everyday mode). Hold Option (⌥), speak, release. Your words appear at the cursor in the focused terminal. This works everywhere, all the time, in any app.

Hands-free mode (the terminal superpower). Double-tap Cmd (⌘) to toggle it on (it ships off by default). Now you do not touch the keyboard at all:

  1. Say a sentence that starts with "type": "type explain what this stack trace means and propose a fix". Infina types it into the focused terminal.
  2. Say "send". Infina presses Enter and the agent gets to work.
  3. Say "open Terminal" or "open Warp" to jump to another window, and queue the next instruction the same way.

Say "new line" when a long prompt needs structure. That small vocabulary, "type", "send", "new line", "open [app]", covers the entire agent workflow.

The hands-free loop across multiple terminals

Here is where voice control of a terminal goes from convenience to multiplier.

An agent session leaves you idle while it works. The known fix is running two or three sessions on different tasks, but feeding them by keyboard means endless Cmd-Tab, click, type, Enter. Most people give up and babysit one.

By voice, the rotation is one continuous motion. Session one finishes a refactor: "type looks right, now update the tests. Send." "Open Warp": session two wants a decision: "type go with option two but keep the old config keys. Send." You never sat down.

Every other dictation app stops at typing text and leaves the triggering, the Enter key, and the window switching on your hands. Infina completes the whole prompt, send, switch-app loop by voice, from 2 to 3 feet away. That distance is a designed-for feature, not a lucky side effect.

We go deep on this workflow in hands-free Claude Code; the same loop drives any CLI agent.

Honest limits

  • English only in the base product. The optional $10/month cloud add-on adds more languages, plus sharper cloud transcription and LLM-polished output for the emails and docs you dictate outside the terminal.
  • Mac only, Apple Silicon required for the on-device models.
  • Raw output is a feature in a terminal and a limitation in prose. If you want polished writing everywhere, that is what the add-on is for, on top of an app you own instead of a subscription you rent forever.
  • Hands-free is our newest feature and labeled experimental. It likes a reasonably quiet room; push-to-talk is the mature fallback that always works.
  • Dictation shines on English sentences. For dense one-liners full of flags and pipes, your fingers are still the right tool, and both can share the same prompt line.

FAQ

Can you use voice typing in a Mac terminal? Yes. Infina types at the OS level, so dictation lands in Terminal.app, iTerm2, Warp, or any other terminal exactly like keyboard input. Hold Option to dictate, or use hands-free mode and just say "type" plus your words.

Is it safe to dictate into a command line? Yes, because dictation only types text. Nothing executes until you press Enter or deliberately say "send" in hands-free mode, so a misheard word just sits on the prompt line like a typo, waiting for your review.

Does terminal voice input send my audio to the cloud? Not with Infina's base product. Transcription runs entirely on your Mac's Neural Engine, works offline, and your audio never leaves the device. Cloud processing exists only as an optional $10/month add-on.

Can I dictate commands like git commit or npm run build? Yes, short commands come through fine, and dictating a commit message is a classic win. Where voice really pays off is the long English prompts you feed CLI agents like Claude Code, Codex, and Gemini CLI, which are full sentences your voice produces much faster than your hands.

What is the best voice input for terminal AI agents? The criteria that matter: raw output that never autocorrects your words, on-device speed, works in every terminal, and a hands-free loop that also sends and switches windows. Infina is the only dictation app we know of that completes that full loop by voice; see pricing for the $99 one-time license.

How accurate is speech to text for terminal use? Infina hits 95%+ for clear English speech, and agent prompts are plain English, the easiest case. For unusual jargon-heavy vocabularies, the optional cloud add-on brings bigger models that are sharper on names and technical terms.

The bottom line

The terminal turned into a place where you write paragraphs, and your keyboard did not get any faster. Voice typing for terminal closes that gap: raw, on-device dictation straight into the prompt line of whatever terminal you already use, with nothing running until you send it.

And when you are ready to stop touching the keyboard entirely, the hands-free loop, "type", "send", "open Warp", runs multiple agent sessions from across the room. No other dictation app closes that loop.

Infina is $99 once at the time of writing, every 1.x update included, risk-free for 7 days. Your prompts are the work now; say them.