Voice Typing for Codex: Dictate Prompts, Send, and Switch by Voice

TL;DR: Codex has no built-in voice input as of July 4, 2026, and its prompts are long natural-language task descriptions, exactly the kind of text that is faster to speak than to type. Infina adds voice typing for Codex two ways: hold Option and dictate on-device, or go fully hands-free. Other dictation apps chain you to the keyboard with a hotkey for every single dictation; with Infina you sit back two feet from the desk, eat lunch, and run the entire loop by voice: say "type" plus your prompt, say "send", say "open Terminal" to check the next agent. It is $99 once, no subscription, with a 7-day money-back guarantee.

Codex ships without a voice input

OpenAI's Codex is a coding agent you drive with plain English. It runs in the terminal as Codex CLI, inside VS Code and Cursor via the IDE extension, and in the cloud on the web.

None of those surfaces has a built-in microphone button, as of July 4, 2026. You describe the task, and you describe it by typing.

That is the gap. Codex removed most of the typing from coding, then handed you a different typing job: writing paragraphs of instructions, corrections, and reviews all day.

Voice typing for Codex closes that gap. Any dictation tool that types into the focused app can fill the text box; only one runs the whole prompt, send, and switch-app loop without your hands.

Why Codex prompts are made for speaking

A good Codex prompt is not a command, it is a briefing. Goal, context, constraints: "migrate the payment webhooks to the new queue, the handlers live in src/webhooks, keep the retry logic, and do not touch the Stripe client."

That is 30 words. Most people type around 40 words a minute and speak around 150. Across a day of agent-driving, the difference is hours.

Speaking also changes what you write. Long prompts stop feeling expensive, so you stop compressing your intent into terse commands the agent misreads.

And here is the part that makes raw dictation the right tool: Codex does not care about your commas. "uh so add rate limiting to the public endpoints and um keep the middleware pattern we already use" produces the same plan as a polished version.

That is why Infina's base product is raw on-device dictation with fast rule-based formatting, built for AI prompting, not for publishing prose. The full argument for speaking to agents is in our guide to dictating prompts to AI.

Workflow 1: hold Option, talk to Codex

The simple way to dictate prompts to Codex:

Focus the Codex CLI in your terminal, or the Codex chat box in your IDE.
Hold Option (⌥) and speak the task.
Release. The text lands at your cursor.
Press Enter to send.

Infina types at the OS level, so it works in Terminal.app, iTerm, Warp, the VS Code terminal, and the IDE extension's input box alike. A terminal is just another text field to it; we cover the general case in voice typing for the terminal.

Transcription runs entirely on your Mac, on an on-device speech model on the Apple Neural Engine. It works offline, and your audio never leaves your device.

Voice typing for Codex, fully hands-free

Push-to-talk still tethers you to the desk. Every prompt starts with a finger on Option and ends with a finger on Enter, which is how every mainstream dictation app works too: a hotkey per dictation, forever.

Infina's hands-free mode cuts the tether:

Double-tap Cmd (⌘) to switch hands-free mode on. Listening runs on-device, so nothing is recorded or sent anywhere while it waits.
Speak a sentence that starts with "type": "type refactor the auth middleware and add tests for the expired-token path." Infina types it into Codex.
Say "send". Infina presses Enter.
Say "open Terminal" or "open Cursor" to move to the next session, and repeat.

It works from 2 to 3 feet away. Lean back with your lunch, watch Codex work through the plan, and queue the next instruction without touching anything.

Honest notes: hands-free is our newest surface, ships off by default, and is labeled experimental. It is English-only in the base product, and push-to-talk is the fallback that always works.

Keeping Codex and Claude Code busy at once

The workflow that sells this to agent power users is parallel agents. Codex in one terminal tab, Claude Code in another, maybe Gemini CLI in a third.

Hands-free, the loop looks like this:

Review Codex's diff. Say "type" plus the correction, then "send".
Say "open Terminal", land in the Claude Code session, dictate its next task, "send".
Glance back at Codex while agent two works.

Typing 40 words a minute, one person babysits one agent. Speaking 150, one person keeps three shipping. The same loop drives Gemini CLI by voice too.

What about other dictation apps?

Any dictation tool can handle the typing half. macOS built-in dictation is free and types into a terminal fine, and it is worth trying first.

But every mainstream option, free or subscription, stops at text. You trigger with a hotkey, you press Enter, you Cmd-Tab between sessions. The dictation apps that charge $15 a month for polished output are solving the wrong problem for Codex work: agents need your intent fast and raw, not your prose formatted.

Infina is built the other way around. Raw, instant, on-device dictation as the base, the full hands-free loop as the moat, and if you also want polished dictation for emails and docs, the optional cloud add-on ($10/month, cancel anytime) brings sharper cloud transcription and LLM-polished cleanup that beats the subscription apps at their own game.

One purchase: $99 one-time as of July 4, 2026, every 1.x update included, 7-day money-back guarantee. Details on pricing.

FAQ

Does OpenAI Codex have built-in voice input? No. As of July 4, 2026, the Codex CLI, IDE extension, and web surface all take typed text with no built-in dictation. You add voice with a system-level dictation tool like Infina, which types into whatever app is focused.

Can I talk to Codex in the terminal and in my IDE? Yes. Infina types at the OS level, so the same Option-hold dictation works in the Codex CLI, in the VS Code or Cursor extension's chat box, and in a browser tab. If the cursor is in the text field, your words land there.

Do I need to speak punctuation when dictating prompts to Codex? No. Codex understands conversational, unpunctuated speech, filler words included. That is why raw dictation is the right default for AI prompting: you need speed and fidelity to your intent, not typographic polish.

Can I send a Codex prompt without touching the keyboard? With Infina's hands-free mode, yes. Speak a sentence starting with "type", then say "send" and Infina presses Enter for you. Say "open Terminal" to switch sessions by voice. Mainstream dictation apps stop at typing the text.

Does Infina's dictation work offline? Yes. By default transcription runs entirely on your Mac (Apple Silicon required), so it works with no internet and your audio never leaves your device. Cloud processing exists only as an optional $10/month add-on.

What does Infina cost for Codex users? $99 one-time as of July 4, 2026, with a 7-day no-questions money-back guarantee and every 1.x update included. No subscription. The optional cloud add-on for polished output and more languages is $10/month.

The bottom line

Codex turned coding into briefing an agent, and briefing is a speaking job, not a typing job. As of July 4, 2026 Codex gives you no microphone, so the text box is yours to fill.

Fill it three times faster by voice. Start with Option-hold dictation, and when you are running Codex next to Claude Code, switch on hands-free and run the whole prompt, send, switch loop from two feet back.

That loop is what Infina was built for: $99 once, on-device by default, risk-free for 7 days.