Typing Is the Bottleneck of the AI Era

TL;DR: AI writes the code and the documents now; your remaining job is describing what you want, and typing caps that channel at a commonly cited 40 to 80 words per minute while speech runs 130 plus. Infina removes the cap and then the keyboard itself: from a couple of feet away, say "type" plus your prompt and it gets typed, say "send" and Enter is pressed, say "open Cursor" or "open Claude Code" and you are briefing the next agent, no key touched anywhere in the loop. No other dictation app completes that prompt, send, and switch-apps cycle hands-free in plain English. $99 once as of July 2026, on-device by default, 7-day refund.

The bottleneck argument is arithmetic plus a job description, and both are laid out below for you to check.

The job changed underneath you

Five years ago, the limiting factor of knowledge work was production: writing the code, drafting the document, building the deck. Hands on keyboard was the work.

Now AI tools produce the artifact. Claude Code writes the migration, Cursor writes the component, a chat model writes the first draft. What is left for the human is the part the model cannot do: knowing what you want and describing it precisely.

That description is text. Lots of it. Context, constraints, examples, corrections, follow-ups. On a heavy day, a person directing AI tools produces thousands of words of pure instruction.

And nearly everyone still produces those words the slow way: typing them.

Why typing is the bottleneck

A bottleneck is the narrowest point in a pipeline. Walk the AI pipeline and look at the widths.

The model generates faster than you can read. The terminal executes instantly. The one stage that crawls is you, pushing intent through your fingers.

The arithmetic is open; check it yourself. Commonly cited averages put typing near 40 words per minute, with practiced typists at 60 to 80. Conversational speech is commonly cited at 130 to 160 words per minute. Whatever exact numbers you use, the ratio lands around three to one. We compare the measurements in depth in dictation vs typing speed.

Now put a workload through it. Thirty prompts a day at 150 words each is 4,500 words of instruction:

Typed at 60 words per minute: about 75 minutes.
Spoken at 140 words per minute: about 32 minutes.

Same prompts, same intent, around 40 minutes of difference every single day. Over a working year that is well over 100 hours spent pushing words through the narrow pipe.

The hidden cost is worse than the visible one

The minutes are the measurable loss. The behavioral loss is bigger.

When words are expensive, you ration them. Typed prompts get compressed: context gets dropped, constraints get skipped, "you know what I mean" gets assumed. The model does not know what you mean, so it guesses, and you pay in retry rounds.

When words are cheap, you spend them. Spoken prompts naturally carry more context because ten extra seconds of talking buys three extra sentences of specification. Better-specified prompts mean fewer round trips, which compounds on top of the raw speed gain. The full productivity picture, including where the gains are real and where they are hype, is in speaking vs typing productivity.

There is also a parallelism cost. If your hands are pinned to one keyboard writing one prompt, you run one agent at a time. People who direct several AI sessions at once, the workflow behind vibe coding by voice, need an input channel that moves at the speed of switching attention, not the speed of typing.

The counterarguments, taken honestly

The thesis has real objections. Here they are, without a strawman.

"Editing still needs a keyboard." True, and voice does not pretend otherwise. Renaming a variable, tweaking a line, navigating code: keyboard work, and it should stay keyboard work. But be honest about the split in an AI-era day: the bulk of your fresh word count is describing intent, not surgically editing. Voice takes the bulk; the keyboard keeps the scalpel. Nothing is taken away from you.

"I can't talk to my computer in an open-plan office." Also true. Dictating in shared space has a social cost, and hands-free operation from across the room wants a quiet-ish room besides. Voice input fits home offices, private offices, and remote work, which happens to be where much of AI-heavy work already lives. In a shared space, push-to-talk at a normal speaking volume is viable; hands-free is for the room where you can think out loud.

"My typing is fast enough." Maybe. At 80 words per minute you are still under the commonly cited floor of conversational speech, and you are typing with your posture, wrists, and attention chained to the desk. The bottleneck is narrower for you, but it is still the narrowest stage in the pipeline.

"Dictation output is messy." For publishing, raw dictation needs cleanup, agreed. For prompting, mess barely matters: large language models parse loose spoken phrasing effortlessly, and raw transcription preserves your exact words instead of rewriting them. And if you want polished prose for email and documents, we do not concede that either: Infina's optional $10/month cloud add-on (7-day trial, cancel anytime) uses large cloud models via our cloud AI providers (Together AI and Groq) to beat the $15/month subscription apps at their own polish game, on top of an app you own.

What removing the bottleneck looks like

Step one is dictation itself. Infina's base product is raw, on-device dictation: hold Option, speak, release, and your words are typed into whatever app is focused. Transcription runs on your Mac's Neural Engine, works offline, and by default no audio or transcripts are stored. That alone moves you from typing speed to speaking speed.

Step two is removing the hands entirely, because push-to-talk still leaves a keyboard residue: hold the key, press Enter, Cmd-Tab to the next window, hundreds of times a day.

Infina's hands-free mode (double-tap Cmd to toggle it on; it ships off by default and is labeled experimental) closes the loop:

Say "type" plus your words: "type add retry logic to the upload endpoint and write a test for the timeout case." Infina types it. "Type" is the trigger; no hotkey exists in this loop.
Say "send". Enter is pressed.
Say "open Claude Code", or "open Cursor", or "open Notes". The app switches.
Repeat, from a couple of feet away, while the previous agent is still working.

That last detail is the point: the bottleneck is not just speed, it is serialization. When prompting, sending, and switching are all spoken, you brief agent two while agent one runs. Shipping speed follows prompt throughput, an argument we push further in prompt faster, ship faster.

Scope stated honestly: Mac only, Apple Silicon required for the on-device models, English-only base product (the cloud add-on adds more languages), and hands-free is the experimental layer while push-to-talk is the mature everyday path.

The economics of unblocking

Infina is $99 one-time as of July 2026, every 1.x update included, no subscription, and no trial; there is a 7-day no-questions money-back guarantee instead. Details on pricing.

Run it against the arithmetic above. If voice returns even 20 minutes a day, a one-time $99 is repaid in the first couple of weeks of use, and everything after that is kept time. That is the whole business case, and every number in it is yours to audit.

FAQ

What does "typing is the bottleneck" actually mean? In AI-era work, the model produces the artifact and the human produces the description. Typing caps that description channel at a commonly cited 40 to 80 words per minute while speech runs 130 plus, making your keyboard the slowest stage in the pipeline.

Is speaking really three times faster than typing? The ratio comes from open arithmetic on commonly cited ranges: around 40 words per minute average typing (60 to 80 for practiced typists) versus 130 to 160 words per minute for conversational speech. Your personal ratio may be two-to-one or four-to-one; measure your own numbers against your own prompt log.

Doesn't coding still require typing? Editing code does, and voice should not replace that. The bottleneck claim is about fresh word production: prompts, instructions, and descriptions, which dominate an AI-heavy day and are faster spoken.

What about open-plan offices? Honest limit: dictation has a social cost in shared space, and hands-free wants a quiet-ish room. Voice input fits home and private offices best; push-to-talk at normal speaking volume is the workable middle ground elsewhere.

Does Infina send my prompts to the cloud? Not by default. Transcription runs on-device on Apple Silicon, works offline, and no audio or transcripts are stored by default. Cloud processing is strictly an optional $10/month add-on.

How much does Infina cost? $99 one-time as of July 2026 with every 1.x update included and a 7-day no-questions-asked refund. No subscription for the core app; the optional cloud add-on is $10/month with its own 7-day trial.

The bottom line

The AI era moved the work from producing artifacts to describing them, and almost nobody upgraded the description channel. Typing is the bottleneck: the arithmetic is open, the daily cost is tens of minutes, and the hidden cost is compressed prompts and serialized agents.

The fix is not typing harder. It is speaking your instructions raw, and then taking your hands out of the loop entirely: type, send, open the next app, all by voice, from across the room.

Infina is the Mac app built to do exactly that, on-device by default, for $99 once as of July 2026. If the bottleneck does not visibly open up in your first week, the refund is one email.