TL;DR: Parakeet is NVIDIA's open speech recognition model family, and it is the engine inside Infina. It runs entirely on your Mac's Neural Engine, so dictation is instant, works offline, and your audio never leaves your machine, all for $99 once as of July 2026. And Infina does something no Whisper wrapper does: it completes the whole loop hands-free from a couple of feet away. Say "type" plus your words and they get typed, say "send" and it presses Enter, say "open Notes" or "open Cursor" and you are dictating somewhere else, without touching the keyboard once.

This article is a plain-English explainer of the model we bet the product on, and you can check every claim against NVIDIA's own pages.

What is Parakeet? A plain-English answer

Parakeet is a family of speech-to-text models built and released by NVIDIA. You give it audio, it gives you back written words, with punctuation and capitalization handled automatically.

Three facts matter, all from NVIDIA's official model card as checked on July 4, 2026:

  • It is open. Parakeet TDT 0.6B is released under a CC-BY-4.0 license, free for commercial use. Anyone can ship it, which is exactly the openness that made Whisper famous, one model generation later.
  • It is accurate. NVIDIA reports a 6.05 percent average word error rate on the Hugging Face Open ASR leaderboard, which put it among the best open speech models available. It also handles tricky cases like spoken numbers well.
  • It is fast. The model transcribes audio far faster than real time. For dictation that means the text is essentially waiting for you when you stop talking.

The "0.6B" means about 600 million parameters. That is a deliberately compact size: big enough to hear English extremely well, small enough to run on a laptop chip without melting it.

The "TDT" part is the interesting bit for dictation. It stands for token-and-duration transducer, and the plain-English version is this: instead of listening to a whole chunk of audio and then writing a paragraph (the way Whisper works), a transducer emits words continuously as sound streams in, and it can skip ahead through silence instead of plodding through it. It behaves like a stenographer, not a translator.

We wrote a companion piece on the model it succeeds in Whisper dictation on Mac. Short version: Whisper is still the multilingual champion for transcribing recordings, and Parakeet is the model you want racing your voice in real time.

Why on-device beats the cloud round-trip

Most dictation apps send your voice to a server. Every sentence you speak gets recorded, uploaded, queued, transcribed on someone else's machine, and downloaded back as text.

That round trip costs you twice.

It costs you time. Even on good Wi-Fi, the network adds a delay to every single utterance, and on hotel Wi-Fi or a tethered train it adds a lot. On-device transcription has no network in the loop at all. The audio goes from your microphone to a chip inside your Mac to text at your cursor.

It costs you privacy. Your voice is biometric data, and what you dictate is the raw feed of your working life. Audio that never leaves your Mac cannot be retained, breached, or used for training, no matter what any policy says. We go deeper on this in on-device dictation on Mac and what makes a dictation app actually private.

There is a third win: offline just works. Planes, dead zones, network outages, none of it matters, because nothing needed the internet in the first place.

The Neural Engine: the AI chip your Mac already has

Every Apple Silicon Mac ships with a Neural Engine, a block of the chip built specifically to run AI models efficiently. Most apps never touch it.

Infina runs Parakeet there. That choice, Parakeet on the Neural Engine rather than on the CPU, is why on-device dictation does not mean fans spinning and a hot lap. The dedicated AI silicon does the heavy lifting while your CPU stays free for your actual work.

One clarification, because the NVIDIA name confuses people: you do not need an NVIDIA GPU. NVIDIA built and open-sourced the model; Infina runs it on Apple's hardware. The only requirement is an Apple Silicon Mac.

How Infina ships Parakeet dictation on your Mac

Here is the concrete package, honestly stated.

Fully on-device by default. Infina runs NVIDIA Parakeet TDT 0.6B on the Apple Neural Engine. Your audio never leaves your Mac, dictation works with no internet connection, and privacy mode is on out of the box, so nothing is stored server-side. Accuracy is strong: 95%+ for clear English speech.

Two ways to dictate. Hold Option and talk, release, and your words land at the cursor: classic push-to-talk. Or double-tap Cmd to toggle hands-free mode, sit back, and run the loop by voice alone: "type summarize this thread and draft a reply", then "send" to press Enter, then "open Claude Code" and keep prompting. Hands-free is experimental and off by default, and we label it that way in the app. But no other dictation app completes that prompt, send, switch-app loop hands-free in plain English.

Built for AI prompters. The base product outputs raw dictation on purpose. If you spend your day prompting Claude Code, Codex, or Cursor, the model on the other end does not care about comma style, it cares that you can speak a thousand-word prompt in the time typing would give you a hundred.

Honest limits. Mac only. Apple Silicon required. The base model is English-only. And raw output is raw: fine for prompts, not for publishing prose.

The optional cloud add-on covers both gaps. For $10/month (7-day free trial, cancel anytime), our cloud AI providers (Together AI and Groq) add sharper cloud transcription, polished LLM cleanup, and more languages. That is the exact feature set the $15/month subscription dictation apps charge forever for. With Infina it is optional, on top of an app you own outright, and the app reverts to fully on-device the moment you cancel.

The license itself is $99 one time as of July 2026, with every 1.x update included. There is no free trial; there is a 7-day no-questions money-back guarantee instead. Full details on the pricing page.

Parakeet vs Whisper, in one honest paragraph

Whisper is older, bigger in its accurate variants, and speaks dozens of languages; it remains the right call for transcribing multilingual recordings. Parakeet is compact, English-focused in the version we ship, and built as a streaming transducer, so it is faster and more responsive for live dictation on consumer hardware. The market is voting the same way: even apps with Whisper in their name have started offering Parakeet alongside it, as of July 4, 2026. The full comparison lives in our Whisper explainer.

FAQ

What is NVIDIA Parakeet, in one sentence? It is NVIDIA's open family of speech-to-text models, and the compact TDT 0.6B version that Infina ships turns English speech into punctuated text far faster than real time, per NVIDIA's own model card as of July 4, 2026.

Do I need an NVIDIA graphics card for Parakeet dictation? No. NVIDIA built and open-sourced the model, but Infina runs it on the Neural Engine inside every Apple Silicon Mac. If you have an M-series Mac, you already own the hardware.

Does Parakeet dictation work offline? Yes, completely. The model lives on your Mac, so Infina's default dictation works on a plane with Wi-Fi off. Only the optional cloud add-on needs a connection.

Is Parakeet more accurate than Whisper? On English, NVIDIA's model card reports leaderboard-topping accuracy for a model of its size, and it is dramatically faster for live use. Whisper still wins on language coverage. For dictation into AI tools, which is Infina's job, Parakeet's speed-plus-accuracy combination is the better trade.

Does Infina support languages other than English? The base on-device model is English-only, and we say so plainly. The optional $10/month cloud add-on adds multiple languages through our cloud AI providers (Together AI and Groq), with a 7-day free trial.

How much does Infina cost? $99 one time as of July 2026, no subscription, with a 7-day no-questions money-back guarantee. The cloud add-on is an optional $10/month on top.

The bottom line

Every dictation app is a wrapper around a speech model, so the model choice is the product choice. We chose Parakeet because it is the current generation: open like Whisper was, but built for streaming speech, compact enough for the Neural Engine, and accurate enough that the cloud stops being worth the round trip.

The result is dictation that is instant, offline, and private by default, with a hands-free loop on top that no other dictation app completes: type, send, switch apps, all by voice, from a couple of feet away.

One $99 purchase as of July 2026, a 7-day no-questions refund, and the model behind it is one you can read about on NVIDIA's own pages. That is the whole pitch.