Insights

Your Voice Is in the Room.
So Is the Vendor.

The Otter.ai lawsuit reveals what AI meeting bots actually do with your conversations, your voice, and your data.

Brian Galvan Founder & Engineer, SimplyTalk Published May 13, 2026

In September 2024, an AI researcher named Alex Bilzerian wrapped up a Zoom call with a venture capital firm. Minutes later, Otter.ai sent him a transcript of the conversation. Included in that transcript were hours of additional conversation recorded after the meeting had formally ended, during which the investors openly discussed what Bilzerian described to The Washington Post as "strategic failures and cooked metrics."

The bot never stopped listening.

That incident, along with similar ones, became the foundation of a federal class action now consolidated in San Jose: In re Otter.AI Privacy Litigation. Four separate lawsuits filed between August and September 2025, all documenting the same pattern. A motion-to-dismiss hearing is scheduled for May 20, 2026.

Whatever the courts decide, the case provides a clear picture of what AI meeting tools are actually doing in the rooms where professionals work.

What the Lawsuits Allege

The complaints describe specific behaviors that go beyond what most users expect from a "notetaker" tool. According to the filings, Otter:

Joins meetings automatically when a user links their calendar, often without the host's direct action, and frequently without notifying other participants that a recording device has entered the room.
Records audio, generates transcripts, captures screenshots, and stores all of it on Otter's servers, creating a persistent copy of conversations that participants may have expected to be ephemeral.
Uses recorded conversations to train its models. The consent for this usage comes from a checkbox the original host clicked during signup, not from the other people whose voices were captured.
Emails partial transcripts to all calendar invitees, including people who never attended the meeting, using those emails to encourage non-users to create accounts.
Captures biometric voiceprint data without adequate disclosure. In Illinois, this falls under BIPA, which carries its own statutory penalties per violation.

The plaintiffs are pursuing claims under the federal Electronic Communications Privacy Act, California's Invasion of Privacy Act (CIPA), and Illinois's Biometric Information Privacy Act. CIPA alone allows $5,000 in statutory damages per violation. With 25 million registered users and over a billion meetings transcribed since 2016, the financial exposure is significant.

Otter denies the allegations and intends to defend the case. None of the claims have been proven in court. But the factual pattern described in the filings is instructive regardless of the outcome.

The Input Layer Is the New Attack Surface

For most of the past two decades, professional security advice focused on the same threat vectors: passwords, phishing, endpoint malware, network perimeters. That advice was sound and remains relevant.

But a new category of risk has emerged underneath all of it: the input layer.

Your voice during a meeting. Your draft in a notes application. Your queries to an AI assistant. Your transcripts and dictation output. The raw material of knowledge work, the content that used to exist only in your mind and on your local hardware, is now being captured, transmitted, stored, and in many cases used to train models by vendors most professionals have never audited.

Three characteristics make the input layer fundamentally different from the attack surfaces most professionals already manage:

It is invisible. A phishing email is something you can identify and choose not to click. An AI notetaker silently joining your calendar invitations operates below the threshold of awareness. The Ohio State IT department issued a campus-wide advisory in August 2025 telling faculty and staff to evaluate Otter independently because the institution had no way to track who was using it or which meetings it was joining.
It is permanent. Unlike a compromised password that you can reset, a captured conversation cannot be retracted. Audio recorded today can be stored, retained, reprocessed, and fed into model training indefinitely.
It crosses jurisdictions instantly. A single video call between participants in California, Illinois, and Texas simultaneously implicates three different consent frameworks. AI tools do not ask which state each participant is in. They record under whichever regime is most permissive, or, as alleged in these complaints, under none of them.

The Difference Between a Promise and an Architecture

There is a marketing version of privacy and an architectural version. They produce very different outcomes, and the gap between them is growing.

The marketing version is a privacy policy. It describes how your data is used and offers opt-outs. It depends on the vendor's word, the vendor's security practices, the vendor's subcontractor relationships, and the vendor's continued solvency. It is, at its core, a promise.

The architectural version is a system where your data cannot be compromised because it never leaves your machine. There is no transcript sitting on a server. No voiceprint in a database. No behavioral log feeding a vendor's analytics pipeline. The architecture makes collection impossible, not merely impermissible.

The Otter litigation joins a growing list of cases, from Clearview AI to Amazon Alexa's COPPA settlement to the parallel Fireflies.ai litigation, that are teaching professionals to recognize this distinction. A privacy policy is a promise. Local processing is a structural guarantee.

Three Questions for Your Current Toolset

If you do knowledge work, the practical question is not whether to use AI. It is which tools process which categories of data, and on whose hardware. Three questions worth answering for every tool already in your workflow:

Where does your input go? When you speak, type, or upload, does the data leave your device? If so, where does it land, and who has access to it?
What happens to it after the immediate task? Is your input retained? Used to train models? Shared with third parties or subprocessors you have never evaluated?
Can you independently verify the answers? A privacy policy is a claim. A local-only architecture is a claim you can verify by monitoring your own network traffic.

Most professionals will discover that the tools handling their most sensitive work, capturing voice, drafting documents, processing client information, are the ones with the weakest answers. That is the gap worth closing first.

How SimplyTalk Handles This Differently

SimplyTalk was built around a single architectural decision: your voice never leaves your machine.

Speech-to-text processing does not require a cloud server. It does not require an account. It does not require telemetry, analytics, or a vendor pipeline. SimplyTalk runs NVIDIA Parakeet and Moonshine AI models directly on your hardware. Audio is captured in memory, transcribed locally, and discarded. Nothing is saved. Nothing is transmitted.

There is no Otter-style email arriving an hour after a sensitive conversation, because there is no server that could generate one. There is no training data pipeline, because your audio is never stored. There is no cross-jurisdictional consent problem, because no data crosses any boundary.

That is not a feature. It is the absence of an entire category of risk.

The professionals reading about this litigation and asking "what else am I exposed to?" are asking the right question. The answer, for speech-to-text, is straightforward: if the tool keeps your voice local, the risk disappears at the architectural level. If it sends your voice to a server, you are trusting a promise.

Your Voice Is in the Room.So Is the Vendor.