Bonfiyah Get the app — free

How the category is changing

The conversation app that remembers people, not just recordings.

Picture the same face at the fire three nights running. The first night you learn their name. The second, you remember it. By the third, you know how they talk and what they tend to promise. That isn't a feature. It's just what it means to know someone — and almost every transcription tool throws it away.

Almost every transcription tool labels the voices in one recording “Speaker 1” and “Speaker 2,” and the next time those same two people talk, it starts over from zero. The recording remembers. The person is forgotten the instant you stop. You can read back every word that was said, and still have no thread connecting the colleague in Tuesday's standup to the same colleague in Thursday's.

Per-recording speaker labels are a dead end.

Diarization — splitting audio into who spoke when — is now standard. Most tools do it well enough within a single file. The problem is the boundary. The labels are scoped to the recording and they die with it. “Speaker 2” in one meeting has no relationship to “Speaker 2” in the next, even when it's plainly the same person in both. Every recording is an island.

That scoping decision quietly caps how useful any of this can be, because the interesting questions about a person are longitudinal. How many of Sarah's commitments actually land? Does this vendor tend to hedge when money comes up? Is the thing he said today consistent with the thing he said last quarter? You cannot ask any of those if the system forgets who Sarah is between recordings. Single-recording diarization can tell you who is talking right now. It can never tell you who someone is.

Speaker memory is the missing primitive.

Bonfiyah treats a speaker as a durable identity that persists across your entire library, recognized by the sound of their voice from one recording to the next. Tag a voice once as “Sarah from product,” and the next time Sarah speaks — in a different meeting, on a different day, in a different project — Bonfiyah recognizes her and carries everything it already knows forward. She is one identity, not a fresh stranger in every file.

This is the load-bearing piece the rest of the intelligence stands on. Cross-recording reasoning is only possible once the system agrees that the Sarah in March and the Sarah in May are the same Sarah. Get that wrong and every conclusion built on top of it inherits the error.

So let me be precise about how the recognition works, because it matters. Identity is matched on the voice itself — the acoustic signature — not on anything anyone said. Bonfiyah never reads the words of a conversation to guess who is speaking or to put a name to a face. Voice is the only signal we trust for identity, because text is far too easy to misread, and getting someone's identity wrong is exactly the kind of mistake that cascades through everything downstream.

What it makes possible.

Once a person persists across recordings, a whole class of features stops being science fiction. People Memory keeps a living profile for each person you talk to — their role, the projects they touch, how their commitments tend to go, the threads still open between you. It builds up automatically from the conversations you have already had, not from a contact card you maintain by hand.

Pre-Brief can only catch you up before a meeting because it knows who you are meeting. Walking in to talk to Marcus, it pulls what Marcus promised last time, what is still open with Marcus, and what has changed since the two of you last spoke. None of that exists without a stable Marcus to hang it on.

Speaker Themes notices how each person tends to talk over time — the topics they return to, the way they hedge, the rhythm of how they commit. Those patterns only become visible across many conversations, never inside one. Each of these is downstream of the same idea: a person is the unit of memory, and a recording is just one moment in their longer story with you.

Why competitors don't do this.

It is not an oversight; it is a positioning choice. The meeting-bot category — Otter and its neighbors — is built around the call. A bot dials into a video meeting, transcribes that session, hands you that summary, and the unit of everything is the single meeting. The whole model is per-meeting, often per-seat, and a per-meeting model has no natural place to put a person who shows up across many meetings over months. The person keeps falling through the gaps between sessions.

Bonfiyah is built around the person and the room, not the call. In-person conversations, repeated over time, with the same handful of people you actually work and live with. That framing makes durable speaker identity the obvious center of gravity rather than an afterthought bolted onto a transcript. It runs as a universal Apple app across iPhone, iPad, and Mac, kept in step by iCloud sync, so the same person stays the same person no matter which device you happened to record on.

A word on consent and control.

Recognizing a voice across recordings is powerful, which means it has to be handled carefully. Speaker identities are yours — they live in your library, and the consent tooling rides alongside them in every tier: two-party-consent surfacing, verbal-consent capture, and an exportable log. Knowing who someone is and respecting how they agreed to be recorded are two halves of the same responsibility, and Bonfiyah treats them that way. The app surfaces the rule; it never tells you a recording is legal, only what the rule is where you are.

And the recognition is private by design. Real-time transcription runs on-device; audio leaves your iPhone only for the optional cloud-transcription pass you control. We do not train AI on your transcripts. The point of remembering people is to serve you across your own conversations — not to turn those conversations into someone else's training data.

Try it across a week.

The value of speaker memory doesn't show up in one recording — it shows up across several. Install Bonfiyah, record a handful of conversations with the same few people over a week, and watch the profiles fill in on their own. The Pro AI layer is free to start, so give it a few days and let People Memory and Pre-Brief gather something to work with. By the second meeting with the same person, the app already knows them better than your notes do.

The fire remembers the faces, not just the nights. So does Bonfiyah.

— Richard

Bonfiyah

More on how the category is changing, by email

Where conversation software is going, why speaker memory matters, and the occasional disagreement with the rest of the field. About once a week.

No spam. We use ConvertKit. See our privacy policy.