The Social Experience

You don't need PhotoSpeak to join the conversation

Someone shares a photo. It arrives in your group chat as a rich story card — faces tagged, location, weather, the whole story. You tap it, record a voice reply, and your memory becomes part of that photo forever. No app to install. No sign-up. Just your voice.

The Experience

A photo arrives in your group chat

This is what drops into your Telegram group. Not just a photo — a story card packed with context. Faces, location, weather, elevation, the story behind the moment. Everything the annotator captured, rendered into something you can tap and explore.

Voice Threads

Three people. Three memories. One photo.

The photo starts a conversation. Everyone who was there — or anyone who has a memory — adds their voice. Not a comment. Not a text. Their actual voice, with all the laughter and personality that text can never capture.

Three people. Three memories. All transcribed, all searchable, all embedded in the photo — forever.

Your Part

Zero setup. Just your voice.

See it arrive

Someone in your family group shares a photo. It shows up as a rich story card — faces tagged, location pinned, weather recorded, the whole story written out. You didn't do any of this. It just arrived like that.

Tap and talk

Record a voice reply right in Telegram. Share a memory, correct a detail, crack a joke about Uncle Dave's sunburn. Your voice, your personality, your version of what happened.

It sticks forever

Your voice clip is transcribed, compressed, and embedded directly into the photo file itself. Not in an app. Not in a database. In the actual image. Copy it, back it up, pass it down — your voice travels with the photo.

Seven threads per photo

Each photo gets seven voice threads. One person might use three of them. Two people might go back and forth. The whole family might pile in. However it plays out — that's the photo's story.

Why Voice

A laugh. A pause. The way Nana says your name.

Text tells you what someone said. Voice tells you how they felt. The hesitation before a punchline. The crack in someone's voice when they remember. The way your dad says "mate" differently than anyone else on earth.

You can transcribe what someone says. You can't transcribe who they are. That's what voice preserves.

Every voice clip is automatically transcribed so it's searchable and readable. But the audio is the real treasure — compressed with Opus, embedded in the photo's metadata, and portable forever.

Under the Hood

Everything travels with the photo

Opus Audio

Voice clips compressed with the Opus codec at 16kbps. Minutes of audio in kilobytes. Optimised for speech — every pause, every laugh, crystal clear.

XMP Metadata

Threads embedded in the photo's XMP data. Copy the file, email it, back it up — the voices come with it. No sidecar files. No database.

Auto-Transcribed

Every voice clip is transcribed automatically. Search by what someone said, not just what's in the photo. "Find the one where Dad talks about the dog."

No App Required

Reply through Telegram — the app you already have. No sign-up, no download, no account. Just open the group chat and talk.