Someone shares a photo. It arrives in your group chat as a rich story card — faces tagged, location, weather, the whole story. You tap it, record a voice reply, and your memory becomes part of that photo forever. No app to install. No sign-up. Just your voice.
This is what drops into your Telegram group. Not just a photo — a story card packed with context. Faces, location, weather, elevation, the story behind the moment. Everything the annotator captured, rendered into something you can tap and explore.
The photo starts a conversation. Everyone who was there — or anyone who has a memory — adds their voice. Not a comment. Not a text. Their actual voice, with all the laughter and personality that text can never capture.
Three people. Three memories. All transcribed, all searchable, all embedded in the photo — forever.
Someone in your family group shares a photo. It shows up as a rich story card — faces tagged, location pinned, weather recorded, the whole story written out. You didn't do any of this. It just arrived like that.
Record a voice reply right in Telegram. Share a memory, correct a detail, crack a joke about Uncle Dave's sunburn. Your voice, your personality, your version of what happened.
Your voice clip is transcribed, compressed, and embedded directly into the photo file itself. Not in an app. Not in a database. In the actual image. Copy it, back it up, pass it down — your voice travels with the photo.
Each photo gets seven voice threads. One person might use three of them. Two people might go back and forth. The whole family might pile in. However it plays out — that's the photo's story.
Text tells you what someone said. Voice tells you how they felt. The hesitation before a punchline. The crack in someone's voice when they remember. The way your dad says "mate" differently than anyone else on earth.
You can transcribe what someone says. You can't transcribe who they are. That's what voice preserves.
Every voice clip is automatically transcribed so it's searchable and readable. But the audio is the real treasure — compressed with Opus, embedded in the photo's metadata, and portable forever.
Voice clips compressed with the Opus codec at 16kbps. Minutes of audio in kilobytes. Optimised for speech — every pause, every laugh, crystal clear.
Threads embedded in the photo's XMP data. Copy the file, email it, back it up — the voices come with it. No sidecar files. No database.
Every voice clip is transcribed automatically. Search by what someone said, not just what's in the photo. "Find the one where Dad talks about the dog."
Reply through Telegram — the app you already have. No sign-up, no download, no account. Just open the group chat and talk.
The best stories aren't written. They're told.