// Python Dev

Automated call transcript: from recording to a structured document

Published on 2026-04-29

Automatic call minutes: from recording to a structured document

Distributed teams spend a lot of time on calls. They discuss tasks, make decisions, assign actions — and all of it dissipates into the air as soon as the conversation ends. Someone wrote something in a notebook, someone relied on memory. Two days later half of the agreements are lost.

The obvious solution is to record calls. But the recording itself doesn’t solve the problem: nobody will listen to a one-hour call just to find one instruction.

We built a pipeline that completely solves this problem — from recording to a finished protocol.


How it works

Recording the conversation

The system supports two sources: SIP telephony via Asterisk and custom video conferences based on Jitsi Meet.

In both cases the conversation is recorded in stereo — each participant on a separate channel. This gives an unexpectedly cheap way to solve attribution: there is no need to synchronize separate files or set up diarization manually.

ElevenLabs receives the stereo file, the use_multi_channel parameter — and the output is immediately a clean dialogue where every utterance is labeled by channel. No additional steps.


Speech recognition

As soon as the call ends, the recording is automatically sent for transcription.

We used ElevenLabs Scribe, but the pipeline is easily adaptable to any STT service or a local model — depending on privacy and budget requirements.


Analysis and structuring

The finished transcript goes to Claude or a local LLM with a simple prompt: here is a conversation between two people, identify the discussion topics, record action items, and structure them by topic.

The model handles this well — a one-hour conversation is compressed into a compact document that clearly shows who said what, what was discussed, and what needs to be done.


Delivering the result

The finished protocol is sent to where the team already works — email, CRM, or knowledge base.

From there it can automatically flow into to-do lists, tasks, or internal documentation. No manual copying.


What this provides

A one-hour call turns into a one-page document with topics, decisions, and action items — and appears in inboxes before participants even close their laptops.

No need to take notes during the call, ask colleagues again the next day, or listen to recordings searching for a single phrase. All agreements are recorded, structured, and available immediately.

The system works in the background and requires no actions from call participants — just talk as usual.


On deployment

The pipeline is configured quickly.

If Asterisk or Jitsi are already used in the infrastructure — integration takes minimal time.

The choice of STT and LLM is flexible: you can use cloud services or a fully local stack if the data must not leave your perimeter.

We built this for our own distributed team — and it’s one of those things that’s hard to remember how we ever worked without.

New request

Submit a request

Confirm that you are not a bot.

Send request
Write and get a quick reply