// Python Dev
How to integrate an LLM with memory: store the facts yourself
Published on 2026-05-14
LLMs reason very well. Their memory is poor.
Ask an AI assistant about something you mentioned earlier in a long dialogue — and it might get confused, mix up details, or just start to drift, inventing facts. The longer the context, the less predictable the answers. You start probing it, but facts drift further and further away. You’d think huge context windows would help, but no! And if you build a project that includes an LLM, this becomes dangerous — agree, when a support assistant starts glitching, it doesn’t do your reputation any favors. So what to do?
The solution is simple: don’t trust the model’s memory. Store the facts yourself.
How it works
User data — profile, history, any structured facts — live in a database. Yes, a boring SQL or NoSQL database. When a request comes in, the necessary data are retrieved and sent into the context window together with the prompt. The model always sees exactly what it needs — no more, no less.
What this gives
Full control over the context. You decide what the model knows at each step. No need to worry whether it will remember something from the session before last, mix up users, or give an answer that contradicts what was said earlier.
The same applies to conversation history. Instead of feeding the model the entire chat, store messages in your database and load only the last few — that’s enough for the model to understand the conversation context. If the task gets more complex, you can go further and set up RAG search over the history. But honestly, for most cases “the last N messages from the table” deliver 90% of the result — without a dissertation on vector databases.
And a nice bonus: you send only relevant data. Fewer tokens per request — a noticeable saving at any scale.
The main principle
An LLM is an engine for reasoning, not a data storage system. Once you separate these two responsibilities, the model starts working noticeably better. Not because the model changed — but because it works with quality input.
👉 Garbage in, garbage out. Clean structured data in — surprisingly good AI out.
// Python Dev
Другие статьи Python Dev
2026-05-15
n8n: a pretty wrapper that ate up two days
The client came with an idea: they have access to the level.travel API, hundreds of Telegram channels for travel agents, and a desire to automatically publish …
2026-05-13
Data parsing in 2026: don't run every page through an LLM!
Among parser developers a strange approach is spreading: send every downloaded page to an LLM asking it to find the needed data. Sounds convenient — you …
2026-05-12
How to Pretend to Be Human: Scraping Without Getting Blocked
There is such a Turing test — the machine tries to convince a person that it is also a person. In parsing everything works exactly the opposite: the site tries …
// Python Projects
Проекты Python Dev
2026-04-29
Automated call transcript: from recording to a structured document
Automatic call minutes: from recording to a structured document Distributed teams spend a lot of time on calls. They discuss tasks, make decisions, assign …
2026-03-26
Telegram bot for voice pranks
An upgrade of an existing Telegram bot: calls through SIP and Telegram, response recording, and monetization through Telegram Stars.
2026-03-26
Automatic energy consumption control system
An MVP system for controlling energy usage limits on EV charging points with automatic relay shutdown and full action logging.
// Contact
Need help?
Get in touch with me and I'll help solve the problem