// DevOps
n8n — not automation. It's an orchestrator.
Published on 2026-05-26
Most people use n8n to move data from one place to another — a webhook arrives, create a task, a form is filled out, send it to the CRM. It’s useful, no doubt, but it’s a bit like using a good knife only to open envelopes.
n8n becomes interesting when it’s no longer seen as an executor and is used as a conductor. It doesn’t do anything itself — but it knows whom to call, in what order, and what to pass along. Each orchestra member plays its part; n8n simply holds the score and makes sure no one comes in at the wrong time.
I’ll show a concrete example — fortunately I have a live one.
Problem: monitoring that no one sets up
Small teams almost never set up proper server monitoring, and it’s not that they don’t understand why — it’s that the alternative looks like Zabbix or Prometheus. These are enterprise tools with their own ecosystem, their own specialists, and their own separate pain in configuration, and for Prometheus to bring real value you need someone who understands what to do with it and why. For three servers of a small product it’s like hiring a chef to make a sandwich — technically a solution, but somewhat disproportionate.
So monitoring gets postponed until better times, and those better times usually arrive when something breaks and it turns out downtime is expensive: an online store is down for two hours at night — lost orders, angry customers in the morning, and time spent investigating instead of doing normal work.
Honest observation
When something does fail, a person without DevOps in the team does exactly the same thing every time — copies the logs and drops them into ChatGPT with the question “here’s the error, what does it mean and how to fix it”. And you know what, it works. It just happens after the customer called, manually, with time lost searching for the right lines among hundreds of log entries.
For one of my projects I automated exactly this process — not “built a monitoring system” in a serious sense, but simply removed the manual work from what was already happening. A signal arrived, logs were collected, analysis ready — and while you read the notification, the answer is already in Telegram.
Architecture: who does what
The system consists of four components, each doing exactly what it was created for and not trying to do anything extra.
Grafana + Loki watch the log stream in real time — Loki stores them, Grafana monitors and when a critical event appears in the logs, for example a series of HTTP 5xx errors, an alert fires.
n8n — the orchestrator. Receives a webhook from Grafana and controls all further logic: where to go, what to fetch, whom to hand off to.
LLM via OpenRouter — the analyst. Receives an array of logs and returns a human-readable diagnosis with a recommendation.
Telegram — just the delivery channel; the analysis result is sent to the responsible person.
Grafana alert → webhook → n8n → request logs from Loki → LLM → Telegram
No agent on the server, no access to infrastructure — the system only reads logs and tells what to do, the server remains under full control.
Technical details
Alert in Grafana
In Grafana Alerting you create a rule based on a LogQL query to Loki, and there’s a non-obvious point with filtering. If you search simply for the text “error”, the alert will trigger on any URL where this word appears — and that happens more often than you’d like. It’s more reliable to match the HTTP status code by its position in the nginx log line:
count_over_time({service_name="ваш-сервис"} |~ "\" 5[0-9][0-9] " [1m])
The quote before the number closes part of the HTTP/1.1" in the log — that way the status code won’t be confused with digits in the request URL. A small detail, but without it the system will generate false positives and quickly become annoying.
The alert is configured to a Contact Point of type Webhook with the URL of your n8n workflow.
Fetching logs in n8n
If Loki is running in the same Docker environment as Grafana and isn’t exposed externally — no problem, Grafana can proxy requests to its datasources. From n8n a normal HTTP GET request is made via the Grafana API:
GET https://ваша-графана.com/api/datasources/proxy/uid/loki/loki/api/v1/query_range
With parameters:
query {service_name="ваш-сервис"} |~ "\" 5[0-9][0-9] "
start {{ Math.floor(Date.now() / 1000) - 900 }}
end {{ Math.floor(Date.now() / 1000) }}
limit 200
Authorization via a Bearer token — a Service Account in Grafana with the Viewer role, nothing extra. 900 seconds is 15 minutes before the alert — enough to catch the chain of events that led to the incident.
Preparing logs for the LLM
Loki returns a nested JSON with streams and values, and before sending this to the LLM, a Code node in n8n unpacks everything into readable text:
const result = $input.first().json;
const streams = result.data?.result || [];
const lines = [];
for (const stream of streams) {
for (const [ts, line] of stream.values) {
lines.push(line);
}
}
return [{ json: { logs: lines.join("\n") } }];
Request to the LLM
Via OpenRouter you can use any model with a single API, and Claude Haiku works well for log analysis — the task doesn’t require deep reasoning, it requires accuracy in following instructions. The prompt is intentionally restrictive:
You are a server monitoring assistant. Analyze these logs and identify
what went wrong and how to fix it. Stick strictly to what's in the logs —
do not invent facts or guess beyond the evidence.
Alert message:
{alert message from Grafana}
Logs:
{logs for 15 minutes before the alert}
The “do not invent facts” condition is essential here — without it the model starts generating plausible but false explanations, which is worse than raw logs.
Why this works
Each tool stays in its lane and doesn’t try to do someone else’s job. Grafana doesn’t analyze logs — it monitors and alerts. Loki doesn’t try to be an analyst — it stores and returns data on request. The LLM doesn’t manage infrastructure — it compresses noise into a signal, and that’s something it does well. n8n doesn’t do any of the above — it knows whom to call and in what order, and that’s its value.
This is the right way to use tools: not trying to do everything with one solution, but precisely distributing responsibility to those who handle it best.
Instead of a conclusion
You can assemble such a system in a few hours without being a DevOps engineer — not because it’s trivial, but because the complexity is distributed to the right places and each component does exactly what it was created for.
AI here doesn’t replace an engineer; it removes manual work from a process that was already happening — just slowly and after the fact. The value isn’t that the LLM is smart, but that the system reacts before the problem becomes expensive.
Boring, proven tools, assembled in the right order, often solve the problem faster and cheaper than a specialized enterprise solution. This isn’t a compromise — it’s systems thinking.
// Reviews
Related reviews
As always, prompt and high-quality! I turn to Mikhail for server issues.
As always, prompt and high-quality! For server-related issues, I turn to Mikhail.
// Contact
Need help?
Get in touch with me and I'll help solve the problem
// Related