How to analyze thousands of reviews on Wildberries using LAG: a step-by-step walkthrough

For popular products on Wildberries, the number of reviews easily runs into the thousands. Reading them manually is slow, tedious, and inefficient.

Real reasons for returns, systemic product issues, recurring complaints, and strengths are hidden in the reviews. The task is to quickly extract the essence, not drown in text.

Why a “brute-force” analysis doesn’t work

A typical approach is to load all reviews into an LLM and ask for a summary. In practice this yields poor results: repeating patterns are lost, important details get blurred, and the outcome is too general.

The reason is simple: the model can’t handle a large amount of heterogeneous text in a single pass. A different approach is needed.

Solution architecture

The process is built in stages:

Review collection
Preliminary classification
Splitting into groups
Local group analysis (Map phase)
Final aggregation (Reduce phase)
Generating the final report

Essentially this is analogous to MapReduce: each group is analyzed independently, then the results are combined. Let’s examine each stage in more detail.

1. Review collection

Reviews are collected via browser automation based on Playwright with interception of XHR requests. This is faster than HTML parsing and more resilient to layout changes — API structure changes less often than page markup.

2. Preliminary classification

Before sending to the LLM, reviews are divided into positive and negative — by product rating and simple keyword filters. This reduces load on the model and improves accuracy: the model works with homogeneous data rather than a mixed stream.

3. Splitting into groups

Reviews are split into groups of 20–30 items. With larger volumes (100 and up) the model struggles to maintain context, error rates and generalizations increase. Small groups provide stable and predictable results.

4. Local group analysis (Map phase)

Each group is processed through the LLM with the same prompt:

You are a product analyst. Your task is to process product reviews and produce an aggregated analysis.

Instructions:
- Separate pros and cons of each review.
- Group repeating pros and cons.
- At the end, write a brief summary of 2–3 sentences: what customers like most often, what most often causes dissatisfaction.

Input data:
[list of reviews]

The output for each group is a list of key pros, a list of cons, and a short summary.

5. Final aggregation (Reduce phase)

All group results are combined and re-sent to the LLM. The goal of this step is to remove duplicates, merge similar formulations, and highlight the main problems and advantages. Without this step the final report will be overloaded with repetitions.

6. Final report

The output is a structured document:

Brief summary — 3–5 sentences about the product as a whole
Key pros — grouped and supported by mention frequency
Key cons — likewise

Additionally: several headline variants for the product card are generated, from which the best is selected in a separate request.

Example fragment of a real output:

14-liter air fryer — cooks quickly, but not always evenly
The Libhof AFZ-14 air fryer has a fairly large capacity — 14 liters, which is convenient for cooking large portions, and various automatic programs with touch control. In use it cooks dishes quickly, usually tasty and juicy, and the dehydrator function allows making healthy snacks. The device is equipped with a rotisserie and a removable lid, which expands cooking possibilities and makes cleaning easier.
However, there are issues with evenness of cooking in operation, which is noticeable on some dishes. Sometimes there is an odor on first use, and the manual and recipes leave questions. Build quality and комплектация vary between units — unfortunately, there are occasional defects and incomplete accessory sets. Overall, it is a useful appliance but requires care when selecting and operating.
#multigrill #electricgrill #homecooking
Electric air fryer for home 14 liters, multi-oven — 9 037 ~~27 273~~ RUB.

Performance

Processing one product takes about 1 minute:

~25 seconds — review collection (browser + XHR)
~35 seconds — processing via LLM

Data volume: 1000–2000 reviews, ~60–65 thousand tokens for the full cycle.

Tech stack

Playwright — data collection
Simple filtering logic — preliminary classification
OpenRouter — working with the LLM
Parallel group processing — speeds up the Map phase

Orchestration can be implemented as a backend service or via no-code tools — for example, n8n.

Common implementation mistakes

Groups that are too large — with >100 reviews the model begins to generalize and lose details
Different prompts for different groups — results become incompatible at the aggregation stage
Lack of final aggregation — without the Reduce phase the report turns into an unreadable mess
Trying to do everything in one request — the most common mistake where it all begins

Limitations

The approach is not perfect: quality depends on the chosen model, each cycle costs tokens, and no one guarantees 100% accuracy of formulations. At the same time key problems and patterns are consistently revealed — noise is effectively filtered out at the grouping stage.

Conclusion

Analyzing thousands of Wildberries reviews through an LLM in a brute-force manner is a lottery. Stage-by-stage processing with grouping and aggregation gives a fundamentally different quality: a stable and repeatable result. Run it on one product twice — you get the same conclusions. Run it on a thousand products — you get comparable reports that can be compared with each other.

This is what distinguishes the tool from an experiment: not a one-off lucky output, but a predictable process that scales to any volume and turns thousands of reviews into structured analytics in minutes.

For business, it’s a way to quickly identify product issues, improve the product card, and reduce returns. For the buyer, it’s a way to understand pros and cons without reading hundreds of comments manually.

Analyses and analysis examples: https://t.me/wildberris_pp

Why a “brute-force” analysis doesn’t work

Solution architecture

1. Review collection

2. Preliminary classification

3. Splitting into groups

4. Local group analysis (Map phase)

5. Final aggregation (Reduce phase)

6. Final report

Performance

Tech stack

Common implementation mistakes

Limitations

Conclusion

Другие статьи Python Dev

How to set up queue dialing via SIP and Python

Проекты Python Dev

Telegram bot for voice pranks

Automatic energy consumption control system

Automatic management of a Telegram channel network for a travel agent

Need help?