Telegram Bot for Order Picking Verification from Invoice and Product Photos
2026-04-04
The client had an operational issue in order picking: some orders were assembled incorrectly, which led to returns, extra shipping, and additional team effort.
The business goal was straightforward: automate picking verification before shipment. The employee takes a photo of the invoice and photos of products on the table (with visible SKUs), and the system automatically checks whether actual picking matches the document.
What mattered for the business
- Reduce the share of incorrect pickings before shipping to the customer.
- Lower costs of returns and reshipments.
- Speed up quality control without assigning a dedicated checker to every order.
- Make verification transparent and standardized across the whole shift.
- Get a conclusion quickly in the same channel employees already use.
The key idea: replace manual visual checking with automated verification. OpenAI parsed the invoice separately and parsed products/tags from photos separately, then returned a structured conclusion on whether the picking was correct.
If you have a similar task (picking control, invoice/product photos, pre-shipment verification) and need to reduce operational losses, you can send details via the brief form.
What was implemented
- A workflow was built to accept a batch of photos: invoice + products with visible SKUs.
- OpenAI-based analysis was configured with a task-specific prompt for picking verification.
- A split extraction approach was implemented: invoice parsing and product photo parsing as separate stages.
- A structured JSON response was produced with the conclusion: picking is correct or incorrect.
- A Telegram bot was implemented to return a clear verification status for each photo set.
- Error scenarios were handled: low photo quality, unreadable SKUs, incomplete photo set.
Practical outcome
In day-to-day operations, picking verification became faster and more consistent: employees sent photos, and the bot returned a verification conclusion.
- Pre-shipment checks stopped relying only on the attention of a specific employee.
- A single decision format was introduced: “picking correct / incorrect” with structured details.
- The risk of missed errors that lead to returns and extra logistics was reduced.
- The team got a tool that fits the existing Telegram workflow without a separate UI.
Based on working statistics, about 90% of checks returned “picking correct.” The remaining 10% were usually also correct, but required attention due to OCR mistakes or missing readable SKUs in photos. As a result, employee attention shifted from rechecking all orders to targeted recheck of only those 10%.
Figure 1. Telegram bot in a real flow: a gallery is sent, and the bot returns verification results per photo.
OpenAI cost profile
Before packaging the solution into the Telegram bot, actual processing cost was measured in a test environment. This made it possible to provide the client with a per-photo cost estimate in advance and forecast budget for production volume.
Figure 2. Test OpenAI spend: 92 requests, $0.51. This was used to estimate per-photo processing cost before production rollout.
Why Telegram was chosen
The team already had an active Telegram workflow, so a separate web interface at the start would add unnecessary overhead. Verification was embedded directly into the existing process without extra onboarding or system switching.
Technical details
Below is the implementation outline that proved practical for scaling to similar document/visual verification tasks.
1. Delivery sequence: n8n first, Telegram bot second
- The first stage was a standalone workflow built and tested in n8n.
- After accuracy and logic were confirmed with the client, the same flow was packaged into a Telegram bot.
- This approach validated the business hypothesis quickly without front-loading bot interface complexity.
Figure 3. n8n test bench where the logic was validated before packaging into the Telegram bot.
2. Processing flow
- An employee sends a batch of photos to the bot: invoice + products on a table with visible SKUs.
- Images are sent to OpenAI with a prompt designed for picking verification.
- The model extracts invoice data and product/tag data separately.
- A document-vs-actual comparison is performed by SKU and positions.
- A structured JSON conclusion is returned for picking correctness.
- The bot returns verification status to the employee per photo set/order.
3. What was validated separately
- Invoice readability and sufficient data for line extraction.
- SKU readability on product photos (tags, labels, shooting angle).
- Completeness of photo set for a specific order.
- Empty or low-confidence recognition routed to manual review.
4. User roles and Telegram command access
- Access control was implemented with user whitelists.
- Administrators could manage whitelists (add/remove/update access).
- Administrative statistics were available through slash commands in Telegram.
- Regular employees did not have access to admin commands or statistics.
- The
/helpcommand was role-aware: admins saw full command set, employees saw only operational commands.
5. Two bot environments: test and production
- Two bots were maintained in parallel: a test bot and a production bot.
- Each bot had its own environment with separate settings and tokens.
- Changes were applied to the test bot first and confirmed with the client.
- After confirmation, the same code was deployed to production.
Why this is good in practice: it reduces risk of production incidents, makes releases predictable, and enables customer sign-off on real scenarios before rollout. For business operations, this means fewer outages, fewer rollback events, and a more stable picking process.
6. Practical OCR and visual-verification notes
- In this case, photo quality affected results less than expected; the critical factor was that text should not be rotated by 90/180 degrees.
- For reliable comparison, SKUs on product photos should be visible separately and not overlap.
- The best outcome came from a hybrid approach: automated verification plus manual recheck of disputed cases.
7. Separate challenge: Telegram photo gallery handling
A significant implementation challenge was gallery handling: Telegram does not provide a reliable way to know upfront how many images are in a specific gallery, and there is no explicit “gallery complete” signal at the moment each individual image arrives.
This complicates concurrent processing: photos must be grouped correctly, premature analysis of incomplete sets must be avoided, and unnecessary locks for other incoming orders must be avoided at the same time. In production, this required dedicated buffering and synchronization logic before final verification was triggered.
8. Two-pass experiment (via n8n)
A separate two-pass approach was tested in n8n: first, OCR-only extraction produced JSON for invoice and products; then a second stage analyzed that JSON for picking correctness.
This performed noticeably worse. In practice, when the model saw raw photos directly, it had better context and produced more accurate one-pass conclusions. The final production path therefore used one-pass flow: “photos -> analysis -> conclusion.”
9. What to keep in mind when replicating this in another project
- Define a photo capture standard for staff first (angle, light, SKU visibility).
- Design error handling and resubmission flow explicitly, not just the happy path.
- Track metrics from day one: wrong-picking rate, manual recheck share, and verification time per order.
If needed, the next iteration can add bot UI screenshots, a sample JSON schema, and fixed before/after metrics as a compact case-study package for proposals.