Overview
An end-to-end sales reporting system, running for two areas — Sumbar and Sumut (West and North Sumatra). Two n8n workflows do the work: one syncs the day’s sales into BigQuery every night, the other turns that data into an AI-written report every morning. By 07:00 WIB, DeepSeek has read the last seven days, written the insight and the action plan, and the workflow has rendered a PDF, archived it to Drive, and emailed it to that area’s leadership — CEO, GM, consultant, and the area sales manager — before anyone has opened a spreadsheet. The same pair of workflows runs per area; only the data source and the recipient list change.
Problem
For a long time, all of this lived in a single spreadsheet — and that was fragile in two ways. First, the data wasn’t clean: older years used a different date format and had no transaction IDs at all, so there was no reliable way to tell rows apart or put them in order. Second, scale — Sumbar alone had already passed 44,000 rows by the time I built this, and a spreadsheet that size stops being a dependable system of record. Keeping the sales history there for the long run wasn’t safe, so I moved it into BigQuery as the backup and source of truth.
Then there was the reporting itself. Writing the daily report used to be my job, by hand, every morning. The numbers were the easy part; the real work was the interpretation — turning a week of sales into a short “here’s what it means and here’s what to do.” That doesn’t scale, and it’s the first thing to slip on a busy morning. A report that’s late or skipped isn’t one executives can rely on.
Stage 1 — Ingestion (nightly, 23:59 WIB)
A second workflow keeps BigQuery current so the morning report always has something trustworthy to read. It runs at the end of each day and loads the day’s sales from the source Google Sheet — but never blindly.

- Watermark, not reload — before reading, it asks BigQuery for the latest transaction ID already stored, then loads only the rows newer than that. The sync is incremental and idempotent: run it twice and there are still no duplicates.
- Normalize the mess — hand-entered data is never clean, so the transform step fixes mixed date formats, collapses customer types into consistent categories, coerces blank numbers to zero, and even survives a column header that’s a single blank space. Every row is stamped with its source and ingest time.
- Bootstrap or delta — the first run loads the whole table; every run after that carries only the new transactions.
Stage 2 — Reporting (daily, 07:00 WIB)
- Pull & guard — reads the trailing seven days of an area’s sales from BigQuery, with one deliberate check first: did yesterday’s data land? If it didn’t, the workflow refuses to send a confident-but-wrong report and emails an alert instead — my cue to go check why ingestion failed. Trust comes before automation.
- Narrate with AI — DeepSeek reads the aggregated numbers and writes the part I used to write by hand: the insight and a concrete action plan, not a restatement of the figures.
- Render & deliver — the narrative and data become an HTML report, converted to PDF, archived to Google Drive, and emailed to that area’s leadership via Gmail.
Results
The report now writes, explains, and delivers itself before 08:00 — no spreadsheet touched, no morning carved out. The interpretation that used to depend on my time is consistent and on time, every day, and it lands in front of the CEO, GM, consultant, and area sales manager as a PDF they can read and forward. And when the data is missing, I’m the one who finds out first — through the alert — instead of a stakeholder finding out through a blank report. Because it’s built as a template, standing the report up for a new area is just a matter of pointing it at that area’s data and recipients — Sumbar and Sumut are live, and the next region is a copy away.
Tools
Need a dashboard like this?
Tell me what you're trying to see — I'll handle the pipeline behind it.
Get in Touch