Worked example · Ask Your Data Workflow

Six analysts.
Fourteen days a month.
Returned.

How a mid-market logistics business stopped writing monthly variance commentary by hand — and what the team does with the fourteen days a month they got back.

~120
analyst-days returned per year
>80%
finance team adoption, month 2
6 weeks
from kickoff to production
The context

Dashboards that showed everything.
Explained nothing.

The client was an Australian mid-market logistics business, roughly $180M in annual revenue. Power BI had been in production for five years. Fabric had been licensed eighteen months earlier. Dashboards were everywhere.

The problem was not the dashboards. The problem was that the finance team — six analysts across FP&A, management accounting, and commercial analytics — was spending the first two weeks of every reporting cycle writing variance commentary for the monthly board pack.

Each analyst took a portion of the business (regions, product lines, customer segments), opened the relevant Power BI report, wrote a few paragraphs explaining what moved and why, and emailed it to the Head of Finance for review. The Head of Finance then compiled, edited, re-edited, and delivered to the CFO. The CFO would ask for more detail on two or three lines. The analysts would rewrite.

Fourteen days of the month were gone before anyone had done any actual finance work.

Why "more dashboards" wasn't the answer

The instinct in most businesses at this point is to build more reports. More slicers, more drill-through, more automated variance tables. The finance team had already been down that road: they had excellent reports. They didn't need better numbers — they needed better explanations.

The second instinct is to turn on Copilot. But out-of-the-box Copilot against the existing semantic model produced exactly the kind of confident-but-occasionally-wrong output you can't put in front of a board. Measures were ambiguous. Intercompany was handled inconsistently across reports. Currency normalisation was applied in some places and not others.

You cannot get trustworthy AI commentary on an untrustworthy model. That was the honest diagnosis.

We don't need more reports. We need the system to stop making us write the same commentary every month.

What we built

A six-week Ask Your Data Workflow engagement, running through the standard six-stage Data Disruption method. Two Decision Engineers on our side, the Head of Finance and a senior analyst on theirs.

The build had three layers:

Layer one — the semantic model. We rebuilt the core finance measures with Claude Code and the Fabric MCP, resolving the intercompany, currency, and mix inconsistencies. Every new measure shipped with a unit test and documentation. The existing Power BI reports kept working; the underlying measures were now consistent.

Layer two — the business glossary. A structured definition layer the AI workflow grounds against. Forty-seven terms, co-authored with the finance team in a single workshop, kept in the client's own repository so they own it. Revenue, margin, utilisation, cost recovery, intercompany, FX-normalised — every term defined in the language the finance team actually uses.

Layer three — the workflow itself. Azure OpenAI running inside the client's own tenant, grounded on the model and the glossary. Given the monthly data refresh, it drafts first-pass variance commentary per business unit — what moved, by how much, against what baseline, with source rows cited on every claim. The output isn't pushed anywhere automatically; analysts open it, review it, edit where needed, and sign.

The 200-question eval set

The piece that made leadership willing to trust it. Before go-live, we co-authored a 200-question evaluation set with the finance team: questions the workflow must answer correctly, edge cases it must handle, and traps it must not fall into ("what's our revenue including intercompany?" — correct answer: refuse and ask for clarification).

We ran the full eval set before every release. The agreed launch threshold was 95% correct on the full set, 100% correct on the twenty-three "board-pack-critical" questions. We hit it in week five. The client now reruns the eval set monthly, and we review any failures under the Decisions Desk retainer.

What changed

Month one after launch, analysts used it cautiously — generating drafts, then rewriting most of them. Month two, they started trusting it for the straightforward variance sections and focusing their time on the harder narrative work. By month three, first-pass commentary was generated in under an hour, edited by analysts in another hour or two, reviewed and approved by the Head of Finance the same day.

Fourteen days a month of analyst time came back. Not saved in the abstract — redirected to actual commercial analysis, margin investigation, pricing work. The kind of thing finance teams always say they'd do if they had the time.

The workflow adoption rate was above 80% of the finance team by month two. The Head of Finance reported to the CFO that the board pack was both faster to produce and higher quality — because the humans were spending their energy on the judgment calls, not the description.

engagement.log
client_size: ~$180M AUD revenue
industry: logistics / 3PL
existing_stack: Power BI Premium, Microsoft Fabric
buying_team: CFO · Head of Finance · Head of Data
offer: Ask Your Data Workflow
duration: 6 weeks
dd_team: 1 × Decision Engineer (lead) · 1 × Decision Engineer (red-team)
client_team: Head of Finance + senior analyst, part-time across the engagement
stack_delivered: Azure OpenAI (inside client tenant) + Fabric semantic model + 47-term business glossary
eval_set: 200 questions, 95% threshold, 100% on board-pack-critical subset
handover: full IP transfer · documentation · monthly eval rerun process
outcome_analyst_time: ~120 analyst-days returned per year
outcome_adoption: >80% finance team, month 2
outcome_cycle_time: first-pass commentary: 10 days → <1 day
trust_model: AI-drafted, human-owned, source-cited
ongoing: Decisions Desk retainer for monthly eval + tuning

Show us the decision
that still takes too long.

A free 45-minute call. Bring a workflow, a reporting pain, or a trust issue — we'll tell you quickly whether this is a real fit.

Start with your own decision