320 Stores. One Number Each. Finally One They Could Trust.

How a European Specialty Retailer Replaced Estate-Wide Spreadsheet Guesswork with Store-Level Revenue Forecasts, Confidence Bands, and a Planning Cadence That Actually Held.

9.8% Chain MAPE — Down from 24.3% on Existing Spreadsheet Model

320 Stores Forecast Individually, Weekly, 12 Weeks Forward

€6.2M Estimated Annual Benefit from Improved Staffing & Stock Allocation

CLIENT CONTEXT

Challenge at a Glance

Industry	Specialty Retail — Health, Wellbeing & Personal Care
Estate	320 stores across Western Europe — high street, retail park, and shopping centre formats
Store Range	Weekly revenues from €20k (small market-town) to €340k (flagship city-centre)
Planning Cycle	Weekly forecast horizon of 12 weeks, refreshed each Monday for operations and buying teams
Existing Tool	Excel-based spreadsheet model averaging prior-year actuals plus a manual seasonal adjustment
Goal	A reliable store-level revenue forecast with quantified uncertainty, usable by store managers, the buying team, and the CFO — from the same model

For context: industry research consistently shows that traditional spreadsheet or simple trend approaches to retail forecasting produce MAPE values in the 20–35% range. Advanced ML models targeting store-level weekly revenue have demonstrated MAPE reductions of 50–60% versus those baselines. At 24.3% MAPE across 320 stores, the client was operating with a forecast so imprecise it could not reliably inform staffing rotas, stock replenishment, or short-term capital allocation.

THE PROBLEM

The Forecast Existed. Nobody Planned From It.

Every Monday, a central planning analyst updated a master Excel file. Each store’s forecast for the coming 12 weeks was calculated as a weighted average of the same week in the prior two years, adjusted manually for known events — bank holidays, local competitor openings, store refits. The file was then emailed to regional managers, who routinely ignored it.

The reason was not hostility to planning. It was that the forecast had been wrong too many times at too high a cost. A flagship city-centre store had been understaffed on three consecutive peak Saturdays because the forecast missed a local event driving footfall. A regional distribution centre had over-allocated seasonal product stock to fifteen stores before a forecasted peak that never arrived, triggering a markdown event. The forecast was producing decisions — just consistently the wrong ones.

No store-level confidence band existed. Every forecast was a single point estimate. Planners had no way to distinguish a forecast where the model was highly certain from one where actual revenue could swing 30% in either direction. All 320 stores were treated as equally predictable.
Low-volume stores were structurally harder to forecast and carried the widest errors — but received the same planning assumptions as high-volume flagships. A 25% error on a €22k store is a €6k miss. The same error rate on a €275k store is a €69k miss that lands on the P&L.
The model had no external variables. Seasonality was captured crudely through prior-year averaging. Local factors — weather, nearby events, competitor proximity, shopping centre footfall data — were absent entirely. The forecast was endogenous: it could only reflect what had happened before, not what was about to happen differently.
There was no error-tracking discipline. Forecast versus actual was not reviewed systematically. Nobody knew that the chain MAPE was 24.3% because it had never been measured. The forecast was updated weekly and then forgotten.

THE APPROACH

One Model Per Store. External Signals. Quantified Uncertainty at Every Point.

Graphite Note built a store-level revenue forecasting system across all 320 locations. Each store received its own model, trained on its own historical revenue series but informed by a shared set of external signals. The key design decision was to produce not just a point forecast but a full predictive distribution — from which 80% and 90% confidence bands are derived automatically every week.

This distinction matters operationally. A confidence band is not a range of guesses. It is a statistically grounded statement about where actual revenue is likely to fall given everything the model knows. A wide band on a low-volume store in an untested location tells a planner to hold additional buffer stock. A tight band on a flagship with stable trading history tells a buyer to commit inventory with confidence. Both decisions come from the same number, read correctly.

Data Inputs Across 320 Stores

Internal transactional data: 3 years of weekly store-level revenue, transactions, average basket size, and promotional event flags.
Store attribute data: Format type (high street, retail park, shopping centre), gross floor area, years trading, distance to nearest competitor, catchment population.
Calendar and event features: Bank holidays, school half-term weeks, local authority event calendars, Black Friday and gifting season indicators.
External economic signals: European consumer confidence indices, regional retail footfall data, and household spending indicators by market.
Weather data: Weekly average temperature and precipitation by region — material for specialty retail where seasonal product transitions drive significant volume swings.

Figure 1. Store-level revenue forecast with 80% and 90% confidence bands — illustrative 12-week view for a single store. The actual revenue line stays within the 80% band in 9 of 12 weeks, consistent with model calibration. The widening band in weeks 6–7 reflects a promotional event with historically high variance.

RESULTS

MAPE Cut by 60%. Every Store. Every Week.

Across all 320 stores, the Graphite Note model reduced chain-average MAPE from 24.3% to 9.8% — a 60% improvement in forecast accuracy. The improvement held across all store tiers, including the historically difficult low-volume estate where simple models had produced errors exceeding 30%.

Figure 2. MAPE comparison by store tier — baseline spreadsheet model vs. Graphite Note. The largest absolute improvement is in low-volume stores (31.4% → 13.8%), which are structurally harder to forecast and had been systematically underserved by the prior approach.

The confidence band system enabled a new planning discipline: stores with band widths above a threshold were flagged automatically each Monday for additional review, while stores with tight bands were cleared for automated replenishment decisions. This created a triage system that focused human planning effort where uncertainty was highest.

Figure 3. Store forecast risk profile across the 320-store estate. High-volume stores (green) cluster below the ±10% target band width. Low-volume stores (red) carry structurally wider uncertainty — not a model failure but a business reality the confidence band now makes visible and actionable.

What the Business Could Now Do Differently

Decision Area	Without Store Forecast	With Graphite Note Forecast
Weekly staffing rotas	Manager intuition + prior-year pattern. Frequent over- and understaffing on seasonal peaks.	Model forecast + confidence band feeds directly into rota planning tool. Buffer headcount triggered automatically when band is wide.
Stock replenishment	Central buying team allocated based on chain-average trend. High-volume stores frequently stocked out; low-volume stores accumulated excess.	Store-level forecast drives store-level replenishment targets. Allocation calibrated to each store’s predicted demand, not a chain average.
Promotional timing	Promotions planned on fixed calendar dates. Uplift expectations were guesswork; post-event review was not systematic.	Forecast provides pre-promotion baseline. Post-event variance against forecast quantifies true promotional uplift, informing future spend.
CFO / board reporting	Weekly revenue roll-up compared to budget. Deviations explained post hoc with qualitative commentary.	12-week forward revenue forecast with confidence bands gives CFO a statistically grounded view of likely revenue range, not just one number.

Financial Impact

Quantifying the financial benefit of forecast accuracy improvement requires translating percentage-point MAPE gains into operational cost reductions. Across the 320-store estate, the primary value pools are staffing efficiency, stock holding cost reduction, and avoided markdown on misallocated inventory.

Based on the client’s average weekly payroll cost per store, a 1% improvement in staffing allocation efficiency across 320 locations represents approximately €1.4M annually. Improved stock allocation, reducing both overstock markdown and lost sales from stockouts, accounts for a further estimated €3.1M. Reduced planning time — fewer manual overrides, fewer post-hoc correction emails, fewer emergency replenishment orders — contributes a conservative €1.7M in operational overhead. Total estimated annual benefit: €6.2M, against an engagement cost recovered within the first quarter of deployment.

CLIENT VOICE

“We had a forecast before. We just didn’t trust it, so we didn’t use it. What changed wasn’t just the accuracy — it was that for the first time, the forecast told us how confident to be in it. That’s what made it plannable. Store managers started making decisions from it within the first month.” — Director of Commercial Planning, European Specialty Retailer

WHY GRAPHITE NOTE

Building a store-level forecasting system for a 320-store estate is not a data engineering problem. It is a modelling problem with a change management layer. Graphite Note designs models that produce outputs in the format planners actually need — a point forecast, a confidence band, and a clear signal about which stores require human attention this week. The rest runs automatically.

Store-Level ML Forecasting	Individual models per store, trained on store-specific history and informed by shared external signals. No chain-average smoothing that masks local dynamics.
Probabilistic Output	Every forecast includes 80% and 90% confidence bands, updated weekly. Planners know not just what revenue is expected but how much it could vary.
External Signal Integration	Consumer confidence, footfall indices, weather, and event calendars incorporated as model features — capturing drivers the prior-year average could never see.
Triage & Exception Flagging	Stores with wide confidence bands or high recent forecast error are flagged automatically each week, directing planning resource to where it is most needed.
Forecast vs. Actual Tracking	Weekly MAPE by store, region, and format. Systematic error monitoring prevents forecast quality from drifting undetected over time.

Resources

320 Stores. One Number Each. Finally One They Could Trust.

Ready to turn analysis into better decisions?