Illustrative scenario — not a specific client.

The Personalisation Investment That Ran on Noise

When the demo wore off

This retailer did what a lot of mid-sized chains do: they bought a packaged recommendation engine and upgraded loyalty so they could push “personalised” offers and show shoppers “customers like you also bought” on tablets next to the fitting room. The vendor’s demo was slick — clean sample data, crisp segments, response curves that looked too good to be true and, in hindsight, were. Go-live hit the date. Marketing ran the programmes. A year later, email open rates and in-store response looked almost identical to the year before they’d turned any of it on. On the shop floor, staff had stopped defending the on-screen suggestions when customers asked why a random product appeared — they would shrug and say “the system.”

Head office assumed the models needed retraining or more cloud spend. The brief for the diagnostic was narrower: before tuning algorithms, check whether the data feeding them resembled the tidy world of the vendor demo.

What the files actually showed

Loyalty was full of ghosts

The loyalty file was the first place to look, and it was messier than anyone wanted to admit in steering meetings. The same shopper could exist as a card number from five years ago, a newer card after they lost the first one, a web account under a different email, and a dusty CRM row from an old campaign. A rough dedup exercise on name, phone, and address suggested roughly thirty-eight percent of “members” were probably duplicates, stale shells, or both. The recommendation engine did not know that. It treated every ID as a separate person, which meant spending behaviour was split across ghosts. A loyal customer could look like three light buyers.

The till was mostly anonymous

In-store behaviour was the bigger hole. Scanning a loyalty card at the till was optional. When Saturday queues stacked up, staff skipped the ask. A few stores had never turned on the prompt to swipe before payment. When the diagnostic measured a representative month, under half of in-store sales volume tied back to a loyalty ID — even though membership numbers on paper looked healthy. Online orders were almost always identified. So the people who shopped both channels were systematically under-represented in the behaviour the model learned from: half their pattern lived in anonymous till lines.

Product codes did not match the engine

Product data did not rescue the situation. The engine ingested a nightly file of proper SKUs from merchandising. The till still rang sales on local PLUs, bundles, and promotional packs that did not map cleanly to those SKUs. A chunk of lines sat on parent or generic codes. The model was not generating random recommendations — it was doing its job on blunt product attributes, so “similar items” often made sense to the database and none to the person standing in the aisle.

The uncomfortable part for leadership was that the software was probably working as sold. Nobody had drawn a line in the sand on data quality before signing. There was no single owner for “who is this customer across channels,” no metric anyone tracked for identified sales at the till, and no gate that said: we do not turn personalisation back on until duplicates and linkage rates hit agreed numbers.

What happened next

What happened next was deliberately boring. Someone was given the golden record — merge rules, which fields win when two profiles collide, and a schedule for cleaning the file that did not depend on a heroic analyst. Till procedures changed: loyalty before payment, simplified enough that staff would actually use it, and a weekly identified-sales percentage by store so weak stores could not hide. Merchandising and IT maintained a mapping table for bundles and promos so lines could resolve to real leaf SKUs wherever possible. New personalisation campaigns paused until the business agreed the inputs had crossed thresholds they’d written down. The roadmap said plainly: several months to stabilise the foundation, then revisit what the engine could do.

That is slower than buying another module. It is the only approach that makes the first module honest.