Banking and financial services data strategy — credit risk and IFRS 9 data foundations, customer and KYC data under POPIA and FICA, fraud and financial crime analytics inputs, and regulatory reporting lineage. Independent advisory for banks and lenders in South Africa and internationally.
A bank’s product is trust — and trust runs on data defensibility: can leadership, the regulator, and the auditor reconstruct how a number was produced, from which systems, on what definitions, under whose accountability?
In practice, banks and lenders accumulate technical debt in data faster than in software. Cores, channels, risk engines, and finance each optimise for their own workflows. The result is not “no data” but contested data: multiple versions of exposure, identity, and performance; extracts that no longer match the production book; and models that inherit those fractures silently.
Independent data strategy advisory for banking starts where those fractures show up as risk: credit decisions, customer onboarding friction, fraud operations overload, and regulatory or board reporting that cannot be traced cleanly to systems of record.
The pages below are the references for each domain. Illustrative diagnostic walkthroughs (what an engagement surfaces before anyone buys a system or retrains a model) sit under Banking examples — each scenario links back to the relevant domain page here.
Loan origination, servicing, collections, and finance often sit on different platforms or on the same platform with inconsistent historical conventions. Staging under IFRS 9, loss given default inputs, collateral valuations, and write-off treatments each depend on data that must be time-consistent and definition-consistent. When exposures are restated quietly between systems, the ECL engine is the last place the problem is visible — not the first. The governance moves that matter are laid out in credit risk data governance.
Digital onboarding, branch capture, agent networks, and third-party introducers each create customer artefacts. Without an authoritative party identity model and retention rules aligned to POPIA purpose limitation, the organisation holds more copies of personal information — not a clearer record. FICA accountability requires evidence of process, not only a populated field in a CRM. For how identity and evidence are usually governed end to end, see banking customer data management.
Transaction monitoring and anomaly detection depend on timely, complete transactions with correct MCCs, merchant categories, counterparties, and device or channel attributes. When operational teams “fix” data in spreadsheets or back-office tools without feeding golden copy rules, the model sees ghosts — either too many alerts or too few. Fraud and financial crime data ties those operational facts to sustainable investigation workload and model confidence.
Prudential and management reporting often evolved as separate pipelines from overlapping sources. The question “which definition of NPL is in this return?” can consume weeks because nobody owns the mapping from report line to system field — and the mapping drifted when a product system was upgraded. Regulatory reporting data lineage is the place to settle definitions, ownership, and reconciliation discipline before the next reporting cycle hardens the wrong answer.
When committees stop trusting staging migrations or coverage ratios, the debate becomes political rather than analytical. Often the root cause is data lineage and ownership, not the mathematics of the model — the same gap credit risk data governance is written to close.
Duplicate KYC requests, unexplained delays, and conflicting contact data are symptoms of fragmented customer data — they also carry reputational and conduct exposure. That is rarely fixed with a new onboarding portal alone; it needs clarity on identity and evidence, as in banking customer data management.
Alert handling cost scales with noise. Data governance at the transaction reference layer is a direct lever on unit cost per case — the topic of fraud and financial crime data.
Findings rarely say “spreadsheet wrong.” They say controls absent, definitions undocumented, or reconciliations not performed — all data governance outcomes that regulatory reporting data lineage addresses upstream of the next audit or regulatory dialogue.
Who owns the lending book snapshot used for accounting and regulatory capital — and how is it locked for each reporting date?
What is the authoritative customer record for identity, consent, and FICA evidence — and how is it reconciled across channels?
Which transaction attributes feed fraud and AML models — and who is accountable when they are wrong?
Where is the mapping from each material regulatory line to system-of-record fields — and who signs off when it changes?
Credit decisioning, collections prioritisation, fraud scoring, and conversational servicing all depend on governed inputs. See AI Readiness for the executive frame: model performance is a downstream symptom of data ownership and lineage. In banking, that starts with the four domains above — credit data, customer data, fraud-relevant transaction data, and reporting lineage — not with the algorithm choice.
Walkthroughs that pair with the guides above: Banking examples.