Regulatory Reporting Data Lineage for Banking in South Africa

The real problem in regulatory reporting is seldom that a return cannot be submitted. Most South African financial institutions can get the submission out. The problem is whether the institution can explain, under pressure, where the numbers came from, what changed, who approved the definitions, and why the return agrees — or does not agree — to finance, risk, treasury, and source system records.

That is why regulatory reporting data lineage banking South Africa is no longer a technical housekeeping topic. It is an executive control issue. Prudential returns, SARB interactions, internal audit reviews, external audit queries, and board risk committees all depend on the same foundation: traceable data, agreed definitions, disciplined reconciliations, and clear ownership.

For banks, insurers with banking exposures, mutual banks, and other regulated financial institutions, weak lineage creates a familiar pattern. Finance produces one view. Risk produces another. Treasury has its own classification. Regulatory reporting teams spend the month-end cycle explaining differences manually. Senior executives receive late escalations on items that should have been controlled at source.

This article sets out what leadership teams should expect from a fit-for-purpose regulatory reporting data foundation.

The return is only the final artefact

A prudential return is the visible output. The control environment behind it matters more.

Executives often see the submitted return, the sign-off pack, and perhaps a reconciliation summary. What they do not always see is the chain of data decisions beneath the numbers: product mappings, counterparty classifications, currency conversions, impairment inputs, maturity buckets, consolidation adjustments, manual overlays, and late corrections.

In a South African bank, a single regulatory line item may depend on data from core banking platforms, lending systems, collateral registers, finance ledgers, risk engines, treasury systems, spreadsheets, and manual journals. If that chain is not documented, the institution is relying on staff memory.

That is risky for three reasons.

First, experienced reporting staff leave or rotate. Knowledge that sits in one analyst’s workbook is not a control.

Second, regulatory interpretations evolve. When the Prudential Authority asks for clarification, the institution must be able to show the calculation logic that applied at the time of submission.

Third, operating conditions are not always stable. Load-shedding, system downtime, delayed batch processes, and data warehouse failures can disrupt reporting timetables. If lineage is unclear, contingency processes become guesswork.

Good reporting governance starts by treating the return as the end of a controlled data supply chain, not as an isolated finance deliverable.

What data lineage should show

Data lineage is the documented path from original record to reported number. For regulatory reporting, it must be practical enough for finance and risk teams to use, not just technical enough for IT architects to admire.

At minimum, lineage for a prudential return should answer six questions:

  1. Where does the data originate?
    For example, is an exposure sourced from a lending platform, a treasury book, a card system, or a manually maintained register?

  2. What happens to it before reporting?
    This includes enrichment, classification, aggregation, exclusions, currency translation, and adjustments.

  3. Which rule or definition is applied?
    A maturity bucket, related-party flag, arrears category, collateral type, or sector classification should not depend on individual interpretation.

  4. Who owns the data element?
    Ownership must sit with an accountable business function, not only with the system team that stores the field.

  5. Where is the control point?
    The institution should know where completeness, validity, and reasonableness are checked.

  6. What changed since the last reporting cycle?
    New products, system changes, mapping updates, and manual amendments must be visible.

This is not about documenting every field in every system at the same level of detail. That would be expensive and unmanageable. The focus should be on critical data elements: those that materially affect regulatory returns, capital calculations, liquidity measures, large exposure reporting, credit risk classification, or executive attestations.

For a broader view of how this fits into banking data strategy, see Zorinthia’s banking data strategy advisory hub.

Definitions fail before reports fail

Many reporting disputes start with a definition that was never properly agreed.

Consider a commercial property exposure. Finance may classify it by product type. Credit risk may classify it by counterparty and collateral. Treasury may focus on funding tenor. Regulatory reporting may need a specific treatment based on the applicable prudential instruction. Each view may be valid for its own purpose, but the return needs one controlled reporting definition.

A regulatory reporting data dictionary should therefore be more than a glossary. It should define:

  • the business meaning of each critical data element;
  • the permitted values or classifications;
  • the source of authority for the definition;
  • the system of record;
  • the owner responsible for correctness;
  • the rule used when source systems disagree;
  • the effective date of definition changes.

The ownership point is important. A data dictionary without accountable owners becomes a reference document that slowly goes stale. If “counterparty sector” is material to a prudential return, someone must be authorised to decide how it is defined, how exceptions are handled, and when the definition changes.

In many institutions, IT is unfairly expected to resolve these issues. IT can manage platforms, access, integration, and technical controls. It should not decide the regulatory meaning of a restructured loan, a connected counterparty, or an exposure class. Those are business and regulatory interpretation decisions, usually involving finance, risk, credit, treasury, and compliance.

Reconciliations are the control system

Reconciliations are often treated as month-end tasks. In regulatory reporting, they should be designed as a control system.

A useful reconciliation does not merely prove that two totals differ. It explains the difference in categories that management can act on. For example:

  • timing differences between source systems and the general ledger;
  • valid scope differences between financial reporting and prudential reporting;
  • mapping differences caused by product or portfolio classification;
  • manual adjustments approved for regulatory treatment;
  • data quality issues requiring source correction.

A bank preparing prudential returns may reconcile lending balances from product systems to the general ledger, then reconcile the regulatory population to the adjusted ledger view. If the difference is explained only by a spreadsheet note saying “classification adjustment”, the control is weak. The adjustment should have a reason, owner, approval record, and repeatable treatment.

Strong reconciliations also reduce executive noise. Instead of bringing unresolved breaks to a CFO or CRO two days before submission, the reporting team can escalate only items that exceed thresholds, indicate control failure, or require policy interpretation.

For a practical illustration of this problem, see the regulatory reporting reconciliation scenario.

Ownership must cross functional boundaries

Regulatory reporting sits across finance, risk, treasury, compliance, operations, and technology. That is why ownership fails when it is assigned only inside the reporting team.

A workable model separates four responsibilities.

Business data owners are accountable for the correctness of critical data in their domain. Credit may own borrower risk attributes. Treasury may own instrument and liquidity attributes. Finance may own ledger mapping and chart of accounts treatment.

Regulatory reporting owners are accountable for the return, interpretation of reporting requirements, submission process, and sign-off pack.

Technology custodians are accountable for systems, data movement, access control, availability, and technical integrity.

Governance forums resolve conflicts, approve definition changes, and prioritise remediation where issues cross departments.

This does not require a large bureaucracy. In a mid-sized South African institution, a monthly data quality and reporting control forum may be enough if it has the right authority. The problem is not the number of meetings. The problem is unresolved accountability.

Executive committees should be wary of reporting structures where every issue is described as “a data problem”. That phrase often hides a decision that no one has been empowered to make.

POPIA and evidence cannot be afterthoughts

Regulatory reporting data often includes or derives from personal and confidential information: customer identifiers, account behaviour, arrears status, income data, collateral details, and related-party information. POPIA does not prevent regulated reporting, but it does require disciplined handling of personal information.

That affects lineage in practical ways.

Access to detailed source data should be limited to people who need it for a defined purpose. Reporting packs should avoid unnecessary personal information where aggregated evidence is sufficient. Extracts used for reconciliations should be stored securely and retained according to policy, not kept indefinitely in personal drives. Where manual files are used, the institution should know who created them, when, from which source, and for what reporting cycle.

Auditability matters as much as privacy. If a regulator, internal audit team, or external assurance provider asks how a number was produced, the institution should be able to retrieve the relevant evidence without reconstructing the answer from scratch.

This is particularly important where spreadsheet-based processes remain part of the environment. Spreadsheets are not automatically unacceptable. Uncontrolled spreadsheets are. Version control, locked formula areas, review evidence, and clear source references can make a major difference while longer-term automation is being considered.

Start with the highest-risk return, not the largest project

Many institutions delay lineage work because they imagine a multi-year enterprise mapping exercise. That is usually the wrong starting point.

A more practical approach is to select one high-risk return, schedule, or reporting area and build a controlled model around it. Good candidates include returns with repeated audit findings, frequent manual adjustments, unexplained breaks, late sign-offs, or high management anxiety.

A focused diagnostic should cover:

  • the critical data elements used in the selected return;
  • the current source systems and manual inputs;
  • definitions and interpretation points;
  • reconciliations and unresolved breaks;
  • owners and approvers;
  • known data quality issues;
  • evidence retained for prior submissions;
  • system dependencies and operational risks, including outage scenarios.

The output should not be a theoretical architecture diagram. It should be an agreed control map, a prioritised remediation list, and a decision record for definitions that require executive or governance approval.

This approach suits the reality of many South African financial institutions. Data maturity is uneven. Legacy systems remain important. Budgets are constrained. Reporting teams are already stretched. The aim is not perfection; it is visible control over the data that matters most.

The executive test

Executives do not need to inspect every mapping table. They do need confidence that the institution can defend its numbers.

A CFO, CRO, or regulatory reporting executive should be able to ask:

  • Which prudential returns have documented lineage for their critical data elements?
  • Where do we rely on manual adjustments, and who approves them?
  • Which definitions are disputed or inconsistently applied?
  • Which reconciliations produce recurring breaks?
  • What evidence would we provide if challenged on last quarter’s submission?
  • Which data issues create the greatest regulatory, financial, or reputational risk?

If these questions produce vague answers, the institution does not have a reporting problem only at month-end. It has a governance problem in the data supply chain.

The next step is straightforward: choose one material prudential reporting area, trace the data from source to submission, and identify where definitions, reconciliations, ownership, and evidence are weakest. That exercise will show leadership whether the current reporting process is controlled — or merely dependent on effort, memory, and late-cycle intervention.