Bank Statement Narration Pattern Classification India

Terra Insight Reconciliation Infrastructure

Content authored by practitioners with experience at Amazon India, Intuit QuickBooks, and the Tata Group. Meet the team →

Published 12 June 2026

Domain expertise

TDS Reconciliation GST Input Credit Platform Settlements NACH Batch Matching Bank Reconciliation Form 26AS Matching ERP Integrations Enterprise Finance Ops

Reviewed by

Navin Krishnan

Managing Director & Founder — Terra Insight

Ex Amazon India · Intuit QuickBooks · Tata nexarc

ISO 27001:2022 Patent Pending Incorporated 2024

Knowledge Card

Problem

Indian bank statement narrations are not standardised. The same NEFT credit can appear as 'NEFT CR:[UTR]/...' in one bank, 'NEFT-[UTR]-...' in another, and 'TRANSFER FROM ... UTR [UTR]' in a third. Without a classification library, reconciliation engines either miss the UTR or assign the transaction to the wrong family, breaking downstream auto-match.

How It's Resolved

An 18-family classification model anchored on short tokens (NEFT, RTGS, IMPS, UPI/, NACH, ECS, CHQ, CHRG, INT, TDS, CASH) routes each narration to one family before secondary extraction runs. Direction (debit or credit) splits same-anchor families like NEFT-In versus NEFT-Out. Ambiguous prefixes are resolved by a priority-ordered rule list that evaluates the most specific token first. Bank-specific delimiters (forward slash, pipe, dash) are normalised to a single canonical separator before UTR and counterparty extraction.

Configuration

Anchor token library with 20 to 25 entries, bank-specific delimiter normalisation profile (HDFC, ICICI, SBI, Axis, Kotak, Yes, IndusInd, PSU banks), direction-based disambiguation, aggregator counterparty allowlist for PG-Settlement routing, quarterly unmatched-narration review process.

Output

Each statement row tagged with one of 18 transaction families, a clean extracted match key (UTR, UPI ref, NACH batch ID, or cheque number), and a canonical counterparty string ready for fuzzy matching against the sub-ledger.

Indian bank statement narrations carry the entire reconciliation signal — UTR, counterparty, reference number, payment type — packed into a single free-text field. Unlike SWIFT MT940 or ISO 20022 CAMT.053 formats, the narration string has no formal schema. Every bank writes it differently. Treasury teams that try to write one regex per bank end up with dozens of brittle rules and high exception volumes. Teams that build a families-first classification library — anchor token, then extraction — get far higher hit rates with a fraction of the rules. This guide documents 18 narration families common across HDFC, ICICI, SBI, Axis, Kotak, Yes, IndusInd, and the major PSU banks, with the anchor tokens and match-key extraction logic that make them reusable across reconciliation, audit, and treasury workflows.

Why Narration Classification Matters

A typical mid-market Indian company processes 8,000 to 25,000 bank statement lines per month across four to six banks. Of these, roughly 60 to 70 percent are recurring families — payroll NEFT, supplier RTGS, customer UPI, NACH collections, bank charges, interest — that follow predictable structures. The remaining 30 to 40 percent are one-offs that need human review. The cost of reconciliation is dominated by misclassified routine transactions, not by genuine exceptions.

Without classification, every narration goes through a single fuzzy matching engine that tries to extract everything at once. The engine wastes cycles on bank charges (which never match an invoice) and gets confused by aggregator settlements (which look like NEFT but are bulk credits). With classification, the engine routes each line to the right downstream handler: invoice matching for customer credits, GL booking for charges, batch explosion for NACH, settlement-report join for aggregator credits.

A well-tuned classification library at a mid-market reconciliation deployment typically lifts auto-match rates from the high 50s to the high 80s — the same direction of movement reported in the public 51 percent to 88 percent benchmark — by removing noise from the matching pool.

The 18-Family Model

The classification library uses 18 transaction families grouped into six functional categories: electronic transfers (NEFT, RTGS, IMPS), unified payments (UPI variants), mandate-based collections (NACH, ECS), paper instruments (cheques), bank-generated entries (charges, interest, TDS), and cash. Each family has a short anchor token, a direction (debit, credit, or both), the typical banks that use the format, and a canonical match key that drives downstream matching.

Quick-Reference Family Table

Family	Anchor tokens	Direction	Typical banks	Match key
NEFT-In	NEFT CR, NEFT-, NEFT/CR	Credit	All scheduled commercial banks	UTR (22 char)
NEFT-Out	NEFT DR, NEFT-OUT, NEFT/DR	Debit	All scheduled commercial banks	UTR (22 char)
RTGS-In	RTGS CR, RTGS-, RTGS/CR	Credit	All RTGS member banks	UTR (22 char)
RTGS-Out	RTGS DR, RTGS-OUT, RTGS/DR	Debit	All RTGS member banks	UTR (22 char)
IMPS-In	IMPS CR, IMPS-, IMPS/CR, NFS	Credit	All IMPS member banks	RRN (12 char)
IMPS-Out	IMPS DR, IMPS-OUT, IMPS/DR	Debit	All IMPS member banks	RRN (12 char)
UPI-Coll	UPI/COLLECT, UPI-COL	Credit	All UPI member banks	UPI ref (12 char)
UPI-Pay	UPI/PAY, UPI/P2P, UPI-OUT	Debit	All UPI member banks	UPI ref (12 char)
UPI-P2M	UPI/P2M, UPI/MER	Credit	Merchant accounts	UPI ref + VPA
NACH-Debit	NACH/DR, NACH-COLL, ACH-D	Credit (to corporate)	Sponsor banks	NACH batch ID + UMRN
NACH-Credit	NACH/CR, ACH-C, SAL/NACH	Debit (from corporate)	Sponsor banks	NACH batch ID
ECS-Debit	ECS/DR, ECS-COLL	Credit	Legacy ECS users	ECS batch + mandate
Cheque-CR	CHQ DEP, CLG CR, CHQ/CR	Credit	All banks	Cheque number + clearing date
Cheque-DR	CHQ DR, CLG DR, CHQ/PAID	Debit	All banks	Cheque number
Bank-Charges	CHRG, CHG, SVC CHARGE, COMM	Debit	All banks	Charge code + period
Interest-Credit	INT CR, INT/SB, INT-FD	Credit	All banks	Period + product code
TDS-Debit	TDS U/S, TDS-194A, TDS/INT	Debit	All banks	Section + period
Cash-Deposit	CASH DEP, CDM CR, CSH/DR	Credit (or debit for withdrawal)	All banks	Branch code + slip number

Electronic Transfers — NEFT, RTGS, IMPS

NEFT and RTGS share the 22-character UTR structure: 4-digit bank code, 2-digit year, 3-digit day of year, 7-digit sequence. The anchor token splits the family by direction (CR or DR suffix) and the secondary regex extracts the UTR by position. A typical narration:

NEFT CR:HDFC2616712345678/ABC MANUFACTURING LTD/INV-2026-0391

The classification step matches the NEFT anchor and the CR: direction marker; the extraction step pulls HDFC2616712345678 as the UTR and ABC MANUFACTURING LTD as the counterparty.

IMPS uses a 12-character RRN (Retrieval Reference Number), not a UTR. The anchor token is IMPS, and the extraction looks for a 12-digit number after the direction marker. IMPS narrations are typically shorter than NEFT because the underlying NPCI message carries less structured data.

Unified Payments — UPI Variants

UPI is split into three families because the downstream handling differs sharply. UPI-Coll covers collect requests initiated by the corporate. UPI-Pay covers outbound payments. UPI-P2M covers merchant inbound collections, which is where aggregator settlements live.

UPI-P2M narrations include the VPA in the format merchant@psp. The VPA suffix is the key disambiguation signal: @razorpay, @paytm, @cashfree, @payu indicate aggregator collections that need a settlement-report join. Direct merchant VPAs (merchantname@hdfcbank, merchantname@ybl) indicate single-order collections that match directly to invoices.

Mandate-Based Collections — NACH, ECS

NACH-Debit (a credit to the corporate’s account from collected mandates) appears as a single batch line. The narration carries the batch reference and the sponsor bank code but does not break down individual mandates. The match key is the NACH batch ID plus the UMRN (Unique Mandate Reference Number) from the NPCI settlement report — a join the reconciliation engine must perform externally.

NACH-Credit (a debit from the corporate’s account for outbound mandates like salary, vendor payouts, or refunds) follows the same single-line pattern. The classification step routes the line to the NACH explosion handler instead of the invoice matcher.

ECS is the legacy predecessor to NACH and is being phased out, but a small number of mandates still settle through ECS rails. The anchor ECS plus the direction marker classifies the family.

Paper Instruments — Cheque

Cheque entries carry the 6-digit cheque number plus clearing date. The anchor tokens include CHQ, CLG (clearing), and PAID. The match key is the cheque number for both inward and outward, joined against the cheque issue register (for outward) or the receipts log (for inward).

Bank-Generated Entries — Charges, Interest, TDS

Bank charges are anchored by CHRG, CHG, SVC CHARGE, or COMM. The narration carries a charge code (NEFT charge, RTGS charge, account maintenance, SMS alerts) and a period. GST at 18 percent on charges is either embedded in the same line or appears as a separate GST ON CHRG debit. The classification step routes the entry to the bank-charges GL and flags the GST component for ITC matching against the bank’s monthly tax invoice.

Interest credits use anchors like INT CR, INT/SB, or INT-FD. The narration carries the period and product code (savings bank, fixed deposit, sweep deposit). Under Section 194A, TDS at 10 percent is deducted by the bank if annual interest exceeds the threshold, and appears as a separate TDS debit.

TDS debits use anchors like TDS U/S, TDS-194A, or TDS/INT. The classification step routes the entry to the TDS reconciliation workflow that reconciles bank-deducted TDS against Form 26AS credits.

Cash

Cash deposits and withdrawals appear with anchors CASH DEP, CDM CR (cash deposit machine), or CSH/DR. The match key is the branch code plus slip number, joined against the petty-cash or cashier register.

Resolving Ambiguity Between Similar Prefixes

Three ambiguity patterns recur and need explicit rules:

TRF / TRANSFER — used by SBI and some PSU banks for any transfer (NEFT, RTGS, internal sweep, branch transfer). Disambiguation: the next token after TRANSFER is FROM for inward, TO for outward, and BY for internal sweep. Combined with the direction (debit or credit) on the row, the family resolves uniquely.
NEFT versus NEFT-OUT versus NEFT-RETURNED — some banks render NEFT returns (failed credits sent back) with the same NEFT prefix as fresh credits. Disambiguation: a RETURN, RTN, or REVERSED token in the narration routes the line to a separate NEFT-Return family that should not auto-match against open invoices.
NACH versus salary NACH versus vendor NACH — the underlying rail is the same, but the GL treatment differs. Disambiguation: a SAL token in the narration routes the line to payroll reconciliation; a VEND or SUP token routes it to vendor payouts; absence of either keeps it in generic NACH-Credit.

A practical implementation evaluates ambiguity rules in priority order: most specific token wins. This avoids the need for backtracking and keeps the rule library auditable.

Worked Rupee Example

A mid-sized D2C brand processes 14,200 narrations across HDFC, ICICI, and Yes Bank in May 2026. Before classification, the auto-match rate is 56 percent — 8,000 lines match automatically, 6,200 hit exceptions and take an average of 4.5 minutes each to resolve. At a fully-loaded analyst cost of ₹520 per hour, monthly exception cost is roughly ₹2.42 lakh.

After deploying an 18-family classification library: auto-match lifts to 87 percent — 12,350 lines match automatically, 1,850 hit exceptions. Monthly exception cost falls to roughly ₹72,000. The ₹1.7 lakh monthly saving annualises to above ₹20 lakh on this single dimension, before counting audit-prep time savings.

To model the exception cost for your own bank statement profile, use the Three-Way Match Exception Cost Calculator — input your monthly volumes, exception rate, and analyst cost to see the year-one saving from narration classification plus matching automation.

Maintaining the Library Over Time

Banks update narration formats every 18 to 24 months — typically alongside core banking upgrades. A quarterly review of the top 20 unmatched narration patterns is sufficient to catch most drift. Major regulatory events (ISO 20022 migration milestones, RBI tokenisation changes, RuPay credit on UPI) trigger out-of-cycle updates. The Reserve Bank of India publishes the underlying payment system specifications that drive these format changes — bookmark its master directions on payment and settlement systems.

For teams setting up classification from scratch, the bank-by-bank guides cover HDFC, ICICI, and SBI specifics. Combine those with the cross-bank classification model in this article to ship a reusable library across your reconciliation stack.

For enterprise finance teams scoping a deployment, bank reconciliation software India should ship with a pre-built 18-family classification library tuned for the major Indian banks, plus a no-code rule editor so finance teams can add or amend bank-specific rules without engineering tickets. A full-stack reconciliation software India implementation covers narration classification as one capability within end-to-end bank, ledger, and statutory reconciliation. See HDFC bank reconciliation India, ICICI bank reconciliation India, and MT940 bank statement reconciliation India for bank-specific and format-specific deep dives that complement this classification model.

Primary reference: Reserve Bank of India — where guidelines for enterprise current accounts and statement standards in India are published.

Frequently Asked Questions

Why are bank statement narrations not standardised in India?

RBI mandates the underlying payment systems (NEFT, RTGS, IMPS, UPI, NACH) and the data they carry, but does not prescribe the exact text format banks must use in customer statement narrations. Each bank decides how to render the payment system message in its core banking system. HDFC prefixes NEFT credits with 'NEFT CR:' and forward-slash delimited fields; SBI uses 'TRANSFER FROM' and dash separators; ICICI uses 'NEFT-' and pipe delimiters. The underlying UTR is identical, but the surrounding text varies. A narration classification library normalises across these bank-specific conventions.

What is an anchor token in narration parsing?

An anchor token is a short, high-confidence string that uniquely identifies a transaction family in a narration. Examples: 'NEFT' anchors all NEFT transactions, 'UPI/' anchors UPI, 'CHRG' anchors bank service charges. Anchor tokens are matched first, then secondary regex extracts the UTR, counterparty, and reference. A well-tuned anchor library uses 20 to 25 tokens to cover above 95 percent of narrations across the major Indian banks.

How do you resolve ambiguity between similar narration prefixes?

Two cases recur. First, 'TRF' can mean inward transfer, outward transfer, or internal sweep depending on the bank — resolved by debit/credit direction plus the next token. Second, 'UPI/' covers P2P, P2M, and aggregator collection — resolved by checking the VPA suffix (bank handles for P2P, @razorpay or @paytm for aggregators) and the amount band. Ambiguity rules should be encoded as a priority-ordered list, with the most specific rule evaluated first.

Do payment gateway aggregator settlements appear in the same narration as direct UPI?

No. Razorpay, PayU, Cashfree, and Stripe India settlements appear as bulk NEFT or IMPS credits with the aggregator's legal entity name in the counterparty field — for example, 'NEFT CR:[UTR]/RAZORPAY SOFTWARE PVT LTD/SETTLEMENT'. The narration does not list the underlying merchant transactions. Reconciliation requires the aggregator's settlement report (CSV or API) to explode the single credit into the underlying orders. The narration classification library should route aggregator settlements to a dedicated 'PG-Settlement' family rather than treating them as generic NEFT credits.

How often should a narration library be updated?

Banks change narration formats roughly once every 18 to 24 months, typically alongside core banking system upgrades. A practical cadence is a quarterly review of unmatched narrations: the top 20 unmatched narration patterns from the previous quarter become candidates for new rule additions. Major events — RBI mandate changes, ISO 20022 migration milestones, new payment products like RuPay Credit on UPI — trigger an out-of-cycle update.

Bank Statement Narration Pattern Classification: A Library for Indian Treasury Teams