The Challenge: A Finance Function Drowning in Manual Reporting
The finance function at this institution — $800 billion AUM, operations in 40 countries — ran on a model that hadn't changed in fifteen years. Every month-end close required 200+ analysts to manually extract data from twelve separate source systems, write variance commentary in Excel, and compile regulatory reports by hand. The close cycle ran 12 business days. Regulators wanted it faster. The CFO wanted it cheaper.
Three specific pain points drove this engagement. First, management reporting commentary: every month, analysts spent an average of 4 hours each writing free-text explanations of budget variances. The explanations were inconsistent in quality, style, and depth. A senior analyst's commentary looked nothing like a junior's. This created downstream noise for the executive team who had to reconcile 80 versions of the same narrative.
Second, regulatory filings. The bank filed 47 regulatory reports monthly across jurisdictions — Basel III capital adequacy, liquidity coverage ratio, net stable funding ratio, and more. Each had bespoke templates, specific data sourcing rules, and strict validation requirements. Errors cost money. In 2023, a formatting error in an NSFR filing triggered a regulatory enquiry that consumed 300 hours of compliance resource to resolve.
Third, the intercompany reconciliation process. With 40 legal entities, intercompany eliminations at close involved matching 180,000+ transaction pairs across systems that didn't speak to each other cleanly. Reconciliation analysts spent 60% of their time on exception management — chasing mismatches, emailing operations teams, and manually adjusting journals. The other 40% of their time was spent documenting what they'd done.
The Brief
Reduce month-end close cycle from 12 to 5 business days. Automate tier-1 variance commentary. Cut regulatory filing error rate to zero. The timeline: 12 weeks to production-ready deployment across 300 finance users.
Our Approach: Three Claude Agents, One Finance Platform
We don't build chatbots and call them AI. For this engagement, we designed three purpose-built Claude AI agents — each targeting one of the bank's core finance pain points — with a shared governance layer and audit trail built for a regulated environment. Every design decision was made through the lens of a bank that has a regulator looking at its AI systems.
Agent 1: Variance Commentary Engine
The Variance Commentary Engine connects to the bank's Oracle Hyperion EPM via a custom MCP server. At month-end close, when Hyperion signals that actuals are loaded, the agent retrieves actual vs. budget vs. prior-year data for every cost centre and P&L line, enriched with context from the bank's internal narrative database — a curated library of approved phrases, prior commentary, and entity-specific business context documents.
The agent produces first-draft variance commentary in the bank's house style: concise, structured, jargon-appropriate for the target audience (executive committee vs. board vs. regulator). Commentary is graded by confidence score. Variances where Claude has high confidence and rich context are auto-approved. Variances that exceed threshold materiality or where the agent flags uncertainty are routed for human review. In month three, auto-approval rate reached 73%.
Agent 2: Regulatory Filing Validator
The Regulatory Filing Validator runs as a quality gate before any regulatory submission. Analysts complete filings using existing tools, then submit to the agent before sign-off. The agent cross-checks every data point in the filing against source system data via MCP connections to the bank's data warehouse, validates calculation logic against regulatory templates, flags any data lineage breaks, and produces a structured sign-off pack with evidence of validation.
For the NSFR filing specifically, the agent catches the class of error that caused the 2023 incident — line items populated from the wrong data source — because it validates data provenance, not just data values. In six months of production, the agent processed 282 regulatory filings and flagged 14 errors before submission. All 14 were corrected before filing. No incidents.
Agent 3: Intercompany Reconciliation Assistant
The Intercompany Reconciliation Assistant connects to both the bank's general ledger and its intercompany netting platform via MCP, ingests the full exception population each morning, and prioritises the queue by materiality, aging, and entity risk profile. For each exception, the agent researches probable root cause — comparing transaction-level detail, examining prior period resolution patterns, and drafting a proposed resolution path with supporting evidence.
Reconciliation analysts receive exceptions pre-diagnosed rather than raw. Average time to resolve a tier-1 exception fell from 47 minutes to 11 minutes. The agent handles tier-3 exceptions (low value, recurring patterns) end-to-end, escalating only when resolution requires a journal entry above a defined threshold.
Technical Architecture: Built for a Regulated Bank
Deploying Claude in a top-20 global bank requires more than API keys and good prompts. The architecture had to satisfy the bank's group CISO, pass model risk management review, meet local regulatory AI governance requirements in three jurisdictions, and integrate with a 15-year-old EPM system that the IT team hadn't touched in five years.
Data Handling and Privacy
All Claude API calls are routed through Claude Enterprise, configured with zero data retention. Financial data never leaves the bank's network perimeter in plaintext. We deployed the MCP server layer within the bank's private cloud, so the only data that reaches Anthropic's API is the constructed prompt — stripped of customer PII and structured to minimise sensitive data exposure. The bank's data classification framework was mapped to a prompt construction policy: certain data types (customer account numbers, specific trader names) are excluded from prompts by design, replaced with anonymised tokens resolved client-side.
Audit Logging and Explainability
Every Claude interaction is logged with full input/output capture to an immutable audit store. Logs are structured to support both internal model risk review and potential regulatory examination. For the Variance Commentary Engine, every piece of auto-approved commentary is traceable to the exact data points and source documents used. Model risk management can pull a full lineage report for any commentary item within 30 seconds.
Human-in-the-Loop Controls
We designed confidence thresholds into every agent that are tunable by the finance director without engineering involvement. When Claude's confidence score falls below the threshold, the item routes to a named human approver with a structured review pack. This isn't a safety net — it's the operating model. The agents are designed to augment the team's judgment, not replace it.
On Model Risk Management
The bank's model risk team reviewed every agent as a model under their existing MRM framework. We provided technical documentation covering: training data and architecture (Anthropic's published documentation), use-case validation approach, ongoing performance monitoring plan, and escalation procedures. Validation completed in 6 weeks — faster than any prior third-party model review.
The 12-Week Deployment Timeline
Discovery & Architecture Design
Four days of stakeholder workshops with finance operations, technology, compliance, and model risk. Mapped all data flows, documented current-state process in detail, agreed governance framework and sign-off matrix. Delivered architecture design document and MRM pre-engagement briefing.
MCP Server Development
Built secure MCP connectors for Oracle Hyperion, the intercompany netting platform, and the regulatory data warehouse. Each connector implements read-only access by default, with write access requiring an explicit human approval step. Penetration test completed by the bank's internal security team before any agent work began.
Agent Development & Prompt Engineering
Built and tested all three agents against sanitised historical data. Extensive prompt engineering work — particularly for the Variance Commentary Engine — to ensure house style compliance, accuracy on edge cases, and appropriate confidence calibration. 200+ test cases run before UAT.
User Acceptance Testing & Training
UAT with 30 finance analysts using real month-end data (prior period). Three training sessions: a 2-hour hands-on workshop for power users, a 45-minute manager briefing, and on-demand video walkthroughs. All users trained before go-live.
Parallel Run
Ran all three agents in parallel with existing manual processes for one full close cycle. Compared outputs, measured accuracy, gathered analyst feedback, tuned thresholds. No production decision was made based on agent output alone during this period.
Production Go-Live
Full production deployment across 300 finance users. On-site hypercare support for the first close cycle. Performance metrics tracked daily. Close cycle completed in 4.5 business days — within the 5-day target. Zero incidents.
Results After Six Months in Production
Six months in, the finance director's summary was direct: "We have fewer errors, faster close, and my team is doing work that actually requires their judgment — not copying and pasting numbers between spreadsheets." That's the outcome we design for.
The close cycle runs consistently at 4.5 to 5 business days, down from 12. That's a 62% reduction in elapsed time. The CFO report now lands on executive committee desks three days earlier. In a bank where capital allocation decisions are made monthly, that's not a marginal improvement — it's a structural change in how quickly the institution can respond to trading conditions.
The $2M annual saving comes primarily from redirected analyst capacity. The bank hasn't reduced headcount — that was never the objective. Instead, 200 analysts have freed up roughly 30% of their month-end bandwidth. That capacity has been redirected to value-added analysis: scenario modelling, competitor benchmarking, investor narrative preparation. Tasks that the CFO wanted done but couldn't resource.
The Regulatory Filing Validator has processed every monthly filing since go-live without a single post-submission correction. The model risk team recently cited it as a case study in responsible AI deployment in their annual report to the board.
The CFO's View
"We went into this with one objective: faster close, no new compliance risk. We got a 70% reduction in cycle time and our cleanest regulatory filing year on record. The implementation team understood both the technology and what it means to operate inside a regulated institution. That combination is rare."
What This Tells You About Claude in Financial Services
If you're a CFO or CTO at a financial institution reading this, the first question is usually: how do we know Claude's outputs are accurate? The answer is: you build the validation infrastructure. The Regulatory Filing Validator doesn't trust Claude's outputs — it validates them against source data. The Variance Commentary Engine doesn't approve commentary automatically — it assigns confidence scores and routes uncertain items to humans. The architecture is designed assuming the model will occasionally be wrong, because all models are occasionally wrong.
The second question is about security and data privacy. Our Claude for Financial Services implementation guide covers this in detail. The short answer: Claude Enterprise, zero retention, private network, MCP servers inside the perimeter, and a data handling policy that satisfies group CISO review. We've done this. We know what it takes.
The third question is about regulatory approval. The agents in this deployment are not making regulated decisions. They're producing draft analysis that humans approve. That's a meaningful distinction. Our Claude Security & Governance service includes a full MRM engagement framework if you're navigating model risk management for the first time.
If you're considering a similar deployment, our Claude Enterprise Implementation service is the right starting point. We run a structured discovery process, produce an architecture design, and handle everything from MCP server development to user training. The full case studies library includes examples from legal, healthcare, and manufacturing as well.
Deploying Claude in Financial Services?
We've built production Claude deployments inside regulated financial institutions. Book a 30-minute strategy call with a Claude Certified Architect to discuss your use case.