Optimization of Cross-Border Payment Costs Using Reinforcement Learning and Network Flow Models

Rajesh Bhaskarla; Harish Pulluri; Mohammed Nazeer

doi:10.63913/ftij.v2i1.24

PDF

Published: May 26, 2026

DOI: https://doi.org/10.63913/ftij.v2i1.24

Keywords:

Cross-Border Payments, Payment Routing, Minimum-Cost Flow, Reinforcement Learning, FX Spread

Citation Analysis:

👤 Rajesh Bhaskarla

🏢 Dept. of EEE, Anurag University, Hyderabad, India

👤 Harish Pulluri

🏢 Dept. of EEE, Anurag University, Hyderabad, India

👤 Mohammed Nazeer

🏢 Dept. of EEE, Anurag University, Hyderabad, India

Cross-border payments exhibit persistent inefficiencies driven by multi-rail fragmentation, time-varying FX spreads, congestion, and operational outages. This study proposes a hybrid optimization architecture that couples a constrained minimum-cost flow solver with a reinforcement learning (RL) policy that applies bounded adaptive adjustments to feasible routing and liquidity plans. Experiments were conducted on 120,000 transactions across 18 corridors under 24 stress scenarios with 50 stochastic seeds, yielding 1,200 independent runs. The proposed RL+Flow method achieved the lowest mean cost at 1.84 USD per transaction, outperforming flow-only (2.11 USD), RL-only (2.24 USD), and cheapest-rail routing (2.36 USD). Cost decomposition showed a 14.8% reduction in FX-spread costs and a 31.6% reduction in delay penalties relative to flow-only, while maintaining conservative fee profiles. Service quality improved concurrently, with late settlement rates reduced to 1.9% versus 4.8% (flow-only), 6.1% (RL-only), and 7.4% (cheapest-rail), and failure rates reduced to 0.42% versus 0.75%, 0.88%, and 1.17% under outage-focused regimes. Corridor analysis indicated larger incremental savings in dense corridors, reaching an average −11.9% cost change versus flow-only, compared with −6.2% in sparse long-tail corridors. Ablation results showed that removing flow warm-start increased mean cost to 2.06 USD and doubled failures to 0.78%, while removing reliability filtering raised failures to 0.91%. Overall, the results indicate that feasibility-first flow optimization combined with risk-aware adaptive RL yields robust cost and reliability gains under realistic nonstationary conditions.

Bhaskarla, R., Pulluri, H., & Nazeer, M. (2026). Optimization of Cross-Border Payment Costs Using Reinforcement Learning and Network Flow Models. Fintech Innovation Journal, 2(1), 71–93. https://doi.org/10.63913/ftij.v2i1.24

Distributed Under Creative Commons CC-BY 4.0

Issue

Vol. 2 No. 1 (2026): Regular Issue: February 2026

Section

Articles

Journal Metrics
Acceptance Rate	51%
Review Speed	52 days
Issues per Year	4
Number of Volumes	2
Number of Issues	5
Number of Articles	25
Number of Reviewers	5
Number of Contributors	56
Contributing Countries	6
WoS Citations	18
Scopus Citations	41
Google Scholar Citations	62
Abstract Views	3,412 views
PDF Downloads	1,372
Updated at 1 June 2026

Tools
Reference Manager
Plagiarism Checker
Grammar Assistant

Article Sidebar

Main Article Content

Article Details

Similar Articles