Causal Modeling for Fraud Detection: Enhancing Financial Security with Interpretable AI

Luqing Ren

doi:10.71222/jaw0kn39

Authors

Luqing Ren Columbia University, New York, NY, USA Author

DOI:

https://doi.org/10.71222/jaw0kn39

Keywords:

causal inference, fraud detection, explainable AI, financial security, propensity score matching, causal discovery

Abstract

Financial fraud poses a significant threat to the stability of modern economic systems. However, traditional machine learning approaches to fraud detection-primarily correlation-based-remain limited in precision, interpretability, and adaptability when confronting the constantly evolving strategies of fraudsters. This study introduces a causal inference framework for fraud detection, leveraging recent advancements in causal analysis to identify and quantify the underlying causal relationships among transaction attributes, user behaviors, and fraudulent outcomes. The framework incorporates three key components: causal discovery algorithms (PC and FCI), robust effect estimation techniques (e.g., PSM and DML), and an interpretable rule-extraction module that translates causal patterns into actionable insights. Experiments were conducted on two real-world datasets: a credit card transaction dataset (284,807 records, 32% fraud rate) and an insurance claims dataset (350,000 cases, 8% fraud rate). Results show that the proposed model consistently outperforms leading correlation-based methods-including AdaBoost, GBDT, XGBoost, and LightGBM-achieving notable performance improvements: an average 9-percentage-point gain in overall accuracy, a 2% increase in F1 score (up to 11%), a 5% boost in AUPRC, and a 13.3% improvement in MCC. A key finding highlights a 47% higher fraud risk associated with atypical location changes combined with large-value transactions, directly addressing the "black-box" limitations of conventional models. Robustness analyses further confirm the model's resilience against confounding influences such as seasonal fluctuations and demographic shifts, underscoring its adaptability to emerging fraud patterns. By integrating causal inference with interpretable artificial intelligence, this research advances fraud detection toward more precise, transparent, and regulatory-compliant financial risk management.

References

1. A. Dal Pozzolo, O. Caelen, R. A. Johnson, and G. Bontempi, "Calibrating probability with undersampling for unbalanced classification," In 2015 IEEE symposium series on computational intelligence, December, 2015, pp. 159-166, doi: 10.1109/SSCI.2015.33.

2. A. Johnson, "State of the Nation Report," 2018.

3. H. A. Abdou, and J. Pointon, "Credit scoring, statistical techniques and evaluation criteria: a review of the literature," Intelligent systems in accounting, finance and management, vol. 18, no. 2-3, pp. 59-88, 2011, doi: 10.1002/isaf.325.

4. K. H. Brodersen, F. Gallusser, J. Koehler, N. Remy, and S. L. Scott, "Inferring causal impact using Bayesian structural time-series models," 2015, doi: 10.1214/14-aoas788.

5. R. Chalapathy, and S. Chawla, "Deep learning for anomaly detection: A survey," arXiv preprint arXiv:1901.03407, 2019.

6. T. Chen, and C. Guestrin, "Xgboost: A scalable tree boosting system," In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, August, 2016, pp. 785-794, doi: 10.1145/2939672.2939785.

7. A. Battaglia, "Adversarial machine learning techniques in Fraud Detection: a Survey," 2022.

8. V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins, "Double/debiased machine learning for treatment and structural parameters," 2018, doi: 10.1111/ectj.12097.

9. Y. Freund, and R. E. Schapire, "A decision-theoretic generalization of on-line learning and an application to boosting," Journal of computer and system sciences, vol. 55, no. 1, pp. 119-139, 1997, doi: 10.1006/jcss.1997.1504.

10. C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, "On calibration of modern neural networks," In International conference on ma-chine learning, July, 2017, pp. 1321-1330.

11. M. V. Balasubramanian, "Ensemble modeling & prediction interpretability for insurance fraud claims classification (Doctoral dissertation, Dublin Business School)," 2019.

12. G. W. Imbens, and D. B. Rubin, "Causal inference in statistics, social, and biomedical sciences," Cambridge university press, 2015. ISBN: 9780521885881.

13. G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, and T. Y. Liu, "Lightgbm: A highly efficient gradient boosting decision tree," Advances in neural information processing systems, vol. 30, 2017.

14. M. N. Kerdabadi, W. A. Byron, X. Sun, and A. Iranitalab, "Spatio-Temporal Directed Graph Learning for Account Takeover Fraud Detection," arXiv preprint arXiv:2509.20339, 2025.

15. S. M. Lundberg, and S. I. Lee, "A unified approach to interpreting model predictions," Advances in neural information pro-cessing systems, vol. 30, 2017.

16. V. Didelez, and I. Pigeot, "Causality: models, reasoning, and inference," 2001.

17. P. R. Rosenbaum, and D. B. Rubin, "The central role of the propensity score in observational studies for causal effects," Bio-metrika, vol. 70, no. 1, pp. 41-55, 1983, doi: 10.1093/biomet/70.1.41.

18. P. Spirtes, C. N. Glymour, and R. Scheines, "Causation, prediction, and search," MIT press, 2000, doi: 10.1198/tech.2003.s776.

19. A. A. Taha, and S. J. Malebary, "An intelligent approach to credit card fraud detection using an optimized light gradient boosting machine," IEEE access, vol. 8, pp. 25579-25587, 2020, doi: 10.1109/access.2020.2971354.

20. L. Ren, "Causal Inference-Driven Intelligent Credit Risk Assessment Model: Cross-Domain Applications from Financial Mar-kets to Health Insurance," Academic Journal of Computing & Information Science, vol. 8, no. 8, pp. 8-14, 2025.

21. L. Ren, "Boosting Algorithm Optimization Technology for Ensemble Learning in Small Sample Fraud Detection," Academic Journal of Engineering and Technology Science, vol. 8, no. 4, pp. 53-60.