Causally Grounded LLM Attribution Agents for High-Dynamic Logistics Systems: Design and Experimental Validation

Sixuan Li

doi:10.71222/c3keh831

Authors

Sixuan Li McCallum Business School, Bentley University, Waltham, United States Author

DOI:

https://doi.org/10.71222/c3keh831

Keywords:

causal attribution, causal graphs, logistics analytics, interpretable ai, language models, distribution shift

Abstract

High-dynamic logistics systems frequently generate anomalies due to interacting operational mechanisms like demand surges, driver shortages, and exogenous shocks. While large language models (LLMs) can transform heterogeneous telemetry into natural-language explanations for operator diagnosis, unconstrained language reasoning remains unreliable for root-cause attribution in systems with structured dependencies. To address this, we propose a causally grounded attribution agent architecture integrating a streaming state-preparation layer, a structural causal graph (SCG) to constrain admissible cause-effect paths, a quantitative attribution core, and an LLM reasoning layer. This framework converts grounded evidence into reliable explanations and intervention suggestions. We validate the core components on a controlled synthetic benchmark. The SCG-aligned model achieves a superior macro F1 score of 0.753 on the in-distribution test set and demonstrates robust performance under distribution shifts, outperforming random forest and ungrounded heuristic baselines. Furthermore, a graph misspecification study confirms that the SCG provides critical structural information beyond mere regularization, as removing a single causal edge significantly reduces accuracy. Finally, an LLM evaluation across multiple grounding configurations reveals that full causal grounding improves attribution accuracy by 20 to 35 percentage points, with smaller models benefiting disproportionately. Ultimately, this study contributes a robust, causally grounded agent architecture and a replicable cross-tier evaluation framework for LLM-based causal reasoning, laying the groundwork for future validation on production telemetry and downstream operational impact assessments.

References

1. L. G. Neuberg, "Causality: models, reasoning, and inference, by judea pearl, cambridge university press, 2000," Econometric Theory, vol. 19, no. 4, pp. 675–685, 2003.

2. J. Wei et al., "Chain-of-thought prompting elicits reasoning in large language models," Advances in Neural Information Processing Systems, vol. 35, pp. 24824–24837, 2022.

3. Z. Ji et al., "Survey of hallucination in natural language generation," ACM Computing Surveys, vol. 55, no. 12, pp. 1–38, 2023.

4. J. Runge, P. Nowack, M. Kretschmer, S. Flaxman, and D. Sejdinovic, "Detecting and quantifying causal associations in large nonlinear time series datasets," Science Advances, vol. 5, no. 11, eaau4996, 2019.

5. C. W. Granger, "Investigating causal relations by econometric models and cross-spectral methods," Econometrica: Journal of the Econometric Society, pp. 424–438, 1969.

6. K. Cho et al., "Learning phrase representations using RNN encoder–decoder for statistical machine translation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734, Oct. 2014.

7. R. Pamfil et al., "Dynotears: Structure learning from time-series data," in International Conference on Artificial Intelligence and Statistics, pp. 1595–1605, June 2020.

8. T. Schick et al., "Toolformer: Language models can teach themselves to use tools," Advances in Neural Information Processing Systems, vol. 36, pp. 68539–68551, 2023.

9. D. B. Rubin, "Estimating causal effects of treatments in randomized and nonrandomized studies," Journal of Educational Psychology, vol. 66, no. 5, pp. 688, 1974.

10. A. P. Raia, "A study of the educational value of management games," The Journal of Business, vol. 39, no. 3, pp. 339–352, 1966.

11. E. Kiciman, R. Ness, A. Sharma, and C. Tan, "Causal reasoning and large language models: Opening a new frontier for causality," Transactions on Machine Learning Research, 2023.

12. P. Lewis et al., "Retrieval-augmented generation for knowledge-intensive NLP tasks," Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474, 2020.

13. M. Zečević, M. Willig, D. S. Dhami, and K. Kersting, "Causal parrots: Large language models may talk causality but are not causal," arXiv preprint arXiv:2308.13067, 2023.

14. S. M. Lundberg and S. I. Lee, "A unified approach to interpreting model predictions," Advances in Neural Information Processing Systems, vol. 30, 2017.

15. D. A. Boiko, R. MacKnight, B. Kline, and G. Gomes, "Autonomous chemical research with large language models," Nature, vol. 624, no. 7992, pp. 570–578, 2023.

16. D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009.

17. T. Broderick, N. Boyd, A. Wibisono, A. C. Wilson, and M. I. Jordan, "Streaming variational Bayes," Advances in Neural Information Processing Systems, vol. 26, 2013.

18. S. Pan et al., "Unifying large language models and knowledge graphs: A roadmap," IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 7, pp. 3580–3599, 2024.

19. R. Kulhavý and M. B. Zarrop, "On a general concept of forgetting," International Journal of Control, vol. 58, no. 4, pp. 905–924, 1993.

20. B. Li et al., "Research, application, and challenges of causal inference in industrial fault diagnosis: A survey," Engineering Applications of Artificial Intelligence, vol. 158, 111376, 2025.

21. Z. Xi, W. Guan, and A. Savasan, "Optimizing inventory management: A causal inference-driven Bayesian network with transfer learning adaptation," PeerJ Computer Science, vol. 11, e3262, 2025.

22. T. Ameer and O. F. Valilai, "Cloud-native causal AI for supply chain KPI monitoring: A GCP framework to diagnose out-of-stock events," Machine Learning with Applications, 100765, 2025.

23. Z. Zhang et al., "Casual inference-enabled graph neural networks for generalized fault diagnosis in industrial IoT system," Information Sciences, vol. 694, 121719, 2025.

24. M. Wyrembek, G. Baryannis, and A. Brintrup, "Causal machine learning for supply chain risk prediction and intervention planning," International Journal of Production Research, vol. 63, no. 15, pp. 5629–5648, 2025.

25. F. F. Bastarianto, T. O. Hancock, C. F. Choudhury, and E. Manley, "Agent-based models in urban transportation: review, challenges, and opportunities," European Transport Research Review, vol. 15, no. 1, pp. 19, 2023.

26. M. Wooldridge, An Introduction to Multiagent Systems. John Wiley & Sons, 2009.

27. E. E. Kosasih, E. Papadakis, G. Baryannis, and A. Brintrup, "A review of explainable artificial intelligence in supply chain management using neurosymbolic approaches," International Journal of Production Research, vol. 62, no. 4, pp. 1510–1540, 2024.

28. J. Li, E. Rombaut, and L. Vanhaverbeke, "A systematic review of agent-based models for autonomous vehicles in urban mobility and logistics: Possibilities for integrated simulation models," Computers, Environment and Urban Systems, vol. 89, 101686, 2021.

29. J. Peters, D. Janzing, and B. Scholkopf, Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press, 2017.