Evaluation and Analysis of Chart Reasoning Accuracy in Multimodal Large Language Models: An Empirical Study on Influencing Factors

Authors

  • Ziyi Jiang Computer Information Tech, Northern Arizona University, Flagstaff, AZ, USA Author
  • Minghui Wang Software Engineering, Peking University, Beijing, China Author

Keywords:

multimodal large language models, chart reasoning, visual understanding, data visualization

Abstract

This study presents a comprehensive empirical evaluation of chart reasoning capabilities in multimodal large language models (MLLMs), examining critical factors that influence performance accuracy across diverse visualization types. Through systematic experimentation with six leading MLLMs including GPT-4V, LLaVA, and BLIP-2, we analyze their proficiency in interpreting statistical charts, graphs, and data visualizations. Our methodology encompasses a curated dataset of 2,400 charts spanning bar graphs, line plots, scatter plots, pie charts, and complex multi-panel visualizations, each annotated with ground-truth reasoning tasks. Performance evaluation reveals significant variations based on chart complexity, data density, textual annotation presence, and visual design elements. Statistical analysis demonstrates that model accuracy decreases substantially with increased data point density (correlation coefficient: -0.73) and increased visual complexity. The study identifies optimal configurations for different chart types and provides actionable insights for improving MLLM deployment in data analysis applications. Our findings contribute to understanding multimodal AI limitations and establishing benchmarks for future chart comprehension research.

References

1. H. Wang et al., “Automated Compliance Monitoring: A Machine Learning Approach for Digital Services Act Adherence in Multi-Product Platforms,” Appl. Comput. Eng., vol. 147, pp. 14–25, 2025. ISBN: 9781805900559.

2. S. Zhang, Z. Feng, and B. Dong, “LAMDA: Low-latency anomaly detection architecture for real-time cross-market financial decision support,” Acad. Nexus J., vol. 3, no. 2, 2024.

3. M. Zhang, N. Heffernan, and A. Lan, “Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions,” arXiv preprint arXiv:2306.00791, 2023.

4. S. Zhang, T. Mo, and Z. Zhang, “LightPersML: A Lightweight Machine Learning Pipeline Architecture for Real-Time Person-alization in Resource-Constrained E-commerce Businesses,” J. Adv. Comput. Syst., vol. 4, no. 8, pp. 44–56, 2024, doi: 10.69987/JACS.2024.40807.

5. Y. Ma, T. Zhang, and G. Zhan, “An LLM-based Intelligent System for the Evaluation of Property Geographical Environment,” in 2024 Int. Symp. Intell. Robot. Syst. (ISoIRS), IEEE, 2024, doi: 10.1109/ISoIRS63136.2024.00057.

6. T. K. Trinh and D. Zhang, “Algorithmic fairness in financial decision-making: Detection and mitigation of bias in credit scoring applications,” J. Adv. Comput. Syst., vol. 4, no. 2, pp. 36–49, 2024, doi: 10.69987/JACS.2024.40204.

7. P. Liu et al., “Deep flow collaborative network for online visual tracking,” in ICASSP 2020 - IEEE Int. Conf. Acoust., Speech Signal Process., IEEE, 2020, doi: 10.1109/ICASSP40776.2020.9054590.

8. M. Li, W. Liu, and C. Chen, “Adaptive Financial Literacy Enhancement through Cloud-Based AI Content Delivery: Effective-ness and Engagement Metrics,” Ann. Appl. Sci., vol. 5, no. 1, 2024.

9. Y. Zhao et al., “Unit operation combination and flow distribution scheme of water pump station system based on Genetic Algorithm,” Appl. Sci., vol. 13, no. 21, p. 11869, 2023, doi: 10.3390/app132111869.

10. Z. Wang et al., “Scientific formula retrieval via tree embeddings,” in 2021 IEEE Int. Conf. Big Data (Big Data), IEEE, 2021, doi: 10.1109/BigData52589.2021.9671942.

11. M. Sun, Z. Feng, and P. Li, “Real-Time AI-Driven Attribution Modeling for Dynamic Budget Allocation in US E-Commerce: A Small Appliance Sector Analysis,” J. Adv. Comput. Syst., vol. 3, no. 9, pp. 39–53, 2023, doi: 10.69987/JACS.2023.30904.

12. J. Fan, T. K. Trinh, and H. Zhang, “Deep Learning-Based Transfer Pricing Anomaly Detection and Risk Alert System for Pharmaceutical Companies: A Data Security-Oriented Approach,” J. Adv. Comput. Syst., vol. 4, no. 2, pp. 1–14, 2024, doi: 10.69987/JACS.2024.40201.

13. R. Chand et al., “Survey on Visual Speech Recognition using Deep Learning Techniques,” in 2023 Int. Conf. Commun. Syst., Comput. IT Appl. (CSCITA), IEEE, 2023, doi: 10.1109/CSCITA55725.2023.10104811.

14. Q. Liu et al., “Multimodal recommender systems: A survey,” ACM Comput. Surv., vol. 57, no. 2, pp. 1–17, 2024, doi: 10.1145/3695461.

15. C. Ju et al., “AI-Driven Vulnerability Assessment and Early Warning Mechanism for Semiconductor Supply Chain Resilience,” Ann. Appl. Sci., vol. 5, no. 1, 2024.

16. Z. Wang et al., “Temporal Evolution of Sentiment in Earnings Calls and Its Relationship with Financial Performance,” Appl. Comput. Eng., vol. 141, pp. 195–206, 2025. ISBN: 9781835589977.

17. M. Smalenberger et al., “Automatic Short Answer Grading in College Mathematics Using In-Context Meta-learning: An Evaluation of the Transferability of Findings,” in Int. Conf. Artif. Intell. Educ., Cham: Springer Nature Switzerland, 2024, doi: 10.1007/978-3-031-64315-6_38.

18. M. Zhang et al., “Interpretable math word problem solution generation via step-by-step planning,” arXiv preprint arXiv:2306.00784, 2023.

19. Z. Wang, X. Wang, and H. Wang, “Temporal graph neural networks for money laundering detection in cross-border transac-tions,” Acad. Nexus J., vol. 3, no. 2, 2024.

20. G. Rao, Z. Wang, and J. Liang, “Reinforcement Learning for Pattern Recognition in Cross-Border Financial Transaction Anomalies: A Behavioral Economics Approach to AML,” Appl. Comput. Eng., vol. 142, pp. 116–127, 2025. ISBN: 9781835589991.

21. J. Chen and Z. Lv, “Graph Neural Networks for Critical Path Prediction and Optimization in High-Performance ASIC Design: A ML-Driven Physical Implementation Approach,” in Global Conf. Adv. Sci. Technol., vol. 1, no. 1, 2025.

22. S. Chen et al., “CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs,” arXiv preprint arXiv:2505.11413, 2025.

23. J. Liang et al., “Anomaly Detection in Tax Filing Documents Using Natural Language Processing Techniques,” Appl. Comput. Eng., vol. 144, pp. 80–89, 2025. ISBN: 9781805900214.

24. S. Zhang, C. Zhu, and J. Xin, “CloudScale: A Lightweight AI Framework for Predictive Supply Chain Risk Management in Small and Medium Manufacturing Enterprises,” Spectrum Res., vol. 4, no. 2, 2024.

25. M. Zhang et al., “Automatic short math answer grading via in-context meta-learning,” arXiv preprint arXiv:2205.15219, 2022.

26. C. Ni et al., “Contrastive Time-Series Visualization Techniques for Enhancing AI Model Interpretability in Financial Risk As-sessment,” 2025, doi: 10.20944/preprints202504.1984.v1.

27. G. Rao et al., “Jump prediction in systemically important financial institutions’ CDS prices,” Spectrum Res., vol. 4, no. 2, 2024.

28. A. Kang, J. Xin, and X. Ma, “Anomalous cross-border capital flow patterns and their implications for national economic security: An empirical analysis,” J. Adv. Comput. Syst., vol. 4, no. 5, pp. 42–54, 2024, doi: 10.69987/JACS.2024.40504.

29. Y. Chen, C. Ni, and H. Wang, “AdaptiveGenBackend A Scalable Architecture for Low-Latency Generative AI Video Processing in Content Creation Platforms,” Ann. Appl. Sci., vol. 5, no. 1, 2024.

Downloads

Published

04 July 2025

How to Cite

Jiang, Z., & Wang, M. (2025). Evaluation and Analysis of Chart Reasoning Accuracy in Multimodal Large Language Models: An Empirical Study on Influencing Factors. Pinnacle Academic Press Proceedings Series, 3, 43-58. http://pinnaclepubs.com/index.php/PAPPS/article/view/173