Privacy-Preserving Federated Learning Framework for Multi-Institutional Healthcare Data Analytics with Differential Privacy and Homomorphic Encryption

Xiaotong Shi

doi:10.71222/v9nck083

Authors

Xiaotong Shi Business Analytics & Data Engineering, Columbia University School of Engineering, New York, NY, USA Author

DOI:

https://doi.org/10.71222/v9nck083

Keywords:

federated learning, healthcare privacy, differential privacy, homomorphic encryption

Abstract

Healthcare data analytics across multiple institutions faces significant privacy challenges due to regulatory requirements and data sensitivity concerns. This paper presents a comprehensive privacy-preserving federated learning framework specifically designed for multi-institutional healthcare data analytics, integrating differential privacy mechanisms with homomorphic encryption techniques. The proposed framework addresses critical limitations in existing approaches by implementing adaptive privacy budget allocation strategies and secure gradient aggregation protocols tailored for healthcare environments. The system architecture incorporates four primary components: local training nodes with privacy protection modules, secure aggregation servers, communication orchestrators, and privacy management systems. Differential privacy implementation utilizes sophisticated noise injection mechanisms with epsilon values optimized between 0.5 and 1.2, while homomorphic encryption ensures secure gradient aggregation across participating institutions. Experimental evaluation on diverse healthcare datasets containing over 2.5 million patient records demonstrates model accuracy retention exceeding 94% while maintaining rigorous privacy guarantees. Performance analysis reveals successful convergence within 85-120 training rounds with computational overhead remaining below 15% compared to centralized approaches. The framework exhibits optimal scalability for networks encompassing up to 20 healthcare entities. Privacy-utility trade-off evaluation confirms superior performance compared to existing federated learning approaches in healthcare contexts. Compliance verification demonstrates adherence to HIPAA and GDPR requirements, establishing practical feasibility for real-world healthcare implementations while advancing collaborative medical research capabilities.

References

1. A. Pawar, S. Jain, A. Dhait, A. Nagbhidkar, and A. Narlawar, "Federated learning for privacy preserving in healthcare data analysis," in 2024 Int. Conf. Artif. Intell. Quantum Comput. Based Sensor Appl. (ICAIQSA), Dec. 2024, pp. 1-6, doi: 10.1109/ICAIQSA64000.2024.10882173.

2. A. Das and D. Saha, "FedProx-based federated transfer learning for efficient model personalization in healthcare," in 2025 Int. Conf. Ambient Intell. Health Care (ICAIHC), Jan. 2025, pp. 1-6, doi: 10.1109/ICAIHC64101.2025.10957093.

3. S. Moon and W. H. Lee, "Privacy-preserving federated learning in healthcare," in 2023 Int. Conf. Electron., Inf., Commun. (ICEIC), Feb. 2023, pp. 1-4, doi: 10.1109/ICEIC57457.2023.10049966.

4. Y. Tian, S. Wang, J. Xiong, R. Bi, Z. Zhou, and M. Z. A. Bhuiyan, "Robust and privacy-preserving decentralized deep fed-erated learning training: Focusing on digital healthcare applications," IEEE/ACM Trans. Comput. Biol. Bioinform., 2023, doi: 10.1109/TCBB.2023.3243932.

5. T. Alluhaidan and D. Josyula, "Weight aggregation methods for federated learning in healthcare-A comparative empirical analysis," in 2024 IEEE Int. Conf. Electro Inf. Technol. (eIT), May 2024, pp. 434-438, doi: 10.1109/eIT60633.2024.10609860.

6. G. Rao, T. K. Trinh, Y. Chen, M. Shu, and S. Zheng, "Jump prediction in systemically important financial institutions' CDS prices," Spectrum Res., vol. 4, no. 2, 2024.

7. J. Fan, T. K. Trinh, and H. Zhang, "Deep learning-based transfer pricing anomaly detection and risk alert system for phar-maceutical companies: A data security-oriented approach," J. Adv. Comput. Syst., vol. 4, no. 2, pp. 1-14, 2024, doi: 10.69987/JACS.2024.40201.

8. M. Zhang, S. Baral, N. Heffernan, and A. Lan, "Automatic short math answer grading via in-context meta-learning," arXiv preprint arXiv:2205.15219, 2022, doi: 10.48550/arXiv.2205.15219.

9. T. K. Trinh and D. Zhang, "Algorithmic fairness in financial decision-making: Detection and mitigation of bias in credit scoring applications," J. Adv. Comput. Syst., vol. 4, no. 2, pp. 36-49, 2024, doi: 10.69987/JACS.2024.40204.

10. Z. Wang, M. Zhang, R. G. Baraniuk, and A. S. Lan, "Scientific formula retrieval via tree embeddings," in 2021 IEEE Int. Conf. Big Data (Big Data), Dec. 2021, pp. 1493-1503, doi: 10.1109/BigData52589.2021.9671942.

11. M. Zhang, Z. Wang, R. Baraniuk, and A. Lan, "Math operation embeddings for open-ended solution analysis and feedback," arXiv preprint arXiv:2104.12047, 2021, doi: 10.48550/arXiv.2104.12047.

12. S. Zhang, C. Zhu, and J. Xin, "CloudScale: A lightweight AI framework for predictive supply chain risk management in small and medium manufacturing enterprises," Spectrum Res., vol. 4, no. 2, 2024,

13. S. Zhang, T. Mo, and Z. Zhang, "LightPersML: A lightweight machine learning pipeline architecture for real-time person-alization in resource-constrained e-commerce businesses," J. Adv. Comput. Syst., vol. 4, no. 8, pp. 44-56, 2024, doi: 10.69987/JACS.2024.40807.

14. D. Huang, M. Yang, and W. Zheng, "Using deep reinforcement learning for optimizing process parameters in CHO cell cultures for monoclonal antibody production," J. Comput. Technol. Appl. Math., vol. 4, no. 2, pp. 1-15, 2024, doi: 10.69987/AIMLR.2024.50302.

15. D. Ma, "AI-driven optimization of intergenerational community services: An empirical analysis of elderly care communities in Los Angeles," J. Comput. Technol. Appl. Math., vol. 4, no. 3, pp. 28-42, 2024, doi: 10.69987/AIMLR.2024.50402.