Graph-Enhanced Clinical Knowledge Injection for Secure and Robust Medical LLM Reasoning

Tobias D. Greene; Zixuanan Wan; Ananya M. Chopra; Kiran Chatterjee

Authors

Tobias D. Greene Department of Computer Science, George Mason University, Fairfax, VA, USA.
Zixuanan Wan Department of Computer Science, Colorado State University, Fort Collins, CO, USA.
Ananya M. Chopra Department of Computer Science, University of North Texas, Denton, TX, USA.
Kiran Chatterjee School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA.

Keywords:

graph-enhanced reasoning, clinical knowledge injection, medical large language models, adversarial robustness, knowledge graphs, secure AI, healthcare AI governance

Abstract

Large language models have demonstrated remarkable capabilities in natural language understanding and generation, yet their deployment in high-stakes medical contexts remains fraught with challenges related to factual accuracy, adversarial vulnerability, and systemic bias. This paper proposes a graph-enhanced clinical knowledge injection framework that systematically integrates structured biomedical ontologies and relational knowledge graphs into the reasoning pipeline of medical large language models. By embedding graph-based representations of clinical entities, relationships, and hierarchical dependencies, the proposed architecture augments model outputs with verifiable domain constraints and causal pathways, thereby improving both security and robustness. We examine the architectural trade-offs between expressivity and computational overhead, the role of graph neural network layers in preserving semantic integrity, and the implications for adversarial robustness when knowledge graphs serve as external verifiers. The framework is situated within a broader governance perspective that addresses data provenance, fairness across demographic groups, and sustainability of large-scale inference. Through cross-domain comparisons with existing retrieval-augmented generation and fine-tuning approaches, we highlight the structural advantages of graph-enhanced injection for mitigating hallucinations and resisting malicious perturbations. The paper concludes with a forward-looking discussion on the deployment of such systems in clinical decision support, the need for continuous validation against evolving medical knowledge, and the policy infrastructure required to ensure equitable access and accountability.

References

1. Singhal, K., Azizi, S., Tu, T., Mahdavi, S. S., Wei, J., Chung, H. W., ... & Natarajan, V. (2023). Large language models encode clinical knowledge. Nature, 620(7972), 172-180.

2. Thirunavukarasu, A. J., Ting, D. S. J., Elangovan, K., Gutierrez, L., Tan, T. F., & Ting, D. S. W. (2023). Large language models in medicine. Nature Medicine, 29(8), 1930-1940.

3. Bodenreider, O. (2004). The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Research, 32(suppl_1), D267-D270.

4. Spackman, K. A., Campbell, K. E., & Cote, R. A. (1997). SNOMED RT: a reference terminology for health care. Journal of the American Medical Informatics Association, 4(Suppl), 640-644.

5. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459-9474.

6. Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 328-339.

7. Wang, B., Xie, Q., Pei, J., Lee, M. T., & Chen, C. (2023). Are large language models ready for healthcare? A comprehensive review. arXiv preprint arXiv:2304.09685.

8. Asakura, K., Kaneko, M., & Aizawa, A. (2024). Hallucination detection in medical LLMs: a survey. Journal of Biomedical Informatics, 152, 104621.

9. Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M. W. (2020). REALM: retrieval-augmented language model pre-training. Proceedings of the 37th International Conference on Machine Learning, 3929-3938.

10. Chen, H., Ji, H., & Roth, D. (2021). Adversarial retrieval for open-domain question answering. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, 4210-4221.

11. McCloskey, M., & Cohen, N. J. (1989). Catastrophic interference in connectionist networks: the sequential learning problem. Psychology of Learning and Motivation, 24, 109-165.

12. Zhang, Y., Chen, Q., Yang, Z., Lin, H., & Lu, Z. (2020). BioBERT: a pre-trained biomedical language model for biomedical text mining. Bioinformatics, 36(4), 1234-1240.

13. Zhou, J., Cui, G., Zhang, Z., Yang, C., Liu, Z., & Sun, M. (2020). Graph neural networks: a review of methods and applications. AI Open, 1, 57-81.

14. Hu, S. (2026). Research on Security Enhancement Methods for Adversarial Robust Large Language Model Intelligent Agents for Medical Decision-Making Tasks. arXiv preprint arXiv:2605.08257.

15. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2018). Graph attention networks. International Conference on Learning Representations.

16. Schlichtkrull, M., Kipf, T. N., Bloem, P., Van Den Berg, R., Titov, I., & Welling, M. (2018). Modeling relational data with graph convolutional networks. European Semantic Web Conference, 593-607.

17. Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Inductive representation learning on large graphs. Advances in Neural Information Processing Systems, 30.

18. Yasunaga, M., Ren, H., Bosselut, A., Liang, P., & Leskovec, J. (2022). QA-GNN: reasoning with language models and knowledge graphs for question answering. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics, 535-547.

19. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.

20. Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645-3650.

21. Steinhardt, J., Koh, P. W., & Liang, P. (2017). Certified defenses for data poisoning attacks. Advances in Neural Information Processing Systems, 30.

22. Smith, B., & Ceusters, W. (2010). Ontologies and the semantic web. Applied Ontology, 5(3-4), 155-175.

23. Cahan, E. M., & Hernandez-Boussard, T. (2023). Bias in machine learning for health: a review of the literature and recommendations for future work. JAMA Health Forum, 4(5), e231087.

24. Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., & Wallach, H. (2018). A reductions approach to fair classification. Proceedings of the 35th International Conference on Machine Learning, 60-69.

25. Han, S., Mao, H., & Dally, W. J. (2016). Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. International Conference on Learning Representations.

Graph-Enhanced Clinical Knowledge Injection for Secure and Robust Medical LLM Reasoning

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Current Issue

Information

Make a Submission

Journal Information

Indexing & Infrastructure