Knowledge Graph-Enhanced Deep Learning Model (H-SYSTEM) for Hypertensive Intracerebral Hemorrhage: Model Development and Validation

J Med Internet Res. 2025 Jun 12;27:e66055. doi: 10.2196/66055.

ABSTRACT

BACKGROUND: Although much progress has been made in artificial intelligence (AI), several challenges remain substantial obstacles to the development and translation of AI systems into clinical practice. Even large language models, which show excellent performance on various tasks, have progressed slowly in clinical practice tasks. Providing precise and explainable treatment plans with personalized details remains a big challenge for AI systems due to both the highly specialized medical knowledge required and patients’ complicated conditions.

OBJECTIVE: This study aimed to develop an explainable and efficient decision support system named H-SYSTEM to assist neurosurgeons in diagnosing and treating patients with hypertensive intracerebral hemorrhage. The system was designed to address the limitations of existing AI systems by integrating a medical domain knowledge graph to enhance decision-making accuracy and explainability.

METHODS: The H-SYSTEM consists of 3 main modules: the key named entity recognition (NER) module, the semantic analysis and representation module, and the reasoning module. Furthermore, we constructed a medical domain knowledge graph for hypertensive intracerebral hemorrhage, named HKG, which served as an external knowledge brain of the H-SYSTEM to enhance its text recognition and automated decision-making capability. The HKG was exploited to guide the training of the semantic analysis and representation module and reasoning module, which makes the output of the H-SYSTEM more explainable., To assess the performance of the H-SYSTEM, we compared it with doctors and different large language models.

RESULTS: The outputs based on HKG showed reliable performance as compared with neurosurgical doctors, with an overall accuracy of 94.87%. The bidirectional encoder representations from transformers, inflated dilated convolutional neural network, bidirectional long short-term memory, and conditional random fields (BERT-IDCNN-BiLSTM-CRF) model was used as the key NER module of the H-SYSTEM due to its fast convergence and efficient extraction of key named entities, achieved the highest performance among 7 key NER models (precision=92.03, recall=90.22, and F1-score=91.11), significantly outperforming the others. The H-SYSTEM achieved an overall accuracy of 91.74% in treatment plans, showing significant consistency with the gold standard (P<.05), with diagnostic measures achieving 88.18% accuracy, 97.03% area under the curve (AUC), and a κ of 0.874; surgical therapy achieving 98.53% accuracy, 98.53% AUC, and a κ of 0.971; and rescue therapies achieving 89.50% accuracy, 94.67% AUC, and a κ of 0.923 (all P<.05). Furthermore, the H-SYSTEM showed high reliability and efficiency when compared to doctors and ChatGPT, achieving statistically higher accuracy (95.26% vs 91.48%, P<.05). Additionally, the H-SYSTEM achieved a total accuracy of 92.22% (ranging from 91.14% to 95.35%) in treatment plans for 605 additional patients from 6 different medical centers.

CONCLUSIONS: The H-SYSTEM showed significantly high efficiency and generalization capacity in processing electronic medical records, and it provided explainable and elaborate treatment plans. Therefore, it has the potential to provide neurosurgeons with rapid and reliable decision support, especially in emergency conditions. The knowledge graph-enhanced deep-learning model exhibited excellent performance in the clinical practice tasks.

PMID:40505141 | DOI:10.2196/66055

By Nevin Manimala