Categories
Nevin Manimala Statistics

Multigranularity Label Prediction Model for Automatic International Classification of Diseases Coding in Clinical Text

J Comput Biol. 2023 Jul 31. doi: 10.1089/cmb.2023.0096. Online ahead of print.

ABSTRACT

International Classification of Diseases (ICD) serves as the foundation for generating comparable global disease statistics across regions and over time. The process of ICD coding involves assigning codes to diseases based on clinical notes, which can describe a patient’s condition in a standard way. However, this process is complicated by the vast number of codes and the intricate taxonomy of ICD codes, which are hierarchically organized into various levels, including chapter, category, subcategory, and its subdivisions. Many existing studies focus solely on predicting subcategory codes, ignoring the hierarchical relationships among codes. To address this limitation, we propose a multitask learning model that trains multiple classifiers for different code levels, while also capturing the relations between coarser and finer-grained labels through a reinforcement mechanism. Our approach is evaluated on both English and Chinese benchmark dataset, and we demonstrate that our method achieves competitive performance with baseline models, particularly in terms of macro-F1 results. These findings suggest that our approach effectively leverages the hierarchical structure of ICD codes to improve disease code prediction accuracy. Analysis of attention mechanism shows that multigranularity attention of our model captures crucial feature of input text on different granularity levels, which can provide reasonable explanations for the prediction results.

PMID:37523219 | DOI:10.1089/cmb.2023.0096

By Nevin Manimala

Portfolio Website for Nevin Manimala