J Am Med Inform Assoc. 2021 May 6:ocab077. doi: 10.1093/jamia/ocab077. Online ahead of print.
ABSTRACT
OBJECTIVE: We introduce Medical evidence Dependency (MD)-informed attention, a novel neuro-symbolic model for understanding free-text clinical trial publications with generalizability and interpretability.
MATERIALS AND METHODS: We trained one head in the multi-head self-attention model to attend to the Medical evidence Ddependency (MD) and to pass linguistic and domain knowledge on to later layers (MDinformed). This MD-informed attention model was integrated into BioBERT and tested on 2 public machine reading comprehension benchmarks for clinical trial publications: Evidence Inference 2.0 and PubMedQA. We also curated a small set of recently published articles reporting randomized controlled trials on COVID-19 (coronavirus disease 2019) following the Evidence Inference 2.0 guidelines to evaluate the model’s robustness to unseen data.
RESULTS: The integration of MD-informed attention head improves BioBERT substantially in both benchmark tasks-as large as an increase of +30% in the F1 score-and achieves the new state-of-the-art performance on the Evidence Inference 2.0. It achieves 84% and 82% in overall accuracy and F1 score, respectively, on the unseen COVID-19 data.
CONCLUSIONS: MD-informed attention empowers neural reading comprehension models with interpretability and generalizability via reusable domain knowledge. Its compositionality can benefit any transformer-based architecture for machine reading comprehension of free-text medical evidence.
PMID:33956981 | DOI:10.1093/jamia/ocab077