N Biotechnol. 2023 Sep 4:S1871-6784(23)00044-4. doi: 10.1016/j.nbt.2023.09.001. Online ahead of print.
Scientific information extraction is fundamental for research and innovation, but is currently mostly a manual, time-consuming process. Text Mining tools (TMTs) enable automated, accurate and quick information extraction from text, but there is little precedent of their use in the biomaterials field. Here, we compare the ability of various TMTs to extract useful information from biomaterials abstracts. Focusing on the biocompatibility of polydioxanone, a biodegradable polymer for which there are relatively few scientific publications, we have tested several tools ranging from machine learning approaches and statistical text analysis to MeSH indexing and domain-specific semantic tools for Named Entity Recognition. We have evaluated their output alongside a manual review of systematic reviews and meta-analyses. The findings show that TMTs can be highly efficient and powerful for mapping biomaterials texts and rapidly yield up-to-date information. Here, they enable one to identify dominating themes, see the evolution of specific terms and topics, and learn about key medical applications in biomaterials literature over the years. The analysis also shows that heterogeneous polymer naming is a significant challenge in mining biomedical literature that is yet to be tackled. This research showcases the potential value of using Natural Language Processing and domain-specific tools to extract and organize biomaterials data.