Nurs Res. 2025 Jul 14. doi: 10.1097/NNR.0000000000000848. Online ahead of print.
ABSTRACT
BACKGROUND: Computable phenotyping is a data science method that systematically synthesizes clinical attributes, such as a disease, condition, or patient cohort, enabling a database to be queried for entries matching these characteristics. Developing computable phenotypes will enhance current clinical and research efforts and is foundational for effective nurse scholar participation in future data science endeavors, such as artificial intelligence (AI) and machine learning (ML) research.
OBJECTIVE: (a) Present a foundational, disease-agnostic framework for systematic computable phenotype construction; (b) demonstrate the framework used by exploring the following question: “Does early pubertal timing increase the risk of developing type II diabetes in males?”; and (c) outline the methodologic utility and limitations of computable phenotyping for nursing research.
METHODS: A proof-of-concept pilot project explored computable phenotype research utility by querying the TriNetX© de-identified health record database. Various computable phenotypes were constructed to retrieve complete case frequency counts of specific health records for children experiencing puberty. These retrieved records allowed for quantifying type 2 diabetes (T2D) risk by comparing children diagnosed with precocious puberty (medically diagnosed early puberty) to those without an abnormal puberty diagnosis. A translational science lens informed the extraction and synthesis of the underlying scientific and operational principles relevant to systematic computable phenotyping.
RESULTS: A six-step, disease-agnostic, computable phenotyping framework is synthesized for nurse researchers and clinicians to leverage “big data” applications in their work. The puberty case example-illustrating foundational use of the framework-suggests that males with precocious puberty may be six times more likely to develop T2D when 14-18 years old than those without diagnosed early puberty. The framework provides a foundation for sophisticated statistical analyses, such as leveraging computable phenotypes in multivariate modeling and machine learning algorithms.
DISCUSSION: The six-step, computable phenotype framework will introduce nurse scholars and clinicians to leverage data science principles in real-world interfaces. Applications using the framework can include generating and testing epidemiologic hypotheses, identifying participants for research with specific clinical attributes, deploying statistical models for health care monitoring and decision-making, and participating in future research on AI and ML algorithms. The puberty case example generates foundational evidence to justify future puberty research.
PMID:40680284 | DOI:10.1097/NNR.0000000000000848