Eur J Epidemiol. 2026 Feb 21. doi: 10.1007/s10654-025-01342-6. Online ahead of print.
ABSTRACT
Ease of access to big data and automated analysis tools can facilitate the rapid generation of poorly designed epidemiological studies, which collectively pose a risk to the quality of medical literature. Member organizations of the TriNetX network have the ability to mass-produce retrospective cohort studies at speed using the federated data network’s statistical power and streamlined analytics pipeline. This exploratory meta-research study collated 13 published TriNetX-based retrospective cohort studies that claim to have used a design that is, in fact, impossible on the platform (the setting of a pseudo-index event on the TriNetX platform). Of these, 8 studies described their analysis as being conducted on the platform alone, making their description of the index event impossible. When we queried seven different generative artificial intelligence (AI) tools for advice on how to set an index event on TriNetX, six tools suggested at least one strategy that cannot be implemented on the platform. Unlike previously documented errors in TriNetX-based studies, we argue that the reporting of impossible index event designs in the identified publications likely constitute either distortion of the reported methods or the uncritical adoption of false AI-generated methodological advice. In an age of accelerating and increasingly automated medical research, editors and peer-reviewers must be informed of limitations with emerging epidemiological datasets and analytic tools.
PMID:41721987 | DOI:10.1007/s10654-025-01342-6