Categories
Nevin Manimala Statistics

Evaluating In-Context Learning in Large Language Models for Molecular Property Regression

J Comput Chem. 2026 Jan 15;47(2):e70308. doi: 10.1002/jcc.70308.

ABSTRACT

Large language models (LLMs) demonstrate strong performance in natural language tasks, but their capacity for genuine in-context learning (ICL) in scientific regression remains unclear. We systematically assessed seven LLMs on molecular property prediction using a controlled framework of 56 transformed tasks that isolate shortcut learning and are designed to induce functional out-of-distribution (OOD) behavior. LLMs performed nearly perfectly on raw molecular weight prediction via shortcut cues but deteriorated under nonlinear transformations, whereas machine learning (ML) baselines showed greater robustness, yielding a performance crossover. Meta-analysis revealed that distributional descriptors and structure-activity landscape indices (SALI) predict task favorability, providing a framework for selecting between LLM- and ML-based approaches in chemistry.

PMID:41538780 | DOI:10.1002/jcc.70308

By Nevin Manimala

Portfolio Website for Nevin Manimala