Categories
Nevin Manimala Statistics

Modeling language evolution using a spin glass approach

Phys Rev E. 2026 Mar;113(3-1):034312. doi: 10.1103/52k8-zz47.

ABSTRACT

The evolution of natural languages poses a riddle to any theoretical perspective based on efficiency considerations. If languages are already optimally effective means of organization and communication of thought, then why do they change? And if they are driven to become optimally effective in the future, then why do they change so slowly, and why do they diversify, rather than converge toward an optimum? We look here at the hypothesis that disorder, rather than efficiency, may play a dominant role. Most traditional approaches to study diachronic language dynamics emphasize lexical data, but it would seem that a crucial contribution to the effectiveness of a thought-coding device is given by its core generative structure, i.e., its syntax. Based on the reduction of syntax to a set of binary syntactic parameters, we introduce here a model of natural language change in which diachronic dynamics can stem from disordered interactions between/among parameters, even in the idealized limit of identical external inputs. We show in which region of “phase space” such dynamics show the glassy features that are observed in natural language across time. In particular, binary syntactic vectors remain trapped in glassy metastable (i.e., tendentially stable) states when the degree of asymmetry in the disordered interactions is below a critical value, consistent with studies of spin glasses with asymmetric interactions. We further show that an added Hopfield-type memory term would indeed, if strong enough, stabilize syntactic configurations even above the critical value, but losing the multiplicity of stable states. Finally, using a notion of linguistic distance in syntactic state space we show that a phylogenetic signal may remain among related languages, despite their gradually divergent syntax, exactly as recently pointed out for real-world languages. These statistical results appear to generalize beyond the dataset of 94 syntactic parameters across 58 languages used in this study.

PMID:41998971 | DOI:10.1103/52k8-zz47

By Nevin Manimala

Portfolio Website for Nevin Manimala