Categories
Nevin Manimala Statistics

Synthetic Data Generation for Classifying Electrophysiological and Morpho-Electrophysiological Neurons from Mouse Visual Cortex

Neuroinformatics. 2025 Dec 27;24(1):2. doi: 10.1007/s12021-025-09761-2.

ABSTRACT

Accurate classification of neuronal cell types is essential for understanding brain organization, but multimodal neuron datasets are scarce and strongly imbalanced across subclasses. We present a benchmark of synthetic data augmentation methods for predicting electrophysiology-defined neuronal classes (e-types) in the Allen Cell Types mouse visual cortex dataset. Two supervised tasks were evaluated over the same 17 e-type labels: prediction from electrophysiology features alone (E→e-type) and prediction from combined morphology plus electrophysiology features (M + E→e-type). We established real-data baselines across multiple classifier families under a unified preprocessing pipeline, then augmented only the training sets using matched per-class grids with Synthetic Minority Over-sampling Technique (SMOTE) and deep generative models: Variational Autoencoders (VAE), Generative Adversarial Networks (GAN), masked autoregressive normalizing flows, and Denoising Diffusion Probabilistic Models (DDPM). Augmentation produced substantial generalization gains when applied in the native high-dimensional feature space, whereas introducing dimensionality reduction largely suppressed these benefits. SMOTE delivered the most robust and consistent improvements across tasks and augmentation levels. To assess biological realism, we introduced a fidelity framework combining feature-wise distribution comparisons, statistical concordance tests, and distance-based measures that compare synthetic-to-real variability against the natural variability between real classes. Most synthetic datasets stayed within biological diversity bounds, with deviations concentrated in the rarest subclasses. These results provide practical guidance on selecting and validating synthetic augmentation for neuronal subtype classification.

PMID:41455019 | DOI:10.1007/s12021-025-09761-2

By Nevin Manimala

Portfolio Website for Nevin Manimala