Categories
Nevin Manimala Statistics

Deep learning-based environmental source separation and sound enhancement: Advancements for cochlear implant and normal hearing listeners

J Acoust Soc Am. 2026 Apr 1;159(4):3448-3463. doi: 10.1121/10.0042760.

ABSTRACT

Humans perceive non-linguistic sounds (NLSs) by associating auditory events with corresponding physical sources in a complex acoustic environment. However, previous studies have shown that cochlear implant (CI) users, vs normal hearing (NH) listeners, can face more severe challenges in identifying and tracking NLS. For CI listeners, this leads to limited autonomy, environmental awareness, safety, contextual navigation and daily engagement with individuals, society, and environmental situations. In earlier work, we studied NLS classification among CI and NH listeners and proposed a NLS enhancement solution to benefit CI/NH listeners. Building on this foundation, we propose here an experimental framework to investigate competing environmental sounds or NLS perception among CI and NH listeners. We introduce a two-source mixture model featuring “target” and “interference” source characteristics and develop an experimental setup for listener evaluation in three conditions: (i) mixed-baseline, (ii) source separation (SS) using the SUccessive DOwnsampling and Resampling of Multi-Resolution Features network, and (iii) source separation with non-linguistic sound enhancement (SSE) achieved by cascading SS output with our previously developed NLS enhancement technique. CI and NH listener evaluations were based on subjective ratings and forced-choice preference test based on perceptual measures: (i) interference, (ii) audio quality, and (iii) distortion. Our study shows a statistically significant improvement in interference reduction, with CI listeners demonstrating reduction for “nature” sounds with “category-matched” interference [F(2,21) = 4.935, p = 0.0175], and NH listeners exhibiting reductions across all NLS categories, with F-values ranging from [F(2,135) = 8.481, p = 0.000 339] to [F(2,135) = 32.37, p = 3.29 × 10-12]. Pairwise forced-choice test revealed preferences for SSE-processed nature and “domestic noises” among both CI and NH listeners. Our proposed experimental framework addresses key challenges in competing environmental sound perception among CI and NH listeners: (1) evaluation of SS for interference-characterized NLS mixture, (2) evaluation of environmental sound or NLS enhancement framework to improve perceptual outcomes with speech-targeted CI processing, and (3) perceptual measures to characterize NH and CI listener experience.

PMID:42007671 | DOI:10.1121/10.0042760

By Nevin Manimala

Portfolio Website for Nevin Manimala