[PDF][PDF] Contribution of RMS-Level-Based Speech Segments to Target Speech Decoding Under Noisy Conditions.
Human listeners can recognize target speech streams in complex auditory scenes. The
cortical activities can robustly track the amplitude fluctuations of target speech with auditory
attentional modulation under a range of signal-to-masker ratios (SMRs). The root-mean-
square (RMS) level of the speech signal is a crucial acoustic cue for target speech
perception. However, in most studies, the neural-tracking activities were analyzed with the
intact speech temporal envelopes, ignoring the characteristic decoding features in different …
cortical activities can robustly track the amplitude fluctuations of target speech with auditory
attentional modulation under a range of signal-to-masker ratios (SMRs). The root-mean-
square (RMS) level of the speech signal is a crucial acoustic cue for target speech
perception. However, in most studies, the neural-tracking activities were analyzed with the
intact speech temporal envelopes, ignoring the characteristic decoding features in different …
Abstract
Human listeners can recognize target speech streams in complex auditory scenes. The cortical activities can robustly track the amplitude fluctuations of target speech with auditory attentional modulation under a range of signal-to-masker ratios (SMRs). The root-mean-square (RMS) level of the speech signal is a crucial acoustic cue for target speech perception. However, in most studies, the neural-tracking activities were analyzed with the intact speech temporal envelopes, ignoring the characteristic decoding features in different RMS-level-specific speech segments. This study aimed to explore the contributions of high-and middle-RMS-level segments to target speech decoding in noisy conditions based on electroencephalogram (EEG) signals. The target stimulus was mixed with a competing speaker at five SMRs (ie, 6, 3, 0,-3, and-6 dB), and then the temporal response function (TRF) was used to analyze the relationship between neural responses and high/middle-RMS-level segments. Experimental results showed that target and ignored speech streams had significantly different TRF responses under conditions with the high-or middle-RMS-level segments. Besides, the high-and middle-RMS-level segments elicited different TRF responses in morphological distributions. These results suggested that distinct models could be used in different RMS-level-specific speech segments to better decode target speech with corresponding EEG signals.
isca-archive.org
Showing the best result for this search. See all results