×
Unlike conventional CNNs, SincNet take raw speech waveform as input. This paper leverages SincNet in vanilla transfer learning (VTL) setup. Out-domain data is ...
(a) In Stage 1, SincNet is trained for frame-level speaker identification using out-domain data. (b) In Stage 2, we adopt the trained SincNet as feature ...
We used SincNet trained on out-domain data for vanilla transfer learning (VTL). We propose to lever- age out-domain data in speaker diarization through SincNet-.
This study proposes a model-based approach for robust speaker clustering using i-vectors and employs cosine K-means and movMF speaker clustersering as baseline ...
Speech Processing: Speaker Recognition and Characterization. Paper Title: TRANSFER LEARNING USING RAW WAVEFORM SINCNET FOR ROBUST SPEAKER DIARIZATION.
This paper proposes a novel CNN architecture, called SincNet, that encourages the first convolutional layer to discover more meaningful filters, ...
Paper Title: TRANSFER LEARNING USING RAW WAVEFORM SINCNET FOR ROBUST SPEAKER DIARIZATION ; Authors: Harishchandra Dubey; University of Texas at Dallas ; Abhijeet ...
People also ask
The idea behind using SincNet filters on the raw speech waveform is to extract more distinguishing frequency-related features in the initial convolution layers ...
Abstract—This paper presents an approach to the speaker diarization problem based on speech local waveform analysis. We assume that the recorded sound scene ...
Transfer Learning Using Raw Waveform Sincnet for Robust Speaker Diarization · Harishchandra Dubey · Abhijeet Sangwan · John H. L. Hansen ...