Improved Meta-Learning Training for Speaker Verification

Chen, Yafeng; Guo, Wu; Gu, Bin

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2103.15421 (eess)

[Submitted on 29 Mar 2021 (v1), last revised 3 Aug 2023 (this version, v2)]

Title:Improved Meta-Learning Training for Speaker Verification

Authors:Yafeng Chen, Wu Guo, Bin Gu

View PDF

Abstract:Meta-learning has recently become a research hotspot in speaker verification (SV). We introduce two methods to improve the meta-learning training for SV in this paper. For the first method, a backbone embedding network is first jointly trained with the conventional cross entropy loss and prototypical networks (PN) loss. Then, inspired by speaker adaptive training in speech recognition, additional transformation coefficients are trained with only the PN loss. The transformation coefficients are used to modify the original backbone embedding network in the x-vector extraction process. Furthermore, the random erasing data augmentation technique is applied to all support samples in each episode to construct positive pairs, and a contrastive loss between the augmented and the original support samples is added to the objective in model training. Experiments are carried out on the SITW and VOiCES databases. Both of the methods can obtain consistent improvements over existing meta-learning training frameworks. By combining these two methods, we can observe further improvements on these two databases.

Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2103.15421 [eess.AS]
	(or arXiv:2103.15421v2 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2103.15421

Submission history

From: Yafeng Chen [view email]
[v1] Mon, 29 Mar 2021 08:37:27 UTC (829 KB)
[v2] Thu, 3 Aug 2023 03:09:53 UTC (174 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Improved Meta-Learning Training for Speaker Verification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Improved Meta-Learning Training for Speaker Verification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators