I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification

Jiacen Zhang; Nakamasa Inoue; Koichi Shinoda

doi:10.21437/Interspeech.2018-1680

Publication Information

Title

Japanese:
English:	I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification

Author

Japanese:	ZHANG Jiacen, 井上中順, 篠田浩一.
English:	Jiacen Zhang, Nakamasa Inoue, Koichi Shinoda.

Language

English

Journal/Book name

Japanese:
English:	Proc. Interspeech 2018

Volume, Number, Page

pp. 3613-3617

Published date

Sept. 4, 2018

Publisher

Japanese:
English:	ISCA

Conference name

Japanese:
English:	Interspeech 2018

Conference site

Japanese:	ハイデラバード
English:	Hyderabad

File

Official URL

https://www.isca-speech.org/archive/Interspeech_2018/abstracts/1680.html

DOI

https://doi.org/10.21437/Interspeech.2018-1680

Abstract

I-vector based text-independent speaker verification (SV) systems often have poor performance with short utterances, as the biased phonetic distribution in a short utterance makes the extracted i-vector unreliable. This paper proposes an i-vector compensation method using a generative adversarial network (GAN), where its generator network is trained to generate a compensated i-vector from a short-utterance i-vector and its discriminator network is trained to determine whether an i-vector is generated by the generator or the one extracted from a long utterance. Additionally, we assign two other learning tasks to the GAN to stabilize its training and to make the generated i-vector more speaker-specific. Speaker verification experiments on the NIST SRE 2008 “10sec-10sec” condition show that after applying our method, the equal error rate reduced by 11.3% from the conventional i-vector and PLDA system.

Home

Search

Support

About T2R2

Related Links

Publication Information