Generative Adversarial Network Based i-Vector Transformation for Short Utterance Speaker Verification

Jiacen Zhang; Nakamasa Inoue; Koichi Shinoda

Publication Information

Title

Japanese:
English:	Generative Adversarial Network Based i-Vector Transformation for Short Utterance Speaker Verification

Author

Japanese:	ZHANG Jiacen, 井上中順, 篠田浩一.
English:	Jiacen Zhang, Nakamasa Inoue, Koichi Shinoda.

Language

English

Journal/Book name

Japanese:	2018年秋季研究発表会講演論文集
English:	ASJ 2018 Autumn Meeting

Volume, Number, Page

pp. 1345-1346

Published date

Aug. 29, 2018

Publisher

Japanese:	一般社団法人日本音響学会
English:	Acoustical Society of Japan

Conference name

Japanese:	日本音響学会 2018年秋季研究発表会
English:	2018 Autumn Meeting of the Acoustical Society of Japan

Conference site

Japanese:	大分
English:	Oita

File

Official URL

http://www.asj.gr.jp/annualmeeting/index.html

Abstract

i-Vector based text-independent speaker verification (SV) systems often have poor performance with short utterances, because the biased phonetic distribution in a short utterance makes the extracted i-vector unreliable. This paper proposes an i-vector compensation method using a generative adversarial network (GAN), where its generator network is trained to transform an unreliable i-vector from a short utterance into a reliable one which can only be extrated from a long utterance and its discriminator network is trained to determine whether an i-vector is from the generator or from a long utterance. Additionally, we assign two other learning tasks to the GAN to stabilize its training and to make the generated i-vector more speaker-specific. Speaker verification experiments conducted on the NIST SRE 2008 “short2-10sec” and “10sec-10sec” conditions show that our method can help reduce the average equal error rate of the conventional i-vector and PLDA system.

Home

Search

Support

About T2R2

Related Links

Publication Information