欢迎访问《声学技术》编辑部！

文章摘要

陈存宝,赵力.嵌入时延神经网络的高斯混合模型说话人辨认[J].声学技术,2010,(3):292~296

嵌入时延神经网络的高斯混合模型说话人辨认

Speaker identification based on GMM with embedded TDNN

投稿时间：2009-05-12 修订日期：2009-08-29

DOI：

中文关键词: 说话人识别高斯混合模型(GMM) 时延神经网络(TDNN) 嵌入

英文关键词: speaker identification gaussian mixed model time delay neural network embedded

基金项目:国家自然科学基金(60872073,60975017);江苏省自然科学基金(BK2008291)

作者	单位	E-mail
陈存宝	东南大学信息科学与工程学院, 南京, 210096	chencunbao@gmail.com
赵力	东南大学信息科学与工程学院, 南京, 210096

摘要点击次数: 1168

全文下载次数: 1054

中文摘要:

提出了一种在高斯混合模型中嵌入时延神经网络的方法。它集成了作为判别性方法的时延神经网络和作为生成性方法的高斯混合模型各自的优点。时延神经网络挖掘了特征向量集的时间信息,并且通过时延网络的变换使需要假设变量独立的最大似然概率(ML)方法更为合理。以最大似然概率为准则,把它们作为一个整体来进行训练。训练过程中,高斯混合模型和神经网络的参数交替更新。实验结果表明,采用所提出的模型在各种信噪比情况下的识别率都比基线系统有所提高,最高能达到21%。

英文摘要:

This paper proposes a modified Gaussian Mixed Model(GMM) with an embedded Time Delay Neural Network(TDNN).It integrates the merits of GMM which is generative and TDNN as a Discriminative model.TDNN digests the time information of the feature sets,and through the transformation of the feature vector it makes the hy-pothesis of independence that maximum likelihood needs more reasonable.GMM and TDNN are trained as a whole by means of maximum likelihood.In the process of training,the parameter of GMM and TDNN are updated alternately.Experiments show that the proposed system improves accuracy rate against baseline GMM at all SNR with a maximum to 21%.

查看全文查看/发表评论下载PDF阅读器

关闭