文章摘要
毛维,曾庆宁,龙超.双微阵列语音增强算法在说话人识别中的应用[J].声学技术,2018,37(3):253~260
双微阵列语音增强算法在说话人识别中的应用
Application of dual-mini microphone array speech enhancement algorithm in speaker recognition
投稿时间:2017-06-21  修订日期:2017-08-18
DOI:10.16300/j.cnki.1000-3630.2018.03.011
中文关键词: 双微阵列  语音增强  相干滤波  最小方差无畸变响应  改进维纳滤波  说话人识别
英文关键词: dual-mini array  speech enhancement  coherence filtering  minimum variance distortionless response  modified Wiener filter  speaker recognition
基金项目:国家自然科学基金项目(61461011)、教育部重点实验室2016年主任基金项目资助(CRKL160107)、桂林电子科技大学研究生科研创新项目(2017YJCX16、2017YJCX20)
作者单位E-mail
毛维 桂林电子科技大学信息与通信学院, 广西桂林 541004  
曾庆宁 桂林电子科技大学信息与通信学院, 广西桂林 541004  
龙超 桂林电子科技大学信息与通信学院, 广西桂林 541004 bishe006@163.com 
摘要点击次数: 1350
全文下载次数: 1147
中文摘要:
      针对复杂噪声环境下识别性能显著降低的问题,提出一种用于说话人识别系统前端的双微阵列语音增强算法。该算法采用的是相干滤波和频域宽带最小方差无畸变响应波束形成器后置结合改进的维纳滤波器。其基本原理是首先求出双微麦克风阵列信号中两个相邻通道间的相干函数,再利用通道间信号的相干性来进行初始噪声抑制。其次,通过一个频域宽带最小方差无畸变响应(Minimum Variance Distortionless Response,MVDR)波束形成器保留目标声源方向的信号并抑制其他方向的信号干扰,再通过改进的维纳滤波器去除噪声残留提升语音质量。最后,使用梅尔频率倒谱系数(Mel Frequency Cepstral Coefficients,MFCC)和伽马通滤波器组频率倒谱系数(Gammatone Filter-bank Fre-quency Cepstral Coefficients,GFCC)对增强后的语音信号做特征参数提取并进行说话人识别。仿真过程采用声学人工头模拟双耳采集数据,实验结果表明,该语音增强算法在复杂噪声环境下能够获得较好的增强效果,能有效提升说话人识别系统的识别率。
英文摘要:
      Aiming at the problem of lowering recognition performance in noisy speech environment, a dual-mini mi-crophone array speech enhancement algorithm is proposed for the front-end processing of recognition system. The speech enhancement algorithm based on Coherent Filter and MVDR-wiener is presented. First, the dual-mini microphone array signals are collected to derive the coherence function between adjacent channels and to carry out the initial noise suppression by using the coherence between elements. Then, the information of target speech is processed by the broad-band MVDR algorithm to keep the signal in the desired sound source direction and suppress the interference signals in other directions. The improved Wiener filter which can get better voice quality by removing residual noise is utilized to process the enhanced signal. Finally, a speaker recognition system using Mel frequency cepstral coefficients (MFCC) and GFCC for feature extraction is used to recognize the enhanced speech. Binaural data are acquired with acoustic artificial head in simulations, the experimental results show that the speech enhancement algorithm can obtain better enhanced effect in noisy environment and effectively improve the recognition rate.
查看全文   查看/发表评论  下载PDF阅读器
关闭