文章摘要
王龙,杨俊安,陈雷,林伟.基于循环神经网络的汉语语言模型建模方法[J].声学技术,2015,34(5):431~436
基于循环神经网络的汉语语言模型建模方法
Recurrent neural network based Chinese language modeling method
投稿时间:2014-10-22  修订日期:2015-02-09
DOI:10.16300/j.cnki.1000-3630.2015.05.010
中文关键词: 语音识别  循环神经网络  语言模型  模型融合
英文关键词: speech recognition  recurrent neural network  language model  model combination
基金项目:国家自然科学基金(60872113)、安徽省自然科学基金(1208085MF94, 1308085QF99)资助项目。
作者单位E-mail
王龙 中国人民解放军电子工程学院, 安徽合肥 230037
安徽省电子制约技术重点实验室, 安徽合肥 230037 
longwang0927@126.com 
杨俊安 中国人民解放军电子工程学院, 安徽合肥 230037
安徽省电子制约技术重点实验室, 安徽合肥 230037 
 
陈雷 中国人民解放军电子工程学院, 安徽合肥 230037
安徽省电子制约技术重点实验室, 安徽合肥 230037 
 
林伟 安徽科大讯飞公司, 安徽合肥 230037  
摘要点击次数: 761
全文下载次数: 3622
中文摘要:
      语言模型是语音识别系统的重要组成部分,目前的主流是n-gram模型。然而n-gram模型存在一些不足,对语句中长距信息描述差、数据稀疏是影响模型性能的两个重要因素。针对不足,研究者提出循环神经网络(Recurrent Neural Network, RNN)建模技术,在英语语言模型建模上取得了较好的效果。根据汉语特点将RNN建模方法应用于汉语语言建模,并结合两种模型的优点,提出了模型融合构建方法。实验结果表明:相比传统的n-gram语言模型,采用RNN训练的汉语语言模型困惑度(PerPLexity, PPL)有了下降,在对汉语电话信道的语音识别上,系统错误率也有下降,将两种语言模型融合后,系统识别错误率更低。
英文摘要:
      Language model is an important part in the speech recognition system, the current mainstream technique is n-gram model. However, n-gram language model still has some shortcomings: the first is poorly to describe the long-distance information of a sentence, and the second is to arise the serious data sparse phenomenon; essentially they are the two important factors influencing the performances of the model. Aiming at these defects of n-gram language model, the researchers put forward a recurrent neural network (RNN) modeling technique, with which, the training for the English language model has achieved good results. According to the characteristics of the Chinese language, the RNN method is used for training the Chinese language model; also a model combination method to combine the advantages of the two models is proposed. The experimental results show that: the perplexity of RNN model has a certain decline, there is also a certain decline on the system recognition error rate,and after model combination, the recognition error rate reduces much more on the Chinese phone speech recognition, compared with the n-gram language model.
查看全文   查看/发表评论  下载PDF阅读器
关闭
function PdfOpen(url){ var win="toolbar=no,location=no,directories=no,status=yes,menubar=yes,scrollbars=yes,resizable=yes"; window.open(url,"",win); } function openWin(url,w,h){ var win="toolbar=no,location=no,directories=no,status=no,menubar=no,scrollbars=yes,resizable=no,width=" + w + ",height=" + h; controlWindow=window.open(url,"",win); } &et=BC47B324F0930E198C3B32DCA63F67F2167B26387B0668A12B15530B86A4A559880A7AA2756F2CB68AC2E65337A64C768760F5EE4143F3C18CECDA54F01505D4&pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=84529CA2B2E519AC&jid=DDCFCD5ACE1B1E5A6D46213553C850CA&yid=FFD10F7019FAA9EC&aid=&vid=&iid=94C357A881DFC066&sid=6D6BFCF0101BC091&eid=08076B8B3CC96095&fileno=20150510&flag=1&is_more=0">