颜永红
颜永红科研工作经历
 
 
                出生日期:1967年03月16日,出生地:江苏无锡,学历:研究生   
             职务/职称:研究员(博士生导师),专业:语音信号与信息处理
    中国科学院声学研究所 所长助理,中科院新疆理化所 副所长
 
 
 
 
简 历
 
1985.09 – 1990.06     清华大学电子工程系,获工学学士学位
1990.06 – 1992.08     北京星河电子公司. 任系统工程师及非特定人语音识别课题组组长
1992.09 – 1995.06     美国俄勒冈研究院(OGI)学习,获计算机科学和工程博士
1995.08 – 1996.06     Research Scientist, Department of Computer Science, OGI
1996.07 – 1998.07     Assistant Professor, Department of Computer Science, OGI
1998.08 – 2004.06     Associate Professor, Department of Electrical & Computer Science,Oregon Health & Science University (merged with OGI in 2001)
1998.12 – 2001.04     英特尔全球人机界面学术委员会主席,中国研究中心主任和首席科学家, 英特尔微处理器实验室人机界面总框架师, 英特尔公司 (美国)
2002.02 – 至今          中科院声学所研究员,博导
2010.01 – 至今          Journal of Computer Science and Technology 编委
2010.01 – 至今        《声学学报》副主编
2012.06 – 至今        《应用声学》副主编
2013.1 –                  《声学技术》编委
 
荣誉称号
 
2000       获英特尔发明家奖
2002       入选中国科学院“百人计划”,在2006年中科院对2002年度109位“百人计划”入选者进行的终期评估中,被评为优秀并排名第一
2007       新世纪百千万人才工程国家级人选
2009       国家自然基金委国家杰出青年基金获得者
2010       马大猷声学奖获得者。
 
 
 
主要科研成就
 
       历年来在国内外负责完成科研项目40多项(其中包括国家自然科学基金杰青项目,自然科学基金重点项目,自然科学基金国际合作项目,国家973/863计划项目,科技部科技支撑项目,中科院重点支持、先导等多种项目,以及多项横向项目),发表学术论文300余篇,获得国家发明专利授权40余项。
       在美国俄勒冈研究院(Oregon Graduate Institute, OGI)口语研究中心(该中心和CMU, MIT的语音研究组为90年代美国前三大的口语研究机构)工作期间,承担过多项美国自然基金委和国防部(DARPA)的项目,是OGI 语音工具包语音识别部分的主要作者。该软件包为90年代语音行业比较有影响的工具包之一。所提出的多识别器前端、多特征应用和后端信息融合算法,在语种识别研究领域取得较大影响。在1995年和1996年美国国防部和标准局举办的语种识别评测中, 连续两次获第一。
        1998年底至2001年任职于英特尔公司,先后担任英特尔公司主任工程师、人机界面总框架师、英特尔中国研究中心主任和首席科学家、英特尔全球人机界面学术委员会主席,领导了英特尔性能库信号处理部分的研发(Intel Performance Primitives,IPP),在基于x86的软件中得到广泛应用, 多次获得公司表彰(Division Recognition Award), 并于2000年获英特尔发明家奖。
        2002年1月受聘于中国科学院声学研究所,同年入选中国科学院“百人计划”,并着手组建了一个新的语音实验室。2006年中科院对2002年度109位“百人计划”入选者进行了终期评估,颜永红被评为优秀并排名第一。十余年以来实验室实现了在研经费和人员各10倍以上的成长,先后承担了国家863、973、国家自然科学基金、中科院知识创新工程等多个国家项目,特别是关键词检测、说话人识别、语种识别、音频水印等技术的实际应用。自成立以来,实验室在听感知、噪声消除、搜索算法、鉴别性训练、发音评估模型、音乐搜索、说话人/语种识别等研究方面均有所创新,发表论文300余篇,申请发明专利40余项,并多次在国内外语音技术评测中取得第一的成绩。目前已经成为我国在语言声学领域唯一一个省部级重点实验室。实验室和社会力量积极合作,成功地实现了自主知识产权语音产品的推广,打破了中文语音识别市场被国外公司垄断的状况。
 
 
部分SCI索引论文列表
 
1. C. Cao, M. Li, X. Wu, H. Suo, J. Liu, and Y. Yan, "Automatic Singing Performance Evaluation for Untrained Singers," IEICE Trans. on Information and Systems, vol. vol.E92-D, iss. no.8, pp. 1596–1600, 2009.
2. C. Liu, F. Pan, F. Ge, B. Dong, H. Suo, and Y. Yan, "An LVCSR Based Reading Miscue Detection System Using Knowledge of Reference and Error Patterns," IEICE Transactions on Information and Systems, vol. E92-D, iss. 9, 2009.
3. X. Wu, M. Li, H. Suo, and Y. Yan, "Melody Track Selection Using Discriminative Language Model," IEICE Trans. on Information and Systems, vol. 91, iss. 6, pp. 1838–1840, 2008.
4. H. Zhang, Q. Fu, and Y. Yan, "Speech Enhancement Using Improved Adaptive Null- Forming in Frequency Domain with Postfilter," IEICE Trans. On Fundamentals of Electronics, Communications and Computer Sciences, vol. E91-A, iss. 12, 2008.
5. 王迪, 付强, 杨琳, and 于萍. and 颜永红 and 冯稷, "基于人耳听觉模型的自动嗓音评估方法," 物理学报, vol. 57, iss. 7, pp. 290–296, 2008.
6. Q. Fu and P. Murphy, "Robust Glottal Source Estimation Based on Joint Source-filter Model Optimization," IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, vol. 14, iss. 2, pp. 492–501, 2006.
7. H. Zhang, Q. Fu, and Y. Yan, "Speech Enhancement Using Compact Microphone Array and Applications in Distant Speech Acquisition," Chinese Journal of Electronics (English), pp. 481–486, 2009.
8. X. Wang, J. Zhang, and Y. Yan, "Discrimination between pathological and normal voices using GMM-SVM approach," journal of voice, vol. 25, iss. 1, pp. 38–43, 2011.
9. Q. Zhang, J. Pan, and Y. Yan, "Development of a Mandarin-English Bilingual Speech Recognition System with Unified Acoustic Models," Journal of Information Science and Engineering, vol. 26, pp. 1491–1507, 2010.
10. X. Zhang, H. Suo, Q. Zhao, and Y. Yan, "Using a kind of novel phonotactic information for SVM based speaker recognition," IEICE Transactions on Information and Systems, vol. E92-D, iss. 4, pp. 746–749, 2009.
11. X. Wu and Y. Yan, "Speaker Adaptation Using Constrained Transformation," IEEE Trans on Speech and Audio Processing, vol. 12, iss. 2, pp. 168–174, 2004.
12. C. Liu and Y. Yan, "Robust State Clustering Using Phonetic Decision Trees," Speech Communication, vol. 42, iss. 3-4, pp. 391–408, 2004.
13. J. Yang, K. Sha, W. Gan, Y. Yan, and J. Tian, "A simplified Algorithm for Impedance Calculation of Arbitrarily shaped radiators," In Chinese Physics Letters, 2005.
14. J. Yang, K. Tan, W. Gan, M. Er, and Y. Yan, "Beamwidth control in parametric acoustic array," Japanese Journal of Applied physics, vol. 44, iss. 9A, 2005.
15. X. Zhang, X. Xiao, H. Wang, J. Zhang, and Y. Yan, "Multi-Class Maximum A Posteriori Linear Regression for Speaker Verification," Chinese Journal of Electronics, vol. 19, iss. 4, pp. 641–645, 2010.
16. Y. Zhou, J. Li, Y. Sun, J. Zhang, Y. Yan, and M. Akagi, "A hybrid speech emotion recognition system based on spectral and prosodic features," IEICE TRANSACTIONS on Information and Systems, vol. E93-D, iss. 10, pp. 2813–2821, 2010.
17. J. Yang, X. Zhang, H. Suo, L. Lu, J. Zhang, and Y. Yan, "Maximum a Posteriori Linear Regression for Language Recognition," Expert Systems With Applications, vol. Volume 39, iss. 4, pp. 4287–4291, 2012.
18. J. Yang, X. Zhang, H. Suo, J. Wang, and Y. Yan, "Language Recognition with Language Total Variability," Chinese Journal of Electronics, vol. 21, iss. 1, pp. 97–101, 2012.
19. J. Yang, X. Zhang, H. Suo, L. Lu, J. Zhang, and Y. Yan, "Low-dimensional Representation of Gaussian Mixture Model Supervector For Language Recognition," Eurasip Journal on Advances in Signal Processing, iss. 47, pp. 1–7, 2012.
20. K. Li, Q. Fu, and Y. Yan, "Speech Enhancement Using Robust Generalized Sidelobe Canceller with Multi-channel Post-Filtering in Adverse Environments," Chinese Journal of Electronics, vol. 40, iss. 1, pp. 85–90, 2012.
21. X. Zhao, Y. Guo, J. Liu, and Y. Yan, "Logarithmic adaptive quantization projection for audio watermarking," IEICE Transactions on Information and Systems, vol. E95-D, iss. 5, pp. 1436–1445, 2012.
22. Y. Feng, V. L. Gracco, and L. Max, "Integration of auditory and somatosensory error signals in the neural control of speech movements," Journal of Neurophysiology, vol. 106, pp. 667–679, 2011.
23. K. Li, Y. Guo, Q. Fu, J. Li, and Y. Yan, "Two-Microphone Noise Reduction Using Spatial Information-Based Spectral Amplitude Estimation," IEICE Transactions on information and systems, vol. E95-D, iss. 5, 2012.
24. Y. Li, W. Xu, and Y. Yan, "A Novel Similarity Measure to Induce Semantic Classes and Its Application for Language Model Adaptation in a Dialogue System," Journal of Computer Science and Technology, pp. 443–450, 2012.
25. S. Cai, Y. Xiao, J. Pan, Q. Zhao, and Y. Yan, "Noise Robust Feature Scheme for Automatic Speech Recognition Based on Auditory Perceptual Mechanisms," IEICE Transactions on Information and Systems, vol. Vol.E95-D, iss. No.6, pp. 1610–1618, 2012.
26. C. Liang, X. Zhang, and Y. Yan, "Discriminative Decision Function Based Scoring Method Used in Speaker Verification," Chinese Journal of Electronics, vol. 21, iss. 4, pp. 692–696, 2012.
27. C. Liang, L. Yang, Q. Zhao, and Y. Yan, "Factor Analysis of Neighborhood Preserving Embedding for Speaker Veri?cation," IEICE TRANSACTIONS on Information and Systems, vol. E95-D, iss. 10, pp. 2572–2576, 2012.
28. F. Ge, C. Liu, J. Shao, F. Pan, B. Dong, and Y. Yan, "Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech," IEICE - Trans. Inf. Syst. vol. E91-D, iss. 10, pp. 2485–2492, 2008.
29. X. Zhang, P. Lv, H. Suo, Q. Zhao, and Y. Yan, "Robust Speaker Clustering Using Affinity Propagation," IEICE - Trans. Inf. Syst. vol. E91-D, iss. 11, pp. 2739–2741, 2008.
30. Y. Sun, Y. Zhou, Q. Zhao, P. Zhang, F. Pan, and Y. Yan, "Enhancing the robustness of the posterior-based confidence measures using entropy information for speech recognition," IEICE TRANS on Information and Systems. vol. E93-D, iss. 9, pp. 2431–2439, 2010.
31. Y. Sun, Y. Zhou, Q. Zhao, and Y. Yan, "Acoustic Feature Optimization based on F-Ratio for Robust Speech Recognition," IEICE Transactions on Information and Systems, vol. E93-D, iss. 9, pp. 2417–2430, 2010.
32. Y. Feng, G. J. Hao, S. A. Xue, and L. Max, "Detecting anticipatory effects in speech articulation by means of spectral coefficient analyses," Speech Communication, vol. 53, iss. 6, pp. 842–854, 2011.
33. J. Gao, Q. Zhao, and Y. Yan, "Towards precise and robust automatic synchronization of live speech and its transcripts," Speech Communication, vol. 53, iss. 4, pp. 508–523, 2011.
34. J. Gao, J. Shao, Q. Zhao, and Y. Yan, "Efficient System Combination for Chinese Spoken Term Detection," Chinese Journal of Electronics, vol. 19, iss. 3, pp. 457–462, 2010.
35. J. Li, L. Yang, J. Zhang, Y. Yan, Y. Hu, M. Akagi, and P. C. Loizou, "Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, Japanese, and English," Journal of the Acoustical Society of America, vol. 129, iss. 5, pp. 3291–3301, 2011.
36. J. Li, S. Sakamoto, S. Hongo, M. Akagi, and Y. Suzuki, "Two-stage binaural speech enhancement with Wiener filter for high-quality speech communication," Speech Communication, vol. 53, iss. 5, pp. 677–689, 2011.
37. 刘赵杰, 邵健, 张鹏远, and 赵庆卫. and 颜永红 and 冯稷, "汉语自然口语中声调识别的研究," 物理学报, vol. 56, iss. 12, pp. 7064–7068, 2007.
38. P. Lv and Y. Yan, "Rapid Adaptation Algorithm Based on Regression Analysis for Speech Eecognition," Chinese of Journal Electronics, pp. 69–73, 2006.
39. L. Yang, J. Zhang, and Y. Yan, "Effect of the temporal fine structure in different frequency bands on Mandarin tone perception," IEICE Trans. on Information and System, vol. E91-D, iss. 28, pp. 371–374, 2008.
40. L. Yang, J. Zhang, and Y. Yan, "An improved cochlear implant strategy incorporating frequency modulation information," Chinese Journal of Electronics, vol. 17, iss. 2, pp. 273–278, 2008.
41. J. Shao, T. Li, Q. Zhang, Q. Zhao, and Y. Yan, "A One-Pass Real-Time Decoder Using Memory-Efficient State Network," IEICE TRANSACTIONS on Information and Systems, vol. 91, iss. 3, p. 529, 2008.
42. J. Shao, Q. Zhao, P. Zhang, Z. Liu, and Y. Yan, "Fast fuzzy keyword spotting using syllable confusion network," Chinese Journal of Electronics, vol. 17, iss. 2, pp. 265–269, 2008.
43. H. Suo, M. Li, P. Lv, and Y. Yan, "Automatic Language Identification with Discriminative Language Characterization Based on SVM," IEICE - Transactions on Information and Systems, vol. E91-D, iss. 3, pp. 567–575, 2008.
44. D. Ying, Y. Yan, J. Dang, and F. K. Soong, "Voice Activity Detection Based On An Unsupervised Learning Framework," IEEE Trans. on Audio, Speech, and Language Processing, vol. 19, iss. 8, pp. 2624–2633, 2011.
45. Q. Zhang, J. Pan, L. Yang, J. Shao, and Y. Yan, "Development of a Mandarin-English Bilingual Speech Recognition System for Real World Music Retrieval," IEICE - Transactions on Information and Systems, vol. E91-D, iss. 3, pp. 514–521, 2008.
46. Y. Zhou, H. Suo, J. Li, and Y. Yan, "Harmonic Structure Features for Robust Speaker Diarization," ETRI Journal, vol. 34, iss. 4, pp. 583–590, 2012.
 
 
 
部分授权专利列表
 
序号
名称
专利申请号
授权年度
 
一种对话交互前端的回声消除和语音检测方法
ZL 02148685.9
2006
 
一种基于数字信号处理的语音变声方法
ZL 03137014.4
2006
 
子带自适应谷点降噪系统和方法
ZL 2004 1 0006563.8
2007
 
一种自适应谷点降噪方法及系统
ZL 2004 1 0006564.2
2007
 
一种语音识别系统
ZL 2004 1 0070139.X
2006
 
语音识别系统
ZL 2004 1 0070140.2
2006
 
一种基于混淆网络的语音解码方法
ZL 2004 1 0090801.8
2008
 
一种便携式数字移动通讯设备及其语音控制方法和系统
ZL 2003 8 0101122.X
2008
 
一种应用于语音识别系统的语音端点检测方法
ZL 2004- 10090802.2
2009
 
一种基于语音识别及语音分析的发音评估方法
ZL 2004 1 0074445.0
2009
 
一种基于谐波特征的浊音检测方法
ZL200510089956.4
2010
 
基于能量及谐波的语音端点检测方法
ZL200510089957.9
2010
 
一种基于变换域的数字音频混合方法
ZL200410088428.2
2010
 
基于内容分析的短信问答系统及实现方法
ZL200510093640.2
2010
 
一种基于能量的音符切分方法
ZL200510117698.6
2010
 
电话语音识别中的自适应方法
ZL200610089253.6
2010
 
一种自同步的音频水印方法
ZL200510064334.6
2010
 
基于浮动窗口的端点检测方法、壮汉子和语音识别系统
ZL200410083807.2
2010
 
语音标记方法、系统及基于语音标记的语音识别方法和系统
ZL200410083836.6
2010
 
一种基于信息传递的说话人聚类方法
ZL 2007 1 0178363.4
2011
 
一种自动嗓音谐噪比分析方法
ZL 2007 1 0178362.X
2011
 
基于音素混淆的中英文双语语音识别方法
ZL 2008 1 0110555.6
2011
 
一种基于小型麦克风阵列的定向语音增强方法
ZL 2008 1 0112195.3
2011
 
一种快速可在线应用的声道长度归整方法
ZL 2008 1 0097981.0
2011
 
发音质量评估系统中的置信度快速求取方法
ZL 2008 1 0240811.3
2011
 
一种用于版权管理的数字音频水印算法
ZL 2010 1 0621150.6
2012
 
基于非监督学习的噪声谱估计与语音活动度检测方法
ZL 2010 1 0178166.4
2012
 
一种演唱的评分系统和方法
ZL 2007 1 0177034.8
2012
 
一种应用于语音情感识别的语音情感特征计算方法
ZL 201010272971.3
2012
 
一种采样率差异估计与校正方法
ZL 2009 1 0088731.5
2012