文章摘要
熊天,张天骐,闻斌,吴超.基于鲁棒主成分分析和MFCC反复结构的歌声分离方法[J].声学技术,2023,42(6):794~803
基于鲁棒主成分分析和MFCC反复结构的歌声分离方法
Singing voice separation method based on robust principal component analysis and muti-repetition MFCC structure
投稿时间:2022-05-06  修订日期:2022-06-16
DOI:10.16300/j.cnki.1000-3630.2023.06.013
中文关键词: 鲁棒主成分分析(RPCA)  梅尔频率倒谱系数(MFCC)  歌声伴奏分离  反复结构
英文关键词: robust principal component analysis (RPCA)  Mel frequence cepstrum coefficient (MFCC)  song accompaniment separation  repeated structure
基金项目:国家自然科学基金项目(61671095,61702065,61701067,61771085)
作者单位E-mail
熊天 重庆邮电大学通信与信息工程学院, 重庆 400065 1194347981@qq.com 
张天骐 重庆邮电大学通信与信息工程学院, 重庆 400065  
闻斌 重庆邮电大学通信与信息工程学院, 重庆 400065  
吴超 重庆邮电大学通信与信息工程学院, 重庆 400065  
摘要点击次数: 354
全文下载次数: 288
中文摘要:
      针对单一传统方法对歌声分离不彻底的问题,文章提出了一种基于鲁棒主成分分析(Robust Principal Component Analysis, RPCA)和梅尔频率倒谱系数(Mel Frequency Cepstrum Coefficients, MFCC)反复结构的两步歌声伴奏分离模型。该模型有效地改善了鲁棒主成分分析对歌声分离不完全和梅尔频率倒谱系数反复结构歌声在低频处分离不佳的问题。首先使用鲁棒主成分分析将混合音乐信号分解为低秩矩阵和稀疏矩阵,然后分别对其提取梅尔频率倒谱系数特征参数并且对其进行相似运算,构建相似矩阵及建立梅尔频率倒谱系数反复结构模型并通过反复结构模型分别得到低秩矩阵和稀疏矩阵相关的掩蔽矩阵,最后根据构建的掩蔽矩阵模型以及傅里叶逆变换得到背景音乐和歌声。在公开数据集上进行了实验,实验结果表明本文算法在歌声分离性能上与比较算法相比,平均信号干扰比值最高有接近7 dB的提高。
英文摘要:
      In view of the difficulty of separating the singing from background music, a two-step accompaniment separation model based on robust principal component analysis and Mel frequency cepstrum coefficient repeated structure is proposed in this paper. The model effectively improves the problems of incomplete song separation and poor separation of Mel frequency cepstrum coefficients at low frequencies existed in robust principal component analysis. Firstly, the mixed music is decomposed into low rank matrix and sparse matrix by robust principal component analysis, then the characteristic parameters of Mel frequency cepstrum coefficients are extracted and the similar operations are carried out. The similarity matrix and the repeated structure model of Mel frequency cepstrum coefficients are constructed, and by the repeated structure model, both the low rank matrix related masking matrix and the sparse matrix related masking matrix are obtained. Finally, the background music and singing are obtained through the masking matrices and inverse Fourier transform. Experiments are carried out on public data sets. Compared with the existing comparison algorithm, the average signal to interference ratio of the proposed algorithm is improved by 7 dB.
查看全文   查看/发表评论  下载PDF阅读器
关闭