新的全参考音视频同步感知质量评价模型被引量：2

Novel full reference perceptual quality metric for audio-visual asynchrony

下载PDF

导出

摘要提出一种利用协惯量分析构建的全参考音视频同步感知质量模型。通过对齐得到待测音频与视频的同步误差。将音视频内容分为纯净语音、无语音和有背景语音3类。将纯净语音类分为视频中有说话人和无说话人2个子类。分别对各类选取多维特征,利用协惯量分析从特征中获得音视频最相关的特征映射和相关程度。通过参考音视频得到相关程度曲线并得到同步误差到感知质量的映射关系。结果表明该模型评测结果与主观实验结果有较好相关性。 A full reference model was proposed to evaluate the perceptual quality of audiovisual asynchrony. A standard synchronization process was used to determine the time difference between audio and video. The mapping between the time difference and the perceptual quality was derived by co-inertia analysis. The co-inertia analysis extracted the most related component from audio and video features, and then formed a mapping for each audiovisual sequence. Audiovisual contents were divided into three categories： clean speech, non speech and mixed speech. The clean speech category was further split into two subcategories. Audio and video features were chosen separately for each category. Subjective test results showed that the proposed model conforms well with subjective results.

作者魏耀都谢湘匡镜明韩辛璐

机构地区北京理工大学信息与电子学院

出处《通信学报》 EI CSCD 北大核心 2012年第2期182-190,共9页 Journal on Communications

基金国家科技重大专项基金项目资助(2010ZX03004-003)~~

关键词信息处理技术音视频质量评价协惯量分析同步 signal processing technique audiovisual quality assessment co-inertia analysis synchrony

分类号 TN912 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献13

1RIMELL A, OWEN A. The effect of focused attention on audio-visual quality perception with applications in multi-model codec design[A] ICASSP 2000[C]. Istanbul, Turkey, 2000. 2377-2380.
2HANDS D S. A basic multimedia quality model[J]. IEEE Transactions on Multimedia,2004,12 (6): 806-816.
3STEINMETZ R. Human perception of jitter and media synchronization[J]. IEEE Journal on Selected Areas in Communications, 1996, 14(1): 61-72.
4NISHIBORI K, TAKEUCHI Y, MATSUMOTO T, et al. Finding the correspondence of audiovisual events by object manipulation[J]. Electronics and Communications, 2009, 92(5): 1-13.
5BREDIN H, CHOLLET G Audiovisual speech synchrony measure application to biometrics[J]. Eurasip Journal on Advances in SignalProcessing, 2007, (3): 1-11.
6GILLET O, ESSID S, RICHARD G~ On the correlation of automatic audio and visual segmentations of music videos[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2007,3(17): 347-355.
7LIU Y Y, SATO Y. Recovery of audio-to-video synchronizationthrough analysis of cross-modality correlation[J]. Pattern Recognition Letters, 2010,31 (8): 696-701.
8ENR1QUE A R, BREDIN H, GARCIA M C, et al. Audio-visual speech asynchrony detection using co-inertia analysis and coupled hidden markov models[J]. Pattern Analysis & Applications, 2009,9(12):271-284.
9EVENO N, BESACIER L. Co-inertia analysis for "liveness" test in audio-visual biometrics[A]. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis[C]. Zagreb Croatia, 2005.257-261.
10KUMAR K, NAVEATIL J, MARCHERET E, et al. Audio-visual speech synchronization detection using a bimodal linear prediction model[A]. 2009 IEEE Conference on Computer Vision and Pattern Recognition[C]. 2009.53-59.

同被引文献25

1吴张顺,张珣.基于FFmpeg的视频编码存储研究与实现[J].杭州电子科技大学学报（自然科学版）,2006,26(3):30-34. 被引量：33
2马燕,李存,李晓勇,刘海涛.基于ARM平台的多媒体播放器的设计与实现[J].计算机工程,2006,32(24):221-222. 被引量：7
3MI Faraj, J Bigun. S ynergy of lip-motion and acoustic features in biometric speech and speaker recognition[ J]. IEEE Transac- tions on Computer,2007,56(9): 1169- 1175.
4S Kumagal, K Doman, et al. Detection of inconsistency between subject and speaker based on the co-occurrence of lip motion and voice towards speech scene extraction from news videos [ A]. IEEE International Symposium on Multimedia[ C]. Cali- fornia: IEEE,2011.311 - 318.
5M Slaney,M Covell. Facesync:A linear operator for measuring synchronization of video facial images and audio track [ A ].Neural Information Processing Systems [ C ]. Denver: NIPSF, 2000. 814 - 820.
6N Eveno, L Besacier. A speaker independent "liveness" test for audio-visual biomelrics [ A ]. Nineth European Conference on Speech Communication and Technology [ C ]. Lisbon: ISCA, 2005. 3081 - 3084.
7G ChoUet, R Landais, et al. Some experiments in audio-visual speech processing [A ]. Non-Linear Speech Processing 2007 [ C]. Paris-ISCA, 2007.28 - 56.
8A Sayo, Y Kajikawa, et al. Biometrics authentication method using lip motion in utterance[ A]. 8th International Conference on Information, Communications and Signal Processing [ C ]. Singapore: IEEF., 2011.1 - 5.
9AA EL-Sallam, AS Mian. Correlation based speech-video syn- chronization[ J]. Pattern Recognition Letters, 2011,32 ( 6 ) : 780 - 786.
10B Goswami, C Chan, et al. Speaker authentication using video- based lip information[ A]. IEEE, International Conference on A- coustics, Speech, and Signal Processing [ C ]. Prague: IEEE, 2011.1908 - 1910.

引证文献2

1朱铮宇,贺前华,奉小慧,叶婉玲,李艳雄,杨继臣.基于时空相关度融合的语音唇动一致性检测算法[J].电子学报,2014,42(4):779-785. 被引量：5
2滕志军,徐艳伟.基于Qt的跨平台多媒体播放器的设计与实现[J].东北师大学报（自然科学版）,2015,47(4):59-63. 被引量：6

二级引证文献11

1贺前华,潘伟锵,胡永健,朱铮宇,李艳雄,奉小慧.说话人认证录音回放检测方法综述[J].数据采集与处理,2015,30(2):266-274. 被引量：1
2LUO Siwei,HOU Mengshu,ZHAN Siyu,LYU Mengjie,LI Ming.Consistency Maintenance in Replication：A Novel Strategy Based on Diamond Topology in Cloud Storage[J].Chinese Journal of Electronics,2017,26(1):192-198.
3朱铮宇,邱华愉,杨春玲,王泳.基于特定韵母发音事件分析的语音唇动一致性判决方法[J].华南理工大学学报（自然科学版）,2020,48(1):139-146. 被引量：4
4朱铮宇,廖丽平,杨春玲,王泳,蔡君,邱华愉.基于韵母发音事件匹配与位置时延分析的音唇一致性判决方法[J].电子学报,2021,49(1):140-148. 被引量：1
5程伟伦,唐恒飞.基于FFmpeg的车载嵌入式流媒体终端的研究与实现[J].农业装备与车辆工程,2021,59(8):120-122. 被引量：6
6王伟伟,项世珍,董盛鹏,周汉海.PAL视频监控系统实时性分析和优化[J].计算机应用,2021,41(S02):330-334. 被引量：5
7刘中柱,杨福合.某车载装备的多路视频远程监控系统软件设计[J].机械工程与自动化,2022(2):68-70.
8张国超,金巧园,代中华.基于VxWorks的视频记录重演系统设计与实现[J].电子技术与软件工程,2022(11):100-104. 被引量：1
9朱铮宇,罗超,贺前华,彭炜锋,毛志炜,张顺四.基于唇重构与三维耦合CNN的多视角音唇一致性判别[J].华南理工大学学报（自然科学版）,2023,51(5):70-77. 被引量：1
10刘萍,王喆,张莹.基于树莓派的高校声像档案展示系统的设计[J].科技经济导刊,2018(36):36-37.

1陈行勇,黎湘,郭桂蓉,姜斌.微进动弹道导弹目标雷达特征提取[J].电子与信息学报,2006,28(4):643-646. 被引量：62
2陈行勇,黎湘,郭桂蓉,姜斌.中段目标进动雷达特征提取[J].信号处理,2006,22(5):707-711. 被引量：10
3杨强,张宁.高频雷达中多目标的识别检测研究[J].现代雷达,2002,24(6):6-8. 被引量：5
4周洋帆.基于IPv6的无线传感网的研究[J].移动通信,2012(18):57-60.
5安慧珍.机场激光驱鸟器扫描系统设计及应用[J].山西电子技术,2013(1):15-17. 被引量：1
6翟南.从磁带制作到基于文件的网络音频制作流程浅析[J].现代电视技术,2011(5):128-132.
7施勤,万霞,熊纬.机动雷达大惯量天线定位控制策略[J].现代雷达,2012,34(3):58-60.
8潘红斌,安晨霞,皇甫素翔.基于交流伺服控制系统设计的演示系统[J].通信与广播电视,2004(4):56-63.
9黄永明,杨绿溪.空时编码OFDM与多维波束形成组合方案[J].通信学报,2006,27(9):129-134. 被引量：1
10杜永锋.空时编码OFDM与多维波束形成组合算法研究[J].舰船电子工程,2012,32(12):45-48.

通信学报

2012年第2期

浏览历史

内容加载中请稍等...

新的全参考音视频同步感知质量评价模型被引量：2

参考文献13

同被引文献25

引证文献2

二级引证文献11

相关作者

相关机构

相关主题

浏览历史

新的全参考音视频同步感知质量评价模型 被引量：2

参考文献13

同被引文献25

引证文献2

二级引证文献11

相关作者

相关机构

相关主题

浏览历史

新的全参考音视频同步感知质量评价模型被引量：2