Most recently, due to the demand of immersive communication, region-of-interest-based(ROI) high efficiency video coding(HEVC) approaches in conferencing scenarios have become increasingly important. However, there exi...Most recently, due to the demand of immersive communication, region-of-interest-based(ROI) high efficiency video coding(HEVC) approaches in conferencing scenarios have become increasingly important. However, there exists no objective metric, specially developed for efficiently evaluating the perceived visual quality of video conferencing coding. Therefore, this paper proposes a novel objective quality assessment method, namely Gaussian mixture model based peak signal-tonoise ratio(GMM-PSNR), for the perceptual video conferencing coding. First, eye tracking experiments, together with a real-time technique of face and facial feature extraction, are introduced. In the experiments, importance of background, face, and facial feature regions is identified, and it is then quantified based on eye fixation points over test videos. Next, assuming that the distribution of the eye fixation points obeys Gaussian mixture model, we utilize expectation-maximization(EM) algorithm to generate an importance weight map for each frame of video conferencing coding, in light of a new term eye fixation points/pixel(efp/p). According to the generated weight map, GMM-PSNR is developed for quality assessment by assigning different weights to the distortion of each pixel in the video frame. Finally, we utilize some experiments to investigate the correlation of the proposed GMM-PSNR and other conventional objective metrics with subjective quality metrics. The experimental results show the effectiveness of GMM-PSNR.展开更多
While quality assessment is essential for testing, optimizing, benchmarking, monitoring, and inspecting related systems and services, it also plays an essential role in the design of virtually all visual signal proces...While quality assessment is essential for testing, optimizing, benchmarking, monitoring, and inspecting related systems and services, it also plays an essential role in the design of virtually all visual signal processing and communication algorithms, as well as various related decision-making processes. In this paper, we first provide an overview of recently derived quality assessment approaches for traditional visual signals (i.e., 2D images/videos), with highlights for new trends (such as machine learning approaches). On the other hand, with the ongoing development of devices and multimedia services, newly emerged visual signals (e.g., mobile/3D videos) are becoming more and more popular. This work focuses on recent progresses of quality metrics, which have been reviewed for the newly emerged forms of visual signals, which include scalable and mobile videos, High Dynamic Range (HDR) images, image segmentation results, 3D images/videos, and retargeted images.展开更多
With the advent in services such as telemedicine and telesurgery,provision of continuous quality monitoring for these services has become a challenge for the network operators.Quality standards for provision of such s...With the advent in services such as telemedicine and telesurgery,provision of continuous quality monitoring for these services has become a challenge for the network operators.Quality standards for provision of such services are application specic as medical imagery is quite different than general purpose images and videos.This paper presents a novel full reference objective video quality metric that focuses on estimating the quality of wireless capsule endoscopy(WCE)videos containing bleeding regions.Bleeding regions in gastrointestinal tract have been focused in this research,as bleeding is one of the major reasons behind several diseases within the tract.The method jointly estimates the diagnostic as well as perceptual quality of WCE videos,and accurately predicts the quality,which is in high correlation with the subjective differential mean opinion scores(DMOS).The proposed combines motion quality estimates,bleeding regions’quality estimates based on support vector machine(SVM)and perceptual quality estimates using the pristine and impaired WCE videos.Our method Quality Index for Bleeding Regions in Capsule Endoscopy(QI-BRiCE)videos is one of its kind and the results show high correlation in terms of Pearson’s linear correlation coefcient(PLCC)and Spearman’s rank order correlation coefcient(SROCC).An F-test is also provided in the results section to prove the statistical signicance of our proposed method.展开更多
With the rapid development of immersive multimedia technologies,360-degree video services have quickly gained popularity and how to ensure sufficient spatial presence of end users when viewing 360-degree videos become...With the rapid development of immersive multimedia technologies,360-degree video services have quickly gained popularity and how to ensure sufficient spatial presence of end users when viewing 360-degree videos becomes a new challenge.In this regard,accurately acquiring users’sense of spatial presence is of fundamental importance for video service providers to improve their service quality.Unfortunately,there is no efficient evaluation model so far for measuring the sense of spatial presence for 360-degree videos.In this paper,we first design an assessment framework to clarify the influencing factors of spatial presence.Related parameters of 360-degree videos and headmounted display devices are both considered in this framework.Well-designed subjective experiments are then conducted to investigate the impact of various influencing factors on the sense of presence.Based on the subjective ratings,we propose a spatial presence assessment model that can be easily deployed in 360-degree video applications.To the best of our knowledge,this is the first attempt in literature to establish a quantitative spatial presence assessment model by using technical parameters that are easily extracted.Experimental results demonstrate that the proposed model can reliably predict the sense of spatial presence.展开更多
The objective assessment method of network video quality is a challenge, because the video quality will be distorted by various factors, including transmission and compression. In order to improve the objective method...The objective assessment method of network video quality is a challenge, because the video quality will be distorted by various factors, including transmission and compression. In order to improve the objective method, an objective assessment method based on fuzzy inference system of Mamdani is proposed. Firstly, six quality parameters are introduced. All the quality parameters are inputted to fuzzy logic controller system. Secondly, the outputs are used as next inputs and inferred by another fuzzy logic controller system to obtain the objective quality of network video. Lastly, the performance of proposed method is validated on four videos with different network environment. Meanwhile this method is compared with other methods. The experimental results show that the proposed method can improve the similarity between subjective and objective assessment.展开更多
Video compression in medical video streaming is one of the key technologies associated with mobile healthcare.Seamless delivery of medical video streams over a resource constrained network emphasizes the need of a vid...Video compression in medical video streaming is one of the key technologies associated with mobile healthcare.Seamless delivery of medical video streams over a resource constrained network emphasizes the need of a video codec that requires minimum bitrates and maintains high perceptual quality.This paper presents a comparative study between High Efciency Video Coding(HEVC)and its potential successor Versatile Video Coding(VVC)in the context of healthcare.A large-scale subjective experiment comprising of twenty-four non-expert participants is presented for eight different test conditions in Full High Denition(FHD)videos.The presented analysis highlights the impact of compression artefacts on the perceptual quality of HEVC and VVC processed videos.Our results and ndings show that VVC clearly outperforms HEVC in terms of achieving higher compression,while maintaining high quality in FHD videos.VVC requires upto 40%less bitrate for encoding an FHD video at excellent perceptual quality.We have provided rate-quality curves for both encoders and a degree of overlap across both codecs in terms of perceptual quality.Overall,there is a 71%degree of overlap in terms of quality between VVC and HEVC compressed videos for eight different test conditions.展开更多
A new no-reference blocking artifact metric for B-DCT compression video is presented in this paper. We first present a new definition of blocking artifact and a new method for measuring perceptive blocking artifact ba...A new no-reference blocking artifact metric for B-DCT compression video is presented in this paper. We first present a new definition of blocking artifact and a new method for measuring perceptive blocking artifact based on HVS taking into account the luminance masking and activity masking characteristic. Then, we propose a new concept of blocking artifact cluster and the algorithm for clustering blocking artifacts. Considering eye movement and fixation, we select several clusters with most serious blocking artifacts and utilize the average of their blocking artifacts to assess the total blocking artifact of B-DCT reconstructed video. Experimental results illustrating the performance of the proposed method are presented and evaluated.展开更多
Medical video repositories play important roles for many health-related issues such as medical imaging, medical research and education, medical diagnostics and training of medical professionals. Due to the increasing ...Medical video repositories play important roles for many health-related issues such as medical imaging, medical research and education, medical diagnostics and training of medical professionals. Due to the increasing availability of the digital video data, indexing, annotating and the retrieval of the information are crucial. Since performing these processes are both computationally expensive and time consuming, automated systems are needed. In this paper, we present a medical video segmentation and retrieval research initiative. We describe the key components of the system including video segmentation engine, image retrieval engine and image quality assessment module. The aim of this research is to provide an online tool for indexing, browsing and retrieving the neurosurgical videotapes. This tool will allow people to retrieve the specific information in a long video tape they are interested in instead of looking through the entire content.展开更多
现实场景下拍摄的视频由于存在各种未知失真类型、缺少参考视频,对此类视频的质量评价是一个十分具有挑战性的任务.近年来,研究人员将人类视觉系统的先验知识融合在质量评价任务中.在此基础上,提出一种考虑背景失真的无参考视频质量评...现实场景下拍摄的视频由于存在各种未知失真类型、缺少参考视频,对此类视频的质量评价是一个十分具有挑战性的任务.近年来,研究人员将人类视觉系统的先验知识融合在质量评价任务中.在此基础上,提出一种考虑背景失真的无参考视频质量评价方法.该方法在考虑视频内容的同时,显著增强了对视频背景中信息丢失问题的敏感度,在特征提取阶段充分考虑背景特征的提取;随后,通过引入结合门控机制的通道挖掘技术,高效整合高低维特征,使特征通道更加精准地聚焦于背景失真细节;最终,利用时序建模模块构建特征的时间维度模型,并通过线性回归方法生成视频质量的客观量化评分.使用SROCC(spearman rank order correlation coefficient)、PLCC(pearson linear correlation coefficient)和RMSE(root mean squared error)等评价指标在公开数据集KoNViD-1k、LIVE-Qualcomm和CVD2014开展实验,结果表明该方法不仅与人类主观感知具有高度相关性,且预测误差较小,有效提升了视频质量评估的准确性和可靠性,能够更贴近地模拟人类对视频质量的直观评价.展开更多
六自由度(Six Degrees of Freedom,6DoF)视频允许用户从全方位、任意视角身临其境体验场景,是下一代沉浸式视频产业的发展方向.部分自由度受限的窗口6DoF视频近年来成为研究热点,本文提出面向窗口6DoF合成视频的主观数据库和客观质量评...六自由度(Six Degrees of Freedom,6DoF)视频允许用户从全方位、任意视角身临其境体验场景,是下一代沉浸式视频产业的发展方向.部分自由度受限的窗口6DoF视频近年来成为研究热点,本文提出面向窗口6DoF合成视频的主观数据库和客观质量评价方法.在主观数据库方面,构建了包含两种交互路径不适性失真、四种绘制失真和四种压缩失真的窗口6DoF合成视频主观质量数据库Windowed-6DoF,并开展主观质量测试及结果分析.在客观质量评价方法方面,设计了一种融合多层特征的窗口6DoF合成视频无参考客观质量评价方法.采用切比雪夫矩提取视频时域切片上的底层形状特征;采用Resnet-50网络提取视频的时域、空域高层语义特征并进行降维处理;最后采用随机森林将底层形状特征和高层语义特征进行融合,且训练得到窗口6DoF合成视频的客观质量评价模型.在提出的数据库Windowed-6DoF和公共数据库IRCCyN/IVC DIBR的测试结果表明,本文提出的客观质量评价方法预测分数的皮尔逊线性相关系数分别达到0.9327和0.8581,与主观评价分数具有较好的一致性.展开更多
背景现有医学影像压缩技术基于均方误差优化,并不能完全反映人类对医学影像的主观质量感受,与临床诊断所需的结构特征保留度存在一定差距。目的提出一种面向医学影像细微特征的低损耗压缩编码算法,旨在不降低医学影像主观质量的同时降...背景现有医学影像压缩技术基于均方误差优化,并不能完全反映人类对医学影像的主观质量感受,与临床诊断所需的结构特征保留度存在一定差距。目的提出一种面向医学影像细微特征的低损耗压缩编码算法,旨在不降低医学影像主观质量的同时降低其传输带宽。方法本研究收集了解放军总医院14例骨科手术的CT图像序列,首先基于医学影像的亮度、对比度及细节纹理等关键视觉特征,重构结构相似性指数(structural similarity index measure,SSIM),其中亮度因子α=1.15,对比度/结构因子β=γ=0.95;进而基于线性失真模型和大数定律,建立结构相似性指数和均方误差的关系式;其次,将1/SSIM作为图像失真的度量指标,构建适用于率失真优化的SSIM失真测度;在此基础上,在目标速率约束条件下使失真指标最小化,建立基于SSIM的率失真优化框架;最后,依托x264平台,将所提方法与标准编码器进行比较,验证其在率失真性能上的优势。结果本团队的方法相较x264标准编码器取得了恒定量化参数下平均-5.2%和恒定质量因子下平均-4.8%的率失真收益;在主观质量上,编码前后图像的SSIM均>0.95,码率平均降低372 kbps,在计算效率上未增加编码时间复杂度。结论本研究提出的方法在保证医学影像高感官质量的同时兼顾了计算复杂度的控制,为医疗影像传输提供了更优秀的压缩编码方案。展开更多
文摘Most recently, due to the demand of immersive communication, region-of-interest-based(ROI) high efficiency video coding(HEVC) approaches in conferencing scenarios have become increasingly important. However, there exists no objective metric, specially developed for efficiently evaluating the perceived visual quality of video conferencing coding. Therefore, this paper proposes a novel objective quality assessment method, namely Gaussian mixture model based peak signal-tonoise ratio(GMM-PSNR), for the perceptual video conferencing coding. First, eye tracking experiments, together with a real-time technique of face and facial feature extraction, are introduced. In the experiments, importance of background, face, and facial feature regions is identified, and it is then quantified based on eye fixation points over test videos. Next, assuming that the distribution of the eye fixation points obeys Gaussian mixture model, we utilize expectation-maximization(EM) algorithm to generate an importance weight map for each frame of video conferencing coding, in light of a new term eye fixation points/pixel(efp/p). According to the generated weight map, GMM-PSNR is developed for quality assessment by assigning different weights to the distortion of each pixel in the video frame. Finally, we utilize some experiments to investigate the correlation of the proposed GMM-PSNR and other conventional objective metrics with subjective quality metrics. The experimental results show the effectiveness of GMM-PSNR.
基金partially supported by the Research Grants Council of the Hong Kong SAR, China (Project CUHK 415712)the Ministry of Education Academic Research Fund (AcRF) Tier 2 in Singapore under Grant No. T208B1218
文摘While quality assessment is essential for testing, optimizing, benchmarking, monitoring, and inspecting related systems and services, it also plays an essential role in the design of virtually all visual signal processing and communication algorithms, as well as various related decision-making processes. In this paper, we first provide an overview of recently derived quality assessment approaches for traditional visual signals (i.e., 2D images/videos), with highlights for new trends (such as machine learning approaches). On the other hand, with the ongoing development of devices and multimedia services, newly emerged visual signals (e.g., mobile/3D videos) are becoming more and more popular. This work focuses on recent progresses of quality metrics, which have been reviewed for the newly emerged forms of visual signals, which include scalable and mobile videos, High Dynamic Range (HDR) images, image segmentation results, 3D images/videos, and retargeted images.
基金supported by Innovate UK,which is a part of UK Research&Innovation,under the Knowledge Transfer Partnership(KTP)program(Project No.11433)supported by the Grand Information Technology Research Center Program through the Institute of Information&Communications Technology and Planning&Evaluation(IITP)funded by the Ministry of Science and ICT(MSIT),Korea(IITP-2020-2020-0-01612)。
文摘With the advent in services such as telemedicine and telesurgery,provision of continuous quality monitoring for these services has become a challenge for the network operators.Quality standards for provision of such services are application specic as medical imagery is quite different than general purpose images and videos.This paper presents a novel full reference objective video quality metric that focuses on estimating the quality of wireless capsule endoscopy(WCE)videos containing bleeding regions.Bleeding regions in gastrointestinal tract have been focused in this research,as bleeding is one of the major reasons behind several diseases within the tract.The method jointly estimates the diagnostic as well as perceptual quality of WCE videos,and accurately predicts the quality,which is in high correlation with the subjective differential mean opinion scores(DMOS).The proposed combines motion quality estimates,bleeding regions’quality estimates based on support vector machine(SVM)and perceptual quality estimates using the pristine and impaired WCE videos.Our method Quality Index for Bleeding Regions in Capsule Endoscopy(QI-BRiCE)videos is one of its kind and the results show high correlation in terms of Pearson’s linear correlation coefcient(PLCC)and Spearman’s rank order correlation coefcient(SROCC).An F-test is also provided in the results section to prove the statistical signicance of our proposed method.
基金supported in part by ZTE Industry⁃University⁃Institute Coop⁃eration Funds.
文摘With the rapid development of immersive multimedia technologies,360-degree video services have quickly gained popularity and how to ensure sufficient spatial presence of end users when viewing 360-degree videos becomes a new challenge.In this regard,accurately acquiring users’sense of spatial presence is of fundamental importance for video service providers to improve their service quality.Unfortunately,there is no efficient evaluation model so far for measuring the sense of spatial presence for 360-degree videos.In this paper,we first design an assessment framework to clarify the influencing factors of spatial presence.Related parameters of 360-degree videos and headmounted display devices are both considered in this framework.Well-designed subjective experiments are then conducted to investigate the impact of various influencing factors on the sense of presence.Based on the subjective ratings,we propose a spatial presence assessment model that can be easily deployed in 360-degree video applications.To the best of our knowledge,this is the first attempt in literature to establish a quantitative spatial presence assessment model by using technical parameters that are easily extracted.Experimental results demonstrate that the proposed model can reliably predict the sense of spatial presence.
基金supported by the High Level Talent Research Project in Huaqiao University ( 14BS214)
文摘The objective assessment method of network video quality is a challenge, because the video quality will be distorted by various factors, including transmission and compression. In order to improve the objective method, an objective assessment method based on fuzzy inference system of Mamdani is proposed. Firstly, six quality parameters are introduced. All the quality parameters are inputted to fuzzy logic controller system. Secondly, the outputs are used as next inputs and inferred by another fuzzy logic controller system to obtain the objective quality of network video. Lastly, the performance of proposed method is validated on four videos with different network environment. Meanwhile this method is compared with other methods. The experimental results show that the proposed method can improve the similarity between subjective and objective assessment.
基金supported by Innovate UK,which is a part of UK Research&Innovation,and Pangea Connected Ltd.,under the Knowledge Transfer Partnership(KTP)program(Project No.11433)。
文摘Video compression in medical video streaming is one of the key technologies associated with mobile healthcare.Seamless delivery of medical video streams over a resource constrained network emphasizes the need of a video codec that requires minimum bitrates and maintains high perceptual quality.This paper presents a comparative study between High Efciency Video Coding(HEVC)and its potential successor Versatile Video Coding(VVC)in the context of healthcare.A large-scale subjective experiment comprising of twenty-four non-expert participants is presented for eight different test conditions in Full High Denition(FHD)videos.The presented analysis highlights the impact of compression artefacts on the perceptual quality of HEVC and VVC processed videos.Our results and ndings show that VVC clearly outperforms HEVC in terms of achieving higher compression,while maintaining high quality in FHD videos.VVC requires upto 40%less bitrate for encoding an FHD video at excellent perceptual quality.We have provided rate-quality curves for both encoders and a degree of overlap across both codecs in terms of perceptual quality.Overall,there is a 71%degree of overlap in terms of quality between VVC and HEVC compressed videos for eight different test conditions.
基金Project (No. YJCB2003017MU) supported by Huawei Technology Fund, China
文摘A new no-reference blocking artifact metric for B-DCT compression video is presented in this paper. We first present a new definition of blocking artifact and a new method for measuring perceptive blocking artifact based on HVS taking into account the luminance masking and activity masking characteristic. Then, we propose a new concept of blocking artifact cluster and the algorithm for clustering blocking artifacts. Considering eye movement and fixation, we select several clusters with most serious blocking artifacts and utilize the average of their blocking artifacts to assess the total blocking artifact of B-DCT reconstructed video. Experimental results illustrating the performance of the proposed method are presented and evaluated.
文摘Medical video repositories play important roles for many health-related issues such as medical imaging, medical research and education, medical diagnostics and training of medical professionals. Due to the increasing availability of the digital video data, indexing, annotating and the retrieval of the information are crucial. Since performing these processes are both computationally expensive and time consuming, automated systems are needed. In this paper, we present a medical video segmentation and retrieval research initiative. We describe the key components of the system including video segmentation engine, image retrieval engine and image quality assessment module. The aim of this research is to provide an online tool for indexing, browsing and retrieving the neurosurgical videotapes. This tool will allow people to retrieve the specific information in a long video tape they are interested in instead of looking through the entire content.
文摘现实场景下拍摄的视频由于存在各种未知失真类型、缺少参考视频,对此类视频的质量评价是一个十分具有挑战性的任务.近年来,研究人员将人类视觉系统的先验知识融合在质量评价任务中.在此基础上,提出一种考虑背景失真的无参考视频质量评价方法.该方法在考虑视频内容的同时,显著增强了对视频背景中信息丢失问题的敏感度,在特征提取阶段充分考虑背景特征的提取;随后,通过引入结合门控机制的通道挖掘技术,高效整合高低维特征,使特征通道更加精准地聚焦于背景失真细节;最终,利用时序建模模块构建特征的时间维度模型,并通过线性回归方法生成视频质量的客观量化评分.使用SROCC(spearman rank order correlation coefficient)、PLCC(pearson linear correlation coefficient)和RMSE(root mean squared error)等评价指标在公开数据集KoNViD-1k、LIVE-Qualcomm和CVD2014开展实验,结果表明该方法不仅与人类主观感知具有高度相关性,且预测误差较小,有效提升了视频质量评估的准确性和可靠性,能够更贴近地模拟人类对视频质量的直观评价.
文摘六自由度(Six Degrees of Freedom,6DoF)视频允许用户从全方位、任意视角身临其境体验场景,是下一代沉浸式视频产业的发展方向.部分自由度受限的窗口6DoF视频近年来成为研究热点,本文提出面向窗口6DoF合成视频的主观数据库和客观质量评价方法.在主观数据库方面,构建了包含两种交互路径不适性失真、四种绘制失真和四种压缩失真的窗口6DoF合成视频主观质量数据库Windowed-6DoF,并开展主观质量测试及结果分析.在客观质量评价方法方面,设计了一种融合多层特征的窗口6DoF合成视频无参考客观质量评价方法.采用切比雪夫矩提取视频时域切片上的底层形状特征;采用Resnet-50网络提取视频的时域、空域高层语义特征并进行降维处理;最后采用随机森林将底层形状特征和高层语义特征进行融合,且训练得到窗口6DoF合成视频的客观质量评价模型.在提出的数据库Windowed-6DoF和公共数据库IRCCyN/IVC DIBR的测试结果表明,本文提出的客观质量评价方法预测分数的皮尔逊线性相关系数分别达到0.9327和0.8581,与主观评价分数具有较好的一致性.
文摘背景现有医学影像压缩技术基于均方误差优化,并不能完全反映人类对医学影像的主观质量感受,与临床诊断所需的结构特征保留度存在一定差距。目的提出一种面向医学影像细微特征的低损耗压缩编码算法,旨在不降低医学影像主观质量的同时降低其传输带宽。方法本研究收集了解放军总医院14例骨科手术的CT图像序列,首先基于医学影像的亮度、对比度及细节纹理等关键视觉特征,重构结构相似性指数(structural similarity index measure,SSIM),其中亮度因子α=1.15,对比度/结构因子β=γ=0.95;进而基于线性失真模型和大数定律,建立结构相似性指数和均方误差的关系式;其次,将1/SSIM作为图像失真的度量指标,构建适用于率失真优化的SSIM失真测度;在此基础上,在目标速率约束条件下使失真指标最小化,建立基于SSIM的率失真优化框架;最后,依托x264平台,将所提方法与标准编码器进行比较,验证其在率失真性能上的优势。结果本团队的方法相较x264标准编码器取得了恒定量化参数下平均-5.2%和恒定质量因子下平均-4.8%的率失真收益;在主观质量上,编码前后图像的SSIM均>0.95,码率平均降低372 kbps,在计算效率上未增加编码时间复杂度。结论本研究提出的方法在保证医学影像高感官质量的同时兼顾了计算复杂度的控制,为医疗影像传输提供了更优秀的压缩编码方案。