Although ray tracing produces high-fidelity, realistic images, it is considered computationally burdensome when implemented on a high rendering rate system. Perception-driven rendering techniques generate images with ...Although ray tracing produces high-fidelity, realistic images, it is considered computationally burdensome when implemented on a high rendering rate system. Perception-driven rendering techniques generate images with minimal noise and distortion that are generally acceptable to the human visual system, thereby reducing rendering costs. In this paper, we introduce a perception-entropy-driven temporal reusing method to accelerate real-time ray tracing. We first build a just noticeable difference(JND) model to represent the uncertainty of ray samples and image space masking effects. Then, we expand the shading gradient through gradient max-pooling and gradient filtering to enlarge the visual receipt field. Finally, we dynamically optimize reusable time segments to improve the accuracy of temporal reusing. Compared with Monte Carlo ray tracing, our algorithm enhances frames per second(fps) by 1.93× to 2.96× at 8 to 16 samples per pixel, significantly accelerating the Monte Carlo ray tracing process while maintaining visual quality.展开更多
A novel adaptive digital image watermark algorithm is proposed. Fuzzy c-means clustering (FCM) is used to classify the original image blocks into two classes based on several characteristic parameters of human visua...A novel adaptive digital image watermark algorithm is proposed. Fuzzy c-means clustering (FCM) is used to classify the original image blocks into two classes based on several characteristic parameters of human visual system (HVS). One is suited for embedding a digital watermark, the other is not. So the appropriate blocks in an image are selected to embed the watermark. The wetermark is embedded in the middle-frequency part of the host image in conjunction with HVS and discrete cosine transform (DCT). The maximal watermark strength is fixed according to the frequency masking. In the same time, for the good performance, the watermark is modulated into a fractal modulation array. The simulation results show that we can remarkably extract the hiding watermark and the algorithm can achieve good robustness with common signal distortion or geometric distortion and the quality of the watermarked image is guaranteed.展开更多
An effective blind digital watermarking algorithm based on neural networks in the wavelet domain is presented. Firstly, the host image is decomposed through wavelet transform. The significant coefficients of wavelet a...An effective blind digital watermarking algorithm based on neural networks in the wavelet domain is presented. Firstly, the host image is decomposed through wavelet transform. The significant coefficients of wavelet are selected according to the human visual system (HVS) characteristics. Watermark bits are added to them. And then effectively cooperates neural networks to learn the characteristics of the embedded watermark related to them. Because of the learning and adaptive capabilities of neural networks, the trained neural networks almost exactly recover the watermark from the watermarked image. Experimental results and comparisons with other techniques prove the effectiveness of the new algorithm.展开更多
Since there is lack of methodology to assess the performance of defogging algorithm and the existing assessment methods have some limitations,three new methods for assessing the defogging algorithm were proposed.One w...Since there is lack of methodology to assess the performance of defogging algorithm and the existing assessment methods have some limitations,three new methods for assessing the defogging algorithm were proposed.One was using synthetic foggy image simulated by image degradation model to assess the defogging algorithm in full-reference way.In this method,the absolute difference was computed between the synthetic image with and without fog.The other two were computing the fog density of gray level image or constructing assessment system of color image from human visual perception to assess the defogging algorithm in no-reference way.For these methods,an assessment function was defined to evaluate algorithm performance from the function value.Using the defogging algorithm comparison,the experimental results demonstrate the effectiveness and reliability of the proposed methods.展开更多
To overcome the shortcomings of the Lee image enhancement algorithm and its improvement based on the logarithmic image processing(LIP) model, this paper proposes what we believe to be an effective image enhancement al...To overcome the shortcomings of the Lee image enhancement algorithm and its improvement based on the logarithmic image processing(LIP) model, this paper proposes what we believe to be an effective image enhancement algorithm. This algorithm introduces fuzzy entropy, makes full use of neighborhood information, fuzzy information and human visual characteristics.To enhance an image, this paper first carries out the reasonable fuzzy-3 partition of its histogram into the dark region, intermediate region and bright region. It then extracts the statistical characteristics of the three regions and adaptively selects the parameter αaccording to the statistical characteristics of the image’s gray-scale values. It also adds a useful nonlinear transform, thus increasing the ubiquity of the algorithm. Finally, the causes for the gray-scale value overcorrection that occurs in the traditional image enhancement algorithms are analyzed and their solutions are proposed.The simulation results show that our image enhancement algorithm can effectively suppress the noise of an image, enhance its contrast and visual effect, sharpen its edge and adjust its dynamic range.展开更多
A Robust Adaptive Video Encoder (RAVE) based on human visual model is proposed. The encoder combines the best features of Fine Granularity Scalable (FGS) coding, framedropping coding, video redundancy coding, and huma...A Robust Adaptive Video Encoder (RAVE) based on human visual model is proposed. The encoder combines the best features of Fine Granularity Scalable (FGS) coding, framedropping coding, video redundancy coding, and human visual model. According to packet loss and available bandwidth of the network, the encoder adjust the output bit rate by jointly adapting quantization step-size instructed by human visual model, rate shaping, and periodically inserting key frame. The proposed encoder is implemented based on MPEG-4 encoder and is compared with the case of a conventional FGS algorithm. It is shown that RAVE is a very efficient robust video encoder that provides improved visual quality for the receiver and consumes equal or less network resource. Results are confirmed by subjective tests and simulation tests.展开更多
This letter proposes a new kind of image quality philosophy—Modulate Quality based on Fixation Points (MQFP) based on Human Visual System (HVS) model. Dissimilar to the former HVS-based quality assessment, the new me...This letter proposes a new kind of image quality philosophy—Modulate Quality based on Fixation Points (MQFP) based on Human Visual System (HVS) model. Dissimilar to the former HVS-based quality assessment, the new measure emphasizes particularly on modeling the jumping phenomenon of human sight instead of modeling the visual perception of human. In other words, to model the HVS using fixation points and stay-frequency instead of Contrast Sensitive Function (CSF) etc. which models the visual perception of HVS. The experiment on various frequency-distortion images indicates that the new measure is correlated with the subjective judgment more than the former HVS-based measure and is a robust measure.展开更多
Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly dist...Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.展开更多
In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on...In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on human visual saliency model in H.264/AVC. Firstly, we modifie Itti's saliency model. Secondly, target bits of each frame are allocated through the correlation of saliency region between the current and previous frame, and the complexity of each MB is modified through the saliency value and its Mean Absolute Difference (MAD) value. Lastly, the algorithm was implemented in JVT JM12.2. Simulation results show that, comparing with traditional rate control algorithm, the proposed one can reduce the coding bit rate and improve the reconstructed video subjective quality, especially for visual saliency region. It is very suitable for wireless video transmission.展开更多
The key to the wavelet based denoising teehniquea is how to manipulate the wavelet coefficients. By referring to the idea of Inclusive-OR in the design of circuits, this paper proposes a new algorithm called wavelet d...The key to the wavelet based denoising teehniquea is how to manipulate the wavelet coefficients. By referring to the idea of Inclusive-OR in the design of circuits, this paper proposes a new algorithm called wavelet domain Inclusive-OR denoising algorithm(WDIDA), which distinguishes the wavelet coefficients belonging to image or noise by considering their phases and modulus maxima simultaneously. Using this new algorithm, the denoising effects are improved and the computation time is reduced. Furthermore, in order to enhance the edges of the image but not magnify noise, a contrast nonlinear enhancing algorithm is presented according to human visual properties. Compared with traditional enhancing algorithms, the algorithm that we proposed has a better noise reducing performanee , preserving edges and improving the visual quality of images.展开更多
Image dehazing is still an open research topic that has been undergoing a lot of development,especially with the renewed interest in machine learning-based methods.A major challenge of the existing dehazing methods is...Image dehazing is still an open research topic that has been undergoing a lot of development,especially with the renewed interest in machine learning-based methods.A major challenge of the existing dehazing methods is the estimation of transmittance,which is the key element of haze-affected imaging models.Conventional methods are based on a set of assumptions that reduce the solution search space.However,the multiplication of these assumptions tends to restrict the solutions to particular cases that cannot account for the reality of the observed image.In this paper we reduce the number of simplified hypotheses in order to attain a more plausible and realistic solution by exploiting a priori knowledge of the ground truth in the proposed method.The proposed method relies on pixel information between the ground truth and haze image to reduce these assumptions.This is achieved by using ground truth and haze image to find the geometric-pixel information through a guided Convolution Neural Networks(CNNs)with a Parallax Attention Mechanism(PAM).It uses the differential pixel-based variance in order to estimate transmittance.The pixel variance uses local and global patches between the assumed ground truth and haze image to refine the transmission map.The transmission map is also improved based on improved Markov random field(MRF)energy functions.We used different images to test the proposed algorithm.The entropy value of the proposed method was 7.43 and 7.39,a percent increase of4.35%and5.42%,respectively,compared to the best existing results.The increment is similar in other performance quality metrics and this validate its superiority compared to other existing methods in terms of key image quality evaluation metrics.The proposed approach’s drawback,an over-reliance on real ground truth images,is also investigated.The proposed method show more details hence yields better images than those from the existing state-of-the-art-methods.展开更多
An improved FGS (Fine Granular Scalability) coding method is proposed in this letter, which is based on human visual characteristics. This method adjusts FGS coding frame rate according to the evaluation of video sequ...An improved FGS (Fine Granular Scalability) coding method is proposed in this letter, which is based on human visual characteristics. This method adjusts FGS coding frame rate according to the evaluation of video sequences so as to improve the coding efficiency and subject perceived quality of reconstructed images. Finally, a fine granular joint source channel coding is proposed based on the source coding method, which not only utilizes the network resources efficiently, but guarantees the reliable transmission of video information.展开更多
While quality assessment is essential for testing, optimizing, benchmarking, monitoring, and inspecting related systems and services, it also plays an essential role in the design of virtually all visual signal proces...While quality assessment is essential for testing, optimizing, benchmarking, monitoring, and inspecting related systems and services, it also plays an essential role in the design of virtually all visual signal processing and communication algorithms, as well as various related decision-making processes. In this paper, we first provide an overview of recently derived quality assessment approaches for traditional visual signals (i.e., 2D images/videos), with highlights for new trends (such as machine learning approaches). On the other hand, with the ongoing development of devices and multimedia services, newly emerged visual signals (e.g., mobile/3D videos) are becoming more and more popular. This work focuses on recent progresses of quality metrics, which have been reviewed for the newly emerged forms of visual signals, which include scalable and mobile videos, High Dynamic Range (HDR) images, image segmentation results, 3D images/videos, and retargeted images.展开更多
This work presents a robust and rotationally invariant shape descriptor, namely perception pronouncement (called p2), to mathematically model the eye fixations, p2 takes two criteria - the local consideration of sur...This work presents a robust and rotationally invariant shape descriptor, namely perception pronouncement (called p2), to mathematically model the eye fixations, p2 takes two criteria - the local consideration of surface curvature and the global consideration of view- independent visibility - into account. Differing from existing works that often computed the intrinsic surface property of visibility in imaging space, a novel approach is proposed to approxi- mate the attribute in object space using Gauss map and Ray tracing. With the presented shape descriptor, mesh saliency detection, which refers to reasoning about which regions or points of a surface axe important, is more sensible, especially when 3D models fall into two categories: (1) the models possess significant interior/exterior structures; (2) the models contain regions where the contrast in visibility is high. For the models that are out of the categories, saliencies achieved by our approach are comparable to or even better than those of state-of-the-axt methods.展开更多
The existing H.264/AVC rate control schemes rarely include the perceptual considerations.As a result,the im-provements in visual quality are hardly comparable to those in peak signal-to-noise ratio(PSNR).In this paper...The existing H.264/AVC rate control schemes rarely include the perceptual considerations.As a result,the im-provements in visual quality are hardly comparable to those in peak signal-to-noise ratio(PSNR).In this paper,we propose a perceptual importance analysis scheme to accurately abstract the spatial and temporal perceptual characteristics of video contents.Then we perform bit allocation at macroblock(MB)level by adopting a perceptual mode decision scheme,which adaptively updates the Lagrangian multiplier for mode decision according to the perceptual importance of each MB.Simulation results show that the proposed scheme can efficiently reduce bit rates without visual quality degradation.展开更多
Vision-simulated imagery―the process of generating images that mimic the human visual system―is a valuable tool with a wide spectrum of possible applications, including visual acuity measurements, personalized plann...Vision-simulated imagery―the process of generating images that mimic the human visual system―is a valuable tool with a wide spectrum of possible applications, including visual acuity measurements, personalized planning of corrective lenses and surgeries, vision-correcting displays, vision-related hardware development, and extended reality discomfort reduction. A critical property of human vision is that it is imperfect because of the highly influential wavefront aberrations that vary from person to person. This study provides an overview of the existing computational image generation techniques that properly simulate human vision in the presence of wavefront aberrations. These algorithms typically apply ray tracing with a detailed description of the simulated eye or utilize the point-spread func-tion of the eye to perform convolution on the input image. Based on the description of the vision simulation tech-niques, several of their characteristic features have been evaluated and some potential application areas and research directions have been outlined.展开更多
Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate eval...Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate evaluator for visual experience,thus the modeling of human visual system(HVS)is a core issue for objective IQA and visual experience optimization.The traditional model based on black box fitting has low interpretability and it is difficult to guide the experience optimization effectively,while the model based on physiological simulation is hard to integrate into practical visual communication services due to its high computational complexity.For bridging the gap between signal distortion and visual experience,in this paper,we propose a novel perceptual no-reference(NR)IQA algorithm based on structural computational modeling of HVS.According to the mechanism of the human brain,we divide the visual signal processing into a low-level visual layer,a middle-level visual layer and a high-level visual layer,which conduct pixel information processing,primitive information processing and global image information processing,respectively.The natural scene statistics(NSS)based features,deep features and free-energy based features are extracted from these three layers.The support vector regression(SVR)is employed to aggregate features to the final quality prediction.Extensive experimental comparisons on three widely used benchmark IQA databases(LIVE,CSIQ and TID2013)demonstrate that our proposed metric is highly competitive with or outperforms the state-of-the-art NR IQA measures.展开更多
In this paper we propose a semi fragile watermarking scheme, which can be used for image authentication. Let the original image be performed by l level discrete wavelet transformation. An approximate wavelet coeffi...In this paper we propose a semi fragile watermarking scheme, which can be used for image authentication. Let the original image be performed by l level discrete wavelet transformation. An approximate wavelet coefficient matrix of the original image and real value chaotic sequences are than used to generate the content based and secure watermark. The watermark is embedded into original image by using HVS. The tamper detection can identify the tampered region of the received watermarked image. Experimental results are given.展开更多
The technology for image-to-image style transfer(a prevalent image processing task)has developed rapidly.The purpose of style transfer is to extract a texture from the source image domain and transfer it to the target...The technology for image-to-image style transfer(a prevalent image processing task)has developed rapidly.The purpose of style transfer is to extract a texture from the source image domain and transfer it to the target image domain using a deep neural network.However,the existing methods typically have a large computational cost.To achieve efficient style transfer,we introduce a novel Ghost module into the GANILLA architecture to produce more feature maps from cheap operations.Then we utilize an attention mechanism to transform images with various styles.We optimize the original generative adversarial network(GAN)by using more efficient calculation methods for image-to-illustration translation.The experimental results show that our proposed method is similar to human vision and still maintains the quality of the image.Moreover,our proposed method overcomes the high computational cost and high computational resource consumption for style transfer.By comparing the results of subjective and objective evaluation indicators,our proposed method has shown superior performance over existing methods.展开更多
Traditionally, fractal image compression suffers from lengthy encoding time in measure ofhours. In this paper, combined with characteristlcs of human visual system, a flexible classification technique is proposed. Thi...Traditionally, fractal image compression suffers from lengthy encoding time in measure ofhours. In this paper, combined with characteristlcs of human visual system, a flexible classification technique is proposed. This yields a corresponding adaptive algorithm which can cut down the encoding timeinto second's magnitude. Experiment results suggest that the algorithm can balance the overall encodingperformance efficiently, that is, with a higher speed and a better PSNR gain.展开更多
基金supported by the National Natural Science Foundation of China (No.U19A2063)the Jilin Provincial Science&Technology Development Program of China (No.20230201080GX)。
文摘Although ray tracing produces high-fidelity, realistic images, it is considered computationally burdensome when implemented on a high rendering rate system. Perception-driven rendering techniques generate images with minimal noise and distortion that are generally acceptable to the human visual system, thereby reducing rendering costs. In this paper, we introduce a perception-entropy-driven temporal reusing method to accelerate real-time ray tracing. We first build a just noticeable difference(JND) model to represent the uncertainty of ray samples and image space masking effects. Then, we expand the shading gradient through gradient max-pooling and gradient filtering to enlarge the visual receipt field. Finally, we dynamically optimize reusable time segments to improve the accuracy of temporal reusing. Compared with Monte Carlo ray tracing, our algorithm enhances frames per second(fps) by 1.93× to 2.96× at 8 to 16 samples per pixel, significantly accelerating the Monte Carlo ray tracing process while maintaining visual quality.
基金Supported by the National Natural Science Foundation ofChina (10571127) the Doctoral Foundation of the Ministry of Educationof China (20040610004)
文摘A novel adaptive digital image watermark algorithm is proposed. Fuzzy c-means clustering (FCM) is used to classify the original image blocks into two classes based on several characteristic parameters of human visual system (HVS). One is suited for embedding a digital watermark, the other is not. So the appropriate blocks in an image are selected to embed the watermark. The wetermark is embedded in the middle-frequency part of the host image in conjunction with HVS and discrete cosine transform (DCT). The maximal watermark strength is fixed according to the frequency masking. In the same time, for the good performance, the watermark is modulated into a fractal modulation array. The simulation results show that we can remarkably extract the hiding watermark and the algorithm can achieve good robustness with common signal distortion or geometric distortion and the quality of the watermarked image is guaranteed.
基金Supported by the National Natural Science Foun-dation of China ( 60473015)
文摘An effective blind digital watermarking algorithm based on neural networks in the wavelet domain is presented. Firstly, the host image is decomposed through wavelet transform. The significant coefficients of wavelet are selected according to the human visual system (HVS) characteristics. Watermark bits are added to them. And then effectively cooperates neural networks to learn the characteristics of the embedded watermark related to them. Because of the learning and adaptive capabilities of neural networks, the trained neural networks almost exactly recover the watermark from the watermarked image. Experimental results and comparisons with other techniques prove the effectiveness of the new algorithm.
基金Projects(91220301,61175064,61273314)supported by the National Natural Science Foundation of ChinaProject(126648)supported by the Postdoctoral Science Foundation of Central South University,ChinaProject(2012170301)supported by the New Teacher Fund for School of Information Science and Engineering,Central South University,China
文摘Since there is lack of methodology to assess the performance of defogging algorithm and the existing assessment methods have some limitations,three new methods for assessing the defogging algorithm were proposed.One was using synthetic foggy image simulated by image degradation model to assess the defogging algorithm in full-reference way.In this method,the absolute difference was computed between the synthetic image with and without fog.The other two were computing the fog density of gray level image or constructing assessment system of color image from human visual perception to assess the defogging algorithm in no-reference way.For these methods,an assessment function was defined to evaluate algorithm performance from the function value.Using the defogging algorithm comparison,the experimental results demonstrate the effectiveness and reliability of the proposed methods.
基金supported by the National Natural Science Foundation of China(61472324)
文摘To overcome the shortcomings of the Lee image enhancement algorithm and its improvement based on the logarithmic image processing(LIP) model, this paper proposes what we believe to be an effective image enhancement algorithm. This algorithm introduces fuzzy entropy, makes full use of neighborhood information, fuzzy information and human visual characteristics.To enhance an image, this paper first carries out the reasonable fuzzy-3 partition of its histogram into the dark region, intermediate region and bright region. It then extracts the statistical characteristics of the three regions and adaptively selects the parameter αaccording to the statistical characteristics of the image’s gray-scale values. It also adds a useful nonlinear transform, thus increasing the ubiquity of the algorithm. Finally, the causes for the gray-scale value overcorrection that occurs in the traditional image enhancement algorithms are analyzed and their solutions are proposed.The simulation results show that our image enhancement algorithm can effectively suppress the noise of an image, enhance its contrast and visual effect, sharpen its edge and adjust its dynamic range.
基金Supported by Innovation Fund of China(00C26224210641)
文摘A Robust Adaptive Video Encoder (RAVE) based on human visual model is proposed. The encoder combines the best features of Fine Granularity Scalable (FGS) coding, framedropping coding, video redundancy coding, and human visual model. According to packet loss and available bandwidth of the network, the encoder adjust the output bit rate by jointly adapting quantization step-size instructed by human visual model, rate shaping, and periodically inserting key frame. The proposed encoder is implemented based on MPEG-4 encoder and is compared with the case of a conventional FGS algorithm. It is shown that RAVE is a very efficient robust video encoder that provides improved visual quality for the receiver and consumes equal or less network resource. Results are confirmed by subjective tests and simulation tests.
基金Supported by the National Natural Science Foundation of China (No.60372068) and theGuangdong Province Science Foundation (No.011628).
文摘This letter proposes a new kind of image quality philosophy—Modulate Quality based on Fixation Points (MQFP) based on Human Visual System (HVS) model. Dissimilar to the former HVS-based quality assessment, the new measure emphasizes particularly on modeling the jumping phenomenon of human sight instead of modeling the visual perception of human. In other words, to model the HVS using fixation points and stay-frequency instead of Contrast Sensitive Function (CSF) etc. which models the visual perception of HVS. The experiment on various frequency-distortion images indicates that the new measure is correlated with the subjective judgment more than the former HVS-based measure and is a robust measure.
基金Supported by Tianjin Municipal Natural Science Foundation of China(Grant No.19JCJQJC61600)Hebei Provincial Natural Science Foundation of China(Grant Nos.F2020202051,F2020202053).
文摘Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.
基金supported by National Natural Science Foundation of China under Grant No.610700800973 Sub-Program Projects under Grant No.2009CB320906+3 种基金National Science and Technology of Major Special Projects under Grant No.2010ZX03004-003S&T Planning Project of Hubei Provincial Department of Education under Grant No. Q20112805H&SPlanning Project of Hubei Provincial Department of Education under Grant No.2011jyte142Science Foundation of HubeiProvincial under Grant No.2010CDB05103
文摘In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on human visual saliency model in H.264/AVC. Firstly, we modifie Itti's saliency model. Secondly, target bits of each frame are allocated through the correlation of saliency region between the current and previous frame, and the complexity of each MB is modified through the saliency value and its Mean Absolute Difference (MAD) value. Lastly, the algorithm was implemented in JVT JM12.2. Simulation results show that, comparing with traditional rate control algorithm, the proposed one can reduce the coding bit rate and improve the reconstructed video subjective quality, especially for visual saliency region. It is very suitable for wireless video transmission.
文摘The key to the wavelet based denoising teehniquea is how to manipulate the wavelet coefficients. By referring to the idea of Inclusive-OR in the design of circuits, this paper proposes a new algorithm called wavelet domain Inclusive-OR denoising algorithm(WDIDA), which distinguishes the wavelet coefficients belonging to image or noise by considering their phases and modulus maxima simultaneously. Using this new algorithm, the denoising effects are improved and the computation time is reduced. Furthermore, in order to enhance the edges of the image but not magnify noise, a contrast nonlinear enhancing algorithm is presented according to human visual properties. Compared with traditional enhancing algorithms, the algorithm that we proposed has a better noise reducing performanee , preserving edges and improving the visual quality of images.
基金This work was funded by the Deanship of Scientific Research at Jouf University under grant No DSR-2021-02-0398.
文摘Image dehazing is still an open research topic that has been undergoing a lot of development,especially with the renewed interest in machine learning-based methods.A major challenge of the existing dehazing methods is the estimation of transmittance,which is the key element of haze-affected imaging models.Conventional methods are based on a set of assumptions that reduce the solution search space.However,the multiplication of these assumptions tends to restrict the solutions to particular cases that cannot account for the reality of the observed image.In this paper we reduce the number of simplified hypotheses in order to attain a more plausible and realistic solution by exploiting a priori knowledge of the ground truth in the proposed method.The proposed method relies on pixel information between the ground truth and haze image to reduce these assumptions.This is achieved by using ground truth and haze image to find the geometric-pixel information through a guided Convolution Neural Networks(CNNs)with a Parallax Attention Mechanism(PAM).It uses the differential pixel-based variance in order to estimate transmittance.The pixel variance uses local and global patches between the assumed ground truth and haze image to refine the transmission map.The transmission map is also improved based on improved Markov random field(MRF)energy functions.We used different images to test the proposed algorithm.The entropy value of the proposed method was 7.43 and 7.39,a percent increase of4.35%and5.42%,respectively,compared to the best existing results.The increment is similar in other performance quality metrics and this validate its superiority compared to other existing methods in terms of key image quality evaluation metrics.The proposed approach’s drawback,an over-reliance on real ground truth images,is also investigated.The proposed method show more details hence yields better images than those from the existing state-of-the-art-methods.
基金Supported by National Natural Science Foundation of China (No.90104013) and 863 project(2001AA121061)
文摘An improved FGS (Fine Granular Scalability) coding method is proposed in this letter, which is based on human visual characteristics. This method adjusts FGS coding frame rate according to the evaluation of video sequences so as to improve the coding efficiency and subject perceived quality of reconstructed images. Finally, a fine granular joint source channel coding is proposed based on the source coding method, which not only utilizes the network resources efficiently, but guarantees the reliable transmission of video information.
基金partially supported by the Research Grants Council of the Hong Kong SAR, China (Project CUHK 415712)the Ministry of Education Academic Research Fund (AcRF) Tier 2 in Singapore under Grant No. T208B1218
文摘While quality assessment is essential for testing, optimizing, benchmarking, monitoring, and inspecting related systems and services, it also plays an essential role in the design of virtually all visual signal processing and communication algorithms, as well as various related decision-making processes. In this paper, we first provide an overview of recently derived quality assessment approaches for traditional visual signals (i.e., 2D images/videos), with highlights for new trends (such as machine learning approaches). On the other hand, with the ongoing development of devices and multimedia services, newly emerged visual signals (e.g., mobile/3D videos) are becoming more and more popular. This work focuses on recent progresses of quality metrics, which have been reviewed for the newly emerged forms of visual signals, which include scalable and mobile videos, High Dynamic Range (HDR) images, image segmentation results, 3D images/videos, and retargeted images.
基金Supported by China Scholarship Council(201206230015)China NSFC Key Project(61133009)the National 973 Program of China(2011CB302203)
文摘This work presents a robust and rotationally invariant shape descriptor, namely perception pronouncement (called p2), to mathematically model the eye fixations, p2 takes two criteria - the local consideration of surface curvature and the global consideration of view- independent visibility - into account. Differing from existing works that often computed the intrinsic surface property of visibility in imaging space, a novel approach is proposed to approxi- mate the attribute in object space using Gauss map and Ray tracing. With the presented shape descriptor, mesh saliency detection, which refers to reasoning about which regions or points of a surface axe important, is more sensible, especially when 3D models fall into two categories: (1) the models possess significant interior/exterior structures; (2) the models contain regions where the contrast in visibility is high. For the models that are out of the categories, saliencies achieved by our approach are comparable to or even better than those of state-of-the-axt methods.
基金Project supported by the Shanghai Key Laboratory of Digital Media Processing and Transmissions,China
文摘The existing H.264/AVC rate control schemes rarely include the perceptual considerations.As a result,the im-provements in visual quality are hardly comparable to those in peak signal-to-noise ratio(PSNR).In this paper,we propose a perceptual importance analysis scheme to accurately abstract the spatial and temporal perceptual characteristics of video contents.Then we perform bit allocation at macroblock(MB)level by adopting a perceptual mode decision scheme,which adaptively updates the Lagrangian multiplier for mode decision according to the perceptual importance of each MB.Simulation results show that the proposed scheme can efficiently reduce bit rates without visual quality degradation.
文摘Vision-simulated imagery―the process of generating images that mimic the human visual system―is a valuable tool with a wide spectrum of possible applications, including visual acuity measurements, personalized planning of corrective lenses and surgeries, vision-correcting displays, vision-related hardware development, and extended reality discomfort reduction. A critical property of human vision is that it is imperfect because of the highly influential wavefront aberrations that vary from person to person. This study provides an overview of the existing computational image generation techniques that properly simulate human vision in the presence of wavefront aberrations. These algorithms typically apply ray tracing with a detailed description of the simulated eye or utilize the point-spread func-tion of the eye to perform convolution on the input image. Based on the description of the vision simulation tech-niques, several of their characteristic features have been evaluated and some potential application areas and research directions have been outlined.
基金This work was supported by National Natural Science Foundation of China(Nos.61831015 and 61901260)Key Research and Development Program of China(No.2019YFB1405902).
文摘Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate evaluator for visual experience,thus the modeling of human visual system(HVS)is a core issue for objective IQA and visual experience optimization.The traditional model based on black box fitting has low interpretability and it is difficult to guide the experience optimization effectively,while the model based on physiological simulation is hard to integrate into practical visual communication services due to its high computational complexity.For bridging the gap between signal distortion and visual experience,in this paper,we propose a novel perceptual no-reference(NR)IQA algorithm based on structural computational modeling of HVS.According to the mechanism of the human brain,we divide the visual signal processing into a low-level visual layer,a middle-level visual layer and a high-level visual layer,which conduct pixel information processing,primitive information processing and global image information processing,respectively.The natural scene statistics(NSS)based features,deep features and free-energy based features are extracted from these three layers.The support vector regression(SVR)is employed to aggregate features to the final quality prediction.Extensive experimental comparisons on three widely used benchmark IQA databases(LIVE,CSIQ and TID2013)demonstrate that our proposed metric is highly competitive with or outperforms the state-of-the-art NR IQA measures.
文摘In this paper we propose a semi fragile watermarking scheme, which can be used for image authentication. Let the original image be performed by l level discrete wavelet transformation. An approximate wavelet coefficient matrix of the original image and real value chaotic sequences are than used to generate the content based and secure watermark. The watermark is embedded into original image by using HVS. The tamper detection can identify the tampered region of the received watermarked image. Experimental results are given.
基金This work was funded by the China Postdoctoral Science Foundation(No.2019M661319)Heilongjiang Postdoctoral Scientific Research Developmental Foundation(No.LBH-Q17042)+1 种基金Fundamental Research Funds for the Central Universities(3072020CFQ0602,3072020CF0604,3072020CFP0601)2019 Industrial Internet Innovation and Development Engineering(KY1060020002,KY10600200008).
文摘The technology for image-to-image style transfer(a prevalent image processing task)has developed rapidly.The purpose of style transfer is to extract a texture from the source image domain and transfer it to the target image domain using a deep neural network.However,the existing methods typically have a large computational cost.To achieve efficient style transfer,we introduce a novel Ghost module into the GANILLA architecture to produce more feature maps from cheap operations.Then we utilize an attention mechanism to transform images with various styles.We optimize the original generative adversarial network(GAN)by using more efficient calculation methods for image-to-illustration translation.The experimental results show that our proposed method is similar to human vision and still maintains the quality of the image.Moreover,our proposed method overcomes the high computational cost and high computational resource consumption for style transfer.By comparing the results of subjective and objective evaluation indicators,our proposed method has shown superior performance over existing methods.
文摘Traditionally, fractal image compression suffers from lengthy encoding time in measure ofhours. In this paper, combined with characteristlcs of human visual system, a flexible classification technique is proposed. This yields a corresponding adaptive algorithm which can cut down the encoding timeinto second's magnitude. Experiment results suggest that the algorithm can balance the overall encodingperformance efficiently, that is, with a higher speed and a better PSNR gain.