Visual degradation of captured images caused by rainy streaks under rainy weather can adversely affect the performance of many open-air vision systems.Hence,it is necessary to address the problem of eliminating rain s...Visual degradation of captured images caused by rainy streaks under rainy weather can adversely affect the performance of many open-air vision systems.Hence,it is necessary to address the problem of eliminating rain streaks from the individual rainy image.In this work,a deep convolution neural network(CNN)based method is introduced,called Rain-Removal Net(R2N),to solve the single image de-raining issue.Firstly,we decomposed the rainy image into its high-frequency detail layer and lowfrequency base layer.Then,we used the high-frequency detail layer to input the carefully designed CNN architecture to learn the mapping between it and its corresponding derained high-frequency detail layer.The CNN architecture consists of four convolution layers and four deconvolution layers,as well as three skip connections.The experiments on synthetic and real-world rainy images show that the performance of our architecture outperforms the compared state-of-the-art de-raining models with respects to the quality of de-rained images and computing efficiency.展开更多
In this letter, we present a novel integrated feature that incorporates traditional parameters, and adopt a parallel cascading fashion network Haze Net for enhancing image quality. Our unified feature is a complete in...In this letter, we present a novel integrated feature that incorporates traditional parameters, and adopt a parallel cascading fashion network Haze Net for enhancing image quality. Our unified feature is a complete integration, and its role is to directly describe the effects of haze. In Haze Net, we design two separate structures including backbone and auxiliary networks to extract feature map. Backbone network is responsible for extracting high-level feature map, and low-level feature learned by the auxiliary network can be interpreted as fine-grained feature. After cascading two features with different accuracy, final performance can be effectively improved. Extensive experimental results on both synthetic datasets and real-world images prove the superiority of the proposed method, and demonstrate more favorable performance compared with the existing state-of-art methods.展开更多
Single image super-resolution(SISR)is a fundamentally challenging problem because a low-resolution(LR)image can correspond to a set of high-resolution(HR)images,while most are not expected.Recently,SISR can be achieve...Single image super-resolution(SISR)is a fundamentally challenging problem because a low-resolution(LR)image can correspond to a set of high-resolution(HR)images,while most are not expected.Recently,SISR can be achieved by a deep learning-based method.By constructing a very deep super-resolution convolutional neural network(VDSRCNN),the LR images can be improved to HR images.This study mainly achieves two objectives:image super-resolution(ISR)and deblurring the image from VDSRCNN.Firstly,by analyzing ISR,we modify different training parameters to test the performance of VDSRCNN.Secondly,we add the motion blurred images to the training set to optimize the performance of VDSRCNN.Finally,we use image quality indexes to evaluate the difference between the images from classical methods and VDSRCNN.The results indicate that the VDSRCNN performs better in generating HR images from LR images using the optimized VDSRCNN in a proper method.展开更多
Outdoor haze has adverse impact on outdoor image quality,including contrast loss and poor visibility.In this paper,a novel dehazing algorithm based on the decomposition strategy is proposed.It combines the advantages ...Outdoor haze has adverse impact on outdoor image quality,including contrast loss and poor visibility.In this paper,a novel dehazing algorithm based on the decomposition strategy is proposed.It combines the advantages of the two-dimensional variational mode decomposition(2DVMD)algorithm and dark channel prior.The original hazy image is adaptively decom-posed into low-frequency and high-frequency images according to the image frequency band by using the 2DVMD algorithm.The low-frequency image is dehazed by using the improved dark channel prior,and then fused with the high-frequency image.Furthermore,we optimize the atmospheric light and transmit-tance estimation method to obtain a defogging effect with richer details and stronger contrast.The proposed algorithm is com-pared with the existing advanced algorithms.Experiment results show that the proposed algorithm has better performance in comparison with the state-of-the-art algorithms.展开更多
Single Image Super-Resolution(SISR)technology aims to reconstruct a clear,high-resolution image with more information from an input low-resolution image that is blurry and contains less information.This technology has...Single Image Super-Resolution(SISR)technology aims to reconstruct a clear,high-resolution image with more information from an input low-resolution image that is blurry and contains less information.This technology has significant research value and is widely used in fields such as medical imaging,satellite image processing,and security surveillance.Despite significant progress in existing research,challenges remain in reconstructing clear and complex texture details,with issues such as edge blurring and artifacts still present.The visual perception effect still needs further enhancement.Therefore,this study proposes a Pyramid Separable Channel Attention Network(PSCAN)for the SISR task.Thismethod designs a convolutional backbone network composed of Pyramid Separable Channel Attention blocks to effectively extract and fuse multi-scale features.This expands the model’s receptive field,reduces resolution loss,and enhances the model’s ability to reconstruct texture details.Additionally,an innovative artifact loss function is designed to better distinguish between artifacts and real edge details,reducing artifacts in the reconstructed images.We conducted comprehensive ablation and comparative experiments on the Arabidopsis root image dataset and several public datasets.The experimental results show that the proposed PSCAN method achieves the best-known performance in both subjective visual effects and objective evaluation metrics,with improvements of 0.84 in Peak Signal-to-Noise Ratio(PSNR)and 0.017 in Structural Similarity Index(SSIM).This demonstrates that the method can effectively preserve high-frequency texture details,reduce artifacts,and have good generalization performance.展开更多
The numerous photos captured by low-price Internet of Things(IoT)sensors are frequently affected by meteorological factors,especially rainfall.It causes varying sizes of white streaks on the image,destroying the image...The numerous photos captured by low-price Internet of Things(IoT)sensors are frequently affected by meteorological factors,especially rainfall.It causes varying sizes of white streaks on the image,destroying the image texture and ruining the performance of the outdoor computer vision system.Existing methods utilise training with pairs of images,which is difficult to cover all scenes and leads to domain gaps.In addition,the network structures adopt deep learning to map rain images to rain-free images,failing to use prior knowledge effectively.To solve these problems,we introduce a single image derain model in edge computing that combines prior knowledge of rain patterns with the learning capability of the neural network.Specifically,the algorithm first uses Residue Channel Prior to filter out the rainfall textural features then it uses the Feature Fusion Module to fuse the original image with the background feature information.This results in a pre-processed image which is fed into Half Instance Net(HINet)to recover a high-quality rain-free image with a clear and accurate structure,and the model does not rely on any rainfall assumptions.Experimental results on synthetic and real-world datasets show that the average peak signal-to-noise ratio of the model decreases by 0.37 dB on the synthetic dataset and increases by 0.43 dB on the real-world dataset,demonstrating that a combined model reduces the gap between synthetic data and natural rain scenes,improves the generalization ability of the derain network,and alleviates the overfitting problem.展开更多
Recent advances in Super-Resolution(SR)image reconstruction using Convolutional Neural Networks(CNNs)have encountered significant challenges in effectively modeling the complex mapping between Low-Resolution(LR)and Hi...Recent advances in Super-Resolution(SR)image reconstruction using Convolutional Neural Networks(CNNs)have encountered significant challenges in effectively modeling the complex mapping between Low-Resolution(LR)and High-Resolution(HR)images.While Generative Adversarial Networks(GANs)have been explored as a potential solution to enhance SR performance,these models often suffer from prolonged training and inference times,and may fail to preserve intricate texture details in the reconstructed images.In response to these limitations,we propose a novel fusion network architecture,termed CLustering and Generative Adversarial Network(CL-GAN),designed to concurrently learn and integrate the features of clustered image segments and low-resolution inputs,thereby enhancing the SR reconstruction process.The CL-GAN framework comprises two primary components:a local network that emphasizes feature extraction from clustered image regions,and a global network built upon a GAN framework to model global image characteristics.To further improve texture recovery,we incorporate dense connection mechanisms within both the local and global networks,facilitating the preservation of fine-grained details in the generated SR images.Extensive experiments conducted on publicly available datasets demonstrate that the proposed CL-GAN framework outperforms existing state-of-the-art methods,delivering superior SR images with enhanced detail fidelity and visual quality.展开更多
Advances in mobile cameras have made it easier to capture ultra-high resolution(UHR)portraits.However,existing face reconstruction methods lack specific adaptations for UHR input(e.g.,4096×4096),leading to under-...Advances in mobile cameras have made it easier to capture ultra-high resolution(UHR)portraits.However,existing face reconstruction methods lack specific adaptations for UHR input(e.g.,4096×4096),leading to under-use of high-frequency details that are crucial for achieving photorealistic rendering.Our method supports 4096×4096 UHR input and utilizes a divide-and-conquer approach for end-to-end 4K albedo,micronormal,and specular texture reconstruction at the original resolution.We employ a two-stage strategy to capture both global distributions and local high-frequency details,effectively mitigating mosaic and seam artifacts common in patch-based prediction.Additionally,we innovatively apply hash encoding to facial U-V coordinates to boost the model’s ability to learn regional high-frequency feature distributions.Our method can be easily incorporated in stateof-the-art facial geometry reconstruction pipelines,significantly improving the texture reconstruction quality,facilitating artistic creation workflows.展开更多
The quality of photos is highly susceptible to severe weather such as heavy rain;it can also degrade the performance of various visual tasks like object detection.Rain removal is a challenging problem because rain str...The quality of photos is highly susceptible to severe weather such as heavy rain;it can also degrade the performance of various visual tasks like object detection.Rain removal is a challenging problem because rain streaks have different appearances even in one image.Regions where rain accumulates appear foggy or misty,while rain streaks can be clearly seen in areas where rain is less heavy.We propose removing various rain effects in pictures using a hybrid multiscale loss guided multiple feature fusion de-raining network(MSGMFFNet).Specially,to deal with rain streaks,our method generates a rain streak attention map,while preprocessing uses gamma correction and contrast enhancement to enhanced images to address the problem of rain accumulation.Using these tools,the model can restore a result with abundant details.Furthermore,a hybrid multiscale loss combining L1 loss and edge loss is used to guide the training process to pay attention to edge and content information.Comprehensive experiments conducted on both synthetic and real-world datasets demonstrate the effectiveness of our method.展开更多
We synthesize animations from a single image by transferring fluid motion of a video example globally.Given a target image of a fluid scene,an alpha matte is required to extract the fluid region.Our method needs to ad...We synthesize animations from a single image by transferring fluid motion of a video example globally.Given a target image of a fluid scene,an alpha matte is required to extract the fluid region.Our method needs to adjust a user-specified video example for producing the fluid motion suitable for the extracted fluid region.Employing the fluid video database,the flow field of the target image is obtained by warping the optical flow of a video frame that has a visually similar scene to the target image according to their scene correspondences,which assigns fluid orientation and speed automatically.Results show that our method is successful in preserving large fluid features in the synthesized animations.In comparison to existing approaches,it is both possible and useful to utilize our method to create flow animations with higher quality.展开更多
There is a steadily growing range of applications that can benefit from facial reconstruction techniques,leading to an increasing demand for reconstruction of high-quality 3D face models.While it is an important expre...There is a steadily growing range of applications that can benefit from facial reconstruction techniques,leading to an increasing demand for reconstruction of high-quality 3D face models.While it is an important expressive part of the human face,the nose has received less attention than other expressive regions in the face reconstruction literature.When applying existing reconstruction methods to facial images,the reconstructed nose models are often inconsistent with the desired shape and expression.In this paper,we propose a coarse-to-fine 3D nose reconstruction and correction pipeline to build a nose model from a single image,where 3D and 2D nose curve correspondences are adaptively updated and refined.We first correct the reconstruction result coarsely using constraints of 3D-2D sparse landmark correspondences,and then heuristically update a dense 3D-2D curve correspondence based on the coarsely corrected result.A final refinement step is performed to correct the shape based on the updated 3D-2D dense curve constraints.Experimental results show the advantages of our method for 3D nose reconstruction over existing methods.展开更多
An improved single image dehazing method based on dark channel prior and wavelet transform is proposed. This proposed method employs wavelet transform and guided filter instead of the soft matting procedure to estimat...An improved single image dehazing method based on dark channel prior and wavelet transform is proposed. This proposed method employs wavelet transform and guided filter instead of the soft matting procedure to estimate and refine the depth map of haze images. Moreover, a contrast enhancement method based on just noticeable difference(JND) and quadratic function is adopted to enhance the contrast for the dehazed image, since the scene radiance is usually not as bright as the atmospheric light,and the dehazed image looks dim. The experimental results show that the proposed approach can effectively enhance the haze image and is well suitable for implementing on the surveillance and obstacle detection systems.展开更多
Cities are in constant change and city managers aim to keep an updated digital model of the city for city governance. There are a lot of images uploaded daily on image sharing platforms (as “Flickr”, “Twitter”, et...Cities are in constant change and city managers aim to keep an updated digital model of the city for city governance. There are a lot of images uploaded daily on image sharing platforms (as “Flickr”, “Twitter”, etc.). These images feature a rough localization and no orientation information. Nevertheless, they can help to populate an active collaborative database of street images usable to maintain a city 3D model, but their localization and orientation need to be known. Based on these images, we propose the Data Gathering system for image Pose Estimation (DGPE) that helps to find the pose (position and orientation) of the camera used to shoot them with better accuracy than the sole GPS localization that may be embedded in the image header. DGPE uses both visual and semantic information, existing in a single image processed by a fully automatic chain composed of three main layers: Data retrieval and preprocessing layer, Features extraction layer, Decision Making layer. In this article, we present the whole system details and compare its detection results with a state of the art method. Finally, we show the obtained localization, and often orientation results, combining both semantic and visual information processing on 47 images. Our multilayer system succeeds in 26% of our test cases in finding a better localization and orientation of the original photo. This is achieved by using only the image content and associated metadata. The use of semantic information found on social media such as comments, hash tags, etc. has doubled the success rate to 59%. It has reduced the search area and thus made the visual search more accurate.展开更多
Significant advancements have been achieved in the field of Single Image Super-Resolution(SISR)through the utilization of Convolutional Neural Networks(CNNs)to attain state-of-the-art performance.Recent efforts have e...Significant advancements have been achieved in the field of Single Image Super-Resolution(SISR)through the utilization of Convolutional Neural Networks(CNNs)to attain state-of-the-art performance.Recent efforts have explored the incorporation of Transformers to augment network performance in SISR.However,the high computational cost of Transformers makes them less suitable for deployment on lightweight devices.Moreover,the majority of enhancements for CNNs rely predominantly on small spatial convolutions,thereby neglecting the potential advantages of large kernel convolution.In this paper,the authors propose a Multi-Perception Large Kernel convNet(MPLKN)which delves into the exploration of large kernel convolution.Specifically,the authors have architected a Multi-Perception Large Kernel(MPLK)module aimed at extracting multi-scale features and employ a stepwise feature fusion strategy to seamlessly integrate these features.In addition,to enhance the network's capacity for nonlinear spatial information processing,the authors have designed a Spatial-Channel Gated Feed-forward Network(SCGFN)that is capable of adapting to feature interactions across both spatial and channel dimensions.Experimental results demonstrate that MPLKN outperforms other lightweight image super-resolution models while maintaining a minimal number of parameters and FLOPs.展开更多
Dissolution kinetics of sodium carbonate is investigated with the image analysis method at the approach of single particle.The dissolution experiments are carried out in an aqueous solution under a series of controlle...Dissolution kinetics of sodium carbonate is investigated with the image analysis method at the approach of single particle.The dissolution experiments are carried out in an aqueous solution under a series of controlled temperature and p H.The selected sodium carbonate particles are all spherical with the same mass and diameter.The dissolution process is quantified with the measurement of particle diameter from dissolution images.The concentration of dissolved sodium carbonate in solvent is calculated with the measured diameter of particle.Both surface reaction model and mass transport model are implemented to determine the dissolution mechanism and quantify the dissolution rate constant at each experimental condition.According to the fitting results with both two models,it is clarified that the dissolution process at the increasing temperature is controlled by the mass transport of dissolved sodium carbonate travelling from particle surface into solvent.The dissolution process at the increasing pH is controlled by the chemical reaction on particle surface.Furthermore,the dissolution rate constant for each single spherical sodium carbonate particle is quantified and the results show that the dissolution rate constant of single spherical sodium carbonate increases significantly with the rising of temperature,but decreases with the increasing of pH conversely.展开更多
In the field of imaging,the image resolution is required to be higher.There is always a contradiction between the sensitivity and resolution of the seeker in the infrared guidance system.This work uses the rosette sca...In the field of imaging,the image resolution is required to be higher.There is always a contradiction between the sensitivity and resolution of the seeker in the infrared guidance system.This work uses the rosette scanning mode for physical compression imaging in order to improve the resolution of the image as much as possible under the high-sensitivity infrared rosette point scanning mode and complete the missing information that is not scanned.It is effective to use optical lens instead of traditional optical reflection system,which can reduce the loss in optical path transmission.At the same time,deep learning neural network is used for control.An infrared single pixel imaging system that integrates sparse algorithm and recovery algorithm through the improved generative adversarial networks is trained.The experiment on the infrared aerial target dataset shows that when the input is sparse image after rose sampling,the system finally can realize the single pixel recovery imaging of the infrared image,which improves the resolution of the image while ensuring high sensitivity.展开更多
In recent years,deep learning has been introduced into the field of Single-pixel imaging(SPI),garnering significant attention.However,conventional networks still exhibit limitations in preserving image details.To addr...In recent years,deep learning has been introduced into the field of Single-pixel imaging(SPI),garnering significant attention.However,conventional networks still exhibit limitations in preserving image details.To address this issue,we integrate Large Kernel Convolution(LKconv)into the U-Net framework,proposing an enhanced network structure named U-LKconv network,which significantly enhances the capability to recover image details even under low sampling conditions.展开更多
Single image super resolution(SISR)is an important research content in the field of computer vision and image processing.With the rapid development of deep neural networks,different image super-resolution models have ...Single image super resolution(SISR)is an important research content in the field of computer vision and image processing.With the rapid development of deep neural networks,different image super-resolution models have emerged.Compared to some traditional SISR methods,deep learning-based methods can complete the super-resolution tasks through a single image.In addition,compared with the SISR methods using traditional convolutional neural networks,SISR based on generative adversarial networks(GAN)has achieved the most advanced visual performance.In this review,we first explore the challenges faced by SISR and introduce some common datasets and evaluation metrics.Then,we review the improved network structures and loss functions of GAN-based perceptual SISR.Subsequently,the advantages and disadvantages of different networks are analyzed by multiple comparative experiments.Finally,we summarize the paper and look forward to the future development trends of GAN-based perceptual SISR.展开更多
Sparse representation has attracted extensive attention and performed well on image super-resolution(SR) in the last decade. However, many current image SR methods face the contradiction of detail recovery and artif...Sparse representation has attracted extensive attention and performed well on image super-resolution(SR) in the last decade. However, many current image SR methods face the contradiction of detail recovery and artifact suppression. We propose a multi-resolution dictionary learning(MRDL) model to solve this contradiction, and give a fast single image SR method based on the MRDL model. To obtain the MRDL model, we first extract multi-scale patches by using our proposed adaptive patch partition method(APPM). The APPM divides images into patches of different sizes according to their detail richness. Then, the multiresolution dictionary pairs, which contain structural primitives of various resolutions, can be trained from these multi-scale patches.Owing to the MRDL strategy, our SR algorithm not only recovers details well, with less jag and noise, but also significantly improves the computational efficiency. Experimental results validate that our algorithm performs better than other SR methods in evaluation metrics and visual perception.展开更多
Although there has been a great breakthrough in the accuracy and speed of super-resolution(SR)reconstruction of a single image by using a convolutional neural network,an important problem remains unresolved:how to res...Although there has been a great breakthrough in the accuracy and speed of super-resolution(SR)reconstruction of a single image by using a convolutional neural network,an important problem remains unresolved:how to restore finer texture details during image super-resolution reconstruction?This paper proposes an Enhanced Laplacian Pyramid Generative Adversarial Network(ELSRGAN),based on the Laplacian pyramid to capture the high-frequency details of the image.By combining Laplacian pyramids and generative adversarial networks,progressive reconstruction of super-resolution images can be made,making model applications more flexible.In order to solve the problem of gradient disappearance,we introduce the Residual-in-Residual Dense Block(RRDB)as the basic network unit.Network capacity benefits more from dense connections,is able to capture more visual features with better reconstruction effects,and removes BN layers to increase calculation speed and reduce calculation complexity.In addition,a loss of content driven by perceived similarity is used instead of content loss driven by spatial similarity,thereby enhancing the visual effect of the super-resolution image,making it more consistent with human visual perception.Extensive qualitative and quantitative evaluation of the baseline datasets shows that the proposed algorithm has higher mean-sort-score(MSS)than any state-of-the-art method and has better visual perception.展开更多
基金This work was supported by the National Natural Science Foundation of China(Grant No.61673222)Jiangsu Universities Natural Science Research Project(Grant No.13KJA510001)Major Program of the National Social Science Fund of China(Grant No.17ZDA092).
文摘Visual degradation of captured images caused by rainy streaks under rainy weather can adversely affect the performance of many open-air vision systems.Hence,it is necessary to address the problem of eliminating rain streaks from the individual rainy image.In this work,a deep convolution neural network(CNN)based method is introduced,called Rain-Removal Net(R2N),to solve the single image de-raining issue.Firstly,we decomposed the rainy image into its high-frequency detail layer and lowfrequency base layer.Then,we used the high-frequency detail layer to input the carefully designed CNN architecture to learn the mapping between it and its corresponding derained high-frequency detail layer.The CNN architecture consists of four convolution layers and four deconvolution layers,as well as three skip connections.The experiments on synthetic and real-world rainy images show that the performance of our architecture outperforms the compared state-of-the-art de-raining models with respects to the quality of de-rained images and computing efficiency.
基金supported by the National Natural Science Foundation of China (No.61561030)the Gansu Provincial F inance Department (No.214138)。
文摘In this letter, we present a novel integrated feature that incorporates traditional parameters, and adopt a parallel cascading fashion network Haze Net for enhancing image quality. Our unified feature is a complete integration, and its role is to directly describe the effects of haze. In Haze Net, we design two separate structures including backbone and auxiliary networks to extract feature map. Backbone network is responsible for extracting high-level feature map, and low-level feature learned by the auxiliary network can be interpreted as fine-grained feature. After cascading two features with different accuracy, final performance can be effectively improved. Extensive experimental results on both synthetic datasets and real-world images prove the superiority of the proposed method, and demonstrate more favorable performance compared with the existing state-of-art methods.
文摘Single image super-resolution(SISR)is a fundamentally challenging problem because a low-resolution(LR)image can correspond to a set of high-resolution(HR)images,while most are not expected.Recently,SISR can be achieved by a deep learning-based method.By constructing a very deep super-resolution convolutional neural network(VDSRCNN),the LR images can be improved to HR images.This study mainly achieves two objectives:image super-resolution(ISR)and deblurring the image from VDSRCNN.Firstly,by analyzing ISR,we modify different training parameters to test the performance of VDSRCNN.Secondly,we add the motion blurred images to the training set to optimize the performance of VDSRCNN.Finally,we use image quality indexes to evaluate the difference between the images from classical methods and VDSRCNN.The results indicate that the VDSRCNN performs better in generating HR images from LR images using the optimized VDSRCNN in a proper method.
基金supported by the National Defense Technology Advance Research Project of China(004040204).
文摘Outdoor haze has adverse impact on outdoor image quality,including contrast loss and poor visibility.In this paper,a novel dehazing algorithm based on the decomposition strategy is proposed.It combines the advantages of the two-dimensional variational mode decomposition(2DVMD)algorithm and dark channel prior.The original hazy image is adaptively decom-posed into low-frequency and high-frequency images according to the image frequency band by using the 2DVMD algorithm.The low-frequency image is dehazed by using the improved dark channel prior,and then fused with the high-frequency image.Furthermore,we optimize the atmospheric light and transmit-tance estimation method to obtain a defogging effect with richer details and stronger contrast.The proposed algorithm is com-pared with the existing advanced algorithms.Experiment results show that the proposed algorithm has better performance in comparison with the state-of-the-art algorithms.
基金supported by Beijing Municipal Science and Technology Project(No.Z221100007122003).
文摘Single Image Super-Resolution(SISR)technology aims to reconstruct a clear,high-resolution image with more information from an input low-resolution image that is blurry and contains less information.This technology has significant research value and is widely used in fields such as medical imaging,satellite image processing,and security surveillance.Despite significant progress in existing research,challenges remain in reconstructing clear and complex texture details,with issues such as edge blurring and artifacts still present.The visual perception effect still needs further enhancement.Therefore,this study proposes a Pyramid Separable Channel Attention Network(PSCAN)for the SISR task.Thismethod designs a convolutional backbone network composed of Pyramid Separable Channel Attention blocks to effectively extract and fuse multi-scale features.This expands the model’s receptive field,reduces resolution loss,and enhances the model’s ability to reconstruct texture details.Additionally,an innovative artifact loss function is designed to better distinguish between artifacts and real edge details,reducing artifacts in the reconstructed images.We conducted comprehensive ablation and comparative experiments on the Arabidopsis root image dataset and several public datasets.The experimental results show that the proposed PSCAN method achieves the best-known performance in both subjective visual effects and objective evaluation metrics,with improvements of 0.84 in Peak Signal-to-Noise Ratio(PSNR)and 0.017 in Structural Similarity Index(SSIM).This demonstrates that the method can effectively preserve high-frequency texture details,reduce artifacts,and have good generalization performance.
基金supported by the National Natural Science Foundation of China under Grant no.41975183,and Grant no.41875184 and Supported by a grant from State Key Laboratory of Resources and Environmental Information System.
文摘The numerous photos captured by low-price Internet of Things(IoT)sensors are frequently affected by meteorological factors,especially rainfall.It causes varying sizes of white streaks on the image,destroying the image texture and ruining the performance of the outdoor computer vision system.Existing methods utilise training with pairs of images,which is difficult to cover all scenes and leads to domain gaps.In addition,the network structures adopt deep learning to map rain images to rain-free images,failing to use prior knowledge effectively.To solve these problems,we introduce a single image derain model in edge computing that combines prior knowledge of rain patterns with the learning capability of the neural network.Specifically,the algorithm first uses Residue Channel Prior to filter out the rainfall textural features then it uses the Feature Fusion Module to fuse the original image with the background feature information.This results in a pre-processed image which is fed into Half Instance Net(HINet)to recover a high-quality rain-free image with a clear and accurate structure,and the model does not rely on any rainfall assumptions.Experimental results on synthetic and real-world datasets show that the average peak signal-to-noise ratio of the model decreases by 0.37 dB on the synthetic dataset and increases by 0.43 dB on the real-world dataset,demonstrating that a combined model reduces the gap between synthetic data and natural rain scenes,improves the generalization ability of the derain network,and alleviates the overfitting problem.
基金supported by the Science and Technology Project of Qinghai Province(No.2022-ZJ-701)the High Performance Computing Center of Qinghai University(China).
文摘Recent advances in Super-Resolution(SR)image reconstruction using Convolutional Neural Networks(CNNs)have encountered significant challenges in effectively modeling the complex mapping between Low-Resolution(LR)and High-Resolution(HR)images.While Generative Adversarial Networks(GANs)have been explored as a potential solution to enhance SR performance,these models often suffer from prolonged training and inference times,and may fail to preserve intricate texture details in the reconstructed images.In response to these limitations,we propose a novel fusion network architecture,termed CLustering and Generative Adversarial Network(CL-GAN),designed to concurrently learn and integrate the features of clustered image segments and low-resolution inputs,thereby enhancing the SR reconstruction process.The CL-GAN framework comprises two primary components:a local network that emphasizes feature extraction from clustered image regions,and a global network built upon a GAN framework to model global image characteristics.To further improve texture recovery,we incorporate dense connection mechanisms within both the local and global networks,facilitating the preservation of fine-grained details in the generated SR images.Extensive experiments conducted on publicly available datasets demonstrate that the proposed CL-GAN framework outperforms existing state-of-the-art methods,delivering superior SR images with enhanced detail fidelity and visual quality.
基金supported by the National Key R&D Program of China(2024YDLN0011)the Key R&D Program of Zhejiang Province(2023C01039).
文摘Advances in mobile cameras have made it easier to capture ultra-high resolution(UHR)portraits.However,existing face reconstruction methods lack specific adaptations for UHR input(e.g.,4096×4096),leading to under-use of high-frequency details that are crucial for achieving photorealistic rendering.Our method supports 4096×4096 UHR input and utilizes a divide-and-conquer approach for end-to-end 4K albedo,micronormal,and specular texture reconstruction at the original resolution.We employ a two-stage strategy to capture both global distributions and local high-frequency details,effectively mitigating mosaic and seam artifacts common in patch-based prediction.Additionally,we innovatively apply hash encoding to facial U-V coordinates to boost the model’s ability to learn regional high-frequency feature distributions.Our method can be easily incorporated in stateof-the-art facial geometry reconstruction pipelines,significantly improving the texture reconstruction quality,facilitating artistic creation workflows.
基金This work was supported in part by the National Key R&D Program of China under No.2017YFB1003000the National Natural Science Foundation of China under No.61872047 and No.61720106007+2 种基金the Beijing Nova Program under No.Z201100006820124the Beijing Natural Science Foundation(L191004)the 111 Project(B18008).
文摘The quality of photos is highly susceptible to severe weather such as heavy rain;it can also degrade the performance of various visual tasks like object detection.Rain removal is a challenging problem because rain streaks have different appearances even in one image.Regions where rain accumulates appear foggy or misty,while rain streaks can be clearly seen in areas where rain is less heavy.We propose removing various rain effects in pictures using a hybrid multiscale loss guided multiple feature fusion de-raining network(MSGMFFNet).Specially,to deal with rain streaks,our method generates a rain streak attention map,while preprocessing uses gamma correction and contrast enhancement to enhanced images to address the problem of rain accumulation.Using these tools,the model can restore a result with abundant details.Furthermore,a hybrid multiscale loss combining L1 loss and edge loss is used to guide the training process to pay attention to edge and content information.Comprehensive experiments conducted on both synthetic and real-world datasets demonstrate the effectiveness of our method.
基金Project supported by the National Basic Research Program (973) of China (No.2011CB302203)the Innovation Program of the Science and Technology Commission of Shanghai Municipality,China (No.10511501200)
文摘We synthesize animations from a single image by transferring fluid motion of a video example globally.Given a target image of a fluid scene,an alpha matte is required to extract the fluid region.Our method needs to adjust a user-specified video example for producing the fluid motion suitable for the extracted fluid region.Employing the fluid video database,the flow field of the target image is obtained by warping the optical flow of a video frame that has a visually similar scene to the target image according to their scene correspondences,which assigns fluid orientation and speed automatically.Results show that our method is successful in preserving large fluid features in the synthesized animations.In comparison to existing approaches,it is both possible and useful to utilize our method to create flow animations with higher quality.
基金supported by the National Natural Science Foundation of China(Grant Nos.61972342,61602402,and 61902334)Zhejiang Provincial Basic Public Welfare Research(Grant No.LGG19F020001)+1 种基金Shenzhen Fundamental Research(General Project)(Grant No.JCYJ20190814112007258)the Royal Society(Grant No.IES\R1\180126).
文摘There is a steadily growing range of applications that can benefit from facial reconstruction techniques,leading to an increasing demand for reconstruction of high-quality 3D face models.While it is an important expressive part of the human face,the nose has received less attention than other expressive regions in the face reconstruction literature.When applying existing reconstruction methods to facial images,the reconstructed nose models are often inconsistent with the desired shape and expression.In this paper,we propose a coarse-to-fine 3D nose reconstruction and correction pipeline to build a nose model from a single image,where 3D and 2D nose curve correspondences are adaptively updated and refined.We first correct the reconstruction result coarsely using constraints of 3D-2D sparse landmark correspondences,and then heuristically update a dense 3D-2D curve correspondence based on the coarsely corrected result.A final refinement step is performed to correct the shape based on the updated 3D-2D dense curve constraints.Experimental results show the advantages of our method for 3D nose reconstruction over existing methods.
基金supported by the National Natural Science Foundation of China(61075013)the Joint Funds of the Civil Aviation(61139003)
文摘An improved single image dehazing method based on dark channel prior and wavelet transform is proposed. This proposed method employs wavelet transform and guided filter instead of the soft matting procedure to estimate and refine the depth map of haze images. Moreover, a contrast enhancement method based on just noticeable difference(JND) and quadratic function is adopted to enhance the contrast for the dehazed image, since the scene radiance is usually not as bright as the atmospheric light,and the dehazed image looks dim. The experimental results show that the proposed approach can effectively enhance the haze image and is well suitable for implementing on the surveillance and obstacle detection systems.
文摘Cities are in constant change and city managers aim to keep an updated digital model of the city for city governance. There are a lot of images uploaded daily on image sharing platforms (as “Flickr”, “Twitter”, etc.). These images feature a rough localization and no orientation information. Nevertheless, they can help to populate an active collaborative database of street images usable to maintain a city 3D model, but their localization and orientation need to be known. Based on these images, we propose the Data Gathering system for image Pose Estimation (DGPE) that helps to find the pose (position and orientation) of the camera used to shoot them with better accuracy than the sole GPS localization that may be embedded in the image header. DGPE uses both visual and semantic information, existing in a single image processed by a fully automatic chain composed of three main layers: Data retrieval and preprocessing layer, Features extraction layer, Decision Making layer. In this article, we present the whole system details and compare its detection results with a state of the art method. Finally, we show the obtained localization, and often orientation results, combining both semantic and visual information processing on 47 images. Our multilayer system succeeds in 26% of our test cases in finding a better localization and orientation of the original photo. This is achieved by using only the image content and associated metadata. The use of semantic information found on social media such as comments, hash tags, etc. has doubled the success rate to 59%. It has reduced the search area and thus made the visual search more accurate.
文摘Significant advancements have been achieved in the field of Single Image Super-Resolution(SISR)through the utilization of Convolutional Neural Networks(CNNs)to attain state-of-the-art performance.Recent efforts have explored the incorporation of Transformers to augment network performance in SISR.However,the high computational cost of Transformers makes them less suitable for deployment on lightweight devices.Moreover,the majority of enhancements for CNNs rely predominantly on small spatial convolutions,thereby neglecting the potential advantages of large kernel convolution.In this paper,the authors propose a Multi-Perception Large Kernel convNet(MPLKN)which delves into the exploration of large kernel convolution.Specifically,the authors have architected a Multi-Perception Large Kernel(MPLK)module aimed at extracting multi-scale features and employ a stepwise feature fusion strategy to seamlessly integrate these features.In addition,to enhance the network's capacity for nonlinear spatial information processing,the authors have designed a Spatial-Channel Gated Feed-forward Network(SCGFN)that is capable of adapting to feature interactions across both spatial and channel dimensions.Experimental results demonstrate that MPLKN outperforms other lightweight image super-resolution models while maintaining a minimal number of parameters and FLOPs.
基金the Institute of Particle and Science Engineering,University of Leeds and Procter&Gamble Newcastle Innovation Centre(UK)for partially funding the project
文摘Dissolution kinetics of sodium carbonate is investigated with the image analysis method at the approach of single particle.The dissolution experiments are carried out in an aqueous solution under a series of controlled temperature and p H.The selected sodium carbonate particles are all spherical with the same mass and diameter.The dissolution process is quantified with the measurement of particle diameter from dissolution images.The concentration of dissolved sodium carbonate in solvent is calculated with the measured diameter of particle.Both surface reaction model and mass transport model are implemented to determine the dissolution mechanism and quantify the dissolution rate constant at each experimental condition.According to the fitting results with both two models,it is clarified that the dissolution process at the increasing temperature is controlled by the mass transport of dissolved sodium carbonate travelling from particle surface into solvent.The dissolution process at the increasing pH is controlled by the chemical reaction on particle surface.Furthermore,the dissolution rate constant for each single spherical sodium carbonate particle is quantified and the results show that the dissolution rate constant of single spherical sodium carbonate increases significantly with the rising of temperature,but decreases with the increasing of pH conversely.
基金the Fundamental Research Funds for the Central Universities(No.3072022CF0802)。
文摘In the field of imaging,the image resolution is required to be higher.There is always a contradiction between the sensitivity and resolution of the seeker in the infrared guidance system.This work uses the rosette scanning mode for physical compression imaging in order to improve the resolution of the image as much as possible under the high-sensitivity infrared rosette point scanning mode and complete the missing information that is not scanned.It is effective to use optical lens instead of traditional optical reflection system,which can reduce the loss in optical path transmission.At the same time,deep learning neural network is used for control.An infrared single pixel imaging system that integrates sparse algorithm and recovery algorithm through the improved generative adversarial networks is trained.The experiment on the infrared aerial target dataset shows that when the input is sparse image after rose sampling,the system finally can realize the single pixel recovery imaging of the infrared image,which improves the resolution of the image while ensuring high sensitivity.
文摘In recent years,deep learning has been introduced into the field of Single-pixel imaging(SPI),garnering significant attention.However,conventional networks still exhibit limitations in preserving image details.To address this issue,we integrate Large Kernel Convolution(LKconv)into the U-Net framework,proposing an enhanced network structure named U-LKconv network,which significantly enhances the capability to recover image details even under low sampling conditions.
基金The authors are highly thankful to the Development Research Center of Guangxi Relatively Sparse-populated Minorities(ID:GXRKJSZ201901)to the Natural Science Foundation of Guangxi Province(No.2018GXNSFAA281164)This research was financially supported by the project of outstanding thousand young teachers’training in higher education institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory Breeding Base of System Control and Information Processing.
文摘Single image super resolution(SISR)is an important research content in the field of computer vision and image processing.With the rapid development of deep neural networks,different image super-resolution models have emerged.Compared to some traditional SISR methods,deep learning-based methods can complete the super-resolution tasks through a single image.In addition,compared with the SISR methods using traditional convolutional neural networks,SISR based on generative adversarial networks(GAN)has achieved the most advanced visual performance.In this review,we first explore the challenges faced by SISR and introduce some common datasets and evaluation metrics.Then,we review the improved network structures and loss functions of GAN-based perceptual SISR.Subsequently,the advantages and disadvantages of different networks are analyzed by multiple comparative experiments.Finally,we summarize the paper and look forward to the future development trends of GAN-based perceptual SISR.
文摘Sparse representation has attracted extensive attention and performed well on image super-resolution(SR) in the last decade. However, many current image SR methods face the contradiction of detail recovery and artifact suppression. We propose a multi-resolution dictionary learning(MRDL) model to solve this contradiction, and give a fast single image SR method based on the MRDL model. To obtain the MRDL model, we first extract multi-scale patches by using our proposed adaptive patch partition method(APPM). The APPM divides images into patches of different sizes according to their detail richness. Then, the multiresolution dictionary pairs, which contain structural primitives of various resolutions, can be trained from these multi-scale patches.Owing to the MRDL strategy, our SR algorithm not only recovers details well, with less jag and noise, but also significantly improves the computational efficiency. Experimental results validate that our algorithm performs better than other SR methods in evaluation metrics and visual perception.
基金This work was supported in part by the National Science Foundation of China under Grant 61572526.
文摘Although there has been a great breakthrough in the accuracy and speed of super-resolution(SR)reconstruction of a single image by using a convolutional neural network,an important problem remains unresolved:how to restore finer texture details during image super-resolution reconstruction?This paper proposes an Enhanced Laplacian Pyramid Generative Adversarial Network(ELSRGAN),based on the Laplacian pyramid to capture the high-frequency details of the image.By combining Laplacian pyramids and generative adversarial networks,progressive reconstruction of super-resolution images can be made,making model applications more flexible.In order to solve the problem of gradient disappearance,we introduce the Residual-in-Residual Dense Block(RRDB)as the basic network unit.Network capacity benefits more from dense connections,is able to capture more visual features with better reconstruction effects,and removes BN layers to increase calculation speed and reduce calculation complexity.In addition,a loss of content driven by perceived similarity is used instead of content loss driven by spatial similarity,thereby enhancing the visual effect of the super-resolution image,making it more consistent with human visual perception.Extensive qualitative and quantitative evaluation of the baseline datasets shows that the proposed algorithm has higher mean-sort-score(MSS)than any state-of-the-art method and has better visual perception.