In recent years,three-dimensional reconstruction technologies that employ multiple cameras have continued to evolve significantly,enabling remote collaboration among users in extended Reality(XR)environments.In additi...In recent years,three-dimensional reconstruction technologies that employ multiple cameras have continued to evolve significantly,enabling remote collaboration among users in extended Reality(XR)environments.In addition,methods for deploying multiple cameras for motion capture of users(e.g.,performers)are widely used in computer graphics.As the need to minimize and optimize the number of cameras grows to reduce costs,various technologies and research approaches focused on Optimal Camera Placement(OCP)are continually being proposed.However,as most existing studies assume homogeneous camera setups,there is a growing demand for studies on heterogeneous camera setups.For instance,technical demands keep emerging in scenarios with minimal camera configurations,especially regarding cost factors,the physical placement of cameras given the spatial structure,and image capture strategies for heterogeneous cameras,such as high-resolution RGB cameras and depth cameras.In this study,we propose a pre-visualization and simulation method for the optimal placement of heterogeneous cameras in XR environments,accounting for both the specifications of heterogeneous cameras(e.g.,field of view)and the physical configuration(e.g.,wall configuration)in real-world spaces.The proposed method performs a visibility analysis of cameras by considering each camera’s field-of-view volume,resolution,and unique characteristics,along with physicalspace constraints.This approach enables the optimal position and rotation of each camera to be recommended,along with the minimum number of cameras required.In the results of our study conducted in heterogeneous camera combinations,the proposed method achieved 81.7%~82.7%coverage of the target visual information using only 2~3 cameras.In contrast,single(or homogeneous)-typed cameras were required to use 11 cameras for 81.6%coverage.Accordingly,we found that camera deployment resources can be reduced with the proposed approaches.展开更多
Aiming at the scale adaptation of automatic driving target detection algorithms in low illumination environments and the shortcomings in target occlusion processing,this paper proposes a YOLO-LKSDS automatic driving d...Aiming at the scale adaptation of automatic driving target detection algorithms in low illumination environments and the shortcomings in target occlusion processing,this paper proposes a YOLO-LKSDS automatic driving detection model.Firstly,the Contrast-Limited Adaptive Histogram Equalisation(CLAHE)image enhancement algorithm is improved to increase the image contrast and enhance the detailed features of the target;then,on the basis of the YOLOv5 model,the Kmeans++clustering algorithm is introduced to obtain a suitable anchor frame,and SPPELAN spatial pyramid pooling is improved to enhance the accuracy and robustness of the model for multi-scale target detection.Finally,an improved SEAM(Separated and Enhancement Attention Module)attention mechanism is combined with the DIOU-NMS algorithm to optimize the model’s performance when dealing with occlusion and dense scenes.Compared with the original model,the improved YOLO-LKSDS model achieves a 13.3%improvement in accuracy,a 1.7%improvement in mAP,and 240,000 fewer parameters on the BDD100K dataset.In order to validate the generalization of the improved algorithm,we selected the KITTI dataset for experimentation,which shows that YOLOv5’s accuracy improves by 21.1%,recall by 36.6%,and mAP50 by 29.5%,respectively,on the KITTI dataset.The deployment of this paper’s algorithm is verified by an edge computing platform,where the average speed of detection reaches 24.4 FPS while power consumption remains below 9 W,demonstrating high real-time capability and energy efficiency.展开更多
To address the challenges of high-precision optical surface defect detection,we propose a novel design for a wide-field and broadband light field camera in this work.The proposed system can achieve a 50°field of ...To address the challenges of high-precision optical surface defect detection,we propose a novel design for a wide-field and broadband light field camera in this work.The proposed system can achieve a 50°field of view and operates at both visible and near-infrared wavelengths.Using the principles of light field imaging,the proposed design enables 3D reconstruction of optical surfaces,thus enabling vertical surface height measurements with enhanced accuracy.Using Zemax-based simulations,we evaluate the system’s modulation transfer function,its optical aberrations,and its tolerance to shape variations through Zernike coefficient adjustments.The results demonstrate that this camera can achieve the required spatial resolution while also maintaining high imaging quality and thus offers a promising solution for advanced optical surface defect inspection.展开更多
Low-light image enhancement aims to improve the visibility of severely degraded images captured under insufficient illumination,alleviating the adverse effects of illumination degradation on image quality.Traditional ...Low-light image enhancement aims to improve the visibility of severely degraded images captured under insufficient illumination,alleviating the adverse effects of illumination degradation on image quality.Traditional Retinex-based approaches,inspired by human visual perception of brightness and color,decompose an image into illumination and reflectance components to restore fine details.However,their limited capacity for handling noise and complex lighting conditions often leads to distortions and artifacts in the enhanced results,particularly under extreme low-light scenarios.Although deep learning methods built upon Retinex theory have recently advanced the field,most still suffer frominsufficient interpretability and sub-optimal enhancement performance.This paper presents RetinexWT,a novel framework that tightly integrates classical Retinex theory with modern deep learning.Following Retinex principles,RetinexWT employs wavelet transforms to estimate illumination maps for brightness adjustment.A detail-recovery module that synergistically combines Vision Transformer(ViT)and wavelet transforms is then introduced to guide the restoration of lost details,thereby improving overall image quality.Within the framework,wavelet decomposition splits input features into high-frequency and low-frequency components,enabling scale-specific processing of global illumination/color cues and fine textures.Furthermore,a gating mechanism selectively fuses down-sampled and up-sampled features,while an attention-based fusion strategy enhances model interpretability.Extensive experiments on the LOL dataset demonstrate that RetinexWT surpasses existing Retinex-oriented deeplearning methods,achieving an average Peak Signal-to-Noise Ratio(PSNR)improvement of 0.22 dB over the current StateOfTheArt(SOTA),thereby confirming its superiority in low-light image enhancement.Code is available at https://github.com/CHEN-hJ516/RetinexWT(accessed on 14 October 2025).展开更多
Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approach...Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.展开更多
Background:Studies have shown that heart rate variability(HRV)is a predictor of the prognosis of cardiovascular diseases.Contact heartbeat monitoring equipment is widely used,especially in hospitals,and benefits from ...Background:Studies have shown that heart rate variability(HRV)is a predictor of the prognosis of cardiovascular diseases.Contact heartbeat monitoring equipment is widely used,especially in hospitals,and benefits from the rapidity and accuracy of the detection of physiological health indicators.However,long-term contact with equipment has many adverse effects.The purpose of this study was to improve the accuracy of HRV detection via noncontact equipment,thus enabling HRV to be assessed in various scenarios.Methods:A novel deep learning approach was proposed for measuring heartbeats through camera videos.First,we performed facial segmentation and divided the face into 16 grid cells with different light balance scores.After the trend is filtered by the Hamming window,a transformer-based neural network is used to further filter the signal.Finally,heart rate(HR)and HRV are estimated.Results:We used 1 million synthetic data points for pretraining and a public dataset in combination with a dataset that we constructed for task training.The final results were obtained on a test dataset that we constructed.The accuracy for HR with a low light balance score(0.867-0.983)was greater than that with a high score(0.667-0.750).Our method had higher accuracy in estimating HR than traditional filtering methods(0.167-0.417)and state-of-the-art neural network filtering methods(0.783-0.917)did.The root mean square error of the HRV from the time domain was the lowest,and the correlation index score was the highest for the HRV from the frequency domain estimated by our method compared with those estimated by two neural networks.Conclusions:Light balance,large sample training,and two-stage training can improve the accuracy of HRV estimation.展开更多
LiDAR and camera are two of the most common sensors used in the fields of robot perception,autonomous driving,augmented reality,and virtual reality,where these sensors are widely used to perform various tasks such as ...LiDAR and camera are two of the most common sensors used in the fields of robot perception,autonomous driving,augmented reality,and virtual reality,where these sensors are widely used to perform various tasks such as odometry estimation and 3D reconstruction.Fusing the information from these two sensors can significantly increase the robustness and accuracy of these perception tasks.The extrinsic calibration between cameras and LiDAR is a fundamental prerequisite for multimodal systems.Recently,extensive studies have been conducted on the calibration of extrinsic parameters.Although several calibration methods facilitate sensor fusion,a comprehensive summary for researchers and,especially,non-expert users is lacking.Thus,we present an overview of extrinsic calibration and discuss diverse calibration methods from the perspective of calibration system design.Based on the calibration information sources,this study classifies these methods as target-based or targetless.For each type of calibration method,further classification was performed according to the diverse types of features or constraints used in the calibration process,and their detailed implementations and key characteristics were introduced.Thereafter,calibration-accuracy evaluation methods are presented.Finally,we comprehensively compare the advantages and disadvantages of each calibration method and suggest directions for practical applications and future research.展开更多
Recently,a multitude of techniques that fuse deep learning with Retinex theory have been utilized in the field of low-light image enhancement,yielding remarkable outcomes.Due to the intricate nature of imaging scenari...Recently,a multitude of techniques that fuse deep learning with Retinex theory have been utilized in the field of low-light image enhancement,yielding remarkable outcomes.Due to the intricate nature of imaging scenarios,including fluctuating noise levels and unpredictable environmental elements,these techniques do not fully resolve these challenges.We introduce an innovative strategy that builds upon Retinex theory and integrates a novel deep network architecture,merging the Convolutional Block Attention Module(CBAM)with the Transformer.Our model is capable of detecting more prominent features across both channel and spatial domains.We have conducted extensive experiments across several datasets,namely LOLv1,LOLv2-real,and LOLv2-sync.The results show that our approach surpasses other methods when evaluated against critical metrics such as Peak Signal-to-Noise Ratio(PSNR)and Structural Similarity Index(SSIM).Moreover,we have visually assessed images enhanced by various techniques and utilized visual metrics like LPIPS for comparison,and the experimental data clearly demonstrate that our approach excels visually over other methods as well.展开更多
Due to the limitations of spatial bandwidth product and data transmission bandwidth,the field of view,resolution,and imaging speed constrain each other in an optical imaging system.Here,a fast-zoom and high-resolution...Due to the limitations of spatial bandwidth product and data transmission bandwidth,the field of view,resolution,and imaging speed constrain each other in an optical imaging system.Here,a fast-zoom and high-resolution sparse compound-eye camera(CEC)based on dual-end collaborative optimization is proposed,which provides a cost-effective way to break through the trade-off among the field of view,resolution,and imaging speed.In the optical end,a sparse CEC based on liquid lenses is designed,which can realize large-field-of-view imaging in real time,and fast zooming within 5 ms.In the computational end,a disturbed degradation model driven super-resolution network(DDMDSR-Net)is proposed to deal with complex image degradation issues in actual imaging situations,achieving high-robustness and high-fidelity resolution enhancement.Based on the proposed dual-end collaborative optimization framework,the angular resolution of the CEC can be enhanced from 71.6"to 26.0",which provides a solution to realize high-resolution imaging for array camera dispensing with high optical hardware complexity and data transmission bandwidth.Experiments verify the advantages of the CEC based on dual-end collaborative optimization in high-fidelity reconstruction of real scene images,kilometer-level long-distance detection,and dynamic imaging and precise recognition of targets of interest.展开更多
Observatories typically deploy all-sky cameras for monitoring cloud cover and weather conditions.However,many of these cameras lack scientific-grade sensors,r.esulting in limited photometric precision,which makes calc...Observatories typically deploy all-sky cameras for monitoring cloud cover and weather conditions.However,many of these cameras lack scientific-grade sensors,r.esulting in limited photometric precision,which makes calculating the sky area visibility distribution via extinction measurement challenging.To address this issue,we propose the Photometry-Free Sky Area Visibility Estimation(PFSAVE)method.This method uses the standard magnitude of the faintest star observed within a given sky area to estimate visibility.By employing a pertransformation refitting optimization strategy,we achieve a high-precision coordinate transformation model with an accuracy of 0.42 pixels.Using the results of HEALPix segmentation is also introduced to achieve high spatial resolution.Comprehensive analysis based on real allsky images demonstrates that our method exhibits higher accuracy than the extinction-based method.Our method supports both manual and robotic dynamic scheduling,especially under partially cloudy conditions.展开更多
Low-light image enhancement is one of the most active research areas in the field of computer vision in recent years.In the low-light image enhancement process,loss of image details and increase in noise occur inevita...Low-light image enhancement is one of the most active research areas in the field of computer vision in recent years.In the low-light image enhancement process,loss of image details and increase in noise occur inevitably,influencing the quality of enhanced images.To alleviate this problem,a low-light image enhancement model called RetinexNet model based on Retinex theory was proposed in this study.The model was composed of an image decomposition module and a brightness enhancement module.In the decomposition module,a convolutional block attention module(CBAM)was incorporated to enhance feature representation capacity of the network,focusing on crucial features and suppressing irrelevant ones.A multifeature fusion denoising module was designed within the brightness enhancement module,circumventing the issue of feature loss during downsampling.The proposed model outperforms the existing algorithms in terms of PSNR and SSIM metrics on the publicly available datasets LOL and MIT-Adobe FiveK,as well as gives superior results in terms of NIQE metrics on the publicly available dataset LIME.展开更多
This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,an...This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.展开更多
Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning...Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning. Retinexformer introduces channel self-attention mechanisms in the IG-MSA. However, it fails to effectively capture long-range spatial dependencies, leaving room for improvement. Based on the Retinexformer deep learning framework, we designed the Retinexformer+ network. The “+” signifies our advancements in extracting long-range spatial dependencies. We introduced multi-scale dilated convolutions in illumination estimation to expand the receptive field. These convolutions effectively capture the weakening semantic dependency between pixels as distance increases. In illumination restoration, we used Unet++ with multi-level skip connections to better integrate semantic information at different scales. The designed Illumination Fusion Dual Self-Attention (IF-DSA) module embeds multi-scale dilated convolutions to achieve spatial self-attention. This module captures long-range spatial semantic relationships within acceptable computational complexity. Experimental results on the Low-Light (LOL) dataset show that Retexformer+ outperforms other State-Of-The-Art (SOTA) methods in both quantitative and qualitative evaluations, with the computational complexity increased to an acceptable 51.63 G FLOPS. On the LOL_v1 dataset, RetinexFormer+ shows an increase of 1.15 in Peak Signal-to-Noise Ratio (PSNR) and a decrease of 0.39 in Root Mean Square Error (RMSE). On the LOL_v2_real dataset, the PSNR increases by 0.42 and the RMSE decreases by 0.18. Experimental results on the Exdark dataset show that Retexformer+ can effectively enhance real-scene images and maintain their semantic information.展开更多
Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of vis...Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.展开更多
Under low-illumination conditions, the quality of image signals deteriorates significantly, typically characterized by a peak signal-to-noise ratio (PSNR) below 10 dB, which severely limits the usability of the images...Under low-illumination conditions, the quality of image signals deteriorates significantly, typically characterized by a peak signal-to-noise ratio (PSNR) below 10 dB, which severely limits the usability of the images. Supervised methods, which utilize paired high-low light images as training sets, can enhance the PSNR to around 20 dB, significantly improving image quality. However, such data is challenging to obtain. In recent years, unsupervised low-light image enhancement (LIE) methods based on the Retinex framework have been proposed, but they generally lag behind supervised methods by 5–10 dB in performance. In this paper, we introduce the Denoising-Distilled Retine (DDR) method, an unsupervised approach that integrates denoising priors into a Retinex-based training framework. By explicitly incorporating denoising, the DDR method effectively addresses the challenges of noise and artifacts in low-light images, thereby enhancing the performance of the Retinex framework. The model achieved a PSNR of 19.82 dB on the LOL dataset, which is comparable to the performance of supervised methods. Furthermore, by applying knowledge distillation, the DDR method optimizes the model for real-time processing of low-light images, achieving a processing speed of 199.7 fps without incurring additional computational costs. While the DDR method has demonstrated superior performance in terms of image quality and processing speed, there is still room for improvement in terms of robustness across different color spaces and under highly resource-constrained conditions. Future research will focus on enhancing the model’s generalizability and adaptability to address these challenges. Our rigorous testing on public datasets further substantiates the DDR method’s state-of-the-art performance in both image quality and processing speed.展开更多
Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illuminati...Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.展开更多
This research addresses the critical challenge of enhancing satellite images captured under low-light conditions,which suffer from severely degraded quality,including a lack of detail,poor contrast,and low usability.O...This research addresses the critical challenge of enhancing satellite images captured under low-light conditions,which suffer from severely degraded quality,including a lack of detail,poor contrast,and low usability.Overcoming this limitation is essential for maximizing the value of satellite imagery in downstream computer vision tasks(e.g.,spacecraft on-orbit connection,spacecraft surface repair,space debris capture)that rely on clear visual information.Our key novelty lies in an unsupervised generative adversarial network featuring two main contributions:(1)an improved U-Net(IU-Net)generator with multi-scale feature fusion in the contracting path for richer semantic feature extraction,and(2)a Global Illumination Attention Module(GIA)at the end of the contracting path to couple local and global information,significantly improving detail recovery and illumination adjustment.The proposed algorithm operates in an unsupervised manner.It is trained and evaluated on our self-constructed,unpaired Spacecraft Dataset for Detection,Enforcement,and Parts Recognition(SDDEP),designed specifically for low-light enhancement tasks.Extensive experiments demonstrate that our method outperforms the baseline EnlightenGAN,achieving improvements of 2.7%in structural similarity(SSIM),4.7%in peak signal-to-noise ratio(PSNR),6.3%in learning perceptual image patch similarity(LPIPS),and 53.2%in DeltaE 2000.Qualitatively,the enhanced images exhibit higher overall and local brightness,improved contrast,and more natural visual effects.展开更多
Low-light images often have defects such as low visibility,low contrast,high noise,and high color distortion compared with well-exposed images.If the low-light region of an image is enhanced directly,the noise will in...Low-light images often have defects such as low visibility,low contrast,high noise,and high color distortion compared with well-exposed images.If the low-light region of an image is enhanced directly,the noise will inevitably blur the whole image.Besides,according to the retina-and-cortex(retinex)theory of color vision,the reflectivity of different image regions may differ,limiting the enhancement performance of applying uniform operations to the entire image.Therefore,we design a Hierarchical Flow Learning(HFL)framework,which consists of a Hierarchical Image Network(HIN)and a normalized invertible Flow Learning Network(FLN).HIN can extract hierarchical structural features from low-light images,while FLN maps the distribution of normally exposed images to a Gaussian distribution using the learned hierarchical features of low-light images.In subsequent testing,the reversibility of FLN allows inferring and obtaining enhanced low-light images.Specifically,the HIN extracts as much image information as possible from three scales,local,regional,and global,using a Triple-branch Hierarchical Fusion Module(THFM)and a Dual-Dconv Cross Fusion Module(DCFM).The THFM aggregates regional and global features to enhance the overall brightness and quality of low-light images by perceiving and extracting more structure information,whereas the DCFM uses the properties of the activation function and local features to enhance images at the pixel-level to reduce noise and improve contrast.In addition,in this paper,the model was trained using a negative log-likelihood loss function.Qualitative and quantitative experimental results demonstrate that our HFL can better handle many quality degradation types in low-light images compared with state-of-the-art solutions.The HFL model enhances low-light images with better visibility,less noise,and improved contrast,suitable for practical scenarios such as autonomous driving,medical imaging,and nighttime surveillance.Outperforming them by PSNR=27.26 dB,SSIM=0.93,and LPIPS=0.10 on benchmark dataset LOL-v1.The source code of HFL is available at https://github.com/Smile-QT/HFL-for-LIE.展开更多
This study aimed to identify the physiological mechanisms enabling low-N-tolerant maize cultivar to maintain higher photosynthesis and yield under low-N,low-light,and combined stress.In a three-year field trial of low...This study aimed to identify the physiological mechanisms enabling low-N-tolerant maize cultivar to maintain higher photosynthesis and yield under low-N,low-light,and combined stress.In a three-year field trial of low-N-tolerant and low-N-sensitive maize cultivars under two N fertilization(normal N:240 kg N ha^(−1);low-N:150 kg N ha^(−1))and two light conditions(normal light;low-light:35%light reduction),the tolerant cultivar showed higher net photosynthetic rate than the sensitive one.Random Forest analysis and Structural Equation Modeling identified PSI donor-side limitation(elevated Y_(ND))as the key photosynthetic constraint.The tolerant cultivar maintained higher D1 and PsaA protein levels and preferentially allocated photosynthetic N to electron transport.This strategy reduced Y_(ND)and sustained photosystem stability,thus improving carboxylation efficiency and resulting in higher photosynthesis.展开更多
基金supported by the 2024 Research Fund of University of Ulsan.
文摘In recent years,three-dimensional reconstruction technologies that employ multiple cameras have continued to evolve significantly,enabling remote collaboration among users in extended Reality(XR)environments.In addition,methods for deploying multiple cameras for motion capture of users(e.g.,performers)are widely used in computer graphics.As the need to minimize and optimize the number of cameras grows to reduce costs,various technologies and research approaches focused on Optimal Camera Placement(OCP)are continually being proposed.However,as most existing studies assume homogeneous camera setups,there is a growing demand for studies on heterogeneous camera setups.For instance,technical demands keep emerging in scenarios with minimal camera configurations,especially regarding cost factors,the physical placement of cameras given the spatial structure,and image capture strategies for heterogeneous cameras,such as high-resolution RGB cameras and depth cameras.In this study,we propose a pre-visualization and simulation method for the optimal placement of heterogeneous cameras in XR environments,accounting for both the specifications of heterogeneous cameras(e.g.,field of view)and the physical configuration(e.g.,wall configuration)in real-world spaces.The proposed method performs a visibility analysis of cameras by considering each camera’s field-of-view volume,resolution,and unique characteristics,along with physicalspace constraints.This approach enables the optimal position and rotation of each camera to be recommended,along with the minimum number of cameras required.In the results of our study conducted in heterogeneous camera combinations,the proposed method achieved 81.7%~82.7%coverage of the target visual information using only 2~3 cameras.In contrast,single(or homogeneous)-typed cameras were required to use 11 cameras for 81.6%coverage.Accordingly,we found that camera deployment resources can be reduced with the proposed approaches.
基金supported by the Key R&D Program of Shaanxi Province(No.2025CYYBXM-078).
文摘Aiming at the scale adaptation of automatic driving target detection algorithms in low illumination environments and the shortcomings in target occlusion processing,this paper proposes a YOLO-LKSDS automatic driving detection model.Firstly,the Contrast-Limited Adaptive Histogram Equalisation(CLAHE)image enhancement algorithm is improved to increase the image contrast and enhance the detailed features of the target;then,on the basis of the YOLOv5 model,the Kmeans++clustering algorithm is introduced to obtain a suitable anchor frame,and SPPELAN spatial pyramid pooling is improved to enhance the accuracy and robustness of the model for multi-scale target detection.Finally,an improved SEAM(Separated and Enhancement Attention Module)attention mechanism is combined with the DIOU-NMS algorithm to optimize the model’s performance when dealing with occlusion and dense scenes.Compared with the original model,the improved YOLO-LKSDS model achieves a 13.3%improvement in accuracy,a 1.7%improvement in mAP,and 240,000 fewer parameters on the BDD100K dataset.In order to validate the generalization of the improved algorithm,we selected the KITTI dataset for experimentation,which shows that YOLOv5’s accuracy improves by 21.1%,recall by 36.6%,and mAP50 by 29.5%,respectively,on the KITTI dataset.The deployment of this paper’s algorithm is verified by an edge computing platform,where the average speed of detection reaches 24.4 FPS while power consumption remains below 9 W,demonstrating high real-time capability and energy efficiency.
基金supported by the Jilin Science and Technology Development Plan (20240101029JJ) for the following study:synchronized high-speed detection of surface shape and defects in the grinding stage of complex surfaces (KLMSZZ202305)for the high-precision wide dynamic large aperture optical inspection system for fine astronomical observation by the National Major Research Instrument Development Project (62127901)+2 种基金for ultrasmooth manufacturing technology of large diameter complex curved surface by the National Key R&D Program(2022YFB3403405)for research on the key technology of rapid synchronous detection of surface shape and subsurface defects in the grinding stage of large diameter complex surfaces by the International Cooperation Project(2025010157)The Key Laboratory of Optical System Advanced Manufacturing Technology,Chinese Academy of Sciences (2022KLOMT02-04) also supported this study
文摘To address the challenges of high-precision optical surface defect detection,we propose a novel design for a wide-field and broadband light field camera in this work.The proposed system can achieve a 50°field of view and operates at both visible and near-infrared wavelengths.Using the principles of light field imaging,the proposed design enables 3D reconstruction of optical surfaces,thus enabling vertical surface height measurements with enhanced accuracy.Using Zemax-based simulations,we evaluate the system’s modulation transfer function,its optical aberrations,and its tolerance to shape variations through Zernike coefficient adjustments.The results demonstrate that this camera can achieve the required spatial resolution while also maintaining high imaging quality and thus offers a promising solution for advanced optical surface defect inspection.
基金supported in part by the National Natural Science Foundation of China[Grant number 62471075]the Major Science and Technology Project Grant of the Chongqing Municipal Education Commission[Grant number KJZD-M202301901].
文摘Low-light image enhancement aims to improve the visibility of severely degraded images captured under insufficient illumination,alleviating the adverse effects of illumination degradation on image quality.Traditional Retinex-based approaches,inspired by human visual perception of brightness and color,decompose an image into illumination and reflectance components to restore fine details.However,their limited capacity for handling noise and complex lighting conditions often leads to distortions and artifacts in the enhanced results,particularly under extreme low-light scenarios.Although deep learning methods built upon Retinex theory have recently advanced the field,most still suffer frominsufficient interpretability and sub-optimal enhancement performance.This paper presents RetinexWT,a novel framework that tightly integrates classical Retinex theory with modern deep learning.Following Retinex principles,RetinexWT employs wavelet transforms to estimate illumination maps for brightness adjustment.A detail-recovery module that synergistically combines Vision Transformer(ViT)and wavelet transforms is then introduced to guide the restoration of lost details,thereby improving overall image quality.Within the framework,wavelet decomposition splits input features into high-frequency and low-frequency components,enabling scale-specific processing of global illumination/color cues and fine textures.Furthermore,a gating mechanism selectively fuses down-sampled and up-sampled features,while an attention-based fusion strategy enhances model interpretability.Extensive experiments on the LOL dataset demonstrate that RetinexWT surpasses existing Retinex-oriented deeplearning methods,achieving an average Peak Signal-to-Noise Ratio(PSNR)improvement of 0.22 dB over the current StateOfTheArt(SOTA),thereby confirming its superiority in low-light image enhancement.Code is available at https://github.com/CHEN-hJ516/RetinexWT(accessed on 14 October 2025).
基金funded by the National Natural Science Foundation of China,grant numbers 52374156 and 62476005。
文摘Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.
基金National Natural Science Foundation of China,Grant/Award Number:72204169Department of Science and Technology of Sichuan Province,Grant/Award Number:2021YFS0393。
文摘Background:Studies have shown that heart rate variability(HRV)is a predictor of the prognosis of cardiovascular diseases.Contact heartbeat monitoring equipment is widely used,especially in hospitals,and benefits from the rapidity and accuracy of the detection of physiological health indicators.However,long-term contact with equipment has many adverse effects.The purpose of this study was to improve the accuracy of HRV detection via noncontact equipment,thus enabling HRV to be assessed in various scenarios.Methods:A novel deep learning approach was proposed for measuring heartbeats through camera videos.First,we performed facial segmentation and divided the face into 16 grid cells with different light balance scores.After the trend is filtered by the Hamming window,a transformer-based neural network is used to further filter the signal.Finally,heart rate(HR)and HRV are estimated.Results:We used 1 million synthetic data points for pretraining and a public dataset in combination with a dataset that we constructed for task training.The final results were obtained on a test dataset that we constructed.The accuracy for HR with a low light balance score(0.867-0.983)was greater than that with a high score(0.667-0.750).Our method had higher accuracy in estimating HR than traditional filtering methods(0.167-0.417)and state-of-the-art neural network filtering methods(0.783-0.917)did.The root mean square error of the HRV from the time domain was the lowest,and the correlation index score was the highest for the HRV from the frequency domain estimated by our method compared with those estimated by two neural networks.Conclusions:Light balance,large sample training,and two-stage training can improve the accuracy of HRV estimation.
基金Supported by Beijing Natural Science Foundation(Grant No.L241012)the National Natural Science Foundation of China(Grant No.62572468).
文摘LiDAR and camera are two of the most common sensors used in the fields of robot perception,autonomous driving,augmented reality,and virtual reality,where these sensors are widely used to perform various tasks such as odometry estimation and 3D reconstruction.Fusing the information from these two sensors can significantly increase the robustness and accuracy of these perception tasks.The extrinsic calibration between cameras and LiDAR is a fundamental prerequisite for multimodal systems.Recently,extensive studies have been conducted on the calibration of extrinsic parameters.Although several calibration methods facilitate sensor fusion,a comprehensive summary for researchers and,especially,non-expert users is lacking.Thus,we present an overview of extrinsic calibration and discuss diverse calibration methods from the perspective of calibration system design.Based on the calibration information sources,this study classifies these methods as target-based or targetless.For each type of calibration method,further classification was performed according to the diverse types of features or constraints used in the calibration process,and their detailed implementations and key characteristics were introduced.Thereafter,calibration-accuracy evaluation methods are presented.Finally,we comprehensively compare the advantages and disadvantages of each calibration method and suggest directions for practical applications and future research.
文摘Recently,a multitude of techniques that fuse deep learning with Retinex theory have been utilized in the field of low-light image enhancement,yielding remarkable outcomes.Due to the intricate nature of imaging scenarios,including fluctuating noise levels and unpredictable environmental elements,these techniques do not fully resolve these challenges.We introduce an innovative strategy that builds upon Retinex theory and integrates a novel deep network architecture,merging the Convolutional Block Attention Module(CBAM)with the Transformer.Our model is capable of detecting more prominent features across both channel and spatial domains.We have conducted extensive experiments across several datasets,namely LOLv1,LOLv2-real,and LOLv2-sync.The results show that our approach surpasses other methods when evaluated against critical metrics such as Peak Signal-to-Noise Ratio(PSNR)and Structural Similarity Index(SSIM).Moreover,we have visually assessed images enhanced by various techniques and utilized visual metrics like LPIPS for comparison,and the experimental data clearly demonstrate that our approach excels visually over other methods as well.
基金financial supports from National Natural Science Foundation of China(Grant Nos.U23A20368 and 62175006)Academic Excellence Foundation of BUAA for PhD Students.
文摘Due to the limitations of spatial bandwidth product and data transmission bandwidth,the field of view,resolution,and imaging speed constrain each other in an optical imaging system.Here,a fast-zoom and high-resolution sparse compound-eye camera(CEC)based on dual-end collaborative optimization is proposed,which provides a cost-effective way to break through the trade-off among the field of view,resolution,and imaging speed.In the optical end,a sparse CEC based on liquid lenses is designed,which can realize large-field-of-view imaging in real time,and fast zooming within 5 ms.In the computational end,a disturbed degradation model driven super-resolution network(DDMDSR-Net)is proposed to deal with complex image degradation issues in actual imaging situations,achieving high-robustness and high-fidelity resolution enhancement.Based on the proposed dual-end collaborative optimization framework,the angular resolution of the CEC can be enhanced from 71.6"to 26.0",which provides a solution to realize high-resolution imaging for array camera dispensing with high optical hardware complexity and data transmission bandwidth.Experiments verify the advantages of the CEC based on dual-end collaborative optimization in high-fidelity reconstruction of real scene images,kilometer-level long-distance detection,and dynamic imaging and precise recognition of targets of interest.
基金supported by Natural Science Foundation of Jilin Province(20210101468JC)Chinese Academy of Sciences and Local Government Cooperation Project(2023SYHZ0027,23SH04)National Natural Science Foundation of China(12273063&12203078)。
文摘Observatories typically deploy all-sky cameras for monitoring cloud cover and weather conditions.However,many of these cameras lack scientific-grade sensors,r.esulting in limited photometric precision,which makes calculating the sky area visibility distribution via extinction measurement challenging.To address this issue,we propose the Photometry-Free Sky Area Visibility Estimation(PFSAVE)method.This method uses the standard magnitude of the faintest star observed within a given sky area to estimate visibility.By employing a pertransformation refitting optimization strategy,we achieve a high-precision coordinate transformation model with an accuracy of 0.42 pixels.Using the results of HEALPix segmentation is also introduced to achieve high spatial resolution.Comprehensive analysis based on real allsky images demonstrates that our method exhibits higher accuracy than the extinction-based method.Our method supports both manual and robotic dynamic scheduling,especially under partially cloudy conditions.
文摘Low-light image enhancement is one of the most active research areas in the field of computer vision in recent years.In the low-light image enhancement process,loss of image details and increase in noise occur inevitably,influencing the quality of enhanced images.To alleviate this problem,a low-light image enhancement model called RetinexNet model based on Retinex theory was proposed in this study.The model was composed of an image decomposition module and a brightness enhancement module.In the decomposition module,a convolutional block attention module(CBAM)was incorporated to enhance feature representation capacity of the network,focusing on crucial features and suppressing irrelevant ones.A multifeature fusion denoising module was designed within the brightness enhancement module,circumventing the issue of feature loss during downsampling.The proposed model outperforms the existing algorithms in terms of PSNR and SSIM metrics on the publicly available datasets LOL and MIT-Adobe FiveK,as well as gives superior results in terms of NIQE metrics on the publicly available dataset LIME.
基金Supported by the Fundamental Research Funds for the Central Universities(2024300443)the Natural Science Foundation of Jiangsu Province(BK20241224).
文摘This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.
基金supported by the Key Laboratory of Forensic Science and Technology at College of Sichuan Province(2023YB04).
文摘Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning. Retinexformer introduces channel self-attention mechanisms in the IG-MSA. However, it fails to effectively capture long-range spatial dependencies, leaving room for improvement. Based on the Retinexformer deep learning framework, we designed the Retinexformer+ network. The “+” signifies our advancements in extracting long-range spatial dependencies. We introduced multi-scale dilated convolutions in illumination estimation to expand the receptive field. These convolutions effectively capture the weakening semantic dependency between pixels as distance increases. In illumination restoration, we used Unet++ with multi-level skip connections to better integrate semantic information at different scales. The designed Illumination Fusion Dual Self-Attention (IF-DSA) module embeds multi-scale dilated convolutions to achieve spatial self-attention. This module captures long-range spatial semantic relationships within acceptable computational complexity. Experimental results on the Low-Light (LOL) dataset show that Retexformer+ outperforms other State-Of-The-Art (SOTA) methods in both quantitative and qualitative evaluations, with the computational complexity increased to an acceptable 51.63 G FLOPS. On the LOL_v1 dataset, RetinexFormer+ shows an increase of 1.15 in Peak Signal-to-Noise Ratio (PSNR) and a decrease of 0.39 in Root Mean Square Error (RMSE). On the LOL_v2_real dataset, the PSNR increases by 0.42 and the RMSE decreases by 0.18. Experimental results on the Exdark dataset show that Retexformer+ can effectively enhance real-scene images and maintain their semantic information.
基金supported by the National Natural Science Foundation of China(Grant No.62302086)the Natural Science Foundation of Liaoning Province(Grant No.2023-MSBA-070)the Fundamental Research Funds for the Central Universities(Grant No.N2317005).
文摘Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.
基金support by the Guangxi Natural Science Foundation(Grant No.2024GXNSFAA010484)the NationalNatural Science Foundation of China(No.62466013),this work has been made possible.
文摘Under low-illumination conditions, the quality of image signals deteriorates significantly, typically characterized by a peak signal-to-noise ratio (PSNR) below 10 dB, which severely limits the usability of the images. Supervised methods, which utilize paired high-low light images as training sets, can enhance the PSNR to around 20 dB, significantly improving image quality. However, such data is challenging to obtain. In recent years, unsupervised low-light image enhancement (LIE) methods based on the Retinex framework have been proposed, but they generally lag behind supervised methods by 5–10 dB in performance. In this paper, we introduce the Denoising-Distilled Retine (DDR) method, an unsupervised approach that integrates denoising priors into a Retinex-based training framework. By explicitly incorporating denoising, the DDR method effectively addresses the challenges of noise and artifacts in low-light images, thereby enhancing the performance of the Retinex framework. The model achieved a PSNR of 19.82 dB on the LOL dataset, which is comparable to the performance of supervised methods. Furthermore, by applying knowledge distillation, the DDR method optimizes the model for real-time processing of low-light images, achieving a processing speed of 199.7 fps without incurring additional computational costs. While the DDR method has demonstrated superior performance in terms of image quality and processing speed, there is still room for improvement in terms of robustness across different color spaces and under highly resource-constrained conditions. Future research will focus on enhancing the model’s generalizability and adaptability to address these challenges. Our rigorous testing on public datasets further substantiates the DDR method’s state-of-the-art performance in both image quality and processing speed.
基金This researchwas Sponsored by Xinjiang Uygur Autonomous Region Tianshan Talent Programme Project(2023TCLJ02)Natural Science Foundation of Xinjiang Uygur Autonomous Region(2022D01C349).
文摘Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.
基金supported by Anhui Province University Key Science and Technology Project(2024AH053415)Anhui Province University Major Science and Technology Project(2024AH040229).
文摘This research addresses the critical challenge of enhancing satellite images captured under low-light conditions,which suffer from severely degraded quality,including a lack of detail,poor contrast,and low usability.Overcoming this limitation is essential for maximizing the value of satellite imagery in downstream computer vision tasks(e.g.,spacecraft on-orbit connection,spacecraft surface repair,space debris capture)that rely on clear visual information.Our key novelty lies in an unsupervised generative adversarial network featuring two main contributions:(1)an improved U-Net(IU-Net)generator with multi-scale feature fusion in the contracting path for richer semantic feature extraction,and(2)a Global Illumination Attention Module(GIA)at the end of the contracting path to couple local and global information,significantly improving detail recovery and illumination adjustment.The proposed algorithm operates in an unsupervised manner.It is trained and evaluated on our self-constructed,unpaired Spacecraft Dataset for Detection,Enforcement,and Parts Recognition(SDDEP),designed specifically for low-light enhancement tasks.Extensive experiments demonstrate that our method outperforms the baseline EnlightenGAN,achieving improvements of 2.7%in structural similarity(SSIM),4.7%in peak signal-to-noise ratio(PSNR),6.3%in learning perceptual image patch similarity(LPIPS),and 53.2%in DeltaE 2000.Qualitatively,the enhanced images exhibit higher overall and local brightness,improved contrast,and more natural visual effects.
基金supported by the National Natural Science Foundation of China(Grant Nos.61971078,61501070)the Scientific Research Foundation of Chongqing University of Technology(Grant No.0121230236)the Science and Technology Research Program of Chongqing Municipal Education Commission(Grant No.KJ202301165).
文摘Low-light images often have defects such as low visibility,low contrast,high noise,and high color distortion compared with well-exposed images.If the low-light region of an image is enhanced directly,the noise will inevitably blur the whole image.Besides,according to the retina-and-cortex(retinex)theory of color vision,the reflectivity of different image regions may differ,limiting the enhancement performance of applying uniform operations to the entire image.Therefore,we design a Hierarchical Flow Learning(HFL)framework,which consists of a Hierarchical Image Network(HIN)and a normalized invertible Flow Learning Network(FLN).HIN can extract hierarchical structural features from low-light images,while FLN maps the distribution of normally exposed images to a Gaussian distribution using the learned hierarchical features of low-light images.In subsequent testing,the reversibility of FLN allows inferring and obtaining enhanced low-light images.Specifically,the HIN extracts as much image information as possible from three scales,local,regional,and global,using a Triple-branch Hierarchical Fusion Module(THFM)and a Dual-Dconv Cross Fusion Module(DCFM).The THFM aggregates regional and global features to enhance the overall brightness and quality of low-light images by perceiving and extracting more structure information,whereas the DCFM uses the properties of the activation function and local features to enhance images at the pixel-level to reduce noise and improve contrast.In addition,in this paper,the model was trained using a negative log-likelihood loss function.Qualitative and quantitative experimental results demonstrate that our HFL can better handle many quality degradation types in low-light images compared with state-of-the-art solutions.The HFL model enhances low-light images with better visibility,less noise,and improved contrast,suitable for practical scenarios such as autonomous driving,medical imaging,and nighttime surveillance.Outperforming them by PSNR=27.26 dB,SSIM=0.93,and LPIPS=0.10 on benchmark dataset LOL-v1.The source code of HFL is available at https://github.com/Smile-QT/HFL-for-LIE.
基金supported by the Key Program of Natural Science Foundation of Sichuan Province(2022NSFSC0013)National Key Research and Development Program of China(2022YFD1901603,2023YFD2301902).
文摘This study aimed to identify the physiological mechanisms enabling low-N-tolerant maize cultivar to maintain higher photosynthesis and yield under low-N,low-light,and combined stress.In a three-year field trial of low-N-tolerant and low-N-sensitive maize cultivars under two N fertilization(normal N:240 kg N ha^(−1);low-N:150 kg N ha^(−1))and two light conditions(normal light;low-light:35%light reduction),the tolerant cultivar showed higher net photosynthetic rate than the sensitive one.Random Forest analysis and Structural Equation Modeling identified PSI donor-side limitation(elevated Y_(ND))as the key photosynthetic constraint.The tolerant cultivar maintained higher D1 and PsaA protein levels and preferentially allocated photosynthetic N to electron transport.This strategy reduced Y_(ND)and sustained photosystem stability,thus improving carboxylation efficiency and resulting in higher photosynthesis.