Aiming at the scale adaptation of automatic driving target detection algorithms in low illumination environments and the shortcomings in target occlusion processing,this paper proposes a YOLO-LKSDS automatic driving d...Aiming at the scale adaptation of automatic driving target detection algorithms in low illumination environments and the shortcomings in target occlusion processing,this paper proposes a YOLO-LKSDS automatic driving detection model.Firstly,the Contrast-Limited Adaptive Histogram Equalisation(CLAHE)image enhancement algorithm is improved to increase the image contrast and enhance the detailed features of the target;then,on the basis of the YOLOv5 model,the Kmeans++clustering algorithm is introduced to obtain a suitable anchor frame,and SPPELAN spatial pyramid pooling is improved to enhance the accuracy and robustness of the model for multi-scale target detection.Finally,an improved SEAM(Separated and Enhancement Attention Module)attention mechanism is combined with the DIOU-NMS algorithm to optimize the model’s performance when dealing with occlusion and dense scenes.Compared with the original model,the improved YOLO-LKSDS model achieves a 13.3%improvement in accuracy,a 1.7%improvement in mAP,and 240,000 fewer parameters on the BDD100K dataset.In order to validate the generalization of the improved algorithm,we selected the KITTI dataset for experimentation,which shows that YOLOv5’s accuracy improves by 21.1%,recall by 36.6%,and mAP50 by 29.5%,respectively,on the KITTI dataset.The deployment of this paper’s algorithm is verified by an edge computing platform,where the average speed of detection reaches 24.4 FPS while power consumption remains below 9 W,demonstrating high real-time capability and energy efficiency.展开更多
Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approach...Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.展开更多
Due to the limitations of spatial bandwidth product and data transmission bandwidth,the field of view,resolution,and imaging speed constrain each other in an optical imaging system.Here,a fast-zoom and high-resolution...Due to the limitations of spatial bandwidth product and data transmission bandwidth,the field of view,resolution,and imaging speed constrain each other in an optical imaging system.Here,a fast-zoom and high-resolution sparse compound-eye camera(CEC)based on dual-end collaborative optimization is proposed,which provides a cost-effective way to break through the trade-off among the field of view,resolution,and imaging speed.In the optical end,a sparse CEC based on liquid lenses is designed,which can realize large-field-of-view imaging in real time,and fast zooming within 5 ms.In the computational end,a disturbed degradation model driven super-resolution network(DDMDSR-Net)is proposed to deal with complex image degradation issues in actual imaging situations,achieving high-robustness and high-fidelity resolution enhancement.Based on the proposed dual-end collaborative optimization framework,the angular resolution of the CEC can be enhanced from 71.6"to 26.0",which provides a solution to realize high-resolution imaging for array camera dispensing with high optical hardware complexity and data transmission bandwidth.Experiments verify the advantages of the CEC based on dual-end collaborative optimization in high-fidelity reconstruction of real scene images,kilometer-level long-distance detection,and dynamic imaging and precise recognition of targets of interest.展开更多
Observatories typically deploy all-sky cameras for monitoring cloud cover and weather conditions.However,many of these cameras lack scientific-grade sensors,r.esulting in limited photometric precision,which makes calc...Observatories typically deploy all-sky cameras for monitoring cloud cover and weather conditions.However,many of these cameras lack scientific-grade sensors,r.esulting in limited photometric precision,which makes calculating the sky area visibility distribution via extinction measurement challenging.To address this issue,we propose the Photometry-Free Sky Area Visibility Estimation(PFSAVE)method.This method uses the standard magnitude of the faintest star observed within a given sky area to estimate visibility.By employing a pertransformation refitting optimization strategy,we achieve a high-precision coordinate transformation model with an accuracy of 0.42 pixels.Using the results of HEALPix segmentation is also introduced to achieve high spatial resolution.Comprehensive analysis based on real allsky images demonstrates that our method exhibits higher accuracy than the extinction-based method.Our method supports both manual and robotic dynamic scheduling,especially under partially cloudy conditions.展开更多
Low-light image enhancement is one of the most active research areas in the field of computer vision in recent years.In the low-light image enhancement process,loss of image details and increase in noise occur inevita...Low-light image enhancement is one of the most active research areas in the field of computer vision in recent years.In the low-light image enhancement process,loss of image details and increase in noise occur inevitably,influencing the quality of enhanced images.To alleviate this problem,a low-light image enhancement model called RetinexNet model based on Retinex theory was proposed in this study.The model was composed of an image decomposition module and a brightness enhancement module.In the decomposition module,a convolutional block attention module(CBAM)was incorporated to enhance feature representation capacity of the network,focusing on crucial features and suppressing irrelevant ones.A multifeature fusion denoising module was designed within the brightness enhancement module,circumventing the issue of feature loss during downsampling.The proposed model outperforms the existing algorithms in terms of PSNR and SSIM metrics on the publicly available datasets LOL and MIT-Adobe FiveK,as well as gives superior results in terms of NIQE metrics on the publicly available dataset LIME.展开更多
This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,an...This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.展开更多
Recently,a multitude of techniques that fuse deep learning with Retinex theory have been utilized in the field of low-light image enhancement,yielding remarkable outcomes.Due to the intricate nature of imaging scenari...Recently,a multitude of techniques that fuse deep learning with Retinex theory have been utilized in the field of low-light image enhancement,yielding remarkable outcomes.Due to the intricate nature of imaging scenarios,including fluctuating noise levels and unpredictable environmental elements,these techniques do not fully resolve these challenges.We introduce an innovative strategy that builds upon Retinex theory and integrates a novel deep network architecture,merging the Convolutional Block Attention Module(CBAM)with the Transformer.Our model is capable of detecting more prominent features across both channel and spatial domains.We have conducted extensive experiments across several datasets,namely LOLv1,LOLv2-real,and LOLv2-sync.The results show that our approach surpasses other methods when evaluated against critical metrics such as Peak Signal-to-Noise Ratio(PSNR)and Structural Similarity Index(SSIM).Moreover,we have visually assessed images enhanced by various techniques and utilized visual metrics like LPIPS for comparison,and the experimental data clearly demonstrate that our approach excels visually over other methods as well.展开更多
Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning...Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning. Retinexformer introduces channel self-attention mechanisms in the IG-MSA. However, it fails to effectively capture long-range spatial dependencies, leaving room for improvement. Based on the Retinexformer deep learning framework, we designed the Retinexformer+ network. The “+” signifies our advancements in extracting long-range spatial dependencies. We introduced multi-scale dilated convolutions in illumination estimation to expand the receptive field. These convolutions effectively capture the weakening semantic dependency between pixels as distance increases. In illumination restoration, we used Unet++ with multi-level skip connections to better integrate semantic information at different scales. The designed Illumination Fusion Dual Self-Attention (IF-DSA) module embeds multi-scale dilated convolutions to achieve spatial self-attention. This module captures long-range spatial semantic relationships within acceptable computational complexity. Experimental results on the Low-Light (LOL) dataset show that Retexformer+ outperforms other State-Of-The-Art (SOTA) methods in both quantitative and qualitative evaluations, with the computational complexity increased to an acceptable 51.63 G FLOPS. On the LOL_v1 dataset, RetinexFormer+ shows an increase of 1.15 in Peak Signal-to-Noise Ratio (PSNR) and a decrease of 0.39 in Root Mean Square Error (RMSE). On the LOL_v2_real dataset, the PSNR increases by 0.42 and the RMSE decreases by 0.18. Experimental results on the Exdark dataset show that Retexformer+ can effectively enhance real-scene images and maintain their semantic information.展开更多
Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of vis...Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.展开更多
Under low-illumination conditions, the quality of image signals deteriorates significantly, typically characterized by a peak signal-to-noise ratio (PSNR) below 10 dB, which severely limits the usability of the images...Under low-illumination conditions, the quality of image signals deteriorates significantly, typically characterized by a peak signal-to-noise ratio (PSNR) below 10 dB, which severely limits the usability of the images. Supervised methods, which utilize paired high-low light images as training sets, can enhance the PSNR to around 20 dB, significantly improving image quality. However, such data is challenging to obtain. In recent years, unsupervised low-light image enhancement (LIE) methods based on the Retinex framework have been proposed, but they generally lag behind supervised methods by 5–10 dB in performance. In this paper, we introduce the Denoising-Distilled Retine (DDR) method, an unsupervised approach that integrates denoising priors into a Retinex-based training framework. By explicitly incorporating denoising, the DDR method effectively addresses the challenges of noise and artifacts in low-light images, thereby enhancing the performance of the Retinex framework. The model achieved a PSNR of 19.82 dB on the LOL dataset, which is comparable to the performance of supervised methods. Furthermore, by applying knowledge distillation, the DDR method optimizes the model for real-time processing of low-light images, achieving a processing speed of 199.7 fps without incurring additional computational costs. While the DDR method has demonstrated superior performance in terms of image quality and processing speed, there is still room for improvement in terms of robustness across different color spaces and under highly resource-constrained conditions. Future research will focus on enhancing the model’s generalizability and adaptability to address these challenges. Our rigorous testing on public datasets further substantiates the DDR method’s state-of-the-art performance in both image quality and processing speed.展开更多
Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illuminati...Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.展开更多
Low-light images often have defects such as low visibility,low contrast,high noise,and high color distortion compared with well-exposed images.If the low-light region of an image is enhanced directly,the noise will in...Low-light images often have defects such as low visibility,low contrast,high noise,and high color distortion compared with well-exposed images.If the low-light region of an image is enhanced directly,the noise will inevitably blur the whole image.Besides,according to the retina-and-cortex(retinex)theory of color vision,the reflectivity of different image regions may differ,limiting the enhancement performance of applying uniform operations to the entire image.Therefore,we design a Hierarchical Flow Learning(HFL)framework,which consists of a Hierarchical Image Network(HIN)and a normalized invertible Flow Learning Network(FLN).HIN can extract hierarchical structural features from low-light images,while FLN maps the distribution of normally exposed images to a Gaussian distribution using the learned hierarchical features of low-light images.In subsequent testing,the reversibility of FLN allows inferring and obtaining enhanced low-light images.Specifically,the HIN extracts as much image information as possible from three scales,local,regional,and global,using a Triple-branch Hierarchical Fusion Module(THFM)and a Dual-Dconv Cross Fusion Module(DCFM).The THFM aggregates regional and global features to enhance the overall brightness and quality of low-light images by perceiving and extracting more structure information,whereas the DCFM uses the properties of the activation function and local features to enhance images at the pixel-level to reduce noise and improve contrast.In addition,in this paper,the model was trained using a negative log-likelihood loss function.Qualitative and quantitative experimental results demonstrate that our HFL can better handle many quality degradation types in low-light images compared with state-of-the-art solutions.The HFL model enhances low-light images with better visibility,less noise,and improved contrast,suitable for practical scenarios such as autonomous driving,medical imaging,and nighttime surveillance.Outperforming them by PSNR=27.26 dB,SSIM=0.93,and LPIPS=0.10 on benchmark dataset LOL-v1.The source code of HFL is available at https://github.com/Smile-QT/HFL-for-LIE.展开更多
This research addresses the critical challenge of enhancing satellite images captured under low-light conditions,which suffer from severely degraded quality,including a lack of detail,poor contrast,and low usability.O...This research addresses the critical challenge of enhancing satellite images captured under low-light conditions,which suffer from severely degraded quality,including a lack of detail,poor contrast,and low usability.Overcoming this limitation is essential for maximizing the value of satellite imagery in downstream computer vision tasks(e.g.,spacecraft on-orbit connection,spacecraft surface repair,space debris capture)that rely on clear visual information.Our key novelty lies in an unsupervised generative adversarial network featuring two main contributions:(1)an improved U-Net(IU-Net)generator with multi-scale feature fusion in the contracting path for richer semantic feature extraction,and(2)a Global Illumination Attention Module(GIA)at the end of the contracting path to couple local and global information,significantly improving detail recovery and illumination adjustment.The proposed algorithm operates in an unsupervised manner.It is trained and evaluated on our self-constructed,unpaired Spacecraft Dataset for Detection,Enforcement,and Parts Recognition(SDDEP),designed specifically for low-light enhancement tasks.Extensive experiments demonstrate that our method outperforms the baseline EnlightenGAN,achieving improvements of 2.7%in structural similarity(SSIM),4.7%in peak signal-to-noise ratio(PSNR),6.3%in learning perceptual image patch similarity(LPIPS),and 53.2%in DeltaE 2000.Qualitatively,the enhanced images exhibit higher overall and local brightness,improved contrast,and more natural visual effects.展开更多
This study aimed to identify the physiological mechanisms enabling low-N-tolerant maize cultivar to maintain higher photosynthesis and yield under low-N,low-light,and combined stress.In a three-year field trial of low...This study aimed to identify the physiological mechanisms enabling low-N-tolerant maize cultivar to maintain higher photosynthesis and yield under low-N,low-light,and combined stress.In a three-year field trial of low-N-tolerant and low-N-sensitive maize cultivars under two N fertilization(normal N:240 kg N ha^(−1);low-N:150 kg N ha^(−1))and two light conditions(normal light;low-light:35%light reduction),the tolerant cultivar showed higher net photosynthetic rate than the sensitive one.Random Forest analysis and Structural Equation Modeling identified PSI donor-side limitation(elevated Y_(ND))as the key photosynthetic constraint.The tolerant cultivar maintained higher D1 and PsaA protein levels and preferentially allocated photosynthetic N to electron transport.This strategy reduced Y_(ND)and sustained photosystem stability,thus improving carboxylation efficiency and resulting in higher photosynthesis.展开更多
Photomechanics is a crucial branch of solid mechanics.The localization of point targets constitutes a fundamental problem in optical experimental mechanics,with extensive applications in various missions of unmanned a...Photomechanics is a crucial branch of solid mechanics.The localization of point targets constitutes a fundamental problem in optical experimental mechanics,with extensive applications in various missions of unmanned aerial vehicles.Localizing moving targets is crucial for analyzing their motion characteristics and dynamic properties.Reconstructing the trajectories of points from asynchronous cameras is a significant challenge.It encompasses two coupled sub-problems:Trajectory reconstruction and camera synchronization.Present methods typically address only one of these sub-problems individually.This paper proposes a 3D trajectory reconstruction method for point targets based on asynchronous cameras,simultaneously solving both sub-problems.Firstly,we extend the trajectory intersection method to asynchronous cameras to resolve the limitation of traditional triangulation that requires camera synchronization.Secondly,we develop models for camera temporal information and target motion,based on imaging mechanisms and target dynamics characteristics.The parameters are optimized simultaneously to achieve trajectory reconstruction without accurate time parameters.Thirdly,we optimize the camera rotations alongside the camera time information and target motion parameters,using tighter and more continuous constraints on moving points.The reconstruction accuracy is significantly improved,especially when the camera rotations are inaccurate.Finally,the simulated and real-world experimental results demonstrate the feasibility and accuracy of the proposed method.The real-world results indicate that the proposed algorithm achieved a localization error of 112.95 m at an observation distance range of 15-20 km.展开更多
It is important to understand the development of joints and fractures in rock masses to ensure drilling stability and blasting effectiveness.Traditional manual observation techniques for identifying and extracting fra...It is important to understand the development of joints and fractures in rock masses to ensure drilling stability and blasting effectiveness.Traditional manual observation techniques for identifying and extracting fracture characteristics have been proven to be inefficient and prone to subjective interpretation.Moreover,conventional image processing algorithms and classical deep learning models often encounter difficulties in accurately identifying fracture areas,resulting in unclear contours.This study proposes an intelligent method for detecting internal fractures in mine rock masses to address these challenges.The proposed approach captures a nodal fracture map within the targeted blast area and integrates channel and spatial attention mechanisms into the ResUnet(RU)model.The channel attention mechanism dynamically recalibrates the importance of each feature channel,and the spatial attention mechanism enhances feature representation in key areas while minimizing background noise,thus improving segmentation accuracy.A dynamic serpentine convolution module is also introduced that adaptively adjusts the shape and orientation of the convolution kernel based on the local structure of the input feature map.Furthermore,this method enables the automatic extraction and quantification of borehole nodal fracture information by fitting sinusoidal curves to the boundaries of the fracture contours using the least squares method.In comparison to other advanced deep learning models,our enhanced RU demonstrates superior performance across evaluation metrics,including accuracy,pixel accuracy(PA),and intersection over union(IoU).Unlike traditional manual extraction methods,our intelligent detection approach provides considerable time and cost savings,with an average error rate of approximately 4%.This approach has the potential to greatly improve the efficiency of geological surveys of borehole fractures.展开更多
This study presents a drone-based aerial imaging method for automated rice seedling detection and counting in paddy fields.Utilizing a drone equipped with a high-resolution camera,images are captured 14 days postsowin...This study presents a drone-based aerial imaging method for automated rice seedling detection and counting in paddy fields.Utilizing a drone equipped with a high-resolution camera,images are captured 14 days postsowing at a consistent altitude of six meters,employing autonomous flight for uniform data acquisition.The approach effectively addresses the distinct growth patterns of both single and clustered rice seedlings at this early stage.The methodology follows a two-step process:first,the GoogleNet deep learning network identifies the location and center points of rice plants.Then,the U-Net deep learning network performs classification and counting of individual plants and clusters.This combination of deep learning models achieved a 90%accuracy rate in classifying and counting both single and clustered seedlings.To validate the method’s effectiveness,results were compared against traditional manual counting conducted by agricultural experts.The comparison revealed minimal discrepancies,with a variance of only 2–4 clumps per square meter,confirming the reliability of the proposed method.This automated approach offers significant benefits by providing an efficient,accurate,and scalable solution for monitoring seedling growth.It enables farmers to optimize fertilizer and pesticide application,improve resource allocation,and enhance overall crop management,ultimately contributing to increased agricultural productivity.展开更多
Closed thoracic drainage can be performed using a steel-needle-guided chest tube to treat pleural effusion or pneumothorax in clinics.However,the puncture procedure during surgery is invisible,increasing the risk of s...Closed thoracic drainage can be performed using a steel-needle-guided chest tube to treat pleural effusion or pneumothorax in clinics.However,the puncture procedure during surgery is invisible,increasing the risk of surgical failure.Therefore,it is necessary to design a visualization system for closed thoracic drainage.Augmented reality(AR)technology can assist in visualizing the internal anatomical structure and determining the insertion point on the body surface.The structure of the currently used steel-needle-guided chest tube was modified by integrating it with an ultrafine diameter camera to provide real-time visualization of the puncture process.After simulation experiments,the overall registration error of the AR method was measured to be within(3.59±0.53)mm,indicating its potential for clinical application.The ultrafine diameter camera module and improved steel-needle-guided chest tube can timely reflect the position of the needle tip in the human body.A comparative experiment showed that video guidance could improve the safety of the puncture process compared to the traditional method.Finally,a qualitative evaluation of the usability of the system was conducted through a questionnaire.This system facilitates the visualization of closed thoracic drainage puncture procedure and pro-vides an implementation scheme to enhance the accuracy and safety of the operative step,which is conducive to reducing the learning curve and improving the proficiency of the doctors.展开更多
基金supported by the Key R&D Program of Shaanxi Province(No.2025CYYBXM-078).
文摘Aiming at the scale adaptation of automatic driving target detection algorithms in low illumination environments and the shortcomings in target occlusion processing,this paper proposes a YOLO-LKSDS automatic driving detection model.Firstly,the Contrast-Limited Adaptive Histogram Equalisation(CLAHE)image enhancement algorithm is improved to increase the image contrast and enhance the detailed features of the target;then,on the basis of the YOLOv5 model,the Kmeans++clustering algorithm is introduced to obtain a suitable anchor frame,and SPPELAN spatial pyramid pooling is improved to enhance the accuracy and robustness of the model for multi-scale target detection.Finally,an improved SEAM(Separated and Enhancement Attention Module)attention mechanism is combined with the DIOU-NMS algorithm to optimize the model’s performance when dealing with occlusion and dense scenes.Compared with the original model,the improved YOLO-LKSDS model achieves a 13.3%improvement in accuracy,a 1.7%improvement in mAP,and 240,000 fewer parameters on the BDD100K dataset.In order to validate the generalization of the improved algorithm,we selected the KITTI dataset for experimentation,which shows that YOLOv5’s accuracy improves by 21.1%,recall by 36.6%,and mAP50 by 29.5%,respectively,on the KITTI dataset.The deployment of this paper’s algorithm is verified by an edge computing platform,where the average speed of detection reaches 24.4 FPS while power consumption remains below 9 W,demonstrating high real-time capability and energy efficiency.
基金funded by the National Natural Science Foundation of China,grant numbers 52374156 and 62476005。
文摘Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.
基金financial supports from National Natural Science Foundation of China(Grant Nos.U23A20368 and 62175006)Academic Excellence Foundation of BUAA for PhD Students.
文摘Due to the limitations of spatial bandwidth product and data transmission bandwidth,the field of view,resolution,and imaging speed constrain each other in an optical imaging system.Here,a fast-zoom and high-resolution sparse compound-eye camera(CEC)based on dual-end collaborative optimization is proposed,which provides a cost-effective way to break through the trade-off among the field of view,resolution,and imaging speed.In the optical end,a sparse CEC based on liquid lenses is designed,which can realize large-field-of-view imaging in real time,and fast zooming within 5 ms.In the computational end,a disturbed degradation model driven super-resolution network(DDMDSR-Net)is proposed to deal with complex image degradation issues in actual imaging situations,achieving high-robustness and high-fidelity resolution enhancement.Based on the proposed dual-end collaborative optimization framework,the angular resolution of the CEC can be enhanced from 71.6"to 26.0",which provides a solution to realize high-resolution imaging for array camera dispensing with high optical hardware complexity and data transmission bandwidth.Experiments verify the advantages of the CEC based on dual-end collaborative optimization in high-fidelity reconstruction of real scene images,kilometer-level long-distance detection,and dynamic imaging and precise recognition of targets of interest.
基金supported by Natural Science Foundation of Jilin Province(20210101468JC)Chinese Academy of Sciences and Local Government Cooperation Project(2023SYHZ0027,23SH04)National Natural Science Foundation of China(12273063&12203078)。
文摘Observatories typically deploy all-sky cameras for monitoring cloud cover and weather conditions.However,many of these cameras lack scientific-grade sensors,r.esulting in limited photometric precision,which makes calculating the sky area visibility distribution via extinction measurement challenging.To address this issue,we propose the Photometry-Free Sky Area Visibility Estimation(PFSAVE)method.This method uses the standard magnitude of the faintest star observed within a given sky area to estimate visibility.By employing a pertransformation refitting optimization strategy,we achieve a high-precision coordinate transformation model with an accuracy of 0.42 pixels.Using the results of HEALPix segmentation is also introduced to achieve high spatial resolution.Comprehensive analysis based on real allsky images demonstrates that our method exhibits higher accuracy than the extinction-based method.Our method supports both manual and robotic dynamic scheduling,especially under partially cloudy conditions.
文摘Low-light image enhancement is one of the most active research areas in the field of computer vision in recent years.In the low-light image enhancement process,loss of image details and increase in noise occur inevitably,influencing the quality of enhanced images.To alleviate this problem,a low-light image enhancement model called RetinexNet model based on Retinex theory was proposed in this study.The model was composed of an image decomposition module and a brightness enhancement module.In the decomposition module,a convolutional block attention module(CBAM)was incorporated to enhance feature representation capacity of the network,focusing on crucial features and suppressing irrelevant ones.A multifeature fusion denoising module was designed within the brightness enhancement module,circumventing the issue of feature loss during downsampling.The proposed model outperforms the existing algorithms in terms of PSNR and SSIM metrics on the publicly available datasets LOL and MIT-Adobe FiveK,as well as gives superior results in terms of NIQE metrics on the publicly available dataset LIME.
基金Supported by the Fundamental Research Funds for the Central Universities(2024300443)the Natural Science Foundation of Jiangsu Province(BK20241224).
文摘This paper presents a high-speed and robust dual-band infrared thermal camera based on an ARM CPU.The system consists of a low-resolution long-wavelength infrared detector,a digital temperature and humid⁃ity sensor,and a CMOS sensor.In view of the significant contrast between face and background in thermal infra⁃red images,this paper explores a suitable accuracy-latency tradeoff for thermal face detection and proposes a tiny,lightweight detector named YOLO-Fastest-IR.Four YOLO-Fastest-IR models(IR0 to IR3)with different scales are designed based on YOLO-Fastest.To train and evaluate these lightweight models,a multi-user low-resolution thermal face database(RGBT-MLTF)was collected,and the four networks were trained.Experiments demon⁃strate that the lightweight convolutional neural network performs well in thermal infrared face detection tasks.The proposed algorithm outperforms existing face detection methods in both positioning accuracy and speed,making it more suitable for deployment on mobile platforms or embedded devices.After obtaining the region of interest(ROI)in the infrared(IR)image,the RGB camera is guided by the thermal infrared face detection results to achieve fine positioning of the RGB face.Experimental results show that YOLO-Fastest-IR achieves a frame rate of 92.9 FPS on a Raspberry Pi 4B and successfully detects 97.4%of faces in the RGBT-MLTF test set.Ultimate⁃ly,an infrared temperature measurement system with low cost,strong robustness,and high real-time perfor⁃mance was integrated,achieving a temperature measurement accuracy of 0.3℃.
文摘Recently,a multitude of techniques that fuse deep learning with Retinex theory have been utilized in the field of low-light image enhancement,yielding remarkable outcomes.Due to the intricate nature of imaging scenarios,including fluctuating noise levels and unpredictable environmental elements,these techniques do not fully resolve these challenges.We introduce an innovative strategy that builds upon Retinex theory and integrates a novel deep network architecture,merging the Convolutional Block Attention Module(CBAM)with the Transformer.Our model is capable of detecting more prominent features across both channel and spatial domains.We have conducted extensive experiments across several datasets,namely LOLv1,LOLv2-real,and LOLv2-sync.The results show that our approach surpasses other methods when evaluated against critical metrics such as Peak Signal-to-Noise Ratio(PSNR)and Structural Similarity Index(SSIM).Moreover,we have visually assessed images enhanced by various techniques and utilized visual metrics like LPIPS for comparison,and the experimental data clearly demonstrate that our approach excels visually over other methods as well.
基金supported by the Key Laboratory of Forensic Science and Technology at College of Sichuan Province(2023YB04).
文摘Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning. Retinexformer introduces channel self-attention mechanisms in the IG-MSA. However, it fails to effectively capture long-range spatial dependencies, leaving room for improvement. Based on the Retinexformer deep learning framework, we designed the Retinexformer+ network. The “+” signifies our advancements in extracting long-range spatial dependencies. We introduced multi-scale dilated convolutions in illumination estimation to expand the receptive field. These convolutions effectively capture the weakening semantic dependency between pixels as distance increases. In illumination restoration, we used Unet++ with multi-level skip connections to better integrate semantic information at different scales. The designed Illumination Fusion Dual Self-Attention (IF-DSA) module embeds multi-scale dilated convolutions to achieve spatial self-attention. This module captures long-range spatial semantic relationships within acceptable computational complexity. Experimental results on the Low-Light (LOL) dataset show that Retexformer+ outperforms other State-Of-The-Art (SOTA) methods in both quantitative and qualitative evaluations, with the computational complexity increased to an acceptable 51.63 G FLOPS. On the LOL_v1 dataset, RetinexFormer+ shows an increase of 1.15 in Peak Signal-to-Noise Ratio (PSNR) and a decrease of 0.39 in Root Mean Square Error (RMSE). On the LOL_v2_real dataset, the PSNR increases by 0.42 and the RMSE decreases by 0.18. Experimental results on the Exdark dataset show that Retexformer+ can effectively enhance real-scene images and maintain their semantic information.
基金supported by the National Natural Science Foundation of China(Grant No.62302086)the Natural Science Foundation of Liaoning Province(Grant No.2023-MSBA-070)the Fundamental Research Funds for the Central Universities(Grant No.N2317005).
文摘Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.
基金support by the Guangxi Natural Science Foundation(Grant No.2024GXNSFAA010484)the NationalNatural Science Foundation of China(No.62466013),this work has been made possible.
文摘Under low-illumination conditions, the quality of image signals deteriorates significantly, typically characterized by a peak signal-to-noise ratio (PSNR) below 10 dB, which severely limits the usability of the images. Supervised methods, which utilize paired high-low light images as training sets, can enhance the PSNR to around 20 dB, significantly improving image quality. However, such data is challenging to obtain. In recent years, unsupervised low-light image enhancement (LIE) methods based on the Retinex framework have been proposed, but they generally lag behind supervised methods by 5–10 dB in performance. In this paper, we introduce the Denoising-Distilled Retine (DDR) method, an unsupervised approach that integrates denoising priors into a Retinex-based training framework. By explicitly incorporating denoising, the DDR method effectively addresses the challenges of noise and artifacts in low-light images, thereby enhancing the performance of the Retinex framework. The model achieved a PSNR of 19.82 dB on the LOL dataset, which is comparable to the performance of supervised methods. Furthermore, by applying knowledge distillation, the DDR method optimizes the model for real-time processing of low-light images, achieving a processing speed of 199.7 fps without incurring additional computational costs. While the DDR method has demonstrated superior performance in terms of image quality and processing speed, there is still room for improvement in terms of robustness across different color spaces and under highly resource-constrained conditions. Future research will focus on enhancing the model’s generalizability and adaptability to address these challenges. Our rigorous testing on public datasets further substantiates the DDR method’s state-of-the-art performance in both image quality and processing speed.
基金This researchwas Sponsored by Xinjiang Uygur Autonomous Region Tianshan Talent Programme Project(2023TCLJ02)Natural Science Foundation of Xinjiang Uygur Autonomous Region(2022D01C349).
文摘Infrared and visible light image fusion technology integrates feature information from two different modalities into a fused image to obtain more comprehensive information.However,in low-light scenarios,the illumination degradation of visible light images makes it difficult for existing fusion methods to extract texture detail information from the scene.At this time,relying solely on the target saliency information provided by infrared images is far from sufficient.To address this challenge,this paper proposes a lightweight infrared and visible light image fusion method based on low-light enhancement,named LLE-Fuse.The method is based on the improvement of the MobileOne Block,using the Edge-MobileOne Block embedded with the Sobel operator to perform feature extraction and downsampling on the source images.The intermediate features at different scales obtained are then fused by a cross-modal attention fusion module.In addition,the Contrast Limited Adaptive Histogram Equalization(CLAHE)algorithm is used for image enhancement of both infrared and visible light images,guiding the network model to learn low-light enhancement capabilities through enhancement loss.Upon completion of network training,the Edge-MobileOne Block is optimized into a direct connection structure similar to MobileNetV1 through structural reparameterization,effectively reducing computational resource consumption.Finally,after extensive experimental comparisons,our method achieved improvements of 4.6%,40.5%,156.9%,9.2%,and 98.6%in the evaluation metrics Standard Deviation(SD),Visual Information Fidelity(VIF),Entropy(EN),and Spatial Frequency(SF),respectively,compared to the best results of the compared algorithms,while only being 1.5 ms/it slower in computation speed than the fastest method.
基金supported by the National Natural Science Foundation of China(Grant Nos.61971078,61501070)the Scientific Research Foundation of Chongqing University of Technology(Grant No.0121230236)the Science and Technology Research Program of Chongqing Municipal Education Commission(Grant No.KJ202301165).
文摘Low-light images often have defects such as low visibility,low contrast,high noise,and high color distortion compared with well-exposed images.If the low-light region of an image is enhanced directly,the noise will inevitably blur the whole image.Besides,according to the retina-and-cortex(retinex)theory of color vision,the reflectivity of different image regions may differ,limiting the enhancement performance of applying uniform operations to the entire image.Therefore,we design a Hierarchical Flow Learning(HFL)framework,which consists of a Hierarchical Image Network(HIN)and a normalized invertible Flow Learning Network(FLN).HIN can extract hierarchical structural features from low-light images,while FLN maps the distribution of normally exposed images to a Gaussian distribution using the learned hierarchical features of low-light images.In subsequent testing,the reversibility of FLN allows inferring and obtaining enhanced low-light images.Specifically,the HIN extracts as much image information as possible from three scales,local,regional,and global,using a Triple-branch Hierarchical Fusion Module(THFM)and a Dual-Dconv Cross Fusion Module(DCFM).The THFM aggregates regional and global features to enhance the overall brightness and quality of low-light images by perceiving and extracting more structure information,whereas the DCFM uses the properties of the activation function and local features to enhance images at the pixel-level to reduce noise and improve contrast.In addition,in this paper,the model was trained using a negative log-likelihood loss function.Qualitative and quantitative experimental results demonstrate that our HFL can better handle many quality degradation types in low-light images compared with state-of-the-art solutions.The HFL model enhances low-light images with better visibility,less noise,and improved contrast,suitable for practical scenarios such as autonomous driving,medical imaging,and nighttime surveillance.Outperforming them by PSNR=27.26 dB,SSIM=0.93,and LPIPS=0.10 on benchmark dataset LOL-v1.The source code of HFL is available at https://github.com/Smile-QT/HFL-for-LIE.
基金supported by Anhui Province University Key Science and Technology Project(2024AH053415)Anhui Province University Major Science and Technology Project(2024AH040229).
文摘This research addresses the critical challenge of enhancing satellite images captured under low-light conditions,which suffer from severely degraded quality,including a lack of detail,poor contrast,and low usability.Overcoming this limitation is essential for maximizing the value of satellite imagery in downstream computer vision tasks(e.g.,spacecraft on-orbit connection,spacecraft surface repair,space debris capture)that rely on clear visual information.Our key novelty lies in an unsupervised generative adversarial network featuring two main contributions:(1)an improved U-Net(IU-Net)generator with multi-scale feature fusion in the contracting path for richer semantic feature extraction,and(2)a Global Illumination Attention Module(GIA)at the end of the contracting path to couple local and global information,significantly improving detail recovery and illumination adjustment.The proposed algorithm operates in an unsupervised manner.It is trained and evaluated on our self-constructed,unpaired Spacecraft Dataset for Detection,Enforcement,and Parts Recognition(SDDEP),designed specifically for low-light enhancement tasks.Extensive experiments demonstrate that our method outperforms the baseline EnlightenGAN,achieving improvements of 2.7%in structural similarity(SSIM),4.7%in peak signal-to-noise ratio(PSNR),6.3%in learning perceptual image patch similarity(LPIPS),and 53.2%in DeltaE 2000.Qualitatively,the enhanced images exhibit higher overall and local brightness,improved contrast,and more natural visual effects.
基金supported by the Key Program of Natural Science Foundation of Sichuan Province(2022NSFSC0013)National Key Research and Development Program of China(2022YFD1901603,2023YFD2301902).
文摘This study aimed to identify the physiological mechanisms enabling low-N-tolerant maize cultivar to maintain higher photosynthesis and yield under low-N,low-light,and combined stress.In a three-year field trial of low-N-tolerant and low-N-sensitive maize cultivars under two N fertilization(normal N:240 kg N ha^(−1);low-N:150 kg N ha^(−1))and two light conditions(normal light;low-light:35%light reduction),the tolerant cultivar showed higher net photosynthetic rate than the sensitive one.Random Forest analysis and Structural Equation Modeling identified PSI donor-side limitation(elevated Y_(ND))as the key photosynthetic constraint.The tolerant cultivar maintained higher D1 and PsaA protein levels and preferentially allocated photosynthetic N to electron transport.This strategy reduced Y_(ND)and sustained photosystem stability,thus improving carboxylation efficiency and resulting in higher photosynthesis.
基金supported by the Hunan Provin〓〓cial Natural Science Foundation for Excellent Young Scholars(Grant No.2023JJ20045)the National Natural Science Foundation of China(Grant No.12372189)。
文摘Photomechanics is a crucial branch of solid mechanics.The localization of point targets constitutes a fundamental problem in optical experimental mechanics,with extensive applications in various missions of unmanned aerial vehicles.Localizing moving targets is crucial for analyzing their motion characteristics and dynamic properties.Reconstructing the trajectories of points from asynchronous cameras is a significant challenge.It encompasses two coupled sub-problems:Trajectory reconstruction and camera synchronization.Present methods typically address only one of these sub-problems individually.This paper proposes a 3D trajectory reconstruction method for point targets based on asynchronous cameras,simultaneously solving both sub-problems.Firstly,we extend the trajectory intersection method to asynchronous cameras to resolve the limitation of traditional triangulation that requires camera synchronization.Secondly,we develop models for camera temporal information and target motion,based on imaging mechanisms and target dynamics characteristics.The parameters are optimized simultaneously to achieve trajectory reconstruction without accurate time parameters.Thirdly,we optimize the camera rotations alongside the camera time information and target motion parameters,using tighter and more continuous constraints on moving points.The reconstruction accuracy is significantly improved,especially when the camera rotations are inaccurate.Finally,the simulated and real-world experimental results demonstrate the feasibility and accuracy of the proposed method.The real-world results indicate that the proposed algorithm achieved a localization error of 112.95 m at an observation distance range of 15-20 km.
基金supported by the National Natural Science Foundation of China(No.52474172).
文摘It is important to understand the development of joints and fractures in rock masses to ensure drilling stability and blasting effectiveness.Traditional manual observation techniques for identifying and extracting fracture characteristics have been proven to be inefficient and prone to subjective interpretation.Moreover,conventional image processing algorithms and classical deep learning models often encounter difficulties in accurately identifying fracture areas,resulting in unclear contours.This study proposes an intelligent method for detecting internal fractures in mine rock masses to address these challenges.The proposed approach captures a nodal fracture map within the targeted blast area and integrates channel and spatial attention mechanisms into the ResUnet(RU)model.The channel attention mechanism dynamically recalibrates the importance of each feature channel,and the spatial attention mechanism enhances feature representation in key areas while minimizing background noise,thus improving segmentation accuracy.A dynamic serpentine convolution module is also introduced that adaptively adjusts the shape and orientation of the convolution kernel based on the local structure of the input feature map.Furthermore,this method enables the automatic extraction and quantification of borehole nodal fracture information by fitting sinusoidal curves to the boundaries of the fracture contours using the least squares method.In comparison to other advanced deep learning models,our enhanced RU demonstrates superior performance across evaluation metrics,including accuracy,pixel accuracy(PA),and intersection over union(IoU).Unlike traditional manual extraction methods,our intelligent detection approach provides considerable time and cost savings,with an average error rate of approximately 4%.This approach has the potential to greatly improve the efficiency of geological surveys of borehole fractures.
基金funded by the Ministry of Education and Training Project(code number:B2023-TCT-08).
文摘This study presents a drone-based aerial imaging method for automated rice seedling detection and counting in paddy fields.Utilizing a drone equipped with a high-resolution camera,images are captured 14 days postsowing at a consistent altitude of six meters,employing autonomous flight for uniform data acquisition.The approach effectively addresses the distinct growth patterns of both single and clustered rice seedlings at this early stage.The methodology follows a two-step process:first,the GoogleNet deep learning network identifies the location and center points of rice plants.Then,the U-Net deep learning network performs classification and counting of individual plants and clusters.This combination of deep learning models achieved a 90%accuracy rate in classifying and counting both single and clustered seedlings.To validate the method’s effectiveness,results were compared against traditional manual counting conducted by agricultural experts.The comparison revealed minimal discrepancies,with a variance of only 2–4 clumps per square meter,confirming the reliability of the proposed method.This automated approach offers significant benefits by providing an efficient,accurate,and scalable solution for monitoring seedling growth.It enables farmers to optimize fertilizer and pesticide application,improve resource allocation,and enhance overall crop management,ultimately contributing to increased agricultural productivity.
基金the Shanghai Municipal Education Commission-Gaofeng Clinical Medicine Grant(No.20172005)。
文摘Closed thoracic drainage can be performed using a steel-needle-guided chest tube to treat pleural effusion or pneumothorax in clinics.However,the puncture procedure during surgery is invisible,increasing the risk of surgical failure.Therefore,it is necessary to design a visualization system for closed thoracic drainage.Augmented reality(AR)technology can assist in visualizing the internal anatomical structure and determining the insertion point on the body surface.The structure of the currently used steel-needle-guided chest tube was modified by integrating it with an ultrafine diameter camera to provide real-time visualization of the puncture process.After simulation experiments,the overall registration error of the AR method was measured to be within(3.59±0.53)mm,indicating its potential for clinical application.The ultrafine diameter camera module and improved steel-needle-guided chest tube can timely reflect the position of the needle tip in the human body.A comparative experiment showed that video guidance could improve the safety of the puncture process compared to the traditional method.Finally,a qualitative evaluation of the usability of the system was conducted through a questionnaire.This system facilitates the visualization of closed thoracic drainage puncture procedure and pro-vides an implementation scheme to enhance the accuracy and safety of the operative step,which is conducive to reducing the learning curve and improving the proficiency of the doctors.