Low-light image enhancement(LLIE)remains challenging due to underexposure,color distortion,and amplified noise introduced during illumination correction.Existing deep learning–based methods typically apply uniform en...Low-light image enhancement(LLIE)remains challenging due to underexposure,color distortion,and amplified noise introduced during illumination correction.Existing deep learning–based methods typically apply uniform enhancement across the entire image,which overlooks scene semantics and often leads to texture degradation or unnatural color reproduction.To overcome these limitations,we propose a Semantic-Guided Visual Mamba Network(SGVMNet)that unifies semantic reasoning,state-space modeling,and mixture-of-experts routing for adaptive illumination correction.SGVMNet comprises three key components:(1)a semantic modulation module(SMM)that extracts scene-aware semantic priors from pretrained multimodal models—Large Language and Vision Assistant(LLaVA)and Contrastive Language–Image Pretraining(CLIP)—and injects them hierarchically into the feature stream;(2)aMixture-of-Experts State-Space Feature EnhancementModule(MoE-SSMFEM)that dynamically selects informative channels and activates specialized state-space experts for efficient global–local illumination modeling;and(3)a Text-Guided Mixture Mamba Block(TGMB)that fuses semantic priors and visual features through bidirectional state propagation.Experimental results demonstrate that on the low-light(LOL)dataset,SGVMNet outperforms other state-of-the-art methods in both quantitative and qualitative evaluations,and it also maintains low computational complexity with fast inference speed.On LOLv2-Syn,SGVMNet achieves 26.512 dB PSNR and 0.935 SSIM,outperforming RetinexFormer by 0.61 dB.On LOLv1,SGVMNet attains 26.50 dB PSNR and 0.863 SSIM.Furthermore,experiments on multiple unpaired real-world datasets further validate the superiority of SGVMNet,showing that the model not only exhibits strong cross-scene generalization ability but also effectively preserves semantic consistency and visual naturalness.展开更多
In order to stabilize the video module to build digital image stabilization image sequence, a method of using inertial measurement system is proposed. Through applying real-time attitude in- formation of the camera th...In order to stabilize the video module to build digital image stabilization image sequence, a method of using inertial measurement system is proposed. Through applying real-time attitude in- formation of the camera that obtained by high-precision attitude sensor to estimate the image motion vector and then to compensate for image, the purpose of stabilizing the image sequence can be a- chieved. Experiments demonstrate that this method has a high image stabilization precision, and the up to 16 frame/s video output rate completely meets the real-time requirements.展开更多
The accumulation of snow and ice on PV modules can have a detrimental impact on power generation,leading to reduced efficiency for prolonged periods.Thus,it becomes imperative to develop an intelligent system capable ...The accumulation of snow and ice on PV modules can have a detrimental impact on power generation,leading to reduced efficiency for prolonged periods.Thus,it becomes imperative to develop an intelligent system capable of accurately assessing the extent of snow and ice coverage on PV modules.To address this issue,the article proposes an innovative ice and snow recognition algorithm that effectively segments the ice and snow areas within the collected images.Furthermore,the algorithm incorporates an analysis of the morphological characteristics of ice and snow coverage on PV modules,allowing for the establishment of a residual ice and snow recognition process.This process utilizes both the external ellipse method and the pixel statistical method to refine the identification process.The effectiveness of the proposed algorithm is validated through extensive testing with isolated and continuous snow area pictures.The results demonstrate the algorithm’s accuracy and reliability in identifying and quantifying residual snow and ice on PV modules.In conclusion,this research presents a valuable method for accurately detecting and quantifying snow and ice coverage on PV modules.This breakthrough is of utmost significance for PV power plants,as it enables predictions of power generation efficiency and facilitates efficient PV maintenance during the challenging winter conditions characterized by snow and ice.By proactively managing snow and ice coverage,PV power plants can optimize energy production and minimize downtime,ensuring a sustainable and reliable renewable energy supply.展开更多
The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-genera...The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable.展开更多
Medical image classification has played an important role in the medical field, and the related method based on deep learning has become an important and powerful technique in medical image classification. In this art...Medical image classification has played an important role in the medical field, and the related method based on deep learning has become an important and powerful technique in medical image classification. In this article, we propose a simplified inception module based Hadamard attention (SI + HA) mechanism for medical image classification. Specifically, we propose a new attention mechanism: Hadamard attention mechanism. It improves the accuracy of medical image classification without greatly increasing the complexity of the model. Meanwhile, we adopt a simplified inception module to improve the utilization of parameters. We use two medical image datasets to prove the superiority of our proposed method. In the BreakHis dataset, the AUCs of our method can reach 98.74%, 98.38%, 98.61% and 97.67% under the magnification factors of 40×, 100×, 200× and 400×, respectively. The accuracies can reach 95.67%, 94.17%, 94.53% and 94.12% under the magnification factors of 40×, 100×, 200× and 400×, respectively. In the KIMIA Path 960 dataset, the AUCs and accuracy of our method can reach 99.91% and 99.03%. It is superior to the currently popular methods and can significantly improve the effectiveness of medical image classification.展开更多
The semantic segmentation methods based on CNN have made great progress,but there are still some shortcomings in the application of remote sensing images segmentation,such as the small receptive field can not effectiv...The semantic segmentation methods based on CNN have made great progress,but there are still some shortcomings in the application of remote sensing images segmentation,such as the small receptive field can not effectively capture global context.In order to solve this problem,this paper proposes a hybrid model based on ResNet50 and swin transformer to directly capture long-range dependence,which fuses features through Cross Feature Modulation Module(CFMM).Experimental results on two publicly available datasets,Vaihingen and Potsdam,are mIoU of 70.27%and 76.63%,respectively.Thus,CFM-UNet can maintain a high segmentation performance compared with other competitive networks.展开更多
The unmanned aerial vehicle(UAV)images captured under low-light conditions are often suffering from noise and uneven illumination.To address these issues,we propose a low-light image enhancement algorithm for UAV imag...The unmanned aerial vehicle(UAV)images captured under low-light conditions are often suffering from noise and uneven illumination.To address these issues,we propose a low-light image enhancement algorithm for UAV images,which is inspired by the Retinex theory and guided by a light weighted map.Firstly,we propose a new network for reflectance component processing to suppress the noise in images.Secondly,we construct an illumination enhancement module that uses a light weighted map to guide the enhancement process.Finally,the processed reflectance and illumination components are recombined to obtain the enhancement results.Experimental results show that our method can suppress the noise in images while enhancing image brightness,and prevent over enhancement in bright regions.Code and data are available at https://gitee.com/baixiaotong2/uav-images.git.展开更多
If the degree distribution is chosen carefully, the irregular low-density parity-check (LDPC) codes can outperform the regular ones. An image transmission system is proposed by combining regular and irregular LDPC cod...If the degree distribution is chosen carefully, the irregular low-density parity-check (LDPC) codes can outperform the regular ones. An image transmission system is proposed by combining regular and irregular LDPC codes with 16QAM/64QAM modulation to improve both efficiency and reliability. Simulaton results show that LDPC codes are good coding schemes over fading channel in image communication with lower system complexity. More over, irregular codes can obtain a code gain of about 0.7 dB compared with regular ones when BER is 10 -4. So the irregular LDPC codes are more suitable for image transmission than the regular codes.展开更多
In view of the fact that the current adaptive steganography algorithms are difficult to resist scaling attacks and that a method resisting scaling attack is only for the nearest neighbor interpolation method,this pape...In view of the fact that the current adaptive steganography algorithms are difficult to resist scaling attacks and that a method resisting scaling attack is only for the nearest neighbor interpolation method,this paper proposes an image steganography algorithm based on quantization index modulation resisting both scaling attacks and statistical detection.For the spatial image,this paper uses the watermarking algorithm based on quantization index modulation to extract the embedded domain.Then construct the embedding distortion function of the new embedded domain based on S-UNIWARD steganography,and use the minimum distortion coding to realize the embedding of the secret messages.Finally,according to the embedding modification amplitude of secret messages in the new embedded domain,the quantization index modulation algorithm is applied to realize the final embedding of secret messages in the original embedded domain.The experimental results show that the algorithm proposed is robust to the three common interpolation attacks including the nearest neighbor interpolation,the bilinear interpolation and the bicubic interpolation.And the average correct extraction rate of embedded messages increases from 50%to over 93% after 0.5 times-fold scaling attack using the bicubic interpolation method,compared with the classical steganography algorithm S-UNIWARD.Also the algorithm proposed has higher detection resistance than the original watermarking algorithm based on quantization index modulation.展开更多
With the rapid development of digital communication and the widespread use of the Internet of Things,multi-view image compression has attracted increasing attention as a fundamental technology for image data communica...With the rapid development of digital communication and the widespread use of the Internet of Things,multi-view image compression has attracted increasing attention as a fundamental technology for image data communication.Multi-view image compression aims to improve compression efficiency by leveraging correlations between images.However,the requirement of synchronization and inter-image communication at the encoder side poses significant challenges,especially for constrained devices.In this study,we introduce a novel distributed image compression model based on the attention mechanism to address the challenges associated with the availability of side information only during decoding.Our model integrates an encoder network,a quantization module,and a decoder network,to ensure both high compression performance and high-quality image reconstruction.The encoder uses a deep Convolutional Neural Network(CNN)to extract high-level features from the input image,which then pass through the quantization module for further compression before undergoing lossless entropy coding.The decoder of our model consists of three main components that allow us to fully exploit the information within and between images on the decoder side.Specifically,we first introduce a channel-spatial attention module to capture and refine information within individual image feature maps.Second,we employ a semi-coupled convolution module to extract both shared and specific information in images.Finally,a cross-attention module is employed to fuse mutual information extracted from side information.The effectiveness of our model is validated on various datasets,including KITTI Stereo and Cityscapes.The results highlight the superior compression capabilities of our method,surpassing state-of-the-art techniques.展开更多
Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused inform...Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused information in a single image.One of the critical clinical applications of medical image fusion is to fuse anatomical and functional modalities for rapid diagnosis of malignant tissues.This paper proposes a multimodal medical image fusion network(MMIF-Net)based on multiscale hybrid attention.The method first decomposes the original image to obtain the low-rank and significant parts.Then,to utilize the features at different scales,we add amultiscalemechanism that uses three filters of different sizes to extract the features in the encoded network.Also,a hybrid attention module is introduced to obtain more image details.Finally,the fused images are reconstructed by decoding the network.We conducted experiments with clinical images from brain computed tomography/magnetic resonance.The experimental results show that the multimodal medical image fusion network method based on multiscale hybrid attention works better than other advanced fusion methods.展开更多
With the rapid development of the new energy automotive industry,the enhancement of lithium battery performance and production efficiency has become critical.This article explores the application of artificial intelli...With the rapid development of the new energy automotive industry,the enhancement of lithium battery performance and production efficiency has become critical.This article explores the application of artificial intelligence technology in the lithium battery module PACK line,analyzing how it optimizes the production process and improves production efficiency,and predicts future development trends.The PACK line is an important link in battery manufacturing,involving complex processes such as cell sorting,welding,assembly and testing.The application of AI technology in image recognition,data analysis and predictive maintenance provides new solutions for the intelligent upgrading of the PACK line.This article describes the process of the PACK line in detail,analyzes the challenges under current technological levels,and reviews the application cases of AI technology in the manufacturing industry.The study aims to provide theoretical and practical guidance for the intelligent development of lithium battery module PACK lines,discussing the integration of AI technology,its actual performance,technical challenges,and solutions.It is expected that AI technology will play a greater role in the PACK line,and future research will focus on improving the adaptability of models,developing efficient algorithms,and further integrating into the production line.展开更多
Medical image fusion technology is crucial for improving the detection accuracy and treatment efficiency of diseases,but existing fusion methods have problems such as blurred texture details,low contrast,and inability...Medical image fusion technology is crucial for improving the detection accuracy and treatment efficiency of diseases,but existing fusion methods have problems such as blurred texture details,low contrast,and inability to fully extract fused image information.Therefore,a multimodal medical image fusion method based on mask optimization and parallel attention mechanism was proposed to address the aforementioned issues.Firstly,it converted the entire image into a binary mask,and constructed a contour feature map to maximize the contour feature information of the image and a triple path network for image texture detail feature extraction and optimization.Secondly,a contrast enhancement module and a detail preservation module were proposed to enhance the overall brightness and texture details of the image.Afterwards,a parallel attention mechanism was constructed using channel features and spatial feature changes to fuse images and enhance the salient information of the fused images.Finally,a decoupling network composed of residual networks was set up to optimize the information between the fused image and the source image so as to reduce information loss in the fused image.Compared with nine high-level methods proposed in recent years,the seven objective evaluation indicators of our method have improved by 6%−31%,indicating that this method can obtain fusion results with clearer texture details,higher contrast,and smaller pixel differences between the fused image and the source image.It is superior to other comparison algorithms in both subjective and objective indicators.展开更多
The application of transformer networks and feature fusion models in medical image segmentation has aroused considerable attention within the academic circle.Nevertheless,two main obstacles persist:(1)the restrictions...The application of transformer networks and feature fusion models in medical image segmentation has aroused considerable attention within the academic circle.Nevertheless,two main obstacles persist:(1)the restrictions of the Transformer network in dealing with locally detailed features,and(2)the considerable loss of feature information in current feature fusion modules.To solve these issues,this study initially presents a refined feature extraction approach,employing a double-branch feature extraction network to capture complex multi-scale local and global information from images.Subsequently,we proposed a low-loss feature fusion method-Multi-branch Feature Fusion Enhancement Module(MFFEM),which realizes effective feature fusion with minimal loss.Simultaneously,the cross-layer cross-attention fusion module(CLCA)is adopted to further achieve adequate feature fusion by enhancing the interaction between encoders and decoders of various scales.Finally,the feasibility of our method was verified using the Synapse and ACDC datasets,demonstrating its competitiveness.The average DSC(%)was 83.62 and 91.99 respectively,and the average HD95(mm)was reduced to 19.55 and 1.15 respectively.展开更多
Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding ...Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.展开更多
The key difficulty of restoring a fuzzy image is to estimate its point spread function( PSF). In the paper,PSF is modelled based on modulation transfer function( MTF). The first step is calculating the image MTF. In t...The key difficulty of restoring a fuzzy image is to estimate its point spread function( PSF). In the paper,PSF is modelled based on modulation transfer function( MTF). The first step is calculating the image MTF. In the traditional slanted-edge method,a sub-block is always manually extracted from original image and its MTF will be viewed as the result of the whole image. However,handcraft extraction is inefficient and will lead to inaccurate results. Given this,an automatic MTF computation algorithm is proposed,which extracts and screens out all the effective sub-blocks and calculates their average MTF as the final result. Then,a two-dimensional MTF restoration model is constructed by multiplying the horizontal and vertical MTF,and it is combined with conventional image restoration methods to restore fuzzy image. Experimental results indicate the proposed method implementes a fast and accurate MTF computation and the MTF model improves the performance of conventional restoration methods significantly.展开更多
Objective:to study the effect of automatic tube current modulation on abdominal CT image quality and radiation dose. Methods: a total of 70 patients who underwent abdominal CT examination in our hospital from August 2...Objective:to study the effect of automatic tube current modulation on abdominal CT image quality and radiation dose. Methods: a total of 70 patients who underwent abdominal CT examination in our hospital from August 2019 to August 2020 were collected and divided into control group (35 cases) and study group (35 cases) according to the random number table. CT scan was performed in the control group according to the conventional parameters, while CT scan was performed in the research group using automatic tube current modulation technology. The CT image quality and radiation dose of patients in the two groups were compared. Results: the image quality of the study group was better than that of the control group (P < 0.05). When NI ≥8, the radiation dose in the study group was lower than that in the control group (P < 0.05). Conclusion Automatic tube current modulation technique can obtain better image quality in CT detection, and has significant significance in reducing radiation dose. It is worthy of affirmation, praise and application.展开更多
In this paper, we investigated phase modulation-based computational ghost imaging. According to the results of numerical simulations, we found that the range of the random phase affects the quality of the reconstructe...In this paper, we investigated phase modulation-based computational ghost imaging. According to the results of numerical simulations, we found that the range of the random phase affects the quality of the reconstructed image. Besides,compared with those amplitude modulation-based computational ghost imaging schemes, introducing random phase modulation into the computational ghost imaging scheme could significantly improve the spatial resolution of the reconstructed image, and also extend the field of view.展开更多
Low-light image enhancement methods have limitations in addressing issues such as color distortion,lack of vibrancy,and uneven light distribution and often require paired training data.To address these issues,we propo...Low-light image enhancement methods have limitations in addressing issues such as color distortion,lack of vibrancy,and uneven light distribution and often require paired training data.To address these issues,we propose a two-stage unsupervised low-light image enhancement algorithm called Retinex and Exposure Fusion Network(RFNet),which can overcome the problems of over-enhancement of the high dynamic range and under-enhancement of the low dynamic range in existing enhancement algorithms.This algorithm can better manage the challenges brought about by complex environments in real-world scenarios by training with unpaired low-light images and regular-light images.In the first stage,we design a multi-scale feature extraction module based on Retinex theory,capable of extracting details and structural information at different scales to generate high-quality illumination and reflection images.In the second stage,an exposure image generator is designed through the camera response mechanism function to acquire exposure images containing more dark features,and the generated images are fused with the original input images to complete the low-light image enhancement.Experiments show the effectiveness and rationality of each module designed in this paper.And the method reconstructs the details of contrast and color distribution,outperforms the current state-of-the-art methods in both qualitative and quantitative metrics,and shows excellent performance in the real world.展开更多
Side scan sonar(SSS)is an important means to detect and locate seafloor targets.Autonomous underwater vehicles(AUVs)carrying SSS stay near the seafloor to obtain high-resolution images and provide the outline of the t...Side scan sonar(SSS)is an important means to detect and locate seafloor targets.Autonomous underwater vehicles(AUVs)carrying SSS stay near the seafloor to obtain high-resolution images and provide the outline of the target for observers.The target feature information of an SSS image is similar to the background information,and a small target has less pixel information;therefore,accu-rately identifying and locating small targets in SSS images is challenging.We collect the SSS images of iron metal balls(with a diameter of 1m)and rocks to solve the problem of target misclassification.Thus,the dataset contains two types of targets,namely,‘ball’and‘rock’.With the aim to enable AUVs to accurately and automatically identify small underwater targets in SSS images,this study designs a multisize parallel convolution module embedded in state-of-the-art Yolo5.An attention mechanism transformer and a convolutional block attention module are also introduced to compare their contributions to small target detection accuracy.The performance of the proposed method is further evaluated by taking the lightweight networks Mobilenet3 and Shufflenet2 as the backbone network of Yolo5.This study focuses on the performance of convolutional neural networks for the detection of small targets in SSS images,while another comparison experiment is carried out using traditional HOG+SVM to highlight the neural network’s ability.This study aims to improve the detection accuracy while ensuring the model efficiency to meet the real-time working requirements of AUV target detection.展开更多
文摘Low-light image enhancement(LLIE)remains challenging due to underexposure,color distortion,and amplified noise introduced during illumination correction.Existing deep learning–based methods typically apply uniform enhancement across the entire image,which overlooks scene semantics and often leads to texture degradation or unnatural color reproduction.To overcome these limitations,we propose a Semantic-Guided Visual Mamba Network(SGVMNet)that unifies semantic reasoning,state-space modeling,and mixture-of-experts routing for adaptive illumination correction.SGVMNet comprises three key components:(1)a semantic modulation module(SMM)that extracts scene-aware semantic priors from pretrained multimodal models—Large Language and Vision Assistant(LLaVA)and Contrastive Language–Image Pretraining(CLIP)—and injects them hierarchically into the feature stream;(2)aMixture-of-Experts State-Space Feature EnhancementModule(MoE-SSMFEM)that dynamically selects informative channels and activates specialized state-space experts for efficient global–local illumination modeling;and(3)a Text-Guided Mixture Mamba Block(TGMB)that fuses semantic priors and visual features through bidirectional state propagation.Experimental results demonstrate that on the low-light(LOL)dataset,SGVMNet outperforms other state-of-the-art methods in both quantitative and qualitative evaluations,and it also maintains low computational complexity with fast inference speed.On LOLv2-Syn,SGVMNet achieves 26.512 dB PSNR and 0.935 SSIM,outperforming RetinexFormer by 0.61 dB.On LOLv1,SGVMNet attains 26.50 dB PSNR and 0.863 SSIM.Furthermore,experiments on multiple unpaired real-world datasets further validate the superiority of SGVMNet,showing that the model not only exhibits strong cross-scene generalization ability but also effectively preserves semantic consistency and visual naturalness.
文摘In order to stabilize the video module to build digital image stabilization image sequence, a method of using inertial measurement system is proposed. Through applying real-time attitude in- formation of the camera that obtained by high-precision attitude sensor to estimate the image motion vector and then to compensate for image, the purpose of stabilizing the image sequence can be a- chieved. Experiments demonstrate that this method has a high image stabilization precision, and the up to 16 frame/s video output rate completely meets the real-time requirements.
基金supported by the Key Research and Development Projects in Shaanxi Province(Program No.2021GY-306)the Innovation Capability Support Program of Shaanxi(Program No.2022KJXX-41)the Key Scientific and Technological Projects of Xi’an(Program No.2022JH-RGZN-0005).
文摘The accumulation of snow and ice on PV modules can have a detrimental impact on power generation,leading to reduced efficiency for prolonged periods.Thus,it becomes imperative to develop an intelligent system capable of accurately assessing the extent of snow and ice coverage on PV modules.To address this issue,the article proposes an innovative ice and snow recognition algorithm that effectively segments the ice and snow areas within the collected images.Furthermore,the algorithm incorporates an analysis of the morphological characteristics of ice and snow coverage on PV modules,allowing for the establishment of a residual ice and snow recognition process.This process utilizes both the external ellipse method and the pixel statistical method to refine the identification process.The effectiveness of the proposed algorithm is validated through extensive testing with isolated and continuous snow area pictures.The results demonstrate the algorithm’s accuracy and reliability in identifying and quantifying residual snow and ice on PV modules.In conclusion,this research presents a valuable method for accurately detecting and quantifying snow and ice coverage on PV modules.This breakthrough is of utmost significance for PV power plants,as it enables predictions of power generation efficiency and facilitates efficient PV maintenance during the challenging winter conditions characterized by snow and ice.By proactively managing snow and ice coverage,PV power plants can optimize energy production and minimize downtime,ensuring a sustainable and reliable renewable energy supply.
基金the National Natural Science Foundation of China(No.61976080)the Academic Degrees&Graduate Education Reform Project of Henan Province(No.2021SJGLX195Y)+1 种基金the Teaching Reform Research and Practice Project of Henan Undergraduate Universities(No.2022SYJXLX008)the Key Project on Research and Practice of Henan University Graduate Education and Teaching Reform(No.YJSJG2023XJ006)。
文摘The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable.
文摘Medical image classification has played an important role in the medical field, and the related method based on deep learning has become an important and powerful technique in medical image classification. In this article, we propose a simplified inception module based Hadamard attention (SI + HA) mechanism for medical image classification. Specifically, we propose a new attention mechanism: Hadamard attention mechanism. It improves the accuracy of medical image classification without greatly increasing the complexity of the model. Meanwhile, we adopt a simplified inception module to improve the utilization of parameters. We use two medical image datasets to prove the superiority of our proposed method. In the BreakHis dataset, the AUCs of our method can reach 98.74%, 98.38%, 98.61% and 97.67% under the magnification factors of 40×, 100×, 200× and 400×, respectively. The accuracies can reach 95.67%, 94.17%, 94.53% and 94.12% under the magnification factors of 40×, 100×, 200× and 400×, respectively. In the KIMIA Path 960 dataset, the AUCs and accuracy of our method can reach 99.91% and 99.03%. It is superior to the currently popular methods and can significantly improve the effectiveness of medical image classification.
基金Young Innovative Talents Project of Guangdong Ordinary Universities(No.2022KQNCX225)School-level Teaching and Research Project of Guangzhou City Polytechnic(No.2022xky046)。
文摘The semantic segmentation methods based on CNN have made great progress,but there are still some shortcomings in the application of remote sensing images segmentation,such as the small receptive field can not effectively capture global context.In order to solve this problem,this paper proposes a hybrid model based on ResNet50 and swin transformer to directly capture long-range dependence,which fuses features through Cross Feature Modulation Module(CFMM).Experimental results on two publicly available datasets,Vaihingen and Potsdam,are mIoU of 70.27%and 76.63%,respectively.Thus,CFM-UNet can maintain a high segmentation performance compared with other competitive networks.
基金supported by the National Natural Science Foundation of China(Nos.62201454 and 62306235)the Xi’an Science and Technology Program of Xi’an Science and Technology Bureau(No.23SFSF0004)。
文摘The unmanned aerial vehicle(UAV)images captured under low-light conditions are often suffering from noise and uneven illumination.To address these issues,we propose a low-light image enhancement algorithm for UAV images,which is inspired by the Retinex theory and guided by a light weighted map.Firstly,we propose a new network for reflectance component processing to suppress the noise in images.Secondly,we construct an illumination enhancement module that uses a light weighted map to guide the enhancement process.Finally,the processed reflectance and illumination components are recombined to obtain the enhancement results.Experimental results show that our method can suppress the noise in images while enhancing image brightness,and prevent over enhancement in bright regions.Code and data are available at https://gitee.com/baixiaotong2/uav-images.git.
文摘If the degree distribution is chosen carefully, the irregular low-density parity-check (LDPC) codes can outperform the regular ones. An image transmission system is proposed by combining regular and irregular LDPC codes with 16QAM/64QAM modulation to improve both efficiency and reliability. Simulaton results show that LDPC codes are good coding schemes over fading channel in image communication with lower system complexity. More over, irregular codes can obtain a code gain of about 0.7 dB compared with regular ones when BER is 10 -4. So the irregular LDPC codes are more suitable for image transmission than the regular codes.
基金This work was supported by the National Natural Science Foundation of China(No.61379151,61401512,61572052,U1636219)the National Key Research and Development Program of China(No.2016YFB0801303,2016QY01W0105)the Key Technologies Research and Development Program of Henan Provinces(No.162102210032).
文摘In view of the fact that the current adaptive steganography algorithms are difficult to resist scaling attacks and that a method resisting scaling attack is only for the nearest neighbor interpolation method,this paper proposes an image steganography algorithm based on quantization index modulation resisting both scaling attacks and statistical detection.For the spatial image,this paper uses the watermarking algorithm based on quantization index modulation to extract the embedded domain.Then construct the embedding distortion function of the new embedded domain based on S-UNIWARD steganography,and use the minimum distortion coding to realize the embedding of the secret messages.Finally,according to the embedding modification amplitude of secret messages in the new embedded domain,the quantization index modulation algorithm is applied to realize the final embedding of secret messages in the original embedded domain.The experimental results show that the algorithm proposed is robust to the three common interpolation attacks including the nearest neighbor interpolation,the bilinear interpolation and the bicubic interpolation.And the average correct extraction rate of embedded messages increases from 50%to over 93% after 0.5 times-fold scaling attack using the bicubic interpolation method,compared with the classical steganography algorithm S-UNIWARD.Also the algorithm proposed has higher detection resistance than the original watermarking algorithm based on quantization index modulation.
基金supported by the National Natural Science Foundation of China(Key Program)(No.11932013)the Tianjin Science and Technology Plan Project(No.22PTZWHZ00040)。
文摘With the rapid development of digital communication and the widespread use of the Internet of Things,multi-view image compression has attracted increasing attention as a fundamental technology for image data communication.Multi-view image compression aims to improve compression efficiency by leveraging correlations between images.However,the requirement of synchronization and inter-image communication at the encoder side poses significant challenges,especially for constrained devices.In this study,we introduce a novel distributed image compression model based on the attention mechanism to address the challenges associated with the availability of side information only during decoding.Our model integrates an encoder network,a quantization module,and a decoder network,to ensure both high compression performance and high-quality image reconstruction.The encoder uses a deep Convolutional Neural Network(CNN)to extract high-level features from the input image,which then pass through the quantization module for further compression before undergoing lossless entropy coding.The decoder of our model consists of three main components that allow us to fully exploit the information within and between images on the decoder side.Specifically,we first introduce a channel-spatial attention module to capture and refine information within individual image feature maps.Second,we employ a semi-coupled convolution module to extract both shared and specific information in images.Finally,a cross-attention module is employed to fuse mutual information extracted from side information.The effectiveness of our model is validated on various datasets,including KITTI Stereo and Cityscapes.The results highlight the superior compression capabilities of our method,surpassing state-of-the-art techniques.
基金supported by Qingdao Huanghai University School-Level ScientificResearch Project(2023KJ14)Undergraduate Teaching Reform Research Project of Shandong Provincial Department of Education(M2022328)+1 种基金National Natural Science Foundation of China under Grant(42472324)Qingdao Postdoctoral Foundation under Grant(QDBSH202402049).
文摘Multimodal image fusion plays an important role in image analysis and applications.Multimodal medical image fusion helps to combine contrast features from two or more input imaging modalities to represent fused information in a single image.One of the critical clinical applications of medical image fusion is to fuse anatomical and functional modalities for rapid diagnosis of malignant tissues.This paper proposes a multimodal medical image fusion network(MMIF-Net)based on multiscale hybrid attention.The method first decomposes the original image to obtain the low-rank and significant parts.Then,to utilize the features at different scales,we add amultiscalemechanism that uses three filters of different sizes to extract the features in the encoded network.Also,a hybrid attention module is introduced to obtain more image details.Finally,the fused images are reconstructed by decoding the network.We conducted experiments with clinical images from brain computed tomography/magnetic resonance.The experimental results show that the multimodal medical image fusion network method based on multiscale hybrid attention works better than other advanced fusion methods.
文摘With the rapid development of the new energy automotive industry,the enhancement of lithium battery performance and production efficiency has become critical.This article explores the application of artificial intelligence technology in the lithium battery module PACK line,analyzing how it optimizes the production process and improves production efficiency,and predicts future development trends.The PACK line is an important link in battery manufacturing,involving complex processes such as cell sorting,welding,assembly and testing.The application of AI technology in image recognition,data analysis and predictive maintenance provides new solutions for the intelligent upgrading of the PACK line.This article describes the process of the PACK line in detail,analyzes the challenges under current technological levels,and reviews the application cases of AI technology in the manufacturing industry.The study aims to provide theoretical and practical guidance for the intelligent development of lithium battery module PACK lines,discussing the integration of AI technology,its actual performance,technical challenges,and solutions.It is expected that AI technology will play a greater role in the PACK line,and future research will focus on improving the adaptability of models,developing efficient algorithms,and further integrating into the production line.
基金supported by Gansu Natural Science Foundation Programme(No.24JRRA231)National Natural Science Foundation of China(No.62061023)Gansu Provincial Education,Science and Technology Innovation and Industry(No.2021CYZC-04)。
文摘Medical image fusion technology is crucial for improving the detection accuracy and treatment efficiency of diseases,but existing fusion methods have problems such as blurred texture details,low contrast,and inability to fully extract fused image information.Therefore,a multimodal medical image fusion method based on mask optimization and parallel attention mechanism was proposed to address the aforementioned issues.Firstly,it converted the entire image into a binary mask,and constructed a contour feature map to maximize the contour feature information of the image and a triple path network for image texture detail feature extraction and optimization.Secondly,a contrast enhancement module and a detail preservation module were proposed to enhance the overall brightness and texture details of the image.Afterwards,a parallel attention mechanism was constructed using channel features and spatial feature changes to fuse images and enhance the salient information of the fused images.Finally,a decoupling network composed of residual networks was set up to optimize the information between the fused image and the source image so as to reduce information loss in the fused image.Compared with nine high-level methods proposed in recent years,the seven objective evaluation indicators of our method have improved by 6%−31%,indicating that this method can obtain fusion results with clearer texture details,higher contrast,and smaller pixel differences between the fused image and the source image.It is superior to other comparison algorithms in both subjective and objective indicators.
基金funded by the Henan Science and Technology research project(222103810042)Support by the open project of scientific research platform of grain information processing center of Henan University of Technology(KFJJ-2021-108)+1 种基金Support by the innovative funds plan of Henan University of Technology(2021ZKCJ14)Henan University of Technology Youth Backbone Teacher Program.
文摘The application of transformer networks and feature fusion models in medical image segmentation has aroused considerable attention within the academic circle.Nevertheless,two main obstacles persist:(1)the restrictions of the Transformer network in dealing with locally detailed features,and(2)the considerable loss of feature information in current feature fusion modules.To solve these issues,this study initially presents a refined feature extraction approach,employing a double-branch feature extraction network to capture complex multi-scale local and global information from images.Subsequently,we proposed a low-loss feature fusion method-Multi-branch Feature Fusion Enhancement Module(MFFEM),which realizes effective feature fusion with minimal loss.Simultaneously,the cross-layer cross-attention fusion module(CLCA)is adopted to further achieve adequate feature fusion by enhancing the interaction between encoders and decoders of various scales.Finally,the feasibility of our method was verified using the Synapse and ACDC datasets,demonstrating its competitiveness.The average DSC(%)was 83.62 and 91.99 respectively,and the average HD95(mm)was reduced to 19.55 and 1.15 respectively.
基金supported by Natural Science Foundation Programme of Gansu Province(No.24JRRA231)National Natural Science Foundation of China(No.62061023)Gansu Provincial Science and Technology Plan Key Research and Development Program Project(No.24YFFA024).
文摘Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.
基金Supported by the National High Technology Research and Development Programme of China(No.2012AA12A305)the National Key Technology R&D Program of the Ministry of Science and Technology(No.2013BAH03B01)+1 种基金Fundamental Research Funds for the Central Universities of China(No.2042015kf0059)China Postdoctoral Science Foundation(No.2015M582277)
文摘The key difficulty of restoring a fuzzy image is to estimate its point spread function( PSF). In the paper,PSF is modelled based on modulation transfer function( MTF). The first step is calculating the image MTF. In the traditional slanted-edge method,a sub-block is always manually extracted from original image and its MTF will be viewed as the result of the whole image. However,handcraft extraction is inefficient and will lead to inaccurate results. Given this,an automatic MTF computation algorithm is proposed,which extracts and screens out all the effective sub-blocks and calculates their average MTF as the final result. Then,a two-dimensional MTF restoration model is constructed by multiplying the horizontal and vertical MTF,and it is combined with conventional image restoration methods to restore fuzzy image. Experimental results indicate the proposed method implementes a fast and accurate MTF computation and the MTF model improves the performance of conventional restoration methods significantly.
文摘Objective:to study the effect of automatic tube current modulation on abdominal CT image quality and radiation dose. Methods: a total of 70 patients who underwent abdominal CT examination in our hospital from August 2019 to August 2020 were collected and divided into control group (35 cases) and study group (35 cases) according to the random number table. CT scan was performed in the control group according to the conventional parameters, while CT scan was performed in the research group using automatic tube current modulation technology. The CT image quality and radiation dose of patients in the two groups were compared. Results: the image quality of the study group was better than that of the control group (P < 0.05). When NI ≥8, the radiation dose in the study group was lower than that in the control group (P < 0.05). Conclusion Automatic tube current modulation technique can obtain better image quality in CT detection, and has significant significance in reducing radiation dose. It is worthy of affirmation, praise and application.
基金Project supported by the National Natural Science Foundation of China(Grant No.11305020)the Science and Technology Research Projects of the Education Department of Jilin Province,China(Grant No.2016-354)the Science and Technology Development Project of Jilin Province,China(Grant No.20180520165JH)
文摘In this paper, we investigated phase modulation-based computational ghost imaging. According to the results of numerical simulations, we found that the range of the random phase affects the quality of the reconstructed image. Besides,compared with those amplitude modulation-based computational ghost imaging schemes, introducing random phase modulation into the computational ghost imaging scheme could significantly improve the spatial resolution of the reconstructed image, and also extend the field of view.
基金supported by the National Key Research and Development Program Topics(Grant No.2021YFB4000905)the National Natural Science Foundation of China(Grant Nos.62101432 and 62102309)in part by Shaanxi Natural Science Fundamental Research Program Project(No.2022JM-508).
文摘Low-light image enhancement methods have limitations in addressing issues such as color distortion,lack of vibrancy,and uneven light distribution and often require paired training data.To address these issues,we propose a two-stage unsupervised low-light image enhancement algorithm called Retinex and Exposure Fusion Network(RFNet),which can overcome the problems of over-enhancement of the high dynamic range and under-enhancement of the low dynamic range in existing enhancement algorithms.This algorithm can better manage the challenges brought about by complex environments in real-world scenarios by training with unpaired low-light images and regular-light images.In the first stage,we design a multi-scale feature extraction module based on Retinex theory,capable of extracting details and structural information at different scales to generate high-quality illumination and reflection images.In the second stage,an exposure image generator is designed through the camera response mechanism function to acquire exposure images containing more dark features,and the generated images are fused with the original input images to complete the low-light image enhancement.Experiments show the effectiveness and rationality of each module designed in this paper.And the method reconstructs the details of contrast and color distribution,outperforms the current state-of-the-art methods in both qualitative and quantitative metrics,and shows excellent performance in the real world.
基金supported by the National Key Research and Development Program of China(No.2016YFC0301400).
文摘Side scan sonar(SSS)is an important means to detect and locate seafloor targets.Autonomous underwater vehicles(AUVs)carrying SSS stay near the seafloor to obtain high-resolution images and provide the outline of the target for observers.The target feature information of an SSS image is similar to the background information,and a small target has less pixel information;therefore,accu-rately identifying and locating small targets in SSS images is challenging.We collect the SSS images of iron metal balls(with a diameter of 1m)and rocks to solve the problem of target misclassification.Thus,the dataset contains two types of targets,namely,‘ball’and‘rock’.With the aim to enable AUVs to accurately and automatically identify small underwater targets in SSS images,this study designs a multisize parallel convolution module embedded in state-of-the-art Yolo5.An attention mechanism transformer and a convolutional block attention module are also introduced to compare their contributions to small target detection accuracy.The performance of the proposed method is further evaluated by taking the lightweight networks Mobilenet3 and Shufflenet2 as the backbone network of Yolo5.This study focuses on the performance of convolutional neural networks for the detection of small targets in SSS images,while another comparison experiment is carried out using traditional HOG+SVM to highlight the neural network’s ability.This study aims to improve the detection accuracy while ensuring the model efficiency to meet the real-time working requirements of AUV target detection.