Visual data mining is one of important approach of data mining techniques. Most of them are based on computer graphic techniques but few of them exploit image-processing techniques. This paper proposes an image proces...Visual data mining is one of important approach of data mining techniques. Most of them are based on computer graphic techniques but few of them exploit image-processing techniques. This paper proposes an image processing method, named RNAM (resemble neighborhood averaging method), to facilitate visual data mining, which is used to post-process the data mining result-image and help users to discover significant features and useful patterns effectively. The experiments show that the method is intuitive, easily-understanding and effectiveness. It provides a new approach for visual data mining.展开更多
Organoids possess immense potential for unraveling the intricate functions of human tissues and facilitating preclinical disease treatment.Their applications span from high-throughput drug screening to the modeling of...Organoids possess immense potential for unraveling the intricate functions of human tissues and facilitating preclinical disease treatment.Their applications span from high-throughput drug screening to the modeling of complex diseases,with some even achieving clinical translation.Changes in the overall size,shape,boundary,and other morphological features of organoids provide a noninvasive method for assessing organoid drug sensitivity.However,the precise segmentation of organoids in bright-field microscopy images is made difficult by the complexity of the organoid morphology and interference,including overlapping organoids,bubbles,dust particles,and cell fragments.This paper introduces the precision organoid segmentation technique(POST),which is a deep-learning algorithm for segmenting challenging organoids under simple bright-field imaging conditions.Unlike existing methods,POST accurately segments each organoid and eliminates various artifacts encountered during organoid culturing and imaging.Furthermore,it is sensitive to and aligns with measurements of organoid activity in drug sensitivity experiments.POST is expected to be a valuable tool for drug screening using organoids owing to its capability of automatically and rapidly eliminating interfering substances and thereby streamlining the organoid analysis and drug screening process.展开更多
Although Transformer-based image restoration methods have demonstrated impressive performance,existing Transformers still insufficiently exploit multiscale information.Previous non-Transformer-based studies have shown...Although Transformer-based image restoration methods have demonstrated impressive performance,existing Transformers still insufficiently exploit multiscale information.Previous non-Transformer-based studies have shown that incorporating multiscale features is crucial for improving restoration results.In this paper,we propose a multiscale Transformer(MST)that captures cross-scale attention among tokens,thereby effectively leveraging the multiscale patch recurrence prior of natural images.Furthermore,we introduce a channel-gate feed-forward network(CGFN)to enhance inter-channel information aggregation and reduce channel redundancy.To simultaneously utilise global,local and multiscale features,we design a multitype feature integration block(MFIB).Extensive experiments on both image super-resolution and HEVC compressed video artefact reduction demonstrate that the proposed MST achieves state-of-the-art performance.Ablation studies further verify the effectiveness of each proposed module.展开更多
The existence of absorption and reflection of light underwater leads to problems such as color distortion and blue-green bias in underwater images.In this study,a depthwise separable convolution-based generative adver...The existence of absorption and reflection of light underwater leads to problems such as color distortion and blue-green bias in underwater images.In this study,a depthwise separable convolution-based generative adversarial network(GAN)algorithm was proposed.Taking GAN as the basic framework,it combined a depthwise separable convolution module,attention mechanism,and reconstructed convolution module to realize the enhancement of underwater degraded images.Multi-scale features were captured by the depthwise separable convolution module,and the attention mechanism was utilized to enhance attention to important features.The reconstructed convolution module further extracts and fuses local and global features.Experimental results showed that the algorithm performs well in improving the color bias and blurring of underwater images,with PSNR reaching 27.835,SSIM reaching 0.883,UIQM reaching 3.205,and UCIQE reaching 0.713.The enhanced image outperforms the comparison algorithm in both subjective and objective metrics.展开更多
Near-infrared image sensors are widely used in fields such as material identification,machine vision,and autonomous driving.Lead sulfide colloidal quantum dot-based infrared photodiodes can be integrated with sil⁃icon...Near-infrared image sensors are widely used in fields such as material identification,machine vision,and autonomous driving.Lead sulfide colloidal quantum dot-based infrared photodiodes can be integrated with sil⁃icon-based readout circuits in a single step.Based on this,we propose a photodiode based on an n-i-p structure,which removes the buffer layer and further simplifies the manufacturing process of quantum dot image sensors,thus reducing manufacturing costs.Additionally,for the noise complexity in quantum dot image sensors when capturing images,traditional denoising and non-uniformity methods often do not achieve optimal denoising re⁃sults.For the noise and stripe-type non-uniformity commonly encountered in infrared quantum dot detector imag⁃es,a network architecture has been developed that incorporates multiple key modules.This network combines channel attention and spatial attention mechanisms,dynamically adjusting the importance of feature maps to en⁃hance the ability to distinguish between noise and details.Meanwhile,the residual dense feature fusion module further improves the network's ability to process complex image structures through hierarchical feature extraction and fusion.Furthermore,the pyramid pooling module effectively captures information at different scales,improv⁃ing the network's multi-scale feature representation ability.Through the collaborative effect of these modules,the network can better handle various mixed noise and image non-uniformity issues.Experimental results show that it outperforms the traditional U-Net network in denoising and image correction tasks.展开更多
In the image fusion field,fusing infrared images(IRIs)and visible images(VIs)excelled is a key area.The differences between IRIs and VIs make it challenging to fuse both types into a high-quality image.Accordingly,eff...In the image fusion field,fusing infrared images(IRIs)and visible images(VIs)excelled is a key area.The differences between IRIs and VIs make it challenging to fuse both types into a high-quality image.Accordingly,efficiently combining the advantages of both images while overcoming their shortcomings is necessary.To handle this challenge,we developed an end-to-end IRI andVI fusionmethod based on frequency decomposition and enhancement.By applying concepts from frequency domain analysis,we used the layering mechanism to better capture the salient thermal targets from the IRIs and the rich textural information from the VIs,respectively,significantly boosting the image fusion quality and effectiveness.In addition,the backbone network combined Restormer Blocks and Dense Blocks;Restormer blocks utilize global attention to extract shallow features.Meanwhile,Dense Blocks ensure the integration between shallow and deep features,thereby avoiding the loss of shallow attributes.Extensive experiments on TNO and MSRS datasets demonstrated that the suggested method achieved state-of-the-art(SOTA)performance in various metrics:Entropy(EN),Mutual Information(MI),Standard Deviation(SD),The Structural Similarity Index Measure(SSIM),Fusion quality(Qabf),MI of the pixel(FMI_(pixel)),and modified Visual Information Fidelity(VIF_(m)).展开更多
Microscopy imaging is fundamental in analyzing bacterial morphology and dynamics,offering critical insights into bacterial physiology and pathogenicity.Image segmentation techniques enable quantitative analysis of bac...Microscopy imaging is fundamental in analyzing bacterial morphology and dynamics,offering critical insights into bacterial physiology and pathogenicity.Image segmentation techniques enable quantitative analysis of bacterial structures,facilitating precise measurement of morphological variations and population behaviors at single-cell resolution.This paper reviews advancements in bacterial image segmentation,emphasizing the shift from traditional thresholding and watershed methods to deep learning-driven approaches.Convolutional neural networks(CNNs),U-Net architectures,and three-dimensional(3D)frameworks excel at segmenting dense biofilms and resolving antibiotic-induced morphological changes.These methods combine automated feature extraction with physics-informed postprocessing.Despite progress,challenges persist in computational efficiency,cross-species generalizability,and integration with multimodal experimental workflows.Future progress will depend on improving model robustness across species and imaging modalities,integrating multimodal data for phenotype-function mapping,and developing standard pipelines that link computational tools with clinical diagnostics.These innovations will expand microbial phenotyping beyond structural analysis,enabling deeper insights into bacterial physiology and ecological interactions.展开更多
Clouds are one of the leading causes of sun shading,which reduces the direct horizontal irradiance and curtails the photovoltaic(PV)power.It is critical to estimate cloud cover to accurately predict PV generation with...Clouds are one of the leading causes of sun shading,which reduces the direct horizontal irradiance and curtails the photovoltaic(PV)power.It is critical to estimate cloud cover to accurately predict PV generation within a very short horizon(second/minute).To achieve the precise forecasting of cloud cover,an image preprocessing method based on total-sky images is proposed to remove the interference and address the image edge distortion issue.An optimal threshold estimation method is further designed to achieve higher cloud identification precision.Considering the cloud's meteorological properties,a random hypersurface model(RHM)based on the Gaussian mixture probability hypothesis density(GM-PHD)filter is applied to track the cloud.The GM-PHD can track the rotation and diffusion of clouds,which helps to estimate sun-cloud collision.Furthermore,a hybrid autoregressive integrated moving average(ARIMA)and backpropagation(BP)neural network-based model is applied for intra-hour PV power forecasting.The experiment results demonstrate that the proposed cloud-tracking-based PV power forecasting model can capture the ramp behavior of PV power,improving forecasting precision.展开更多
Background:Brain volume measurement serves as a critical approach for assessing brain health status.Considering the close biological connection between the eyes and brain,this study aims to investigate the feasibility...Background:Brain volume measurement serves as a critical approach for assessing brain health status.Considering the close biological connection between the eyes and brain,this study aims to investigate the feasibility of estimating brain volume through retinal fundus imaging integrated with clinical metadata,and to offer a cost-effective approach for assessing brain health.Methods:Based on clinical information,retinal fundus images,and neuroimaging data derived from a multicenter,population-based cohort study,the Kai Luan Study,we proposed a cross-modal correlation representation(CMCR)network to elucidate the intricate co-degenerative relationships between the eyes and brain for 755 subjects.Specifically,individual clinical information,which has been followed up for as long as 12 years,was encoded as a prompt to enhance the accuracy of brain volume estimation.Independent internal validation and external validation were performed to assess the robustness of the proposed model.Root mean square error(RMSE),peak signal-tonoise ratio(PSNR),and structural similarity index measure(SSIM)metrics were employed to quantitatively evaluate the quality of synthetic brain images derived from retinal imaging data.Results:The proposed framework yielded average RMSE,PSNR,and SSIM values of 98.23,35.78 d B,and 0.64,respectively,which significantly outperformed 5 other methods:multi-channel Variational Autoencoder(mcVAE),Pixelto-Pixel(Pixel2pixel),transformer-based U-Net(Trans UNet),multi-scale transformer network(MT-Net),and residual vision transformer(ResViT).The two-(2D)and three-dimensional(3D)visualization results showed that the shape and texture of the synthetic brain images generated by the proposed method most closely resembled those of actual brain images.Thus,the CMCR framework accurately captured the latent structural correlations between the fundus and the brain.The average difference between predicted and actual brain volumes was 61.36 cm~3,with a relative error of 4.54%.When all of the clinical information(including age and sex,daily habits,cardiovascular factors,metabolic factors,and inflammatory factors)was encoded,the difference was decreased to 53.89 cm~3,with a relative error of 3.98%.Based on the synthesized brain magnetic resonance images from retinal fundus images,the volumes of brain tissues could be estimated with high accuracy.Conclusion:This study provides an innovative,accurate,and cost-effective approach to characterize brain health status through readily accessible retinal fundus images.展开更多
Medical image segmentation is of critical importance in the domain of contemporary medical imaging.However,U-Net and its variants exhibit limitations in capturing complex nonlinear patterns and global contextual infor...Medical image segmentation is of critical importance in the domain of contemporary medical imaging.However,U-Net and its variants exhibit limitations in capturing complex nonlinear patterns and global contextual information.Although the subsequent U-KAN model enhances nonlinear representation capabilities,it still faces challenges such as gradient vanishing during deep network training and spatial detail loss during feature downsampling,resulting in insufficient segmentation accuracy for edge structures and minute lesions.To address these challenges,this paper proposes the RE-UKAN model,which innovatively improves upon U-KAN.Firstly,a residual network is introduced into the encoder to effectively mitigate gradient vanishing through cross-layer identity mappings,thus enhancing modelling capabilities for complex pathological structures.Secondly,Efficient Local Attention(ELA)is integrated to suppress spatial detail loss during downsampling,thereby improving the perception of edge structures and minute lesions.Experimental results on four public datasets demonstrate that RE-UKAN outperforms existing medical image segmentation methods across multiple evaluation metrics,with particularly outstanding performance on the TN-SCUI 2020 dataset,achieving IoU of 88.18%and Dice of 93.57%.Compared to the baseline model,it achieves improvements of 3.05%and 1.72%,respectively.These results fully demonstrate RE-UKAN’s superior detail retention capability and boundary recognition accuracy in complex medical image segmentation tasks,providing a reliable solution for clinical precision segmentation.展开更多
The historical image of Ouyang Xiu constructed during the Song Dynasty evolved from a multifaceted portrayal that balanced his political and literary achievements into a singular cultural symbol.In the Northern Song D...The historical image of Ouyang Xiu constructed during the Song Dynasty evolved from a multifaceted portrayal that balanced his political and literary achievements into a singular cultural symbol.In the Northern Song Dynasty,writings by Ouyang Xiu's family and epitaphs by his colleagues crafted a balanced narrative emphasizing both his official duties and literary merits,thus constructing a dual image of him as a principled remonstrator and a literary master.In the Southern Song Dynasty,official historiography gradually eroded his complex persona as a political reformer by selectively trimming political disputes and emphasizing his literary lineage,ultimately establishing him as a cultural exemplar beyond factional strife.Throughout this evolution of historical writing,Ouyang Xiu's sharpness as a remonstrator was gradually obscured in historical texts,while his image as a literary master,revered by all,became firmly established.The reshaping of Ouyang Xiu's image in historical writings across the Northern and Southern Song dynasties not only reflects the logic of selecting scholar-official role models under the influence of official ideology but also reveals the inherent pattern whereby individual distinctiveness fades into symbolic construction in historical writing.展开更多
Digital watermarking technology plays an important role in detecting malicious tampering and protecting image copyright.However,in practical applications,this technology faces various problems such as severe image dis...Digital watermarking technology plays an important role in detecting malicious tampering and protecting image copyright.However,in practical applications,this technology faces various problems such as severe image distortion,inaccurate localization of the tampered regions,and difficulty in recovering content.Given these shortcomings,a fragile image watermarking algorithm for tampering blind-detection and content self-recovery is proposed.The multi-feature watermarking authentication code(AC)is constructed using texture feature of local binary patterns(LBP),direct coefficient of discrete cosine transform(DCT)and contrast feature of gray level co-occurrence matrix(GLCM)for detecting the tampered region,and the recovery code(RC)is designed according to the average grayscale value of pixels in image blocks for recovering the tampered content.Optimal pixel adjustment process(OPAP)and least significant bit(LSB)algorithms are used to embed the recovery code and authentication code into the image in a staggered manner.When detecting the integrity of the image,the authentication code comparison method and threshold judgment method are used to perform two rounds of tampering detection on the image and blindly recover the tampered content.Experimental results show that this algorithm has good transparency,strong and blind detection,and self-recovery performance against four types of malicious attacks and some conventional signal processing operations.When resisting copy-paste,text addition,cropping and vector quantization under the tampering rate(TR)10%,the average tampering detection rate is up to 94.09%,and the peak signal-to-noise ratio(PSNR)of the watermarked image and the recovered image are both greater than 41.47 and 40.31 dB,which demonstrates its excellent advantages compared with other related algorithms in recent years.展开更多
Compact size,high brightness,and wide field of view(FOV)are key requirements for long-wave infrared imagers used in military surveillance or night navigation.However,to meet the imaging requirements of high resolution...Compact size,high brightness,and wide field of view(FOV)are key requirements for long-wave infrared imagers used in military surveillance or night navigation.However,to meet the imaging requirements of high resolution and wide FOV,infrared optical systems often adopt complex optical lens groups,which will increase the size and weight of the optical system.In this paper,a strategy based on wavefront coding(WFC)is proposed to design a compact wide-FOV infrared imager.A cubic phase mask is inserted into the pupil plane of the infrared imager to correct the aberration.The simulated results show that,the WFC infrared imager has good imaging quality in a wide FOV of±16°.In addition,the WFC infrared imager achieves compactness with its 40 mm×40 mm×40 mm size.A fast focal ratio of 1 combined with an entrance pupil diameter of 25 mm ensures brightness.This work is of significance for designing a compact wide-FOV infrared imager.展开更多
Single-pixel imaging(SPI)receives widespread attention due to its superior anti-interference capabilities,and image segmentation technology can effectively facilitate its recognition and information extraction.However...Single-pixel imaging(SPI)receives widespread attention due to its superior anti-interference capabilities,and image segmentation technology can effectively facilitate its recognition and information extraction.However,the complexity of the target scene and plenty of imaging time in SPI make it challenging to achieve high-quality and concise segmentation.In this paper,we investigate the image-free intricate scene semantic segmentation in SPI.Using“learned”illumination patterns allows for the full extraction of the object's spatial information,thereby enabling pixel-level segmentation results through the decoding of the received measurements.Simulation and experimentation show that,in the absence of image reconstruction,the mean intersection over union(MIoU)of segmented image can reach higher than 85%,and the Dice coefficient(DICE)close to 90%even at the sampling ratio of 5%.Our approach may be favorable to applications in medical image segmentation and autonomous driving field.展开更多
A large-scale view of the magnetospheric cusp is expected to be obtained by the Soft X-ray Imager(SXI)onboard the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE).However,it is challenging to trace the three-d...A large-scale view of the magnetospheric cusp is expected to be obtained by the Soft X-ray Imager(SXI)onboard the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE).However,it is challenging to trace the three-dimensional cusp boundary from a two-dimensional X-ray image because the detected X-ray signals will be integrated along the line of sight.In this work,a global magnetohydrodynamic code was used to simulate the X-ray images and photon count images,assuming an interplanetary magnetic field with a pure Bz component.The assumption of an elliptic cusp boundary at a given altitude was used to trace the equatorward and poleward boundaries of the cusp from a simulated X-ray image.The average discrepancy was less than 0.1 RE.To reduce the influence of instrument effects and cosmic X-ray backgrounds,image denoising was considered before applying the method above to SXI photon count images.The cusp boundaries were reasonably reconstructed from the noisy X-ray image.展开更多
Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approach...Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.展开更多
Over the years,Generative Adversarial Networks(GANs)have revolutionized the medical imaging industry for applications such as image synthesis,denoising,super resolution,data augmentation,and cross-modality translation...Over the years,Generative Adversarial Networks(GANs)have revolutionized the medical imaging industry for applications such as image synthesis,denoising,super resolution,data augmentation,and cross-modality translation.The objective of this review is to evaluate the advances,relevances,and limitations of GANs in medical imaging.An organised literature review was conducted following the guidelines of PRISMA(Preferred Reporting Items for Systematic Reviews and Meta-Analyses).The literature considered included peer-reviewed papers published between 2020 and 2025 across databases including PubMed,IEEE Xplore,and Scopus.The studies related to applications of GAN architectures in medical imaging with reported experimental outcomes and published in English in reputable journals and conferences were considered for the review.Thesis,white papers,communication letters,and non-English articles were not included for the same.CLAIM based quality assessment criteria were applied to the included studies to assess the quality.The study classifies diverse GAN architectures,summarizing their clinical applications,technical performances,and their implementation hardships.Key findings reveal the increasing applications of GANs for enhancing diagnostic accuracy,reducing data scarcity through synthetic data generation,and supporting modality translation.However,concerns such as limited generalizability,lack of clinical validation,and regulatory constraints persist.This review provides a comprehensive study of the prevailing scenario of GANs in medical imaging and highlights crucial research gaps and future directions.Though GANs hold transformative capability for medical imaging,their integration into clinical use demands further validation,interpretability,and regulatory alignment.展开更多
This paper proposes a tamper detection technique for semi-fragile watermarking using Quantizationbased Discrete Cosine Transform(DCT)for tamper localization.In this study,the proposed embedding strategy is investigate...This paper proposes a tamper detection technique for semi-fragile watermarking using Quantizationbased Discrete Cosine Transform(DCT)for tamper localization.In this study,the proposed embedding strategy is investigated by experimental tests over the diagonal order of the DCT coefficients.The cover image is divided into non-overlapping blocks of size 8×8 pixels.The DCT is applied to each block,and the coefficients are arranged using a zig-zag pattern within the block.In this study,the low-frequency coefficients are selected to examine the impact of the imperceptibility score and tamper detection accuracy.High accuracy of tamper detection can be achieved by checking the surrounding blocks to determine whether the corresponding block has been tampered with.The proposed tamper detection is tested under various malicious,incidental,and hybrid attacks(both incidental and malicious attacks).The experimental results demonstrate that the proposed technique achieves a Peak-Signal-to-Noise Ratio(PSNR)value of 41.2318 dB,an average Structural Similarity Index Measure(SSIM)value of 0.9768.The proposed scheme is also evaluated against malicious attacks such as copy-move,object deletion,object manipulation,and collage attacks.The proposed scheme can detect the malicious attack localization under various tampering rates.In addition,the proposed scheme can still detect tampered pixels under a hybrid attack,such as a combination ofmalicious and incidental attacks,with an average accuracy of 96.44%.展开更多
基金Supported by the National Natural Science Foun-dation of China (60173051) ,the Teaching and Research Award Pro-gramfor Outstanding Young Teachers in Higher Education Institu-tions of Ministry of Education of China ,and Liaoning Province HigherEducation Research Foundation (20040206)
文摘Visual data mining is one of important approach of data mining techniques. Most of them are based on computer graphic techniques but few of them exploit image-processing techniques. This paper proposes an image processing method, named RNAM (resemble neighborhood averaging method), to facilitate visual data mining, which is used to post-process the data mining result-image and help users to discover significant features and useful patterns effectively. The experiments show that the method is intuitive, easily-understanding and effectiveness. It provides a new approach for visual data mining.
基金supported by the National Key R&D Program of China(No.2022YFC2504403)the National Natural Science Foundation of China(No.62172202)+1 种基金the Experiment Project of China Manned Space Program(No.HYZHXM01019)the Fundamental Research Funds for the Central Universities from Southeast University(No.3207032101C3)。
文摘Organoids possess immense potential for unraveling the intricate functions of human tissues and facilitating preclinical disease treatment.Their applications span from high-throughput drug screening to the modeling of complex diseases,with some even achieving clinical translation.Changes in the overall size,shape,boundary,and other morphological features of organoids provide a noninvasive method for assessing organoid drug sensitivity.However,the precise segmentation of organoids in bright-field microscopy images is made difficult by the complexity of the organoid morphology and interference,including overlapping organoids,bubbles,dust particles,and cell fragments.This paper introduces the precision organoid segmentation technique(POST),which is a deep-learning algorithm for segmenting challenging organoids under simple bright-field imaging conditions.Unlike existing methods,POST accurately segments each organoid and eliminates various artifacts encountered during organoid culturing and imaging.Furthermore,it is sensitive to and aligns with measurements of organoid activity in drug sensitivity experiments.POST is expected to be a valuable tool for drug screening using organoids owing to its capability of automatically and rapidly eliminating interfering substances and thereby streamlining the organoid analysis and drug screening process.
基金supported in part by the National Natural Science Foundation of China under Grants 62101346 and 62301330the Guangdong Basic and Applied Basic Research Foundation under Grants 2021A1515011702 and 2022A1515110101+1 种基金the Shenzhen Science and Technology Programme under Grants JCYJ20240813141358076 and 20231121103807001the Guangdong Provincial Key Laboratory under Grant 2023B1212060076.
文摘Although Transformer-based image restoration methods have demonstrated impressive performance,existing Transformers still insufficiently exploit multiscale information.Previous non-Transformer-based studies have shown that incorporating multiscale features is crucial for improving restoration results.In this paper,we propose a multiscale Transformer(MST)that captures cross-scale attention among tokens,thereby effectively leveraging the multiscale patch recurrence prior of natural images.Furthermore,we introduce a channel-gate feed-forward network(CGFN)to enhance inter-channel information aggregation and reduce channel redundancy.To simultaneously utilise global,local and multiscale features,we design a multitype feature integration block(MFIB).Extensive experiments on both image super-resolution and HEVC compressed video artefact reduction demonstrate that the proposed MST achieves state-of-the-art performance.Ablation studies further verify the effectiveness of each proposed module.
文摘The existence of absorption and reflection of light underwater leads to problems such as color distortion and blue-green bias in underwater images.In this study,a depthwise separable convolution-based generative adversarial network(GAN)algorithm was proposed.Taking GAN as the basic framework,it combined a depthwise separable convolution module,attention mechanism,and reconstructed convolution module to realize the enhancement of underwater degraded images.Multi-scale features were captured by the depthwise separable convolution module,and the attention mechanism was utilized to enhance attention to important features.The reconstructed convolution module further extracts and fuses local and global features.Experimental results showed that the algorithm performs well in improving the color bias and blurring of underwater images,with PSNR reaching 27.835,SSIM reaching 0.883,UIQM reaching 3.205,and UCIQE reaching 0.713.The enhanced image outperforms the comparison algorithm in both subjective and objective metrics.
基金Supported by the National key research and development program in the 14th five year plan 2021YFA1200700)the National Natural Science Foundation of China(62535018,62431025,62561160113)the Natural Science Foundation of Shanghai(23ZR1473400).
文摘Near-infrared image sensors are widely used in fields such as material identification,machine vision,and autonomous driving.Lead sulfide colloidal quantum dot-based infrared photodiodes can be integrated with sil⁃icon-based readout circuits in a single step.Based on this,we propose a photodiode based on an n-i-p structure,which removes the buffer layer and further simplifies the manufacturing process of quantum dot image sensors,thus reducing manufacturing costs.Additionally,for the noise complexity in quantum dot image sensors when capturing images,traditional denoising and non-uniformity methods often do not achieve optimal denoising re⁃sults.For the noise and stripe-type non-uniformity commonly encountered in infrared quantum dot detector imag⁃es,a network architecture has been developed that incorporates multiple key modules.This network combines channel attention and spatial attention mechanisms,dynamically adjusting the importance of feature maps to en⁃hance the ability to distinguish between noise and details.Meanwhile,the residual dense feature fusion module further improves the network's ability to process complex image structures through hierarchical feature extraction and fusion.Furthermore,the pyramid pooling module effectively captures information at different scales,improv⁃ing the network's multi-scale feature representation ability.Through the collaborative effect of these modules,the network can better handle various mixed noise and image non-uniformity issues.Experimental results show that it outperforms the traditional U-Net network in denoising and image correction tasks.
基金funded by Anhui Province University Key Science and Technology Project(2024AH053415)Anhui Province University Major Science and Technology Project(2024AH040229)+3 种基金Talent Research Initiation Fund Project of Tongling University(2024tlxyrc019)Tongling University School-Level Scientific Research Project(2024tlxyptZD07)TheUniversity Synergy Innovation Programof Anhui Province(GXXT-2023-050)Tongling City Science and Technology Major Special Project(Unveiling and Commanding Model)(200401JB004).
文摘In the image fusion field,fusing infrared images(IRIs)and visible images(VIs)excelled is a key area.The differences between IRIs and VIs make it challenging to fuse both types into a high-quality image.Accordingly,efficiently combining the advantages of both images while overcoming their shortcomings is necessary.To handle this challenge,we developed an end-to-end IRI andVI fusionmethod based on frequency decomposition and enhancement.By applying concepts from frequency domain analysis,we used the layering mechanism to better capture the salient thermal targets from the IRIs and the rich textural information from the VIs,respectively,significantly boosting the image fusion quality and effectiveness.In addition,the backbone network combined Restormer Blocks and Dense Blocks;Restormer blocks utilize global attention to extract shallow features.Meanwhile,Dense Blocks ensure the integration between shallow and deep features,thereby avoiding the loss of shallow attributes.Extensive experiments on TNO and MSRS datasets demonstrated that the suggested method achieved state-of-the-art(SOTA)performance in various metrics:Entropy(EN),Mutual Information(MI),Standard Deviation(SD),The Structural Similarity Index Measure(SSIM),Fusion quality(Qabf),MI of the pixel(FMI_(pixel)),and modified Visual Information Fidelity(VIF_(m)).
基金financially supported by the Open Project Program of Wuhan National Laboratory for Optoelectronics(No.2022WNLOKF009)the National Natural Science Foundation of China(No.62475216)+2 种基金the Key Research and Development Program of Shaanxi(No.2024GH-ZDXM-37)the Fujian Provincial Natural Science Foundation of China(No.2024J01060)the Startup Program of XMU,and the Fundamental Research Funds for the Central Universities.
文摘Microscopy imaging is fundamental in analyzing bacterial morphology and dynamics,offering critical insights into bacterial physiology and pathogenicity.Image segmentation techniques enable quantitative analysis of bacterial structures,facilitating precise measurement of morphological variations and population behaviors at single-cell resolution.This paper reviews advancements in bacterial image segmentation,emphasizing the shift from traditional thresholding and watershed methods to deep learning-driven approaches.Convolutional neural networks(CNNs),U-Net architectures,and three-dimensional(3D)frameworks excel at segmenting dense biofilms and resolving antibiotic-induced morphological changes.These methods combine automated feature extraction with physics-informed postprocessing.Despite progress,challenges persist in computational efficiency,cross-species generalizability,and integration with multimodal experimental workflows.Future progress will depend on improving model robustness across species and imaging modalities,integrating multimodal data for phenotype-function mapping,and developing standard pipelines that link computational tools with clinical diagnostics.These innovations will expand microbial phenotyping beyond structural analysis,enabling deeper insights into bacterial physiology and ecological interactions.
基金supported by National Natural Science Foundation of China(U1909201,62206062).
文摘Clouds are one of the leading causes of sun shading,which reduces the direct horizontal irradiance and curtails the photovoltaic(PV)power.It is critical to estimate cloud cover to accurately predict PV generation within a very short horizon(second/minute).To achieve the precise forecasting of cloud cover,an image preprocessing method based on total-sky images is proposed to remove the interference and address the image edge distortion issue.An optimal threshold estimation method is further designed to achieve higher cloud identification precision.Considering the cloud's meteorological properties,a random hypersurface model(RHM)based on the Gaussian mixture probability hypothesis density(GM-PHD)filter is applied to track the cloud.The GM-PHD can track the rotation and diffusion of clouds,which helps to estimate sun-cloud collision.Furthermore,a hybrid autoregressive integrated moving average(ARIMA)and backpropagation(BP)neural network-based model is applied for intra-hour PV power forecasting.The experiment results demonstrate that the proposed cloud-tracking-based PV power forecasting model can capture the ramp behavior of PV power,improving forecasting precision.
基金supported by the National Natural Science Foundation of China(62522119 and 62372358)the Beijing Natural Science Foundation(7242267)+2 种基金the Beijing Scholars Program([2015]160)the Natural Science Basic Research Program of Shaanxi(2023-JC-QN-0719)the Guangdong Basic and Applied Basic Research Foundation(2022A1515110453)。
文摘Background:Brain volume measurement serves as a critical approach for assessing brain health status.Considering the close biological connection between the eyes and brain,this study aims to investigate the feasibility of estimating brain volume through retinal fundus imaging integrated with clinical metadata,and to offer a cost-effective approach for assessing brain health.Methods:Based on clinical information,retinal fundus images,and neuroimaging data derived from a multicenter,population-based cohort study,the Kai Luan Study,we proposed a cross-modal correlation representation(CMCR)network to elucidate the intricate co-degenerative relationships between the eyes and brain for 755 subjects.Specifically,individual clinical information,which has been followed up for as long as 12 years,was encoded as a prompt to enhance the accuracy of brain volume estimation.Independent internal validation and external validation were performed to assess the robustness of the proposed model.Root mean square error(RMSE),peak signal-tonoise ratio(PSNR),and structural similarity index measure(SSIM)metrics were employed to quantitatively evaluate the quality of synthetic brain images derived from retinal imaging data.Results:The proposed framework yielded average RMSE,PSNR,and SSIM values of 98.23,35.78 d B,and 0.64,respectively,which significantly outperformed 5 other methods:multi-channel Variational Autoencoder(mcVAE),Pixelto-Pixel(Pixel2pixel),transformer-based U-Net(Trans UNet),multi-scale transformer network(MT-Net),and residual vision transformer(ResViT).The two-(2D)and three-dimensional(3D)visualization results showed that the shape and texture of the synthetic brain images generated by the proposed method most closely resembled those of actual brain images.Thus,the CMCR framework accurately captured the latent structural correlations between the fundus and the brain.The average difference between predicted and actual brain volumes was 61.36 cm~3,with a relative error of 4.54%.When all of the clinical information(including age and sex,daily habits,cardiovascular factors,metabolic factors,and inflammatory factors)was encoded,the difference was decreased to 53.89 cm~3,with a relative error of 3.98%.Based on the synthesized brain magnetic resonance images from retinal fundus images,the volumes of brain tissues could be estimated with high accuracy.Conclusion:This study provides an innovative,accurate,and cost-effective approach to characterize brain health status through readily accessible retinal fundus images.
文摘Medical image segmentation is of critical importance in the domain of contemporary medical imaging.However,U-Net and its variants exhibit limitations in capturing complex nonlinear patterns and global contextual information.Although the subsequent U-KAN model enhances nonlinear representation capabilities,it still faces challenges such as gradient vanishing during deep network training and spatial detail loss during feature downsampling,resulting in insufficient segmentation accuracy for edge structures and minute lesions.To address these challenges,this paper proposes the RE-UKAN model,which innovatively improves upon U-KAN.Firstly,a residual network is introduced into the encoder to effectively mitigate gradient vanishing through cross-layer identity mappings,thus enhancing modelling capabilities for complex pathological structures.Secondly,Efficient Local Attention(ELA)is integrated to suppress spatial detail loss during downsampling,thereby improving the perception of edge structures and minute lesions.Experimental results on four public datasets demonstrate that RE-UKAN outperforms existing medical image segmentation methods across multiple evaluation metrics,with particularly outstanding performance on the TN-SCUI 2020 dataset,achieving IoU of 88.18%and Dice of 93.57%.Compared to the baseline model,it achieves improvements of 3.05%and 1.72%,respectively.These results fully demonstrate RE-UKAN’s superior detail retention capability and boundary recognition accuracy in complex medical image segmentation tasks,providing a reliable solution for clinical precision segmentation.
基金an initial outcome of the Research on the Interactive Relationship Between Biographies and Epitaphs in Ancient China,a project(ID:24BZW023)supported by the National Social Science Fund of China。
文摘The historical image of Ouyang Xiu constructed during the Song Dynasty evolved from a multifaceted portrayal that balanced his political and literary achievements into a singular cultural symbol.In the Northern Song Dynasty,writings by Ouyang Xiu's family and epitaphs by his colleagues crafted a balanced narrative emphasizing both his official duties and literary merits,thus constructing a dual image of him as a principled remonstrator and a literary master.In the Southern Song Dynasty,official historiography gradually eroded his complex persona as a political reformer by selectively trimming political disputes and emphasizing his literary lineage,ultimately establishing him as a cultural exemplar beyond factional strife.Throughout this evolution of historical writing,Ouyang Xiu's sharpness as a remonstrator was gradually obscured in historical texts,while his image as a literary master,revered by all,became firmly established.The reshaping of Ouyang Xiu's image in historical writings across the Northern and Southern Song dynasties not only reflects the logic of selecting scholar-official role models under the influence of official ideology but also reveals the inherent pattern whereby individual distinctiveness fades into symbolic construction in historical writing.
基金supported by Postgraduate Research&Practice Innovation Program of Jiangsu Province,China(Grant No.SJCX24_1332)Jiangsu Province Education Science Planning Project in 2024(Grant No.B-b/2024/01/122)High-Level Talent Scientific Research Foundation of Jinling Institute of Technology,China(Grant No.jit-b-201918).
文摘Digital watermarking technology plays an important role in detecting malicious tampering and protecting image copyright.However,in practical applications,this technology faces various problems such as severe image distortion,inaccurate localization of the tampered regions,and difficulty in recovering content.Given these shortcomings,a fragile image watermarking algorithm for tampering blind-detection and content self-recovery is proposed.The multi-feature watermarking authentication code(AC)is constructed using texture feature of local binary patterns(LBP),direct coefficient of discrete cosine transform(DCT)and contrast feature of gray level co-occurrence matrix(GLCM)for detecting the tampered region,and the recovery code(RC)is designed according to the average grayscale value of pixels in image blocks for recovering the tampered content.Optimal pixel adjustment process(OPAP)and least significant bit(LSB)algorithms are used to embed the recovery code and authentication code into the image in a staggered manner.When detecting the integrity of the image,the authentication code comparison method and threshold judgment method are used to perform two rounds of tampering detection on the image and blindly recover the tampered content.Experimental results show that this algorithm has good transparency,strong and blind detection,and self-recovery performance against four types of malicious attacks and some conventional signal processing operations.When resisting copy-paste,text addition,cropping and vector quantization under the tampering rate(TR)10%,the average tampering detection rate is up to 94.09%,and the peak signal-to-noise ratio(PSNR)of the watermarked image and the recovered image are both greater than 41.47 and 40.31 dB,which demonstrates its excellent advantages compared with other related algorithms in recent years.
文摘Compact size,high brightness,and wide field of view(FOV)are key requirements for long-wave infrared imagers used in military surveillance or night navigation.However,to meet the imaging requirements of high resolution and wide FOV,infrared optical systems often adopt complex optical lens groups,which will increase the size and weight of the optical system.In this paper,a strategy based on wavefront coding(WFC)is proposed to design a compact wide-FOV infrared imager.A cubic phase mask is inserted into the pupil plane of the infrared imager to correct the aberration.The simulated results show that,the WFC infrared imager has good imaging quality in a wide FOV of±16°.In addition,the WFC infrared imager achieves compactness with its 40 mm×40 mm×40 mm size.A fast focal ratio of 1 combined with an entrance pupil diameter of 25 mm ensures brightness.This work is of significance for designing a compact wide-FOV infrared imager.
基金Project supported by the Fundamental Research Funds for the Central Universities of China(Grant No.531118010757)。
文摘Single-pixel imaging(SPI)receives widespread attention due to its superior anti-interference capabilities,and image segmentation technology can effectively facilitate its recognition and information extraction.However,the complexity of the target scene and plenty of imaging time in SPI make it challenging to achieve high-quality and concise segmentation.In this paper,we investigate the image-free intricate scene semantic segmentation in SPI.Using“learned”illumination patterns allows for the full extraction of the object's spatial information,thereby enabling pixel-level segmentation results through the decoding of the received measurements.Simulation and experimentation show that,in the absence of image reconstruction,the mean intersection over union(MIoU)of segmented image can reach higher than 85%,and the Dice coefficient(DICE)close to 90%even at the sampling ratio of 5%.Our approach may be favorable to applications in medical image segmentation and autonomous driving field.
基金funded by the National Natural Science Foundation of China(NNSFC)under Grant Numbers 42322408,42188101,and 42441809Additional support was provided by the Climbing Program of the National Space Science Center(NSSC,Grant No.E4PD3005)as well as the Specialized Research Fund for State Key Laboratories of China.
文摘A large-scale view of the magnetospheric cusp is expected to be obtained by the Soft X-ray Imager(SXI)onboard the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE).However,it is challenging to trace the three-dimensional cusp boundary from a two-dimensional X-ray image because the detected X-ray signals will be integrated along the line of sight.In this work,a global magnetohydrodynamic code was used to simulate the X-ray images and photon count images,assuming an interplanetary magnetic field with a pure Bz component.The assumption of an elliptic cusp boundary at a given altitude was used to trace the equatorward and poleward boundaries of the cusp from a simulated X-ray image.The average discrepancy was less than 0.1 RE.To reduce the influence of instrument effects and cosmic X-ray backgrounds,image denoising was considered before applying the method above to SXI photon count images.The cusp boundaries were reasonably reconstructed from the noisy X-ray image.
基金funded by the National Natural Science Foundation of China,grant numbers 52374156 and 62476005。
文摘Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.
基金supported by Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Research Project under grant number RGP2/540/46.
文摘Over the years,Generative Adversarial Networks(GANs)have revolutionized the medical imaging industry for applications such as image synthesis,denoising,super resolution,data augmentation,and cross-modality translation.The objective of this review is to evaluate the advances,relevances,and limitations of GANs in medical imaging.An organised literature review was conducted following the guidelines of PRISMA(Preferred Reporting Items for Systematic Reviews and Meta-Analyses).The literature considered included peer-reviewed papers published between 2020 and 2025 across databases including PubMed,IEEE Xplore,and Scopus.The studies related to applications of GAN architectures in medical imaging with reported experimental outcomes and published in English in reputable journals and conferences were considered for the review.Thesis,white papers,communication letters,and non-English articles were not included for the same.CLAIM based quality assessment criteria were applied to the included studies to assess the quality.The study classifies diverse GAN architectures,summarizing their clinical applications,technical performances,and their implementation hardships.Key findings reveal the increasing applications of GANs for enhancing diagnostic accuracy,reducing data scarcity through synthetic data generation,and supporting modality translation.However,concerns such as limited generalizability,lack of clinical validation,and regulatory constraints persist.This review provides a comprehensive study of the prevailing scenario of GANs in medical imaging and highlights crucial research gaps and future directions.Though GANs hold transformative capability for medical imaging,their integration into clinical use demands further validation,interpretability,and regulatory alignment.
基金funded by Ministry of Higher Education Malaysia through Universiti Malaysia Pahang Al-Sultan Abdullah under Internal Research Grant(RDU233003).
文摘This paper proposes a tamper detection technique for semi-fragile watermarking using Quantizationbased Discrete Cosine Transform(DCT)for tamper localization.In this study,the proposed embedding strategy is investigated by experimental tests over the diagonal order of the DCT coefficients.The cover image is divided into non-overlapping blocks of size 8×8 pixels.The DCT is applied to each block,and the coefficients are arranged using a zig-zag pattern within the block.In this study,the low-frequency coefficients are selected to examine the impact of the imperceptibility score and tamper detection accuracy.High accuracy of tamper detection can be achieved by checking the surrounding blocks to determine whether the corresponding block has been tampered with.The proposed tamper detection is tested under various malicious,incidental,and hybrid attacks(both incidental and malicious attacks).The experimental results demonstrate that the proposed technique achieves a Peak-Signal-to-Noise Ratio(PSNR)value of 41.2318 dB,an average Structural Similarity Index Measure(SSIM)value of 0.9768.The proposed scheme is also evaluated against malicious attacks such as copy-move,object deletion,object manipulation,and collage attacks.The proposed scheme can detect the malicious attack localization under various tampering rates.In addition,the proposed scheme can still detect tampered pixels under a hybrid attack,such as a combination ofmalicious and incidental attacks,with an average accuracy of 96.44%.