期刊文献+
共找到7,768篇文章
< 1 2 250 >
每页显示 20 50 100
BDMFuse:Multi-scale network fusion for infrared and visible images based on base and detail features
1
作者 SI Hai-Ping ZHAO Wen-Rui +4 位作者 LI Ting-Ting LI Fei-Tao Fernando Bacao SUN Chang-Xia LI Yan-Ling 《红外与毫米波学报》 北大核心 2025年第2期289-298,共10页
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f... The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception. 展开更多
关键词 infrared image visible image image fusion encoder-decoder multi-scale features
在线阅读 下载PDF
MA-VoxelMorph:Multi-scale attention-based VoxelMorph for nonrigid registration of thoracoabdominal CT images
2
作者 Qing Huang Lei Ren +3 位作者 Tingwei Quan Minglei Yang Hongmei Yuan Kai Cao 《Journal of Innovative Optical Health Sciences》 2025年第1期135-151,共17页
This paper aims to develop a nonrigid registration method of preoperative and intraoperative thoracoabdominal CT images in computer-assisted interventional surgeries for accurate tumor localization and tissue visualiz... This paper aims to develop a nonrigid registration method of preoperative and intraoperative thoracoabdominal CT images in computer-assisted interventional surgeries for accurate tumor localization and tissue visualization enhancement.However,fine structure registration of complex thoracoabdominal organs and large deformation registration caused by respiratory motion is challenging.To deal with this problem,we propose a 3D multi-scale attention VoxelMorph(MAVoxelMorph)registration network.To alleviate the large deformation problem,a multi-scale axial attention mechanism is utilized by using a residual dilated pyramid pooling for multi-scale feature extraction,and position-aware axial attention for long-distance dependencies between pixels capture.To further improve the large deformation and fine structure registration results,a multi-scale context channel attention mechanism is employed utilizing content information via adjacent encoding layers.Our method was evaluated on four public lung datasets(DIR-Lab dataset,Creatis dataset,Learn2Reg dataset,OASIS dataset)and a local dataset.Results proved that the proposed method achieved better registration performance than current state-of-the-art methods,especially in handling the registration of large deformations and fine structures.It also proved to be fast in 3D image registration,using about 1.5 s,and faster than most methods.Qualitative and quantitative assessments proved that the proposed MA-VoxelMorph has the potential to realize precise and fast tumor localization in clinical interventional surgeries. 展开更多
关键词 Thoracoabdominal CT image registration large deformation fine structure multi-scale attention mechanism
原文传递
Multi-Scale Vision Transformer with Dynamic Multi-Loss Function for Medical Image Retrieval and Classification
3
作者 Omar Alqahtani Mohamed Ghouse +2 位作者 Asfia Sabahath Omer Bin Hussain Arshiya Begum 《Computers, Materials & Continua》 2025年第5期2221-2244,共24页
This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi... This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi-scale encoding significantly enhances the model’s ability to capture both fine-grained and global features,while the dynamic loss function adapts during training to optimize classification accuracy and retrieval performance.Our approach was evaluated on the ISIC-2018 and ChestX-ray14 datasets,yielding notable improvements.Specifically,on the ISIC-2018 dataset,our method achieves an F1-Score improvement of+4.84% compared to the standard ViT,with a precision increase of+5.46% for melanoma(MEL).On the ChestX-ray14 dataset,the method delivers an F1-Score improvement of 5.3%over the conventional ViT,with precision gains of+5.0% for pneumonia(PNEU)and+5.4%for fibrosis(FIB).Experimental results demonstrate that our approach outperforms traditional CNN-based models and existing ViT variants,particularly in retrieving relevant medical cases and enhancing diagnostic accuracy.These findings highlight the potential of the proposedmethod for large-scalemedical image analysis,offering improved tools for clinical decision-making through superior classification and case comparison. 展开更多
关键词 Medical image retrieval vision transformer multi-scale encoding multi-loss function ISIC-2018 ChestX-ray14
在线阅读 下载PDF
M2ANet:Multi-branch and multi-scale attention network for medical image segmentation
4
作者 Wei Xue Chuanghui Chen +3 位作者 Xuan Qi Jian Qin Zhen Tang Yongsheng He 《Chinese Physics B》 2025年第8期547-559,共13页
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ... Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures. 展开更多
关键词 medical image segmentation convolutional neural network multi-branch attention multi-scale feature fusion
原文传递
Magnetic Resonance Image Super-Resolution Based on GAN and Multi-Scale Residual Dense Attention Network
5
作者 GUAN Chunling YU Suping +1 位作者 XU Wujun FAN Hong 《Journal of Donghua University(English Edition)》 2025年第4期435-441,共7页
The application of image super-resolution(SR)has brought significant assistance in the medical field,aiding doctors to make more precise diagnoses.However,solely relying on a convolutional neural network(CNN)for image... The application of image super-resolution(SR)has brought significant assistance in the medical field,aiding doctors to make more precise diagnoses.However,solely relying on a convolutional neural network(CNN)for image SR may lead to issues such as blurry details and excessive smoothness.To address the limitations,we proposed an algorithm based on the generative adversarial network(GAN)framework.In the generator network,three different sizes of convolutions connected by a residual dense structure were used to extract detailed features,and an attention mechanism combined with dual channel and spatial information was applied to concentrate the computing power on crucial areas.In the discriminator network,using InstanceNorm to normalize tensors sped up the training process while retaining feature information.The experimental results demonstrate that our algorithm achieves higher peak signal-to-noise ratio(PSNR)and structural similarity index measure(SSIM)compared to other methods,resulting in an improved visual quality. 展开更多
关键词 magnetic resonance(MR) image super-resolution(SR) attention mechanism generative adversarial network(GAN) multi-scale convolution
在线阅读 下载PDF
Experiments on image data augmentation techniques for geological rock type classification with convolutional neural networks 被引量:1
6
作者 Afshin Tatar Manouchehr Haghighi Abbas Zeinijahromi 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第1期106-125,共20页
The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist... The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications. 展开更多
关键词 Deep learning(DL) image analysis image data augmentation Convolutional neural networks(CNNs) Geological image analysis Rock classification Rock thin section(RTS)images
在线阅读 下载PDF
A Custom Medical Image De-identification System Based on Data Privacy
7
作者 ZHANG Jingchen WANG Jiayang +3 位作者 ZHAO Yuanzhi ZHOU Wei LUO Wei QIAN Qing 《数据与计算发展前沿(中英文)》 2025年第3期122-135,共14页
【Objective】Medical imaging data has great value,but it contains a significant amount of sensitive information about patients.At present,laws and regulations regarding to the de-identification of medical imaging data... 【Objective】Medical imaging data has great value,but it contains a significant amount of sensitive information about patients.At present,laws and regulations regarding to the de-identification of medical imaging data are not clearly defined around the world.This study aims to develop a tool that meets compliance-driven desensitization requirements tailored to diverse research needs.【Methods】To enhance the security of medical image data,we designed and implemented a DICOM format medical image de-identification system on the Windows operating system.【Results】Our custom de-identification system is adaptable to the legal standards of different countries and can accommodate specific research demands.The system offers both web-based online and desktop offline de-identification capabilities,enabling customization of de-identification rules and facilitating batch processing to improve efficiency.【Conclusions】This medical image de-identification system robustly strengthens the stewardship of sensitive medical data,aligning with data security protection requirements while facilitating the sharing and utilization of medical image data.This approach unlocks the intrinsic value inherent in such datasets. 展开更多
关键词 de-identification system medical image data privacy DICOM data sharing
暂未订购
A Novel Data-Annotated Label Collection and Deep-Learning Based Medical Image Segmentation in Reversible Data Hiding Domain
8
作者 Lord Amoah Jinwei Wang Bernard-Marie Onzo 《Computer Modeling in Engineering & Sciences》 2025年第5期1635-1660,共26页
Medical image segmentation,i.e.,labeling structures of interest in medical images,is crucial for disease diagnosis and treatment in radiology.In reversible data hiding in medical images(RDHMI),segmentation consists of... Medical image segmentation,i.e.,labeling structures of interest in medical images,is crucial for disease diagnosis and treatment in radiology.In reversible data hiding in medical images(RDHMI),segmentation consists of only two regions:the focal and nonfocal regions.The focal region mainly contains information for diagnosis,while the nonfocal region serves as the monochrome background.The current traditional segmentation methods utilized in RDHMI are inaccurate for complex medical images,and manual segmentation is time-consuming,poorly reproducible,and operator-dependent.Implementing state-of-the-art deep learning(DL)models will facilitate key benefits,but the lack of domain-specific labels for existing medical datasets makes it impossible.To address this problem,this study provides labels of existing medical datasets based on a hybrid segmentation approach to facilitate the implementation of DL segmentation models in this domain.First,an initial segmentation based on a 33 kernel is performed to analyze×identified contour pixels before classifying pixels into focal and nonfocal regions.Then,several human expert raters evaluate and classify the generated labels into accurate and inaccurate labels.The inaccurate labels undergo manual segmentation by medical practitioners and are scored based on a hierarchical voting scheme before being assigned to the proposed dataset.To ensure reliability and integrity in the proposed dataset,we evaluate the accurate automated labels with manually segmented labels by medical practitioners using five assessment metrics:dice coefficient,Jaccard index,precision,recall,and accuracy.The experimental results show labels in the proposed dataset are consistent with the subjective judgment of human experts,with an average accuracy score of 94%and dice coefficient scores between 90%-99%.The study further proposes a ResNet-UNet with concatenated spatial and channel squeeze and excitation(scSE)architecture for semantic segmentation to validate and illustrate the usefulness of the proposed dataset.The results demonstrate the superior performance of the proposed architecture in accurately separating the focal and nonfocal regions compared to state-of-the-art architectures.Dataset information is released under the following URL:https://www.kaggle.com/lordamoah/datasets(accessed on 31 March 2025). 展开更多
关键词 Reversible data hiding medical image segmentation medical image dataset deep learning
在线阅读 下载PDF
Pre-trained SAM as data augmentation for image segmentation
9
作者 Junjun Wu Yunbo Rao +1 位作者 Shaoning Zeng Bob Zhang 《CAAI Transactions on Intelligence Technology》 2025年第1期268-282,共15页
Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in ord... Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation. 展开更多
关键词 data augmentation image segmentation large model segment anything model
在线阅读 下载PDF
General Improvement of Image Interpolation-Based Data Hiding Methods Using Multiple-Based Number Conversion
10
作者 Da-Chun Wu Bing-Han 《Computer Modeling in Engineering & Sciences》 2025年第7期535-580,共46页
Data hiding methods involve embedding secret messages into cover objects to enable covert communication in a way that is difficult to detect.In data hiding methods based on image interpolation,the image size is reduce... Data hiding methods involve embedding secret messages into cover objects to enable covert communication in a way that is difficult to detect.In data hiding methods based on image interpolation,the image size is reduced and then enlarged through interpolation,followed by the embedding of secret data into the newly generated pixels.A general improving approach for embedding secret messages is proposed.The approach may be regarded a general model for enhancing the data embedding capacity of various existing image interpolation-based data hiding methods.This enhancement is achieved by expanding the range of pixel values available for embedding secret messages,removing the limitations of many existing methods,where the range is restricted to powers of two to facilitate the direct embedding of bit-based messages.This improvement is accomplished through the application of multiple-based number conversion to the secret message data.The method converts the message bits into a multiple-based number and uses an algorithm to embed each digit of this number into an individual pixel,thereby enhancing the message embedding efficiency,as proved by a theorem derived in this study.The proposed improvement method has been tested through experiments on three well-known image interpolation-based data hiding methods.The results show that the proposed method can enhance the three data embedding rates by approximately 14%,13%,and 10%,respectively,create stego-images with good quality,and resist RS steganalysis attacks.These experimental results indicate that the use of the multiple-based number conversion technique to improve the three interpolation-based methods for embedding secret messages increases the number of message bits embedded in the images.For many image interpolation-based data hiding methods,which use power-of-two pixel-value ranges for message embedding,other than the three tested ones,the proposed improvement method is also expected to be effective for enhancing their data embedding capabilities. 展开更多
关键词 data hiding image interpolation interpolation-based hiding methods steganography multiple-based number conversion
在线阅读 下载PDF
Enhancing Medical Image Classification with BSDA-Mamba:Integrating Bayesian Random Semantic Data Augmentation and Residual Connections
11
作者 Honglin Wang Yaohua Xu Cheng Zhu 《Computers, Materials & Continua》 2025年第6期4999-5018,共20页
Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Aug... Medical image classification is crucial in disease diagnosis,treatment planning,and clinical decisionmaking.We introduced a novel medical image classification approach that integrates Bayesian Random Semantic Data Augmentation(BSDA)with a Vision Mamba-based model for medical image classification(MedMamba),enhanced by residual connection blocks,we named the model BSDA-Mamba.BSDA augments medical image data semantically,enhancing the model’s generalization ability and classification performance.MedMamba,a deep learning-based state space model,excels in capturing long-range dependencies in medical images.By incorporating residual connections,BSDA-Mamba further improves feature extraction capabilities.Through comprehensive experiments on eight medical image datasets,we demonstrate that BSDA-Mamba outperforms existing models in accuracy,area under the curve,and F1-score.Our results highlight BSDA-Mamba’s potential as a reliable tool for medical image analysis,particularly in handling diverse imaging modalities from X-rays to MRI.The open-sourcing of our model’s code and datasets,will facilitate the reproduction and extension of our work. 展开更多
关键词 Deep learning medical image classification data augmentation visual state space model
在线阅读 下载PDF
A Lightweight Convolutional Neural Network with Hierarchical Multi-Scale Feature Fusion for Image Classification 被引量:2
12
作者 Adama Dembele Ronald Waweru Mwangi Ananda Omutokoh Kube 《Journal of Computer and Communications》 2024年第2期173-200,共28页
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso... Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline. 展开更多
关键词 MobileNet image Classification Lightweight Convolutional Neural Network Depthwise Dilated Separable Convolution Hierarchical multi-scale Feature Fusion
在线阅读 下载PDF
Underwater Image Enhancement Based on Multi-scale Adversarial Network
13
作者 ZENG Jun-yang SI Zhan-jun 《印刷与数字媒体技术研究》 CAS 北大核心 2024年第5期70-77,共8页
In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of ea... In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of each layer were enhanced into the global features by the proposed residual dense block,which ensured that the generated images retain more details.Secondly,a multi-scale structure was adopted to extract multi-scale semantic features of the original images.Finally,the features obtained from the dual channels were fused by an adaptive fusion module to further optimize the features.The discriminant network adopted the structure of the Markov discriminator.In addition,by constructing mean square error,structural similarity,and perceived color loss function,the generated image is consistent with the reference image in structure,color,and content.The experimental results showed that the enhanced underwater image deblurring effect of the proposed algorithm was good and the problem of underwater image color bias was effectively improved.In both subjective and objective evaluation indexes,the experimental results of the proposed algorithm are better than those of the comparison algorithm. 展开更多
关键词 Underwater image enhancement Generative adversarial network multi-scale feature extraction Residual dense block
在线阅读 下载PDF
Image Tamper Detection and Multi-Scale Self-Recovery Using Reference Embedding with Multi-Rate Data Protection 被引量:1
14
作者 Navid Daneshmandpour Habibollah Danyali Mohammad Sadegh Helfroush 《China Communications》 SCIE CSCD 2019年第11期154-166,共13页
This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a mult... This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a multi-scale quality manner.In the proposed approach,the reference data composed of several parts and each part is protected by a channel coding rate according to its importance.The first part,which is used to reconstruct a rough approximation of the original image,is highly protected in order to resist against higher tampering rates.Other parts are protected with lower rates according to their importance leading to lower tolerable tampering rate(TTR),but the higher quality of the recovered images.The proposed MSSR approach is an efficient solution for the main disadvantage of the current methods,which either recover a tampered image in low tampering rates or fails when tampering rate is above the TTR value.The simulation results on 10000 test images represent the efficiency of the multi-scale self-recovery feature of the proposed approach in comparison with the existing methods. 展开更多
关键词 TAMPER detection image recovery multi-scale SELF-RECOVERY tolerable tampering rate
在线阅读 下载PDF
An Advanced Image Processing Technique for Backscatter-Electron Data by Scanning Electron Microscopy for Microscale Rock Exploration 被引量:2
15
作者 Zhaoliang Hou Kunfeng Qiu +1 位作者 Tong Zhou Yiwei Cai 《Journal of Earth Science》 SCIE CAS CSCD 2024年第1期301-305,共5页
Backscatter electron analysis from scanning electron microscopes(BSE-SEM)produces high-resolution image data of both rock samples and thin-sections,showing detailed structural and geochemical(mineralogical)information... Backscatter electron analysis from scanning electron microscopes(BSE-SEM)produces high-resolution image data of both rock samples and thin-sections,showing detailed structural and geochemical(mineralogical)information.This allows an in-depth exploration of the rock microstructures and the coupled chemical characteristics in the BSE-SEM image to be made using image processing techniques.Although image processing is a powerful tool for revealing the more subtle data“hidden”in a picture,it is not a commonly employed method in geoscientific microstructural analysis.Here,we briefly introduce the general principles of image processing,and further discuss its application in studying rock microstructures using BSE-SEM image data. 展开更多
关键词 image processing rock microstructures electron-based imaging data mining
原文传递
Multi-Scale Feature Fusion Network for Accurate Detection of Cervical Abnormal Cells
16
作者 Chuanyun Xu Die Hu +3 位作者 Yang Zhang Shuaiye Huang Yisha Sun Gang Li 《Computers, Materials & Continua》 2025年第4期559-574,共16页
Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells an... Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening. 展开更多
关键词 Cervical abnormal cells image detection multi-scale feature fusion contextual information
在线阅读 下载PDF
CLIP-IML:A novel approach for CLIP-based image manipulation localization
17
作者 Xue-Yang Hou Yilihamu Yaermaimaiti Shuo-Qi Cheng 《Journal of Electronic Science and Technology》 2025年第3期56-70,共15页
Existing image manipulation localization(IML)techniques require large,densely annotated sets of forged images.This requirement greatly increases labeling costs and limits a model’s ability to handle manipulation type... Existing image manipulation localization(IML)techniques require large,densely annotated sets of forged images.This requirement greatly increases labeling costs and limits a model’s ability to handle manipulation types that are novel or absent from the training data.To address these issues,we present CLIP-IML,an IML framework that leverages contrastive language-image pre-training(CLIP).A lightweight feature-reconstruction module transforms CLIP token sequences into spatial tensors,after which a compact feature-pyramid network and a multi-scale fusion decoder work together to capture information from fine to coarse levels.We evaluated CLIP-IML on ten public datasets that cover copy-move,splicing,removal,and artificial intelligence(AI)-generated forgeries.The framework raises the average F1-score by 7.85%relative to the strongest recent baselines and secures either the first-or second-place performance on every dataset.Ablation studies show that CLIP pre-training,higher resolution inputs,and the multi-scale decoder each make complementary contributions.Under six common post-processing perturbations,as well as the compression pipelines used by Facebook,Weibo,and WeChat,the performance decline never exceeds 2.2%,confirming strong practical robustness.Moreover,CLIP-IML requires only a few thousand annotated images for training,which markedly reduces data-collection and labeling effort compared with previous methods.All of these results indicate that CLIP-IML is highly generalizable for image tampering localization across a wide range of tampering scenarios. 展开更多
关键词 image manipulation localization multi-scale feature Pre-trained model Vision-language model Vision Transformer
在线阅读 下载PDF
AMSFuse:Adaptive Multi-Scale Feature Fusion Network for Diabetic Retinopathy Classification
18
作者 Chengzhang Zhu Ahmed Alasri +5 位作者 Tao Xu Yalong Xiao Abdulrahman Noman Raeed Alsabri Xuanchu Duan Monir Abdullah 《Computers, Materials & Continua》 2025年第3期5153-5167,共15页
Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure p... Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods. 展开更多
关键词 Diabetic retinopathy multi-scale feature fusion global features local features integrated attention mechanism retinal images
暂未订购
A U-Net Based Method for Radio Astronomical Image Deconvolution
19
作者 Xinghui Zhou Qianyun Yun +4 位作者 Hui Deng Yangfan Xie Yijun Xu Feng Wang Ying Mei 《Research in Astronomy and Astrophysics》 2025年第10期1-10,共10页
Deconvolution in radio interferometry faces challenges due to incomplete sampling of the visibilities in the spatial frequency domain caused by a limited number of antenna baselines,resulting in an ill-posed inverse p... Deconvolution in radio interferometry faces challenges due to incomplete sampling of the visibilities in the spatial frequency domain caused by a limited number of antenna baselines,resulting in an ill-posed inverse problem.Reconstructing dirty images into clean ones is crucial for subsequent scientific analysis.To address these challenges,we propose a U-Net based method that extracts high-level information from the dirty image and reconstructs a clean image by effectively reducing artifacts and sidelobes.The U-Net architecture,consisting of an encoder-decoder structure and skip connections,facilitates the flow of information and preserves spatial details.Using simulated data of radio galaxies,we train our model and evaluate its performance on the testing set.Compared with the CLEAN method and the visibility and image conditioned denoising diffusion probabilistic model,our proposed model can effectively reconstruct both extended sources and faint point sources with higher values in the structural similarity index measure and the peak signal-to-noise ratio.Furthermore,we investigate the impact of noise on the model performance,demonstrating its robustness under varying noise levels. 展开更多
关键词 TECHNIQUES image processing-techniques interferometric-radio continuum galaxies-methods data analysis
在线阅读 下载PDF
HiFAST: An H I Data Calibration and Imaging Pipeline for FAST.Ⅲ.Standing Wave Removal
20
作者 Chen Xu Jie Wang +13 位作者 Yingjie Jing Fujia Li Hengqian Gan Ziming Liu Tiantian Liang Qingze Chen Zerui Liu Zhipeng Hou Hao Hu Huijie Hu Shijie Huang Peng Jiang Chuan-Peng Zhang Yan Zhu 《Research in Astronomy and Astrophysics》 2025年第1期247-261,共15页
The standing waves existing in radio telescope data are primarily due to reflections among the instruments,which significantly impact the spectral quality of the Five-hundred-meter Aperture Spherical radio Telescope(F... The standing waves existing in radio telescope data are primarily due to reflections among the instruments,which significantly impact the spectral quality of the Five-hundred-meter Aperture Spherical radio Telescope(FAST).Eliminating these standing waves for FAST is challenging given the constant changes in their phases and amplitudes.Over a ten-second period,the phases shift by 18°while the amplitudes fluctuate by 6 mK.Thus,we developed the fast Fourier transform(FFT)filter method to eliminate these standing waves for every individual spectrum.The FFT filter can decrease the rms from 3.2 to 1.15 times the theoretical estimate.Compared to other methods such as sine fitting and running median,the FFT filter achieves a median rms of approximately 1.2 times the theoretical expectation and the smallest scatter at 12%.Additionally,the FFT filter method avoids the flux loss issue encountered with some other methods.The FFT is also efficient in detecting harmonic radio frequency interference(RFI).In the FAST data,we identified three distinct types of harmonic RFI,each with amplitudes exceeding 100 mK and intrinsic frequency periods of 8.1,0.5,and 0.37 MHz,respectively.The FFT filter,proven as the most effective method,is integrated into the H I data calibration and imaging pipeline for FAST(HiFAST,https://hifast.readthedocs.io). 展开更多
关键词 methods data analysis-techniques image processing-galaxies ISM-radio lines GALAXIES
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部