Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi...Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.展开更多
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram...An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.展开更多
In the article“A Lightweight Approach for Skin Lesion Detection through Optimal Features Fusion”by Khadija Manzoor,Fiaz Majeed,Ansar Siddique,Talha Meraj,Hafiz Tayyab Rauf,Mohammed A.El-Meligy,Mohamed Sharaf,Abd Ela...In the article“A Lightweight Approach for Skin Lesion Detection through Optimal Features Fusion”by Khadija Manzoor,Fiaz Majeed,Ansar Siddique,Talha Meraj,Hafiz Tayyab Rauf,Mohammed A.El-Meligy,Mohamed Sharaf,Abd Elatty E.Abd Elgawad Computers,Materials&Continua,2022,Vol.70,No.1,pp.1617–1630.DOI:10.32604/cmc.2022.018621,URL:https://www.techscience.com/cmc/v70n1/44361,there was an error regarding the affiliation for the author Hafiz Tayyab Rauf.Instead of“Centre for Smart Systems,AI and Cybersecurity,Staffordshire University,Stoke-on-Trent,UK”,the affiliation should be“Independent Researcher,Bradford,BD80HS,UK”.展开更多
Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it cha...Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely us...Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely used in pulmonary disease diagnosis,such as pneumonia and tuberculosis.However,traditional feature fusion methods often suffer from feature disparity,information loss,redundancy,and increased complexity,hindering the further extension of DL algorithms.To solve this problem,we propose a Graph-Convolution Fusion Network with Self-Supervised Feature Alignment(Self-FAGCFN)to address the limitations of traditional feature fusion methods in deep learning-based medical image classification for respiratory diseases such as pneumonia and tuberculosis.The network integrates Convolutional Neural Networks(CNNs)for robust feature extraction from two-dimensional grid structures and Graph Convolutional Networks(GCNs)within a Graph Neural Network branch to capture features based on graph structure,focusing on significant node representations.Additionally,an Attention-Embedding Ensemble Block is included to capture critical features from GCN outputs.To ensure effective feature alignment between pre-and post-fusion stages,we introduce a feature alignment loss that minimizes disparities.Moreover,to address the limitations of proposed methods,such as inappropriate centroid discrepancies during feature alignment and class imbalance in the dataset,we develop a Feature-Centroid Fusion(FCF)strategy and a Multi-Level Feature-Centroid Update(MLFCU)algorithm,respectively.Extensive experiments on public datasets LungVision and Chest-Xray demonstrate that the Self-FAGCFN model significantly outperforms existing methods in diagnosing pneumonia and tuberculosis,highlighting its potential for practical medical applications.展开更多
Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure p...Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.展开更多
Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework...Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework for human gait classification in video sequences using deep learning(DL)fusion assisted and posterior probability-based moth flames optimization(MFO)is proposed.In the first step,the video frames are resized and finetuned by two pre-trained lightweight DL models,EfficientNetB0 and MobileNetV2.Both models are selected based on the top-5 accuracy and less number of parameters.Later,both models are trained through deep transfer learning and extracted deep features fused using a voting scheme.In the last step,the authors develop a posterior probabilitybased MFO feature selection algorithm to select the best features.The selected features are classified using several supervised learning methods.The CASIA-B publicly available dataset has been employed for the experimental process.On this dataset,the authors selected six angles such as 0°,18°,90°,108°,162°,and 180°and obtained an average accuracy of 96.9%,95.7%,86.8%,90.0%,95.1%,and 99.7%.Results demonstrate comparable improvement in accuracy and significantly minimize the computational time with recent state-of-the-art techniques.展开更多
In order to address the challenges encountered in visual navigation for asteroid landing using traditional point features,such as significant recognition and extraction errors,low computational efficiency,and limited ...In order to address the challenges encountered in visual navigation for asteroid landing using traditional point features,such as significant recognition and extraction errors,low computational efficiency,and limited navigation accuracy,a novel approach for multi-type fusion visual navigation is proposed.This method aims to overcome the limitations of single-type features and enhance navigation accuracy.Analytical criteria for selecting multi-type features are introduced,which simultaneously improve computational efficiency and system navigation accuracy.Concerning pose estimation,both absolute and relative pose estimation methods based on multi-type feature fusion are proposed,and multi-type feature normalization is established,which significantly improves system navigation accuracy and lays the groundwork for flexible application of joint absolute-relative estimation.The feasibility and effectiveness of the proposed method are validated through simulation experiments through 4769 Castalia.展开更多
Multi-source data fusion provides high-precision spatial situational awareness essential for analyzing granular urban social activities.This study used Shanghai’s catering industry as a case study,leveraging electron...Multi-source data fusion provides high-precision spatial situational awareness essential for analyzing granular urban social activities.This study used Shanghai’s catering industry as a case study,leveraging electronic reviews and consumer data sourced from third-party restaurant platforms collected in 2021.By performing weighted processing on two-dimensional point-of-interest(POI)data,clustering hotspots of high-dimensional restaurant data were identified.A hierarchical network of restaurant hotspots was constructed following the Central Place Theory(CPT)framework,while the Geo-Informatic Tupu method was employed to resolve the challenges posed by network deformation in multi-scale processes.These findings suggest the necessity of enhancing the spatial balance of Shanghai’s urban centers by moderately increasing the number and service capacity of suburban centers at the urban periphery.Such measures would contribute to a more optimized urban structure and facilitate the outward dispersion of comfort-oriented facilities such as the restaurant industry.At a finer spatial scale,the distribution of restaurant hotspots demonstrates a polycentric and symmetric spatial pattern,with a developmental trend radiating outward along the city’s ring roads.This trend can be attributed to the efforts of restaurants to establish connections with other urban functional spaces,leading to the reconfiguration of urban spaces,expansion of restaurant-dedicated land use,and the reorganization of associated commercial activities.The results validate the existence of a polycentric urban structure in Shanghai but also highlight the instability of the restaurant hotspot network during cross-scale transitions.展开更多
Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feat...Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification.展开更多
With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heter...With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.展开更多
A heart attack disrupts the normal flow of blood to the heart muscle,potentially causing severe damage or death if not treated promptly.It can lead to long-term health complications,reduce quality of life,and signific...A heart attack disrupts the normal flow of blood to the heart muscle,potentially causing severe damage or death if not treated promptly.It can lead to long-term health complications,reduce quality of life,and significantly impact daily activities and overall well-being.Despite the growing popularity of deep learning,several drawbacks persist,such as complexity and the limitation of single-model learning.In this paper,we introduce a residual learning-based feature fusion technique to achieve high accuracy in differentiating abnormal cardiac rhythms heart sound.Combining MobileNet with DenseNet201 for feature fusion leverages MobileNet lightweight,efficient architecture with DenseNet201,dense connections,resulting in enhanced feature extraction and improved model performance with reduced computational cost.To further enhance the fusion,we employed residual learning to optimize the hierarchical features of heart abnormal sounds during training.The experimental results demonstrate that the proposed fusion method achieved an accuracy of 95.67%on the benchmark PhysioNet-2016 Spectrogram dataset.To further validate the performance,we applied it to the BreakHis dataset with a magnification level of 100X.The results indicate that the model maintains robust performance on the second dataset,achieving an accuracy of 96.55%.it highlights its consistent performance,making it a suitable for various applications.展开更多
Taking the Ming Tombs Forest Farm in Beijing as the research object,this research applied multi-source data fusion and GIS heat-map overlay analysis techniques,systematically collected bird observation point data from...Taking the Ming Tombs Forest Farm in Beijing as the research object,this research applied multi-source data fusion and GIS heat-map overlay analysis techniques,systematically collected bird observation point data from the Global Biodiversity Information Facility(GBIF),population distribution data from the Oak Ridge National Laboratory(ORNL)in the United States,as well as information on the composition of tree species in suitable forest areas for birds and the forest geographical information of the Ming Tombs Forest Farm,which is based on literature research and field investigations.By using GIS technology,spatial processing was carried out on bird observation points and population distribution data to identify suitable bird-watching areas in different seasons.Then,according to the suitability value range,these areas were classified into different grades(from unsuitable to highly suitable).The research findings indicated that there was significant spatial heterogeneity in the bird-watching suitability of the Ming Tombs Forest Farm.The north side of the reservoir was generally a core area with high suitability in all seasons.The deep-aged broad-leaved mixed forests supported the overlapping co-existence of the ecological niches of various bird species,such as the Zosterops simplex and Urocissa erythrorhyncha.In contrast,the shallow forest-edge coniferous pure forests and mixed forests were more suitable for specialized species like Carduelis sinica.The southern urban area and the core area of the mausoleums had relatively low suitability due to ecological fragmentation or human interference.Based on these results,this paper proposed a three-level protection framework of“core area conservation—buffer zone management—isolation zone construction”and a spatio-temporal coordinated human-bird co-existence strategy.It was also suggested that the human-bird co-existence space could be optimized through measures such as constructing sound and light buffer interfaces,restoring ecological corridors,and integrating cultural heritage elements.This research provided an operational technical approach and decision-making support for the scientific planning of bird-watching sites and the coordination of ecological protection and tourism development.展开更多
The detection of Ochotona Curzoniae serves as a fundamental component for estimating the population size of this species and for analyzing the dynamics of its population fluctuations.In natural environments,the pixels...The detection of Ochotona Curzoniae serves as a fundamental component for estimating the population size of this species and for analyzing the dynamics of its population fluctuations.In natural environments,the pixels representing Ochotona Curzoniae constitute a small fraction of the total pixels,and their distinguishing features are often subtle,complicating the target detection process.To effectively extract the characteristics of these small targets,a feature fusion approach that utilizes up-sampling and channel integration from various layers within a CNN can significantly enhance the representation of target features,ultimately improving detection accuracy.However,the top-down fusion of features from different layers may lead to information duplication and semantic bias,resulting in redundancy and high-frequency noise.To address the challenges of information redundancy and high-frequency noise during the feature fusion process in CNN,we have developed a target detection model for Ochotona Curzoniae.This model is based on a spatial-channel reconfiguration convolutional(SCConv)attentional mechanism and feature fusion(FFBCA),integrated with the Faster R-CNN framework.It consists of a feature extraction network,an attention mechanism-based feature fusion module,and a jump residual connection fusion module.Initially,we designed a dual attention mechanism feature fusion module that employs spatial-channel reconstruction convolution.In the spatial dimension,the attention mechanism adopts a separation-reconstruction approach,calculating a weight matrix for the spatial information within the feature map through group normalization.This process directs the model to concentrate on feature information assigned varying weights,thereby reducing redundancy during feature fusion.In the channel dimension,the attention mechanism utilizes a partition-transpose-fusion method,segmenting the input feature map into high-noise and low-noise components based on the variance of the feature information.The high-noise segment is processed through a low-pass filter constructed from pointwise convolution(PWC)to eliminate some high-frequency noise,while the low-noise segment employs a bottleneck structure with global average pooling(GAP)to generate a weight matrix that emphasizes the significance of channel dimension feature information.This approach diminishes the model’s focus on low-weight feature information,thereby preserving low-frequency semantic information while reducing information redundancy.Furthermore,we have developed a novel feature extraction network,ResNeXt-S,by integrating the Sim attention mechanism into ResNeXt50.This configuration assigns three-dimensional attention weights to each position within the feature map,thereby enhancing the local feature information of small targets while reducing background noise.Finally,we constructed a jump residual connection fusion module to minimize the loss of high-level semantic information during the feature fusion process.Experiments on Ochotona Curzoniae target detection on the Ochotona Curzoniae dataset show that the detection accuracy of the model in this paper is 92.3%,which is higher than that of FSSD512(84.6%),TDFSSD512(81.3%),FPN(86.5%),FFBAM(88.5%),Faster R-CNN(89.6%),and SSD512(88.6%)detection accuracies.展开更多
This study proposes a learner profile framework based on multi-feature fusion,aiming to enhance the precision of personalized learning recommendations by integrating learners’static attributes(e.g.,demographic data a...This study proposes a learner profile framework based on multi-feature fusion,aiming to enhance the precision of personalized learning recommendations by integrating learners’static attributes(e.g.,demographic data and historical academic performance)with dynamic behavioral patterns(e.g.,real-time interactions and evolving interests over time).The research employs Term Frequency-Inverse Document Frequency(TF-IDF)for semantic feature extraction,integrates the Analytic Hierarchy Process(AHP)for feature weighting,and introduces a time decay function inspired by Newton’s law of cooling to dynamically model changes in learners’interests.Empirical results demonstrate that this framework effectively captures the dynamic evolution of learners’behaviors and provides context-aware learning resource recommendations.The study introduces a novel paradigm for learner modeling in educational technology,combining methodological innovation with a scalable technical architecture,thereby laying a foundation for the development of adaptive learning systems.展开更多
Ensuring that autonomous vehicles maintain high precision and rapid response capabilities in complex and dynamic driving environments is a critical challenge in the field of autonomous driving.This study aims to enhan...Ensuring that autonomous vehicles maintain high precision and rapid response capabilities in complex and dynamic driving environments is a critical challenge in the field of autonomous driving.This study aims to enhance the learning efficiency ofmulti-sensor feature fusion in autonomous driving tasks,thereby improving the safety and responsiveness of the system.To achieve this goal,we propose an innovative multi-sensor feature fusion model that integrates three distinct modalities:visual,radar,and lidar data.The model optimizes the feature fusion process through the introduction of two novel mechanisms:Sparse Channel Pooling(SCP)and Residual Triplet-Attention(RTA).Firstly,the SCP mechanism enables the model to adaptively filter out salient feature channels while eliminating the interference of redundant features.This enhances the model’s emphasis on critical features essential for decisionmaking and strengthens its robustness to environmental variability.Secondly,the RTA mechanism addresses the issue of feature misalignment across different modalities by effectively aligning key cross-modal features.This alignment reduces the computational overhead associated with redundant features and enhances the overall efficiency of the system.Furthermore,this study incorporates a reinforcement learning module designed to optimize strategies within a continuous action space.By integrating thismodulewith the feature fusion learning process,the entire system is capable of learning efficient driving strategies in an end-to-end manner within the CARLA autonomous driving simulator.Experimental results demonstrate that the proposedmodel significantly enhances the perception and decision-making accuracy of the autonomous driving system in complex traffic scenarios while maintaining real-time responsiveness.This work provides a novel perspective and technical pathway for the application of multi-sensor data fusion in autonomous driving.展开更多
Ransomware attacks pose a significant threat to critical infrastructures,demanding robust detection mechanisms.This study introduces a hybrid model that combines vision transformer(ViT)and one-dimensional convolutiona...Ransomware attacks pose a significant threat to critical infrastructures,demanding robust detection mechanisms.This study introduces a hybrid model that combines vision transformer(ViT)and one-dimensional convolutional neural network(1DCNN)architectures to enhance ransomware detection capabilities.Addressing common challenges in ransomware detection,particularly dataset class imbalance,the synthetic minority oversampling technique(SMOTE)is employed to generate synthetic samples for minority class,thereby improving detection accuracy.The integration of ViT and 1DCNN through feature fusion enables the model to capture both global contextual and local sequential features,resulting in comprehensive ransomware classification.Tested on the UNSW-NB15 dataset,the proposed ViT-1DCNN model achieved 98%detection accuracy with precision,recall,and F1-score metrics surpassing conventional methods.This approach not only reduces false positives and negatives but also offers scalability and robustness for real-world cybersecurity applications.The results demonstrate the model’s potential as an effective tool for proactive ransomware detection,especially in environments where evolving threats require adaptable and high-accuracy solutions.展开更多
Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells an...Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.展开更多
With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of...With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.展开更多
基金supported by National Nature Science Foundation of China (Nos. 61462046 and 61762052)Natural Science Foundation of Jiangxi Province (Nos. 20161BAB202049 and 20161BAB204172)+2 种基金the Bidding Project of the Key Laboratory of Watershed Ecology and Geographical Environment Monitoring, NASG (Nos. WE2016003, WE2016013 and WE2016015)the Science and Technology Research Projects of Jiangxi Province Education Department (Nos. GJJ160741, GJJ170632 and GJJ170633)the Art Planning Project of Jiangxi Province (Nos. YG2016250 and YG2017381)
文摘Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.
基金supported by the National Natural Science Foundation of China(No.62241109)the Tianjin Science and Technology Commissioner Project(No.20YDTPJC01110)。
文摘An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.
文摘In the article“A Lightweight Approach for Skin Lesion Detection through Optimal Features Fusion”by Khadija Manzoor,Fiaz Majeed,Ansar Siddique,Talha Meraj,Hafiz Tayyab Rauf,Mohammed A.El-Meligy,Mohamed Sharaf,Abd Elatty E.Abd Elgawad Computers,Materials&Continua,2022,Vol.70,No.1,pp.1617–1630.DOI:10.32604/cmc.2022.018621,URL:https://www.techscience.com/cmc/v70n1/44361,there was an error regarding the affiliation for the author Hafiz Tayyab Rauf.Instead of“Centre for Smart Systems,AI and Cybersecurity,Staffordshire University,Stoke-on-Trent,UK”,the affiliation should be“Independent Researcher,Bradford,BD80HS,UK”.
基金supported in part by the National Natural Science Foundation of China under Grants 62463002,62062021 and 62473033in part by the Guiyang Scientific Plan Project[2023]48–11,in part by QKHZYD[2023]010 Guizhou Province Science and Technology Innovation Base Construction Project“Key Laboratory Construction of Intelligent Mountain Agricultural Equipment”.
文摘Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by the National Natural Science Foundation of China(62276092,62303167)the Postdoctoral Fellowship Program(Grade C)of China Postdoctoral Science Foundation(GZC20230707)+3 种基金the Key Science and Technology Program of Henan Province,China(242102211051,242102211042,212102310084)Key Scientiffc Research Projects of Colleges and Universities in Henan Province,China(25A520009)the China Postdoctoral Science Foundation(2024M760808)the Henan Province medical science and technology research plan joint construction project(LHGJ2024069).
文摘Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely used in pulmonary disease diagnosis,such as pneumonia and tuberculosis.However,traditional feature fusion methods often suffer from feature disparity,information loss,redundancy,and increased complexity,hindering the further extension of DL algorithms.To solve this problem,we propose a Graph-Convolution Fusion Network with Self-Supervised Feature Alignment(Self-FAGCFN)to address the limitations of traditional feature fusion methods in deep learning-based medical image classification for respiratory diseases such as pneumonia and tuberculosis.The network integrates Convolutional Neural Networks(CNNs)for robust feature extraction from two-dimensional grid structures and Graph Convolutional Networks(GCNs)within a Graph Neural Network branch to capture features based on graph structure,focusing on significant node representations.Additionally,an Attention-Embedding Ensemble Block is included to capture critical features from GCN outputs.To ensure effective feature alignment between pre-and post-fusion stages,we introduce a feature alignment loss that minimizes disparities.Moreover,to address the limitations of proposed methods,such as inappropriate centroid discrepancies during feature alignment and class imbalance in the dataset,we develop a Feature-Centroid Fusion(FCF)strategy and a Multi-Level Feature-Centroid Update(MLFCU)algorithm,respectively.Extensive experiments on public datasets LungVision and Chest-Xray demonstrate that the Self-FAGCFN model significantly outperforms existing methods in diagnosing pneumonia and tuberculosis,highlighting its potential for practical medical applications.
基金supported by the National Natural Science Foundation of China(No.62376287)the International Science and Technology Innovation Joint Base of Machine Vision and Medical Image Processing in Hunan Province(2021CB1013)the Natural Science Foundation of Hunan Province(Nos.2022JJ30762,2023JJ70016).
文摘Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.
基金King Saud University,Grant/Award Number:RSP2024R157。
文摘Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework for human gait classification in video sequences using deep learning(DL)fusion assisted and posterior probability-based moth flames optimization(MFO)is proposed.In the first step,the video frames are resized and finetuned by two pre-trained lightweight DL models,EfficientNetB0 and MobileNetV2.Both models are selected based on the top-5 accuracy and less number of parameters.Later,both models are trained through deep transfer learning and extracted deep features fused using a voting scheme.In the last step,the authors develop a posterior probabilitybased MFO feature selection algorithm to select the best features.The selected features are classified using several supervised learning methods.The CASIA-B publicly available dataset has been employed for the experimental process.On this dataset,the authors selected six angles such as 0°,18°,90°,108°,162°,and 180°and obtained an average accuracy of 96.9%,95.7%,86.8%,90.0%,95.1%,and 99.7%.Results demonstrate comparable improvement in accuracy and significantly minimize the computational time with recent state-of-the-art techniques.
基金supported by the National Natural Science Foundation of China(No.U2037602)。
文摘In order to address the challenges encountered in visual navigation for asteroid landing using traditional point features,such as significant recognition and extraction errors,low computational efficiency,and limited navigation accuracy,a novel approach for multi-type fusion visual navigation is proposed.This method aims to overcome the limitations of single-type features and enhance navigation accuracy.Analytical criteria for selecting multi-type features are introduced,which simultaneously improve computational efficiency and system navigation accuracy.Concerning pose estimation,both absolute and relative pose estimation methods based on multi-type feature fusion are proposed,and multi-type feature normalization is established,which significantly improves system navigation accuracy and lays the groundwork for flexible application of joint absolute-relative estimation.The feasibility and effectiveness of the proposed method are validated through simulation experiments through 4769 Castalia.
基金Under the auspices of the Key Program of National Natural Science Foundation of China(No.42030409)。
文摘Multi-source data fusion provides high-precision spatial situational awareness essential for analyzing granular urban social activities.This study used Shanghai’s catering industry as a case study,leveraging electronic reviews and consumer data sourced from third-party restaurant platforms collected in 2021.By performing weighted processing on two-dimensional point-of-interest(POI)data,clustering hotspots of high-dimensional restaurant data were identified.A hierarchical network of restaurant hotspots was constructed following the Central Place Theory(CPT)framework,while the Geo-Informatic Tupu method was employed to resolve the challenges posed by network deformation in multi-scale processes.These findings suggest the necessity of enhancing the spatial balance of Shanghai’s urban centers by moderately increasing the number and service capacity of suburban centers at the urban periphery.Such measures would contribute to a more optimized urban structure and facilitate the outward dispersion of comfort-oriented facilities such as the restaurant industry.At a finer spatial scale,the distribution of restaurant hotspots demonstrates a polycentric and symmetric spatial pattern,with a developmental trend radiating outward along the city’s ring roads.This trend can be attributed to the efforts of restaurants to establish connections with other urban functional spaces,leading to the reconfiguration of urban spaces,expansion of restaurant-dedicated land use,and the reorganization of associated commercial activities.The results validate the existence of a polycentric urban structure in Shanghai but also highlight the instability of the restaurant hotspot network during cross-scale transitions.
基金supported by the National Natural Science Foundation of China(62302167,62477013)Natural Science Foundation of Shanghai(No.24ZR1456100)+1 种基金Science and Technology Commission of Shanghai Municipality(No.24DZ2305900)the Shanghai Municipal Special Fund for Promoting High-Quality Development of Industries(2211106).
文摘Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification.
文摘With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.
文摘A heart attack disrupts the normal flow of blood to the heart muscle,potentially causing severe damage or death if not treated promptly.It can lead to long-term health complications,reduce quality of life,and significantly impact daily activities and overall well-being.Despite the growing popularity of deep learning,several drawbacks persist,such as complexity and the limitation of single-model learning.In this paper,we introduce a residual learning-based feature fusion technique to achieve high accuracy in differentiating abnormal cardiac rhythms heart sound.Combining MobileNet with DenseNet201 for feature fusion leverages MobileNet lightweight,efficient architecture with DenseNet201,dense connections,resulting in enhanced feature extraction and improved model performance with reduced computational cost.To further enhance the fusion,we employed residual learning to optimize the hierarchical features of heart abnormal sounds during training.The experimental results demonstrate that the proposed fusion method achieved an accuracy of 95.67%on the benchmark PhysioNet-2016 Spectrogram dataset.To further validate the performance,we applied it to the BreakHis dataset with a magnification level of 100X.The results indicate that the model maintains robust performance on the second dataset,achieving an accuracy of 96.55%.it highlights its consistent performance,making it a suitable for various applications.
基金Sponsored by Beijing Youth Innovation Talent Support Program for Urban Greening and Landscaping——The 2024 Special Project for Promoting High-Quality Development of Beijing’s Landscaping through Scientific and Technological Innovation(KJCXQT202410).
文摘Taking the Ming Tombs Forest Farm in Beijing as the research object,this research applied multi-source data fusion and GIS heat-map overlay analysis techniques,systematically collected bird observation point data from the Global Biodiversity Information Facility(GBIF),population distribution data from the Oak Ridge National Laboratory(ORNL)in the United States,as well as information on the composition of tree species in suitable forest areas for birds and the forest geographical information of the Ming Tombs Forest Farm,which is based on literature research and field investigations.By using GIS technology,spatial processing was carried out on bird observation points and population distribution data to identify suitable bird-watching areas in different seasons.Then,according to the suitability value range,these areas were classified into different grades(from unsuitable to highly suitable).The research findings indicated that there was significant spatial heterogeneity in the bird-watching suitability of the Ming Tombs Forest Farm.The north side of the reservoir was generally a core area with high suitability in all seasons.The deep-aged broad-leaved mixed forests supported the overlapping co-existence of the ecological niches of various bird species,such as the Zosterops simplex and Urocissa erythrorhyncha.In contrast,the shallow forest-edge coniferous pure forests and mixed forests were more suitable for specialized species like Carduelis sinica.The southern urban area and the core area of the mausoleums had relatively low suitability due to ecological fragmentation or human interference.Based on these results,this paper proposed a three-level protection framework of“core area conservation—buffer zone management—isolation zone construction”and a spatio-temporal coordinated human-bird co-existence strategy.It was also suggested that the human-bird co-existence space could be optimized through measures such as constructing sound and light buffer interfaces,restoring ecological corridors,and integrating cultural heritage elements.This research provided an operational technical approach and decision-making support for the scientific planning of bird-watching sites and the coordination of ecological protection and tourism development.
基金funded by the National Natural Science Foundation of China(Grant Nos.62161019,62061024).
文摘The detection of Ochotona Curzoniae serves as a fundamental component for estimating the population size of this species and for analyzing the dynamics of its population fluctuations.In natural environments,the pixels representing Ochotona Curzoniae constitute a small fraction of the total pixels,and their distinguishing features are often subtle,complicating the target detection process.To effectively extract the characteristics of these small targets,a feature fusion approach that utilizes up-sampling and channel integration from various layers within a CNN can significantly enhance the representation of target features,ultimately improving detection accuracy.However,the top-down fusion of features from different layers may lead to information duplication and semantic bias,resulting in redundancy and high-frequency noise.To address the challenges of information redundancy and high-frequency noise during the feature fusion process in CNN,we have developed a target detection model for Ochotona Curzoniae.This model is based on a spatial-channel reconfiguration convolutional(SCConv)attentional mechanism and feature fusion(FFBCA),integrated with the Faster R-CNN framework.It consists of a feature extraction network,an attention mechanism-based feature fusion module,and a jump residual connection fusion module.Initially,we designed a dual attention mechanism feature fusion module that employs spatial-channel reconstruction convolution.In the spatial dimension,the attention mechanism adopts a separation-reconstruction approach,calculating a weight matrix for the spatial information within the feature map through group normalization.This process directs the model to concentrate on feature information assigned varying weights,thereby reducing redundancy during feature fusion.In the channel dimension,the attention mechanism utilizes a partition-transpose-fusion method,segmenting the input feature map into high-noise and low-noise components based on the variance of the feature information.The high-noise segment is processed through a low-pass filter constructed from pointwise convolution(PWC)to eliminate some high-frequency noise,while the low-noise segment employs a bottleneck structure with global average pooling(GAP)to generate a weight matrix that emphasizes the significance of channel dimension feature information.This approach diminishes the model’s focus on low-weight feature information,thereby preserving low-frequency semantic information while reducing information redundancy.Furthermore,we have developed a novel feature extraction network,ResNeXt-S,by integrating the Sim attention mechanism into ResNeXt50.This configuration assigns three-dimensional attention weights to each position within the feature map,thereby enhancing the local feature information of small targets while reducing background noise.Finally,we constructed a jump residual connection fusion module to minimize the loss of high-level semantic information during the feature fusion process.Experiments on Ochotona Curzoniae target detection on the Ochotona Curzoniae dataset show that the detection accuracy of the model in this paper is 92.3%,which is higher than that of FSSD512(84.6%),TDFSSD512(81.3%),FPN(86.5%),FFBAM(88.5%),Faster R-CNN(89.6%),and SSD512(88.6%)detection accuracies.
基金This work is supported by the Ministry of Education of Humanities and Social Science projects in China(No.20YJCZH124)Guangdong Province Education and Teaching Reform Project No.640:Research on the Teaching Practice and Application of Online Peer Assessment Methods in the Context of Artificial Intelligence.
文摘This study proposes a learner profile framework based on multi-feature fusion,aiming to enhance the precision of personalized learning recommendations by integrating learners’static attributes(e.g.,demographic data and historical academic performance)with dynamic behavioral patterns(e.g.,real-time interactions and evolving interests over time).The research employs Term Frequency-Inverse Document Frequency(TF-IDF)for semantic feature extraction,integrates the Analytic Hierarchy Process(AHP)for feature weighting,and introduces a time decay function inspired by Newton’s law of cooling to dynamically model changes in learners’interests.Empirical results demonstrate that this framework effectively captures the dynamic evolution of learners’behaviors and provides context-aware learning resource recommendations.The study introduces a novel paradigm for learner modeling in educational technology,combining methodological innovation with a scalable technical architecture,thereby laying a foundation for the development of adaptive learning systems.
文摘Ensuring that autonomous vehicles maintain high precision and rapid response capabilities in complex and dynamic driving environments is a critical challenge in the field of autonomous driving.This study aims to enhance the learning efficiency ofmulti-sensor feature fusion in autonomous driving tasks,thereby improving the safety and responsiveness of the system.To achieve this goal,we propose an innovative multi-sensor feature fusion model that integrates three distinct modalities:visual,radar,and lidar data.The model optimizes the feature fusion process through the introduction of two novel mechanisms:Sparse Channel Pooling(SCP)and Residual Triplet-Attention(RTA).Firstly,the SCP mechanism enables the model to adaptively filter out salient feature channels while eliminating the interference of redundant features.This enhances the model’s emphasis on critical features essential for decisionmaking and strengthens its robustness to environmental variability.Secondly,the RTA mechanism addresses the issue of feature misalignment across different modalities by effectively aligning key cross-modal features.This alignment reduces the computational overhead associated with redundant features and enhances the overall efficiency of the system.Furthermore,this study incorporates a reinforcement learning module designed to optimize strategies within a continuous action space.By integrating thismodulewith the feature fusion learning process,the entire system is capable of learning efficient driving strategies in an end-to-end manner within the CARLA autonomous driving simulator.Experimental results demonstrate that the proposedmodel significantly enhances the perception and decision-making accuracy of the autonomous driving system in complex traffic scenarios while maintaining real-time responsiveness.This work provides a novel perspective and technical pathway for the application of multi-sensor data fusion in autonomous driving.
文摘Ransomware attacks pose a significant threat to critical infrastructures,demanding robust detection mechanisms.This study introduces a hybrid model that combines vision transformer(ViT)and one-dimensional convolutional neural network(1DCNN)architectures to enhance ransomware detection capabilities.Addressing common challenges in ransomware detection,particularly dataset class imbalance,the synthetic minority oversampling technique(SMOTE)is employed to generate synthetic samples for minority class,thereby improving detection accuracy.The integration of ViT and 1DCNN through feature fusion enables the model to capture both global contextual and local sequential features,resulting in comprehensive ransomware classification.Tested on the UNSW-NB15 dataset,the proposed ViT-1DCNN model achieved 98%detection accuracy with precision,recall,and F1-score metrics surpassing conventional methods.This approach not only reduces false positives and negatives but also offers scalability and robustness for real-world cybersecurity applications.The results demonstrate the model’s potential as an effective tool for proactive ransomware detection,especially in environments where evolving threats require adaptable and high-accuracy solutions.
基金funded by the China Chongqing Municipal Science and Technology Bureau,grant numbers 2024TIAD-CYKJCXX0121,2024NSCQ-LZX0135Chongqing Municipal Commission of Housing and Urban-Rural Development,grant number CKZ2024-87+3 种基金the Chongqing University of Technology graduate education high-quality development project,grant number gzlsz202401the Chongqing University of Technology-Chongqing LINGLUE Technology Co.,Ltd.,Electronic Information(Artificial Intelligence)graduate joint training basethe Postgraduate Education and Teaching Reform Research Project in Chongqing,grant number yjg213116the Chongqing University of Technology-CISDI Chongqing Information Technology Co.,Ltd.,Computer Technology graduate joint training base.
文摘Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.
基金supported by Communication University of China(HG23035)partly supported by the Fundamental Research Funds for the Central Universities(CUC230A013).
文摘With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.