期刊文献+
共找到354篇文章
< 1 2 18 >
每页显示 20 50 100
A machine learning-based depression recognition model integrating spiritexpression features from traditional Chinese medicine
1
作者 Minghui Yao Rongrong Zhu +4 位作者 Peng Qian Huilin Liu Xirong Sun Limin Gao Fufeng Li 《Digital Chinese Medicine》 2026年第1期68-79,共12页
Objective To develop a depression recognition model by integrating the spirit-expression diagnostic framework of traditional Chinese medicine(TCM)with machine learning algorithms.The proposed model seeks to establish ... Objective To develop a depression recognition model by integrating the spirit-expression diagnostic framework of traditional Chinese medicine(TCM)with machine learning algorithms.The proposed model seeks to establish a TCM-informed tool for early depression screening,thereby bridging traditional diagnostic principles with modern computational approaches.Methods The study included patients with depression who visited the Shanghai Pudong New Area Mental Health Center from October 1,2022 to October 1,2023,as well as students and teachers from Shanghai University of Traditional Chinese Medicine during the same period as the healthy control group.Videos of 3–10 s were captured using a Xiaomi Pad 5,and the TCM spirit and expressions were determined by TCM experts(at least 3 out of 5 experts agreed to determine the category of TCM spirit and expressions).Basic information,facial images,and interview information were collected through a portable TCM intelligent analysis and diagnosis device,and facial diagnosis features were extracted using the Open CV computer vision library technology.Statistical analysis methods such as parametric and non-parametric tests were used to analyze the baseline data,TCM spirit and expression features,and facial diagnosis feature parameters of the two groups,to compare the differences in TCM spirit and expression and facial features.Five machine learning algorithms,including extreme gradient boosting(XGBoost),decision tree(DT),Bernoulli naive Bayes(BernoulliNB),support vector machine(SVM),and k-nearest neighbor(KNN)classification,were used to construct a depression recognition model based on the fusion of TCM spirit and expression features.The performance of the model was evaluated using metrics such as accuracy,precision,and the area under the receiver operating characteristic(ROC)curve(AUC).The model results were explained using the Shapley Additive exPlanations(SHAP).Results A total of 93 depression patients and 87 healthy individuals were ultimately included in this study.There was no statistically significant difference in the baseline characteristics between the two groups(P>0.05).The differences in the characteristics of the spirit and expressions in TCM and facial features between the two groups were shown as follows.(i)Quantispirit facial analysis revealed that depression patients exhibited significantly reduced facial spirit and luminance compared with healthy controls(P<0.05),with characteristic features such as sad expressions,facial erythema,and changes in the lip color ranging from erythematous to cyanotic.(ii)Depressed patients exhibited significantly lower values in facial complexion L,lip L,and a values,and gloss index,but higher values in facial complexion a and b,lip b,low gloss index,and matte index(all P<0.05).(iii)The results of multiple models show that the XGBoost-based depression recognition model,integrating the TCM“spirit-expression”diagnostic framework,achieved an accuracy of 98.61%and significantly outperformed four benchmark algorithms—DT,BernoulliNB,SVM,and KNN(P<0.01).(iv)The SHAP visualization results show that in the recognition model constructed by the XGBoost algorithm,the complexion b value,categories of facial spirit,high gloss index,low gloss index,categories of facial expression and texture features have significant contribution to the model.Conclusion This study demonstrates that integrating TCM spirit-expression diagnostic features with machine learning enables the construction of a high-precision depression detection model,offering a novel paradigm for objective depression diagnosis. 展开更多
关键词 Traditional Chinese medicine SPIRIT EXPRESSION feature fusion DEPRESSION Recognition model
在线阅读 下载PDF
Boruta-LSTMAE:Feature-Enhanced Depth Image Denoising for 3D Recognition
2
作者 Fawad Salam Khan Noman Hasany +6 位作者 Muzammil Ahmad Khan Shayan Abbas Sajjad Ahmed Muhammad Zorain Wai Yie Leong Susama Bagchi Sanjoy Kumar Debnath 《Computers, Materials & Continua》 2026年第4期2181-2206,共26页
The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce... The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce poor computer vision results.The common image denoising techniques tend to remove significant image details and also remove noise,provided they are based on space and frequency filtering.The updated framework presented in this paper is a novel denoising model that makes use of Boruta-driven feature selection using a Long Short-Term Memory Autoencoder(LSTMAE).The Boruta algorithm identifies the most useful depth features that are used to maximize the spatial structure integrity and reduce redundancy.An LSTMAE is then used to process these selected features and model depth pixel sequences to generate robust,noise-resistant representations.The system uses the encoder to encode the input data into a latent space that has been compressed before it is decoded to retrieve the clean image.Experiments on a benchmark data set show that the suggested technique attains a PSNR of 45 dB and an SSIM of 0.90,which is 10 dB higher than the performance of conventional convolutional autoencoders and 15 times higher than that of the wavelet-based models.Moreover,the feature selection step will decrease the input dimensionality by 40%,resulting in a 37.5%reduction in training time and a real-time inference rate of 200 FPS.Boruta-LSTMAE framework,therefore,offers a highly efficient and scalable system for depth image denoising,with a high potential to be applied to close-range 3D systems,such as robotic manipulation and gesture-based interfaces. 展开更多
关键词 Boruta LSTM autoencoder feature fusion DENOISING 3D object recognition depth images
在线阅读 下载PDF
AFI:Blackbox Backdoor Detection Method Based on Adaptive Feature Injection
3
作者 Simin Tang Zhiyong Zhang +3 位作者 Junyan Pan Gaoyuan Quan Weiguo Wang Junchang Jing 《Computers, Materials & Continua》 2026年第4期1890-1908,共19页
At inference time,deep neural networks are susceptible to backdoor attacks,which can produce attackercontrolled outputs when inputs contain carefully crafted triggers.Existing defense methods often focus on specific a... At inference time,deep neural networks are susceptible to backdoor attacks,which can produce attackercontrolled outputs when inputs contain carefully crafted triggers.Existing defense methods often focus on specific attack types or incur high costs,such as data cleaning or model fine-tuning.In contrast,we argue that it is possible to achieve effective and generalizable defense without removing triggers or incurring high model-cleaning costs.Fromthe attacker’s perspective and based on characteristics of vulnerable neuron activation anomalies,we propose an Adaptive Feature Injection(AFI)method for black-box backdoor detection.AFI employs a pre-trained image encoder to extract multi-level deep features and constructs a dynamic weight fusionmechanism for precise identification and interception of poisoned samples.Specifically,we select the control samples with the largest feature differences fromthe clean dataset via feature-space analysis,and generate blended sample pairs with the test sample using dynamic linear interpolation.The detection statistic is computed by measuring the divergence G(x)in model output responses.We systematically evaluate the effectiveness of AFI against representative backdoor attacks,including BadNets,Blend,WaNet,and IAB,on three benchmark datasets:MNIST,CIFAR-10,and ImageNet.Experimental results show that AFI can effectively detect poisoned samples,achieving average detection rates of 95.20%,94.15%,and 86.49%on these datasets,respectively.Compared with existing methods,AFI demonstrates strong cross-domain generalization ability and robustness to unknown attacks. 展开更多
关键词 Deep learning backdoor attacks universal detection feature fusion backward reasoning
在线阅读 下载PDF
AdvYOLO:An Improved Cross-Conv-Block Feature Fusion-Based YOLO Network for Transferable Adversarial Attacks on ORSIs Object Detection
4
作者 Leyu Dai Jindong Wang +2 位作者 Ming Zhou Song Guo Hengwei Zhang 《Computers, Materials & Continua》 2026年第4期767-792,共26页
In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free... In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free models have opened new avenues for real-time target detection in optical remote sensing images(ORSIs).However,in the realmof adversarial attacks,developing adversarial techniques tailored to Anchor-Freemodels remains challenging.Adversarial examples generated based on Anchor-Based models often exhibit poor transferability to these new model architectures.Furthermore,the growing diversity of Anchor-Free models poses additional hurdles to achieving robust transferability of adversarial attacks.This study presents an improved cross-conv-block feature fusion You Only Look Once(YOLO)architecture,meticulously engineered to facilitate the extraction ofmore comprehensive semantic features during the backpropagation process.To address the asymmetry between densely distributed objects in ORSIs and the corresponding detector outputs,a novel dense bounding box attack strategy is proposed.This approach leverages dense target bounding boxes loss in the calculation of adversarial loss functions.Furthermore,by integrating translation-invariant(TI)and momentum-iteration(MI)adversarial methodologies,the proposed framework significantly improves the transferability of adversarial attacks.Experimental results demonstrate that our method achieves superior adversarial attack performance,with adversarial transferability rates(ATR)of 67.53%on the NWPU VHR-10 dataset and 90.71%on the HRSC2016 dataset.Compared to ensemble adversarial attack and cascaded adversarial attack approaches,our method generates adversarial examples in an average of 0.64 s,representing an approximately 14.5%improvement in efficiency under equivalent conditions. 展开更多
关键词 Remote sensing object detection transferable adversarial attack feature fusion cross-conv-block
在线阅读 下载PDF
Enhanced Multi-Scale Feature Extraction Lightweight Network for Remote Sensing Object Detection
5
作者 Xiang Luo Yuxuan Peng +2 位作者 Renghong Xie Peng Li Yuwen Qian 《Computers, Materials & Continua》 2026年第3期2097-2118,共22页
Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targ... Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targets,complex backgrounds,and small objects in remote sensing.Maintaining model lightweight to address resource constraints in remote sensing scenarios while improving task completion for remote sensing tasks remains a research hotspot.Therefore,we propose an enhanced multi-scale feature extraction lightweight network EM-YOLO based on the YOLOv8s architecture,specifically optimized for the characteristics of large target scale variations,diverse orientations,and numerous small objects in remote sensing images.Our innovations lie in two main aspects:First,a dynamic snake convolution(DSC)is introduced into the backbone network to enhance the model’s feature extraction capability for oriented targets.Second,an innovative focusing-diffusion module is designed in the feature fusion neck to effectively integrate multi-scale feature information.Finally,we introduce Layer-Adaptive Sparsity for magnitude-based Pruning(LASP)method to perform lightweight network pruning to better complete tasks in resource-constrained scenarios.Experimental results on the lightweight platform Orin demonstrate that the proposed method significantly outperforms the original YOLOv8s model in oriented remote sensing object detection tasks,and achieves comparable or superior performance to state-of-the-art methods on three authoritative remote sensing datasets(DOTA v1.0,DOTA v1.5,and HRSC2016). 展开更多
关键词 Deep learning object detection feature extraction feature fusion remote sensing
在线阅读 下载PDF
Steel Surface Anomaly Detection Using 3D Depth and 2D RGB Features
6
作者 Zheng Wangguandong Lu Ping +2 位作者 Deng Fangwei Huang Shijun Xia Siyu 《ZTE Communications》 2026年第1期81-87,共7页
The detection of steel surface anomalies has become an industrial challenge due to variations in production equipment,processes,and characteristics.To alleviate the problem,this paper proposes a detection and localiza... The detection of steel surface anomalies has become an industrial challenge due to variations in production equipment,processes,and characteristics.To alleviate the problem,this paper proposes a detection and localization method combining 3D depth and 2D RGB features.The framework comprises three stages:defect classification,defect location,an d warpage judgment.The first stage uses a dataefficient image Transformer model,the second stage utilizes reverse knowledge distillation,and the third stage performs feature fusion using3D depth and 2D RGB features.Experimental results show that the proposed algorithm achieves relatively high accuracy and feasibility,and can be effectively used in industrial scenarios. 展开更多
关键词 anomaly detection anomaly localization feature fusion reverse distillation
在线阅读 下载PDF
Federated Semi-Supervised Learning Based on Feature Space Fusion
7
作者 Zhe Ding Hao Yi +6 位作者 Wenrui Xie Ming Zhang Yuxuan Xiao Qixu Wang Qing Chen Zhiguang Qin Dajiang Chen 《Computers, Materials & Continua》 2026年第5期2062-2076,共15页
Federated semi-supervised learning(FSSL)has garnered substantial attention for enabling collaborative global model training across multiple clients to address the scarcity of labeled data and to preserve data privacy.... Federated semi-supervised learning(FSSL)has garnered substantial attention for enabling collaborative global model training across multiple clients to address the scarcity of labeled data and to preserve data privacy.However,FSSL is plagued by formidable challenges stemming fromcross-client data heterogeneity,as existing methods fail to achieve effective fusion of feature subspaces across distinct clients.To address this issue,we propose a novel FSSL framework,named FedSPQR,which is explicitly tailored for the label-at-server scenario.On the server side,FedSPQR adopts subspace clustering and fusion method based on the Grassmann manifold to construct a unified global feature space,which is further leveraged to refine the global model.On the client side,the pre-established global feature space acts as a benchmark for aligning the local feature subspaces.Based on the aligned local feature subspaces,integrating self-supervised learning with knowledge distillation facilitates effective local learning to alleviate local bias caused by data heterogeneity.Extensive experiments on two standard public benchmarks confirm that FedSPQR outperforms state-of-the-art(SOTA)baselines by a significant margin. 展开更多
关键词 Federated semi-supervised learning feature space fusion knowledge distillation
在线阅读 下载PDF
Research on Camouflage Target Detection Method Based on Edge Guidance and Multi-Scale Feature Fusion
8
作者 Tianze Yu Jianxun Zhang Hongji Chen 《Computers, Materials & Continua》 2026年第4期1676-1697,共22页
Camouflaged Object Detection(COD)aims to identify objects that share highly similar patterns—such as texture,intensity,and color—with their surrounding environment.Due to their intrinsic resemblance to the backgroun... Camouflaged Object Detection(COD)aims to identify objects that share highly similar patterns—such as texture,intensity,and color—with their surrounding environment.Due to their intrinsic resemblance to the background,camouflaged objects often exhibit vague boundaries and varying scales,making it challenging to accurately locate targets and delineate their indistinct edges.To address this,we propose a novel camouflaged object detection network called Edge-Guided and Multi-scale Fusion Network(EGMFNet),which leverages edge-guided multi-scale integration for enhanced performance.The model incorporates two innovative components:a Multi-scale Fusion Module(MSFM)and an Edge-Guided Attention Module(EGA).These designs exploit multi-scale features to uncover subtle cues between candidate objects and the background while emphasizing camouflaged object boundaries.Moreover,recognizing the rich contextual information in fused features,we introduce a Dual-Branch Global Context Module(DGCM)to refine features using extensive global context,thereby generatingmore informative representations.Experimental results on four benchmark datasets demonstrate that EGMFNet outperforms state-of-the-art methods across five evaluation metrics.Specifically,on COD10K,our EGMFNet-P improves F_(β)by 4.8 points and reduces mean absolute error(MAE)by 0.006 compared with ZoomNeXt;on NC4K,it achieves a 3.6-point increase in F_(β).OnCAMO and CHAMELEON,it obtains 4.5-point increases in F_(β),respectively.These consistent gains substantiate the superiority and robustness of EGMFNet. 展开更多
关键词 Camouflaged object detection multi-scale feature fusion edge-guided image segmentation
在线阅读 下载PDF
Attention Mechanisms and FFM Feature Fusion Module-Based Modification of the Deep Neural Network for Detection of Structural Cracks
9
作者 Tao Jin Zhekun Shou +1 位作者 Hongchao Liu Yuchun Shao 《Computer Modeling in Engineering & Sciences》 2026年第2期345-366,共22页
This research centers on structural health monitoring of bridges,a critical transportation infrastructure.Owing to the cumulative action of heavy vehicle loads,environmental variations,and material aging,bridge compon... This research centers on structural health monitoring of bridges,a critical transportation infrastructure.Owing to the cumulative action of heavy vehicle loads,environmental variations,and material aging,bridge components are prone to cracks and other defects,severely compromising structural safety and service life.Traditional inspection methods relying on manual visual assessment or vehicle-mounted sensors suffer from low efficiency,strong subjectivity,and high costs,while conventional image processing techniques and early deep learning models(e.g.,UNet,Faster R-CNN)still performinadequately in complex environments(e.g.,varying illumination,noise,false cracks)due to poor perception of fine cracks andmulti-scale features,limiting practical application.To address these challenges,this paper proposes CACNN-Net(CBAM-Augmented CNN),a novel dual-encoder architecture that innovatively couples a CNN for local detail extraction with a CBAM-Transformer for global context modeling.A key contribution is the dedicated Feature FusionModule(FFM),which strategically integratesmulti-scale features and focuses attention on crack regions while suppressing irrelevant noise.Experiments on bridge crack datasets demonstrate that CACNNNet achieves a precision of 77.6%,a recall of 79.4%,and an mIoU of 62.7%.These results significantly outperform several typical models(e.g.,UNet-ResNet34,Deeplabv3),confirming their superior accuracy and robust generalization,providing a high-precision automated solution for bridge crack detection and a novel network design paradigm for structural surface defect identification in complex scenarios,while future research may integrate physical features like depth information to advance intelligent infrastructure maintenance and digital twin management. 展开更多
关键词 Bridge crack diseases structural health monitoring convolutional neural network feature fusion
在线阅读 下载PDF
Multi-scale feature fused stacked autoencoder and its application for soft sensor modeling 被引量:1
10
作者 Zhi Li Yuchong Xia +2 位作者 Jian Long Chensheng Liu Longfei Zhang 《Chinese Journal of Chemical Engineering》 2025年第5期241-254,共14页
Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE... Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE)has been widely used to improve the model accuracy of soft sensors.However,with the increase of network layers,SAE may encounter serious information loss issues,which affect the modeling performance of soft sensors.Besides,there are typically very few labeled samples in the data set,which brings challenges to traditional neural networks to solve.In this paper,a multi-scale feature fused stacked autoencoder(MFF-SAE)is suggested for feature representation related to hierarchical output,where stacked autoencoder,mutual information(MI)and multi-scale feature fusion(MFF)strategies are integrated.Based on correlation analysis between output and input variables,critical hidden variables are extracted from the original variables in each autoencoder's input layer,which are correspondingly given varying weights.Besides,an integration strategy based on multi-scale feature fusion is adopted to mitigate the impact of information loss with the deepening of the network layers.Then,the MFF-SAE method is designed and stacked to form deep networks.Two practical industrial processes are utilized to evaluate the performance of MFF-SAE.Results from simulations indicate that in comparison to other cutting-edge techniques,the proposed method may considerably enhance the accuracy of soft sensor modeling,where the suggested method reduces the root mean square error(RMSE)by 71.8%,17.1%and 64.7%,15.1%,respectively. 展开更多
关键词 Multi-scale feature fusion Soft sensors Stacked autoencoders Computational chemistry Chemical processes Parameter estimation
在线阅读 下载PDF
A Lightweight Multiscale Feature Fusion Network for Solar Cell Defect Detection
11
作者 Xiaoyun Chen Lanyao Zhang +3 位作者 Xiaoling Chen Yigang Cen Linna Zhang Fugui Zhang 《Computers, Materials & Continua》 SCIE EI 2025年第1期521-542,共22页
Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it cha... Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network. 展开更多
关键词 Defect segmentation multi-scale feature fusion multi-scale attention depthwise separable residual block
在线阅读 下载PDF
Multi-scale feature fusion optical remote sensing target detection method 被引量:1
12
作者 BAI Liang DING Xuewen +1 位作者 LIU Ying CHANG Limei 《Optoelectronics Letters》 2025年第4期226-233,共8页
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram... An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved. 展开更多
关键词 multi scale feature fusion optical remote sensing feature map improve target detection ability optical remote sensing imagesfirstlythe target detection feature fusionto enrich semantic information spatial information
原文传递
Multi-Scale Feature Fusion and Advanced Representation Learning for Multi Label Image Classification
13
作者 Naikang Zhong Xiao Lin +1 位作者 Wen Du Jin Shi 《Computers, Materials & Continua》 2025年第3期5285-5306,共22页
Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feat... Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification. 展开更多
关键词 Image classification MULTI-LABEL multi scale attention mechanisms feature fusion
在线阅读 下载PDF
Detection of Abnormal Cardiac Rhythms Using Feature Fusion Technique with Heart Sound Spectrograms
14
作者 Saif Ur Rehman Khan Zia Khan 《Journal of Bionic Engineering》 2025年第4期2030-2049,共20页
A heart attack disrupts the normal flow of blood to the heart muscle,potentially causing severe damage or death if not treated promptly.It can lead to long-term health complications,reduce quality of life,and signific... A heart attack disrupts the normal flow of blood to the heart muscle,potentially causing severe damage or death if not treated promptly.It can lead to long-term health complications,reduce quality of life,and significantly impact daily activities and overall well-being.Despite the growing popularity of deep learning,several drawbacks persist,such as complexity and the limitation of single-model learning.In this paper,we introduce a residual learning-based feature fusion technique to achieve high accuracy in differentiating abnormal cardiac rhythms heart sound.Combining MobileNet with DenseNet201 for feature fusion leverages MobileNet lightweight,efficient architecture with DenseNet201,dense connections,resulting in enhanced feature extraction and improved model performance with reduced computational cost.To further enhance the fusion,we employed residual learning to optimize the hierarchical features of heart abnormal sounds during training.The experimental results demonstrate that the proposed fusion method achieved an accuracy of 95.67%on the benchmark PhysioNet-2016 Spectrogram dataset.To further validate the performance,we applied it to the BreakHis dataset with a magnification level of 100X.The results indicate that the model maintains robust performance on the second dataset,achieving an accuracy of 96.55%.it highlights its consistent performance,making it a suitable for various applications. 展开更多
关键词 Cardiac rhythms feature fusion Residual learning BreakHis Spectrogram sound
在线阅读 下载PDF
Fusing Geometric and Temporal Deep Features for High-Precision Arabic Sign Language Recognition
15
作者 Yazeed Alkharijah Shehzad Khalid +2 位作者 Syed Muhammad Usman Amina Jameel Danish Hamid 《Computer Modeling in Engineering & Sciences》 2025年第7期1113-1141,共29页
Arabic Sign Language(ArSL)recognition plays a vital role in enhancing the communication for the Deaf and Hard of Hearing(DHH)community.Researchers have proposed multiple methods for automated recognition of ArSL;howev... Arabic Sign Language(ArSL)recognition plays a vital role in enhancing the communication for the Deaf and Hard of Hearing(DHH)community.Researchers have proposed multiple methods for automated recognition of ArSL;however,these methods face multiple challenges that include high gesture variability,occlusions,limited signer diversity,and the scarcity of large annotated datasets.Existing methods,often relying solely on either skeletal data or video-based features,struggle with generalization and robustness,especially in dynamic and real-world conditions.This paper proposes a novel multimodal ensemble classification framework that integrates geometric features derived from 3D skeletal joint distances and angles with temporal features extracted from RGB videos using the Inflated 3D ConvNet(I3D).By fusing these complementary modalities at the feature level and applying a majority-voting ensemble of XGBoost,Random Forest,and Support Vector Machine classifiers,the framework robustly captures both spatial configurations and motion dynamics of sign gestures.Feature selection using the Pearson Correlation Coefficient further enhances efficiency by reducing redundancy.Extensive experiments on the ArabSign dataset,which includes RGB videos and corresponding skeletal data,demonstrate that the proposed approach significantly outperforms state-of-the-art methods,achieving an average F1-score of 97%using a majority-voting ensemble of XGBoost,Random Forest,and SVM classifiers,and improving recognition accuracy by more than 7%over previous best methods.This work not only advances the technical stateof-the-art in ArSL recognition but also provides a scalable,real-time solution for practical deployment in educational,social,and assistive communication technologies.Even though this study is about Arabic Sign Language,the framework proposed here can be extended to different sign languages,creating possibilities for potentially worldwide applicability in sign language recognition tasks. 展开更多
关键词 Arabic sign language recognition multimodal feature fusion ensemble classification skeletal data inflated 3D ConvNet(I3D)
在线阅读 下载PDF
An Ochotona Curzoniae Object Detection Model Based on Feature Fusion with SCConv Attention Mechanism
16
作者 Haiyan Chen Rong Li 《Computers, Materials & Continua》 2025年第9期5693-5712,共20页
The detection of Ochotona Curzoniae serves as a fundamental component for estimating the population size of this species and for analyzing the dynamics of its population fluctuations.In natural environments,the pixels... The detection of Ochotona Curzoniae serves as a fundamental component for estimating the population size of this species and for analyzing the dynamics of its population fluctuations.In natural environments,the pixels representing Ochotona Curzoniae constitute a small fraction of the total pixels,and their distinguishing features are often subtle,complicating the target detection process.To effectively extract the characteristics of these small targets,a feature fusion approach that utilizes up-sampling and channel integration from various layers within a CNN can significantly enhance the representation of target features,ultimately improving detection accuracy.However,the top-down fusion of features from different layers may lead to information duplication and semantic bias,resulting in redundancy and high-frequency noise.To address the challenges of information redundancy and high-frequency noise during the feature fusion process in CNN,we have developed a target detection model for Ochotona Curzoniae.This model is based on a spatial-channel reconfiguration convolutional(SCConv)attentional mechanism and feature fusion(FFBCA),integrated with the Faster R-CNN framework.It consists of a feature extraction network,an attention mechanism-based feature fusion module,and a jump residual connection fusion module.Initially,we designed a dual attention mechanism feature fusion module that employs spatial-channel reconstruction convolution.In the spatial dimension,the attention mechanism adopts a separation-reconstruction approach,calculating a weight matrix for the spatial information within the feature map through group normalization.This process directs the model to concentrate on feature information assigned varying weights,thereby reducing redundancy during feature fusion.In the channel dimension,the attention mechanism utilizes a partition-transpose-fusion method,segmenting the input feature map into high-noise and low-noise components based on the variance of the feature information.The high-noise segment is processed through a low-pass filter constructed from pointwise convolution(PWC)to eliminate some high-frequency noise,while the low-noise segment employs a bottleneck structure with global average pooling(GAP)to generate a weight matrix that emphasizes the significance of channel dimension feature information.This approach diminishes the model’s focus on low-weight feature information,thereby preserving low-frequency semantic information while reducing information redundancy.Furthermore,we have developed a novel feature extraction network,ResNeXt-S,by integrating the Sim attention mechanism into ResNeXt50.This configuration assigns three-dimensional attention weights to each position within the feature map,thereby enhancing the local feature information of small targets while reducing background noise.Finally,we constructed a jump residual connection fusion module to minimize the loss of high-level semantic information during the feature fusion process.Experiments on Ochotona Curzoniae target detection on the Ochotona Curzoniae dataset show that the detection accuracy of the model in this paper is 92.3%,which is higher than that of FSSD512(84.6%),TDFSSD512(81.3%),FPN(86.5%),FFBAM(88.5%),Faster R-CNN(89.6%),and SSD512(88.6%)detection accuracies. 展开更多
关键词 Ochotona curzoniae target detection SCConv attention feature fusion
在线阅读 下载PDF
Oversampling-Enhanced Feature Fusion-Based Hybrid ViT-1DCNN Model for Ransomware Cyber Attack Detection
17
作者 Muhammad Armghan Latif Zohaib Mushtaq +4 位作者 Saifur Rahman Saad Arif Salim Nasar Faraj Mursal Muhammad Irfan Haris Aziz 《Computer Modeling in Engineering & Sciences》 2025年第2期1667-1695,共29页
Ransomware attacks pose a significant threat to critical infrastructures,demanding robust detection mechanisms.This study introduces a hybrid model that combines vision transformer(ViT)and one-dimensional convolutiona... Ransomware attacks pose a significant threat to critical infrastructures,demanding robust detection mechanisms.This study introduces a hybrid model that combines vision transformer(ViT)and one-dimensional convolutional neural network(1DCNN)architectures to enhance ransomware detection capabilities.Addressing common challenges in ransomware detection,particularly dataset class imbalance,the synthetic minority oversampling technique(SMOTE)is employed to generate synthetic samples for minority class,thereby improving detection accuracy.The integration of ViT and 1DCNN through feature fusion enables the model to capture both global contextual and local sequential features,resulting in comprehensive ransomware classification.Tested on the UNSW-NB15 dataset,the proposed ViT-1DCNN model achieved 98%detection accuracy with precision,recall,and F1-score metrics surpassing conventional methods.This approach not only reduces false positives and negatives but also offers scalability and robustness for real-world cybersecurity applications.The results demonstrate the model’s potential as an effective tool for proactive ransomware detection,especially in environments where evolving threats require adaptable and high-accuracy solutions. 展开更多
关键词 Ransomware attacks CYBERSECURITY vision transformer convolutional neural network feature fusion ENCRYPTION threat detection
在线阅读 下载PDF
BAHGRF^(3):Human gait recognition in the indoor environment using deep learning features fusion assisted framework and posterior probability moth flame optimisation
18
作者 Muhammad Abrar Ahmad Khan Muhammad Attique Khan +5 位作者 Ateeq Ur Rehman Ahmed Ibrahim Alzahrani Nasser Alalwan Deepak Gupta Saima Ahmed Rahin Yudong Zhang 《CAAI Transactions on Intelligence Technology》 2025年第2期387-401,共15页
Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework... Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework for human gait classification in video sequences using deep learning(DL)fusion assisted and posterior probability-based moth flames optimization(MFO)is proposed.In the first step,the video frames are resized and finetuned by two pre-trained lightweight DL models,EfficientNetB0 and MobileNetV2.Both models are selected based on the top-5 accuracy and less number of parameters.Later,both models are trained through deep transfer learning and extracted deep features fused using a voting scheme.In the last step,the authors develop a posterior probabilitybased MFO feature selection algorithm to select the best features.The selected features are classified using several supervised learning methods.The CASIA-B publicly available dataset has been employed for the experimental process.On this dataset,the authors selected six angles such as 0°,18°,90°,108°,162°,and 180°and obtained an average accuracy of 96.9%,95.7%,86.8%,90.0%,95.1%,and 99.7%.Results demonstrate comparable improvement in accuracy and significantly minimize the computational time with recent state-of-the-art techniques. 展开更多
关键词 deep learning feature fusion feature optimization gait classification indoor environment machine learning
在线阅读 下载PDF
End-to-End Audio Pattern Recognition Network for Overcoming Feature Limitations in Human-Machine Interaction
19
作者 Zijian Sun Yaqian Li +2 位作者 Haoran Liu Haibin Li Wenming Zhang 《Computers, Materials & Continua》 2025年第5期3187-3210,共24页
In recent years,audio pattern recognition has emerged as a key area of research,driven by its applications in human-computer interaction,robotics,and healthcare.Traditional methods,which rely heavily on handcrafted fe... In recent years,audio pattern recognition has emerged as a key area of research,driven by its applications in human-computer interaction,robotics,and healthcare.Traditional methods,which rely heavily on handcrafted features such asMel filters,often suffer frominformation loss and limited feature representation capabilities.To address these limitations,this study proposes an innovative end-to-end audio pattern recognition framework that directly processes raw audio signals,preserving original information and extracting effective classification features.The proposed framework utilizes a dual-branch architecture:a global refinement module that retains channel and temporal details and a multi-scale embedding module that captures high-level semantic information.Additionally,a guided fusion module integrates complementary features from both branches,ensuring a comprehensive representation of audio data.Specifically,the multi-scale audio context embedding module is designed to effectively extract spatiotemporal dependencies,while the global refinement module aggregates multi-scale channel and temporal cues for enhanced modeling.The guided fusion module leverages these features to achieve efficient integration of complementary information,resulting in improved classification accuracy.Experimental results demonstrate the model’s superior performance on multiple datasets,including ESC-50,UrbanSound8K,RAVDESS,and CREMA-D,with classification accuracies of 93.25%,90.91%,92.36%,and 70.50%,respectively.These results highlight the robustness and effectiveness of the proposed framework,which significantly outperforms existing approaches.By addressing critical challenges such as information loss and limited feature representation,thiswork provides newinsights and methodologies for advancing audio classification and multimodal interaction systems. 展开更多
关键词 Audio pattern recognition raw audio end-to-end network feature fusion
在线阅读 下载PDF
Low-Light Image Enhancement Based on Wavelet Local and Global Feature Fusion Network
20
作者 Shun Song Xiangqian Jiang Dawei Zhao 《Journal of Contemporary Educational Research》 2025年第11期209-214,共6页
A wavelet-based local and global feature fusion network(LAGN)is proposed for low-light image enhancement,aiming to enhance image details and restore colors in dark areas.This study focuses on addressing three key issu... A wavelet-based local and global feature fusion network(LAGN)is proposed for low-light image enhancement,aiming to enhance image details and restore colors in dark areas.This study focuses on addressing three key issues in low-light image enhancement:Enhancing low-light images using LAGN to preserve image details and colors;extracting image edge information via wavelet transform to enhance image details;and extracting local and global features of images through convolutional neural networks and Transformer to improve image contrast.Comparisons with state-of-the-art methods on two datasets verify that LAGN achieves the best performance in terms of details,brightness,and contrast. 展开更多
关键词 Image enhancement feature fusion Wavelet transform Convolutional Neural Network(CNN) TRANSFORMER
在线阅读 下载PDF
上一页 1 2 18 下一页 到第
使用帮助 返回顶部