期刊文献+
共找到159,651篇文章
< 1 2 250 >
每页显示 20 50 100
Robust Symmetry Prediction with Multi-Modal Feature Fusion for Partial Shapes
1
作者 Junhua Xi Kouquan Zheng +3 位作者 Yifan Zhong Longjiang Li Zhiping Cai Jinjing Chen 《Intelligent Automation & Soft Computing》 SCIE 2023年第3期3099-3111,共13页
In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resoluti... In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resolution,single viewpoint,and occlusion.Different from the existing works predicting symmetry from the complete shape,we propose a learning approach for symmetry predic-tion based on a single RGB-D image.Instead of directly predicting the symmetry from incomplete shapes,our method consists of two modules,i.e.,the multi-mod-al feature fusion module and the detection-by-reconstruction module.Firstly,we build a channel-transformer network(CTN)to extract cross-fusion features from the RGB-D as the multi-modal feature fusion module,which helps us aggregate features from the color and the depth separately.Then,our self-reconstruction net-work based on a 3D variational auto-encoder(3D-VAE)takes the global geo-metric features as input,followed by a prediction symmetry network to detect the symmetry.Our experiments are conducted on three public datasets:ShapeNet,YCB,and ScanNet,we demonstrate that our method can produce reliable and accurate results. 展开更多
关键词 Symmetry prediction multi-modal feature fusion partial shapes
在线阅读 下载PDF
Cryptomining Malware Detection Based on Edge Computing-Oriented Multi-Modal Features Deep Learning 被引量:2
2
作者 Wenjuan Lian Guoqing Nie +2 位作者 Yanyan Kang Bin Jia Yang Zhang 《China Communications》 SCIE CSCD 2022年第2期174-185,共12页
In recent years,with the increase in the price of cryptocurrencies,the number of malicious cryptomining software has increased significantly.With their powerful spreading ability,cryptomining malware can unknowingly o... In recent years,with the increase in the price of cryptocurrencies,the number of malicious cryptomining software has increased significantly.With their powerful spreading ability,cryptomining malware can unknowingly occupy our resources,harm our interests,and damage more legitimate assets.However,although current traditional rule-based malware detection methods have a low false alarm rate,they have a relatively low detection rate when faced with a large volume of emerging malware.Even though common machine learning-based or deep learning-based methods have certain ability to learn and detect unknown malware,the characteristics they learn are single and independent,and cannot be learned adaptively.Aiming at the above problems,we propose a deep learning model with multi-input of multi-modal features,which can simultaneously accept digital features and image features on different dimensions.The model in turn includes parallel learning of three sub-models and ensemble learning of another specific sub-model.The four sub-models can be processed in parallel on different devices and can be further applied to edge computing environments.The model can adaptively learn multi-modal features and output prediction results.The detection rate of our model is as high as 97.01%and the false alarm rate is only 0.63%.The experimental results prove the advantage and effectiveness of the proposed method. 展开更多
关键词 cryptomining malware multi-modal ensemble learning deep learning edge computing
在线阅读 下载PDF
Test method of laser paint removal based on multi-modal feature fusion 被引量:2
3
作者 HUANG Hai-peng HAO Ben-tian +2 位作者 YE De-jun GAO Hao LI Liang 《Journal of Central South University》 SCIE EI CAS CSCD 2022年第10期3385-3398,共14页
Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion net... Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion network model was constructed based on a laser paint removal experiment. The alignment of heterogeneous data under different modals was solved by combining the piecewise aggregate approximation and gramian angular field. Moreover, the attention mechanism was introduced to optimize the dual-path network and dense connection network, enabling the sampling characteristics to be extracted and integrated. Consequently, the multi-modal discriminant detection of laser paint removal was realized. According to the experimental results, the verification accuracy of the constructed model on the experimental dataset was 99.17%, which is 5.77% higher than the optimal single-modal detection results of the laser paint removal. The feature extraction network was optimized by the attention mechanism, and the model accuracy was increased by 3.3%. Results verify the improved classification performance of the constructed multi-modal feature fusion model in detecting laser paint removal, the effective integration of acoustic data and visual image data, and the accurate detection of laser paint removal. 展开更多
关键词 laser cleaning multi-modal fusion image processing deep learning
在线阅读 下载PDF
Adaptive multi-modal feature fusion for far and hard object detection
4
作者 LI Yang GE Hongwei 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2021年第2期232-241,共10页
In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is pro... In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is proposed,which makes use of multi-neighborhood information of voxel and image information.Firstly,design an improved ResNet that maintains the structure information of far and hard objects in low-resolution feature maps,which is more suitable for detection task.Meanwhile,semantema of each image feature map is enhanced by semantic information from all subsequent feature maps.Secondly,extract multi-neighborhood context information with different receptive field sizes to make up for the defect of sparseness of point cloud which improves the ability of voxel features to represent the spatial structure and semantic information of objects.Finally,propose a multi-modal feature adaptive fusion strategy which uses learnable weights to express the contribution of different modal features to the detection task,and voxel attention further enhances the fused feature expression of effective target objects.The experimental results on the KITTI benchmark show that this method outperforms VoxelNet with remarkable margins,i.e.increasing the AP by 8.78%and 5.49%on medium and hard difficulty levels.Meanwhile,our method achieves greater detection performance compared with many mainstream multi-modal methods,i.e.outperforming the AP by 1%compared with that of MVX-Net on medium and hard difficulty levels. 展开更多
关键词 3D object detection adaptive fusion multi-modal data fusion attention mechanism multi-neighborhood features
在线阅读 下载PDF
Video-Based Deception Detection with Non-Contact Heart Rate Monitoring and Multi-Modal Feature Selection
5
作者 Yanfeng Li Jincheng Bian +1 位作者 Yiqun Gao Rencheng Song 《Journal of Beijing Institute of Technology》 EI CAS 2024年第3期175-185,共11页
Deception detection plays a crucial role in criminal investigation.Videos contain a wealth of information regarding apparent and physiological changes in individuals,and thus can serve as an effective means of decepti... Deception detection plays a crucial role in criminal investigation.Videos contain a wealth of information regarding apparent and physiological changes in individuals,and thus can serve as an effective means of deception detection.In this paper,we investigate video-based deception detection considering both apparent visual features such as eye gaze,head pose and facial action unit(AU),and non-contact heart rate detected by remote photoplethysmography(rPPG)technique.Multiple wrapper-based feature selection methods combined with the K-nearest neighbor(KNN)and support vector machine(SVM)classifiers are employed to screen the most effective features for deception detection.We evaluate the performance of the proposed method on both a self-collected physiological-assisted visual deception detection(PV3D)dataset and a public bag-oflies(BOL)dataset.Experimental results demonstrate that the SVM classifier with symbiotic organisms search(SOS)feature selection yields the best overall performance,with an area under the curve(AUC)of 83.27%and accuracy(ACC)of 83.33%for PV3D,and an AUC of 71.18%and ACC of 70.33%for BOL.This demonstrates the stability and effectiveness of the proposed method in video-based deception detection tasks. 展开更多
关键词 deception detection apparent visual features remote photoplethysmography non-contact heart rate feature selection
在线阅读 下载PDF
A Multi-Modal Feature Fusion Method Enhanced by Dynamic Sample Graphs for Predicting Drug Responses
6
作者 Haochen Zhao Bowei Li +1 位作者 Chenliang Xie Guihua Duan 《Big Data Mining and Analytics》 2026年第2期580-595,共16页
The complexity of cancer frequently results in diverse therapeutic responses among patients with the same cancer type undergoing identical treatments.Additionally,the development of anti-cancer drugs faces significant... The complexity of cancer frequently results in diverse therapeutic responses among patients with the same cancer type undergoing identical treatments.Additionally,the development of anti-cancer drugs faces significant challenges due to extended timelines and high attrition rates during the process.Advances in machine learning,combined with the availability of extensive drug research databases,have facilitated the emergence of computational approaches designed to predict drug responses more accurately.However,the noise and uncertainty introduced by incomplete records across different biological data sources pose challenges,ultimately constraining the predictive capabilities of these models.To overcome these limitations,this study introduces a novel multi-modal learning-based drug response prediction framework,DSGPred.By constructing dynamic sample graphs,DSGPred accounts for missing modality types and quantities within individual samples,enabling a more granular understanding of data completeness.The framework integrates multi-modal heterogeneous graph convolutional networks with advanced fusion modules to deeply capture and synthesize diverse drug and cell line features.Additionally,DSGPred employs an interaction-focused feature extraction module to learn dual interaction modes,thereby enhancing the richness of drug-cell line interaction embeddings.Experimental evaluations on the benchmark and independent datasets indicate that DSGPred consistently surpasses existing methods in predictive performance.Furthermore,tests involving previously unseen drugs and cell lines validate DSGPred’s generalization ability.Practical applicability is further underscored through case studies,highlighting its utility in real-world scenarios,and offering robust predictions and insights for drug response prediction and personalized therapy design.The codes for DSGPred are available at https://github.com/zhc940702/DSGPred/. 展开更多
关键词 drug response prediction multi-modal fusion deep learning
原文传递
Tomato Growth Height Prediction Method by Phenotypic Feature Extraction Using Multi-modal Data
7
作者 GONG Yu WANG Ling +3 位作者 ZHAO Rongqiang YOU Haibo ZHOU Mo LIU Jie 《智慧农业(中英文)》 2025年第1期97-110,共14页
[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-base... [Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management. 展开更多
关键词 tomato growth prediction deep learning phenotypic feature extraction multi-modal data recurrent neural net‐work long short-term memory large language model
在线阅读 下载PDF
Global-local feature optimization based RGB-IR fusion object detection on drone view 被引量:1
8
作者 Zhaodong CHEN Hongbing JI Yongquan ZHANG 《Chinese Journal of Aeronautics》 2026年第1期436-453,共18页
Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still st... Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still struggle to deal with the complex and changing scenarios captured by drones,mainly due to two reasons:(A)RGB-IR fusion detectors are susceptible to inferior inputs that degrade performance and stability.(B)RGB-IR fusion detectors are susceptible to redundant features that reduce accuracy and efficiency.In this paper,an innovative RGB-IR fusion detection framework based on global-local feature optimization,named GLFDet,is proposed to improve the detection performance and efficiency of drone-captured objects.The key components of GLFDet include a Global Feature Optimization(GFO)module,a Local Feature Optimization(LFO)module and a Channel Separation Fusion(CSF)module.Specifically,GFO calculates the information content of the input image from the frequency domain and optimizes the features holistically.Then,LFO dynamically selects high-value features and filters out low-value features before fusion,which significantly improves the efficiency of fusion.Finally,CSF fuses the RGB and IR features across the corresponding channels,which avoids the rearrangement of the channel relationships and enhances the model stability.Extensive experimental results show that the proposed method achieves the best performance on three popular RGB-IR datasets Drone Vehicle,VEDAI,and LLVIP.In addition,GLFDet is more lightweight than other comparable models,making it more appealing to edge devices such as drones.The code is available at https://github.com/lao chen330/GLFDet. 展开更多
关键词 Object detection Deep learning RGB-IR fusion DRONES Global feature Local feature
原文传递
GaitMAFF:Adaptive Multi-Modal Fusion of Skeleton Maps and Silhouettes for Robust Gait Recognition in Complex Scenarios
9
作者 Zhongbin Luo Zhaoyang Guan +2 位作者 Wenxing You Yunteng Wang Yanqiu Bi 《Computers, Materials & Continua》 2026年第5期540-558,共19页
Gait recognition is a key biometric for long-distance identification,yet its performance is severely degraded by real-world challenges such as varying clothing,carrying conditions,and changing viewpoints.While combini... Gait recognition is a key biometric for long-distance identification,yet its performance is severely degraded by real-world challenges such as varying clothing,carrying conditions,and changing viewpoints.While combining silhouette and skeleton data is a promising direction,effectively fusing these heterogeneous modalities and adaptively weighting their contributions in response to diverse conditions remains a central problem.This paper introduces GaitMAFF,a novelMulti-modal Adaptive Feature Fusion Network,to address this challenge.Our approach first transforms discrete skeleton joints into a dense SkeletonMap representation to align with silhouettes,then employs an attention-based module to dynamically learn the fusion weights between the two modalities.These fused features are processed by a powerful spatio-temporal backbone withWeighted Global-Local Feature FusionModules(WFFM)to learn a discriminative representation.Extensive experiments on the challenging CCPG and Gait3D datasets show that GaitMAFF achieves state-of-the-art performance,with an average Rank-1 accuracy of 84.6%on CCPG and 58.7%on Gait3D.These results demonstrate that our adaptive fusion strategy effectively integrates complementary multimodal information,significantly enhancing gait recognition robustness and accuracy in complex scenes and providing a practical solution for real-world applications. 展开更多
关键词 Gait recognition multi-modal fusion adaptive feature fusion skeleton map SILHOUETTE
在线阅读 下载PDF
Interpretable Feature Learning and Band Gap Prediction for Titanium-based Semiconductors
10
作者 YUAN Binxia YANG Shen’ao +2 位作者 LIU Yuhao QIAN Hong ZHU Rui 《材料导报》 北大核心 2026年第7期184-191,共8页
Titanium-based semiconductors are known for their high chemical stability and suitable band gap widths.However,the conventional experimental screening methods are inefficient due to the wide variety of materials.To sp... Titanium-based semiconductors are known for their high chemical stability and suitable band gap widths.However,the conventional experimental screening methods are inefficient due to the wide variety of materials.To speed up the selection process,this work focuses on interpretable feature learning and band gap prediction for titanium-based semiconductors.First,titanium compounds were selected from the Materials Project database by machine learning,and elemental features were extracted using the Magpie descriptors.Then,principal component analysis(PCA)was applied to reduce the data dimensionality,creating a representative dataset.Meantime,heatmaps and SHAP(SHapley Additive exPlanations)methods were used to demonstrate the influence of key features such as electronegativity,covalent radius,period number,and unit cell volume on the bandgap,understanding the relationship between the material’s properties and performance.After comparing different machine learning models,including Random Forest(RF),Support Vector Machines(SVM),Linear Regression(LR),and Gradient Boosting Regression(GBR),the RF was found to be the most accurate for band gap prediction.Finally,the model performance was improved through parameter tuning,showing high accuracy.These findings provide strong data support and design guidance for the development of materials in fields like photocatalysis and solar cells. 展开更多
关键词 titanium-based semiconductors band gap feature ertraction PREDICTION random forest
在线阅读 下载PDF
Efficient Arabic Essay Scoring with Hybrid Models: Feature Selection, Data Optimization, and Performance Trade-Offs
11
作者 Mohamed Ezz Meshrif Alruily +4 位作者 Ayman Mohamed Mostafa Alaa SAlaerjan Bader Aldughayfiq Hisham Allahem Abdulaziz Shehab 《Computers, Materials & Continua》 2026年第1期2274-2301,共28页
Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic... Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic poses distinct challenges due to the language’s complex morphology,diglossia,and the scarcity of annotated datasets.This paper presents a hybrid approach to Arabic AES by combining text-based,vector-based,and embeddingbased similarity measures to improve essay scoring accuracy while minimizing the training data required.Using a large Arabic essay dataset categorized into thematic groups,the study conducted four experiments to evaluate the impact of feature selection,data size,and model performance.Experiment 1 established a baseline using a non-machine learning approach,selecting top-N correlated features to predict essay scores.The subsequent experiments employed 5-fold cross-validation.Experiment 2 showed that combining embedding-based,text-based,and vector-based features in a Random Forest(RF)model achieved an R2 of 88.92%and an accuracy of 83.3%within a 0.5-point tolerance.Experiment 3 further refined the feature selection process,demonstrating that 19 correlated features yielded optimal results,improving R2 to 88.95%.In Experiment 4,an optimal data efficiency training approach was introduced,where training data portions increased from 5%to 50%.The study found that using just 10%of the data achieved near-peak performance,with an R2 of 85.49%,emphasizing an effective trade-off between performance and computational costs.These findings highlight the potential of the hybrid approach for developing scalable Arabic AES systems,especially in low-resource environments,addressing linguistic challenges while ensuring efficient data usage. 展开更多
关键词 Automated essay scoring text-based features vector-based features embedding-based features feature selection optimal data efficiency
在线阅读 下载PDF
Regression-Based Face Pose Estimation with Deep Multi-modal Feature Loss
12
作者 Yanqiu Wu Chaoqun Hong +1 位作者 Liang Chen Zhiqiang Zeng 《国际计算机前沿大会会议论文集》 2020年第1期534-549,共16页
Image-based face pose estimation tries to estimate the facial direction with 2D images.It provides important information for many face recognition applications.However,it is a difficult task due to complex conditions ... Image-based face pose estimation tries to estimate the facial direction with 2D images.It provides important information for many face recognition applications.However,it is a difficult task due to complex conditions and appearances.Deep learning method used in this field has the disadvantage of ignoring the natural structures of human faces.To solve this problem,a framework is proposed in this paper to estimate face poses with regression,which is based on deep learning and multi-modal feature loss(M2FL).Different from current loss functions using only a single type of features,the descriptive power was improved by combining multiple image features.To achieve it,hypergraph-based manifold regularization was applied.In this way,the loss of face pose estimation was reduced.Experimental results on commonly-used benchmark datasets demonstrate the performance of M2FL. 展开更多
关键词 Face pose estimation Deep learning multi-modal features
原文传递
Enhanced Multi-Scale Feature Extraction Lightweight Network for Remote Sensing Object Detection
13
作者 Xiang Luo Yuxuan Peng +2 位作者 Renghong Xie Peng Li Yuwen Qian 《Computers, Materials & Continua》 2026年第3期2097-2118,共22页
Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targ... Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targets,complex backgrounds,and small objects in remote sensing.Maintaining model lightweight to address resource constraints in remote sensing scenarios while improving task completion for remote sensing tasks remains a research hotspot.Therefore,we propose an enhanced multi-scale feature extraction lightweight network EM-YOLO based on the YOLOv8s architecture,specifically optimized for the characteristics of large target scale variations,diverse orientations,and numerous small objects in remote sensing images.Our innovations lie in two main aspects:First,a dynamic snake convolution(DSC)is introduced into the backbone network to enhance the model’s feature extraction capability for oriented targets.Second,an innovative focusing-diffusion module is designed in the feature fusion neck to effectively integrate multi-scale feature information.Finally,we introduce Layer-Adaptive Sparsity for magnitude-based Pruning(LASP)method to perform lightweight network pruning to better complete tasks in resource-constrained scenarios.Experimental results on the lightweight platform Orin demonstrate that the proposed method significantly outperforms the original YOLOv8s model in oriented remote sensing object detection tasks,and achieves comparable or superior performance to state-of-the-art methods on three authoritative remote sensing datasets(DOTA v1.0,DOTA v1.5,and HRSC2016). 展开更多
关键词 Deep learning object detection feature extraction feature fusion remote sensing
在线阅读 下载PDF
A Fine-Grained RecognitionModel based on Discriminative Region Localization and Efficient Second-Order Feature Encoding
14
作者 Xiaorui Zhang Yingying Wang +3 位作者 Wei Sun Shiyu Zhou Haoming Zhang Pengpai Wang 《Computers, Materials & Continua》 2026年第4期946-965,共20页
Discriminative region localization and efficient feature encoding are crucial for fine-grained object recognition.However,existing data augmentation methods struggle to accurately locate discriminative regions in comp... Discriminative region localization and efficient feature encoding are crucial for fine-grained object recognition.However,existing data augmentation methods struggle to accurately locate discriminative regions in complex backgrounds,small target objects,and limited training data,leading to poor recognition.Fine-grained images exhibit“small inter-class differences,”and while second-order feature encoding enhances discrimination,it often requires dual Convolutional Neural Networks(CNN),increasing training time and complexity.This study proposes a model integrating discriminative region localization and efficient second-order feature encoding.By ranking feature map channels via a fully connected layer,it selects high-importance channels to generate an enhanced map,accurately locating discriminative regions.Cropping and erasing augmentations further refine recognition.To improve efficiency,a novel second-order feature encoding module generates an attention map from the fourth convolutional group of Residual Network 50 layers(ResNet-50)and multiplies it with features from the fifth group,producing second-order features while reducing dimensionality and training time.Experiments on Caltech-University of California,San Diego Birds-200-2011(CUB-200-2011),Stanford Car,and Fine-Grained Visual Classification of Aircraft(FGVC Aircraft)datasets show state-of-the-art accuracy of 88.9%,94.7%,and 93.3%,respectively. 展开更多
关键词 Fine-grained recognition feature encoding data augmentation second-order feature discriminative regions
在线阅读 下载PDF
Layered Feature Engineering for E-Commerce Purchase Prediction:A Hierarchical Evaluation on Taobao User Behavior Datasets
15
作者 Liqiu Suo Lin Xia +1 位作者 Yoona Chung Eunchan Kim 《Computers, Materials & Continua》 2026年第4期1865-1889,共25页
Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three ... Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three layers:Basic,Conversion&Stability(efficiency and volatility across actions),and Advanced Interactions&Activity(crossbehavior synergies and intensity).Using real Taobao(Alibaba’s primary e-commerce platform)logs(57,976 records for 10,203 users;25 November–03 December 2017),we conducted a hierarchical,layer-wise evaluation that holds data splits and hyperparameters fixed while varying only the feature set to quantify each layer’s marginal contribution.Across logistic regression(LR),decision tree,random forest,XGBoost,and CatBoost models with stratified 5-fold cross-validation,the performance improvedmonotonically fromBasic to Conversion&Stability to Advanced features.With LR,F1 increased from 0.613(Basic)to 0.962(Advanced);boosted models achieved high discrimination(0.995 AUC Score)and an F1 score up to 0.983.Calibration and precision–recall analyses indicated strong ranking quality and acknowledged potential dataset and period biases given the short(9-day)window.By making feature contributions measurable and reproducible,the framework complements model-centric advances and offers a transparent blueprint for production-grade behavioralmodeling.The code and processed artifacts are publicly available,and future work will extend the validation to longer,seasonal datasets and hybrid approaches that combine automated feature learning with domain-driven design. 展开更多
关键词 Hierarchical feature engineering purchase prediction user behavior dataset feature importance e-commerce platform TAOBAO
在线阅读 下载PDF
Multi-modal data analysis for autism spectrum disorder in children:State of the art and trends
16
作者 Lukai Pang Xiaoke Zhao +4 位作者 Lulu Zhao Jianqing Li Fengyi Kuo Hongxing Wang Chengyu Liu 《EngMedicine》 2026年第1期47-56,共10页
Autism spectrum disorder(AsD)is a highly heterogeneous neurodevelopmental disorder.Early diagnosis and intervention are crucial for improving outcomes.Traditional single-modality diagnostic methods are subjective,limi... Autism spectrum disorder(AsD)is a highly heterogeneous neurodevelopmental disorder.Early diagnosis and intervention are crucial for improving outcomes.Traditional single-modality diagnostic methods are subjective,limited,and struggle to reveal the underlying pathological mechanisms.In contrast,multimodal data analysis integrates behavioral,physiological,and neuroimaging information with advanced machine-learning and deeplearning algorithms to overcome these limitations.In this review,we surveyed the recent pediatric AsD literature,highlighting artificial intelligence-driven diagnostic techniques,multimodal data fusion strategies,and emerging trends in ASD assessment.We surveyed studies that integrated two or more modalities and summarized the fusion levels,learning paradigms,tasks,datasets,and metrics.Multimodal approaches outperform singlemodality baselines in classification,severity estimation,and subtyping by leveraging complementary information and reducing modality-specific biases.Multimodal approaches significantly enhance diagnostic accuracy and comprehensiveness,enabling early screening of AsD,symptom subtyping,severity assessment,and personalized interventions.Advances in multimodal fusion techniques have promoted progress in precision medicine for the treatment of ASD. 展开更多
关键词 Autism spectrum disorder multi-modal data Machine learning Early screening Symptom subtyping
暂未订购
MDGET-MER:Multi-Level Dynamic Gating and Emotion Transfer for Multi-Modal Emotion Recognition
17
作者 Musheng Chen Qiang Wen +2 位作者 Xiaohong Qiu Junhua Wu Wenqing Fu 《Computers, Materials & Continua》 2026年第3期872-893,共22页
In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing method... In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing methods struggle to dynamically adjust cross-modal complementary strength to optimize fusion quality and lack effective mechanisms to model the dynamic evolution of emotions.To address these issues,we propose a multi-level dynamic gating and emotion transfer framework for multi-modal emotion recognition.A dynamic gating mechanism is applied across unimodal encoding,cross-modal alignment,and emotion transfer modeling,substantially improving noise robustness and feature alignment.First,we construct a unimodal encoder based on gated recurrent units and feature-selection gating to suppress intra-modal noise and enhance contextual representation.Second,we design a gated-attention crossmodal encoder that dynamically calibrates the complementary contributions of visual and audio modalities to the dominant textual features and eliminates redundant information.Finally,we introduce a gated enhanced emotion transfer module that explicitly models the temporal dependence of emotional evolution in dialogues via transfer gating and optimizes continuity modeling with a comparative learning loss.Experimental results demonstrate that the proposed method outperforms state-of-the-art models on the public MELD and IEMOCAP datasets. 展开更多
关键词 multi-modal emotion recognition dynamic gating emotion transfer module cross-modal dynamic alignment noise robustness
在线阅读 下载PDF
Detecting Anomalies in FinTech: A Graph Neural Network and Feature Selection Perspective
18
作者 Vinh Truong Hoang Nghia Dinh +3 位作者 Viet-Tuan Le Kiet Tran-Trung Bay Nguyen Van Kittikhun Meethongjan 《Computers, Materials & Continua》 2026年第1期207-246,共40页
The Financial Technology(FinTech)sector has witnessed rapid growth,resulting in increasingly complex and high-volume digital transactions.Although this expansion improves efficiency and accessibility,it also introduce... The Financial Technology(FinTech)sector has witnessed rapid growth,resulting in increasingly complex and high-volume digital transactions.Although this expansion improves efficiency and accessibility,it also introduces significant vulnerabilities,including fraud,money laundering,and market manipulation.Traditional anomaly detection techniques often fail to capture the relational and dynamic characteristics of financial data.Graph Neural Networks(GNNs),capable of modeling intricate interdependencies among entities,have emerged as a powerful framework for detecting subtle and sophisticated anomalies.However,the high-dimensionality and inherent noise of FinTech datasets demand robust feature selection strategies to improve model scalability,performance,and interpretability.This paper presents a comprehensive survey of GNN-based approaches for anomaly detection in FinTech,with an emphasis on the synergistic role of feature selection.We examine the theoretical foundations of GNNs,review state-of-the-art feature selection techniques,analyze their integration with GNNs,and categorize prevalent anomaly types in FinTech applications.In addition,we discuss practical implementation challenges,highlight representative case studies,and propose future research directions to advance the field of graph-based anomaly detection in financial systems. 展开更多
关键词 GNN SECURITY ECOMMERCE FinTech abnormal detection feature selection
在线阅读 下载PDF
Towards Real-Time Multi-Person Pose Estimation via Feature Selection and Sharpening Mechanisms
19
作者 Chengang Dong Yongkang Ding Jianwei Hu 《Computer Modeling in Engineering & Sciences》 2026年第3期888-908,共21页
Real-time multi-person pose estimation(MPE)built upon neural network architectures aims to simultaneously detect multiple human instances and regress joint coordinates in dynamic scenes.However,due to factors such as ... Real-time multi-person pose estimation(MPE)built upon neural network architectures aims to simultaneously detect multiple human instances and regress joint coordinates in dynamic scenes.However,due to factors such as high model complexity and limited expression of keypoint information,both the efficiency and accuracy of real-time MPE remain to be improved.To mitigate the adverse impacts caused by the aforementioned issues,this work develops FSEM-Pose,a real-time MPE model rooted in the YOLOv10 framework.In detail,first,FSEM-Pose upgrades the backbone module of the baseline network by introducing the Feature Shuffling-Convolution(FS-Conv),which effectively reduces the backbone size while maximizing the retention of spatial information from the input image.Second,FSEM-Pose incorporates a Feature Saliency Enhancement Module(FSEM)to strengthen the feature encoding of human keypoints,thereby improving the accuracy of pose estimation.Finally,FSEM-Pose further enhances inference efficiency via a lightweight optimization of the head using shared convolutional layers.Our method achieves competitive results across multiple accuracy and efficiency metrics on the MS COCO 2017 and CrowdPose datasets.While being lightweight in design,it improves average precision(AP)by 2.1%and 2.5%,respectively. 展开更多
关键词 Pose estimation feature sharpening LIGHTWEIGHT YOLOv10
在线阅读 下载PDF
LP-YOLO:Enhanced Smoke and Fire Detection via Self-Attention and Feature Pyramid Integration
20
作者 Qing Long Bing Yi +2 位作者 Haiqiao Liu Zhiling Peng Xiang Liu 《Computers, Materials & Continua》 2026年第3期1490-1509,共20页
Accurate detection of smoke and fire sources is critical for early fire warning and environmental monitoring.However,conventional detection approaches are highly susceptible to noise,illumination variations,and comple... Accurate detection of smoke and fire sources is critical for early fire warning and environmental monitoring.However,conventional detection approaches are highly susceptible to noise,illumination variations,and complex environmental conditions,which often reduce detection accuracy and real-time performance.To address these limitations,we propose Lightweight and Precise YOLO(LP-YOLO),a high-precision detection framework that integrates a self-attention mechanism with a feature pyramid,built upon YOLOv8.First,to overcome the restricted receptive field and parameter redundancy of conventional Convolutional Neural Networks(CNNs),we design an enhanced backbone based on Wavelet Convolutions(WTConv),which expands the receptive field through multifrequency convolutional processing.Second,a Bidirectional Feature Pyramid Network(BiFPN)is employed to achieve bidirectional feature fusion,enhancing the representation of smoke features across scales.Third,to mitigate the challenge of ambiguous object boundaries,we introduce the Frequency-aware Feature Fusion(FreqFusion)module,in which the Adaptive Low-Pass Filter(ALPF)reduces intra-class inconsistencies,the offset generator refines boundary localization,and the Adaptive High-Pass Filter(AHPF)recovers high-frequency details lost during down-sampling.Experimental evaluations demonstrate that LP-YOLO significantly outperforms the baseline YOLOv8,achieving an improvement of 9.3%in mAP@50 and 9.2%in F1-score.Moreover,the model is 56.6%and 32.4%smaller than YOLOv7-tiny and EfficientDet,respectively,while maintaining real-time inference speed at 238 frames per second(FPS).Validation on multiple benchmark datasets,including D-Fire,FIRESENSE,and BoWFire,further confirms its robustness and generalization ability,with detection accuracy consistently exceeding 82%.These results highlight the potential of LP-YOLO as a practical solution with high accuracy,robustness,and real-time performance for smoke and fire source detection. 展开更多
关键词 Deep learning smoke detection feature pyramid boundary refinement
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部