Coal dust explosions are severe safety accidents in coal mine production,posing significant threats to life and property.Predicting the maximum explosion pressure(Pm)of coal dust using deep learning models can effecti...Coal dust explosions are severe safety accidents in coal mine production,posing significant threats to life and property.Predicting the maximum explosion pressure(Pm)of coal dust using deep learning models can effectively assess potential risks and provide a scientific basis for preventing coal dust explosions.In this study,a 20-L explosion sphere apparatus was used to test the maximum explosion pressure of coal dust under seven different particle sizes and ten mass concentrations(Cdust),resulting in a dataset of 70 experimental groups.Through Spearman correlation analysis and random forest feature selection methods,particle size(D_(10),D_(20),D_(50))and mass concentration(Cdust)were identified as critical feature parameters from the ten initial parameters of the coal dust samples.Based on this,a hybrid Long Short-Term Memory(LSTM)network model incorporating a Multi-Head Attention Mechanism and the Sparrow Search Algorithm(SSA)was proposed to predict the maximum explosion pressure of coal dust.The results demonstrate that the SSA-LSTM-Multi-Head Attention model excels in predicting the maximum explosion pressure of coal dust.The four evaluation metrics indicate that the model achieved a coefficient of determination(R^(2)),root mean square error(RMSE),mean absolute percentage error(MAPE),and mean absolute error(MAE)of 0.9841,0.0030,0.0074,and 0.0049,respectively,in the training set.In the testing set,these values were 0.9743,0.0087,0.0108,and 0.0069,respectively.Compared to artificial neural networks(ANN),random forest(RF),support vector machines(SVM),particle swarm optimized-SVM(PSO-SVM)neural networks,and the traditional single-model LSTM,the SSA-LSTM-Multi-Head Attention model demonstrated superior generalization capability and prediction accuracy.The findings of this study not only advance the application of deep learning in coal dust explosion prediction but also provide robust technical support for the prevention and risk assessment of coal dust explosions.展开更多
Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing ...Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing methods in capturing the implicit semantics and contextual associations in sarcastic expressions,this paper proposes an event-aware model for Chinese sarcasm detection,leveraging a multi-head attention(MHA)mechanism and contrastive learning(CL)strategies.The proposed model employs a dual-path Bidirectional Encoder Representations from Transformers(BERT)encoder to process comment text and event context separately and integrates an MHA mechanism to facilitate deep interactions between the two,thereby capturing multidimensional semantic associations.Additionally,a CL strategy is introduced to enhance feature representation capabilities,further improving the model’s performance in handling class imbalance and complex contextual scenarios.The model achieves state-of-the-art performance on the Chinese sarcasm dataset,with significant improvements in accuracy(79.55%),F1-score(84.22%),and an area under the curve(AUC,84.35%).展开更多
Abnormal network traffic, as a frequent security risk, requires a series of techniques to categorize and detect it. Existing network traffic anomaly detection still faces challenges: the inability to fully extract loc...Abnormal network traffic, as a frequent security risk, requires a series of techniques to categorize and detect it. Existing network traffic anomaly detection still faces challenges: the inability to fully extract local and global features, as well as the lack of effective mechanisms to capture complex interactions between features;Additionally, when increasing the receptive field to obtain deeper feature representations, the reliance on increasing network depth leads to a significant increase in computational resource consumption, affecting the efficiency and performance of detection. Based on these issues, firstly, this paper proposes a network traffic anomaly detection model based on parallel dilated convolution and residual learning (Res-PDC). To better explore the interactive relationships between features, the traffic samples are converted into two-dimensional matrix. A module combining parallel dilated convolutions and residual learning (res-pdc) was designed to extract local and global features of traffic at different scales. By utilizing res-pdc modules with different dilation rates, we can effectively capture spatial features at different scales and explore feature dependencies spanning wider regions without increasing computational resources. Secondly, to focus and integrate the information in different feature subspaces, further enhance and extract the interactions among the features, multi-head attention is added to Res-PDC, resulting in the final model: multi-head attention enhanced parallel dilated convolution and residual learning (MHA-Res-PDC) for network traffic anomaly detection. Finally, comparisons with other machine learning and deep learning algorithms are conducted on the NSL-KDD and CIC-IDS-2018 datasets. The experimental results demonstrate that the proposed method in this paper can effectively improve the detection performance.展开更多
Safety maintenance of power equipment is of great importance in power grids,in which image-processing-based defect recognition is supposed to classify abnormal conditions during daily inspection.However,owing to the b...Safety maintenance of power equipment is of great importance in power grids,in which image-processing-based defect recognition is supposed to classify abnormal conditions during daily inspection.However,owing to the blurred features of defect images,the current defect recognition algorithm has poor fine-grained recognition ability.Visual attention can achieve fine-grained recognition with its abil-ity to model long-range dependencies while introducing extra computational complexity,especially for multi-head attention in vision transformer structures.Under these circumstances,this paper proposes a self-reduction multi-head attention module that can reduce computational complexity and be easily combined with a Convolutional Neural Network(CNN).In this manner,local and global fea-tures can be calculated simultaneously in our proposed structure,aiming to improve the defect recognition performance.Specifically,the proposed self-reduction multi-head attention can reduce redundant parameters,thereby solving the problem of limited computational resources.Experimental results were obtained based on the defect dataset collected from the substation.The results demonstrated the efficiency and superiority of the proposed method over other advanced algorithms.展开更多
As the group-buying model shows significant progress in attracting new users,enhancing user engagement,and increasing platform profitability,providing personalized recommendations for group-buying users has emerged as...As the group-buying model shows significant progress in attracting new users,enhancing user engagement,and increasing platform profitability,providing personalized recommendations for group-buying users has emerged as a new challenge in the field of recommendation systems.This paper introduces a group-buying recommendation model based on multi-head attention mechanisms and multi-task learning,termed the Multi-head Attention Mechanisms and Multi-task Learning Group-Buying Recommendation(MAMGBR)model,specifically designed to optimize group-buying recommendations on e-commerce platforms.The core dataset of this study comes from the Chinese maternal and infant e-commerce platform“Beibei,”encompassing approximately 430,000 successful groupbuying actions and over 120,000 users.Themodel focuses on twomain tasks:recommending items for group organizers(Task Ⅰ)and recommending participants for a given group-buying event(Task Ⅱ).In model evaluation,MAMGBR achieves an MRR@10 of 0.7696 for Task I,marking a 20.23%improvement over baseline models.Furthermore,in Task II,where complex interaction patterns prevail,MAMGBR utilizes auxiliary loss functions to effectively model the multifaceted roles of users,items,and participants,leading to a 24.08%increase in MRR@100 under a 1:99 sample ratio.Experimental results show that compared to benchmark models,such as NGCF and EATNN,MAMGBR’s integration ofmulti-head attentionmechanisms,expert networks,and gating mechanisms enables more accurate modeling of user preferences and social associations within group-buying scenarios,significantly enhancing recommendation accuracy and platform group-buying success rates.展开更多
Aiming at the problem of low convergence efficiency of traditional multi-UAV path planning algorithms in unknown complex environments,this paper proposes a deep reinforcement learning algorithm incorporating the atten...Aiming at the problem of low convergence efficiency of traditional multi-UAV path planning algorithms in unknown complex environments,this paper proposes a deep reinforcement learning algorithm incorporating the attention mechanism.The method is based on the Soft Actor-Critic(SAC)framework,which introduces a multi-attention mechanism in the Critic network,dynamically learns the dependency relationship between intelligences,and realizes key information screening and conflict avoidance.An environment with multiple random obstacles is designed to simulate complex emergent situations.The results show that the proposed algorithm significantly improves the mission success rate and average reward,significantly extends the survival time and exploration range of the UAVs,and verifies the effectiveness of the attention mechanism in enhancing the efficiency,robustness,and long-term planning capability of multi-UAV collaboration,as compared to the baseline method that does not use attention.展开更多
The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character rese...The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character resemblance,and semantic discrepancies.However,existing methods often struggle to fully address these types of errors,impacting the overall correction accuracy.This paper introduces a multi-modal feature encoder designed to efficiently extract features from three distinct modalities:pinyin,semantics,and character morphology.Unlike previous methods that rely on direct fusion or fixed-weight summation to integrate multi-modal information,our approach employs a multi-head attention mechanism to focuse more on relevant modal information while dis-regarding less pertinent data.To prevent issues such as gradient explosion or vanishing,the model incorporates a residual connection of the original text vector for fine-tuning.This approach ensures robust model performance by maintaining essential linguistic details throughout the correction process.Experimental evaluations on the SIGHAN benchmark dataset demonstrate that the pro-posed model outperforms baseline approaches across various metrics and datasets,confirming its effectiveness and feasibility.展开更多
Due to the time-varying topology and possible disturbances in a conflict environment,it is still challenging to maintain the mission performance of flying Ad hoc networks(FANET),which limits the application of Unmanne...Due to the time-varying topology and possible disturbances in a conflict environment,it is still challenging to maintain the mission performance of flying Ad hoc networks(FANET),which limits the application of Unmanned Aerial Vehicle(UAV)swarms in harsh environments.This paper proposes an intelligent framework to quickly recover the cooperative coveragemission by aggregating the historical spatio-temporal network with the attention mechanism.The mission resilience metric is introduced in conjunction with connectivity and coverage status information to simplify the optimization model.A spatio-temporal node pooling method is proposed to ensure all node location features can be updated after destruction by capturing the temporal network structure.Combined with the corresponding Laplacian matrix as the hyperparameter,a recovery algorithm based on the multi-head attention graph network is designed to achieve rapid recovery.Simulation results showed that the proposed framework can facilitate rapid recovery of the connectivity and coverage more effectively compared to the existing studies.The results demonstrate that the average connectivity and coverage results is improved by 17.92%and 16.96%,respectively compared with the state-of-the-art model.Furthermore,by the ablation study,the contributions of each different improvement are compared.The proposed model can be used to support resilient network design for real-time mission execution.展开更多
To fully make use of information from different representation subspaces,a multi-head attention-based long short-term memory(LSTM)model is proposed in this study for speech emotion recognition(SER).The proposed model ...To fully make use of information from different representation subspaces,a multi-head attention-based long short-term memory(LSTM)model is proposed in this study for speech emotion recognition(SER).The proposed model uses frame-level features and takes the temporal information of emotion speech as the input of the LSTM layer.Here,a multi-head time-dimension attention(MHTA)layer was employed to linearly project the output of the LSTM layer into different subspaces for the reduced-dimension context vectors.To provide relative vital information from other dimensions,the output of MHTA,the output of feature-dimension attention,and the last time-step output of LSTM were utilized to form multiple context vectors as the input of the fully connected layer.To improve the performance of multiple vectors,feature-dimension attention was employed for the all-time output of the first LSTM layer.The proposed model was evaluated on the eNTERFACE and GEMEP corpora,respectively.The results indicate that the proposed model outperforms LSTM by 14.6%and 10.5%for eNTERFACE and GEMEP,respectively,proving the effectiveness of the proposed model in SER tasks.展开更多
Fraud cases have been a risk in society and people’s property security has been greatly threatened.In recent studies,many promising algorithms have been developed for social media offensive text recognition as well a...Fraud cases have been a risk in society and people’s property security has been greatly threatened.In recent studies,many promising algorithms have been developed for social media offensive text recognition as well as sentiment analysis.These algorithms are also suitable for fraudulent phone text recognition.Compared to these tasks,the semantics of fraudulent words are more complex and more difficult to distinguish.Recurrent Neural Networks(RNN),the variants ofRNN,ConvolutionalNeuralNetworks(CNN),and hybrid neural networks to extract text features are used by most text classification research.However,a single network or a simple network combination cannot obtain rich characteristic knowledge of fraudulent phone texts relatively.Therefore,a new model is proposed in this paper.In the fraudulent phone text,the knowledge that can be learned by the model includes the sequence structure of sentences,the correlation between words,the correlation of contextual semantics,the feature of keywords in sentences,etc.The new model combines a bidirectional Long-Short Term Memory Neural Network(BiLSTM)or a bidirectional Gate Recurrent United(BiGRU)and a Multi-Head attention mechanism module with convolution.A normalization layer is added after the output of the final hidden layer.BiLSTM or BiGRU is used to build the encoding and decoding layer.Multi-head attention mechanism module with convolution(MHAC)enhances the ability of the model to learn global interaction information and multi-granularity local interaction information in fraudulent sentences.A fraudulent phone text dataset is produced by us in this paper.The THUCNews data sets and fraudulent phone text data sets are used in experiments.Experiment results show that compared with the baseline model,the proposed model(LMHACL)has the best experiment results in terms of Accuracy,Precision,Recall,and F1 score on the two data sets.And the performance indexes on fraudulent phone text data sets are all above 0.94.展开更多
Cardiovascular disease is the leading cause of death globally.This disease causes loss of heart muscles and is also responsible for the death of heart cells,sometimes damaging their functionality.A person’s life may ...Cardiovascular disease is the leading cause of death globally.This disease causes loss of heart muscles and is also responsible for the death of heart cells,sometimes damaging their functionality.A person’s life may depend on receiving timely assistance as soon as possible.Thus,minimizing the death ratio can be achieved by early detection of heart attack(HA)symptoms.In the United States alone,an estimated 610,000 people die fromheart attacks each year,accounting for one in every four fatalities.However,by identifying and reporting heart attack symptoms early on,it is possible to reduce damage and save many lives significantly.Our objective is to devise an algorithm aimed at helping individuals,particularly elderly individuals living independently,to safeguard their lives.To address these challenges,we employ deep learning techniques.We have utilized a vision transformer(ViT)to address this problem.However,it has a significant overhead cost due to its memory consumption and computational complexity because of scaling dot-product attention.Also,since transformer performance typically relies on large-scale or adequate data,adapting ViT for smaller datasets is more challenging.In response,we propose a three-in-one steam model,theMulti-Head Attention Vision Hybrid(MHAVH).Thismodel integrates a real-time posture recognition framework to identify chest pain postures indicative of heart attacks using transfer learning techniques,such as ResNet-50 and VGG-16,renowned for their robust feature extraction capabilities.By incorporatingmultiple heads into the vision transformer to generate additional metrics and enhance heart-detection capabilities,we leverage a 2019 posture-based dataset comprising RGB images,a novel creation by the author that marks the first dataset tailored for posture-based heart attack detection.Given the limited online data availability,we segmented this dataset into gender categories(male and female)and conducted testing on both segmented and original datasets.The training accuracy of our model reached an impressive 99.77%.Upon testing,the accuracy for male and female datasets was recorded at 92.87%and 75.47%,respectively.The combined dataset accuracy is 93.96%,showcasing a commendable performance overall.Our proposed approach demonstrates versatility in accommodating small and large datasets,offering promising prospects for real-world applications.展开更多
The present study examines the impact of short-term public opinion sentiment on the secondary market,with a focus on the potential for such sentiment to cause dramatic stock price fluctuations and increase investment ...The present study examines the impact of short-term public opinion sentiment on the secondary market,with a focus on the potential for such sentiment to cause dramatic stock price fluctuations and increase investment risk.The quantification of investment sentiment indicators and the persistent analysis of their impact has been a complex and significant area of research.In this paper,a structured multi-head attention stock index prediction method based adaptive public opinion sentiment vector is proposed.The proposedmethod utilizes an innovative approach to transform numerous investor comments on social platforms over time into public opinion sentiment vectors expressing complex sentiments.It then analyzes the continuous impact of these vectors on the market through the use of aggregating techniques and public opinion data via a structured multi-head attention mechanism.The experimental results demonstrate that the public opinion sentiment vector can provide more comprehensive feedback on market sentiment than traditional sentiment polarity analysis.Furthermore,the multi-head attention mechanism is shown to improve prediction accuracy through attention convergence on each type of input information separately.Themean absolute percentage error(MAPE)of the proposedmethod is 0.463%,a reduction of 0.294% compared to the benchmark attention algorithm.Additionally,the market backtesting results indicate that the return was 24.560%,an improvement of 8.202% compared to the benchmark algorithm.These results suggest that themarket trading strategy based on thismethod has the potential to improve trading profits.展开更多
Automatic extraction of the patient’s health information from the unstructured data concerning the discharge summary remains challenging.Discharge summary related documents contain various aspects of the patient heal...Automatic extraction of the patient’s health information from the unstructured data concerning the discharge summary remains challenging.Discharge summary related documents contain various aspects of the patient health condition to examine the quality of treatment and thereby help improve decision-making in the medical field.Using a sentiment dictionary and feature engineering,the researchers primarily mine semantic text features.However,choosing and designing features requires a lot of manpower.The proposed approach is an unsupervised deep learning model that learns a set of clusters embedded in the latent space.A composite model including Active Learning(AL),Convolutional Neural Network(CNN),BiGRU,and Multi-Attention,called ACBMA in this research,is designed to measure the quality of treatment based on discharge summaries text sentiment detection.CNN is utilized for extracting the set of local features of text vectors.Then BiGRU network was utilized to extract the text’s global features to solve the issues that a single CNN cannot obtain global semantic information and the traditional Recurrent Neural Network(RNN)gradient disappearance.Experiments prove that the ACBMA method can demonstrate the effectiveness of the suggested method,achieve comparable results to state-of-arts methods in sentiment detection,and outperform them with accurate benchmarks.Finally,several algorithm studies ultimately determined that the ACBMA method is more precise for discharge summaries sentiment analysis.展开更多
Accurate traffic prediction is crucial for an intelligent traffic system (ITS). However, the excessive non-linearity and complexity of the spatial-temporal correlation in traffic flow severely limit the prediction acc...Accurate traffic prediction is crucial for an intelligent traffic system (ITS). However, the excessive non-linearity and complexity of the spatial-temporal correlation in traffic flow severely limit the prediction accuracy of most existing models, which simply stack temporal and spatial modules and fail to capture spatial-temporal features effectively. To improve the prediction accuracy, a multi-head attention spatial-temporal graph neural network (MSTNet) is proposed in this paper. First, the traffic data is decomposed into unique time spans that conform to positive rules, and valuable traffic node attributes are mined through an adaptive graph structure. Second, time and spatial features are captured using a multi-head attention spatial-temporal module. Finally, a multi-step prediction module is used to achieve future traffic condition prediction. Numerical experiments were conducted on an open-source dataset, and the results demonstrate that MSTNet performs well in spatial-temporal feature extraction and achieves more positive forecasting results than the baseline methods.展开更多
The majority of existing graph-network-based few-shot models focus on a node-similarity update mode.The lack of adequate information intensies the risk of overtraining.In this paper,we propose a novel Multihead Attent...The majority of existing graph-network-based few-shot models focus on a node-similarity update mode.The lack of adequate information intensies the risk of overtraining.In this paper,we propose a novel Multihead Attention Graph Network to excavate discriminative relation and fulll effective information propagation.For edge update,the node-level attention is used to evaluate the similarities between the two nodes and the distributionlevel attention extracts more in-deep global relation.The cooperation between those two parts provides a discriminative and comprehensive expression for edge feature.For node update,we embrace the label-level attention to soften the noise of irrelevant nodes and optimize the update direction.Our proposed model is veried through extensive experiments on two few-shot benchmark MiniImageNet and CIFAR-FS dataset.The results suggest that our method has a strong capability of noise immunity and quick convergence.The classication accuracy outperforms most state-of-the-art approaches.展开更多
Due to its high-temperature and high-pressure operating environment,food/feed puffing machines are prone to faults such as cavity blockage and cutter wear.This paper presents the design of a fault diagnosis system for...Due to its high-temperature and high-pressure operating environment,food/feed puffing machines are prone to faults such as cavity blockage and cutter wear.This paper presents the design of a fault diagnosis system for puffing machines(food/feed processing equipment that expands or puffs agricultural products),based on a convolutional neural network and a multi-head attention mechanism model,which incorporates Bayesian optimization.The system combines multi-source information fusion,capturing patterns and characteristics associated with fault states by monitoring various sources of information,such as temperature,noise signals,main motor current and vibration signals from key components.Hyperparameters are optimized through Bayesian optimization to obtain the optimal parameter model.The integration of convolutional neural networks and multi-head attention mechanisms enables the simultaneous capture of both local and global information,thereby enhancing data comprehension.Experimental results demonstrate that the system successfully diagnoses puffing machine faults,achieving an average recognition accuracy of 98.8%across various operating conditions,highlighting its high accuracy,generalization ability and robustness.展开更多
Brain tumors pose a singularly formidable threat in contemporary healthcare due to their diverse histological profiles and unpredictable clinical behavior.Their spectrum ranges from slow-growing benign tumors to highl...Brain tumors pose a singularly formidable threat in contemporary healthcare due to their diverse histological profiles and unpredictable clinical behavior.Their spectrum ranges from slow-growing benign tumors to highly aggressive malignancies in sensitive anatomical locations.This necessitates an intensified focus on their path-ophysiology and demands precise characterization for patient-specific therapeutic solutions.Techniques to correctly identify brain tumors using artificial intelligence are often employed for addressing segmentation and detection tasks;however,the lack of generalizable results hinders medical practitioners from incorporating them into the diagnostic process.Predominantly reliant on Magnetic Resonance Imaging,research on other imaging methods like Positron Emission Tomography&Computed Tomography,is scarce due to a dearth of open-access datasets.Our study proposes a robust MBTC-Net framework by leveraging EfficientNetV2B0 for extracting high-dimensional feature maps,followed by reshaping into sequences and applying multi-head attention to capture contextual dependencies.After reintroducing the attention output into a spatial structure,we perform average pooling before transitioning to dense layers,enhanced with batch normalization and dropout.The model is fine-tuned with the Adamax optimizer to classify various kinds of brain tumors using softmax from T1-weighted,T1 Contrast-Enhanced,&T2-weighted MRI sequences and CT scans.To reduce the risk of overfitting,measures such as stratified 5-fold cross-validation have been extensively implemented across 3 open-access Kaggle datasets,obtaining 97.54%(15-class),97.97%(6-class),and 99.34%(2-class)accuracies,respectively.We have also applied Grad-CAM to decipher and visually analyze the predictions made by this framework.This research underscores the need for multimodal training of CT scans and MRI sequences for deploying a sturdy framework in real-time environments and advancing the well-being of patients.展开更多
基金funded by the Research on Intelligent Mining Geological Model and Ventilation Model for Extremely Thin Coal Seam in Heilongjiang Province,China(2021ZXJ02A03)the Demonstration of Intelligent Mining for Comprehensive Mining Face in Extremely Thin Coal Seam in Heilongjiang Province,China(2021ZXJ02A04)the Natural Science Foundation of Heilongjiang Province,China(LH2024E112).
文摘Coal dust explosions are severe safety accidents in coal mine production,posing significant threats to life and property.Predicting the maximum explosion pressure(Pm)of coal dust using deep learning models can effectively assess potential risks and provide a scientific basis for preventing coal dust explosions.In this study,a 20-L explosion sphere apparatus was used to test the maximum explosion pressure of coal dust under seven different particle sizes and ten mass concentrations(Cdust),resulting in a dataset of 70 experimental groups.Through Spearman correlation analysis and random forest feature selection methods,particle size(D_(10),D_(20),D_(50))and mass concentration(Cdust)were identified as critical feature parameters from the ten initial parameters of the coal dust samples.Based on this,a hybrid Long Short-Term Memory(LSTM)network model incorporating a Multi-Head Attention Mechanism and the Sparrow Search Algorithm(SSA)was proposed to predict the maximum explosion pressure of coal dust.The results demonstrate that the SSA-LSTM-Multi-Head Attention model excels in predicting the maximum explosion pressure of coal dust.The four evaluation metrics indicate that the model achieved a coefficient of determination(R^(2)),root mean square error(RMSE),mean absolute percentage error(MAPE),and mean absolute error(MAE)of 0.9841,0.0030,0.0074,and 0.0049,respectively,in the training set.In the testing set,these values were 0.9743,0.0087,0.0108,and 0.0069,respectively.Compared to artificial neural networks(ANN),random forest(RF),support vector machines(SVM),particle swarm optimized-SVM(PSO-SVM)neural networks,and the traditional single-model LSTM,the SSA-LSTM-Multi-Head Attention model demonstrated superior generalization capability and prediction accuracy.The findings of this study not only advance the application of deep learning in coal dust explosion prediction but also provide robust technical support for the prevention and risk assessment of coal dust explosions.
基金granted by Qin Xin Talents Cultivation Program(No.QXTCP C202115),Beijing Information Science&Technology Universitythe Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing Fund(No.GJJ-23),National Social Science Foundation,China(No.21BTQ079).
文摘Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing methods in capturing the implicit semantics and contextual associations in sarcastic expressions,this paper proposes an event-aware model for Chinese sarcasm detection,leveraging a multi-head attention(MHA)mechanism and contrastive learning(CL)strategies.The proposed model employs a dual-path Bidirectional Encoder Representations from Transformers(BERT)encoder to process comment text and event context separately and integrates an MHA mechanism to facilitate deep interactions between the two,thereby capturing multidimensional semantic associations.Additionally,a CL strategy is introduced to enhance feature representation capabilities,further improving the model’s performance in handling class imbalance and complex contextual scenarios.The model achieves state-of-the-art performance on the Chinese sarcasm dataset,with significant improvements in accuracy(79.55%),F1-score(84.22%),and an area under the curve(AUC,84.35%).
基金supported by the Xiamen Science and Technology Subsidy Project(No.2023CXY0318).
文摘Abnormal network traffic, as a frequent security risk, requires a series of techniques to categorize and detect it. Existing network traffic anomaly detection still faces challenges: the inability to fully extract local and global features, as well as the lack of effective mechanisms to capture complex interactions between features;Additionally, when increasing the receptive field to obtain deeper feature representations, the reliance on increasing network depth leads to a significant increase in computational resource consumption, affecting the efficiency and performance of detection. Based on these issues, firstly, this paper proposes a network traffic anomaly detection model based on parallel dilated convolution and residual learning (Res-PDC). To better explore the interactive relationships between features, the traffic samples are converted into two-dimensional matrix. A module combining parallel dilated convolutions and residual learning (res-pdc) was designed to extract local and global features of traffic at different scales. By utilizing res-pdc modules with different dilation rates, we can effectively capture spatial features at different scales and explore feature dependencies spanning wider regions without increasing computational resources. Secondly, to focus and integrate the information in different feature subspaces, further enhance and extract the interactions among the features, multi-head attention is added to Res-PDC, resulting in the final model: multi-head attention enhanced parallel dilated convolution and residual learning (MHA-Res-PDC) for network traffic anomaly detection. Finally, comparisons with other machine learning and deep learning algorithms are conducted on the NSL-KDD and CIC-IDS-2018 datasets. The experimental results demonstrate that the proposed method in this paper can effectively improve the detection performance.
基金supported in part by Major Program of the National Natural Science Foundation of China under Grant 62127803.
文摘Safety maintenance of power equipment is of great importance in power grids,in which image-processing-based defect recognition is supposed to classify abnormal conditions during daily inspection.However,owing to the blurred features of defect images,the current defect recognition algorithm has poor fine-grained recognition ability.Visual attention can achieve fine-grained recognition with its abil-ity to model long-range dependencies while introducing extra computational complexity,especially for multi-head attention in vision transformer structures.Under these circumstances,this paper proposes a self-reduction multi-head attention module that can reduce computational complexity and be easily combined with a Convolutional Neural Network(CNN).In this manner,local and global fea-tures can be calculated simultaneously in our proposed structure,aiming to improve the defect recognition performance.Specifically,the proposed self-reduction multi-head attention can reduce redundant parameters,thereby solving the problem of limited computational resources.Experimental results were obtained based on the defect dataset collected from the substation.The results demonstrated the efficiency and superiority of the proposed method over other advanced algorithms.
基金supported by the Key Research and Development Program of Heilongjiang Province(No.2022ZX01A35).
文摘As the group-buying model shows significant progress in attracting new users,enhancing user engagement,and increasing platform profitability,providing personalized recommendations for group-buying users has emerged as a new challenge in the field of recommendation systems.This paper introduces a group-buying recommendation model based on multi-head attention mechanisms and multi-task learning,termed the Multi-head Attention Mechanisms and Multi-task Learning Group-Buying Recommendation(MAMGBR)model,specifically designed to optimize group-buying recommendations on e-commerce platforms.The core dataset of this study comes from the Chinese maternal and infant e-commerce platform“Beibei,”encompassing approximately 430,000 successful groupbuying actions and over 120,000 users.Themodel focuses on twomain tasks:recommending items for group organizers(Task Ⅰ)and recommending participants for a given group-buying event(Task Ⅱ).In model evaluation,MAMGBR achieves an MRR@10 of 0.7696 for Task I,marking a 20.23%improvement over baseline models.Furthermore,in Task II,where complex interaction patterns prevail,MAMGBR utilizes auxiliary loss functions to effectively model the multifaceted roles of users,items,and participants,leading to a 24.08%increase in MRR@100 under a 1:99 sample ratio.Experimental results show that compared to benchmark models,such as NGCF and EATNN,MAMGBR’s integration ofmulti-head attentionmechanisms,expert networks,and gating mechanisms enables more accurate modeling of user preferences and social associations within group-buying scenarios,significantly enhancing recommendation accuracy and platform group-buying success rates.
文摘Aiming at the problem of low convergence efficiency of traditional multi-UAV path planning algorithms in unknown complex environments,this paper proposes a deep reinforcement learning algorithm incorporating the attention mechanism.The method is based on the Soft Actor-Critic(SAC)framework,which introduces a multi-attention mechanism in the Critic network,dynamically learns the dependency relationship between intelligences,and realizes key information screening and conflict avoidance.An environment with multiple random obstacles is designed to simulate complex emergent situations.The results show that the proposed algorithm significantly improves the mission success rate and average reward,significantly extends the survival time and exploration range of the UAVs,and verifies the effectiveness of the attention mechanism in enhancing the efficiency,robustness,and long-term planning capability of multi-UAV collaboration,as compared to the baseline method that does not use attention.
基金Supported by the National Natural Science Foundation of China(No.61472256,61170277)the Hujiang Foundation(No.A14006).
文摘The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character resemblance,and semantic discrepancies.However,existing methods often struggle to fully address these types of errors,impacting the overall correction accuracy.This paper introduces a multi-modal feature encoder designed to efficiently extract features from three distinct modalities:pinyin,semantics,and character morphology.Unlike previous methods that rely on direct fusion or fixed-weight summation to integrate multi-modal information,our approach employs a multi-head attention mechanism to focuse more on relevant modal information while dis-regarding less pertinent data.To prevent issues such as gradient explosion or vanishing,the model incorporates a residual connection of the original text vector for fine-tuning.This approach ensures robust model performance by maintaining essential linguistic details throughout the correction process.Experimental evaluations on the SIGHAN benchmark dataset demonstrate that the pro-posed model outperforms baseline approaches across various metrics and datasets,confirming its effectiveness and feasibility.
基金the National Natural Science Foundation of China(NNSFC)(Grant Nos.72001213 and 72301292)the National Social Science Fund of China(Grant No.19BGL297)the Basic Research Program of Natural Science in Shaanxi Province(Grant No.2021JQ-369).
文摘Due to the time-varying topology and possible disturbances in a conflict environment,it is still challenging to maintain the mission performance of flying Ad hoc networks(FANET),which limits the application of Unmanned Aerial Vehicle(UAV)swarms in harsh environments.This paper proposes an intelligent framework to quickly recover the cooperative coveragemission by aggregating the historical spatio-temporal network with the attention mechanism.The mission resilience metric is introduced in conjunction with connectivity and coverage status information to simplify the optimization model.A spatio-temporal node pooling method is proposed to ensure all node location features can be updated after destruction by capturing the temporal network structure.Combined with the corresponding Laplacian matrix as the hyperparameter,a recovery algorithm based on the multi-head attention graph network is designed to achieve rapid recovery.Simulation results showed that the proposed framework can facilitate rapid recovery of the connectivity and coverage more effectively compared to the existing studies.The results demonstrate that the average connectivity and coverage results is improved by 17.92%and 16.96%,respectively compared with the state-of-the-art model.Furthermore,by the ablation study,the contributions of each different improvement are compared.The proposed model can be used to support resilient network design for real-time mission execution.
基金The National Natural Science Foundation of China(No.61571106,61633013,61673108,81871444).
文摘To fully make use of information from different representation subspaces,a multi-head attention-based long short-term memory(LSTM)model is proposed in this study for speech emotion recognition(SER).The proposed model uses frame-level features and takes the temporal information of emotion speech as the input of the LSTM layer.Here,a multi-head time-dimension attention(MHTA)layer was employed to linearly project the output of the LSTM layer into different subspaces for the reduced-dimension context vectors.To provide relative vital information from other dimensions,the output of MHTA,the output of feature-dimension attention,and the last time-step output of LSTM were utilized to form multiple context vectors as the input of the fully connected layer.To improve the performance of multiple vectors,feature-dimension attention was employed for the all-time output of the first LSTM layer.The proposed model was evaluated on the eNTERFACE and GEMEP corpora,respectively.The results indicate that the proposed model outperforms LSTM by 14.6%and 10.5%for eNTERFACE and GEMEP,respectively,proving the effectiveness of the proposed model in SER tasks.
基金This researchwas funded by the Major Science and Technology Innovation Project of Shandong Province in China(2019JZZY010120).
文摘Fraud cases have been a risk in society and people’s property security has been greatly threatened.In recent studies,many promising algorithms have been developed for social media offensive text recognition as well as sentiment analysis.These algorithms are also suitable for fraudulent phone text recognition.Compared to these tasks,the semantics of fraudulent words are more complex and more difficult to distinguish.Recurrent Neural Networks(RNN),the variants ofRNN,ConvolutionalNeuralNetworks(CNN),and hybrid neural networks to extract text features are used by most text classification research.However,a single network or a simple network combination cannot obtain rich characteristic knowledge of fraudulent phone texts relatively.Therefore,a new model is proposed in this paper.In the fraudulent phone text,the knowledge that can be learned by the model includes the sequence structure of sentences,the correlation between words,the correlation of contextual semantics,the feature of keywords in sentences,etc.The new model combines a bidirectional Long-Short Term Memory Neural Network(BiLSTM)or a bidirectional Gate Recurrent United(BiGRU)and a Multi-Head attention mechanism module with convolution.A normalization layer is added after the output of the final hidden layer.BiLSTM or BiGRU is used to build the encoding and decoding layer.Multi-head attention mechanism module with convolution(MHAC)enhances the ability of the model to learn global interaction information and multi-granularity local interaction information in fraudulent sentences.A fraudulent phone text dataset is produced by us in this paper.The THUCNews data sets and fraudulent phone text data sets are used in experiments.Experiment results show that compared with the baseline model,the proposed model(LMHACL)has the best experiment results in terms of Accuracy,Precision,Recall,and F1 score on the two data sets.And the performance indexes on fraudulent phone text data sets are all above 0.94.
基金Researchers Supporting Project Number(RSPD2024R576),King Saud University,Riyadh,Saudi Arabia。
文摘Cardiovascular disease is the leading cause of death globally.This disease causes loss of heart muscles and is also responsible for the death of heart cells,sometimes damaging their functionality.A person’s life may depend on receiving timely assistance as soon as possible.Thus,minimizing the death ratio can be achieved by early detection of heart attack(HA)symptoms.In the United States alone,an estimated 610,000 people die fromheart attacks each year,accounting for one in every four fatalities.However,by identifying and reporting heart attack symptoms early on,it is possible to reduce damage and save many lives significantly.Our objective is to devise an algorithm aimed at helping individuals,particularly elderly individuals living independently,to safeguard their lives.To address these challenges,we employ deep learning techniques.We have utilized a vision transformer(ViT)to address this problem.However,it has a significant overhead cost due to its memory consumption and computational complexity because of scaling dot-product attention.Also,since transformer performance typically relies on large-scale or adequate data,adapting ViT for smaller datasets is more challenging.In response,we propose a three-in-one steam model,theMulti-Head Attention Vision Hybrid(MHAVH).Thismodel integrates a real-time posture recognition framework to identify chest pain postures indicative of heart attacks using transfer learning techniques,such as ResNet-50 and VGG-16,renowned for their robust feature extraction capabilities.By incorporatingmultiple heads into the vision transformer to generate additional metrics and enhance heart-detection capabilities,we leverage a 2019 posture-based dataset comprising RGB images,a novel creation by the author that marks the first dataset tailored for posture-based heart attack detection.Given the limited online data availability,we segmented this dataset into gender categories(male and female)and conducted testing on both segmented and original datasets.The training accuracy of our model reached an impressive 99.77%.Upon testing,the accuracy for male and female datasets was recorded at 92.87%and 75.47%,respectively.The combined dataset accuracy is 93.96%,showcasing a commendable performance overall.Our proposed approach demonstrates versatility in accommodating small and large datasets,offering promising prospects for real-world applications.
基金funded by the Major Humanities and Social Sciences Research Projects in Zhejiang higher education institutions,grant number 2023QN082,awarded to Cheng ZhaoThe National Natural Science Foundation of China also provided funding,grant number 61902349,awarded to Cheng Zhao.
文摘The present study examines the impact of short-term public opinion sentiment on the secondary market,with a focus on the potential for such sentiment to cause dramatic stock price fluctuations and increase investment risk.The quantification of investment sentiment indicators and the persistent analysis of their impact has been a complex and significant area of research.In this paper,a structured multi-head attention stock index prediction method based adaptive public opinion sentiment vector is proposed.The proposedmethod utilizes an innovative approach to transform numerous investor comments on social platforms over time into public opinion sentiment vectors expressing complex sentiments.It then analyzes the continuous impact of these vectors on the market through the use of aggregating techniques and public opinion data via a structured multi-head attention mechanism.The experimental results demonstrate that the public opinion sentiment vector can provide more comprehensive feedback on market sentiment than traditional sentiment polarity analysis.Furthermore,the multi-head attention mechanism is shown to improve prediction accuracy through attention convergence on each type of input information separately.Themean absolute percentage error(MAPE)of the proposedmethod is 0.463%,a reduction of 0.294% compared to the benchmark attention algorithm.Additionally,the market backtesting results indicate that the return was 24.560%,an improvement of 8.202% compared to the benchmark algorithm.These results suggest that themarket trading strategy based on thismethod has the potential to improve trading profits.
基金This work was supported by the National Natural Science Foundation of China(Grant No.U1811262).
文摘Automatic extraction of the patient’s health information from the unstructured data concerning the discharge summary remains challenging.Discharge summary related documents contain various aspects of the patient health condition to examine the quality of treatment and thereby help improve decision-making in the medical field.Using a sentiment dictionary and feature engineering,the researchers primarily mine semantic text features.However,choosing and designing features requires a lot of manpower.The proposed approach is an unsupervised deep learning model that learns a set of clusters embedded in the latent space.A composite model including Active Learning(AL),Convolutional Neural Network(CNN),BiGRU,and Multi-Attention,called ACBMA in this research,is designed to measure the quality of treatment based on discharge summaries text sentiment detection.CNN is utilized for extracting the set of local features of text vectors.Then BiGRU network was utilized to extract the text’s global features to solve the issues that a single CNN cannot obtain global semantic information and the traditional Recurrent Neural Network(RNN)gradient disappearance.Experiments prove that the ACBMA method can demonstrate the effectiveness of the suggested method,achieve comparable results to state-of-arts methods in sentiment detection,and outperform them with accurate benchmarks.Finally,several algorithm studies ultimately determined that the ACBMA method is more precise for discharge summaries sentiment analysis.
文摘Accurate traffic prediction is crucial for an intelligent traffic system (ITS). However, the excessive non-linearity and complexity of the spatial-temporal correlation in traffic flow severely limit the prediction accuracy of most existing models, which simply stack temporal and spatial modules and fail to capture spatial-temporal features effectively. To improve the prediction accuracy, a multi-head attention spatial-temporal graph neural network (MSTNet) is proposed in this paper. First, the traffic data is decomposed into unique time spans that conform to positive rules, and valuable traffic node attributes are mined through an adaptive graph structure. Second, time and spatial features are captured using a multi-head attention spatial-temporal module. Finally, a multi-step prediction module is used to achieve future traffic condition prediction. Numerical experiments were conducted on an open-source dataset, and the results demonstrate that MSTNet performs well in spatial-temporal feature extraction and achieves more positive forecasting results than the baseline methods.
基金supported in part by the Natural Science Foundation of China under Grant 61972169 and U1536203in part by the National key research and developm program of China(2016QY01W0200)in part by the Major Scientic and Technological Project of Hubei Province(2018AAA068 and 2019AAA051).
文摘The majority of existing graph-network-based few-shot models focus on a node-similarity update mode.The lack of adequate information intensies the risk of overtraining.In this paper,we propose a novel Multihead Attention Graph Network to excavate discriminative relation and fulll effective information propagation.For edge update,the node-level attention is used to evaluate the similarities between the two nodes and the distributionlevel attention extracts more in-deep global relation.The cooperation between those two parts provides a discriminative and comprehensive expression for edge feature.For node update,we embrace the label-level attention to soften the noise of irrelevant nodes and optimize the update direction.Our proposed model is veried through extensive experiments on two few-shot benchmark MiniImageNet and CIFAR-FS dataset.The results suggest that our method has a strong capability of noise immunity and quick convergence.The classication accuracy outperforms most state-of-the-art approaches.
基金sponsored by the Belt and Road Innovative Cooperation Project(BZ2022003).
文摘Due to its high-temperature and high-pressure operating environment,food/feed puffing machines are prone to faults such as cavity blockage and cutter wear.This paper presents the design of a fault diagnosis system for puffing machines(food/feed processing equipment that expands or puffs agricultural products),based on a convolutional neural network and a multi-head attention mechanism model,which incorporates Bayesian optimization.The system combines multi-source information fusion,capturing patterns and characteristics associated with fault states by monitoring various sources of information,such as temperature,noise signals,main motor current and vibration signals from key components.Hyperparameters are optimized through Bayesian optimization to obtain the optimal parameter model.The integration of convolutional neural networks and multi-head attention mechanisms enables the simultaneous capture of both local and global information,thereby enhancing data comprehension.Experimental results demonstrate that the system successfully diagnoses puffing machine faults,achieving an average recognition accuracy of 98.8%across various operating conditions,highlighting its high accuracy,generalization ability and robustness.
文摘Brain tumors pose a singularly formidable threat in contemporary healthcare due to their diverse histological profiles and unpredictable clinical behavior.Their spectrum ranges from slow-growing benign tumors to highly aggressive malignancies in sensitive anatomical locations.This necessitates an intensified focus on their path-ophysiology and demands precise characterization for patient-specific therapeutic solutions.Techniques to correctly identify brain tumors using artificial intelligence are often employed for addressing segmentation and detection tasks;however,the lack of generalizable results hinders medical practitioners from incorporating them into the diagnostic process.Predominantly reliant on Magnetic Resonance Imaging,research on other imaging methods like Positron Emission Tomography&Computed Tomography,is scarce due to a dearth of open-access datasets.Our study proposes a robust MBTC-Net framework by leveraging EfficientNetV2B0 for extracting high-dimensional feature maps,followed by reshaping into sequences and applying multi-head attention to capture contextual dependencies.After reintroducing the attention output into a spatial structure,we perform average pooling before transitioning to dense layers,enhanced with batch normalization and dropout.The model is fine-tuned with the Adamax optimizer to classify various kinds of brain tumors using softmax from T1-weighted,T1 Contrast-Enhanced,&T2-weighted MRI sequences and CT scans.To reduce the risk of overfitting,measures such as stratified 5-fold cross-validation have been extensively implemented across 3 open-access Kaggle datasets,obtaining 97.54%(15-class),97.97%(6-class),and 99.34%(2-class)accuracies,respectively.We have also applied Grad-CAM to decipher and visually analyze the predictions made by this framework.This research underscores the need for multimodal training of CT scans and MRI sequences for deploying a sturdy framework in real-time environments and advancing the well-being of patients.