Accurate forecasting of blast furnace gas(BFG)production is an essential prerequisite for reasonable energy scheduling and management to reduce carbon emissions.Coupling forecasting between BFG generation and consumpt...Accurate forecasting of blast furnace gas(BFG)production is an essential prerequisite for reasonable energy scheduling and management to reduce carbon emissions.Coupling forecasting between BFG generation and consumption dynamics was taken as the research object.A multi-task learning(MTL)method for BFG forecasting was proposed,which integrated a coupling correlation coefficient(CCC)and an inverted transformer structure.The CCC method could enhance key information extraction by establishing relationships between multiple prediction targets and relevant factors,while MTL effectively captured the inherent correlations between BFG generation and consumption.Finally,a real-world case study was conducted to compare the proposed model with four benchmark models.Results indicated significant reductions in average mean absolute percentage error by 33.37%,achieving 1.92%,with a computational time of 76 s.The sensitivity analysis of hyperparameters such as learning rate,batch size,and units of the long short-term memory layer highlights the importance of hyperparameter tuning.展开更多
With the widespread use of SMS(Short Message Service),the proliferation of malicious SMS has emerged as a pressing societal issue.While deep learning-based text classifiers offer promise,they often exhibit suboptimal ...With the widespread use of SMS(Short Message Service),the proliferation of malicious SMS has emerged as a pressing societal issue.While deep learning-based text classifiers offer promise,they often exhibit suboptimal performance in fine-grained detection tasks,primarily due to imbalanced datasets and insufficient model representation capabilities.To address this challenge,this paper proposes an LLMs-enhanced graph fusion dual-stream Transformer model for fine-grained Chinese malicious SMS detection.During the data processing stage,Large Language Models(LLMs)are employed for data augmentation,mitigating dataset imbalance.In the data input stage,both word-level and character-level features are utilized as model inputs,enhancing the richness of features and preventing information loss.A dual-stream Transformer serves as the backbone network in the learning representation stage,complemented by a graph-based feature fusion mechanism.At the output stage,both supervised classification cross-entropy loss and supervised contrastive learning loss are used as multi-task optimization objectives,further enhancing the model’s feature representation.Experimental results demonstrate that the proposed method significantly outperforms baselines on a publicly available Chinese malicious SMS dataset.展开更多
Traffic characterization(e.g.,chat,video)and application identifi-cation(e.g.,FTP,Facebook)are two of the more crucial jobs in encrypted network traffic classification.These two activities are typically carried out se...Traffic characterization(e.g.,chat,video)and application identifi-cation(e.g.,FTP,Facebook)are two of the more crucial jobs in encrypted network traffic classification.These two activities are typically carried out separately by existing systems using separate models,significantly adding to the difficulty of network administration.Convolutional Neural Network(CNN)and Transformer are deep learning-based approaches for network traf-fic classification.CNN is good at extracting local features while ignoring long-distance information from the network traffic sequence,and Transformer can capture long-distance feature dependencies while ignoring local details.Based on these characteristics,a multi-task learning model that combines Transformer and 1D-CNN for encrypted traffic classification is proposed(MTC).In order to make up for the Transformer’s lack of local detail feature extraction capability and the 1D-CNN’s shortcoming of ignoring long-distance correlation information when processing traffic sequences,the model uses a parallel structure to fuse the features generated by the Transformer block and the 1D-CNN block with each other using a feature fusion block.This structure improved the representation of traffic features by both blocks and allows the model to perform well with both long and short length sequences.The model simultaneously handles multiple tasks,which lowers the cost of training.Experiments reveal that on the ISCX VPN-nonVPN dataset,the model achieves an average F1 score of 98.25%and an average recall of 98.30%for the task of identifying applications,and an average F1 score of 97.94%,and an average recall of 97.54%for the task of traffic characterization.When advanced models on the same dataset are chosen for comparison,the model produces the best results.To prove the generalization,we applied MTC to CICIDS2017 dataset,and our model also achieved good results.展开更多
Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,V...Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,Video Masked Auto-Encoder(VideoMAE)employs a pre-training approach with a high ratio of tube masking and reconstruction,effectively mitigating spatial bias due to temporal redundancy in full video frames.This steers the model’s focus toward detailed temporal contexts.However,as the VideoMAE still relies on full video frames during the action recognition stage,it may exhibit a progressive shift in attention towards spatial contexts,deteriorating its ability to capture the main spatio-temporal contexts.To address this issue,we propose an attention-directing module named Transformer Encoder Attention Module(TEAM).This proposed module effectively directs the model’s attention to the core characteristics within each video,inherently mitigating spatial bias.The TEAM first figures out the core features among the overall extracted features from each video.After that,it discerns the specific parts of the video where those features are located,encouraging the model to focus more on these informative parts.Consequently,during the action recognition stage,the proposed TEAM effectively shifts the VideoMAE’s attention from spatial contexts towards the core spatio-temporal contexts.This attention-shift manner alleviates the spatial bias in the model and simultaneously enhances its ability to capture precise video contexts.We conduct extensive experiments to explore the optimal configuration that enables the TEAM to fulfill its intended design purpose and facilitates its seamless integration with the VideoMAE framework.The integrated model,i.e.,VideoMAE+TEAM,outperforms the existing VideoMAE by a significant margin on Something-Something-V2(71.3%vs.70.3%).Moreover,the qualitative comparisons demonstrate that the TEAM encourages the model to disregard insignificant features and focus more on the essential video features,capturing more detailed spatio-temporal contexts within the video.展开更多
With the rapid development of mechanical equipment,mechanical health monitoring field has entered the era of big data.Deep learning has made a great achievement in the processing of large data of image and speech due ...With the rapid development of mechanical equipment,mechanical health monitoring field has entered the era of big data.Deep learning has made a great achievement in the processing of large data of image and speech due to the powerful modeling capabilities,this also brings influence to the mechanical fault diagnosis field.Therefore,according to the characteristics of motor vibration signals(nonstationary and difficult to deal with)and mechanical‘big data’,combined with deep learning,a motor fault diagnosis method based on stacked de-noising auto-encoder is proposed.The frequency domain signals obtained by the Fourier transform are used as input to the network.This method can extract features adaptively and unsupervised,and get rid of the dependence of traditional machine learning methods on human extraction features.A supervised fine tuning of the model is then carried out by backpropagation.The Asynchronous motor in Drivetrain Dynamics Simulator system was taken as the research object,the effectiveness of the proposed method was verified by a large number of data,and research on visualization of network output,the results shown that the SDAE method is more efficient and more intelligent.展开更多
A system of number recognition with a graphic user interface (GUI) is implemented on the embedded development platform by using the fuzzy pattern recognition method. An application interface (API) of uC/ OS-Ⅱ is ...A system of number recognition with a graphic user interface (GUI) is implemented on the embedded development platform by using the fuzzy pattern recognition method. An application interface (API) of uC/ OS-Ⅱ is used to implement the features of multi-task concurrency and the communications among tasks. Handwriting function is implemented by the improvement of the interface provided by the platform. Fuzzy pattern recognition technology based on fuzzy theory is used to analyze the input of handwriting. A primary system for testing is implemented. It can receive and analyze user inputs from both keyboard and touch-screen. The experimental results show that the embedded fuzzy recognition system which uses the technology which integrates two ways of fuzzy recognition can retain a high recognition rate and reduce hardware requirements.展开更多
Driver steering intention prediction provides an augmented solution to the design of an onboard collaboration mechanism between human driver and intelligent vehicle.In this study,a multi-task sequential learning frame...Driver steering intention prediction provides an augmented solution to the design of an onboard collaboration mechanism between human driver and intelligent vehicle.In this study,a multi-task sequential learning framework is developed to pre-dict future steering torques and steering postures based on upper limb neuromuscular electromyography signals.The joint representation learning for driving postures and steering intention provides an in-depth understanding and accurate modelling of driving steering behaviours.Regarding different testing scenarios,two driving modes,namely,both-hand and single-right-hand modes,are studied.For each driving mode,three different driving postures are further evaluated.Next,a multi-task time-series transformer network(MTS-Trans)is developed to predict the future steering torques and driving postures based on the multi-variate sequential input and the self-attention mechanism.To evaluate the multi-task learning performance and information-sharing characteristics within the network,four distinct two-branch network architectures are evaluated.Empirical validation is conducted through a driving simulator-based experiment,encompassing 21 participants.The pro-posed model achieves accurate prediction results on future steering torque prediction as well as driving posture recognition for both two-hand and single-hand driving modes.These findings hold significant promise for the advancement of driver steering assistance systems,fostering mutual comprehension and synergy between human drivers and intelligent vehicles.展开更多
Transformers have dominated the field of natural language processing and have recently made an impact in the area of computer vision.In the field of medical image analysis,transformers have also been successfully used...Transformers have dominated the field of natural language processing and have recently made an impact in the area of computer vision.In the field of medical image analysis,transformers have also been successfully used in to full-stack clinical applications,including image synthesis/reconstruction,registration,segmentation,detection,and diagnosis.This paper aimed to promote awareness of the applications of transformers in medical image analysis.Specifically,we first provided an overview of the core concepts of the attention mechanism built into transformers and other basic components.Second,we reviewed various transformer architectures tailored for medical image applications and discuss their limitations.Within this review,we investigated key challenges including the use of transformers in different learning paradigms,improving model efficiency,and coupling with other techniques.We hope this review would provide a comprehensive picture of transformers to readers with an interest in medical image analysis.展开更多
Artificial intelligence(AI)can potentially improve the reliability of transformer protection by fusing multiple features.However,owing to the data scarcity of inrush current and internal fault,the existing methods fac...Artificial intelligence(AI)can potentially improve the reliability of transformer protection by fusing multiple features.However,owing to the data scarcity of inrush current and internal fault,the existing methods face the problem of poor generalizability.In this paper,a denoising-classification neural network(DCNN)is proposed,one which inte-grates a convolutional auto-encoder(CAE)and a convolutional neural network(CNN),and is used to develop a reli-able transformer protection scheme by identifying the exciting voltage-differential current curve(VICur).In the DCNN,CAE shares its encoder part with the CNN,where the CNN combines the encoder and a classifier.Based on the inter-action of the CAE reconstruction process and the CNN classification process,the CAE regards the saturated features of the VICur as noise and removes them accurately.Consequently,it guides CNN to focus on the unsaturated features of the VICur.The unsaturated part of the VICur approximates an ellipse,and this significantly differentiates between a healthy and faulty transformer.Therefore,the unsaturated features extracted by the CNN help to decrease the data ergodicity requirement of AI and improve the generalizability.Finally,a CNN which is trained well by the DCNN is used to develop a protection scheme.PSCAD simulations and dynamic model experiments verify its superior performance.展开更多
基金supported by the National Natural Science Foundation of China(No.52474435)China Baowu Low Carbon Metallurgy Innovation Foundation(BWLCF202307).
文摘Accurate forecasting of blast furnace gas(BFG)production is an essential prerequisite for reasonable energy scheduling and management to reduce carbon emissions.Coupling forecasting between BFG generation and consumption dynamics was taken as the research object.A multi-task learning(MTL)method for BFG forecasting was proposed,which integrated a coupling correlation coefficient(CCC)and an inverted transformer structure.The CCC method could enhance key information extraction by establishing relationships between multiple prediction targets and relevant factors,while MTL effectively captured the inherent correlations between BFG generation and consumption.Finally,a real-world case study was conducted to compare the proposed model with four benchmark models.Results indicated significant reductions in average mean absolute percentage error by 33.37%,achieving 1.92%,with a computational time of 76 s.The sensitivity analysis of hyperparameters such as learning rate,batch size,and units of the long short-term memory layer highlights the importance of hyperparameter tuning.
基金supported by the Fundamental Research Funds for the Central Universities(2024JKF13)the Beijing Municipal Education Commission General Program of Science and Technology(No.KM202414019003).
文摘With the widespread use of SMS(Short Message Service),the proliferation of malicious SMS has emerged as a pressing societal issue.While deep learning-based text classifiers offer promise,they often exhibit suboptimal performance in fine-grained detection tasks,primarily due to imbalanced datasets and insufficient model representation capabilities.To address this challenge,this paper proposes an LLMs-enhanced graph fusion dual-stream Transformer model for fine-grained Chinese malicious SMS detection.During the data processing stage,Large Language Models(LLMs)are employed for data augmentation,mitigating dataset imbalance.In the data input stage,both word-level and character-level features are utilized as model inputs,enhancing the richness of features and preventing information loss.A dual-stream Transformer serves as the backbone network in the learning representation stage,complemented by a graph-based feature fusion mechanism.At the output stage,both supervised classification cross-entropy loss and supervised contrastive learning loss are used as multi-task optimization objectives,further enhancing the model’s feature representation.Experimental results demonstrate that the proposed method significantly outperforms baselines on a publicly available Chinese malicious SMS dataset.
基金supported by the People’s Public Security University of China central basic scientific research business program(No.2021JKF206).
文摘Traffic characterization(e.g.,chat,video)and application identifi-cation(e.g.,FTP,Facebook)are two of the more crucial jobs in encrypted network traffic classification.These two activities are typically carried out separately by existing systems using separate models,significantly adding to the difficulty of network administration.Convolutional Neural Network(CNN)and Transformer are deep learning-based approaches for network traf-fic classification.CNN is good at extracting local features while ignoring long-distance information from the network traffic sequence,and Transformer can capture long-distance feature dependencies while ignoring local details.Based on these characteristics,a multi-task learning model that combines Transformer and 1D-CNN for encrypted traffic classification is proposed(MTC).In order to make up for the Transformer’s lack of local detail feature extraction capability and the 1D-CNN’s shortcoming of ignoring long-distance correlation information when processing traffic sequences,the model uses a parallel structure to fuse the features generated by the Transformer block and the 1D-CNN block with each other using a feature fusion block.This structure improved the representation of traffic features by both blocks and allows the model to perform well with both long and short length sequences.The model simultaneously handles multiple tasks,which lowers the cost of training.Experiments reveal that on the ISCX VPN-nonVPN dataset,the model achieves an average F1 score of 98.25%and an average recall of 98.30%for the task of identifying applications,and an average F1 score of 97.94%,and an average recall of 97.54%for the task of traffic characterization.When advanced models on the same dataset are chosen for comparison,the model produces the best results.To prove the generalization,we applied MTC to CICIDS2017 dataset,and our model also achieved good results.
基金This work was supported by the National Research Foundation of Korea(NRF)Grant(Nos.2018R1A5A7059549,2020R1A2C1014037)by Institute of Information&Communications Technology Planning&Evaluation(IITP)Grant(No.2020-0-01373)funded by the Korea government(*MSIT).*Ministry of Science and Information&Communication Technology.
文摘Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,Video Masked Auto-Encoder(VideoMAE)employs a pre-training approach with a high ratio of tube masking and reconstruction,effectively mitigating spatial bias due to temporal redundancy in full video frames.This steers the model’s focus toward detailed temporal contexts.However,as the VideoMAE still relies on full video frames during the action recognition stage,it may exhibit a progressive shift in attention towards spatial contexts,deteriorating its ability to capture the main spatio-temporal contexts.To address this issue,we propose an attention-directing module named Transformer Encoder Attention Module(TEAM).This proposed module effectively directs the model’s attention to the core characteristics within each video,inherently mitigating spatial bias.The TEAM first figures out the core features among the overall extracted features from each video.After that,it discerns the specific parts of the video where those features are located,encouraging the model to focus more on these informative parts.Consequently,during the action recognition stage,the proposed TEAM effectively shifts the VideoMAE’s attention from spatial contexts towards the core spatio-temporal contexts.This attention-shift manner alleviates the spatial bias in the model and simultaneously enhances its ability to capture precise video contexts.We conduct extensive experiments to explore the optimal configuration that enables the TEAM to fulfill its intended design purpose and facilitates its seamless integration with the VideoMAE framework.The integrated model,i.e.,VideoMAE+TEAM,outperforms the existing VideoMAE by a significant margin on Something-Something-V2(71.3%vs.70.3%).Moreover,the qualitative comparisons demonstrate that the TEAM encourages the model to disregard insignificant features and focus more on the essential video features,capturing more detailed spatio-temporal contexts within the video.
基金This research is supported financially by Natural Science Foundation of China(Grant No.51505234,51405241,51575283).
文摘With the rapid development of mechanical equipment,mechanical health monitoring field has entered the era of big data.Deep learning has made a great achievement in the processing of large data of image and speech due to the powerful modeling capabilities,this also brings influence to the mechanical fault diagnosis field.Therefore,according to the characteristics of motor vibration signals(nonstationary and difficult to deal with)and mechanical‘big data’,combined with deep learning,a motor fault diagnosis method based on stacked de-noising auto-encoder is proposed.The frequency domain signals obtained by the Fourier transform are used as input to the network.This method can extract features adaptively and unsupervised,and get rid of the dependence of traditional machine learning methods on human extraction features.A supervised fine tuning of the model is then carried out by backpropagation.The Asynchronous motor in Drivetrain Dynamics Simulator system was taken as the research object,the effectiveness of the proposed method was verified by a large number of data,and research on visualization of network output,the results shown that the SDAE method is more efficient and more intelligent.
基金Pre-Research Project of the National Natural Science Foundation of China supported by Southeast University ( NoXJ0605227)
文摘A system of number recognition with a graphic user interface (GUI) is implemented on the embedded development platform by using the fuzzy pattern recognition method. An application interface (API) of uC/ OS-Ⅱ is used to implement the features of multi-task concurrency and the communications among tasks. Handwriting function is implemented by the improvement of the interface provided by the platform. Fuzzy pattern recognition technology based on fuzzy theory is used to analyze the input of handwriting. A primary system for testing is implemented. It can receive and analyze user inputs from both keyboard and touch-screen. The experimental results show that the embedded fuzzy recognition system which uses the technology which integrates two ways of fuzzy recognition can retain a high recognition rate and reduce hardware requirements.
文摘Driver steering intention prediction provides an augmented solution to the design of an onboard collaboration mechanism between human driver and intelligent vehicle.In this study,a multi-task sequential learning framework is developed to pre-dict future steering torques and steering postures based on upper limb neuromuscular electromyography signals.The joint representation learning for driving postures and steering intention provides an in-depth understanding and accurate modelling of driving steering behaviours.Regarding different testing scenarios,two driving modes,namely,both-hand and single-right-hand modes,are studied.For each driving mode,three different driving postures are further evaluated.Next,a multi-task time-series transformer network(MTS-Trans)is developed to predict the future steering torques and driving postures based on the multi-variate sequential input and the self-attention mechanism.To evaluate the multi-task learning performance and information-sharing characteristics within the network,four distinct two-branch network architectures are evaluated.Empirical validation is conducted through a driving simulator-based experiment,encompassing 21 participants.The pro-posed model achieves accurate prediction results on future steering torque prediction as well as driving posture recognition for both two-hand and single-hand driving modes.These findings hold significant promise for the advancement of driver steering assistance systems,fostering mutual comprehension and synergy between human drivers and intelligent vehicles.
基金the National Natural Science Foundation of China(Grant No.62106101)the Natural Science Foundation of Jiangsu Province(Grant No.BK20210180).
文摘Transformers have dominated the field of natural language processing and have recently made an impact in the area of computer vision.In the field of medical image analysis,transformers have also been successfully used in to full-stack clinical applications,including image synthesis/reconstruction,registration,segmentation,detection,and diagnosis.This paper aimed to promote awareness of the applications of transformers in medical image analysis.Specifically,we first provided an overview of the core concepts of the attention mechanism built into transformers and other basic components.Second,we reviewed various transformer architectures tailored for medical image applications and discuss their limitations.Within this review,we investigated key challenges including the use of transformers in different learning paradigms,improving model efficiency,and coupling with other techniques.We hope this review would provide a comprehensive picture of transformers to readers with an interest in medical image analysis.
基金supported by the National Natural Science Foundation of China (Grant No.:20210333).
文摘Artificial intelligence(AI)can potentially improve the reliability of transformer protection by fusing multiple features.However,owing to the data scarcity of inrush current and internal fault,the existing methods face the problem of poor generalizability.In this paper,a denoising-classification neural network(DCNN)is proposed,one which inte-grates a convolutional auto-encoder(CAE)and a convolutional neural network(CNN),and is used to develop a reli-able transformer protection scheme by identifying the exciting voltage-differential current curve(VICur).In the DCNN,CAE shares its encoder part with the CNN,where the CNN combines the encoder and a classifier.Based on the inter-action of the CAE reconstruction process and the CNN classification process,the CAE regards the saturated features of the VICur as noise and removes them accurately.Consequently,it guides CNN to focus on the unsaturated features of the VICur.The unsaturated part of the VICur approximates an ellipse,and this significantly differentiates between a healthy and faulty transformer.Therefore,the unsaturated features extracted by the CNN help to decrease the data ergodicity requirement of AI and improve the generalizability.Finally,a CNN which is trained well by the DCNN is used to develop a protection scheme.PSCAD simulations and dynamic model experiments verify its superior performance.