Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or...Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or indirect slurs.To address this gap,we propose a hybrid framework combining Term Frequency-Inverse Document Frequency(TF-IDF),word-to-vector(Word2Vec),and Bidirectional Encoder Representations from Transformers(BERT)based models for multi-class cyberbullying detection.Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships,fused with BERT’s contextual embeddings to capture syntactic and semantic complexities.We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories:age,ethnicity,gender,religion,and indirect aggression.Among BERT variants tested,BERT Base Un-Cased achieved the highest performance with 93%accuracy(standard deviation across±1%5-fold cross-validation)and an average AUC of 0.96,outperforming standalone TF-IDF(78%)and Word2Vec(82%)models.Notably,it achieved near-perfect AUC scores(0.99)for age and ethnicity-based bullying.A comparative analysis with state-of-the-art benchmarks,including Generative Pre-trained Transformer 2(GPT-2)and Text-to-Text Transfer Transformer(T5)models highlights BERT’s superiority in handling ambiguous language.This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification,offering a scalable solution for moderating nuanced harmful content.展开更多
AlphaPanda(AlphaFold2[1]inspired protein-specific antibody design in a diffusional manner)is an advanced algorithm for designing complementary determining regions(CDRs)of the antibody targeted the specific epitope,com...AlphaPanda(AlphaFold2[1]inspired protein-specific antibody design in a diffusional manner)is an advanced algorithm for designing complementary determining regions(CDRs)of the antibody targeted the specific epitope,combining transformer[2]models,3DCNN[3],and diffusion[4]generative models.展开更多
In this paper,the small-signal modeling of the Indium Phosphide High Electron Mobility Transistor(InP HEMT)based on the Transformer neural network model is investigated.The AC S-parameters of the HEMT device are train...In this paper,the small-signal modeling of the Indium Phosphide High Electron Mobility Transistor(InP HEMT)based on the Transformer neural network model is investigated.The AC S-parameters of the HEMT device are trained and validated using the Transformer model.In the proposed model,the eight-layer transformer encoders are connected in series and the encoder layer of each Transformer consists of the multi-head attention layer and the feed-forward neural network layer.The experimental results show that the measured and modeled S-parameters of the HEMT device match well in the frequency range of 0.5-40 GHz,with the errors versus frequency less than 1%.Compared with other models,good accuracy can be achieved to verify the effectiveness of the proposed model.展开更多
In dynamic 5G network environments,user mobility and heterogeneous network topologies pose dual challenges to the effort of improving performance of mobile edge caching.Existing studies often overlook the dynamic natu...In dynamic 5G network environments,user mobility and heterogeneous network topologies pose dual challenges to the effort of improving performance of mobile edge caching.Existing studies often overlook the dynamic nature of user locations and the potential of device-to-device(D2D)cooperative caching,limiting the reduction of transmission latency.To address this issue,this paper proposes a joint optimization scheme for edge caching that integrates user mobility prediction with deep reinforcement learning.First,a Transformer-based geolocation prediction model is designed,leveraging multi-head attention mechanisms to capture correlations in historical user trajectories for accurate future location prediction.Then,within a three-tier heterogeneous network,we formulate a latency minimization problem under a D2D cooperative caching architecture and develop a mobility-aware Deep Q-Network(DQN)caching strategy.This strategy takes predicted location information as state input and dynamically adjusts the content distribution across small base stations(SBSs)andmobile users(MUs)to reduce end-to-end delay inmulti-hop content retrieval.Simulation results show that the proposed DQN-based method outperforms other baseline strategies across variousmetrics,achieving a 17.2%reduction in transmission delay compared to DQNmethods withoutmobility integration,thus validating the effectiveness of the joint optimization of location prediction and caching decisions.展开更多
The identification of ore grades is a critical step in mineral resource exploration and mining.Prompt gamma neutron activation analysis(PGNAA)technology employs gamma rays generated by the nuclear reactions between ne...The identification of ore grades is a critical step in mineral resource exploration and mining.Prompt gamma neutron activation analysis(PGNAA)technology employs gamma rays generated by the nuclear reactions between neutrons and samples to achieve the qualitative and quantitative detection of sample components.In this study,we present a novel method for identifying copper grade by combining the vision transformer(ViT)model with the PGNAA technique.First,a Monte Carlo simulation is employed to determine the optimal sizes of the neutron moderator,thermal neutron absorption material,and dimensions of the device.Subsequently,based on the parameters obtained through optimization,a PGNAA copper ore measurement model is established.The gamma spectrum of the copper ore is analyzed using the ViT model.The ViT model is optimized for hyperparameters using a grid search.To ensure the reliability of the identification results,the test results are obtained through five repeated tenfold cross-validations.Long short-term memory and convolutional neural network models are compared with the ViT method.These results indicate that the ViT method is efficient in identifying copper ore grades with average accuracy,precision,recall,F_(1)score,and F_(1)(-)score values of 0.9795,0.9637,0.9614,0.9625,and 0.9942,respectively.When identifying associated minerals,the ViT model can identify Pb,Zn,Fe,and Co minerals with identification accuracies of 0.9215,0.9396,0.9966,and 0.8311,respectively.展开更多
Deep learning(DL)has become a crucial technique for predicting the El Niño-Southern Oscillation(ENSO)and evaluating its predictability.While various DL-based models have been developed for ENSO predictions,many f...Deep learning(DL)has become a crucial technique for predicting the El Niño-Southern Oscillation(ENSO)and evaluating its predictability.While various DL-based models have been developed for ENSO predictions,many fail to capture the coherent multivariate evolution within the coupled ocean-atmosphere system of the tropical Pacific.To address this three-dimensional(3D)limitation and represent ENSO-related ocean-atmosphere interactions more accurately,a novel this 3D multivariate prediction model was proposed based on a Transformer architecture,which incorporates a spatiotemporal self-attention mechanism.This model,named 3D-Geoformer,offers several advantages,enabling accurate ENSO predictions up to one and a half years in advance.Furthermore,an integrated gradient method was introduced into the model to identify the sources of predictability for sea surface temperature(SST)variability in the eastern equatorial Pacific.Results reveal that the 3D-Geoformer effectively captures ENSO-related precursors during the evolution of ENSO events,particularly the thermocline feedback processes and ocean temperature anomaly pathways on and off the equator.By extending DL-based ENSO predictions from one-dimensional Niño time series to 3D multivariate fields,the 3D-Geoformer represents a significant advancement in ENSO prediction.This study provides details in the model formulation,analysis procedures,sensitivity experiments,and illustrative examples,offering practical guidance for the application of the model in ENSO research.展开更多
Short⁃term traffic flow prediction plays a crucial role in the planning of intelligent transportation systems.Nowadays,there is a large amount of traffic flow data generated from the monitoring devices of urban road n...Short⁃term traffic flow prediction plays a crucial role in the planning of intelligent transportation systems.Nowadays,there is a large amount of traffic flow data generated from the monitoring devices of urban road networks,which contains road network traffic information with high application value.In this study,an improved spatio⁃temporal attention transformer model(ISTA⁃transformer model)is proposed to provide a more accurate method for predicting multi⁃step short⁃term traffic flow based on monitoring data.By embedding a temporal attention layer and a spatial attention layer in the model,the model learns the relationship between traffic flows at different time intervals and different geographic locations,and realizes more accurate multi⁃step short⁃time flow prediction.Finally,we validate the superiority of the model with monitoring data spanning 15 days from 620 monitoring points in Qingdao,China.In the four time steps of prediction,the MAPE(Mean Absolute Percentage Error)values of ISTA⁃transformers prediction results are 0.22,0.29,0.37,and 0.38,respectively,and its prediction accuracy is usually better than that of six baseline models(Transformer,GRU,CNN,LSTM,Seq2Seq and LightGBM),which indicates that the proposed model in this paper always has a better ability to explain the prediction results with the time steps in the multi⁃step prediction.展开更多
Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning...Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning. Retinexformer introduces channel self-attention mechanisms in the IG-MSA. However, it fails to effectively capture long-range spatial dependencies, leaving room for improvement. Based on the Retinexformer deep learning framework, we designed the Retinexformer+ network. The “+” signifies our advancements in extracting long-range spatial dependencies. We introduced multi-scale dilated convolutions in illumination estimation to expand the receptive field. These convolutions effectively capture the weakening semantic dependency between pixels as distance increases. In illumination restoration, we used Unet++ with multi-level skip connections to better integrate semantic information at different scales. The designed Illumination Fusion Dual Self-Attention (IF-DSA) module embeds multi-scale dilated convolutions to achieve spatial self-attention. This module captures long-range spatial semantic relationships within acceptable computational complexity. Experimental results on the Low-Light (LOL) dataset show that Retexformer+ outperforms other State-Of-The-Art (SOTA) methods in both quantitative and qualitative evaluations, with the computational complexity increased to an acceptable 51.63 G FLOPS. On the LOL_v1 dataset, RetinexFormer+ shows an increase of 1.15 in Peak Signal-to-Noise Ratio (PSNR) and a decrease of 0.39 in Root Mean Square Error (RMSE). On the LOL_v2_real dataset, the PSNR increases by 0.42 and the RMSE decreases by 0.18. Experimental results on the Exdark dataset show that Retexformer+ can effectively enhance real-scene images and maintain their semantic information.展开更多
The rise of social media platforms has revolutionized communication, enabling the exchange of vast amounts of data through text, audio, images, and videos. These platforms have become critical for sharing opinions and...The rise of social media platforms has revolutionized communication, enabling the exchange of vast amounts of data through text, audio, images, and videos. These platforms have become critical for sharing opinions and insights, influencing daily habits, and driving business, political, and economic decisions. Text posts are particularly significant, and natural language processing (NLP) has emerged as a powerful tool for analyzing such data. While traditional NLP methods have been effective for structured media, social media content poses unique challenges due to its informal and diverse nature. This has spurred the development of new techniques tailored for processing and extracting insights from unstructured user-generated text. One key application of NLP is the summarization of user comments to manage overwhelming content volumes. Abstractive summarization has proven highly effective in generating concise, human-like summaries, offering clear overviews of key themes and sentiments. This enhances understanding and engagement while reducing cognitive effort for users. For businesses, summarization provides actionable insights into customer preferences and feedback, enabling faster trend analysis, improved responsiveness, and strategic adaptability. By distilling complex data into manageable insights, summarization plays a vital role in improving user experiences and empowering informed decision-making in a data-driven landscape. This paper proposes a new implementation framework by fine-tuning and parameterizing Transformer Large Language Models to manage and maintain linguistic and semantic components in abstractive summary generation. The system excels in transforming large volumes of data into meaningful summaries, as evidenced by its strong performance across metrics like fluency, consistency, readability, and semantic coherence.展开更多
Photovoltaic(PV)systems are environmentally friendly,generate green energy,and receive support from policies and organizations.However,weather fluctuations make large-scale PV power integration and management challeng...Photovoltaic(PV)systems are environmentally friendly,generate green energy,and receive support from policies and organizations.However,weather fluctuations make large-scale PV power integration and management challenging despite the economic benefits.Existing PV forecasting techniques(sequential and convolutional neural networks(CNN))are sensitive to environmental conditions,reducing energy distribution system performance.To handle these issues,this article proposes an efficient,weather-resilient convolutional-transformer-based network(CT-NET)for accurate and efficient PV power forecasting.The network consists of three main modules.First,the acquired PV generation data are forwarded to the pre-processing module for data refinement.Next,to carry out data encoding,a CNNbased multi-head attention(MHA)module is developed in which a single MHA is used to decode the encoded data.The encoder module is mainly composed of 1D convolutional and MHA layers,which extract local as well as contextual features,while the decoder part includes MHA and feedforward layers to generate the final prediction.Finally,the performance of the proposed network is evaluated using standard error metrics,including the mean squared error(MSE),root mean squared error(RMSE),and mean absolute percentage error(MAPE).An ablation study and comparative analysis with several competitive state-of-the-art approaches revealed a lower error rate in terms of MSE(0.0471),RMSE(0.2167),and MAPE(0.6135)over publicly available benchmark data.In addition,it is demonstrated that our proposed model is less complex,with the lowest number of parameters(0.0135 M),size(0.106 MB),and inference time(2 ms/step),suggesting that it is easy to integrate into the smart grid.展开更多
Micro-expressions are spontaneous, unconscious movements that reveal true emotions.Accurate facial movement information and network training learning methods are crucial for micro-expression recognition.However, most ...Micro-expressions are spontaneous, unconscious movements that reveal true emotions.Accurate facial movement information and network training learning methods are crucial for micro-expression recognition.However, most existing micro-expression recognition technologies so far focus on modeling the single category of micro-expression images and neural network structure.Aiming at the problems of low recognition rate and weak model generalization ability in micro-expression recognition, a micro-expression recognition algorithm is proposed based on graph convolution network(GCN) and Transformer model.Firstly, action unit(AU) feature detection is extracted and facial muscle nodes in the neighborhood are divided into three subsets for recognition.Then, graph convolution layer is used to find the layout of dependencies between AU nodes of micro-expression classification.Finally, multiple attentional features of each facial action are enriched with Transformer model to include more sequence information before calculating the overall correlation of each region.The proposed method is validated in CASME II and CAS(ME)^2 datasets, and the recognition rate reached 69.85%.展开更多
Timed abstract state machine(TASM) is a formal specification language used to specify and simulate the behavior of real-time systems. Formal verification of TASM model can be fulfilled through model checking activitie...Timed abstract state machine(TASM) is a formal specification language used to specify and simulate the behavior of real-time systems. Formal verification of TASM model can be fulfilled through model checking activities by translating into UPPAAL. Firstly, the translational semantics from TASM to UPPAAL is presented through atlas transformation language(ATL). Secondly, the implementation of the proposed model transformation tool TASM2UPPAAL is provided. Finally, a case study is given to illustrate the automatic transformation from TASM model to UPPAAL model.展开更多
The dependence of transformer performance on the material properties was investigated using two laboratory-processed 0.23 mm thick grain-oriented electrical steels domain-refined with elec-trolytically etched grooves ...The dependence of transformer performance on the material properties was investigated using two laboratory-processed 0.23 mm thick grain-oriented electrical steels domain-refined with elec-trolytically etched grooves having different magnetic properties. The iron loss at 1.7 T, 50 Hz and the flux density at 800 A/m of material A were 0.73 W/kg and 1.89 T, respectively; and those of material B, 0.83 W/kg and 1.88 T. Model stacked and wound transformer core experiments using the tested materials exhibited performance well reflecting the material characteristics. In a three-phase stacked core with step-lap joints excited to 1.7 T, 50 Hz, the core loss, the exciting current and the noise level were 0.86 W/kg, 0.74 A and 52 dB, respectively, with material A; and 0.97 W/kg, 1.0 A and 54 dB with material B. The building factors for the core losses of the two materials were almost the same in both core configurations. The effect of higher harmonics on transformer performance was also investigated.展开更多
The oil industries are an important part of a country’s economy.The crude oil’s price is influenced by a wide range of variables.Therefore,how accurately can countries predict its behavior and what predictors to emp...The oil industries are an important part of a country’s economy.The crude oil’s price is influenced by a wide range of variables.Therefore,how accurately can countries predict its behavior and what predictors to employ are two main questions.In this view,we propose utilizing deep learning and ensemble learning techniques to boost crude oil’s price forecasting performance.The suggested method is based on a deep learning snapshot ensemble method of the Transformer model.To examine the superiority of the proposed model,this paper compares the proposed deep learning ensemble model against different machine learning and statistical models for daily Organization of the Petroleum Exporting Countries(OPEC)oil price forecasting.Experimental results demonstrated the outperformance of the proposed method over statistical and machine learning methods.More precisely,the proposed snapshot ensemble of Transformer method achieved relative improvement in the forecasting performance compared to autoregressive integrated moving average ARIMA(1,1,1),ARIMA(0,1,1),autoregressive moving average(ARMA)(0,1),vector autoregression(VAR),random walk(RW),support vector machine(SVM),and random forests(RF)models by 99.94%,99.62%,99.87%,99.65%,7.55%,98.38%,and 99.35%,respectively,according to mean square error metric.展开更多
Recent advancement in low-cost cameras has facilitated surveillance in various developing towns in India.The video obtained from such surveillance are of low quality.Still counting vehicles from such videos are necess...Recent advancement in low-cost cameras has facilitated surveillance in various developing towns in India.The video obtained from such surveillance are of low quality.Still counting vehicles from such videos are necessity to avoid traf-fic congestion and allows drivers to plan their routes more precisely.On the other hand,detecting vehicles from such low quality videos are highly challenging with vision based methodologies.In this research a meticulous attempt is made to access low-quality videos to describe traffic in Salem town in India,which is mostly an un-attempted entity by most available sources.In this work profound Detection Transformer(DETR)model is used for object(vehicle)detection.Here vehicles are anticipated in a rush-hour traffic video using a set of loss functions that carry out bipartite coordinating among estimated and information acquired on real attributes.Every frame in the traffic footage has its date and time which is detected and retrieved using Tesseract Optical Character Recognition.The date and time extricated and perceived from the input image are incorporated with the length of the recognized objects acquired from the DETR model.This furnishes the vehicles report with timestamp.Transformer Timeseries Prediction Model(TTPM)is proposed to predict the density of the vehicle for future prediction,here the regular NLP layers have been removed and the encoding temporal layer has been modified.The proposed TTPM error rate outperforms the existing models with RMSE of 4.313 and MAE of 3.812.展开更多
Architecture analysis and design language (AADL) is an architecture description language standard for embedded real-time systems and it is widely used in safety-critical applications. For facilitating verifcafion an...Architecture analysis and design language (AADL) is an architecture description language standard for embedded real-time systems and it is widely used in safety-critical applications. For facilitating verifcafion and analysis, model transformation is one of the methods. A synchronous subset of AADL and a general methodology for translating the AADL subset into timed abstract state machine (TASM) were studied. Based on the arias transformation language ( ATL ) framework, the associated translating tool AADL2TASM was implemented by defining the meta-model of both AADL and TASM, and the ATL transformation rules. A case study with property verification of the AADL model was also presented for validating the tool.展开更多
Automated and accurate movie genre classification is crucial for content organization,recommendation systems,and audience targeting in the film industry.Although most existing approaches focus on audiovisual features ...Automated and accurate movie genre classification is crucial for content organization,recommendation systems,and audience targeting in the film industry.Although most existing approaches focus on audiovisual features such as trailers and posters,the text-based classification remains underexplored despite its accessibility and semantic richness.This paper introduces the Genre Attention Model(GAM),a deep learning architecture that integrates transformer models with a hierarchical attention mechanism to extract and leverage contextual information from movie plots formulti-label genre classification.In order to assess its effectiveness,we assessmultiple transformer-based models,including Bidirectional Encoder Representations fromTransformers(BERT),ALite BERT(ALBERT),Distilled BERT(DistilBERT),Robustly Optimized BERT Pretraining Approach(RoBERTa),Efficiently Learning an Encoder that Classifies Token Replacements Accurately(ELECTRA),eXtreme Learning Network(XLNet)and Decodingenhanced BERT with Disentangled Attention(DeBERTa).Experimental results demonstrate the superior performance of DeBERTa-based GAM,which employs a two-tier hierarchical attention mechanism:word-level attention highlights key terms,while sentence-level attention captures critical narrative segments,ensuring a refined and interpretable representation of movie plots.Evaluated on three benchmark datasets Trailers12K,Large Movie Trailer Dataset-9(LMTD-9),and MovieLens37K.GAM achieves micro-average precision scores of 83.63%,83.32%,and 83.34%,respectively,surpassing state-of-the-artmodels.Additionally,GAMis computationally efficient,requiring just 6.10Giga Floating Point Operations Per Second(GFLOPS),making it a scalable and cost-effective solution.These results highlight the growing potential of text-based deep learning models in genre classification and GAM’s effectiveness in improving predictive accuracy while maintaining computational efficiency.With its robust performance,GAM offers a versatile and scalable framework for content recommendation,film indexing,and media analytics,providing an interpretable alternative to traditional audiovisual-based classification techniques.展开更多
The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack...The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.展开更多
This study proposes a virtual healthcare assistant framework designed to provide support in multiple languages for efficient and accurate healthcare assistance.The system employs a transformer model to process sophist...This study proposes a virtual healthcare assistant framework designed to provide support in multiple languages for efficient and accurate healthcare assistance.The system employs a transformer model to process sophisticated,multilingual user inputs and gain improved contextual understanding compared to conventional models,including long short-term memory(LSTM)models.In contrast to LSTMs,which sequence processes information and may experience challenges with long-range dependencies,transformers utilize self-attention to learn relationships among every aspect of the input in parallel.This enables them to execute more accurately in various languages and contexts,making them well-suited for applications such as translation,summarization,and conversational Comparative evaluations revealed the superiority of the transformer model(accuracy rate:85%)compared with that of the LSTM model(accuracy rate:65%).The experiments revealed several advantages of the transformer architecture over the LSTM model,such as more effective self-attention,the ability for models to work in parallel with each other,and contextual understanding for better multilingual compatibility.Additionally,our prediction model exhibited effectiveness for disease diagnosis,with accuracy of 85%or greater in identifying the relationship between symptoms and diseases among different demographics.The system provides translation support from English to other languages,with conversion to French(Bilingual Evaluation Understudy score:0.7),followed by English to Hindi(0.6).The lowest Bilingual Evaluation Understudy score was found for English to Telugu(0.39).This virtual assistant can also perform symptom analysis and disease prediction,with output given in the preferred language of the user.展开更多
Hepatology encompasses various aspects,such as metabolic-associated fatty liver disease,viral hepatitis,alcoholic liver disease,liver cirrhosis,liver failure,liver tumors,and liver transplantation.The global epidemiol...Hepatology encompasses various aspects,such as metabolic-associated fatty liver disease,viral hepatitis,alcoholic liver disease,liver cirrhosis,liver failure,liver tumors,and liver transplantation.The global epidemiological situation of liver diseases is grave,posing a substantial threat to human health and quality of life.Characterized by high incidence and mortality rates,liver diseases have emerged as a prominent global public health concern.In recent years,the rapid advan-cement of artificial intelligence(AI),deep learning,and radiomics has transfor-med medical research and clinical practice,demonstrating considerable potential in hepatology.AI is capable of automatically detecting abnormal cells in liver tissue sections,enhancing the accu-racy and efficiency of pathological diagnosis.Deep learning models are able to extract features from computed tomography and magnetic resonance imaging images to facilitate liver disease classification.Machine learning models are capable of integrating clinical data to forecast disease progression and treatment responses,thus supporting clinical decision-making for personalized medicine.Through the analysis of imaging data,laboratory results,and genomic information,AI can assist in diagnosis,forecast disease progression,and optimize treatment plans,thereby improving clinical outcomes for liver disease patients.This minireview intends to comprehensively summarize the state-of-the-art theories and applications of AI in hepatology,explore the opportunities and challenges it presents in clinical practice,basic research,and translational medicine,and propose future research directions to guide the advancement of hepatology and ultimately improve patient outcomes.展开更多
基金funded by Scientific Research Deanship at University of Hail-Saudi Arabia through Project Number RG-23092.
文摘Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or indirect slurs.To address this gap,we propose a hybrid framework combining Term Frequency-Inverse Document Frequency(TF-IDF),word-to-vector(Word2Vec),and Bidirectional Encoder Representations from Transformers(BERT)based models for multi-class cyberbullying detection.Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships,fused with BERT’s contextual embeddings to capture syntactic and semantic complexities.We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories:age,ethnicity,gender,religion,and indirect aggression.Among BERT variants tested,BERT Base Un-Cased achieved the highest performance with 93%accuracy(standard deviation across±1%5-fold cross-validation)and an average AUC of 0.96,outperforming standalone TF-IDF(78%)and Word2Vec(82%)models.Notably,it achieved near-perfect AUC scores(0.99)for age and ethnicity-based bullying.A comparative analysis with state-of-the-art benchmarks,including Generative Pre-trained Transformer 2(GPT-2)and Text-to-Text Transfer Transformer(T5)models highlights BERT’s superiority in handling ambiguous language.This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification,offering a scalable solution for moderating nuanced harmful content.
基金supported by the Key Project of International Cooperation of Qilu University of Technology(Grant No.:QLUTGJHZ2018008)Shandong Provincial Natural Science Foundation Committee,China(Grant No.:ZR2016HB54)Shandong Provincial Key Laboratory of Microbial Engineering(SME).
文摘AlphaPanda(AlphaFold2[1]inspired protein-specific antibody design in a diffusional manner)is an advanced algorithm for designing complementary determining regions(CDRs)of the antibody targeted the specific epitope,combining transformer[2]models,3DCNN[3],and diffusion[4]generative models.
基金Supported by the National Natural Science Foundation of China(62201293,62034003)the Open-Foundation of State Key Laboratory of Millimeter-Waves(K202313)the Jiangsu Province Youth Science and Technology Talent Support Project(JSTJ-2024-040)。
文摘In this paper,the small-signal modeling of the Indium Phosphide High Electron Mobility Transistor(InP HEMT)based on the Transformer neural network model is investigated.The AC S-parameters of the HEMT device are trained and validated using the Transformer model.In the proposed model,the eight-layer transformer encoders are connected in series and the encoder layer of each Transformer consists of the multi-head attention layer and the feed-forward neural network layer.The experimental results show that the measured and modeled S-parameters of the HEMT device match well in the frequency range of 0.5-40 GHz,with the errors versus frequency less than 1%.Compared with other models,good accuracy can be achieved to verify the effectiveness of the proposed model.
基金supported by the Liaoning Provincial Education Department Fund,grant number JYTZD2023083.
文摘In dynamic 5G network environments,user mobility and heterogeneous network topologies pose dual challenges to the effort of improving performance of mobile edge caching.Existing studies often overlook the dynamic nature of user locations and the potential of device-to-device(D2D)cooperative caching,limiting the reduction of transmission latency.To address this issue,this paper proposes a joint optimization scheme for edge caching that integrates user mobility prediction with deep reinforcement learning.First,a Transformer-based geolocation prediction model is designed,leveraging multi-head attention mechanisms to capture correlations in historical user trajectories for accurate future location prediction.Then,within a three-tier heterogeneous network,we formulate a latency minimization problem under a D2D cooperative caching architecture and develop a mobility-aware Deep Q-Network(DQN)caching strategy.This strategy takes predicted location information as state input and dynamically adjusts the content distribution across small base stations(SBSs)andmobile users(MUs)to reduce end-to-end delay inmulti-hop content retrieval.Simulation results show that the proposed DQN-based method outperforms other baseline strategies across variousmetrics,achieving a 17.2%reduction in transmission delay compared to DQNmethods withoutmobility integration,thus validating the effectiveness of the joint optimization of location prediction and caching decisions.
基金supported by the National Natural Science Foundation of China(Nos.U2BB2077 and 42374226)the Natural Science Foundation of Jiangxi Province(20232BAB201043 and 20232BCJ23006)the Nuclear energy development project of the National Defense Science and Industry Bureau(Nos.20201192-01,20201192-03).
文摘The identification of ore grades is a critical step in mineral resource exploration and mining.Prompt gamma neutron activation analysis(PGNAA)technology employs gamma rays generated by the nuclear reactions between neutrons and samples to achieve the qualitative and quantitative detection of sample components.In this study,we present a novel method for identifying copper grade by combining the vision transformer(ViT)model with the PGNAA technique.First,a Monte Carlo simulation is employed to determine the optimal sizes of the neutron moderator,thermal neutron absorption material,and dimensions of the device.Subsequently,based on the parameters obtained through optimization,a PGNAA copper ore measurement model is established.The gamma spectrum of the copper ore is analyzed using the ViT model.The ViT model is optimized for hyperparameters using a grid search.To ensure the reliability of the identification results,the test results are obtained through five repeated tenfold cross-validations.Long short-term memory and convolutional neural network models are compared with the ViT method.These results indicate that the ViT method is efficient in identifying copper ore grades with average accuracy,precision,recall,F_(1)score,and F_(1)(-)score values of 0.9795,0.9637,0.9614,0.9625,and 0.9942,respectively.When identifying associated minerals,the ViT model can identify Pb,Zn,Fe,and Co minerals with identification accuracies of 0.9215,0.9396,0.9966,and 0.8311,respectively.
基金Supported by the Laoshan Laboratory(No.LSKJ202202402)the National Natural Science Foundation of China(No.42030410)+2 种基金the Startup Foundation for Introducing Talent of Nanjing University of Information Science&Technology,and Jiangsu Innovation Research Group(No.JSSCTD 202346)supported by the China National Postdoctoral Program for Innovative Talents(No.BX20240169)the China Postdoctoral Science Foundation(No.2141062400101)。
文摘Deep learning(DL)has become a crucial technique for predicting the El Niño-Southern Oscillation(ENSO)and evaluating its predictability.While various DL-based models have been developed for ENSO predictions,many fail to capture the coherent multivariate evolution within the coupled ocean-atmosphere system of the tropical Pacific.To address this three-dimensional(3D)limitation and represent ENSO-related ocean-atmosphere interactions more accurately,a novel this 3D multivariate prediction model was proposed based on a Transformer architecture,which incorporates a spatiotemporal self-attention mechanism.This model,named 3D-Geoformer,offers several advantages,enabling accurate ENSO predictions up to one and a half years in advance.Furthermore,an integrated gradient method was introduced into the model to identify the sources of predictability for sea surface temperature(SST)variability in the eastern equatorial Pacific.Results reveal that the 3D-Geoformer effectively captures ENSO-related precursors during the evolution of ENSO events,particularly the thermocline feedback processes and ocean temperature anomaly pathways on and off the equator.By extending DL-based ENSO predictions from one-dimensional Niño time series to 3D multivariate fields,the 3D-Geoformer represents a significant advancement in ENSO prediction.This study provides details in the model formulation,analysis procedures,sensitivity experiments,and illustrative examples,offering practical guidance for the application of the model in ENSO research.
基金Sponsored by National Key Research and Development Program of China(Grant No.2020YEB1600500).
文摘Short⁃term traffic flow prediction plays a crucial role in the planning of intelligent transportation systems.Nowadays,there is a large amount of traffic flow data generated from the monitoring devices of urban road networks,which contains road network traffic information with high application value.In this study,an improved spatio⁃temporal attention transformer model(ISTA⁃transformer model)is proposed to provide a more accurate method for predicting multi⁃step short⁃term traffic flow based on monitoring data.By embedding a temporal attention layer and a spatial attention layer in the model,the model learns the relationship between traffic flows at different time intervals and different geographic locations,and realizes more accurate multi⁃step short⁃time flow prediction.Finally,we validate the superiority of the model with monitoring data spanning 15 days from 620 monitoring points in Qingdao,China.In the four time steps of prediction,the MAPE(Mean Absolute Percentage Error)values of ISTA⁃transformers prediction results are 0.22,0.29,0.37,and 0.38,respectively,and its prediction accuracy is usually better than that of six baseline models(Transformer,GRU,CNN,LSTM,Seq2Seq and LightGBM),which indicates that the proposed model in this paper always has a better ability to explain the prediction results with the time steps in the multi⁃step prediction.
基金supported by the Key Laboratory of Forensic Science and Technology at College of Sichuan Province(2023YB04).
文摘Enhancing low-light images with color distortion and uneven multi-light source distribution presents challenges. Most advanced methods for low-light image enhancement are based on the Retinex model using deep learning. Retinexformer introduces channel self-attention mechanisms in the IG-MSA. However, it fails to effectively capture long-range spatial dependencies, leaving room for improvement. Based on the Retinexformer deep learning framework, we designed the Retinexformer+ network. The “+” signifies our advancements in extracting long-range spatial dependencies. We introduced multi-scale dilated convolutions in illumination estimation to expand the receptive field. These convolutions effectively capture the weakening semantic dependency between pixels as distance increases. In illumination restoration, we used Unet++ with multi-level skip connections to better integrate semantic information at different scales. The designed Illumination Fusion Dual Self-Attention (IF-DSA) module embeds multi-scale dilated convolutions to achieve spatial self-attention. This module captures long-range spatial semantic relationships within acceptable computational complexity. Experimental results on the Low-Light (LOL) dataset show that Retexformer+ outperforms other State-Of-The-Art (SOTA) methods in both quantitative and qualitative evaluations, with the computational complexity increased to an acceptable 51.63 G FLOPS. On the LOL_v1 dataset, RetinexFormer+ shows an increase of 1.15 in Peak Signal-to-Noise Ratio (PSNR) and a decrease of 0.39 in Root Mean Square Error (RMSE). On the LOL_v2_real dataset, the PSNR increases by 0.42 and the RMSE decreases by 0.18. Experimental results on the Exdark dataset show that Retexformer+ can effectively enhance real-scene images and maintain their semantic information.
文摘The rise of social media platforms has revolutionized communication, enabling the exchange of vast amounts of data through text, audio, images, and videos. These platforms have become critical for sharing opinions and insights, influencing daily habits, and driving business, political, and economic decisions. Text posts are particularly significant, and natural language processing (NLP) has emerged as a powerful tool for analyzing such data. While traditional NLP methods have been effective for structured media, social media content poses unique challenges due to its informal and diverse nature. This has spurred the development of new techniques tailored for processing and extracting insights from unstructured user-generated text. One key application of NLP is the summarization of user comments to manage overwhelming content volumes. Abstractive summarization has proven highly effective in generating concise, human-like summaries, offering clear overviews of key themes and sentiments. This enhances understanding and engagement while reducing cognitive effort for users. For businesses, summarization provides actionable insights into customer preferences and feedback, enabling faster trend analysis, improved responsiveness, and strategic adaptability. By distilling complex data into manageable insights, summarization plays a vital role in improving user experiences and empowering informed decision-making in a data-driven landscape. This paper proposes a new implementation framework by fine-tuning and parameterizing Transformer Large Language Models to manage and maintain linguistic and semantic components in abstractive summary generation. The system excels in transforming large volumes of data into meaningful summaries, as evidenced by its strong performance across metrics like fluency, consistency, readability, and semantic coherence.
基金supported by the National Research Foundation of Korea (NRF)grant funded by the Korean government (MSIT) (No.2019M3F2A1073179).
文摘Photovoltaic(PV)systems are environmentally friendly,generate green energy,and receive support from policies and organizations.However,weather fluctuations make large-scale PV power integration and management challenging despite the economic benefits.Existing PV forecasting techniques(sequential and convolutional neural networks(CNN))are sensitive to environmental conditions,reducing energy distribution system performance.To handle these issues,this article proposes an efficient,weather-resilient convolutional-transformer-based network(CT-NET)for accurate and efficient PV power forecasting.The network consists of three main modules.First,the acquired PV generation data are forwarded to the pre-processing module for data refinement.Next,to carry out data encoding,a CNNbased multi-head attention(MHA)module is developed in which a single MHA is used to decode the encoded data.The encoder module is mainly composed of 1D convolutional and MHA layers,which extract local as well as contextual features,while the decoder part includes MHA and feedforward layers to generate the final prediction.Finally,the performance of the proposed network is evaluated using standard error metrics,including the mean squared error(MSE),root mean squared error(RMSE),and mean absolute percentage error(MAPE).An ablation study and comparative analysis with several competitive state-of-the-art approaches revealed a lower error rate in terms of MSE(0.0471),RMSE(0.2167),and MAPE(0.6135)over publicly available benchmark data.In addition,it is demonstrated that our proposed model is less complex,with the lowest number of parameters(0.0135 M),size(0.106 MB),and inference time(2 ms/step),suggesting that it is easy to integrate into the smart grid.
基金Supported by Shaanxi Province Key Research and Development Project (2021GY-280)the National Natural Science Foundation of China (No.61834005,61772417,61802304)。
文摘Micro-expressions are spontaneous, unconscious movements that reveal true emotions.Accurate facial movement information and network training learning methods are crucial for micro-expression recognition.However, most existing micro-expression recognition technologies so far focus on modeling the single category of micro-expression images and neural network structure.Aiming at the problems of low recognition rate and weak model generalization ability in micro-expression recognition, a micro-expression recognition algorithm is proposed based on graph convolution network(GCN) and Transformer model.Firstly, action unit(AU) feature detection is extracted and facial muscle nodes in the neighborhood are divided into three subsets for recognition.Then, graph convolution layer is used to find the layout of dependencies between AU nodes of micro-expression classification.Finally, multiple attentional features of each facial action are enriched with Transformer model to include more sequence information before calculating the overall correlation of each region.The proposed method is validated in CASME II and CAS(ME)^2 datasets, and the recognition rate reached 69.85%.
基金National Natural Science Foundations of China(No. 61073013,No. 90818024)Aviation Science Foundation of China( No.2010ZAO4001)
文摘Timed abstract state machine(TASM) is a formal specification language used to specify and simulate the behavior of real-time systems. Formal verification of TASM model can be fulfilled through model checking activities by translating into UPPAAL. Firstly, the translational semantics from TASM to UPPAAL is presented through atlas transformation language(ATL). Secondly, the implementation of the proposed model transformation tool TASM2UPPAAL is provided. Finally, a case study is given to illustrate the automatic transformation from TASM model to UPPAAL model.
文摘The dependence of transformer performance on the material properties was investigated using two laboratory-processed 0.23 mm thick grain-oriented electrical steels domain-refined with elec-trolytically etched grooves having different magnetic properties. The iron loss at 1.7 T, 50 Hz and the flux density at 800 A/m of material A were 0.73 W/kg and 1.89 T, respectively; and those of material B, 0.83 W/kg and 1.88 T. Model stacked and wound transformer core experiments using the tested materials exhibited performance well reflecting the material characteristics. In a three-phase stacked core with step-lap joints excited to 1.7 T, 50 Hz, the core loss, the exciting current and the noise level were 0.86 W/kg, 0.74 A and 52 dB, respectively, with material A; and 0.97 W/kg, 1.0 A and 54 dB with material B. The building factors for the core losses of the two materials were almost the same in both core configurations. The effect of higher harmonics on transformer performance was also investigated.
文摘The oil industries are an important part of a country’s economy.The crude oil’s price is influenced by a wide range of variables.Therefore,how accurately can countries predict its behavior and what predictors to employ are two main questions.In this view,we propose utilizing deep learning and ensemble learning techniques to boost crude oil’s price forecasting performance.The suggested method is based on a deep learning snapshot ensemble method of the Transformer model.To examine the superiority of the proposed model,this paper compares the proposed deep learning ensemble model against different machine learning and statistical models for daily Organization of the Petroleum Exporting Countries(OPEC)oil price forecasting.Experimental results demonstrated the outperformance of the proposed method over statistical and machine learning methods.More precisely,the proposed snapshot ensemble of Transformer method achieved relative improvement in the forecasting performance compared to autoregressive integrated moving average ARIMA(1,1,1),ARIMA(0,1,1),autoregressive moving average(ARMA)(0,1),vector autoregression(VAR),random walk(RW),support vector machine(SVM),and random forests(RF)models by 99.94%,99.62%,99.87%,99.65%,7.55%,98.38%,and 99.35%,respectively,according to mean square error metric.
文摘Recent advancement in low-cost cameras has facilitated surveillance in various developing towns in India.The video obtained from such surveillance are of low quality.Still counting vehicles from such videos are necessity to avoid traf-fic congestion and allows drivers to plan their routes more precisely.On the other hand,detecting vehicles from such low quality videos are highly challenging with vision based methodologies.In this research a meticulous attempt is made to access low-quality videos to describe traffic in Salem town in India,which is mostly an un-attempted entity by most available sources.In this work profound Detection Transformer(DETR)model is used for object(vehicle)detection.Here vehicles are anticipated in a rush-hour traffic video using a set of loss functions that carry out bipartite coordinating among estimated and information acquired on real attributes.Every frame in the traffic footage has its date and time which is detected and retrieved using Tesseract Optical Character Recognition.The date and time extricated and perceived from the input image are incorporated with the length of the recognized objects acquired from the DETR model.This furnishes the vehicles report with timestamp.Transformer Timeseries Prediction Model(TTPM)is proposed to predict the density of the vehicle for future prediction,here the regular NLP layers have been removed and the encoding temporal layer has been modified.The proposed TTPM error rate outperforms the existing models with RMSE of 4.313 and MAE of 3.812.
基金National Natural Science Foundations of China (No. 61073013,No. 90818024)Aviation Science Foundation of China(No.2010ZAO4001)
文摘Architecture analysis and design language (AADL) is an architecture description language standard for embedded real-time systems and it is widely used in safety-critical applications. For facilitating verifcafion and analysis, model transformation is one of the methods. A synchronous subset of AADL and a general methodology for translating the AADL subset into timed abstract state machine (TASM) were studied. Based on the arias transformation language ( ATL ) framework, the associated translating tool AADL2TASM was implemented by defining the meta-model of both AADL and TASM, and the ATL transformation rules. A case study with property verification of the AADL model was also presented for validating the tool.
基金would like to thank the Deanship of Graduate Studies and Scientific Research at Qassim University for financial support(QU-APC-2025).
文摘Automated and accurate movie genre classification is crucial for content organization,recommendation systems,and audience targeting in the film industry.Although most existing approaches focus on audiovisual features such as trailers and posters,the text-based classification remains underexplored despite its accessibility and semantic richness.This paper introduces the Genre Attention Model(GAM),a deep learning architecture that integrates transformer models with a hierarchical attention mechanism to extract and leverage contextual information from movie plots formulti-label genre classification.In order to assess its effectiveness,we assessmultiple transformer-based models,including Bidirectional Encoder Representations fromTransformers(BERT),ALite BERT(ALBERT),Distilled BERT(DistilBERT),Robustly Optimized BERT Pretraining Approach(RoBERTa),Efficiently Learning an Encoder that Classifies Token Replacements Accurately(ELECTRA),eXtreme Learning Network(XLNet)and Decodingenhanced BERT with Disentangled Attention(DeBERTa).Experimental results demonstrate the superior performance of DeBERTa-based GAM,which employs a two-tier hierarchical attention mechanism:word-level attention highlights key terms,while sentence-level attention captures critical narrative segments,ensuring a refined and interpretable representation of movie plots.Evaluated on three benchmark datasets Trailers12K,Large Movie Trailer Dataset-9(LMTD-9),and MovieLens37K.GAM achieves micro-average precision scores of 83.63%,83.32%,and 83.34%,respectively,surpassing state-of-the-artmodels.Additionally,GAMis computationally efficient,requiring just 6.10Giga Floating Point Operations Per Second(GFLOPS),making it a scalable and cost-effective solution.These results highlight the growing potential of text-based deep learning models in genre classification and GAM’s effectiveness in improving predictive accuracy while maintaining computational efficiency.With its robust performance,GAM offers a versatile and scalable framework for content recommendation,film indexing,and media analytics,providing an interpretable alternative to traditional audiovisual-based classification techniques.
基金supported by National Natural Science Foundation of China(No.52374155)Anhui Provincial Natural Science Foundation(No.2308085 MF218).
文摘The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.
文摘This study proposes a virtual healthcare assistant framework designed to provide support in multiple languages for efficient and accurate healthcare assistance.The system employs a transformer model to process sophisticated,multilingual user inputs and gain improved contextual understanding compared to conventional models,including long short-term memory(LSTM)models.In contrast to LSTMs,which sequence processes information and may experience challenges with long-range dependencies,transformers utilize self-attention to learn relationships among every aspect of the input in parallel.This enables them to execute more accurately in various languages and contexts,making them well-suited for applications such as translation,summarization,and conversational Comparative evaluations revealed the superiority of the transformer model(accuracy rate:85%)compared with that of the LSTM model(accuracy rate:65%).The experiments revealed several advantages of the transformer architecture over the LSTM model,such as more effective self-attention,the ability for models to work in parallel with each other,and contextual understanding for better multilingual compatibility.Additionally,our prediction model exhibited effectiveness for disease diagnosis,with accuracy of 85%or greater in identifying the relationship between symptoms and diseases among different demographics.The system provides translation support from English to other languages,with conversion to French(Bilingual Evaluation Understudy score:0.7),followed by English to Hindi(0.6).The lowest Bilingual Evaluation Understudy score was found for English to Telugu(0.39).This virtual assistant can also perform symptom analysis and disease prediction,with output given in the preferred language of the user.
文摘Hepatology encompasses various aspects,such as metabolic-associated fatty liver disease,viral hepatitis,alcoholic liver disease,liver cirrhosis,liver failure,liver tumors,and liver transplantation.The global epidemiological situation of liver diseases is grave,posing a substantial threat to human health and quality of life.Characterized by high incidence and mortality rates,liver diseases have emerged as a prominent global public health concern.In recent years,the rapid advan-cement of artificial intelligence(AI),deep learning,and radiomics has transfor-med medical research and clinical practice,demonstrating considerable potential in hepatology.AI is capable of automatically detecting abnormal cells in liver tissue sections,enhancing the accu-racy and efficiency of pathological diagnosis.Deep learning models are able to extract features from computed tomography and magnetic resonance imaging images to facilitate liver disease classification.Machine learning models are capable of integrating clinical data to forecast disease progression and treatment responses,thus supporting clinical decision-making for personalized medicine.Through the analysis of imaging data,laboratory results,and genomic information,AI can assist in diagnosis,forecast disease progression,and optimize treatment plans,thereby improving clinical outcomes for liver disease patients.This minireview intends to comprehensively summarize the state-of-the-art theories and applications of AI in hepatology,explore the opportunities and challenges it presents in clinical practice,basic research,and translational medicine,and propose future research directions to guide the advancement of hepatology and ultimately improve patient outcomes.