The rise of social media platforms has revolutionized communication, enabling the exchange of vast amounts of data through text, audio, images, and videos. These platforms have become critical for sharing opinions and...The rise of social media platforms has revolutionized communication, enabling the exchange of vast amounts of data through text, audio, images, and videos. These platforms have become critical for sharing opinions and insights, influencing daily habits, and driving business, political, and economic decisions. Text posts are particularly significant, and natural language processing (NLP) has emerged as a powerful tool for analyzing such data. While traditional NLP methods have been effective for structured media, social media content poses unique challenges due to its informal and diverse nature. This has spurred the development of new techniques tailored for processing and extracting insights from unstructured user-generated text. One key application of NLP is the summarization of user comments to manage overwhelming content volumes. Abstractive summarization has proven highly effective in generating concise, human-like summaries, offering clear overviews of key themes and sentiments. This enhances understanding and engagement while reducing cognitive effort for users. For businesses, summarization provides actionable insights into customer preferences and feedback, enabling faster trend analysis, improved responsiveness, and strategic adaptability. By distilling complex data into manageable insights, summarization plays a vital role in improving user experiences and empowering informed decision-making in a data-driven landscape. This paper proposes a new implementation framework by fine-tuning and parameterizing Transformer Large Language Models to manage and maintain linguistic and semantic components in abstractive summary generation. The system excels in transforming large volumes of data into meaningful summaries, as evidenced by its strong performance across metrics like fluency, consistency, readability, and semantic coherence.展开更多
Cyberbullying is a remarkable issue in the Arabic-speaking world,affecting children,organizations,and businesses.Various efforts have been made to combat this problem through proposed models using machine learning(ML)...Cyberbullying is a remarkable issue in the Arabic-speaking world,affecting children,organizations,and businesses.Various efforts have been made to combat this problem through proposed models using machine learning(ML)and deep learning(DL)approaches utilizing natural language processing(NLP)methods and by proposing relevant datasets.However,most of these endeavors focused predominantly on the English language,leaving a substantial gap in addressing Arabic cyberbullying.Given the complexities of the Arabic language,transfer learning techniques and transformers present a promising approach to enhance the detection and classification of abusive content by leveraging large and pretrained models that use a large dataset.Therefore,this study proposes a hybrid model using transformers trained on extensive Arabic datasets.It then fine-tunes the hybrid model on a newly curated Arabic cyberbullying dataset collected from social media platforms,in particular Twitter.Additionally,the following two hybrid transformer models are introduced:the first combines CAmelid Morphologically-aware pretrained Bidirectional Encoder Representations from Transformers(CAMeLBERT)with Arabic Generative Pre-trained Transformer 2(AraGPT2)and the second combines Arabic BERT(AraBERT)with Cross-lingual Language Model-RoBERTa(XLM-R).Two strategies,namely,feature fusion and ensemble voting,are employed to improve the model performance accuracy.Experimental results,measured through precision,recall,F1-score,accuracy,and AreaUnder the Curve-Receiver Operating Characteristic(AUC-ROC),demonstrate that the combined CAMeLBERT and AraGPT2 models using feature fusion outperformed traditional DL models,such as Long Short-Term Memory(LSTM)and Bidirectional Long Short-Term Memory(BiLSTM),as well as other independent Arabic-based transformer models.展开更多
As the plasma current power in tokamak devices increases,a significant number of stray magnetic fields are generated around the equipment.These stray magnetic fields can disrupt the operation of electronic power devic...As the plasma current power in tokamak devices increases,a significant number of stray magnetic fields are generated around the equipment.These stray magnetic fields can disrupt the operation of electronic power devices,particularly transformers in switched-mode power supplies.Testing flyback converters with transformers under strong background magnetic fields highlights electromagnetic compatibility(EMC)issues for such switched-mode power supplies.This study utilizes finite element analysis software to simulate the electromagnetic environment of switched-mode power supply transformers and investigates the impact of variations in different magnetic field parameters on the performance of switched-mode power supplies under strong stray magnetic fields.The findings indicate that EMC issues are associated with transformer core saturation and can be alleviated through appropriate configurations of the core size,air gap,fillet radius,and installation direction.This study offers novel solutions for addressing EMC issues in high magnetic field environments.展开更多
X(formerly known as Twitter)is one of the most prominent social media platforms,enabling users to share short messages(tweets)with the public or their followers.It serves various purposes,from real-time news dissemina...X(formerly known as Twitter)is one of the most prominent social media platforms,enabling users to share short messages(tweets)with the public or their followers.It serves various purposes,from real-time news dissemination and political discourse to trend spotting and consumer engagement.X has emerged as a key space for understanding shifting brand perceptions,consumer preferences,and product-related sentiment in the fashion industry.However,the platform’s informal,dynamic,and context-dependent language poses substantial challenges for sentiment analysis,mainly when attempting to detect sarcasm,slang,and nuanced emotional tones.This study introduces a hybrid deep learning framework that integrates Transformer encoders,recurrent neural networks(i.e.,Long Short-Term Memory(LSTM)and Gated Recurrent Unit(GRU)),and attention mechanisms to improve the accuracy of fashion-related sentiment classification.These methods were selected due to their proven strength in capturing both contextual dependencies and sequential structures,which are essential for interpreting short-form text.Our model was evaluated on a dataset of 20,000 fashion tweets.The experimental results demonstrate a classification accuracy of 92.25%,outperforming conventional models such as Logistic Regression,Linear Support Vector Machine(SVM),and even standalone LSTM by a margin of up to 8%.This improvement highlights the importance of hybrid architectures in handling noisy,informal social media data.This study’s findings offer strong implications for digital marketing and brand management,where timely sentiment detection is critical.Despite the promising results,challenges remain regarding the precise identification of negative sentiments,indicating that further work is needed to detect subtle and contextually embedded expressions.展开更多
Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status...Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.展开更多
In recent years,Transformer has achieved remarkable results in the field of computer vision,with its built-in attention layers effectively modeling global dependencies in images by transforming image features into tok...In recent years,Transformer has achieved remarkable results in the field of computer vision,with its built-in attention layers effectively modeling global dependencies in images by transforming image features into token forms.However,Transformers often face high computational costs when processing large-scale image data,which limits their feasibility in real-time applications.To address this issue,we propose Token Masked Pose Transformers(TMPose),constructing an efficient Transformer network for pose estimation.This network applies semantic-level masking to tokens and employs three different masking strategies to optimize model performance,aiming to reduce computational complexity.Experimental results show that TMPose reduces computational complexity by 61.1%on the COCO validation dataset,with negligible loss in accuracy.Additionally,our performance on the MPII dataset is also competitive.This research not only enhances the accuracy of pose estimation but also significantly reduces the demand for computational resources,providing new directions for further studies in this field.展开更多
Critical for metering and protection in electric railway traction power supply systems(TPSSs),the measurement performance of voltage transformers(VTs)must be timely and reliably monitored.This paper outlines a three-s...Critical for metering and protection in electric railway traction power supply systems(TPSSs),the measurement performance of voltage transformers(VTs)must be timely and reliably monitored.This paper outlines a three-step,RMS data only method for evaluating VTs in TPSSs.First,a kernel principal component analysis approach is used to diagnose the VT exhibiting significant measurement deviations over time,mitigating the influence of stochastic fluctuations in traction loads.Second,a back propagation neural network is employed to continuously estimate the measurement deviations of the targeted VT.Third,a trend analysis method is developed to assess the evolution of the measurement performance of VTs.Case studies conducted on field data from an operational TPSS demonstrate the effectiveness of the proposed method in detecting VTs with measurement deviations exceeding 1%relative to their original accuracy levels.Additionally,the method accurately tracks deviation trends,enabling the identification of potential early-stage faults in VTs and helping prevent significant economic losses in TPSS operations.展开更多
This paper focuses on the research of the main transformer selection and layout scheme for new energy step-up substations.From the perspective of engineering design,it analyzes the principles of main transformer selec...This paper focuses on the research of the main transformer selection and layout scheme for new energy step-up substations.From the perspective of engineering design,it analyzes the principles of main transformer selection,key parameters,and their matching with the characteristics of new energy.It also explores the layout methods and optimization strategies.Combined with typical case studies,optimization suggestions are proposed for the design of main transformers in new energy step-up substations.The research shows that rational main transformer selection and scientific layout schemes can better adapt to the characteristics of new energy projects while effectively improving land use efficiency and economic viability.This study can provide technical experience support for the design of new energy projects.展开更多
Remote sensing Change Detection(CD)involves identifying changing regions of interest in bi-temporal remote sensing images.CD technology has rapidly developed in recent years through the powerful learning ability of Co...Remote sensing Change Detection(CD)involves identifying changing regions of interest in bi-temporal remote sensing images.CD technology has rapidly developed in recent years through the powerful learning ability of Convolutional Neural Networks(CNN),affording complex feature extraction.However,the local receptive fields in the CNN limit modeling long-range contextual relationships in semantic changes.Therefore,this work explores the great potential of Siamese Transformers in CD tasks and proposes a general CD model entitled STCD that relies on Swin Transformers.In the encoding process,pure Transformers without CNN are used to model the long-range context of semantic tokens,reducing computational overhead and improving model efficiency compared to current methods.During the decoding process,the 3D convolution block obtains the changing features in the time series and generates the predicted change map in the deconvolution layer with axial attention.Extensive experiments on three binary CD datasets and one semantic CD dataset demonstrate that the proposed STCD model outperforms several popular benchmark methods considering performance and the required parameters.Among the STCD variants,the F1-Score of the Base-STCD on the three binary CD datasets LEVIR,DSIFN,and SVCD reached 89.85%,54.72%,and 93.75%,respectively,and the mF1-Score and mIoU on the semantic CD dataset SECOND were 75.60%and 66.19%.展开更多
The development of autonomous vehicles has become one of the greatest research endeavors in recent years. These vehicles rely on many complex systems working in tandem to make decisions. For practical use and safety r...The development of autonomous vehicles has become one of the greatest research endeavors in recent years. These vehicles rely on many complex systems working in tandem to make decisions. For practical use and safety reasons, these systems must not only be accurate, but also quickly detect changes in the surrounding environment. In autonomous vehicle research, the environment perception system is one of the key components of development. Environment perception systems allow the vehicle to understand its surroundings. This is done by using cameras, light detection and ranging (LiDAR), with other sensor systems and modalities. Deep learning computer vision algorithms have been shown to be the strongest tool for translating camera data into accurate and safe traversability decisions regarding the environment surrounding a vehicle. In order for a vehicle to safely traverse an area in real time, these computer vision algorithms must be accurate and have low latency. While much research has studied autonomous driving for traversing well-structured urban environments, limited research exists evaluating perception system improvements in off-road settings. This research aims to investigate the adaptability of several existing deep-learning architectures for semantic segmentation in off-road environments. Previous studies of two Convolutional Neural Network (CNN) architectures are included for comparison with new evaluation of Vision Transformer (ViT) architectures for semantic segmentation. Our results demonstrate viability of ViT architectures for off-road perception systems, having a strong segmentation accuracy, lower inference speed and memory footprint compared to previous results with CNN architectures.展开更多
The changing nature of malware poses a cybersecurity threat,resulting in significant financial losses each year.However,traditional antivirus tools for detecting malware based on signatures are ineffective against dis...The changing nature of malware poses a cybersecurity threat,resulting in significant financial losses each year.However,traditional antivirus tools for detecting malware based on signatures are ineffective against disguised variations as they have low levels of accuracy.This study introduces Data Efficient Image Transformer-Malware Classifier(DeiT-MC),a system for classifying malware that utilizes Data-Efficient Image Transformers.DeiTMC treats malware samples as visual data and integrates a newly developed Hybrid GridBay Optimizer(HGBO)for hyperparameter optimization and better model performance under varying malware scenarios.With HGBO,DeiT-MC outperforms the state-of-the-art techniques with a strong accuracy rate of 94% on theMaleViS and 92% on MalNet-Image Tiny datasets.Therefore,this work presents DeiT-MC as a promising and robust solution for classifying malware families using image analysis techniques and visualization approaches.展开更多
Transformer models have emerged as dominant networks for various tasks in computer vision compared to Convolutional Neural Networks(CNNs).The transformers demonstrate the ability to model long-range dependencies by ut...Transformer models have emerged as dominant networks for various tasks in computer vision compared to Convolutional Neural Networks(CNNs).The transformers demonstrate the ability to model long-range dependencies by utilizing a self-attention mechanism.This study aims to provide a comprehensive survey of recent transformerbased approaches in image and video applications,as well as diffusion models.We begin by discussing existing surveys of vision transformers and comparing them to this work.Then,we review the main components of a vanilla transformer network,including the self-attention mechanism,feed-forward network,position encoding,etc.In the main part of this survey,we review recent transformer-based models in three categories:Transformer for downstream tasks,Vision Transformer for Generation,and Vision Transformer for Segmentation.We also provide a comprehensive overview of recent transformer models for video tasks and diffusion models.We compare the performance of various hierarchical transformer networks for multiple tasks on popular benchmark datasets.Finally,we explore some future research directions to further improve the field.展开更多
Dry-Type Cast Resin Distribution Transformers(CRT)is the secondgeneration of air-cooled distribution transformers where oil is replaced by resin for electrical insulation.CRT transformers may installed indoor adjacent...Dry-Type Cast Resin Distribution Transformers(CRT)is the secondgeneration of air-cooled distribution transformers where oil is replaced by resin for electrical insulation.CRT transformers may installed indoor adjacent to or near residential areas since they are clean and safe comparing to the conventional transformers.But,as it is obvious,noise discrepancy is intrinsically accompanied with all types of transformers and is inevitable for CRT transformers too.Minimization of noise level caused by such these transformers has biological and ergonomic importance.As it is known the core of transformers is the main source of the noise generation.In this paper,experimental and numerical investigation is implemented for a large number of fabricated CRT transformers in IT Co(Iran Transfo Company)to evaluate the effective geometrical parameters of the core on the overall sound level of transformers.Noise Level of each sample is measured according to criteria of IEC60651 and is reported in units of Decibel(dB).Numerical simulation is done using noncommercial version of ANSYS Workbench software to extract first six natural frequencies and mode shapes of CRT cores which is reported in units of Hz.Three novel non-dimensional variables for geometry of the transformer core are introduced.Both experimental and numerical results show approximately similar response to these variables.Correlation between natural frequencies and noise level is evaluated statistically.Pearson factor shows that there is a robust conjunction between first two natural frequencies and noise level of CRTs.Results show that noise level decreases as the two first natural frequencies increases and vice versa,noise level increases as the two natural frequencies of the core decreases.Finally the noise level decomposed to two parts.展开更多
Structured microgrids(SμGs)and Flexible electronic large power transformers(FeLPTs)are emerging as two essential technologies for renewable energy integration,flexible power transmission,and active control.SμGs prov...Structured microgrids(SμGs)and Flexible electronic large power transformers(FeLPTs)are emerging as two essential technologies for renewable energy integration,flexible power transmission,and active control.SμGs provide the integration of renewable energy and storage to balance the energy demand and supply as needed for a given system design.FeLPT’s flexibility for processing,control,and re-configurability offers the capability for flexible transmission for effective flow control and enable SμGs connectivity while still keeping multiscale system level control.Early adaptors for combined heat and power have demonstrated significant economic benefits while reducing environmental foot prints.They bring tremendous benefits to utility companies also.With storage and active control capabilities,a 300-percent increase in bulk transmission and distribution lines are possible without having to increase capacity.SμGs and FeLPTs will also enable the utility industry to be better prepared for the emerging large increase in base load demand from electric transportation and data centers.This is a win-win-win situation for the consumer,the utilities(grid operators),and the environment.SμGs and FeLPTs provide value in power substation,energy surety,reliability,resiliency,and security.It is also shown that the initial cost associated with SμG and FeLPTs deployment can be easily offset with reduced operating cost,which in turn reduces the total life-cycle cost by 33%to 67%.展开更多
文摘The rise of social media platforms has revolutionized communication, enabling the exchange of vast amounts of data through text, audio, images, and videos. These platforms have become critical for sharing opinions and insights, influencing daily habits, and driving business, political, and economic decisions. Text posts are particularly significant, and natural language processing (NLP) has emerged as a powerful tool for analyzing such data. While traditional NLP methods have been effective for structured media, social media content poses unique challenges due to its informal and diverse nature. This has spurred the development of new techniques tailored for processing and extracting insights from unstructured user-generated text. One key application of NLP is the summarization of user comments to manage overwhelming content volumes. Abstractive summarization has proven highly effective in generating concise, human-like summaries, offering clear overviews of key themes and sentiments. This enhances understanding and engagement while reducing cognitive effort for users. For businesses, summarization provides actionable insights into customer preferences and feedback, enabling faster trend analysis, improved responsiveness, and strategic adaptability. By distilling complex data into manageable insights, summarization plays a vital role in improving user experiences and empowering informed decision-making in a data-driven landscape. This paper proposes a new implementation framework by fine-tuning and parameterizing Transformer Large Language Models to manage and maintain linguistic and semantic components in abstractive summary generation. The system excels in transforming large volumes of data into meaningful summaries, as evidenced by its strong performance across metrics like fluency, consistency, readability, and semantic coherence.
基金funded by the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia,through the project number“NBU-FFR-2025-1197-01”.
文摘Cyberbullying is a remarkable issue in the Arabic-speaking world,affecting children,organizations,and businesses.Various efforts have been made to combat this problem through proposed models using machine learning(ML)and deep learning(DL)approaches utilizing natural language processing(NLP)methods and by proposing relevant datasets.However,most of these endeavors focused predominantly on the English language,leaving a substantial gap in addressing Arabic cyberbullying.Given the complexities of the Arabic language,transfer learning techniques and transformers present a promising approach to enhance the detection and classification of abusive content by leveraging large and pretrained models that use a large dataset.Therefore,this study proposes a hybrid model using transformers trained on extensive Arabic datasets.It then fine-tunes the hybrid model on a newly curated Arabic cyberbullying dataset collected from social media platforms,in particular Twitter.Additionally,the following two hybrid transformer models are introduced:the first combines CAmelid Morphologically-aware pretrained Bidirectional Encoder Representations from Transformers(CAMeLBERT)with Arabic Generative Pre-trained Transformer 2(AraGPT2)and the second combines Arabic BERT(AraBERT)with Cross-lingual Language Model-RoBERTa(XLM-R).Two strategies,namely,feature fusion and ensemble voting,are employed to improve the model performance accuracy.Experimental results,measured through precision,recall,F1-score,accuracy,and AreaUnder the Curve-Receiver Operating Characteristic(AUC-ROC),demonstrate that the combined CAMeLBERT and AraGPT2 models using feature fusion outperformed traditional DL models,such as Long Short-Term Memory(LSTM)and Bidirectional Long Short-Term Memory(BiLSTM),as well as other independent Arabic-based transformer models.
基金supported by the Natural Science Foundation of Anhui Province(No.228085ME142)the Comprehensive Research Facility for the Fusion Technology Program of China(No.20180000527301001228)the Open Fund of the Magnetic Confinement Fusion Laboratory of Anhui Province(No.2024AMF04003)。
文摘As the plasma current power in tokamak devices increases,a significant number of stray magnetic fields are generated around the equipment.These stray magnetic fields can disrupt the operation of electronic power devices,particularly transformers in switched-mode power supplies.Testing flyback converters with transformers under strong background magnetic fields highlights electromagnetic compatibility(EMC)issues for such switched-mode power supplies.This study utilizes finite element analysis software to simulate the electromagnetic environment of switched-mode power supply transformers and investigates the impact of variations in different magnetic field parameters on the performance of switched-mode power supplies under strong stray magnetic fields.The findings indicate that EMC issues are associated with transformer core saturation and can be alleviated through appropriate configurations of the core size,air gap,fillet radius,and installation direction.This study offers novel solutions for addressing EMC issues in high magnetic field environments.
文摘X(formerly known as Twitter)is one of the most prominent social media platforms,enabling users to share short messages(tweets)with the public or their followers.It serves various purposes,from real-time news dissemination and political discourse to trend spotting and consumer engagement.X has emerged as a key space for understanding shifting brand perceptions,consumer preferences,and product-related sentiment in the fashion industry.However,the platform’s informal,dynamic,and context-dependent language poses substantial challenges for sentiment analysis,mainly when attempting to detect sarcasm,slang,and nuanced emotional tones.This study introduces a hybrid deep learning framework that integrates Transformer encoders,recurrent neural networks(i.e.,Long Short-Term Memory(LSTM)and Gated Recurrent Unit(GRU)),and attention mechanisms to improve the accuracy of fashion-related sentiment classification.These methods were selected due to their proven strength in capturing both contextual dependencies and sequential structures,which are essential for interpreting short-form text.Our model was evaluated on a dataset of 20,000 fashion tweets.The experimental results demonstrate a classification accuracy of 92.25%,outperforming conventional models such as Logistic Regression,Linear Support Vector Machine(SVM),and even standalone LSTM by a margin of up to 8%.This improvement highlights the importance of hybrid architectures in handling noisy,informal social media data.This study’s findings offer strong implications for digital marketing and brand management,where timely sentiment detection is critical.Despite the promising results,challenges remain regarding the precise identification of negative sentiments,indicating that further work is needed to detect subtle and contextually embedded expressions.
基金supported by the Deanship of Research and Graduate Studies at King Khalid University under Small Research Project grant number RGP1/139/45.
文摘Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.
基金supported in part by the Scientific Research Start-Up Fund of Zhejiang Sci-Tech University,under the project titled“(National Treasury)Development of a Digital Silk Museum System Based on Metaverse and AR”(Project No.11121731282202-01).
文摘In recent years,Transformer has achieved remarkable results in the field of computer vision,with its built-in attention layers effectively modeling global dependencies in images by transforming image features into token forms.However,Transformers often face high computational costs when processing large-scale image data,which limits their feasibility in real-time applications.To address this issue,we propose Token Masked Pose Transformers(TMPose),constructing an efficient Transformer network for pose estimation.This network applies semantic-level masking to tokens and employs three different masking strategies to optimize model performance,aiming to reduce computational complexity.Experimental results show that TMPose reduces computational complexity by 61.1%on the COCO validation dataset,with negligible loss in accuracy.Additionally,our performance on the MPII dataset is also competitive.This research not only enhances the accuracy of pose estimation but also significantly reduces the demand for computational resources,providing new directions for further studies in this field.
基金supported by the National Natural Science Foundation of China(No.52107125)Applied Basic Research Project of Sichuan Province(No.2022NSFSC0250)Chengdu Guojia Electrical Engineering Co.,Ltd.(No.KYL202312-0043).
文摘Critical for metering and protection in electric railway traction power supply systems(TPSSs),the measurement performance of voltage transformers(VTs)must be timely and reliably monitored.This paper outlines a three-step,RMS data only method for evaluating VTs in TPSSs.First,a kernel principal component analysis approach is used to diagnose the VT exhibiting significant measurement deviations over time,mitigating the influence of stochastic fluctuations in traction loads.Second,a back propagation neural network is employed to continuously estimate the measurement deviations of the targeted VT.Third,a trend analysis method is developed to assess the evolution of the measurement performance of VTs.Case studies conducted on field data from an operational TPSS demonstrate the effectiveness of the proposed method in detecting VTs with measurement deviations exceeding 1%relative to their original accuracy levels.Additionally,the method accurately tracks deviation trends,enabling the identification of potential early-stage faults in VTs and helping prevent significant economic losses in TPSS operations.
文摘This paper focuses on the research of the main transformer selection and layout scheme for new energy step-up substations.From the perspective of engineering design,it analyzes the principles of main transformer selection,key parameters,and their matching with the characteristics of new energy.It also explores the layout methods and optimization strategies.Combined with typical case studies,optimization suggestions are proposed for the design of main transformers in new energy step-up substations.The research shows that rational main transformer selection and scientific layout schemes can better adapt to the characteristics of new energy projects while effectively improving land use efficiency and economic viability.This study can provide technical experience support for the design of new energy projects.
基金supported by the Military Commission Science and Technology Committee Leading Fund[grant number 18-163-00-TS-004-080-01].
文摘Remote sensing Change Detection(CD)involves identifying changing regions of interest in bi-temporal remote sensing images.CD technology has rapidly developed in recent years through the powerful learning ability of Convolutional Neural Networks(CNN),affording complex feature extraction.However,the local receptive fields in the CNN limit modeling long-range contextual relationships in semantic changes.Therefore,this work explores the great potential of Siamese Transformers in CD tasks and proposes a general CD model entitled STCD that relies on Swin Transformers.In the encoding process,pure Transformers without CNN are used to model the long-range context of semantic tokens,reducing computational overhead and improving model efficiency compared to current methods.During the decoding process,the 3D convolution block obtains the changing features in the time series and generates the predicted change map in the deconvolution layer with axial attention.Extensive experiments on three binary CD datasets and one semantic CD dataset demonstrate that the proposed STCD model outperforms several popular benchmark methods considering performance and the required parameters.Among the STCD variants,the F1-Score of the Base-STCD on the three binary CD datasets LEVIR,DSIFN,and SVCD reached 89.85%,54.72%,and 93.75%,respectively,and the mF1-Score and mIoU on the semantic CD dataset SECOND were 75.60%and 66.19%.
文摘The development of autonomous vehicles has become one of the greatest research endeavors in recent years. These vehicles rely on many complex systems working in tandem to make decisions. For practical use and safety reasons, these systems must not only be accurate, but also quickly detect changes in the surrounding environment. In autonomous vehicle research, the environment perception system is one of the key components of development. Environment perception systems allow the vehicle to understand its surroundings. This is done by using cameras, light detection and ranging (LiDAR), with other sensor systems and modalities. Deep learning computer vision algorithms have been shown to be the strongest tool for translating camera data into accurate and safe traversability decisions regarding the environment surrounding a vehicle. In order for a vehicle to safely traverse an area in real time, these computer vision algorithms must be accurate and have low latency. While much research has studied autonomous driving for traversing well-structured urban environments, limited research exists evaluating perception system improvements in off-road settings. This research aims to investigate the adaptability of several existing deep-learning architectures for semantic segmentation in off-road environments. Previous studies of two Convolutional Neural Network (CNN) architectures are included for comparison with new evaluation of Vision Transformer (ViT) architectures for semantic segmentation. Our results demonstrate viability of ViT architectures for off-road perception systems, having a strong segmentation accuracy, lower inference speed and memory footprint compared to previous results with CNN architectures.
文摘The changing nature of malware poses a cybersecurity threat,resulting in significant financial losses each year.However,traditional antivirus tools for detecting malware based on signatures are ineffective against disguised variations as they have low levels of accuracy.This study introduces Data Efficient Image Transformer-Malware Classifier(DeiT-MC),a system for classifying malware that utilizes Data-Efficient Image Transformers.DeiTMC treats malware samples as visual data and integrates a newly developed Hybrid GridBay Optimizer(HGBO)for hyperparameter optimization and better model performance under varying malware scenarios.With HGBO,DeiT-MC outperforms the state-of-the-art techniques with a strong accuracy rate of 94% on theMaleViS and 92% on MalNet-Image Tiny datasets.Therefore,this work presents DeiT-MC as a promising and robust solution for classifying malware families using image analysis techniques and visualization approaches.
基金supported in part by the National Natural Science Foundation of China under Grants 61502162,61702175,and 61772184in part by the Fund of the State Key Laboratory of Geo-information Engineering under Grant SKLGIE2016-M-4-2+4 种基金in part by the Hunan Natural Science Foundation of China under Grant 2018JJ2059in part by the Key R&D Project of Hunan Province of China under Grant 2018GK2014in part by the Open Fund of the State Key Laboratory of Integrated Services Networks under Grant ISN17-14Chinese Scholarship Council(CSC)through College of Computer Science and Electronic Engineering,Changsha,410082Hunan University with Grant CSC No.2018GXZ020784.
文摘Transformer models have emerged as dominant networks for various tasks in computer vision compared to Convolutional Neural Networks(CNNs).The transformers demonstrate the ability to model long-range dependencies by utilizing a self-attention mechanism.This study aims to provide a comprehensive survey of recent transformerbased approaches in image and video applications,as well as diffusion models.We begin by discussing existing surveys of vision transformers and comparing them to this work.Then,we review the main components of a vanilla transformer network,including the self-attention mechanism,feed-forward network,position encoding,etc.In the main part of this survey,we review recent transformer-based models in three categories:Transformer for downstream tasks,Vision Transformer for Generation,and Vision Transformer for Segmentation.We also provide a comprehensive overview of recent transformer models for video tasks and diffusion models.We compare the performance of various hierarchical transformer networks for multiple tasks on popular benchmark datasets.Finally,we explore some future research directions to further improve the field.
文摘Dry-Type Cast Resin Distribution Transformers(CRT)is the secondgeneration of air-cooled distribution transformers where oil is replaced by resin for electrical insulation.CRT transformers may installed indoor adjacent to or near residential areas since they are clean and safe comparing to the conventional transformers.But,as it is obvious,noise discrepancy is intrinsically accompanied with all types of transformers and is inevitable for CRT transformers too.Minimization of noise level caused by such these transformers has biological and ergonomic importance.As it is known the core of transformers is the main source of the noise generation.In this paper,experimental and numerical investigation is implemented for a large number of fabricated CRT transformers in IT Co(Iran Transfo Company)to evaluate the effective geometrical parameters of the core on the overall sound level of transformers.Noise Level of each sample is measured according to criteria of IEC60651 and is reported in units of Decibel(dB).Numerical simulation is done using noncommercial version of ANSYS Workbench software to extract first six natural frequencies and mode shapes of CRT cores which is reported in units of Hz.Three novel non-dimensional variables for geometry of the transformer core are introduced.Both experimental and numerical results show approximately similar response to these variables.Correlation between natural frequencies and noise level is evaluated statistically.Pearson factor shows that there is a robust conjunction between first two natural frequencies and noise level of CRTs.Results show that noise level decreases as the two first natural frequencies increases and vice versa,noise level increases as the two natural frequencies of the core decreases.Finally the noise level decomposed to two parts.
文摘Structured microgrids(SμGs)and Flexible electronic large power transformers(FeLPTs)are emerging as two essential technologies for renewable energy integration,flexible power transmission,and active control.SμGs provide the integration of renewable energy and storage to balance the energy demand and supply as needed for a given system design.FeLPT’s flexibility for processing,control,and re-configurability offers the capability for flexible transmission for effective flow control and enable SμGs connectivity while still keeping multiscale system level control.Early adaptors for combined heat and power have demonstrated significant economic benefits while reducing environmental foot prints.They bring tremendous benefits to utility companies also.With storage and active control capabilities,a 300-percent increase in bulk transmission and distribution lines are possible without having to increase capacity.SμGs and FeLPTs will also enable the utility industry to be better prepared for the emerging large increase in base load demand from electric transportation and data centers.This is a win-win-win situation for the consumer,the utilities(grid operators),and the environment.SμGs and FeLPTs provide value in power substation,energy surety,reliability,resiliency,and security.It is also shown that the initial cost associated with SμG and FeLPTs deployment can be easily offset with reduced operating cost,which in turn reduces the total life-cycle cost by 33%to 67%.