Full ceramic bearings are mission-critical components in oil-free environments,such as food processing,semiconductor manufacturing,and medical applications.Developing effective fault diagnosis methods for these bearin...Full ceramic bearings are mission-critical components in oil-free environments,such as food processing,semiconductor manufacturing,and medical applications.Developing effective fault diagnosis methods for these bearings is essential to ensuring operational reliability and preventing costly failures.Traditional supervised deep learning approaches have demonstrated promise in fault detection,but their dependence on large labeled datasets poses significant challenges in industrial settings where fault-labeled data is scarce.This paper introduces a few-shot learning approach for full ceramic bearing fault diagnosis by leveraging the pre-trained GPT-2 model.Large language models(LLMs)like GPT-2,pre-trained on diverse textual data,exhibit remarkable transfer learning and few-shot learning capabilities,making them ideal for applications with limited labeled data.In this study,acoustic emission(AE)signals from bearings were processed using empirical mode decomposition(EMD),and the extracted AE features were converted into structured text for fine-tuning GPT-2 as a fault classifier.To enhance its performance,we incorporated a modified loss function and softmax activation with cosine similarity,ensuring better generalization in fault identification.Experimental evaluations on a laboratory-collected full ceramic bearing dataset demonstrated that the proposed approach achieved high diagnostic accuracy with as few as five labeled samples,outperforming conventional methods such as k-nearest neighbor(KNN),large memory storage and retrieval(LAMSTAR)neural network,deep neural network(DNN),recurrent neural network(RNN),long short-term memory(LSTM)network,and model-agnostic meta-learning(MAML).The results highlight LLMs’potential to revolutionize fault diagnosis,enabling faster deployment,reduced reliance on extensive labeled datasets,and improved adaptability in industrial monitoring systems.展开更多
Dialogue State Tracking(DST)is a critical component of task-oriented spoken dialogue systems(SDS),tasked with maintaining an accurate representation of the conversational state by predicting slots and their correspond...Dialogue State Tracking(DST)is a critical component of task-oriented spoken dialogue systems(SDS),tasked with maintaining an accurate representation of the conversational state by predicting slots and their corresponding values.Recent advances leverage Large Language Models(LLMs)with prompt-based tuning to improve tracking accuracy and efficiency.However,these approaches often incur substantial computational and memory overheads and typically address slot extraction implicitly within prompts,without explicitly modeling the complex dependencies between slots and values.In this work,we propose PUGG,a novel DST framework that constructs schema-driven prompts to fine-tune GPT-2 and utilizes its tokenizer to implement a memory encoder.PUGG explicitly extracts slot values via GPT-2 and employs Graph Attention Networks(GATs)to model and reason over the intricate relationships between slots and their associated values.We evaluate PUGG on four publicly available datasets,where it achieves stateof-the-art performance across multiple evaluation metrics,highlighting its robustness and generalizability in diverse conversational scenarios.Our results indicate that the integration of GPT-2 substantially reduces model complexity and memory consumption by streamlining key processes.Moreover,prompt tuning enhances the model’s flexibility and precision in extracting relevant slot-value pairs,while the incorporation of GATs facilitates effective relational reasoning,leading to improved dialogue state representations.展开更多
图像描述生成算法是计算机视觉中的关键环节,旨在从给定的输入图像中预测相关文本信息,以实现对图像内容的准确理解与表达。提出一种借鉴平均教师算法的模型,并采用独特的双分支网络架构。为提升模型准确性与稳定性,在每个分支中引入位...图像描述生成算法是计算机视觉中的关键环节,旨在从给定的输入图像中预测相关文本信息,以实现对图像内容的准确理解与表达。提出一种借鉴平均教师算法的模型,并采用独特的双分支网络架构。为提升模型准确性与稳定性,在每个分支中引入位置前馈块。在图像特征提取方面,运用对比语言图像预训练(CLIP)方法,以获取图像的多层次特征,从而更好地捕捉图像的语义信息。在描述生成阶段,通过映射网络将图像特征转化为文本信息,进而利用GPT-2技术来提升预测的准确度与语义的连贯性。为验证模型性能,在Microsoft common objects in context(MSCOCO)和Flickr30k等图像描述数据集上进行充分的训练与测试。测试结果显示所提模型在两个数据集上均表现出色,证实其在图像描述生成任务中的高效性与实用性。研究为图像描述生成领域提供了新的思路与方法,具有深远的理论与实践意义。展开更多
With the increasing use of web applications,challenges in the field of cybersecurity are becoming more complex.This paper explores the application of fine-tuned large language models(LLMs)for the automatic generation ...With the increasing use of web applications,challenges in the field of cybersecurity are becoming more complex.This paper explores the application of fine-tuned large language models(LLMs)for the automatic generation of synthetic attacks,including XSS(Cross-Site Scripting),SQL Injections,and Command Injections.A web application has been developed that allows penetration testers to quickly generate high-quality payloads without the need for in-depth knowledge of artificial intelligence.The fine-tuned language model demonstrates the capability to produce synthetic payloads that closely resemble real-world attacks.This approach not only improves the model’s precision and dependability but also serves as a practical resource for cybersecurity professionals to enhance the security of web applications.The methodology and structured implementation underscore the importance and potential of advanced language models in cybersecurity,illustrating their effectiveness in generating high-quality synthetic data for penetration testing purposes.The research results demonstrate that this approach enables the identification of vulnerabilities that traditional methods may not uncover,providing deeper insights into potential threats and enhancing overall security measures.The performance evaluation of the model indicated satisfactory results,while further hyperparameter optimization could improve accuracy and generalization capabilities.This research represents a significant step forward in improving web application security and opens new opportunities for the use of LLMs in security testing,thereby contributing to the development of more effective cybersecurity strategies.展开更多
本文主要探索了GPT-2模型在学习文言文特征方面的表现,主要以小数据集的文言文样本对GPT-2模型进行了训练、测试,并与LSTM、Sequence to sequence等其它方法生成的文言文进行了比较研究。结果表明构建四种不同模型,采用两种不同的生成...本文主要探索了GPT-2模型在学习文言文特征方面的表现,主要以小数据集的文言文样本对GPT-2模型进行了训练、测试,并与LSTM、Sequence to sequence等其它方法生成的文言文进行了比较研究。结果表明构建四种不同模型,采用两种不同的生成方法进行样本生成,将所生成的样本进行随机选取,不同模型不同生成方法各保存两个样本,将样本保存并归类。生成样本评价表明GPT-2模型具有优越性。展开更多
This paper introduces a novel transform method to produce the newly generated programs through code transform model called the second generation of Generative Pre-trained Transformer(GPT-2)reasonably,improving the pro...This paper introduces a novel transform method to produce the newly generated programs through code transform model called the second generation of Generative Pre-trained Transformer(GPT-2)reasonably,improving the program execution performance significantly.Besides,a theoretical estimation in statistics has given the minimum number of generated programs as required,which guarantees to find the best one within them.The proposed approach can help the voice assistant machine resolve the problem of inefficient execution of application code.In addition to GPT-2,this study develops the variational Simhash algorithm to check the code similarity between sample program and newly generated program,and conceives the piecewise longest common subsequence algorithm to examine the execution’s conformity from the two programs mentioned above.The code similarity check deducts the redundant generated programs,and the output conformity check finds the best-performing generative program.In addition to texts,the proposed approach can also prove the other media,including images,sounds,and movies.As a result,the newly generated program outperforms the sample program significantly because the number of code lines reduces 27.21%,and the program execution time shortens 24.62%.展开更多
Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide...Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches.展开更多
文摘Full ceramic bearings are mission-critical components in oil-free environments,such as food processing,semiconductor manufacturing,and medical applications.Developing effective fault diagnosis methods for these bearings is essential to ensuring operational reliability and preventing costly failures.Traditional supervised deep learning approaches have demonstrated promise in fault detection,but their dependence on large labeled datasets poses significant challenges in industrial settings where fault-labeled data is scarce.This paper introduces a few-shot learning approach for full ceramic bearing fault diagnosis by leveraging the pre-trained GPT-2 model.Large language models(LLMs)like GPT-2,pre-trained on diverse textual data,exhibit remarkable transfer learning and few-shot learning capabilities,making them ideal for applications with limited labeled data.In this study,acoustic emission(AE)signals from bearings were processed using empirical mode decomposition(EMD),and the extracted AE features were converted into structured text for fine-tuning GPT-2 as a fault classifier.To enhance its performance,we incorporated a modified loss function and softmax activation with cosine similarity,ensuring better generalization in fault identification.Experimental evaluations on a laboratory-collected full ceramic bearing dataset demonstrated that the proposed approach achieved high diagnostic accuracy with as few as five labeled samples,outperforming conventional methods such as k-nearest neighbor(KNN),large memory storage and retrieval(LAMSTAR)neural network,deep neural network(DNN),recurrent neural network(RNN),long short-term memory(LSTM)network,and model-agnostic meta-learning(MAML).The results highlight LLMs’potential to revolutionize fault diagnosis,enabling faster deployment,reduced reliance on extensive labeled datasets,and improved adaptability in industrial monitoring systems.
基金supported by the MSIT(Ministry of Science and ICT),Republic of Korea,under the ITRC(Information Technology Research Centre)support program(IITP-2024-RS-2024-00437191)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).
文摘Dialogue State Tracking(DST)is a critical component of task-oriented spoken dialogue systems(SDS),tasked with maintaining an accurate representation of the conversational state by predicting slots and their corresponding values.Recent advances leverage Large Language Models(LLMs)with prompt-based tuning to improve tracking accuracy and efficiency.However,these approaches often incur substantial computational and memory overheads and typically address slot extraction implicitly within prompts,without explicitly modeling the complex dependencies between slots and values.In this work,we propose PUGG,a novel DST framework that constructs schema-driven prompts to fine-tune GPT-2 and utilizes its tokenizer to implement a memory encoder.PUGG explicitly extracts slot values via GPT-2 and employs Graph Attention Networks(GATs)to model and reason over the intricate relationships between slots and their associated values.We evaluate PUGG on four publicly available datasets,where it achieves stateof-the-art performance across multiple evaluation metrics,highlighting its robustness and generalizability in diverse conversational scenarios.Our results indicate that the integration of GPT-2 substantially reduces model complexity and memory consumption by streamlining key processes.Moreover,prompt tuning enhances the model’s flexibility and precision in extracting relevant slot-value pairs,while the incorporation of GATs facilitates effective relational reasoning,leading to improved dialogue state representations.
文摘图像描述生成算法是计算机视觉中的关键环节,旨在从给定的输入图像中预测相关文本信息,以实现对图像内容的准确理解与表达。提出一种借鉴平均教师算法的模型,并采用独特的双分支网络架构。为提升模型准确性与稳定性,在每个分支中引入位置前馈块。在图像特征提取方面,运用对比语言图像预训练(CLIP)方法,以获取图像的多层次特征,从而更好地捕捉图像的语义信息。在描述生成阶段,通过映射网络将图像特征转化为文本信息,进而利用GPT-2技术来提升预测的准确度与语义的连贯性。为验证模型性能,在Microsoft common objects in context(MSCOCO)和Flickr30k等图像描述数据集上进行充分的训练与测试。测试结果显示所提模型在两个数据集上均表现出色,证实其在图像描述生成任务中的高效性与实用性。研究为图像描述生成领域提供了新的思路与方法,具有深远的理论与实践意义。
基金supported by the Ministry of Science,Technological Development and Innovation of the Republic of Serbia,and these results are parts of Grant No.451-03-66/2024-03/200132 with the University of Kragujevac-Faculty of Technical Sciences Cacak.
文摘With the increasing use of web applications,challenges in the field of cybersecurity are becoming more complex.This paper explores the application of fine-tuned large language models(LLMs)for the automatic generation of synthetic attacks,including XSS(Cross-Site Scripting),SQL Injections,and Command Injections.A web application has been developed that allows penetration testers to quickly generate high-quality payloads without the need for in-depth knowledge of artificial intelligence.The fine-tuned language model demonstrates the capability to produce synthetic payloads that closely resemble real-world attacks.This approach not only improves the model’s precision and dependability but also serves as a practical resource for cybersecurity professionals to enhance the security of web applications.The methodology and structured implementation underscore the importance and potential of advanced language models in cybersecurity,illustrating their effectiveness in generating high-quality synthetic data for penetration testing purposes.The research results demonstrate that this approach enables the identification of vulnerabilities that traditional methods may not uncover,providing deeper insights into potential threats and enhancing overall security measures.The performance evaluation of the model indicated satisfactory results,while further hyperparameter optimization could improve accuracy and generalization capabilities.This research represents a significant step forward in improving web application security and opens new opportunities for the use of LLMs in security testing,thereby contributing to the development of more effective cybersecurity strategies.
文摘本文主要探索了GPT-2模型在学习文言文特征方面的表现,主要以小数据集的文言文样本对GPT-2模型进行了训练、测试,并与LSTM、Sequence to sequence等其它方法生成的文言文进行了比较研究。结果表明构建四种不同模型,采用两种不同的生成方法进行样本生成,将所生成的样本进行随机选取,不同模型不同生成方法各保存两个样本,将样本保存并归类。生成样本评价表明GPT-2模型具有优越性。
基金This work is fully supported by the Ministry of Science and Technology,Taiwan,Republic of China,under Grant Nos.MOST 110-2622-E-390-001 and MOST 109-2622-E-390-002-CC3.
文摘This paper introduces a novel transform method to produce the newly generated programs through code transform model called the second generation of Generative Pre-trained Transformer(GPT-2)reasonably,improving the program execution performance significantly.Besides,a theoretical estimation in statistics has given the minimum number of generated programs as required,which guarantees to find the best one within them.The proposed approach can help the voice assistant machine resolve the problem of inefficient execution of application code.In addition to GPT-2,this study develops the variational Simhash algorithm to check the code similarity between sample program and newly generated program,and conceives the piecewise longest common subsequence algorithm to examine the execution’s conformity from the two programs mentioned above.The code similarity check deducts the redundant generated programs,and the output conformity check finds the best-performing generative program.In addition to texts,the proposed approach can also prove the other media,including images,sounds,and movies.As a result,the newly generated program outperforms the sample program significantly because the number of code lines reduces 27.21%,and the program execution time shortens 24.62%.
基金supported in part by the National Natural Science Foundation of China under Grants 62273272 and 61873277in part by the Chinese Postdoctoral Science Foundation under Grant 2020M673446+1 种基金in part by the Key Research and Development Program of Shaanxi Province under Grant 2023-YBGY-243in part by the Youth Innovation Team of Shaanxi Universities.
文摘Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches.