Editors Yang Wang,Xi'an Jiaotong University Dongbo Shi,Shanghai Jiaotong University Ye Sun,University College London Zhesi Shen,National Science Library,CAS Topic of the Special Issue What are the top questions to...Editors Yang Wang,Xi'an Jiaotong University Dongbo Shi,Shanghai Jiaotong University Ye Sun,University College London Zhesi Shen,National Science Library,CAS Topic of the Special Issue What are the top questions towards better science and innovation and the required data to answer these questions?展开更多
Editors Yang Wang,Xi'an Jiaotong University Dongbo Shi,Shanghai Jiaotong University Ye Sun,University College London Zhesi Shen,National Science Library,CASTopic of the Special Issue What are the top questions tow...Editors Yang Wang,Xi'an Jiaotong University Dongbo Shi,Shanghai Jiaotong University Ye Sun,University College London Zhesi Shen,National Science Library,CASTopic of the Special Issue What are the top questions towards better science and innovation and the required data to answer these questions?展开更多
Medical visual question answering(MedVQA)faces unique challenges due to the high precision required for images and the specialized nature of the questions.These challenges include insufficient feature extraction capab...Medical visual question answering(MedVQA)faces unique challenges due to the high precision required for images and the specialized nature of the questions.These challenges include insufficient feature extraction capabilities,a lack of textual priors,and incomplete information fusion and interaction.This paper proposes an enhanced bootstrapping language-image pre-training(BLIP)model for MedVQA based on multimodal feature augmentation and triple-path collaborative attention(FCA-BLIP)to address these issues.First,FCA-BLIP employs a unified bootstrap multimodal model architecture that integrates ResNet and bidirectional encoder representations from Transformer(BERT)models to enhance feature extraction capabilities.It enables a more precise analysis of the details in images and questions.Next,the pre-trained BLIP model is used to extract features from image-text sample pairs.The model can understand the semantic relationships and shared information between images and text.Finally,a novel attention structure is developed to fuse the multimodal feature vectors,thereby improving the alignment accuracy between modalities.Experimental results demonstrate that the proposed method performs well in clinical visual question-answering tasks.For the MedVQA task of staging diabetic macular edema in fundus imaging,the proposed method outperforms the existing major models in several performance metrics.展开更多
The consultation intention of emergency decision-makers in urban rail transit(URT)is input into the emergency knowledge base in the form of domain questions to obtain emergency decision support services.This approach ...The consultation intention of emergency decision-makers in urban rail transit(URT)is input into the emergency knowledge base in the form of domain questions to obtain emergency decision support services.This approach facilitates the rapid collection of complete knowledge and rules to form effective decisions.However,the current structured degree of the URT emergency knowledge base remains low,and the domain questions lack labeled datasets,resulting in a large deviation between the consultation outcomes and the intended objectives.To address this issue,this paper proposes a question intention recognition model for the URT emergency domain,leveraging knowledge graph(KG)and data enhancement technology.First,a structured storage of emergency cases and emergency plans is realized based on KG.Subsequently,a comprehensive question template is developed,and the labeled dataset of emergency domain questions in URT is generated through the KG.Lastly,data enhancement is applied by prompt learning and the NLP Chinese Data Augmentation(NLPCDA)tool,and the intention recognition model combining Generalized Auto-regression Pre-training for Language Understanding(XLNet)and Recurrent Convolutional Neural Network for Text Classification(TextRCNN)is constructed.Word embeddings are generated by XLNet,context information is further captured using Bidirectional Long Short-Term Memory Neural Network(BiLSTM),and salient features are extracted with Convolutional Neural Network(CNN).Experimental results demonstrate that the proposed model can enhance the clarity of classification and the identification of domain questions,thereby providing supportive knowledge for emergency decision-making in URT.展开更多
Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate...Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate answer.In this paper,we propose a VQA system intended to answer yes/no questions about real-world images,in Arabic.To support a robust VQA system,we work in two directions:(1)Using deep neural networks to semantically represent the given image and question in a fine-grainedmanner,namely ResNet-152 and Gated Recurrent Units(GRU).(2)Studying the role of the utilizedmultimodal bilinear pooling fusion technique in the trade-o.between the model complexity and the overall model performance.Some fusion techniques could significantly increase the model complexity,which seriously limits their applicability for VQA models.So far,there is no evidence of how efficient these multimodal bilinear pooling fusion techniques are for VQA systems dedicated to yes/no questions.Hence,a comparative analysis is conducted between eight bilinear pooling fusion techniques,in terms of their ability to reduce themodel complexity and improve themodel performance in this case of VQA systems.Experiments indicate that these multimodal bilinear pooling fusion techniques have improved the VQA model’s performance,until reaching the best performance of 89.25%.Further,experiments have proven that the number of answers in the developed VQA system is a critical factor that a.ects the effectiveness of these multimodal bilinear pooling techniques in achieving their main objective of reducing the model complexity.The Multimodal Local Perception Bilinear Pooling(MLPB)technique has shown the best balance between the model complexity and its performance,for VQA systems designed to answer yes/no questions.展开更多
In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilizati...In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilization of this information. This study proposes a novel framework for intelligent Question-and-Answer (Q&A) systems based on Retrieval-Augmented Generation (RAG) to address these issues. The system efficiently acquires domain-specific knowledge by leveraging external databases, including Relational Databases (RDBs) and graph databases, without additional fine-tuning for Large Language Models (LLMs). Crucially, the framework integrates a Dynamic Knowledge Base Updating Mechanism (DKBUM) and a Weighted Context-Aware Similarity (WCAS) method to enhance retrieval accuracy and mitigate inherent limitations of LLMs, such as hallucinations and lack of specialization. Additionally, the proposed DKBUM dynamically adjusts knowledge weights within the database, ensuring that the most recent and relevant information is utilized, while WCAS refines the alignment between queries and knowledge items by enhanced context understanding. Experimental validation demonstrates that the system can generate timely, accurate, and context-sensitive responses, making it a robust solution for managing complex business logic in specialized industries.展开更多
To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,t...To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance.展开更多
With recent advancements in robotic surgery,notable strides have been made in visual question answering(VQA).Existing VQA systems typically generate textual answers to questions but fail to indicate the location of th...With recent advancements in robotic surgery,notable strides have been made in visual question answering(VQA).Existing VQA systems typically generate textual answers to questions but fail to indicate the location of the relevant content within the image.This limitation restricts the interpretative capacity of the VQA models and their abil-ity to explore specific image regions.To address this issue,this study proposes a grounded VQA model for robotic surgery,capable of localizing a specific region during answer prediction.Drawing inspiration from prompt learning in language models,a dual-modality prompt model was developed to enhance precise multimodal information interactions.Specifically,two complementary prompters were introduced to effectively integrate visual and textual prompts into the encoding process of the model.A visual complementary prompter merges visual prompt knowl-edge with visual information features to guide accurate localization.The textual complementary prompter aligns vis-ual information with textual prompt knowledge and textual information,guiding textual information towards a more accurate inference of the answer.Additionally,a multiple iterative fusion strategy was adopted for comprehensive answer reasoning,to ensure high-quality generation of textual and grounded answers.The experimental results vali-date the effectiveness of the model,demonstrating its superiority over existing methods on the EndoVis-18 and End-oVis-17 datasets.展开更多
Editors Yang Wang,Xi'an Jiaotong University Dongbo Shi,Shanghai Jiaotong University Ye Sun,University College London Zhesi Shen,National Science Library,CAS Topic of the Special Issue What are the top questions to...Editors Yang Wang,Xi'an Jiaotong University Dongbo Shi,Shanghai Jiaotong University Ye Sun,University College London Zhesi Shen,National Science Library,CAS Topic of the Special Issue What are the top questions towards better science and innovation and the required data to answer these questions?展开更多
Determining clinical questions is fundamental to the development of clinical practice guidelines(CPGs),which bridges the initial phases and the final recommendations.It is essential for evidence retrieval and the form...Determining clinical questions is fundamental to the development of clinical practice guidelines(CPGs),which bridges the initial phases and the final recommendations.It is essential for evidence retrieval and the formulation of recommendations.The scientific rigor and precision in determination of clinical questions directly influence the future implementation and applicability of guidelines.In 2020,the World Federation of Acupuncture-Moxibustion Societies initiated the project of clinical practice guideline on acupuncture and moxibustion for adult major depressive disorder(mild-moderate degree)to address clinical and medical decision-making issues in acupuncture treatment for adult mild to moderate major depressive disorder.This CPG provides systematic recommendations based on clinical evidence,patient values,and other factors,aiding decision-makers,clinicians,and patients in selecting appropriate interventions.This paper discusses and analyzes the determination process of clinical questions,and the related issues during the development of this guideline,aiming to provide a reference for determining clinical questions and developing CPGs in the field of acupuncture and exploring more scientific tools and methods for determining clinical questions in future CPGs.展开更多
In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and comput...In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.展开更多
Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the ...Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.展开更多
The weapon and equipment operational requirement analysis(WEORA) is a necessary condition to win a future war,among which the acquisition of knowledge about weapons and equipment is a great challenge. The main challen...The weapon and equipment operational requirement analysis(WEORA) is a necessary condition to win a future war,among which the acquisition of knowledge about weapons and equipment is a great challenge. The main challenge is that the existing weapons and equipment data fails to carry out structured knowledge representation, and knowledge navigation based on natural language cannot efficiently support the WEORA. To solve above problem, this research proposes a method based on question answering(QA) of weapons and equipment knowledge graph(WEKG) to construct and navigate the knowledge related to weapons and equipment in the WEORA. This method firstly constructs the WEKG, and builds a neutral network-based QA system over the WEKG by means of semantic parsing for knowledge navigation. Finally, the method is evaluated and a chatbot on the QA system is developed for the WEORA. Our proposed method has good performance in the accuracy and efficiency of searching target knowledge, and can well assist the WEORA.展开更多
Background External knowledge representations play an essential role in knowledge-based visual question and answering to better understand complex scenarios in the open world.Recent entity-relationship embedding appro...Background External knowledge representations play an essential role in knowledge-based visual question and answering to better understand complex scenarios in the open world.Recent entity-relationship embedding approaches are deficient in representing some complex relations,resulting in a lack of topic-related knowledge and redundancy in topic-irrelevant information.Methods To this end,we propose MKEAH:Multimodal Knowledge Extraction and Accumulation on Hyperplanes.To ensure that the lengths of the feature vectors projected onto the hyperplane compare equally and to filter out sufficient topic-irrelevant information,two losses are proposed to learn the triplet representations from the complementary views:range loss and orthogonal loss.To interpret the capability of extracting topic-related knowledge,we present the Topic Similarity(TS)between topic and entity-relations.Results Experimental results demonstrate the effectiveness of hyperplane embedding for knowledge representation in knowledge-based visual question answering.Our model outperformed state-of-the-art methods by 2.12%and 3.24%on two challenging knowledge-request datasets:OK-VQA and KRVQA,respectively.Conclusions The obvious advantages of our model in TS show that using hyperplane embedding to represent multimodal knowledge can improve its ability to extract topic-related knowledge.展开更多
In literature,differences in the description of female and male characters have been noticeable for a long time.In this study,the novel Jane Eyre is uses as a material to investigate whether there are in fact any sign...In literature,differences in the description of female and male characters have been noticeable for a long time.In this study,the novel Jane Eyre is uses as a material to investigate whether there are in fact any significant differences in the questions Mr.Rochester and Jane use and how the questions function to portray these two main characters.展开更多
A new ontology-based question expansion (OBQE) method is proposed for question similarity calculation in a frequently asked question (FAQ) answering system. Traditional question similarity calculation methods use ...A new ontology-based question expansion (OBQE) method is proposed for question similarity calculation in a frequently asked question (FAQ) answering system. Traditional question similarity calculation methods use "word" to compose question vector, that the semantic relations between words are ignored. OBQE takes the relation as an important part. The process of the new system is:① to build two-layered domain ontology referring to WordNet and domain corpse;② to expand question trunks into domain cases;③ to use domain case composed vector to calculate question similarity. The experimental result shows that the performance of question similarity calculation with OBQE is being improved.展开更多
文摘Editors Yang Wang,Xi'an Jiaotong University Dongbo Shi,Shanghai Jiaotong University Ye Sun,University College London Zhesi Shen,National Science Library,CAS Topic of the Special Issue What are the top questions towards better science and innovation and the required data to answer these questions?
文摘Editors Yang Wang,Xi'an Jiaotong University Dongbo Shi,Shanghai Jiaotong University Ye Sun,University College London Zhesi Shen,National Science Library,CASTopic of the Special Issue What are the top questions towards better science and innovation and the required data to answer these questions?
基金Supported by the Program for Liaoning Excellent Talents in University(No.LR15045)the Liaoning Provincial Science and Technology Department Applied Basic Research Plan(No.101300243).
文摘Medical visual question answering(MedVQA)faces unique challenges due to the high precision required for images and the specialized nature of the questions.These challenges include insufficient feature extraction capabilities,a lack of textual priors,and incomplete information fusion and interaction.This paper proposes an enhanced bootstrapping language-image pre-training(BLIP)model for MedVQA based on multimodal feature augmentation and triple-path collaborative attention(FCA-BLIP)to address these issues.First,FCA-BLIP employs a unified bootstrap multimodal model architecture that integrates ResNet and bidirectional encoder representations from Transformer(BERT)models to enhance feature extraction capabilities.It enables a more precise analysis of the details in images and questions.Next,the pre-trained BLIP model is used to extract features from image-text sample pairs.The model can understand the semantic relationships and shared information between images and text.Finally,a novel attention structure is developed to fuse the multimodal feature vectors,thereby improving the alignment accuracy between modalities.Experimental results demonstrate that the proposed method performs well in clinical visual question-answering tasks.For the MedVQA task of staging diabetic macular edema in fundus imaging,the proposed method outperforms the existing major models in several performance metrics.
基金supported in part by the National Natural Science Foundation of China.The funding numbers 62433005,62272036,62132003,and 62173167.
文摘The consultation intention of emergency decision-makers in urban rail transit(URT)is input into the emergency knowledge base in the form of domain questions to obtain emergency decision support services.This approach facilitates the rapid collection of complete knowledge and rules to form effective decisions.However,the current structured degree of the URT emergency knowledge base remains low,and the domain questions lack labeled datasets,resulting in a large deviation between the consultation outcomes and the intended objectives.To address this issue,this paper proposes a question intention recognition model for the URT emergency domain,leveraging knowledge graph(KG)and data enhancement technology.First,a structured storage of emergency cases and emergency plans is realized based on KG.Subsequently,a comprehensive question template is developed,and the labeled dataset of emergency domain questions in URT is generated through the KG.Lastly,data enhancement is applied by prompt learning and the NLP Chinese Data Augmentation(NLPCDA)tool,and the intention recognition model combining Generalized Auto-regression Pre-training for Language Understanding(XLNet)and Recurrent Convolutional Neural Network for Text Classification(TextRCNN)is constructed.Word embeddings are generated by XLNet,context information is further captured using Bidirectional Long Short-Term Memory Neural Network(BiLSTM),and salient features are extracted with Convolutional Neural Network(CNN).Experimental results demonstrate that the proposed model can enhance the clarity of classification and the identification of domain questions,thereby providing supportive knowledge for emergency decision-making in URT.
文摘Visual question answering(VQA)is a multimodal task,involving a deep understanding of the image scene and the question’s meaning and capturing the relevant correlations between both modalities to infer the appropriate answer.In this paper,we propose a VQA system intended to answer yes/no questions about real-world images,in Arabic.To support a robust VQA system,we work in two directions:(1)Using deep neural networks to semantically represent the given image and question in a fine-grainedmanner,namely ResNet-152 and Gated Recurrent Units(GRU).(2)Studying the role of the utilizedmultimodal bilinear pooling fusion technique in the trade-o.between the model complexity and the overall model performance.Some fusion techniques could significantly increase the model complexity,which seriously limits their applicability for VQA models.So far,there is no evidence of how efficient these multimodal bilinear pooling fusion techniques are for VQA systems dedicated to yes/no questions.Hence,a comparative analysis is conducted between eight bilinear pooling fusion techniques,in terms of their ability to reduce themodel complexity and improve themodel performance in this case of VQA systems.Experiments indicate that these multimodal bilinear pooling fusion techniques have improved the VQA model’s performance,until reaching the best performance of 89.25%.Further,experiments have proven that the number of answers in the developed VQA system is a critical factor that a.ects the effectiveness of these multimodal bilinear pooling techniques in achieving their main objective of reducing the model complexity.The Multimodal Local Perception Bilinear Pooling(MLPB)technique has shown the best balance between the model complexity and its performance,for VQA systems designed to answer yes/no questions.
文摘In the context of power generation companies, vast amounts of specialized data and expert knowledge have been accumulated. However, challenges such as data silos and fragmented knowledge hinder the effective utilization of this information. This study proposes a novel framework for intelligent Question-and-Answer (Q&A) systems based on Retrieval-Augmented Generation (RAG) to address these issues. The system efficiently acquires domain-specific knowledge by leveraging external databases, including Relational Databases (RDBs) and graph databases, without additional fine-tuning for Large Language Models (LLMs). Crucially, the framework integrates a Dynamic Knowledge Base Updating Mechanism (DKBUM) and a Weighted Context-Aware Similarity (WCAS) method to enhance retrieval accuracy and mitigate inherent limitations of LLMs, such as hallucinations and lack of specialization. Additionally, the proposed DKBUM dynamically adjusts knowledge weights within the database, ensuring that the most recent and relevant information is utilized, while WCAS refines the alignment between queries and knowledge items by enhanced context understanding. Experimental validation demonstrates that the system can generate timely, accurate, and context-sensitive responses, making it a robust solution for managing complex business logic in specialized industries.
基金Microsoft Research Asia Internet Services in Academic Research Fund(No.FY07-RES-OPP-116)the Science and Technology Development Program of Tianjin(No.06YFGZGX05900)
文摘To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance.
基金supported in part by the National Key Research and Development Program of China,No.2021ZD0112400National Natural Science Foundation of China,No.U1908214+5 种基金Program for Innovative Research Team at the University of Liaoning Province,No.LT2020015the Support Plan for Key Field Innovation Team of Dalian,No.2021RT06the Support Plan for Leading Innovation Team of Dalian University,No.XLJ202010Program for the Liaoning Province Doctoral Research Starting Fund,No.2022-BS-336Key Laboratory of Advanced Design and Intelligent Computing(Dalian University),and Ministry of Education,No.ADIC2022003Interdisciplinary Project of Dalian University,No.DLUXK-2023-QN-015.
文摘With recent advancements in robotic surgery,notable strides have been made in visual question answering(VQA).Existing VQA systems typically generate textual answers to questions but fail to indicate the location of the relevant content within the image.This limitation restricts the interpretative capacity of the VQA models and their abil-ity to explore specific image regions.To address this issue,this study proposes a grounded VQA model for robotic surgery,capable of localizing a specific region during answer prediction.Drawing inspiration from prompt learning in language models,a dual-modality prompt model was developed to enhance precise multimodal information interactions.Specifically,two complementary prompters were introduced to effectively integrate visual and textual prompts into the encoding process of the model.A visual complementary prompter merges visual prompt knowl-edge with visual information features to guide accurate localization.The textual complementary prompter aligns vis-ual information with textual prompt knowledge and textual information,guiding textual information towards a more accurate inference of the answer.Additionally,a multiple iterative fusion strategy was adopted for comprehensive answer reasoning,to ensure high-quality generation of textual and grounded answers.The experimental results vali-date the effectiveness of the model,demonstrating its superiority over existing methods on the EndoVis-18 and End-oVis-17 datasets.
文摘Editors Yang Wang,Xi'an Jiaotong University Dongbo Shi,Shanghai Jiaotong University Ye Sun,University College London Zhesi Shen,National Science Library,CAS Topic of the Special Issue What are the top questions towards better science and innovation and the required data to answer these questions?
基金Supported by National Key Research and Development Program:2017YFC1703606Shenzhen Science and Technology Program:JCYJ20210324120804012+1 种基金2021 Luohu Soft Science Research Program Project:LX202101022022 Luohu Soft Science Research Program Project:LX202202128。
文摘Determining clinical questions is fundamental to the development of clinical practice guidelines(CPGs),which bridges the initial phases and the final recommendations.It is essential for evidence retrieval and the formulation of recommendations.The scientific rigor and precision in determination of clinical questions directly influence the future implementation and applicability of guidelines.In 2020,the World Federation of Acupuncture-Moxibustion Societies initiated the project of clinical practice guideline on acupuncture and moxibustion for adult major depressive disorder(mild-moderate degree)to address clinical and medical decision-making issues in acupuncture treatment for adult mild to moderate major depressive disorder.This CPG provides systematic recommendations based on clinical evidence,patient values,and other factors,aiding decision-makers,clinicians,and patients in selecting appropriate interventions.This paper discusses and analyzes the determination process of clinical questions,and the related issues during the development of this guideline,aiming to provide a reference for determining clinical questions and developing CPGs in the field of acupuncture and exploring more scientific tools and methods for determining clinical questions in future CPGs.
基金Supported by Sichuan Science and Technology Program(2021YFQ0003,2023YFSY0026,2023YFH0004).
文摘In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.
基金supported by Sichuan Science and Technology Program(2023YFSY0026,2023YFH0004).
文摘Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.
文摘The weapon and equipment operational requirement analysis(WEORA) is a necessary condition to win a future war,among which the acquisition of knowledge about weapons and equipment is a great challenge. The main challenge is that the existing weapons and equipment data fails to carry out structured knowledge representation, and knowledge navigation based on natural language cannot efficiently support the WEORA. To solve above problem, this research proposes a method based on question answering(QA) of weapons and equipment knowledge graph(WEKG) to construct and navigate the knowledge related to weapons and equipment in the WEORA. This method firstly constructs the WEKG, and builds a neutral network-based QA system over the WEKG by means of semantic parsing for knowledge navigation. Finally, the method is evaluated and a chatbot on the QA system is developed for the WEORA. Our proposed method has good performance in the accuracy and efficiency of searching target knowledge, and can well assist the WEORA.
基金Supported by National Nature Science Foudation of China(61976160,61906137,61976158,62076184,62076182)Shanghai Science and Technology Plan Project(21DZ1204800)。
文摘Background External knowledge representations play an essential role in knowledge-based visual question and answering to better understand complex scenarios in the open world.Recent entity-relationship embedding approaches are deficient in representing some complex relations,resulting in a lack of topic-related knowledge and redundancy in topic-irrelevant information.Methods To this end,we propose MKEAH:Multimodal Knowledge Extraction and Accumulation on Hyperplanes.To ensure that the lengths of the feature vectors projected onto the hyperplane compare equally and to filter out sufficient topic-irrelevant information,two losses are proposed to learn the triplet representations from the complementary views:range loss and orthogonal loss.To interpret the capability of extracting topic-related knowledge,we present the Topic Similarity(TS)between topic and entity-relations.Results Experimental results demonstrate the effectiveness of hyperplane embedding for knowledge representation in knowledge-based visual question answering.Our model outperformed state-of-the-art methods by 2.12%and 3.24%on two challenging knowledge-request datasets:OK-VQA and KRVQA,respectively.Conclusions The obvious advantages of our model in TS show that using hyperplane embedding to represent multimodal knowledge can improve its ability to extract topic-related knowledge.
文摘In literature,differences in the description of female and male characters have been noticeable for a long time.In this study,the novel Jane Eyre is uses as a material to investigate whether there are in fact any significant differences in the questions Mr.Rochester and Jane use and how the questions function to portray these two main characters.
文摘A new ontology-based question expansion (OBQE) method is proposed for question similarity calculation in a frequently asked question (FAQ) answering system. Traditional question similarity calculation methods use "word" to compose question vector, that the semantic relations between words are ignored. OBQE takes the relation as an important part. The process of the new system is:① to build two-layered domain ontology referring to WordNet and domain corpse;② to expand question trunks into domain cases;③ to use domain case composed vector to calculate question similarity. The experimental result shows that the performance of question similarity calculation with OBQE is being improved.