Aiming at the problems of incomplete characterization of text relations,poor guidance of potential representations,and low quality of model generation in the field of controllable long text generation,this paper propo...Aiming at the problems of incomplete characterization of text relations,poor guidance of potential representations,and low quality of model generation in the field of controllable long text generation,this paper proposes a new GSPT-CVAE model(Graph Structured Processing,Single Vector,and Potential Attention Com-puting Transformer-Based Conditioned Variational Autoencoder model).The model obtains a more comprehensive representation of textual relations by graph-structured processing of the input text,and at the same time obtains a single vector representation by weighted merging of the vector sequences after graph-structured processing to get an effective potential representation.In the process of potential representation guiding text generation,the model adopts a combination of traditional embedding and potential attention calculation to give full play to the guiding role of potential representation for generating text,to improve the controllability and effectiveness of text generation.The experimental results show that the model has excellent representation learning ability and can learn rich and useful textual relationship representations.The model also achieves satisfactory results in the effectiveness and controllability of text generation and can generate long texts that match the given constraints.The ROUGE-1 F1 score of this model is 0.243,the ROUGE-2 F1 score is 0.041,the ROUGE-L F1 score is 0.22,and the PPL-Word score is 34.303,which gives the GSPT-CVAE model a certain advantage over the baseline model.Meanwhile,this paper compares this model with the state-of-the-art generative models T5,GPT-4,Llama2,and so on,and the experimental results show that the GSPT-CVAE model has a certain competitiveness.展开更多
Surgical site infections(SSIs)are the most common healthcare-related infections in patients with lung cancer.Constructing a lung cancer SSI risk prediction model requires the extraction of relevant risk factors from l...Surgical site infections(SSIs)are the most common healthcare-related infections in patients with lung cancer.Constructing a lung cancer SSI risk prediction model requires the extraction of relevant risk factors from lung cancer case texts,which involves two types of text structuring tasks:attribute discrimination and attribute extraction.This article proposes a joint model,Multi-BGLC,around these two types of tasks,using bidirectional encoder representations from transformers(BERT)as the encoder and fine-tuning the decoder composed of graph convolutional neural network(GCNN)+long short-term memory(LSTM)+conditional random field(CRF)based on cancer case data.The GCNN is used for attribute discrimination,whereas the LSTM and CRF are used for attribute extraction.The experiment verified the effectiveness and accuracy of the model compared with other baseline models.展开更多
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t...We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.展开更多
On January 14,Heimtextil kicked off the new trade fair year with over 3,000 exhibitors from 65 countries.With steady growth,the leading trade fair for home and contract textiles and textile design is strongly position...On January 14,Heimtextil kicked off the new trade fair year with over 3,000 exhibitors from 65 countries.With steady growth,the leading trade fair for home and contract textiles and textile design is strongly positioned. This makes it a reliable platform for international participants.At the opening,architect and designer Patricia Urquiola presented her installation 'among-us' at Heimtextil.展开更多
To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved a...To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms.展开更多
Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive te...Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive text data.Their potential integration into clinical settings offers a promising avenue that could transform clinical diagnosis and decision-making processes in the future(Thirunavukarasu et al.,2023).This article aims to provide an in-depth analysis of LLMs’current and potential impact on clinical practices.Their ability to generate differential diagnosis lists underscores their potential as invaluable tools in medical practice and education(Hirosawa et al.,2023;Koga et al.,2023).展开更多
The application of legal texts in the context of digital television is a process that relies on several normative instruments,ranging from international treaties,such as those of the ITU(International Telecommunicatio...The application of legal texts in the context of digital television is a process that relies on several normative instruments,ranging from international treaties,such as those of the ITU(International Telecommunications Union),to national regulations defining the obligations of audiovisual operators and the modalities of consumer support.Many countries have introduced specific laws and regulations to organize the gradual switch-off of analog broadcasting and encourage the adoption of new digital standards.Consequently,the digitization of Guinea’s broadcasting network cannot be carried out without taking into account the legal framework:allocation of resources and broadcasting players.Analog and digital broadcasting,according to regulatory texts,shows the relationships between the different communication management structures.As for digital broadcasting,we note the appearance of a new service,multiplex.展开更多
With the rapid development of web technology,Social Networks(SNs)have become one of the most popular platforms for users to exchange views and to express their emotions.More and more people are used to commenting on a...With the rapid development of web technology,Social Networks(SNs)have become one of the most popular platforms for users to exchange views and to express their emotions.More and more people are used to commenting on a certain hot spot in SNs,resulting in a large amount of texts containing emotions.Textual Emotion Cause Extraction(TECE)aims to automatically extract causes for a certain emotion in texts,which is an important research issue in natural language processing.It is different from the previous tasks of emotion recognition and emotion classification.In addition,it is not limited to the shallow-level emotion classification of text,but to trace the emotion source.In this paper,we provide a survey for TECE.First,we introduce the development process and classification of TECE.Then,we discuss the existing methods and key factors for TECE.Finally,we enumerate the challenges and developing trend for TECE.展开更多
The present study explores the importance of developing metaphorical thinking skills in students within the framework of English as a Foreign Language(EFL)reading courses at the tertiary educational level.Metaphorical...The present study explores the importance of developing metaphorical thinking skills in students within the framework of English as a Foreign Language(EFL)reading courses at the tertiary educational level.Metaphorical thinking is viewed as the ability to envisage the world figuratively,perceive associatively,and express oneself creatively.It is crucial to recognize metaphors in texts,interpret the complex images they evoke,and generate new metaphors.It is especially needful in the current era of clip thinking and fragmented information processing when students often approach content superficially rather than comprehensively,leading to decreased cognitive activity and a diminished capacity to understand literature.To foster metaphorical thinking,the paper suggests building a text associative-semantic field focusing on metaphors.Due to its hierarchical structure,which can be envisioned as a dense nucleus surrounded by a central region of synonyms and further enveloped by a periphery of more loosely associated linguistic units,the text associative-semantic field is seen as a potent solution for facilitating improved visualization and more holistic comprehension of information,allowing students for expanding their vocabulary and strengthening associative connections.Notably,the study highlights analyzing the metaphors of emotional states as they contribute significantly to a more profound interpretation of the text,understanding the writer’s unique style,deepening the students’engagement with the book,and expanding their emotional experiences.展开更多
This study investigates translation strategies for Chinese cultural terms in academic texts through a case study of Chapter 7 from“Jade Myth Belief and Chinese Spirit”.Using a qualitative research approach based on ...This study investigates translation strategies for Chinese cultural terms in academic texts through a case study of Chapter 7 from“Jade Myth Belief and Chinese Spirit”.Using a qualitative research approach based on cultural context framework and cognitive model,the study analyzes translation challenges and solutions in rendering cultural terms related to jade mythology and archaeological concepts.The research identifies three primary translation strategies:transliteration with annotation,domestication with explanation,and cognitive-based translation.The findings reveal that effective translation requires a balanced approach between maintaining academic precision and preserving cultural authenticity.The study demonstrates that successful translation of cultural terms in academic contexts demands a sophisticated understanding of both source and target cultural contexts,along with careful consideration of the academic audience’s needs.This research contributes to the field by providing practical insights for translators working with Chinese cultural texts in academic settings and proposing an approach to handling complex cultural terminology.展开更多
文摘Aiming at the problems of incomplete characterization of text relations,poor guidance of potential representations,and low quality of model generation in the field of controllable long text generation,this paper proposes a new GSPT-CVAE model(Graph Structured Processing,Single Vector,and Potential Attention Com-puting Transformer-Based Conditioned Variational Autoencoder model).The model obtains a more comprehensive representation of textual relations by graph-structured processing of the input text,and at the same time obtains a single vector representation by weighted merging of the vector sequences after graph-structured processing to get an effective potential representation.In the process of potential representation guiding text generation,the model adopts a combination of traditional embedding and potential attention calculation to give full play to the guiding role of potential representation for generating text,to improve the controllability and effectiveness of text generation.The experimental results show that the model has excellent representation learning ability and can learn rich and useful textual relationship representations.The model also achieves satisfactory results in the effectiveness and controllability of text generation and can generate long texts that match the given constraints.The ROUGE-1 F1 score of this model is 0.243,the ROUGE-2 F1 score is 0.041,the ROUGE-L F1 score is 0.22,and the PPL-Word score is 34.303,which gives the GSPT-CVAE model a certain advantage over the baseline model.Meanwhile,this paper compares this model with the state-of-the-art generative models T5,GPT-4,Llama2,and so on,and the experimental results show that the GSPT-CVAE model has a certain competitiveness.
基金the Special Project of the Shanghai Municipal Commission of Economy and Information Technology for Promoting High-Quality Industrial Development(No.2024-GZL-RGZN-02011)the Shanghai City Digital Transformation Project(No.202301002)the Project of Shanghai Shenkang Hospital Development Center(No.SHDC22023214)。
文摘Surgical site infections(SSIs)are the most common healthcare-related infections in patients with lung cancer.Constructing a lung cancer SSI risk prediction model requires the extraction of relevant risk factors from lung cancer case texts,which involves two types of text structuring tasks:attribute discrimination and attribute extraction.This article proposes a joint model,Multi-BGLC,around these two types of tasks,using bidirectional encoder representations from transformers(BERT)as the encoder and fine-tuning the decoder composed of graph convolutional neural network(GCNN)+long short-term memory(LSTM)+conditional random field(CRF)based on cancer case data.The GCNN is used for attribute discrimination,whereas the LSTM and CRF are used for attribute extraction.The experiment verified the effectiveness and accuracy of the model compared with other baseline models.
文摘We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.
文摘On January 14,Heimtextil kicked off the new trade fair year with over 3,000 exhibitors from 65 countries.With steady growth,the leading trade fair for home and contract textiles and textile design is strongly positioned. This makes it a reliable platform for international participants.At the opening,architect and designer Patricia Urquiola presented her installation 'among-us' at Heimtextil.
文摘To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms.
文摘Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive text data.Their potential integration into clinical settings offers a promising avenue that could transform clinical diagnosis and decision-making processes in the future(Thirunavukarasu et al.,2023).This article aims to provide an in-depth analysis of LLMs’current and potential impact on clinical practices.Their ability to generate differential diagnosis lists underscores their potential as invaluable tools in medical practice and education(Hirosawa et al.,2023;Koga et al.,2023).
文摘The application of legal texts in the context of digital television is a process that relies on several normative instruments,ranging from international treaties,such as those of the ITU(International Telecommunications Union),to national regulations defining the obligations of audiovisual operators and the modalities of consumer support.Many countries have introduced specific laws and regulations to organize the gradual switch-off of analog broadcasting and encourage the adoption of new digital standards.Consequently,the digitization of Guinea’s broadcasting network cannot be carried out without taking into account the legal framework:allocation of resources and broadcasting players.Analog and digital broadcasting,according to regulatory texts,shows the relationships between the different communication management structures.As for digital broadcasting,we note the appearance of a new service,multiplex.
基金partially supported by the National Natural Science Foundation of China under Grant No.62372121the Ministry of education of Humanities and Social Science project under Grant No.20YJAZH118+1 种基金the National Key Research and Development Program of China under Grant No.2020YFB1005804the MOE Project at Center for Linguistics and Applied Linguistics,Guangdong University of Foreign Studies。
文摘With the rapid development of web technology,Social Networks(SNs)have become one of the most popular platforms for users to exchange views and to express their emotions.More and more people are used to commenting on a certain hot spot in SNs,resulting in a large amount of texts containing emotions.Textual Emotion Cause Extraction(TECE)aims to automatically extract causes for a certain emotion in texts,which is an important research issue in natural language processing.It is different from the previous tasks of emotion recognition and emotion classification.In addition,it is not limited to the shallow-level emotion classification of text,but to trace the emotion source.In this paper,we provide a survey for TECE.First,we introduce the development process and classification of TECE.Then,we discuss the existing methods and key factors for TECE.Finally,we enumerate the challenges and developing trend for TECE.
文摘The present study explores the importance of developing metaphorical thinking skills in students within the framework of English as a Foreign Language(EFL)reading courses at the tertiary educational level.Metaphorical thinking is viewed as the ability to envisage the world figuratively,perceive associatively,and express oneself creatively.It is crucial to recognize metaphors in texts,interpret the complex images they evoke,and generate new metaphors.It is especially needful in the current era of clip thinking and fragmented information processing when students often approach content superficially rather than comprehensively,leading to decreased cognitive activity and a diminished capacity to understand literature.To foster metaphorical thinking,the paper suggests building a text associative-semantic field focusing on metaphors.Due to its hierarchical structure,which can be envisioned as a dense nucleus surrounded by a central region of synonyms and further enveloped by a periphery of more loosely associated linguistic units,the text associative-semantic field is seen as a potent solution for facilitating improved visualization and more holistic comprehension of information,allowing students for expanding their vocabulary and strengthening associative connections.Notably,the study highlights analyzing the metaphors of emotional states as they contribute significantly to a more profound interpretation of the text,understanding the writer’s unique style,deepening the students’engagement with the book,and expanding their emotional experiences.
基金sponsored by the Humanities and Social Sciences Project of the Ministry of Education under Grant No.24YJCZH443Shanghai Philosophy and Social Science Planning Project under Grant No.2024EYY015Shanghai Municipal Philosophy and Social Sciences Planning Project under Grant No.2024EYY011.
文摘This study investigates translation strategies for Chinese cultural terms in academic texts through a case study of Chapter 7 from“Jade Myth Belief and Chinese Spirit”.Using a qualitative research approach based on cultural context framework and cognitive model,the study analyzes translation challenges and solutions in rendering cultural terms related to jade mythology and archaeological concepts.The research identifies three primary translation strategies:transliteration with annotation,domestication with explanation,and cognitive-based translation.The findings reveal that effective translation requires a balanced approach between maintaining academic precision and preserving cultural authenticity.The study demonstrates that successful translation of cultural terms in academic contexts demands a sophisticated understanding of both source and target cultural contexts,along with careful consideration of the academic audience’s needs.This research contributes to the field by providing practical insights for translators working with Chinese cultural texts in academic settings and proposing an approach to handling complex cultural terminology.