Covert timing channels(CTC)exploit network resources to establish hidden communication pathways,posing signi cant risks to data security and policy compliance.erefore,detecting such hidden and dangerous threats remain...Covert timing channels(CTC)exploit network resources to establish hidden communication pathways,posing signi cant risks to data security and policy compliance.erefore,detecting such hidden and dangerous threats remains one of the security challenges. is paper proposes LinguTimeX,a new framework that combines natural language processing with arti cial intelligence,along with explainable Arti cial Intelligence(AI)not only to detect CTC but also to provide insights into the decision process.LinguTimeX performs multidimensional feature extraction by fusing linguistic attributes with temporal network patterns to identify covert channels precisely.LinguTimeX demonstrates strong e ectiveness in detecting CTC across multiple languages;namely English,Arabic,and Chinese.Speci cally,the LSTM and RNN models achieved F1 scores of 90%on the English dataset,89%on the Arabic dataset,and 88%on the Chinese dataset,showcasing their superior performance and ability to generalize across multiple languages. is highlights their robustness in detecting CTCs within security systems,regardless of the language or cultural context of the data.In contrast,the DeepForest model produced F1-scores ranging from 86%to 87%across the same datasets,further con rming its e ectiveness in CTC detection.Although other algorithms also showed reasonable accuracy,the LSTM and RNN models consistently outperformed them in multilingual settings,suggesting that deep learning models might be better suited for this particular problem.展开更多
Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for...Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for SLRT. However, making a large-scale and diverse sign language dataset is difficult as sign language data on the Internet is scarce. In making a large-scale and diverse sign language dataset, some sign language data qualities are not up to standard. This paper proposes a two information streams transformer(TIST) model to judge whether the quality of sign language data is qualified. To verify that TIST effectively improves sign language recognition(SLR), we make two datasets, the screened dataset and the unscreened dataset. In this experiment, this paper uses visual alignment constraint(VAC) as the baseline model. The experimental results show that the screened dataset can achieve better word error rate(WER) than the unscreened dataset.展开更多
Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLM...Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLMs.Therefore,in order to better assess the capability of LLMs in the agricultural domain,Agri-Eval was proposed as a benchmark for assessing the knowledge and reasoning ability of LLMs in agriculture.The assessment dataset used in Agri-Eval covered seven major disciplines in the agricultural domain:crop science,horticulture,plant protection,animal husbandry,forest science,aquaculture science,and grass science,and contained a total of 2283 questions.Among domestic general-purpose LLMs,DeepSeek R1 performed best with an accuracy rate of 75.49%.In the realm of international general-purpose LLMs,Gemini 2.0 pro exp 0205 standed out as the top performer,achieving an accuracy rate of 74.28%.As an LLMs in agriculture vertical,Shennong V2.0 outperformed all the LLMs in China,and the answer accuracy rate of agricultural knowledge exceeded that of all the existing general-purpose LLMs.The launch of Agri-Eval helped the LLM developers to comprehensively evaluate the model's capability in the field of agriculture through a variety of tasks and tests to promote the development of the LLMs in the field of agriculture.展开更多
This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to use...This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to user behavior and platform-driven moderation on social media.The proposed methodological framework(1)utilizes large language models for social media post analysis and categorization,(2)employs k-means clustering for content characterization,and(3)incorporates the TODIM(Tomada de Decisão Interativa Multicritério)method to determine moderation strategies based on expert judgments.In general,the fully integrated framework leverages the strengths of these intelligent systems in a more systematic evaluation of large-scale decision problems.When applied in social media moderation,this approach promotes nuanced and context-sensitive self-moderation by taking into account factors such as cultural background and geographic location.The application of this framework is demonstrated within Facebook groups.Eight distinct content clusters encompassing safety,harassment,diversity,and misinformation are identified.Analysis revealed a preference for content removal across all clusters,suggesting a cautious approach towards potentially harmful content.However,the framework also highlights the use of other moderation actions,like account suspension,depending on the content category.These findings contribute to the growing body of research on self-moderation and offer valuable insights for creating safer and more inclusive online spaces within smaller communities.展开更多
This interview examines the theoretical foundations,pedagogical applications,developmental trajectory,and future directions of the xu-argument.Professor Wang Chuming offers a comprehensive account of the xu-argument,c...This interview examines the theoretical foundations,pedagogical applications,developmental trajectory,and future directions of the xu-argument.Professor Wang Chuming offers a comprehensive account of the xu-argument,clarifying its theoretical framework,the learning mechanisms underlying xu,and its interface with international theories of second language acquisition(SLA).From the perspective of the xu-argument,he proposes novel interpretations of core issues in SLA.Drawing on the development of the xu-argument,Wang further discusses the essence,directions,and methodology of innovation in SLA theory.He emphasizes that theoretical advances must capture and illuminate underlying natural laws,arguing that innovative approaches are typically rooted in deep reflection on common sense.He also calls for theoretical innovation in SLA in the Chinese context,advocating a robust research paradigm that shifts from local observation to global theoretical generalization,thereby promoting bottom-up theoretical development.In closing,he highlights the promising prospects for SLA theory in the era of artificial intelligence.展开更多
Professional learning communities(PLCs)offer essential contextual support for the development of foreign language teachers in higher education.The book Building Professional Learning Communities of Foreign Language Te...Professional learning communities(PLCs)offer essential contextual support for the development of foreign language teachers in higher education.The book Building Professional Learning Communities of Foreign Language Teachers in Higher Education,co-authored by Wen et al.(2021)systematically examines the necessity,practical measures,theoretical construction,and outcomes associated with building PLCs for university foreign language teachers.Notably,it introduces a research methodology rooted in local teacher education practices with Chinese characteristics—the dialectical research paradigm(DRP).This review introduces the content of the book,evaluates its contributions to teacher development,and explores its implications for future practice and research.展开更多
This study examines the predictive roles of foreign language classroom anxiety(FLCA),foreign language enjoyment(FLE),and foreign language boredom(FLB)in English achievement among Chinese senior high school students.De...This study examines the predictive roles of foreign language classroom anxiety(FLCA),foreign language enjoyment(FLE),and foreign language boredom(FLB)in English achievement among Chinese senior high school students.Despite extensive research on anxiety in language learning,less attention has been given to boredom,and the combined effects of these three emotions on English achievement remain under-explored,particularly among high school students in China.To address these gaps,a sample of 142 students from Guangzhou was surveyed using questionnaires to assess their emotional experiences and English achievement.The research found that FLE exhibited a positive correlation with academic performance,while FLCA and FLB showed negative associations.Notably,FLE was the most significant predictor of English achievement,followed by FLCA and FLB.Gender differences were observed,with male students reporting significantly higher levels of environmental enjoyment,while female students experienced significantly greater communication anxiety.On this basis,this paper offers suggestions on how to enhance senior high school students’FLE while mitigating FLCA and FLB,thereby promoting more effective and sustained English learning.展开更多
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t...We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.展开更多
Text clustering is an important task because of its vital role in NLP-related tasks.However,existing research on clustering is mainly based on the English language,with limited work on low-resource languages,such as U...Text clustering is an important task because of its vital role in NLP-related tasks.However,existing research on clustering is mainly based on the English language,with limited work on low-resource languages,such as Urdu.Low-resource language text clustering has many drawbacks in the form of limited annotated collections and strong linguistic diversity.Theprimary aim of this paper is twofold:(1)By introducing a clustering dataset namedUNC-2025 comprises 100k Urdu news documents,and(2)a detailed empirical standard of Large Language Model(LLM)improved clusteringmethods for Urdu text.We explicitly evaluate the behavior of the 11multilingual and Urdu-specific embeddings on 3 different clustering algorithms.We carefully evaluated our performance based on a set of internal and external measurements of validity.We discover the best configuration of the mBERT embedding with the HDBSCAN algorithm that attains a new state-of-the-art performance with a high score of external validity of 0.95.This new LLM method has created a new strong standard of Urdu text clustering.Importantly,the results confirm the strength and high scalability of the LLM-generated embeddings towards the ability to generalise the fine,subtle semantics needed to discover topics in low-resource settings and open the door to novel NLP applications in underrepresented languages.展开更多
With the widespread application of large language models(LLMs)in natural language processing and code generation,traditional High-Level Language Programming courses are facing unprecedented challenges and opportunitie...With the widespread application of large language models(LLMs)in natural language processing and code generation,traditional High-Level Language Programming courses are facing unprecedented challenges and opportunities.As a core programming language for computer science majors,C language remains irreplaceable due to its foundational nature and engineering adaptability.This paper,based on the rapid development of large model technologies,proposes a systematic reform design for C language teaching,focusing on teaching objectives,content structure,teaching methods,and evaluation systems.The article suggests a teaching framework centered on“human-computer collaborative programming,”integrating prompt training,AI-assisted debugging,and code generation analysis,aiming to enhance students’problem modeling ability,programming expression skills,and AI collaboration literacy.展开更多
Duangsamorn Wattanapathitiwong—usually called by her Chinese name Wang Ximei these days—never expected a Chinese television drama to lead her to a life in China,a marriage rooted in cross-cultural understanding,and ...Duangsamorn Wattanapathitiwong—usually called by her Chinese name Wang Ximei these days—never expected a Chinese television drama to lead her to a life in China,a marriage rooted in cross-cultural understanding,and a profession that now bridges two nations.From a university student in Thailand puzzled by Chinese dialogue to a Thai language lecturer in China influencing the next generation of Thailand-China communicators,Wang’s journey is a story of resilience,romance,and responsibility.展开更多
This review paper explores advanced methods to prompt Large LanguageModels(LLMs)into generating objectionable or unintended behaviors through adversarial prompt injection attacks.We examine a series of novel projects ...This review paper explores advanced methods to prompt Large LanguageModels(LLMs)into generating objectionable or unintended behaviors through adversarial prompt injection attacks.We examine a series of novel projects like HOUYI,Robustly Aligned LLM(RA-LLM),StruQ,and Virtual Prompt Injection that compel LLMs to produce affirmative responses to harmful queries.Several new benchmarks,such as PromptBench,AdvBench,AttackEval,INJECAGENT,and Robustness Suite,have been created to evaluate the performance and resilience of LLMs against these adversarial attacks.Results show significant success rates in misleading models like Vicuna-7B,LLaMA-2-7B-Chat,GPT-3.5,and GPT-4.The review highlights limitations in existing defense mechanisms and proposes future directions for enhancing LLM alignment and safety protocols,including the concept of LLM SELF DEFENSE.Our study emphasizes the need for improved robustness in LLMs,which will potentially shape the future of Artificial Intelligence(AI)driven applications and security protocols.Understanding the vulnerabilities of LLMs is crucial for developing effective defenses against adversarial prompt injection attacks.This paper proposes a systemic classification framework that discusses various types of prompt injection attacks and defenses.We also go through a broad spectrum of stateof-the-art attack methods(such as HouYi and Virtual Prompt Injection)alongside advanced defense mechanisms(like RA-LLM,StruQ,and LLM Self-Defense),providing critical insights into vulnerabilities and robustness.We also integrate and compare results from multiple recent benchmarks,including PromptBench,INJECENT,and BIPIA.展开更多
Fundamental physics often confronts complex symbolic problems with few guiding exemplars or established principles.While artificial intelligence(AI)offers promise,its typical need for vast datasets to learn from hinde...Fundamental physics often confronts complex symbolic problems with few guiding exemplars or established principles.While artificial intelligence(AI)offers promise,its typical need for vast datasets to learn from hinders its use in these information-scarce frontiers.We introduce learning at criticality(LaC),a reinforcement learning scheme that tunes large language models(LLMs)to a sharp learning transition,addressing this information scarcity.At this transition,LLMs achieve peak generalization from minimal data,exemplified by 7-digit base-7 addition-a test of nontrivial arithmetic reasoning.To elucidate this peak,we analyze a minimal concept-network model designed to capture the essence of how LLMs might link tokens.Trained on a single exemplar,this model also undergoes a sharp learning transition.This transition exhibits hallmarks of a second-order phase transition,notably power-law distributed solution path lengths.At this critical point,the system maximizes a“critical thinking pattern”crucial for generalization,enabled by the underlying scale-free exploration.This suggests LLMs reach peak performance by operating at criticality,where such explorative dynamics enable the extraction of underlying operational rules.We demonstrate LaC in quantum field theory:an 8B-parameter LLM,tuned to its critical point by LaC using a few exemplars of symbolic Matsubara sums,solves unseen,higher-order problems,significantly outperforming far larger models.LaC thus leverages critical phenomena,a physical principle,to empower AI for complex,data-sparse challenges in fundamental physics.展开更多
This paper empirically studies the effects of attitudes towards Mandarin on Mandarin variation,and finds that both Mandarin emotional and value attitudes can effectively suppress Mandarin variation.Further research ha...This paper empirically studies the effects of attitudes towards Mandarin on Mandarin variation,and finds that both Mandarin emotional and value attitudes can effectively suppress Mandarin variation.Further research has found that the language attitudes of local residents have a stronger overall impact on Mandarin variation;The language attitude in small cities has a stronger impact on the variation of Mandarin.展开更多
This mixed-methods study presents a needs analysis to investigate the workplace English language needs of medical students in China who are learning and using English as non-native speakers,the circumstances in which ...This mixed-methods study presents a needs analysis to investigate the workplace English language needs of medical students in China who are learning and using English as non-native speakers,the circumstances in which the various language skills are required,and stakeholders’perceived workplace preparedness in the light of language-related instructional provision during medical training.A leading university in China was chosen as the study case.Altogether,294 online questionnaires were collected from undergraduate medical students,graduate medical students and recent graduates working as physicians,and 33 semi-structured individual interviews were conducted with undergraduate medical students,graduate medical students,recent graduates working as physicians,medical teachers,English for Medical Purposes(EMP)teachers,program leaders and English-speaking patients.Results showed that in addition to physicians experiencing pressure to publish scientific articles internationally,participants attached greater importance to physicians’oral English communication ability,especially in undertaking clinical consultations in English,working with medical interpreters or acting as ad hoc interpreters.The participants also reported a lack of relevant EMP courses or trainings available at this university.Given these communicative events that physicians face in China,EMP courses need to include training in these specific areas.展开更多
This paper examines the application of drama-based pedagogy in EFL classrooms,demonstrating how script analysis,role-playing,and improvisation can effectively enhance students’integrated language skills.The study hig...This paper examines the application of drama-based pedagogy in EFL classrooms,demonstrating how script analysis,role-playing,and improvisation can effectively enhance students’integrated language skills.The study highlights the unique advantages of dramatic texts for pronunciation training,subtext interpretation,and cultural understanding,while providing practical teaching methods including conflict scene selection and stage direction adaptation.Findings indicate that drama techniques reduce learning anxiety,boost motivation,and create authentic language contexts,serving as an effective bridge between literary study and language practice.展开更多
The rapid development of generative artificial intelligence(GenAI)is profoundly changing the form and paradigm of foreign language education.GenAI technology,represented by DeepSeek,provides technical support for pers...The rapid development of generative artificial intelligence(GenAI)is profoundly changing the form and paradigm of foreign language education.GenAI technology,represented by DeepSeek,provides technical support for personalization,immersion,and intelligence of foreign language teaching by virtue of its natural language processing,multimodal content generation,and cross-cultural simulation capabilities.From the three dimensions of“teaching reconstruction,”“learning innovation,”and“education upgrading,”this paper systematically analyzes the internal mechanism of GenAI empowering foreign language education and reveals its unique value in language knowledge transmission,skill training,and cultural understanding.At the same time,considering that GenAI may lead to language model errors in foreign language education,cultural misinterpretations,technological dependence,and data privacy risks,it is proposed to adopt coping strategies such as building an advanced literacy system,establishing a human-AI collaborative ecosystem,and implementing a transparent regulatory framework for algorithms.These measures aim to ensure the high-quality development of technology-integrated foreign language education,providing both theoretical support and practical pathways for cultivating globally competent talents with intercultural communication skills and digital literacy.展开更多
The natural language processing(NLP)domain has witnessed significant advancements with the emergence of transformer-based models,which have reshaped the text understanding and generation landscape.While their capabili...The natural language processing(NLP)domain has witnessed significant advancements with the emergence of transformer-based models,which have reshaped the text understanding and generation landscape.While their capabilities are well recognized,there remains a limited systematic synthesis of how these models perform across tasks,scale efficiently,adapt to domains,and address ethical challenges.Therefore,the aim of this paper was to analyze the performance of transformer-based models across various NLP tasks,their scalability,domain adaptation,and the ethical implications of such models.This meta-analysis paper synthesizes findings from 25 peer-reviewed studies on NLP transformer-based models,adhering to the PRISMA framework.Relevant papers were sourced from electronic databases,including IEEE Xplore,Springer,ACM Digital Library,Elsevier,PubMed,and Google Scholar.The findings highlight the superior performance of transformers over conventional approaches,attributed to selfattention mechanisms and pre-trained language representations.Despite these advantages,challenges such as high computational costs,data bias,and hallucination persist.The study provides new perspectives by underscoring the necessity for future research to optimize transformer architectures for efficiency,address ethical AI concerns,and enhance generalization across languages.This paper contributes valuable insights into the current trends,limitations,and potential improvements in transformer-based models for NLP.展开更多
Accessible communication based on sign language recognition(SLR)is the key to emergency medical assistance for the hearing-impaired community.Balancing the capture of both local and global information in SLR for emerg...Accessible communication based on sign language recognition(SLR)is the key to emergency medical assistance for the hearing-impaired community.Balancing the capture of both local and global information in SLR for emergency medicine poses a significant challenge.To address this,we propose a novel approach based on the inter-learning of visual features between global and local information.Specifically,our method enhances the perception capabilities of the visual feature extractor by strategically leveraging the strengths of convolutional neural network(CNN),which are adept at capturing local features,and visual transformers which perform well at perceiving global features.Furthermore,to mitigate the issue of overfitting caused by the limited availability of sign language data for emergency medical applications,we introduce an enhanced short temporal module for data augmentation through additional subsequences.Experimental results on three publicly available sign language datasets demonstrate the efficacy of the proposed approach.展开更多
基金This study is financed by the European Union-NextGenerationEU,through the National Recovery and Resilience Plan of the Republic of Bulgaria,Project No.BG-RRP-2.013-0001.
文摘Covert timing channels(CTC)exploit network resources to establish hidden communication pathways,posing signi cant risks to data security and policy compliance.erefore,detecting such hidden and dangerous threats remains one of the security challenges. is paper proposes LinguTimeX,a new framework that combines natural language processing with arti cial intelligence,along with explainable Arti cial Intelligence(AI)not only to detect CTC but also to provide insights into the decision process.LinguTimeX performs multidimensional feature extraction by fusing linguistic attributes with temporal network patterns to identify covert channels precisely.LinguTimeX demonstrates strong e ectiveness in detecting CTC across multiple languages;namely English,Arabic,and Chinese.Speci cally,the LSTM and RNN models achieved F1 scores of 90%on the English dataset,89%on the Arabic dataset,and 88%on the Chinese dataset,showcasing their superior performance and ability to generalize across multiple languages. is highlights their robustness in detecting CTCs within security systems,regardless of the language or cultural context of the data.In contrast,the DeepForest model produced F1-scores ranging from 86%to 87%across the same datasets,further con rming its e ectiveness in CTC detection.Although other algorithms also showed reasonable accuracy,the LSTM and RNN models consistently outperformed them in multilingual settings,suggesting that deep learning models might be better suited for this particular problem.
基金supported by the National Language Commission to research on sign language data specifications for artificial intelligence applications and test standards for language service translation systems (No.ZDI145-70)。
文摘Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for SLRT. However, making a large-scale and diverse sign language dataset is difficult as sign language data on the Internet is scarce. In making a large-scale and diverse sign language dataset, some sign language data qualities are not up to standard. This paper proposes a two information streams transformer(TIST) model to judge whether the quality of sign language data is qualified. To verify that TIST effectively improves sign language recognition(SLR), we make two datasets, the screened dataset and the unscreened dataset. In this experiment, this paper uses visual alignment constraint(VAC) as the baseline model. The experimental results show that the screened dataset can achieve better word error rate(WER) than the unscreened dataset.
文摘Model evaluation using benchmark datasets is an important method to measure the capability of large language models(LLMs)in specific domains,and it is mainly used to assess the knowledge and reasoning abilities of LLMs.Therefore,in order to better assess the capability of LLMs in the agricultural domain,Agri-Eval was proposed as a benchmark for assessing the knowledge and reasoning ability of LLMs in agriculture.The assessment dataset used in Agri-Eval covered seven major disciplines in the agricultural domain:crop science,horticulture,plant protection,animal husbandry,forest science,aquaculture science,and grass science,and contained a total of 2283 questions.Among domestic general-purpose LLMs,DeepSeek R1 performed best with an accuracy rate of 75.49%.In the realm of international general-purpose LLMs,Gemini 2.0 pro exp 0205 standed out as the top performer,achieving an accuracy rate of 74.28%.As an LLMs in agriculture vertical,Shennong V2.0 outperformed all the LLMs in China,and the answer accuracy rate of agricultural knowledge exceeded that of all the existing general-purpose LLMs.The launch of Agri-Eval helped the LLM developers to comprehensively evaluate the model's capability in the field of agriculture through a variety of tasks and tests to promote the development of the LLMs in the field of agriculture.
基金funded by the Office of the Vice-President for Research and Development of Cebu Technological University.
文摘This study demonstrates a novel integration of large language models,machine learning,and multicriteria decision-making to investigate self-moderation in small online communities,a topic under-explored compared to user behavior and platform-driven moderation on social media.The proposed methodological framework(1)utilizes large language models for social media post analysis and categorization,(2)employs k-means clustering for content characterization,and(3)incorporates the TODIM(Tomada de Decisão Interativa Multicritério)method to determine moderation strategies based on expert judgments.In general,the fully integrated framework leverages the strengths of these intelligent systems in a more systematic evaluation of large-scale decision problems.When applied in social media moderation,this approach promotes nuanced and context-sensitive self-moderation by taking into account factors such as cultural background and geographic location.The application of this framework is demonstrated within Facebook groups.Eight distinct content clusters encompassing safety,harassment,diversity,and misinformation are identified.Analysis revealed a preference for content removal across all clusters,suggesting a cautious approach towards potentially harmful content.However,the framework also highlights the use of other moderation actions,like account suspension,depending on the content category.These findings contribute to the growing body of research on self-moderation and offer valuable insights for creating safer and more inclusive online spaces within smaller communities.
文摘This interview examines the theoretical foundations,pedagogical applications,developmental trajectory,and future directions of the xu-argument.Professor Wang Chuming offers a comprehensive account of the xu-argument,clarifying its theoretical framework,the learning mechanisms underlying xu,and its interface with international theories of second language acquisition(SLA).From the perspective of the xu-argument,he proposes novel interpretations of core issues in SLA.Drawing on the development of the xu-argument,Wang further discusses the essence,directions,and methodology of innovation in SLA theory.He emphasizes that theoretical advances must capture and illuminate underlying natural laws,arguing that innovative approaches are typically rooted in deep reflection on common sense.He also calls for theoretical innovation in SLA in the Chinese context,advocating a robust research paradigm that shifts from local observation to global theoretical generalization,thereby promoting bottom-up theoretical development.In closing,he highlights the promising prospects for SLA theory in the era of artificial intelligence.
文摘Professional learning communities(PLCs)offer essential contextual support for the development of foreign language teachers in higher education.The book Building Professional Learning Communities of Foreign Language Teachers in Higher Education,co-authored by Wen et al.(2021)systematically examines the necessity,practical measures,theoretical construction,and outcomes associated with building PLCs for university foreign language teachers.Notably,it introduces a research methodology rooted in local teacher education practices with Chinese characteristics—the dialectical research paradigm(DRP).This review introduces the content of the book,evaluates its contributions to teacher development,and explores its implications for future practice and research.
文摘This study examines the predictive roles of foreign language classroom anxiety(FLCA),foreign language enjoyment(FLE),and foreign language boredom(FLB)in English achievement among Chinese senior high school students.Despite extensive research on anxiety in language learning,less attention has been given to boredom,and the combined effects of these three emotions on English achievement remain under-explored,particularly among high school students in China.To address these gaps,a sample of 142 students from Guangzhou was surveyed using questionnaires to assess their emotional experiences and English achievement.The research found that FLE exhibited a positive correlation with academic performance,while FLCA and FLB showed negative associations.Notably,FLE was the most significant predictor of English achievement,followed by FLCA and FLB.Gender differences were observed,with male students reporting significantly higher levels of environmental enjoyment,while female students experienced significantly greater communication anxiety.On this basis,this paper offers suggestions on how to enhance senior high school students’FLE while mitigating FLCA and FLB,thereby promoting more effective and sustained English learning.
文摘We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.
基金Chang Gung University and Chang Gung Memorial Hospital under project number NERPD4Q0021.
文摘Text clustering is an important task because of its vital role in NLP-related tasks.However,existing research on clustering is mainly based on the English language,with limited work on low-resource languages,such as Urdu.Low-resource language text clustering has many drawbacks in the form of limited annotated collections and strong linguistic diversity.Theprimary aim of this paper is twofold:(1)By introducing a clustering dataset namedUNC-2025 comprises 100k Urdu news documents,and(2)a detailed empirical standard of Large Language Model(LLM)improved clusteringmethods for Urdu text.We explicitly evaluate the behavior of the 11multilingual and Urdu-specific embeddings on 3 different clustering algorithms.We carefully evaluated our performance based on a set of internal and external measurements of validity.We discover the best configuration of the mBERT embedding with the HDBSCAN algorithm that attains a new state-of-the-art performance with a high score of external validity of 0.95.This new LLM method has created a new strong standard of Urdu text clustering.Importantly,the results confirm the strength and high scalability of the LLM-generated embeddings towards the ability to generalise the fine,subtle semantics needed to discover topics in low-resource settings and open the door to novel NLP applications in underrepresented languages.
基金Education and Teaching Research Project of Beijing University of Technology(ER2024KCB08)。
文摘With the widespread application of large language models(LLMs)in natural language processing and code generation,traditional High-Level Language Programming courses are facing unprecedented challenges and opportunities.As a core programming language for computer science majors,C language remains irreplaceable due to its foundational nature and engineering adaptability.This paper,based on the rapid development of large model technologies,proposes a systematic reform design for C language teaching,focusing on teaching objectives,content structure,teaching methods,and evaluation systems.The article suggests a teaching framework centered on“human-computer collaborative programming,”integrating prompt training,AI-assisted debugging,and code generation analysis,aiming to enhance students’problem modeling ability,programming expression skills,and AI collaboration literacy.
文摘Duangsamorn Wattanapathitiwong—usually called by her Chinese name Wang Ximei these days—never expected a Chinese television drama to lead her to a life in China,a marriage rooted in cross-cultural understanding,and a profession that now bridges two nations.From a university student in Thailand puzzled by Chinese dialogue to a Thai language lecturer in China influencing the next generation of Thailand-China communicators,Wang’s journey is a story of resilience,romance,and responsibility.
文摘This review paper explores advanced methods to prompt Large LanguageModels(LLMs)into generating objectionable or unintended behaviors through adversarial prompt injection attacks.We examine a series of novel projects like HOUYI,Robustly Aligned LLM(RA-LLM),StruQ,and Virtual Prompt Injection that compel LLMs to produce affirmative responses to harmful queries.Several new benchmarks,such as PromptBench,AdvBench,AttackEval,INJECAGENT,and Robustness Suite,have been created to evaluate the performance and resilience of LLMs against these adversarial attacks.Results show significant success rates in misleading models like Vicuna-7B,LLaMA-2-7B-Chat,GPT-3.5,and GPT-4.The review highlights limitations in existing defense mechanisms and proposes future directions for enhancing LLM alignment and safety protocols,including the concept of LLM SELF DEFENSE.Our study emphasizes the need for improved robustness in LLMs,which will potentially shape the future of Artificial Intelligence(AI)driven applications and security protocols.Understanding the vulnerabilities of LLMs is crucial for developing effective defenses against adversarial prompt injection attacks.This paper proposes a systemic classification framework that discusses various types of prompt injection attacks and defenses.We also go through a broad spectrum of stateof-the-art attack methods(such as HouYi and Virtual Prompt Injection)alongside advanced defense mechanisms(like RA-LLM,StruQ,and LLM Self-Defense),providing critical insights into vulnerabilities and robustness.We also integrate and compare results from multiple recent benchmarks,including PromptBench,INJECENT,and BIPIA.
基金supported by the National Key Research and Development Program of China(Grant No.2024YFA1408604 for K.C.and X.C.)the National Natural Science Foundation of China(Grant Nos.12047503,12447103 for K.C.and X.C.,12325501 for P.Z.,and 12275263 for Y.D.and S.H.)+1 种基金the Innovation Program for Quantum Science and Technology(Grant No.2021ZD0301900 for Y.D.and S.H.)the Natural Science Foundation of Fujian Province of China(Grant No.2023J02032 for Y.D.and S.H.)。
文摘Fundamental physics often confronts complex symbolic problems with few guiding exemplars or established principles.While artificial intelligence(AI)offers promise,its typical need for vast datasets to learn from hinders its use in these information-scarce frontiers.We introduce learning at criticality(LaC),a reinforcement learning scheme that tunes large language models(LLMs)to a sharp learning transition,addressing this information scarcity.At this transition,LLMs achieve peak generalization from minimal data,exemplified by 7-digit base-7 addition-a test of nontrivial arithmetic reasoning.To elucidate this peak,we analyze a minimal concept-network model designed to capture the essence of how LLMs might link tokens.Trained on a single exemplar,this model also undergoes a sharp learning transition.This transition exhibits hallmarks of a second-order phase transition,notably power-law distributed solution path lengths.At this critical point,the system maximizes a“critical thinking pattern”crucial for generalization,enabled by the underlying scale-free exploration.This suggests LLMs reach peak performance by operating at criticality,where such explorative dynamics enable the extraction of underlying operational rules.We demonstrate LaC in quantum field theory:an 8B-parameter LLM,tuned to its critical point by LaC using a few exemplars of symbolic Matsubara sums,solves unseen,higher-order problems,significantly outperforming far larger models.LaC thus leverages critical phenomena,a physical principle,to empower AI for complex,data-sparse challenges in fundamental physics.
基金funded by Project:2024 Youth Project of Philosophy and Social Sciences Planning in Guangdong Province“Research on the Relationship between Mandarin and Economic Development in Cantonese Speaking Areas(GD24YZY03)”.
文摘This paper empirically studies the effects of attitudes towards Mandarin on Mandarin variation,and finds that both Mandarin emotional and value attitudes can effectively suppress Mandarin variation.Further research has found that the language attitudes of local residents have a stronger overall impact on Mandarin variation;The language attitude in small cities has a stronger impact on the variation of Mandarin.
文摘This mixed-methods study presents a needs analysis to investigate the workplace English language needs of medical students in China who are learning and using English as non-native speakers,the circumstances in which the various language skills are required,and stakeholders’perceived workplace preparedness in the light of language-related instructional provision during medical training.A leading university in China was chosen as the study case.Altogether,294 online questionnaires were collected from undergraduate medical students,graduate medical students and recent graduates working as physicians,and 33 semi-structured individual interviews were conducted with undergraduate medical students,graduate medical students,recent graduates working as physicians,medical teachers,English for Medical Purposes(EMP)teachers,program leaders and English-speaking patients.Results showed that in addition to physicians experiencing pressure to publish scientific articles internationally,participants attached greater importance to physicians’oral English communication ability,especially in undertaking clinical consultations in English,working with medical interpreters or acting as ad hoc interpreters.The participants also reported a lack of relevant EMP courses or trainings available at this university.Given these communicative events that physicians face in China,EMP courses need to include training in these specific areas.
文摘This paper examines the application of drama-based pedagogy in EFL classrooms,demonstrating how script analysis,role-playing,and improvisation can effectively enhance students’integrated language skills.The study highlights the unique advantages of dramatic texts for pronunciation training,subtext interpretation,and cultural understanding,while providing practical teaching methods including conflict scene selection and stage direction adaptation.Findings indicate that drama techniques reduce learning anxiety,boost motivation,and create authentic language contexts,serving as an effective bridge between literary study and language practice.
文摘The rapid development of generative artificial intelligence(GenAI)is profoundly changing the form and paradigm of foreign language education.GenAI technology,represented by DeepSeek,provides technical support for personalization,immersion,and intelligence of foreign language teaching by virtue of its natural language processing,multimodal content generation,and cross-cultural simulation capabilities.From the three dimensions of“teaching reconstruction,”“learning innovation,”and“education upgrading,”this paper systematically analyzes the internal mechanism of GenAI empowering foreign language education and reveals its unique value in language knowledge transmission,skill training,and cultural understanding.At the same time,considering that GenAI may lead to language model errors in foreign language education,cultural misinterpretations,technological dependence,and data privacy risks,it is proposed to adopt coping strategies such as building an advanced literacy system,establishing a human-AI collaborative ecosystem,and implementing a transparent regulatory framework for algorithms.These measures aim to ensure the high-quality development of technology-integrated foreign language education,providing both theoretical support and practical pathways for cultivating globally competent talents with intercultural communication skills and digital literacy.
文摘The natural language processing(NLP)domain has witnessed significant advancements with the emergence of transformer-based models,which have reshaped the text understanding and generation landscape.While their capabilities are well recognized,there remains a limited systematic synthesis of how these models perform across tasks,scale efficiently,adapt to domains,and address ethical challenges.Therefore,the aim of this paper was to analyze the performance of transformer-based models across various NLP tasks,their scalability,domain adaptation,and the ethical implications of such models.This meta-analysis paper synthesizes findings from 25 peer-reviewed studies on NLP transformer-based models,adhering to the PRISMA framework.Relevant papers were sourced from electronic databases,including IEEE Xplore,Springer,ACM Digital Library,Elsevier,PubMed,and Google Scholar.The findings highlight the superior performance of transformers over conventional approaches,attributed to selfattention mechanisms and pre-trained language representations.Despite these advantages,challenges such as high computational costs,data bias,and hallucination persist.The study provides new perspectives by underscoring the necessity for future research to optimize transformer architectures for efficiency,address ethical AI concerns,and enhance generalization across languages.This paper contributes valuable insights into the current trends,limitations,and potential improvements in transformer-based models for NLP.
基金supported by the National Natural Science Foundation of China(No.62376197)the Tianjin Science and Technology Program(No.23JCYBJC00360)the Tianjin Health Research Project(No.TJWJ2025MS045).
文摘Accessible communication based on sign language recognition(SLR)is the key to emergency medical assistance for the hearing-impaired community.Balancing the capture of both local and global information in SLR for emergency medicine poses a significant challenge.To address this,we propose a novel approach based on the inter-learning of visual features between global and local information.Specifically,our method enhances the perception capabilities of the visual feature extractor by strategically leveraging the strengths of convolutional neural network(CNN),which are adept at capturing local features,and visual transformers which perform well at perceiving global features.Furthermore,to mitigate the issue of overfitting caused by the limited availability of sign language data for emergency medical applications,we introduce an enhanced short temporal module for data augmentation through additional subsequences.Experimental results on three publicly available sign language datasets demonstrate the efficacy of the proposed approach.