Relation extraction is a pivotal task within the field of natural language processing,boasting numerous realworld applications.Existing research predominantly centers on monolingual relation extraction or cross-lingua...Relation extraction is a pivotal task within the field of natural language processing,boasting numerous realworld applications.Existing research predominantly centers on monolingual relation extraction or cross-lingual enhancement for relation extraction.However,there exists a notable gap in understanding relation extraction within mix-lingual(or code-switching)scenarios.In these scenarios,individuals blend content from different languages within sentences,generating mix-lingual content.The effectiveness of existing relation extraction models in such scenarios remains largely unexplored due to the absence of dedicated datasets.To address this gap,we introduce the Mix-Lingual Relation Extraction(MixRE)task and construct a human-annotated dataset MixRED to support this task.Additionally,we propose a hierarchical training approach for the mix-lingual scenario named Mix-Lingual Training(MixTrain),designed to enhance the performance of large language models(LLMs)when capturing relational dependencies from mix-lingual content spanning different semantic levels.Our experiments involve evaluating state-of-the-art supervised models and LLMs on the constructed dataset,with results indicating that MixTrain notably improves model performance.Moreover,we investigate the effectiveness of using mix-lingual content as a tool to transfer learned relational dependencies across different languages.Additionally,we delve into factors influencing model performance for both supervised models and LLMs in the novel MixREtask.展开更多
文摘Relation extraction is a pivotal task within the field of natural language processing,boasting numerous realworld applications.Existing research predominantly centers on monolingual relation extraction or cross-lingual enhancement for relation extraction.However,there exists a notable gap in understanding relation extraction within mix-lingual(or code-switching)scenarios.In these scenarios,individuals blend content from different languages within sentences,generating mix-lingual content.The effectiveness of existing relation extraction models in such scenarios remains largely unexplored due to the absence of dedicated datasets.To address this gap,we introduce the Mix-Lingual Relation Extraction(MixRE)task and construct a human-annotated dataset MixRED to support this task.Additionally,we propose a hierarchical training approach for the mix-lingual scenario named Mix-Lingual Training(MixTrain),designed to enhance the performance of large language models(LLMs)when capturing relational dependencies from mix-lingual content spanning different semantic levels.Our experiments involve evaluating state-of-the-art supervised models and LLMs on the constructed dataset,with results indicating that MixTrain notably improves model performance.Moreover,we investigate the effectiveness of using mix-lingual content as a tool to transfer learned relational dependencies across different languages.Additionally,we delve into factors influencing model performance for both supervised models and LLMs in the novel MixREtask.