Given a controversial target,such as“nuclear energy”,information-seeking argument mining aims to identify argumentative text from diverse sources.The main challenge in this task comes three-fold:the insufficiency of...Given a controversial target,such as“nuclear energy”,information-seeking argument mining aims to identify argumentative text from diverse sources.The main challenge in this task comes three-fold:the insufficiency of contextual information on targets,cross-domain adaptation across varying targets,and implicit argumentative information within the argument.Current approaches primarily address the first two challenges by improving the integration of target-related semantic information with arguments,while there has been little work on modeling all three aspects.To address these challenges,inspired by the potential capability of the neural topic model for mining the local and global topic information contained in the dataset,we propose a novel topic-enhanced information-seeking argument mining approach by leveraging the mutual interaction between the neural topic model and the language model.Specifically,(i)the global topic information is extracted from the corpora to encapsulate the common knowledge across different targets for solving the cross-domain adaptation;(ii)to capture the contextual information on targets,the target is augmented by target-aware subtopics derived from the global topic-word distribution;(iii)to capture the implicit argumentative information within the argument,the local topic information is captured by minimizing the similarity between its local topic distribution and its semantic representation through mutual learning.Experimental results show the superiority of the proposed model compared to the state-of-the-art baselines in both in-domain and cross-domain scenarios.展开更多
Legal argumentation analysis represents a critical yet underexplored domain in computational linguistics,particularly for cross-lingual contexts where high-quality parallel corpora with argumentative annotations remai...Legal argumentation analysis represents a critical yet underexplored domain in computational linguistics,particularly for cross-lingual contexts where high-quality parallel corpora with argumentative annotations remain scarce.This paper introduces the Hong Kong Legal Judgments Argumentative Corpus(HKLJ-Arg),a novel bilingual Chinese-English parallel corpus comprising 557 professionally translated paragraph pairs extracted from 15 legal judgments spanning 2012-2025.Our corpus uniquely leverages Hong Kong’s bilingual legal framework to provide high-fidelity parallel translations with comprehensive argumentative structure annotations,including claims,premises,examples,and their logical relationships.The annotation methodology employs a prompt-engineered approach using large language models,validated through expert evaluation achieving Cohen’sκof 0.79 for component identification and 0.72 for relationship classification.We further present the Argumentation Preservation Assessment Framework(APAF),a systematic methodology for evaluating how effectively machine translation systems preserve argumentative structures across linguistic boundaries.Statistical analysis reveals diverse argumentative patterns across constitutional,criminal,and civil law domains,with systematic preservation of logical structures across languages.This resource addresses a critical gap in legal natural language processing by providing the first large-scale bilingual corpus specifically designed for cross-lingual legal argumentation mining,translation quality assessment,and comparative legal reasoning analysis.展开更多
In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Challenge held by both the Chinese AI and Law Challenge(CAIL)and the Chinese National Social Media Processing Conf...In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Challenge held by both the Chinese AI and Law Challenge(CAIL)and the Chinese National Social Media Processing Conference(SMP),and introduce the related data set-SMP-CAIL2020-Argmine.The task challenged participants to choose the correct argument among five candidates proposed by the defense to refute or acknowledge the given argument made by the plaintiff,providing the full context recorded in the judgement documents of both parties.We received entries from 63 competing teams,38 of which scored higher than the provided baseline model(BERT)in the first phase and entered the second phase.The best performing system in the two phases achieved accuracy of 0.856 and 0.905,respectively.In this paper,we will present the results of the competition and a summary of the systems,highlighting commonalities and innovations among participating systems.The SMP-CAIL2020-Argmine data set and baseline modelshave been already released.展开更多
基金funded by the National Natural Science Foundation of China(Grant Nos.62402258,62176053,62376130)the Shandong Provincial Natural Science Foundation(No.ZR2024QF099)+5 种基金the Program of New Twenty Policies for Universities of Jinan(No.202333008)the Pilot Project for Integrated Innovation of Science,Education,and Industry of Qilu University of Technology(Shandong Academy of Sciences)(2024ZDZX08)funded by the National Natural Science Foundation of China(Grant No.62102192)the fellowship of China Postdoctoral Science Foundation(2022M710071)the Innovation and Entrepreneurship Program of Jiangsu Province(JSSCBS20210530)supported by a Turing AI Fellowship funded by the UK Research and Innovation(EP/V020579/1,EP/V020579/2)。
文摘Given a controversial target,such as“nuclear energy”,information-seeking argument mining aims to identify argumentative text from diverse sources.The main challenge in this task comes three-fold:the insufficiency of contextual information on targets,cross-domain adaptation across varying targets,and implicit argumentative information within the argument.Current approaches primarily address the first two challenges by improving the integration of target-related semantic information with arguments,while there has been little work on modeling all three aspects.To address these challenges,inspired by the potential capability of the neural topic model for mining the local and global topic information contained in the dataset,we propose a novel topic-enhanced information-seeking argument mining approach by leveraging the mutual interaction between the neural topic model and the language model.Specifically,(i)the global topic information is extracted from the corpora to encapsulate the common knowledge across different targets for solving the cross-domain adaptation;(ii)to capture the contextual information on targets,the target is augmented by target-aware subtopics derived from the global topic-word distribution;(iii)to capture the implicit argumentative information within the argument,the local topic information is captured by minimizing the similarity between its local topic distribution and its semantic representation through mutual learning.Experimental results show the superiority of the proposed model compared to the state-of-the-art baselines in both in-domain and cross-domain scenarios.
基金supported by research grants from Xi’an Jiaotong-Liverpool University Research Development Funding(RDF-22-01-053)Open Project of Anhui Provincial Key Laboratory of Multimodal Cognitive Computation,Anhui University(MMC202414)Duke Kunshan University.
文摘Legal argumentation analysis represents a critical yet underexplored domain in computational linguistics,particularly for cross-lingual contexts where high-quality parallel corpora with argumentative annotations remain scarce.This paper introduces the Hong Kong Legal Judgments Argumentative Corpus(HKLJ-Arg),a novel bilingual Chinese-English parallel corpus comprising 557 professionally translated paragraph pairs extracted from 15 legal judgments spanning 2012-2025.Our corpus uniquely leverages Hong Kong’s bilingual legal framework to provide high-fidelity parallel translations with comprehensive argumentative structure annotations,including claims,premises,examples,and their logical relationships.The annotation methodology employs a prompt-engineered approach using large language models,validated through expert evaluation achieving Cohen’sκof 0.79 for component identification and 0.72 for relationship classification.We further present the Argumentation Preservation Assessment Framework(APAF),a systematic methodology for evaluating how effectively machine translation systems preserve argumentative structures across linguistic boundaries.Statistical analysis reveals diverse argumentative patterns across constitutional,criminal,and civil law domains,with systematic preservation of logical structures across languages.This resource addresses a critical gap in legal natural language processing by providing the first large-scale bilingual corpus specifically designed for cross-lingual legal argumentation mining,translation quality assessment,and comparative legal reasoning analysis.
基金supported by National Key Research and Development Plan(No.2018YFC0830600),and is cooperated with China Justice Big Data Institute,which provided judgement documents and the employment of professional annotators.The competition is also sponsored by Beijing Thunisoft Information Technology Co.,Ltd.,and supported by both CAIL and SMP organizers.
文摘In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Challenge held by both the Chinese AI and Law Challenge(CAIL)and the Chinese National Social Media Processing Conference(SMP),and introduce the related data set-SMP-CAIL2020-Argmine.The task challenged participants to choose the correct argument among five candidates proposed by the defense to refute or acknowledge the given argument made by the plaintiff,providing the full context recorded in the judgement documents of both parties.We received entries from 63 competing teams,38 of which scored higher than the provided baseline model(BERT)in the first phase and entered the second phase.The best performing system in the two phases achieved accuracy of 0.856 and 0.905,respectively.In this paper,we will present the results of the competition and a summary of the systems,highlighting commonalities and innovations among participating systems.The SMP-CAIL2020-Argmine data set and baseline modelshave been already released.