Legal argumentation analysis represents a critical yet underexplored domain in computational linguistics,particularly for cross-lingual contexts where high-quality parallel corpora with argumentative annotations remai...Legal argumentation analysis represents a critical yet underexplored domain in computational linguistics,particularly for cross-lingual contexts where high-quality parallel corpora with argumentative annotations remain scarce.This paper introduces the Hong Kong Legal Judgments Argumentative Corpus(HKLJ-Arg),a novel bilingual Chinese-English parallel corpus comprising 557 professionally translated paragraph pairs extracted from 15 legal judgments spanning 2012-2025.Our corpus uniquely leverages Hong Kong’s bilingual legal framework to provide high-fidelity parallel translations with comprehensive argumentative structure annotations,including claims,premises,examples,and their logical relationships.The annotation methodology employs a prompt-engineered approach using large language models,validated through expert evaluation achieving Cohen’sκof 0.79 for component identification and 0.72 for relationship classification.We further present the Argumentation Preservation Assessment Framework(APAF),a systematic methodology for evaluating how effectively machine translation systems preserve argumentative structures across linguistic boundaries.Statistical analysis reveals diverse argumentative patterns across constitutional,criminal,and civil law domains,with systematic preservation of logical structures across languages.This resource addresses a critical gap in legal natural language processing by providing the first large-scale bilingual corpus specifically designed for cross-lingual legal argumentation mining,translation quality assessment,and comparative legal reasoning analysis.展开更多
In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Challenge held by both the Chinese AI and Law Challenge(CAIL)and the Chinese National Social Media Processing Conf...In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Challenge held by both the Chinese AI and Law Challenge(CAIL)and the Chinese National Social Media Processing Conference(SMP),and introduce the related data set-SMP-CAIL2020-Argmine.The task challenged participants to choose the correct argument among five candidates proposed by the defense to refute or acknowledge the given argument made by the plaintiff,providing the full context recorded in the judgement documents of both parties.We received entries from 63 competing teams,38 of which scored higher than the provided baseline model(BERT)in the first phase and entered the second phase.The best performing system in the two phases achieved accuracy of 0.856 and 0.905,respectively.In this paper,we will present the results of the competition and a summary of the systems,highlighting commonalities and innovations among participating systems.The SMP-CAIL2020-Argmine data set and baseline modelshave been already released.展开更多
基金supported by research grants from Xi’an Jiaotong-Liverpool University Research Development Funding(RDF-22-01-053)Open Project of Anhui Provincial Key Laboratory of Multimodal Cognitive Computation,Anhui University(MMC202414)Duke Kunshan University.
文摘Legal argumentation analysis represents a critical yet underexplored domain in computational linguistics,particularly for cross-lingual contexts where high-quality parallel corpora with argumentative annotations remain scarce.This paper introduces the Hong Kong Legal Judgments Argumentative Corpus(HKLJ-Arg),a novel bilingual Chinese-English parallel corpus comprising 557 professionally translated paragraph pairs extracted from 15 legal judgments spanning 2012-2025.Our corpus uniquely leverages Hong Kong’s bilingual legal framework to provide high-fidelity parallel translations with comprehensive argumentative structure annotations,including claims,premises,examples,and their logical relationships.The annotation methodology employs a prompt-engineered approach using large language models,validated through expert evaluation achieving Cohen’sκof 0.79 for component identification and 0.72 for relationship classification.We further present the Argumentation Preservation Assessment Framework(APAF),a systematic methodology for evaluating how effectively machine translation systems preserve argumentative structures across linguistic boundaries.Statistical analysis reveals diverse argumentative patterns across constitutional,criminal,and civil law domains,with systematic preservation of logical structures across languages.This resource addresses a critical gap in legal natural language processing by providing the first large-scale bilingual corpus specifically designed for cross-lingual legal argumentation mining,translation quality assessment,and comparative legal reasoning analysis.
基金supported by National Key Research and Development Plan(No.2018YFC0830600),and is cooperated with China Justice Big Data Institute,which provided judgement documents and the employment of professional annotators.The competition is also sponsored by Beijing Thunisoft Information Technology Co.,Ltd.,and supported by both CAIL and SMP organizers.
文摘In this paper we present the results of the Interactive Argument-Pair Extraction in Judgement Document Challenge held by both the Chinese AI and Law Challenge(CAIL)and the Chinese National Social Media Processing Conference(SMP),and introduce the related data set-SMP-CAIL2020-Argmine.The task challenged participants to choose the correct argument among five candidates proposed by the defense to refute or acknowledge the given argument made by the plaintiff,providing the full context recorded in the judgement documents of both parties.We received entries from 63 competing teams,38 of which scored higher than the provided baseline model(BERT)in the first phase and entered the second phase.The best performing system in the two phases achieved accuracy of 0.856 and 0.905,respectively.In this paper,we will present the results of the competition and a summary of the systems,highlighting commonalities and innovations among participating systems.The SMP-CAIL2020-Argmine data set and baseline modelshave been already released.