Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine th...Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.展开更多
The discourse analysis task,which focuses on understanding the semantics of long text spans,has received increasing attention in recent years.As a critical component of discourse analysis,discourse relation recognitio...The discourse analysis task,which focuses on understanding the semantics of long text spans,has received increasing attention in recent years.As a critical component of discourse analysis,discourse relation recognition aims to identify the rhetorical relations between adjacent discourse units(e.g.,clauses,sentences,and sentence groups),called arguments,in a document.Previous works focused on capturing the semantic interactions between arguments to recognize their discourse relations,ignoring important textual information in the surrounding contexts.However,in many cases,more than capturing semantic interactions from the texts of the two arguments are needed to identify their rhetorical relations,requiring mining more contextual clues.In this paper,we propose a method to convert the RST-style discourse trees in the training set into dependency-based trees and train a contextual evidence selector on these transformed structures.In this way,the selector can learn the ability to automatically pick critical textual information from the context(i.e.,as evidence)for arguments to assist in discriminating their relations.Then we encode the arguments concatenated with corresponding evidence to obtain the enhanced argument representations.Finally,we combine original and enhanced argument representations to recognize their relations.In addition,we introduce auxiliary tasks to guide the training of the evidence selector to strengthen its selection ability.The experimental results on the Chinese CDTB dataset show that our method outperforms several state-of-the-art baselines in both micro and macro F1 scores.展开更多
A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this p...A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this paper, we propose a cross-lingual implicit DRR framework that exploits an available English corpus for the Chinese DRR task. We use machine translation to generate Chinese instances from a labeled English discourse corpus. In this way, each instance has two independent views: Chinese and English views. Then we train two classifiers in Chinese and English in a co-training way, which exploits unlabeled Chinese data to implement better implicit DRR for Chinese. Experimental results demonstrate the effectiveness of our method.展开更多
We study implicit discourse relation detection,which is one of the most challenging tasks in the field of discourse analysis.We specialize in ambiguous implicit discourse relation,which is an imperceptible linguistic ...We study implicit discourse relation detection,which is one of the most challenging tasks in the field of discourse analysis.We specialize in ambiguous implicit discourse relation,which is an imperceptible linguistic phenomenon and therefore difficult to identify and eliminate.In this paper,we first create a novel task named implicit discourse relation disambiguation(IDRD).Second,we propose a focus-sensitive relation disambiguation model that affirms a truly-correct relation when it is triggered by focal sentence constituents.In addition,we specifically develop a topicdriven focus identification method and a relation search system(RSS)to support the relation disambiguation.Finally,we improve current relation detection systems by using the disambiguation model.Experiments on the penn discourse treebank(PDTB)show promising improvements.展开更多
Neural network based deep learning methods aim to learn representations of data and have produced state-of-the-art results in many natural language processing(NLP)tasks.Discourse parsing is an important research topic...Neural network based deep learning methods aim to learn representations of data and have produced state-of-the-art results in many natural language processing(NLP)tasks.Discourse parsing is an important research topic in discourse analysis,aiming to infer the discourse structure and model the coherence of a given text.This survey covers text-level discourse parsing,shallow discourse parsing and coherence assessment.We first introduce the basic concepts and traditional approaches,and then focus on recent advances in discourse structure oriented representation learning.We also introduce a trend of discourse structure aware representation learning that is to exploit discourse structures or discourse objectives for learning representations of sentences and documents for specific applications or for general purpose.Finally,we present a brief summary of the progress and discuss several future directions.展开更多
基金National Natural Science Foundation of China(Grant Nos.62376166,62306188,61876113)National Key R&D Program of China(No.2022YFC3303504).
文摘Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.
基金supported by the National Natural Science Foundation of China(Grant Nos.61836007,61773276)the Priority Academic Program Development(PAPD)of Jiangsu Higher Education Institutions.
文摘The discourse analysis task,which focuses on understanding the semantics of long text spans,has received increasing attention in recent years.As a critical component of discourse analysis,discourse relation recognition aims to identify the rhetorical relations between adjacent discourse units(e.g.,clauses,sentences,and sentence groups),called arguments,in a document.Previous works focused on capturing the semantic interactions between arguments to recognize their discourse relations,ignoring important textual information in the surrounding contexts.However,in many cases,more than capturing semantic interactions from the texts of the two arguments are needed to identify their rhetorical relations,requiring mining more contextual clues.In this paper,we propose a method to convert the RST-style discourse trees in the training set into dependency-based trees and train a contextual evidence selector on these transformed structures.In this way,the selector can learn the ability to automatically pick critical textual information from the context(i.e.,as evidence)for arguments to assist in discriminating their relations.Then we encode the arguments concatenated with corresponding evidence to obtain the enhanced argument representations.Finally,we combine original and enhanced argument representations to recognize their relations.In addition,we introduce auxiliary tasks to guide the training of the evidence selector to strengthen its selection ability.The experimental results on the Chinese CDTB dataset show that our method outperforms several state-of-the-art baselines in both micro and macro F1 scores.
基金Project supported by the National Natural Science Foundation of China(No.61672440)the Natural Science Foundation of Fujian Province,China(No.2016J05161)+2 种基金the Research Fund of the State Key Laboratory for Novel Software Technology in Nanjing University,China(No.KFKT2015B11)the Scientific Research Project of the National Language Committee of China(No.YB135-49)the Fundamental Research Funds for the Central Universities,China(No.ZK1024)
文摘A lack of labeled corpora obstructs the research progress on implicit discourse relation recognition (DRR) for Chinese, while there are some available discourse corpora in other languages, such as English. In this paper, we propose a cross-lingual implicit DRR framework that exploits an available English corpus for the Chinese DRR task. We use machine translation to generate Chinese instances from a labeled English discourse corpus. In this way, each instance has two independent views: Chinese and English views. Then we train two classifiers in Chinese and English in a co-training way, which exploits unlabeled Chinese data to implement better implicit DRR for Chinese. Experimental results demonstrate the effectiveness of our method.
基金supported by the National Natural Science Foundation of China(Grant Nos.61672368,61373097,61672367,61331011)the Research Foundation of the Ministry of Education and China Mobile(MCM20150602)Natural Science Foundation of Jiangsu(BK20151222).
文摘We study implicit discourse relation detection,which is one of the most challenging tasks in the field of discourse analysis.We specialize in ambiguous implicit discourse relation,which is an imperceptible linguistic phenomenon and therefore difficult to identify and eliminate.In this paper,we first create a novel task named implicit discourse relation disambiguation(IDRD).Second,we propose a focus-sensitive relation disambiguation model that affirms a truly-correct relation when it is triggered by focal sentence constituents.In addition,we specifically develop a topicdriven focus identification method and a relation search system(RSS)to support the relation disambiguation.Finally,we improve current relation detection systems by using the disambiguation model.Experiments on the penn discourse treebank(PDTB)show promising improvements.
基金the National Natural Science Foundation of China(Grant Nos.61876113 and 61876112)the Beijing Natural Science Foundation(Grant No.4192017)+1 种基金the Support Project of High-level Teachers in Beijing Municipal Universities in the Period of 13th Five-year Plan(Grant No.CIT&TCD20170322)the Capacity Building for Sci-Tech Innovation-Fundamental Scientific Research Funds。
文摘Neural network based deep learning methods aim to learn representations of data and have produced state-of-the-art results in many natural language processing(NLP)tasks.Discourse parsing is an important research topic in discourse analysis,aiming to infer the discourse structure and model the coherence of a given text.This survey covers text-level discourse parsing,shallow discourse parsing and coherence assessment.We first introduce the basic concepts and traditional approaches,and then focus on recent advances in discourse structure oriented representation learning.We also introduce a trend of discourse structure aware representation learning that is to exploit discourse structures or discourse objectives for learning representations of sentences and documents for specific applications or for general purpose.Finally,we present a brief summary of the progress and discuss several future directions.