A new method is proposed for constructing the Chinese sentential semantic structure in this paper. The method adopts the features including predicates, relations between predicates and basic arguments, relations betwe...A new method is proposed for constructing the Chinese sentential semantic structure in this paper. The method adopts the features including predicates, relations between predicates and basic arguments, relations between words, and case types to train the models of CRF + + and de- pendency parser. On the basis of the data set in Beijing Forest Studio-Chinese Tagged Corpus ( BFS- CTC), the proposed method obtains precision value of 73.63% in open test. This result shows that the formalized computer processing can construct the sentential semantic structure absolutely. The features of predicates, topic and comment extracted with the method can be applied in Chinese in- formation processing directly for promoting the development of Chinese semantic analysis. The method makes the analysis of sentential semantic analysis based on large scale of data possible. It is a tool for expanding the corpus and has certain theoretical research and practical application value.展开更多
This study conducted an eye-tracking experiment on processing different patterns of Chinese familiar metonymy in sentential contexts.It analyzes five eye-tracking measures concerning the processing of metonymy.The res...This study conducted an eye-tracking experiment on processing different patterns of Chinese familiar metonymy in sentential contexts.It analyzes five eye-tracking measures concerning the processing of metonymy.The results indicate that different patterns of metonymy experience different processing processes under a sentential-context condition,and results in prototype effects.The main finding is that Spatial Part&Whole metonymy is more prototypical than other three patterns of metonymy,i.e.,Container and Contained,Location and Located,Entity and Adjacent Entity,and that the effect of metonymy pattern on the processing is stable and observable.It concludes that contextual information facilitates the processing of non-prototypical metonymy,but restrain the processing of prototypical metonymy.展开更多
Text classification techniques mostly rely on single term analysis of the document data set, while more concepts, especially the specific ones, are usually conveyed by set of terms. To achieve more accurate text class...Text classification techniques mostly rely on single term analysis of the document data set, while more concepts, especially the specific ones, are usually conveyed by set of terms. To achieve more accurate text classifier, more informative feature including frequent co-occurring words in the same sentence and their weights are particularly important in such scenarios. In this paper, we propose a novel approach using sentential frequent itemset, a concept comes from association rule mining, for text classification, which views a sentence rather than a document as a transaction, and uses a variable precision rough set based method to evaluate each sentential frequent itemset's contribution to the classification. Experiments over the Reuters and newsgroup corpus are carried out, which validate the practicability of the proposed system.展开更多
基金Supported by the Science and Technology Innovation Plan of Beijing Institute of Technology(2013)
文摘A new method is proposed for constructing the Chinese sentential semantic structure in this paper. The method adopts the features including predicates, relations between predicates and basic arguments, relations between words, and case types to train the models of CRF + + and de- pendency parser. On the basis of the data set in Beijing Forest Studio-Chinese Tagged Corpus ( BFS- CTC), the proposed method obtains precision value of 73.63% in open test. This result shows that the formalized computer processing can construct the sentential semantic structure absolutely. The features of predicates, topic and comment extracted with the method can be applied in Chinese in- formation processing directly for promoting the development of Chinese semantic analysis. The method makes the analysis of sentential semantic analysis based on large scale of data possible. It is a tool for expanding the corpus and has certain theoretical research and practical application value.
基金the Humanities and Social Sciences Fund of Guangdong Province During the 13th-Year Plan(GD16YWW03)the Humanities and Social Sciences Fund of the Ministry of Education(16YJC740038)the National Social Science Fund of China(17CYY003).
文摘This study conducted an eye-tracking experiment on processing different patterns of Chinese familiar metonymy in sentential contexts.It analyzes five eye-tracking measures concerning the processing of metonymy.The results indicate that different patterns of metonymy experience different processing processes under a sentential-context condition,and results in prototype effects.The main finding is that Spatial Part&Whole metonymy is more prototypical than other three patterns of metonymy,i.e.,Container and Contained,Location and Located,Entity and Adjacent Entity,and that the effect of metonymy pattern on the processing is stable and observable.It concludes that contextual information facilitates the processing of non-prototypical metonymy,but restrain the processing of prototypical metonymy.
文摘Text classification techniques mostly rely on single term analysis of the document data set, while more concepts, especially the specific ones, are usually conveyed by set of terms. To achieve more accurate text classifier, more informative feature including frequent co-occurring words in the same sentence and their weights are particularly important in such scenarios. In this paper, we propose a novel approach using sentential frequent itemset, a concept comes from association rule mining, for text classification, which views a sentence rather than a document as a transaction, and uses a variable precision rough set based method to evaluate each sentential frequent itemset's contribution to the classification. Experiments over the Reuters and newsgroup corpus are carried out, which validate the practicability of the proposed system.