摘要
受信息抽取工作的驱动,利用事件相关文档的特点,在分析各种文本特征的基础上,综合利用词语、语义和词串信息进行句子相似度计算,并在此基础上进行句子聚类,提出了基于特征选择的句子聚类方法,为抽取事件侧面信息提供更好的基础资源。实验表明,采用多特征后,句子聚类的效果得到明显提高。
Motivated by the information extraction, according to the characteristics of the related texts, analyzes all kinds of the features of the texts, calculates the senten cesimilarity using the information of the word, the semantic and the word string, processes sentence clustering, which aims to provide better basic research on extracting the profile information of the event. The Experiment shows that the method can obviously improve the effect of sentence clustering.
出处
《现代计算机》
2007年第5期23-25,共3页
Modern Computer
关键词
信息抽取
句子相似度
句子聚类
向量空间模型
Information Extraction
Sentence Similarity
Sentence Clustering
Vector Space Model