摘要
为了提高中文领域本体概念抽取的自动化程度及准确率,提出了一种基于动态权值的多策略中文领域本体概念自动抽取方法。针对中文领域本体概念的特点,采用自动学习的规则学习模式,筛选出候选概念,将改进的DR&DC、TF-IDF和NC-Value三种策略融合,对候选概念进行领域归属度排序,将最终权重超过阈值的概念存入最终概念集合。实验证明了该方法抽取领域概念的可行性和有效性。
To improve the automation degree and accuracy of Chinese domain ontology concept extraction, a method of concepts automatic extraction based on dynamic weighted multi-strategy integration is proposed. This paper filters out the candidate concepts according to the rule templates using automatic learning; and then improved DR&DC, TF-IDF and NC-Value are integrated;it sequences the degree of domain membership of the candidate concept sets, and puts concepts whose weight exceeds the threshold value into final concept sets. After lots of experiments, the feasibility and validity of this method are proved.
出处
《计算机工程与应用》
CSCD
2014年第21期152-156,共5页
Computer Engineering and Applications
基金
新疆维吾尔自治区科技攻关项目(No.200931103)
新疆大学自然科学基金(No.XY110121)
关键词
动态权值
本体学习
多策略
概念抽取
dynamic weight
ontology learning
multi-strategy
concept extraction