摘要
In order to reduce the costs of the ontology construction, a general ontology learning framework (GOLF) is developed. The key technologies of the GOLF including domain concepts extraction and semantic relationships between concepts and taxonomy automatic construction are proposed. At the same time ontology evaluation methods are also discussed. The experimental results show that this method produces better performance and it is applicable across different domains. By integrating several machine learning algorithms, this method suffers less ambiguity and can identify domain concepts and relations more accurately. By using generalized corpus WordNet and HowNet, this method is applicable across different domains. In addition, by obtaining source documents from the web on demand, the GOLF can produce up-to-date ontologies.
提出了一种通用本体学习框架GOLF,通过对网络上各专业领域web文档集进行挖掘来实现本体自动构建,讨论了本体学习中本体概念的抽取、概念之间语义关系的抽取和分类体系的自动构建等关键技术,通过实验对算法进行了测试,并对本体评价方法进行了探讨.由于集成了多种机器学习算法,该方法在概念抽取和语义关系学习方面具有更高的准确性.采用通用本体WordNet和HowNet作为语料库,它可适用于不同的专业领域.同时,通过按需获取web文档,该方法能实时生成本体.
基金
The National Basic Research Program of China(973Program)(No.2003CB317000),the Natural Science Foundation of Zhejiang Province (No.Y105625).