信息抽取技术研究与探讨被引量：1

下载PDF

导出

摘要对信息抽取技术的发展背景、概念进行了概述。详细介绍了信息抽取中研究的四个关键技术:命名实体识别、实体关系抽取、指代消解及事件探测。根据采用模型的不同,对信息抽取进行了分类介绍,分别指出了各类抽取方法的优点、缺点及研究难点。最后,对国内外在信息抽取领域中的研究现状及应用状况进行了分析,进一步说明了信息抽取技术的发展趋势。

作者伍守芹李晓昀

机构地区湖南衡阳广播电视大学南华大学计算机科学与技术学院

出处《福建电脑》 2010年第4期55-55,65,共2页 Journal of Fujian Computer

关键词信息抽取自然语言处理隐马尔科夫模型最大嫡模型条件随机场

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献6

1Ping Zhong , Jinlin Chen. A Generalized Hidden Markov Model Approach for Web Information Extraction[C]. Proceedings of the 2006 IEEE/ WIC/ACM International Conference on Web Intelligence. December 18- 22, 2006: 709-718.
2Weiwei Sun , Hongzhan Li , Zhifang Sui, The integration of dependency relation classification and semantic role labeling using bilayer maximum entropy Markov models [C]. Proceedings of the Twelfth Conference on Computational Natural Language Learning. Manchester, United Kingdora August 16-17, 2008: 243-247.
3Xiao Li, Ye-Yi Wang, Alex Accro. Extracting structured information from user queries with semi-supervised conditional random fields [C]. Proceedings of the 32nd international ACM SIGIR. confcrcncc on Research and dcvclopmcnt in information retrieval. Boston, MA, USA. July 19-23, 2009: 572-579.
4Ching Hoi Andy Hong, Jesse Prabawa Gozali, Min-Yen Kan. FireCite: lightweight real-time reference string extraction from webpages [C]. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. Paris, France.2009: 189-198.
5ASHRAF Fafma,OZYER Tame,ALHAJJ Reda Employing Clustering Techniques for Automatic Information Extraction From HTML Documents [C]. IEEE transactiom on systems, man and cybernetics. Part C, Applicatious and reviews.2008,38(5): 660-673.
6张铭,银平,邓志鸿,杨冬青.SVM+BiHMM:基于统计方法的元数据抽取混合模型[J].软件学报,2008,19(2):358-368. 被引量：27

二级参考文献22

1Morville P, Rosenfeld L. Information Architecture for the World Wide Web: Designing Large-Scale Web Site. 3rd ed., Sebastopol: 0'Reilly&Associates, 2006.
2Chidlovskii B Wrapping web information providers by transducer induction. In: Racdt L, Flach P, eds. Proc of the 12th Int'l of European Conf. on Machine Learning (ECML 2001). LNCS 2167, Heidelberg: Springer-Verlag, 2001.61-72.
3Hitchcock S, Carr L, Jiao Z, Bergmark D, Hall W, Lagoze C, Harnad S. Developing services for open eprint archives: Globalisation, integration and the impact of links. In: Proc. of the 5th ACM Conf. on Digital Libraries (ACMDL 2000). New York: ACM Press, 2000. 143-151.
4Klink S, Dengel A, Kieninger T. Rule-Based document structure understanding with a fuzzy combination of layout and textual features. Int'l Journal on Document Analysis and Recognition, 2001,4( 1): 18-26.
5Kim J, Le DX, Thoma GR. Automated labeling algorithms for biomedical document images. In: Proc. of the 7th World Multiconference on Systemics, Cybernetics and Informatics. Orlando: ⅢS, 2003. 352-357.
6Zhang M, Yang DQ, Deng ZH, Feng Y, Wang WQ, Zhao PX, Wu S, Wang SA, Tang SW. PKUSpace: A collaborative platform for scientific researching. In: Liu WY, Shi YC, Li Q, eds. Proc of the Int'l Conf. of Web-based Learning (ICWL 2004). LNCS 3143, Heidelberg: Springer-Verlag, 2004. 120-127.
7Zhao PX, Zhang M, Yang DQ, Tang SW. Automatic extraction of metadata from digital documents. Computer Science, 2003, 30(10):217-204
8Bikel DM, Miller S, Schwartz R, Weischedel R. Nymble: A high performance learning name finder. In: Proc. of the 5th Conf. on Applied Natural Language Processing (ANLC'97). San Francisco: Morgan Kaufmann Publishers, 1997. 194-201.
9Seymore K, McCallum A, Rosenreid R. Learning hidden Markov model structure for information extraction. In: Califf ME, Freitag D, Kushmerick N, Muslea I, eds. Proc. of the AAAI'99 Workshop on Machine Learning for Information Extraction. Cambridge: MIT Press, 1999.37-42.
10Borkar VR, Deshmukh K, Sarawagi S. Automatic segmentation of text into structured records. In: Aref WG, ed. Proc. of the ACM-SIGMOD Int'l Conf. Management of Data (SIGMOD 2001). New York: ACM Press, 2001. 175-186.

共引文献26

1郑继明,李瑞仙,蒲兴成.基于单状态HMM的音频分类方法研究[J].计算机应用,2009,29(2):392-394.
2李学勇,高国红,孙甲霞.基于互信息和K-means聚类的信息安全风险评估[J].河南师范大学学报（自然科学版）,2011,39(2):152-155.
3李书明,陈云红.基于元数据的数字教育资源共享研究[J].中国电化教育,2009(2):106-108. 被引量：9
4党德鹏,孟真.基于支持向量机的信息安全风险评估[J].华中科技大学学报（自然科学版）,2010,38(3):46-49. 被引量：38
5朱焱.万维网资源质量模式挖掘技术分析[J].计算机科学,2010,37(8):201-207. 被引量：2
6欧阳辉,禄乐滨,钱建立.基于C4.5的论文元数据抽取算法研究[J].计算机工程与设计,2010,31(16):3708-3711. 被引量：4
7佘俊,张学清.音乐命名实体识别方法[J].计算机应用,2010,30(11):2928-2931. 被引量：9
8高良才,汤帜,陶欣,房婧.一种自动发现、分割与标注引文元数据的方法[J].北京大学学报（自然科学版）,2010,46(6):893-900. 被引量：2
9崔纪锋,张勇,邢春晓.元数据在数据库互操作中的应用[J].计算机科学与探索,2011,5(4):305-312. 被引量：7
10李荣,胡志军,郑家恒.基于遗传算法和隐马尔可夫模型的Web信息抽取的改进[J].计算机科学,2012,39(3):196-199. 被引量：8

同被引文献7

1邵嘉亮.Note Express的三大检索信息管理系统的分析与研究[J].硅谷,2014,7(11):52-52. 被引量：1
2黄春晓.基于NE文献管理软件的作业信息管理系统的设计与实现[J].农业图书情报学刊,2015,27(9):39-41. 被引量：1
3刘峰,张晓林.科学数据元数据标准述评及其通用化设计研究[J].现代图书情报技术,2015(12):3-12. 被引量：37
4王晓燕.一种基于数据集成工具的异构数据集成的分析与设计[J].办公自动化,2016,21(1):56-59. 被引量：1
5刘静.思维导图在知识管理中的应用分析[J].情报探索,2017(11):114-118. 被引量：2
6汪升华,唐国纯.基于HTML5的三维思维导图软件开发技术研究[J].软件工程,2017,20(10):4-7. 被引量：6
7杨志萍,杜瑾,李红培,王超,于蒙.个人知识管理工具综述[J].知识管理论坛,2013(3):9-15. 被引量：14

引证文献1

1王春燕,王治平,马东.基于元数据探索NoteExpress管理思维导图[J].中国管理信息化,2018,21(10):165-166. 被引量：2

二级引证文献2

1单丽.吉林省中小学幼儿教师培训中心培训项目管理工具——培训智慧导图的设计与应用[J].吉林省教育学院学报,2020,36(1):25-29.
2蔡晓玲.文献管理工具在图书馆参考咨询服务中的应用探析[J].福建图书馆学刊,2021,4(3):25-31. 被引量：1

1郑逢强,林磊,刘秉权,孙承杰.《知网》在命名实体识别中的应用研究[J].中文信息学报,2008,22(5):97-101. 被引量：11

福建电脑

2010年第4期

浏览历史

内容加载中请稍等...

信息抽取技术研究与探讨被引量：1

参考文献6

二级参考文献22

共引文献26

同被引文献7

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

信息抽取技术研究与探讨 被引量：1

参考文献6

二级参考文献22

共引文献26

同被引文献7

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

信息抽取技术研究与探讨被引量：1