摘要
文本分类是数据挖掘的重要课题,它是获取信息资源的重要方式之一。根据对具有主题的大量文本的分析,基于神经网络的文本分类器在网络结构上,与文档的标题和段落结构之间建立了严格的对应关系。比较仔细地描述了神经网络的训练算法,包括正向传播算法和反向修正算法,对于算法的主要步骤,给出较详细计算方法。对基于神经网络的文本分类器的测试表明,该神经网络模型参数设置比较简单,其文本分类性能良好。
Text categorization is an important task in Data Mining,and it is an important way for getting information. After collecting and analyzing a great quantity of text in Chinese,it is found,that the words in title and some paragraphs of the text are more important on depicting the subject of the text.So in a text categorizer based on Neural Network the neural network structure is closely related with title and some paragraphs position of the text,Training algorithms for the neural network model are depicted in more detail,including feedforword propagation and back modifying algorithm.The kernel of the algorithms is described in detail.Finally,a testing result for the text categorizer indicates that performance of the neural network model is good to identify text category,and set parameters of the neural network systems is simple and easy.
出处
《计算机工程与应用》
CSCD
北大核心
2006年第12期193-196,共4页
Computer Engineering and Applications
基金
国家863高技术研究发展计划资助项目(编号:2001AA135080)