摘要
隐马尔可夫模型训练算法是一种局部搜索算法,对初值敏感。传统方法采用随机参数训练隐马尔可夫模型时常陷入局部最优,应用于Web挖掘效果不佳。遗传算法具有较强的全局搜索能力,但容易早熟、收敛慢,模拟退火算法具有较强的局部寻优能力,但会随机漫游,全局搜索能力欠缺。综合考虑遗传算法和模拟退火算法的特点,提出混合模拟退火-遗传算法SGA,优化HMM初始参数,弥补Baum-Welch算法对初始参数敏感的缺陷,Web挖掘的实验结果表明五个域提取的REC和PRE都有明显的提高。
The training algorithm which is used to training HMM is a sub-optimal algorithm and sensitive to initial parameters.Typical hidden Markov model often leads to sub-optimal when training it with random parameters.It is ineffective when mining Web information with typical HMM.GA has the excellent ability of global searching and has the defect of slow convergence rate.SA has the excellent ability of local searching and has the defect of randomly roaming.It combines the advantages of genetic algorithm and simulated annealing algorithm,proposes hybrid simulated annealing genetic algorithm(SGA).SGA chooses the best SGA parameters by experiment and optimizes HMM combining Baum-Welch during the course of Web mining.The experimental results show that the SGA significantly improves the performance in precision and recall.
出处
《计算机技术与发展》
2012年第3期106-109,共4页
Computer Technology and Development
基金
湖南省教育科研基金资助项目(10C1176)
湖南省教育科研2011基金资助项目