期刊文献+

基于机器学习算法的奶牛疾病预测模型的研究 被引量:4

Study on Dairy Cow Disease Prediction Model Based on Machine Learning Algorithm
在线阅读 下载PDF
导出
摘要 【目的】评估建立奶牛疾病预测模型的6种机器学习(machine learning,ML)算法的性能及预测变量的重要性。【方法】选取2020年12月至2021年11月,共计944头泌乳牛的生产信息、行为信息作为预测因子,疾病信息作为输出变量,训练并验证模型。将日产奶量、反刍量、活动量、胎次和泌乳天数作为输入变量,利用ML算法建立奶牛疾病的预测模型,评估决策树(Decision Tree,DT)C5.0、CHAID算法、人工神经网络(Artificial Neural Network,ANN)、随机森林(Random Forests,RF)、贝叶斯网络(Bayesian Networks,BN)和逻辑回归(Logistic Regression,LR)6种ML算法的性能,评估预测变量的重要性,以及将胎次和泌乳天数纳入预测变量后模型性能的改善情况。采用敏感性和特异性评估模型性能,按照权重排序评估输入变量对模型预测的重要性。【结果】DT C5.0算法敏感性>85%,特异性>90%,为性能最佳的模型;RF总敏感性为56.8%,对各类牛预测的性能较稳定;ANN、BN、DT CHAID则对样本量较多的疾病预测性能较好,可达74.4%;LR对病牛正确识别率不足40.0%,大多识别为健康牛。产奶量为RF、ANN、LR最重要的预测变量,泌乳天数为DT C5.0、CHAID和BN最重要的预测变量;纳入胎次和泌乳天数后,模型预测的敏感性平均提高9.8%。【结论】ML算法在对奶牛疾病的预测方面表现出很大潜力,其中,DT C5.0更适合用于预测奶牛疾病。产奶量和泌乳天数为疾病预测模型中相对重要的变量,此外,将胎次和泌乳天数纳入预测变量,可提高模型的预测精度。 【Objective】This study was aimed to evaluate 6 kind of machine learning(ML)algorithms which were used to establish a dairy cow disease prediction model,and the importance of predictors.【Method】The production information,behavior information and disease information of a total of 944 lactating cows from December 2020 to November 2021 were selected as predictors to train and validated the models.Daily milk production,rumination,activity,parity,and lactation days were used as input variables,machine learning algorithms were used to establish a dairy cow disease prediction model,6 machine learning algorithms including Decision Tree(DT)C5.0,CHAID algorithm,Artificial Neural Network(ANN),Random Forests(RF),Bayesian Networks(BN)and Logistic Regression(LR)were evaluated,the importance of predictors and the improvement of model performance by including parity and lactation days were assessed as predictors.Sensitivity and specificity were used to evaluate the performance of the models,and the importance of input variables for models predictions was evaluated according to the weight ranking.【Result】The sensitivity of DT C5.0 algorithm was greater than 85%,and the specificity was greater than 90%,which was the model with the best performance.The total sensitivity of RF was 56.8%,and the prediction performance for various types of coe was relatively stable.ANN,BN and DT CHAID had better prediction performance for diseases with a large sample size,up to 74.4%.The correct identification rate of LR for sick cow was less than 40.0%,and most of them were identified as healthy cattle.The sum of daily milk production was the most important predictor of RF,ANN,and LR,and the number of days of lactation was the most important predictor of DT C5.0,CHAID and BN.After adding parity and lactation days,the sensitivity of the model’s prediction was significantly improved.【Conclusion】Using machine learning algorithms to predict dairy cow diseases has shown potential,and among them,DT C5.0 was a more suitable model.What’s more,milk production and lactation days were relatively important variables in disease prediction models.In addition,including parity and lactation days as predictors could improve the accuracy of model prediction.
作者 李尚汝 宋佳美 张城瑞 孙雨坤 张永根 LI Shangru;SONG Jiamei;ZHANG Chengrui;SUN Yukun;ZHANG Yonggen(College of Animal Science and Technology,Northeast Agricultural University,Harbin 150030,China)
出处 《中国畜牧兽医》 CAS 北大核心 2022年第7期2534-2546,共13页 China Animal Husbandry & Veterinary Medicine
基金 国家现代农业产业技术体系。
关键词 奶牛 机器学习 疾病预测 dairy cow machine learning disease prediction
  • 相关文献

参考文献5

二级参考文献53

  • 1田富洋,李法德,李晋阳,韩玉臻,魏新华.奶牛采食量检测仪的设计与技术研究[J].仪器仪表学报,2007,28(2):293-297. 被引量:21
  • 2Hastie TJ, Tibshirani, R J, Friedman JH. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Second Edition. Springer, 2009. ISBN 978-0-387-84857-0.
  • 3Fallon B, Ma J, Allan K, Pillhofer M, Trocm~ N, Jud A. Opportunities for prevention and intervention with young children: lessons from the Canadian incidence study of reported child abuse and neglect. Child Adolesc Psychiatry Ment Health. 2013; 7:4.
  • 4Patel N, Upadhyay S. Study of various decision tree pruning methods with their empirical comparison in WEKA. Int J Comp Appl; 60 (12): 20-25.
  • 5Berry MJA, Linoff G. Mastering Data Mining: The Art and Science of Customer Relationship Management. New York: John Wiley & Sons, Inc., 1999.
  • 6Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. Springer; 2001. pp: 269-272.
  • 7Zibran MF. CHI-Squared Test of Independence. Department of Computer Science, University of Calgary, Alberta, Canada; 2012.
  • 8Breiman L, Friedman JH, Olshen RA, Stone CJ. Classi)gcatT"on and Regression Trees. Belmont, California: Wadsworth, Inc.; 1984.
  • 9O.uinlan RJ. C4.5: Programs .for Machine Learning. San Mateo, California: Morgan Kaufmann Publishers, Inc.; 1993.
  • 10Kass, GV. An exploratory technique for investigating large quantities of categorical data. Appl star. 1980; 2.9:119-127.

共引文献122

同被引文献40

引证文献4

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部