摘要
提出了一个基于统计分析的数据分类算法.通过使用从训练集中提取的信息,对支持集选择问题进行建模,以得到具有良好分离能力的小型支持集.采用混合整数规划模型计算最优的权重值和分类阈值,通过将样式加权和与分类阈值进行比较来对数据进行分类.并使用真实数据集对本算法进行性能评估.实验结果表明:本算法不仅能提高分类的精度,还能有效减少分类所需的计算时间.
In this paper,a data classification algorithm based on statistical analysis is proposed.Firstly,by using the information extracted from the training set,the support set selection problem is modeled to obtain a small support set with good separation ability.Then,the mixed integer programming model is used to calculate the optimal weight value and classification threshold,and the data is classified by comparing the weighted sum of styles with the classification threshold.Finally,the real data set is used to evaluate the performance of the algorithm.Experimental results show that the algorithm can not only improve the accuracy of classification,but also effectively reduce the calculation time required for classification.
作者
黄飞
吴锦华
HUANG Fei;WU Jin-hua(School of General Education and Foreign Languages, Anhui Institute of Information Technology,Wuhu 241000, China)
出处
《西安文理学院学报(自然科学版)》
2020年第4期57-61,共5页
Journal of Xi’an University(Natural Science Edition)
基金
安徽省教育厅高校自然科学研究重点项目(KJ2019A1289)
安徽省重大教学研究项目(2017jyxm0947)。
关键词
统计分析
分类
优化模型
statistical analysis
classification
optimization model