期刊文献+

大数据相似重复记录检测算法在试题库中的运用 被引量:1

The Application of Detection Algorithm for Similar and Duplicate Record in Item Bank
在线阅读 下载PDF
导出
摘要 为了提高试题库中重复信息自动化检测能力,提出面向试题库建设的大数据相似重复记录检测算法。采用大数据分析方法,构建试题库大数据相似重复记录分布模型,获取随机链路中重复记录的分布区间,采用层次关系入度集特征监测的方法,分析试题库大数据相似重复记录特征结构,根据获取的统计特征量,基于空间网格聚类方法对试题库大数据的相似重复记录进行融合处理,根据处理结果,在空间坐标系中实现大数据相似重复记录的检测。仿真实验结果表明,所提算法进行试题库的大数据相似重复记录检测的错误率较低,时间开销较小。 In order to improve the automatic detection ability of repeated information in the item bank, the similar and repetitive record detection algorithm for the construction of item bank was proposed. Big data analysis methods was used to construct the distribution model of similar and repeated records in the item bank, and the distribution section of the repeated records in the random link was obtained. The method of hierarchical relational set feature monitoring was used to analyze the test key data similar to the recording feature structure.According to the obtained statistical feature amount, the similar repetition record of the test case base data was fused based on the spatial grid clustering method. According to the processing result, the detection of large data similarly repeated records was realized in the spatial coordinate system. The simulation experiment results show that the error rate of the large data similar to the counterproductory test bank is relatively low, and the time overhead is small.
作者 胡小琴 潘锦锋 HU Xiaoqin;PAN Jinfeng(Software Department,Quanzhou University of Information Engineering,Quanzhou 362000,China)
出处 《成都工业学院学报》 2023年第1期66-69,共4页 Journal of Chengdu Technological University
基金 福建省中青年教师教育科研项目(JAT190930)。
关键词 大数据相似度 重复记录 检测算法 试题库设计 数据聚类 big data similarity duplicate records detection algorithm design of test question bank data clustering
  • 相关文献

参考文献9

二级参考文献67

共引文献94

同被引文献6

引证文献1

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部