改进的认知诊断模型项目功能差异检验方法——基于观察信息矩阵的Wald统计量被引量：14

An improved method for differential item functioning detection in cognitive diagnosis models: An application of Wald statistic based on observed information matrix

下载PDF

导出

摘要 Hou,de la Torre和Nandakumar(2014)提出可以使用Wald统计量检验DIF,但其结果的一类错误率存在过度膨胀的问题。本研究中提出了一个使用观察信息矩阵进行计算的改进后的Wald统计量。结果表明:(1)使用观察信息矩阵计算的这一改进后的Wald统计量在DIF检验中具有良好的一类错误控制率,尤其是在项目具有较高区分能力的时候,解决了以往研究中一类错误率过度膨胀的问题。(2)随着样本量的增加以及DIF量的增大,使用观察信息矩阵计算Wald统计量的统计检验力也在增加。 In cognitive diagnostic models（CDMs）, differential item functioning（DIF） refers to the probabilities of success of an item being different for examinees with the same attribute mastery pattern in the groups. The detection of DIF is an important step to ensure the fairness and validity of results from CDMs for all groups. Hou et al.（2014） proposed that the Wald statistic can be used to detect DIF in CDMs. Unfortunately, their results revealed that the Wald statistic based on the information matrix estimation method developed by de la Torre（2009, 2011） yielded inflated Type I error rates. However, Li and Wang（2015） found that the Type I error rates of the Wald statistic in which MCMC algorithms were implemented were slightly inflated in their study under the same conditions. In this study, we proposed an improved Wald statistic based on the observed information matrix for DIF assessment. As a general demonstration, we took the log-linear cognitive diagnosis model（LCDM; Henson et al., 2009） as an example. In this simulation study, in order to compare the results with previous studies（e.g., Hou et al.,2014; Li Wang, 2015）, we followed the simulation design used by Hou et al.（2014）, except that we implemented the observed or cross-product（XPD） information matrix in the Wald statistic computation. Parameters set in the studies were： the test length at 30, the number of attributes at 5, and the maximum number of required attributes for an item at 3. Binary item response data were generated from the DINA model. Three sets of true item parameter values were considered（ g j ？s j？.1,.2, or.3） for the reference group. Two DIF sizes：.05 and.10, and two types of DIF： uniform and nonuniform, were manipulated. Two sample sizes were considered, 500 and 1,000. Each condition was replicated 1000 times, and the estimation code was written in R（R Core Team, 2015）. The simulation results showed that：（1） for the relatively discriminating items, Wald statistic had accurate Type I error control when the observed information matrix was used in its computation. However, when the slip and guessing parameters were large（ s j ？g j？ 0.3）, the Type I error control was slightly conservative.（2） When the XPD information matrix was used for the computation of the Wald statistic, the Type I error control was conservative; that is, the performance of the observed information matrix was better than the XPD information matrix.（3） The number of attributes required for success on the item did not have a notable impact on the Type I error control of Wald statistic, irrespective of whether the observed or the XPD information matrix was used for the statistic.（4） The power rates of Wald statistic for detecting DIF increased as the sample size increased. We conclude that our improved Wald statistic provided follows asymptotically a chi-square distribution with degrees of freedom equal to 2, for DINA model. The improved Wald statistic is a useful and powerful tool for DIF detection in CDMs.

作者刘彦楼辛涛李令青田伟刘笑笑

机构地区北京师范大学发展心理研究所中国基础教育质量监测协同创新中心泰山学院教师教育学院

出处《心理学报》 CSSCI CSCD 北大核心 2016年第5期588-598,共11页 Acta Psychologica Sinica

基金国家自然科学基金面上项目(31371047) 中央高校基本科研业务费专项资金资助(SKZZX2013028)

关键词 Wald统计量项目功能差异认知诊断模型观察信息矩阵经验交叉相乘信息矩阵 Wald statistic differential item functioning cognitive diagnosis model observed information matrix cross-product information matrix

分类号 B841 [哲学宗教—基础心理学]

引文网络
相关文献

参考文献2

1王卓然,边玉芳,郭磊.项目功能差异对于认知诊断测验估计准确性的影响[J].心理学探新,2015,35(3):272-278. 被引量：2
2王卓然,郭磊,边玉芳.认知诊断测验中的项目功能差异检测方法比较[J].心理学报,2014,46(12):1923-1932. 被引量：9

二级参考文献23

1陈平,辛涛.(2011).认知诊断计算杌化自适应测验中的项目增补-以DINA模型为例.博士论文.北京师范大学.
2Brennan, R. L. (2006). Education measurement (4 ed. ). West- port, CT: American Council on Education and Pmeger Pub- lishers.
3de la Torre, J. (2009). DINA model and parameter estimation :A didactic. Journal of Educational and Behavioral Statistics, 34 (1),115-130.
4de la Torte, J. , & Douglas, J. A. (2004). Higher - order latent trait models for cognitive diagnosis. Psychometrika, 69 ( 3 ), 333 - 353.
5de la Torre, J. (2011). The Generalized DINA Model Frame- work. Psychometaka, 76( 2 ), 179 - 199.
6Flier,H. ,Mellenbergh,G. J. ,Ader,H. J. ,& Wijn,M. (1984). An iterative item bias detection method. Journal of Education- al Measurement,21 (2) ,131 - 145.
7Haertel,E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Edu- cational Measurement ,26(4 ) ,301 - 321.
8Holland,P. W. , & Thayer, D. T. (1988). Differential item per- formance and the Mantel -Haenszel procedure. Test Validity, 129 -145.
9Hou, L. (2013). Differential item functioning assessment in cog- nitive diagnostic modeling:Applying the wald test to investi-gate DIF in the Generalized DINA Model Framework../ournal of Educational Measurement,51 ( 1 ) ,98 - 125.
10Hou, L. , de la Torte, J. , & Nandakumar, R. ( 2014). Differential item functioning assessment in cognitive diagnostic modeling: Application of the wald test to investigate DIF in the DINA Model. Journal of Educational Measurement, 51 ( 1 ), 98 - 125.

共引文献8

1张龙,涂冬波.多级计分题项目功能差异常用检测方法及比较[J].江西师范大学学报（自然科学版）,2015,39(5):441-448. 被引量：9
2高椿雷,罗照盛,喻晓锋,彭亚风,郑蝉金.CD-MST初始阶段模块组建方法比较[J].心理学报,2016,48(8):1037-1046. 被引量：3
3贺满足.大学英语成就测试的公平性研究——阅读测试的性别DIF检验[J].考试研究,2018,34(5):56-62.
4贺满足.大学英语成就测试的公平性探究--阅读测试的专业背景DIF检验[J].教育与考试,2018(5):51-57. 被引量：2
5尹昊,周蔓,刘彦楼,辛涛.认知诊断模型题目功能差异检验方法的健壮性比较[J].中国考试,2021(3):36-44.
6孙小坚,刘彦楼,王诗梦,辛涛,宋乃庆,周蔓.认知诊断测验中基于信息矩阵的多群组DIF检验[J].心理科学,2022,45(3):710-717.
7李秋云,蔡艳,汪大勋,涂冬波.认知诊断框架下多级评分题目的DIF检测方法及其应用[J].心理科学,2022,45(4):998-1007.
8吴琼琼,赵悦,刘彦楼.方差—协方差矩阵在认知诊断中的作用[J].心理学探新,2023,43(3):262-268.

同被引文献46

1魏丹,张丹慧,刘红云.基于多维题组反应模型的项目功能差异检验探究[J].心理科学,2020,43(1):206-214. 被引量：3
2涂冬波,蔡艳,戴海琦,丁树良.一种多级评分的认知诊断模型:P-DINA模型的开发[J].心理学报,2010,42(10):1011-1020. 被引量：59
3涂冬波,蔡艳,戴海琦.基于DINA模型的Q矩阵修正方法[J].心理学报,2012,44(4):558-568. 被引量：43
4辛涛,乐美玲,张佳慧.教育测量理论新进展及发展趋势[J].中国考试,2012(5):3-11. 被引量：35
5涂冬波,蔡艳,戴海琦.几种常用非补偿型认知诊断模型的比较与选用：基于属性层级关系的考量[J].心理学报,2013,45(2):243-252. 被引量：21
6李令青,韩笑,辛涛,刘彦楼.认知诊断评价在个性化学习中的功能与价值[J].中国考试,2019(1):40-44. 被引量：11
7喻晓锋,罗照盛,高椿雷,秦春影.Q矩阵包含错误的诊断测验分类准确性比较[J].心理科学,2014,37(6):1478-1484. 被引量：4
8叶素静,唐文清,张敏强,曹魏聪.追踪研究中缺失数据处理方法及应用现状分析[J].心理科学进展,2014,22(12):1985-1994. 被引量：21
9王卓然,郭磊,边玉芳.认知诊断测验中的项目功能差异检测方法比较[J].心理学报,2014,46(12):1923-1932. 被引量：9
10喻晓锋,罗照盛,秦春影,高椿雷,李喻骏.基于作答数据的模型参数和Q矩阵联合估计[J].心理学报,2015,47(2):273-282. 被引量：13

引证文献14

1李令青,韩笑,辛涛,刘彦楼.认知诊断评价在个性化学习中的功能与价值[J].中国考试,2019(1):40-44. 被引量：11
2刘彦楼,辛涛,田伟.项目反应理论与认知诊断模型的参数估计:模型整合视角[J].北京师范大学学报（自然科学版）,2017,53(6):742-748. 被引量：4
3刘彦楼,张倩萌,郑宗军,尹昊.认知诊断模型中项目水平模型比较统计量的健壮性[J].心理科学,2019,42(5):1251-1259. 被引量：3
4吕渊.网络服务器信息动态特征实时检测方法仿真[J].计算机仿真,2019,36(12):378-381.
5尹昊,周蔓,刘彦楼,辛涛.认知诊断模型题目功能差异检验方法的健壮性比较[J].中国考试,2021(3):36-44.
6李佳,毛秀珍,张雪琴.认知诊断Q矩阵估计(修正)方法[J].心理科学进展,2021,29(12):2272-2280. 被引量：7
7宋枝璘,郭磊,郑天鹏.认知诊断缺失数据处理方法的比较:零替换、多重插补与极大似然估计法[J].心理学报,2022,54(4):426-440. 被引量：10
8刘彦楼.认知诊断模型的标准误与置信区间估计:并行自助法[J].心理学报,2022,54(6):703-724. 被引量：4
9孙小坚,刘彦楼,王诗梦,辛涛,宋乃庆,周蔓.认知诊断测验中基于信息矩阵的多群组DIF检验[J].心理科学,2022,45(3):710-717.
10李秋云,蔡艳,汪大勋,涂冬波.认知诊断框架下多级评分题目的DIF检测方法及其应用[J].心理科学,2022,45(4):998-1007.

二级引证文献38

1黄荣怀,周伟,杜静,孙飞鹏,王欢欢,曾海军,刘德建.面向智能教育的三个基本计算问题[J].开放教育研究,2019,25(5):11-22. 被引量：56
2刘彦楼,张倩萌,郑宗军,尹昊.认知诊断模型中项目水平模型比较统计量的健壮性[J].心理科学,2019,42(5):1251-1259. 被引量：3
3王大洋,胡春红,卢秋婷.基于GP-DINA模型的学生多级评分的广义认知诊断模型研究[J].现代电子技术,2019,42(24):136-139.
4何娟.对农村初中学困生转化的思考[J].科技资讯,2020,18(22):152-154. 被引量：5
5范淑斌,张鹏岩.基于中介模型的城镇化对粮食安全影响研究[J].内蒙古科技与经济,2020(17):3-6.
6王立君,唐芳,詹沛达.基于认知诊断测评的个性化补救教学效果分析:以“一元一次方程”为例[J].心理科学,2020,43(6):1490-1497. 被引量：12
7秦天程.人工智能教育语境下高职学生的学习力特征和提升途径[J].中国职业技术教育,2021,37(20):88-92. 被引量：11
8王萌萌.信息技术支持下的外语能力精准诊断与教学[J].中国远程教育,2021(9):69-75. 被引量：3
9李佳,毛秀珍,张雪琴.认知诊断Q矩阵估计(修正)方法[J].心理科学进展,2021,29(12):2272-2280. 被引量：7
10蒋林靖,牛彦敏.基于DINA模型的学习测评设计与应用研究[J].常州工学院学报,2022,35(2):85-90.

1个性与目标[J].高中生之友（青春版）,2011(6):44-44.
2张龙,涂冬波.多级计分题项目功能差异常用检测方法及比较[J].江西师范大学学报（自然科学版）,2015,39(5):441-448. 被引量：9
3郑蝉金,郭聪颖,边玉芳.变通的题组项目功能差异检验方法在篇章阅读测验中的应用[J].心理学报,2011,43(7):830-835. 被引量：13
4刘玛利.中国天主教一会一团举行避静[J].中国天主教,2001(3):18-18.
5尚鹏丽,郭磊,陈佳芳,汪新,张进辅.基于KL信息矩阵的动态加权选题策略[J].西南师范大学学报（自然科学版）,2016,41(10):117-123.
6杨权.“土火相乘”政治剧的重演[J].现代哲学,2006(1):108-114.
7丁学明.孙庞斗智[J].数学通讯（学生阅读）,2009(1):95-96.
8郭聪颖,边玉芳.题组项目功能差异(DIF)检验方法的应用探索[J].心理学探新,2013,33(5):423-429. 被引量：3
9让署假和快乐相乘[J].天天爱学习（四年级）,2013(20):20-21.
10朱乙艺,韦小满.我国成就测验的项目功能差异研究述评[J].教育与考试,2012(1):78-81. 被引量：4

心理学报

2016年第5期

浏览历史

内容加载中请稍等...

改进的认知诊断模型项目功能差异检验方法——基于观察信息矩阵的Wald统计量被引量：14

参考文献2

二级参考文献23

共引文献8

同被引文献46

引证文献14

二级引证文献38

相关作者

相关机构

相关主题

浏览历史

改进的认知诊断模型项目功能差异检验方法——基于观察信息矩阵的Wald统计量 被引量：14

参考文献2

二级参考文献23

共引文献8

同被引文献46

引证文献14

二级引证文献38

相关作者

相关机构

相关主题

浏览历史

改进的认知诊断模型项目功能差异检验方法——基于观察信息矩阵的Wald统计量被引量：14