Complex survey designs often involve unequal selection probabilities of clus-ters or units within clusters. When estimating models for complex survey data, scaled weights are incorporated into the likelihood, producin...Complex survey designs often involve unequal selection probabilities of clus-ters or units within clusters. When estimating models for complex survey data, scaled weights are incorporated into the likelihood, producing a pseudo likeli-hood. In a 3-level weighted analysis for a binary outcome, we implemented two methods for scaling the sampling weights in the National Health Survey of Pa-kistan (NHSP). For NHSP with health care utilization as a binary outcome we found age, gender, household (HH) goods, urban/rural status, community de-velopment index, province and marital status as significant predictors of health care utilization (p-value < 0.05). The variance of the random intercepts using scaling method 1 is estimated as 0.0961 (standard error 0.0339) for PSU level, and 0.2726 (standard error 0.0995) for household level respectively. Both esti-mates are significantly different from zero (p-value < 0.05) and indicate consid-erable heterogeneity in health care utilization with respect to households and PSUs. The results of the NHSP data analysis showed that all three analyses, weighted (two scaling methods) and un-weighted, converged to almost identical results with few exceptions. This may have occurred because of the large num-ber of 3rd and 2nd level clusters and relatively small ICC. We performed a sim-ulation study to assess the effect of varying prevalence and intra-class correla-tion coefficients (ICCs) on bias of fixed effect parameters and variance components of a multilevel pseudo maximum likelihood (weighted) analysis. The simulation results showed that the performance of the scaled weighted estimators is satisfactory for both scaling methods. Incorporating simulation into the analysis of complex multilevel surveys allows the integrity of the results to be tested and is recommended as good practice.展开更多
针对传统半监督检测模型关注位置信息不足导致伪标签不准确的问题,提出了基于互一致性的标签筛选(MCBF)策略。该策略在伪标签筛选环节,通过比较连续训练轮次中伪标签的IoU(Intersection over Union,衡量两个边界框的重合程度)评估位置变...针对传统半监督检测模型关注位置信息不足导致伪标签不准确的问题,提出了基于互一致性的标签筛选(MCBF)策略。该策略在伪标签筛选环节,通过比较连续训练轮次中伪标签的IoU(Intersection over Union,衡量两个边界框的重合程度)评估位置变化,并结合两轮次的平均置信度来评价伪标签的稳定性,设定阈值进行筛选。在生成伪标签的阶段,分析相邻轮次的位置信息和置信度的互一致性,使网络兼顾类别置信度和位置信息。此外,设计了一种适用于带钢缺陷检测的新型数据增强策略Copy-Fill-Smooth,有效提升了检测效果。在NEU数据集和私有带钢数据集,评价指标AP@50分别达到了68.8%和66.1%,显示了该策略相比其他半监督检测模型在带钢缺陷检测领域的显著优势。展开更多
非结构化道路的缺陷目标检测任务对道路交通安全具有重要意义,但检测所需的标注数据集相对有限。为了解决非结构化道路标注数据集缺乏以及现有模型对无标注数据学习能力不足的问题,提出一种MAM(Multi-Augmentation with Memory)半监督...非结构化道路的缺陷目标检测任务对道路交通安全具有重要意义,但检测所需的标注数据集相对有限。为了解决非结构化道路标注数据集缺乏以及现有模型对无标注数据学习能力不足的问题,提出一种MAM(Multi-Augmentation with Memory)半监督目标检测算法。首先,引入缓存机制存储无标注图像和带有伪标注图像的框回归位置信息,避免了后续匹配造成的计算资源浪费。其次,设计混合数据增强策略,将缓存的伪标签图像与无标签图像混合输入学生模型,以增强模型对新数据的泛化能力,并使图像的尺度分布更加均衡。MAM算法不受目标检测模型的限制,并且更好地保持了目标框的一致性,避免了计算一致性损失。实验结果表明,MAM算法相比其他全监督学习和半监督学习算法更具优越性,在自建的非结构化道路缺陷数据集Defect上,在标注比例为10%、20%和30%的场景下,MAM算法的均值平均精度(mAP)相比于Soft Teacher算法分别提升了6.8、11.1和6.0百分点,在自建的非结构化道路坑洼数据集Pothole上,在标注比例为15%和30%的场景下,MAM算法的mAP相比于Soft Teacher算法分别提升了5.8和4.3百分点。展开更多
文摘Complex survey designs often involve unequal selection probabilities of clus-ters or units within clusters. When estimating models for complex survey data, scaled weights are incorporated into the likelihood, producing a pseudo likeli-hood. In a 3-level weighted analysis for a binary outcome, we implemented two methods for scaling the sampling weights in the National Health Survey of Pa-kistan (NHSP). For NHSP with health care utilization as a binary outcome we found age, gender, household (HH) goods, urban/rural status, community de-velopment index, province and marital status as significant predictors of health care utilization (p-value < 0.05). The variance of the random intercepts using scaling method 1 is estimated as 0.0961 (standard error 0.0339) for PSU level, and 0.2726 (standard error 0.0995) for household level respectively. Both esti-mates are significantly different from zero (p-value < 0.05) and indicate consid-erable heterogeneity in health care utilization with respect to households and PSUs. The results of the NHSP data analysis showed that all three analyses, weighted (two scaling methods) and un-weighted, converged to almost identical results with few exceptions. This may have occurred because of the large num-ber of 3rd and 2nd level clusters and relatively small ICC. We performed a sim-ulation study to assess the effect of varying prevalence and intra-class correla-tion coefficients (ICCs) on bias of fixed effect parameters and variance components of a multilevel pseudo maximum likelihood (weighted) analysis. The simulation results showed that the performance of the scaled weighted estimators is satisfactory for both scaling methods. Incorporating simulation into the analysis of complex multilevel surveys allows the integrity of the results to be tested and is recommended as good practice.
文摘针对传统半监督检测模型关注位置信息不足导致伪标签不准确的问题,提出了基于互一致性的标签筛选(MCBF)策略。该策略在伪标签筛选环节,通过比较连续训练轮次中伪标签的IoU(Intersection over Union,衡量两个边界框的重合程度)评估位置变化,并结合两轮次的平均置信度来评价伪标签的稳定性,设定阈值进行筛选。在生成伪标签的阶段,分析相邻轮次的位置信息和置信度的互一致性,使网络兼顾类别置信度和位置信息。此外,设计了一种适用于带钢缺陷检测的新型数据增强策略Copy-Fill-Smooth,有效提升了检测效果。在NEU数据集和私有带钢数据集,评价指标AP@50分别达到了68.8%和66.1%,显示了该策略相比其他半监督检测模型在带钢缺陷检测领域的显著优势。
文摘非结构化道路的缺陷目标检测任务对道路交通安全具有重要意义,但检测所需的标注数据集相对有限。为了解决非结构化道路标注数据集缺乏以及现有模型对无标注数据学习能力不足的问题,提出一种MAM(Multi-Augmentation with Memory)半监督目标检测算法。首先,引入缓存机制存储无标注图像和带有伪标注图像的框回归位置信息,避免了后续匹配造成的计算资源浪费。其次,设计混合数据增强策略,将缓存的伪标签图像与无标签图像混合输入学生模型,以增强模型对新数据的泛化能力,并使图像的尺度分布更加均衡。MAM算法不受目标检测模型的限制,并且更好地保持了目标框的一致性,避免了计算一致性损失。实验结果表明,MAM算法相比其他全监督学习和半监督学习算法更具优越性,在自建的非结构化道路缺陷数据集Defect上,在标注比例为10%、20%和30%的场景下,MAM算法的均值平均精度(mAP)相比于Soft Teacher算法分别提升了6.8、11.1和6.0百分点,在自建的非结构化道路坑洼数据集Pothole上,在标注比例为15%和30%的场景下,MAM算法的mAP相比于Soft Teacher算法分别提升了5.8和4.3百分点。