This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi...This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi-scale encoding significantly enhances the model’s ability to capture both fine-grained and global features,while the dynamic loss function adapts during training to optimize classification accuracy and retrieval performance.Our approach was evaluated on the ISIC-2018 and ChestX-ray14 datasets,yielding notable improvements.Specifically,on the ISIC-2018 dataset,our method achieves an F1-Score improvement of+4.84% compared to the standard ViT,with a precision increase of+5.46% for melanoma(MEL).On the ChestX-ray14 dataset,the method delivers an F1-Score improvement of 5.3%over the conventional ViT,with precision gains of+5.0% for pneumonia(PNEU)and+5.4%for fibrosis(FIB).Experimental results demonstrate that our approach outperforms traditional CNN-based models and existing ViT variants,particularly in retrieving relevant medical cases and enhancing diagnostic accuracy.These findings highlight the potential of the proposedmethod for large-scalemedical image analysis,offering improved tools for clinical decision-making through superior classification and case comparison.展开更多
交通标志在检测过程中,因受天气和光照强度的影响,导致检测时出现错检、漏检等问题,针对此问题提出一种融合空间信息的交通标志检测算法。首先,在网络中使用坐标卷积,增强网络对坐标位置信息的敏锐性。其次,在主干特征提取中加入坐标注...交通标志在检测过程中,因受天气和光照强度的影响,导致检测时出现错检、漏检等问题,针对此问题提出一种融合空间信息的交通标志检测算法。首先,在网络中使用坐标卷积,增强网络对坐标位置信息的敏锐性。其次,在主干特征提取中加入坐标注意力机制,可以更好地关注融合处的空间位置信息。在特征融合部分使用多尺度加权融合网络和金字塔池化,利用加权计算和跳跃连接的方式,增强低层与高层之间的语义信息融合效果。最后,使用边框回归损失函数(Scalable Intersection over Union Loss,SIoU)提高目标定位的准确性。在CCTSDB2021和GTSDB数据集上的实验结果显示,该方法在2种数据集上的平均精度(mean Average Precision,mAP)分别达到84.9%和98.5%,与主流检测模型对比有显著提升,较原模型分别提升了5.39个百分点和1.67个百分点,提高了交通标志的检测精度。展开更多
基金funded by the Deanship of Research and Graduate Studies at King Khalid University through small group research under grant number RGP1/278/45.
文摘This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi-scale encoding significantly enhances the model’s ability to capture both fine-grained and global features,while the dynamic loss function adapts during training to optimize classification accuracy and retrieval performance.Our approach was evaluated on the ISIC-2018 and ChestX-ray14 datasets,yielding notable improvements.Specifically,on the ISIC-2018 dataset,our method achieves an F1-Score improvement of+4.84% compared to the standard ViT,with a precision increase of+5.46% for melanoma(MEL).On the ChestX-ray14 dataset,the method delivers an F1-Score improvement of 5.3%over the conventional ViT,with precision gains of+5.0% for pneumonia(PNEU)and+5.4%for fibrosis(FIB).Experimental results demonstrate that our approach outperforms traditional CNN-based models and existing ViT variants,particularly in retrieving relevant medical cases and enhancing diagnostic accuracy.These findings highlight the potential of the proposedmethod for large-scalemedical image analysis,offering improved tools for clinical decision-making through superior classification and case comparison.
文摘现有的基于双向长短时记忆(BiLSTM)网络的命名实体识别(NER)模型难以全面理解文本的整体语义以及捕捉复杂的实体关系。因此,提出一种基于全域信息融合和多维关系感知的NER模型。首先,通过BERT(Bidirectional Encoder Representations from Transformers)获取输入序列的向量表示,并结合BiLSTM进一步学习输入序列的上下文信息。其次,提出由梯度稳定层和特征融合模块组成的全域信息融合机制:前者使模型保持稳定的梯度传播并更新优化输入序列的表示,后者则融合BiLSTM的前后向表示获取更全面的特征表示。接着,构建多维关系感知结构学习不同子空间单词的关联性,以捕获文档中复杂的实体关系。此外,使用自适应焦点损失函数动态调整不同类别实体的权重,提高模型对少数类实体的识别性能。最后,在7个公开数据集上将所提模型和11个基线模型进行对比,实验结果表明所提模型的F1值均优于对比模型,可见该模型的综合性较优。
文摘交通标志在检测过程中,因受天气和光照强度的影响,导致检测时出现错检、漏检等问题,针对此问题提出一种融合空间信息的交通标志检测算法。首先,在网络中使用坐标卷积,增强网络对坐标位置信息的敏锐性。其次,在主干特征提取中加入坐标注意力机制,可以更好地关注融合处的空间位置信息。在特征融合部分使用多尺度加权融合网络和金字塔池化,利用加权计算和跳跃连接的方式,增强低层与高层之间的语义信息融合效果。最后,使用边框回归损失函数(Scalable Intersection over Union Loss,SIoU)提高目标定位的准确性。在CCTSDB2021和GTSDB数据集上的实验结果显示,该方法在2种数据集上的平均精度(mean Average Precision,mAP)分别达到84.9%和98.5%,与主流检测模型对比有显著提升,较原模型分别提升了5.39个百分点和1.67个百分点,提高了交通标志的检测精度。