BACKGROUND Esophageal cancer is the sixth most common cancer worldwide,with a high mortality rate.Early prognosis of esophageal abnormalities can improve patient survival rates.The progression of esophageal cancer fol...BACKGROUND Esophageal cancer is the sixth most common cancer worldwide,with a high mortality rate.Early prognosis of esophageal abnormalities can improve patient survival rates.The progression of esophageal cancer follows a sequence from esophagitis to non-dysplastic Barrett’s esophagus,dysplastic Barrett’s esophagus,and eventually esophageal adenocarcinoma(EAC).This study explored the application of deep learning technology in the precise diagnosis of pathological classification and staging of EAC to enhance diagnostic accuracy and efficiency.AIM To explore the application of deep learning models,particularly Wave-Vision Transformer(Wave-ViT),in the pathological classification and staging of esophageal cancer to enhance diagnostic accuracy and efficiency.METHODS We applied several deep learning models,including multi-layer perceptron,residual network,transformer,and Wave-ViT,to a dataset of clinically validated esophageal pathology images.The models were trained to identify pathological features and assist in the classification and staging of different stages of esophageal cancer.The models were compared based on accuracy,computational complexity,and efficiency.RESULTS The Wave-ViT model demonstrated the highest accuracy at 88.97%,surpassing the transformer(87.65%),residual network(85.44%),and multi-layer perceptron(81.17%).Additionally,Wave-ViT exhibited low computational complexity with significantly reduced parameter size,making it highly efficient for real-time clinical applications.CONCLUSION Deep learning technology,particularly the Frequency-Domain Transformer model,shows promise in improving the precision of pathological classification and staging of EAC.The application of the Frequency-Domain Transformer model enhances the automation of the diagnostic process and may support early detection and treatment of EAC.Future research may further explore the potential of this model in broader medical image analysis applications,particularly in the field of precision medicine.展开更多
针对毫米波雷达和视觉传感器融合算法在特征融合层面缺乏有效监督的问题,提出了一种引入激光雷达监督的多模态融合三维目标检测算法(Radar and Camera Fusion Based on Lidar Supervision,LRCFusion)。该算法首先分别提取视觉传感器、...针对毫米波雷达和视觉传感器融合算法在特征融合层面缺乏有效监督的问题,提出了一种引入激光雷达监督的多模态融合三维目标检测算法(Radar and Camera Fusion Based on Lidar Supervision,LRCFusion)。该算法首先分别提取视觉传感器、激光雷达和毫米波雷达各自的数据特征;接着使用知识蒸馏的方法,利用激光雷达特征作为教师模型监督毫米波雷达特征,以提升毫米波雷达特征的表达水平;然后引入注意力机制实现毫米波雷达和视觉特征融合,并采用基于点云的三维物体检测方法对融合的特征进行目标检测和3D锚框预测;最后,使用预测的3D锚框更新融合前的3D参考点。与基线算法进行比较,所提算法的平均精度提高1.2%,归一化检测得分提高1%。展开更多
基于热红外特性,红外立体视觉路况行人感知方法可以在夜间、雾霾环境下有效检测道路场景中的行人等目标,提高驾驶安全性。针对红外图像中纹理细节少,传统稠密双目立体匹配算法效果差的问题,本文首先根据目标在红外图像下的亮度、边缘特...基于热红外特性,红外立体视觉路况行人感知方法可以在夜间、雾霾环境下有效检测道路场景中的行人等目标,提高驾驶安全性。针对红外图像中纹理细节少,传统稠密双目立体匹配算法效果差的问题,本文首先根据目标在红外图像下的亮度、边缘特征提取感兴趣区域(Region of interest,ROI);然后在ROI中提取图像特征点并匹配,进而计算原始稀疏深度图;最后根据目标表面深度变化较小的特点,结合ROI和原始深度图估计半稠密深度图。本文搭建了实验系统验证该方法的有效性。实验结果表明,在系统约120°观测视场角内,该方法对行人等目标深度感知相对误差在15 m范围内优于1.5%,30m范围内优于3%。展开更多
文摘BACKGROUND Esophageal cancer is the sixth most common cancer worldwide,with a high mortality rate.Early prognosis of esophageal abnormalities can improve patient survival rates.The progression of esophageal cancer follows a sequence from esophagitis to non-dysplastic Barrett’s esophagus,dysplastic Barrett’s esophagus,and eventually esophageal adenocarcinoma(EAC).This study explored the application of deep learning technology in the precise diagnosis of pathological classification and staging of EAC to enhance diagnostic accuracy and efficiency.AIM To explore the application of deep learning models,particularly Wave-Vision Transformer(Wave-ViT),in the pathological classification and staging of esophageal cancer to enhance diagnostic accuracy and efficiency.METHODS We applied several deep learning models,including multi-layer perceptron,residual network,transformer,and Wave-ViT,to a dataset of clinically validated esophageal pathology images.The models were trained to identify pathological features and assist in the classification and staging of different stages of esophageal cancer.The models were compared based on accuracy,computational complexity,and efficiency.RESULTS The Wave-ViT model demonstrated the highest accuracy at 88.97%,surpassing the transformer(87.65%),residual network(85.44%),and multi-layer perceptron(81.17%).Additionally,Wave-ViT exhibited low computational complexity with significantly reduced parameter size,making it highly efficient for real-time clinical applications.CONCLUSION Deep learning technology,particularly the Frequency-Domain Transformer model,shows promise in improving the precision of pathological classification and staging of EAC.The application of the Frequency-Domain Transformer model enhances the automation of the diagnostic process and may support early detection and treatment of EAC.Future research may further explore the potential of this model in broader medical image analysis applications,particularly in the field of precision medicine.
文摘针对毫米波雷达和视觉传感器融合算法在特征融合层面缺乏有效监督的问题,提出了一种引入激光雷达监督的多模态融合三维目标检测算法(Radar and Camera Fusion Based on Lidar Supervision,LRCFusion)。该算法首先分别提取视觉传感器、激光雷达和毫米波雷达各自的数据特征;接着使用知识蒸馏的方法,利用激光雷达特征作为教师模型监督毫米波雷达特征,以提升毫米波雷达特征的表达水平;然后引入注意力机制实现毫米波雷达和视觉特征融合,并采用基于点云的三维物体检测方法对融合的特征进行目标检测和3D锚框预测;最后,使用预测的3D锚框更新融合前的3D参考点。与基线算法进行比较,所提算法的平均精度提高1.2%,归一化检测得分提高1%。
文摘基于热红外特性,红外立体视觉路况行人感知方法可以在夜间、雾霾环境下有效检测道路场景中的行人等目标,提高驾驶安全性。针对红外图像中纹理细节少,传统稠密双目立体匹配算法效果差的问题,本文首先根据目标在红外图像下的亮度、边缘特征提取感兴趣区域(Region of interest,ROI);然后在ROI中提取图像特征点并匹配,进而计算原始稀疏深度图;最后根据目标表面深度变化较小的特点,结合ROI和原始深度图估计半稠密深度图。本文搭建了实验系统验证该方法的有效性。实验结果表明,在系统约120°观测视场角内,该方法对行人等目标深度感知相对误差在15 m范围内优于1.5%,30m范围内优于3%。