BACKGROUND Esophageal cancer is the sixth most common cancer worldwide,with a high mortality rate.Early prognosis of esophageal abnormalities can improve patient survival rates.The progression of esophageal cancer fol...BACKGROUND Esophageal cancer is the sixth most common cancer worldwide,with a high mortality rate.Early prognosis of esophageal abnormalities can improve patient survival rates.The progression of esophageal cancer follows a sequence from esophagitis to non-dysplastic Barrett’s esophagus,dysplastic Barrett’s esophagus,and eventually esophageal adenocarcinoma(EAC).This study explored the application of deep learning technology in the precise diagnosis of pathological classification and staging of EAC to enhance diagnostic accuracy and efficiency.AIM To explore the application of deep learning models,particularly Wave-Vision Transformer(Wave-ViT),in the pathological classification and staging of esophageal cancer to enhance diagnostic accuracy and efficiency.METHODS We applied several deep learning models,including multi-layer perceptron,residual network,transformer,and Wave-ViT,to a dataset of clinically validated esophageal pathology images.The models were trained to identify pathological features and assist in the classification and staging of different stages of esophageal cancer.The models were compared based on accuracy,computational complexity,and efficiency.RESULTS The Wave-ViT model demonstrated the highest accuracy at 88.97%,surpassing the transformer(87.65%),residual network(85.44%),and multi-layer perceptron(81.17%).Additionally,Wave-ViT exhibited low computational complexity with significantly reduced parameter size,making it highly efficient for real-time clinical applications.CONCLUSION Deep learning technology,particularly the Frequency-Domain Transformer model,shows promise in improving the precision of pathological classification and staging of EAC.The application of the Frequency-Domain Transformer model enhances the automation of the diagnostic process and may support early detection and treatment of EAC.Future research may further explore the potential of this model in broader medical image analysis applications,particularly in the field of precision medicine.展开更多
针对毫米波雷达和视觉传感器融合算法在特征融合层面缺乏有效监督的问题,提出了一种引入激光雷达监督的多模态融合三维目标检测算法(Radar and Camera Fusion Based on Lidar Supervision,LRCFusion)。该算法首先分别提取视觉传感器、...针对毫米波雷达和视觉传感器融合算法在特征融合层面缺乏有效监督的问题,提出了一种引入激光雷达监督的多模态融合三维目标检测算法(Radar and Camera Fusion Based on Lidar Supervision,LRCFusion)。该算法首先分别提取视觉传感器、激光雷达和毫米波雷达各自的数据特征;接着使用知识蒸馏的方法,利用激光雷达特征作为教师模型监督毫米波雷达特征,以提升毫米波雷达特征的表达水平;然后引入注意力机制实现毫米波雷达和视觉特征融合,并采用基于点云的三维物体检测方法对融合的特征进行目标检测和3D锚框预测;最后,使用预测的3D锚框更新融合前的3D参考点。与基线算法进行比较,所提算法的平均精度提高1.2%,归一化检测得分提高1%。展开更多
文摘BACKGROUND Esophageal cancer is the sixth most common cancer worldwide,with a high mortality rate.Early prognosis of esophageal abnormalities can improve patient survival rates.The progression of esophageal cancer follows a sequence from esophagitis to non-dysplastic Barrett’s esophagus,dysplastic Barrett’s esophagus,and eventually esophageal adenocarcinoma(EAC).This study explored the application of deep learning technology in the precise diagnosis of pathological classification and staging of EAC to enhance diagnostic accuracy and efficiency.AIM To explore the application of deep learning models,particularly Wave-Vision Transformer(Wave-ViT),in the pathological classification and staging of esophageal cancer to enhance diagnostic accuracy and efficiency.METHODS We applied several deep learning models,including multi-layer perceptron,residual network,transformer,and Wave-ViT,to a dataset of clinically validated esophageal pathology images.The models were trained to identify pathological features and assist in the classification and staging of different stages of esophageal cancer.The models were compared based on accuracy,computational complexity,and efficiency.RESULTS The Wave-ViT model demonstrated the highest accuracy at 88.97%,surpassing the transformer(87.65%),residual network(85.44%),and multi-layer perceptron(81.17%).Additionally,Wave-ViT exhibited low computational complexity with significantly reduced parameter size,making it highly efficient for real-time clinical applications.CONCLUSION Deep learning technology,particularly the Frequency-Domain Transformer model,shows promise in improving the precision of pathological classification and staging of EAC.The application of the Frequency-Domain Transformer model enhances the automation of the diagnostic process and may support early detection and treatment of EAC.Future research may further explore the potential of this model in broader medical image analysis applications,particularly in the field of precision medicine.
文摘针对毫米波雷达和视觉传感器融合算法在特征融合层面缺乏有效监督的问题,提出了一种引入激光雷达监督的多模态融合三维目标检测算法(Radar and Camera Fusion Based on Lidar Supervision,LRCFusion)。该算法首先分别提取视觉传感器、激光雷达和毫米波雷达各自的数据特征;接着使用知识蒸馏的方法,利用激光雷达特征作为教师模型监督毫米波雷达特征,以提升毫米波雷达特征的表达水平;然后引入注意力机制实现毫米波雷达和视觉特征融合,并采用基于点云的三维物体检测方法对融合的特征进行目标检测和3D锚框预测;最后,使用预测的3D锚框更新融合前的3D参考点。与基线算法进行比较,所提算法的平均精度提高1.2%,归一化检测得分提高1%。