Colorectal cancer is the most common cancer with a second mortality rate.Polyp lesion is a precursor symptom of colorectal cancer.Detection and removal of polyps can effectively reduce the mortality of patients in the...Colorectal cancer is the most common cancer with a second mortality rate.Polyp lesion is a precursor symptom of colorectal cancer.Detection and removal of polyps can effectively reduce the mortality of patients in the early period.However,mass images will be generated during an endoscopy,which will greatly increase the workload of doctors,and long-term mechanical screening of endoscopy images will also lead to a high misdiagnosis rate.Aiming at the problem that computer-aided diagnosis models deeply depend on the computational power in the polyp detection task,we propose a lightweight model,coordinate attention-YOLOv5-Lite-Prune,based on the YOLOv5 algorithm,which is different from state-of-the-art methods proposed by the existing research that applied object detection models or their variants directly to prediction task without any lightweight processing,such as faster region-based convolutional neural networks,YOLOv3,YOLOv4,and single shot multibox detector.The innovations of our model are as follows:First,the lightweight EfficientNetLite network is introduced as the new feature extraction network.Second,the depthwise separable convolution and its improved modules with different attention mechanisms are used to replace the standard convolution in the detection head structure.Then,theα-intersection over union loss function is applied to improve the precision and convergence speed of the model.Finally,the model size is compressed with a pruning algorithm.Our model effectively reduces parameter amount and computational complexity without significant accuracy loss.Therefore,the model can be successfully deployed on the embedded deep learning platform,and detect polyps with a speed above 30 frames per second,which means the model gets rid of the limitation that deep learning models must rely on high-performance servers.展开更多
基金the National Natural Science Foundation of China(Nos.81971767,62103263 and 62103267)the Shanghai Science and Technology Commission(Nos.19142203800,19441913800 and 19441910600)。
文摘Colorectal cancer is the most common cancer with a second mortality rate.Polyp lesion is a precursor symptom of colorectal cancer.Detection and removal of polyps can effectively reduce the mortality of patients in the early period.However,mass images will be generated during an endoscopy,which will greatly increase the workload of doctors,and long-term mechanical screening of endoscopy images will also lead to a high misdiagnosis rate.Aiming at the problem that computer-aided diagnosis models deeply depend on the computational power in the polyp detection task,we propose a lightweight model,coordinate attention-YOLOv5-Lite-Prune,based on the YOLOv5 algorithm,which is different from state-of-the-art methods proposed by the existing research that applied object detection models or their variants directly to prediction task without any lightweight processing,such as faster region-based convolutional neural networks,YOLOv3,YOLOv4,and single shot multibox detector.The innovations of our model are as follows:First,the lightweight EfficientNetLite network is introduced as the new feature extraction network.Second,the depthwise separable convolution and its improved modules with different attention mechanisms are used to replace the standard convolution in the detection head structure.Then,theα-intersection over union loss function is applied to improve the precision and convergence speed of the model.Finally,the model size is compressed with a pruning algorithm.Our model effectively reduces parameter amount and computational complexity without significant accuracy loss.Therefore,the model can be successfully deployed on the embedded deep learning platform,and detect polyps with a speed above 30 frames per second,which means the model gets rid of the limitation that deep learning models must rely on high-performance servers.