Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global featu...Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.展开更多
The diagnostic interpretation of dermoscopic images is a complex task as it is very difficult to identify the skin lesions from the normal.Thus the accurate detection of potential abnormalities is required for patient ...The diagnostic interpretation of dermoscopic images is a complex task as it is very difficult to identify the skin lesions from the normal.Thus the accurate detection of potential abnormalities is required for patient monitoring and effec-tive treatment.In this work,a Two-Tier Segmentation(TTS)system is designed,which combines the unsupervised and supervised techniques for skin lesion seg-mentation.It comprises preprocessing by the medianfilter,TTS by Colour K-Means Clustering(CKMC)for initial segmentation and Faster Region based Con-volutional Neural Network(FR-CNN)for refined segmentation.The CKMC approach is evaluated using the different number of clusters(k=3,5,7,and 9).An inception network with batch normalization is employed to segment mel-anoma regions effectively.Different loss functions such as Mean Absolute Error(MAE),Cross Entropy Loss(CEL),and Dice Loss(DL)are utilized for perfor-mance evaluation of the TTS system.The anchor box technique is employed to detect the melanoma region effectively.The TTS system is evaluated using 200 dermoscopic images from the PH2 database.The segmentation accuracies are analyzed in terms of Pixel Accuracy(PA)and Jaccard Index(JI).Results show that the TTS system has 90.19%PA with 0.8048 JI for skin lesion segmentation using DL in FR-CNN with seven clusters in CKMC than CEL and MAE.展开更多
Liver tumors segmentation from computed tomography (CT) images is an essential task for diagnosis and treatments of liver cancer. However, it is difficult owing to the variability of appearances, fuzzy boundaries, het...Liver tumors segmentation from computed tomography (CT) images is an essential task for diagnosis and treatments of liver cancer. However, it is difficult owing to the variability of appearances, fuzzy boundaries, heterogeneous densities, shapes and sizes of lesions. In this paper, an automatic method based on convolutional neural networks (CNNs) is presented to segment lesions from CT images. The CNNs is one of deep learning models with some convolutional filters which can learn hierarchical features from data. We compared the CNNs model to popular machine learning algorithms: AdaBoost, Random Forests (RF), and support vector machine (SVM). These classifiers were trained by handcrafted features containing mean, variance, and contextual features. Experimental evaluation was performed on 30 portal phase enhanced CT images using leave-one-out cross validation. The average Dice Similarity Coefficient (DSC), precision, and recall achieved of 80.06% ± 1.63%, 82.67% ± 1.43%, and 84.34% ± 1.61%, respectively. The results show that the CNNs method has better performance than other methods and is promising in liver tumor segmentation.展开更多
High-throughput maize phenotyping at both organ and plant levels plays a key role in molecular breeding for increasing crop yields. Although the rapid development of light detection and ranging(Li DAR) provides a new ...High-throughput maize phenotyping at both organ and plant levels plays a key role in molecular breeding for increasing crop yields. Although the rapid development of light detection and ranging(Li DAR) provides a new way to characterize three-dimensional(3 D) plant structure, there is a need to develop robust algorithms for extracting 3 D phenotypic traits from Li DAR data to assist in gene identification and selection. Accurate 3 D phenotyping in field environments remains challenging, owing to difficulties in segmentation of organs and individual plants in field terrestrial Li DAR data. We describe a two-stage method that combines both convolutional neural networks(CNNs) and morphological characteristics to segment stems and leaves of individual maize plants in field environments. It initially extracts stem points using the Point CNN model and obtains stem instances by fitting 3 D cylinders to the points. It then segments the field Li DAR point cloud into individual plants using local point densities and 3 D morphological structures of maize plants. The method was tested using 40 samples from field observations and showed high accuracy in the segmentation of both organs(F-score =0.8207) and plants(Fscore =0.9909). The effectiveness of terrestrial Li DAR for phenotyping at organ(including leaf area and stem position) and individual plant(including individual height and crown width) levels in field environments was evaluated. The accuracies of derived stem position(position error =0.0141 m), plant height(R^(2)>0.99), crown width(R^(2)>0.90), and leaf area(R^(2)>0.85) allow investigating plant structural and functional phenotypes in a high-throughput way. This CNN-based solution overcomes the major challenges in organ-level phenotypic trait extraction associated with the organ segmentation, and potentially contributes to studies of plant phenomics and precision agriculture.展开更多
To overcome the computational burden of processing three-dimensional(3 D)medical scans and the lack of spatial information in two-dimensional(2 D)medical scans,a novel segmentation method was proposed that integrates ...To overcome the computational burden of processing three-dimensional(3 D)medical scans and the lack of spatial information in two-dimensional(2 D)medical scans,a novel segmentation method was proposed that integrates the segmentation results of three densely connected 2 D convolutional neural networks(2 D-CNNs).In order to combine the lowlevel features and high-level features,we added densely connected blocks in the network structure design so that the low-level features will not be missed as the network layer increases during the learning process.Further,in order to resolve the problems of the blurred boundary of the glioma edema area,we superimposed and fused the T2-weighted fluid-attenuated inversion recovery(FLAIR)modal image and the T2-weighted(T2)modal image to enhance the edema section.For the loss function of network training,we improved the cross-entropy loss function to effectively avoid network over-fitting.On the Multimodal Brain Tumor Image Segmentation Challenge(BraTS)datasets,our method achieves dice similarity coefficient values of 0.84,0.82,and 0.83 on the BraTS2018 training;0.82,0.85,and 0.83 on the BraTS2018 validation;and 0.81,0.78,and 0.83 on the BraTS2013 testing in terms of whole tumors,tumor cores,and enhancing cores,respectively.Experimental results showed that the proposed method achieved promising accuracy and fast processing,demonstrating good potential for clinical medicine.展开更多
Magnetic Resonance Imaging (MRI) is an important diagnostic technique for early detection of brain Tumor and the classification of brain Tumor from MRI image is a challenging research work because of its different sha...Magnetic Resonance Imaging (MRI) is an important diagnostic technique for early detection of brain Tumor and the classification of brain Tumor from MRI image is a challenging research work because of its different shapes, location and image intensities. For successful classification, the segmentation method is required to separate Tumor. Then important features are extracted from the segmented Tumor that is used to classify the Tumor. In this work, an efficient multilevel segmentation method is developed combining optimal thresholding and watershed segmentation technique followed by a morphological operation to separate the Tumor. Convolutional Neural Network (CNN) is then applied for feature extraction and finally, the Kernel Support Vector Machine (KSVM) is utilized for resultant classification that is justified by our experimental evaluation. Experimental results show that the proposed method effectively detect and classify the Tumor as cancerous or non-cancerous with promising accuracy.展开更多
In intelligent perception and diagnosis of medical equipment,the visual and morphological changes in retinal vessels are closely related to the severity of cardiovascular diseases(e.g.,diabetes and hypertension).Intel...In intelligent perception and diagnosis of medical equipment,the visual and morphological changes in retinal vessels are closely related to the severity of cardiovascular diseases(e.g.,diabetes and hypertension).Intelligent auxiliary diagnosis of these diseases depends on the accuracy of the retinal vascular segmentation results.To address this challenge,we design a Dual-Branch-UNet framework,which comprises a Dual-Branch encoder structure for feature extraction based on the traditional U-Net model for medical image segmentation.To be more explicit,we utilize a novel parallel encoder made up of various convolutional modules to enhance the encoder portion of the original U-Net.Then,image features are combined at each layer to produce richer semantic data and the model’s capacity is adjusted to various input images.Meanwhile,in the lower sampling section,we give up pooling and conduct the lower sampling by convolution operation to control step size for information fusion.We also employ an attentionmodule in the decoder stage to filter the image noises so as to lessen the response of irrelevant features.Experiments are verified and compared on the DRIVE and ARIA datasets for retinal vessels segmentation.The proposed Dual-Branch-UNet has proved to be superior to other five typical state-of-the-art methods.展开更多
In actual traffic scenarios,precise recognition of traffic participants,such as vehicles and pedestrians,is crucial for intelligent transportation.This study proposes an improved algorithm built on Mask-RCNN to enhanc...In actual traffic scenarios,precise recognition of traffic participants,such as vehicles and pedestrians,is crucial for intelligent transportation.This study proposes an improved algorithm built on Mask-RCNN to enhance the ability of autonomous driving systems to recognize traffic participants.The algorithmincorporates long and shortterm memory networks and the fused attention module(GSAM,GCT,and Spatial Attention Module)to enhance the algorithm’s capability to process both global and local information.Additionally,to increase the network’s initial operation stability,the original network activation function was replaced with Gaussian error linear unit.Experiments were conducted using the publicly available Cityscapes dataset.Comparing the test results,it was observed that the revised algorithmoutperformed the original algorithmin terms of AP_(50),AP_(75),and othermetrics by 8.7%and 9.6%for target detection and 12.5%and 13.3%for segmentation.展开更多
This paper concerns the problem of object segmentation in real-time for picking system. A region proposal method inspired by human glance based on the convolutional neural network is proposed to select promising regio...This paper concerns the problem of object segmentation in real-time for picking system. A region proposal method inspired by human glance based on the convolutional neural network is proposed to select promising regions, allowing more processing is reserved only for these regions. The speed of object segmentation is significantly improved by the region proposal method.By the combination of the region proposal method based on the convolutional neural network and superpixel method, the category and location information can be used to segment objects and image redundancy is significantly reduced. The processing time is reduced considerably by this to achieve the real time. Experiments show that the proposed method can segment the interested target object in real time on an ordinary laptop.展开更多
Coronary arterydisease(CAD)has become a significant causeof heart attack,especially amongthose 40yearsoldor younger.There is a need to develop new technologies andmethods to deal with this disease.Many researchers hav...Coronary arterydisease(CAD)has become a significant causeof heart attack,especially amongthose 40yearsoldor younger.There is a need to develop new technologies andmethods to deal with this disease.Many researchers have proposed image processing-based solutions for CADdiagnosis,but achieving highly accurate results for angiogram segmentation is still a challenge.Several different types of angiograms are adopted for CAD diagnosis.This paper proposes an approach for image segmentation using ConvolutionNeuralNetworks(CNN)for diagnosing coronary artery disease to achieve state-of-the-art results.We have collected the 2D X-ray images from the hospital,and the proposed model has been applied to them.Image augmentation has been performed in this research as it’s the most significant task required to be initiated to increase the dataset’s size.Also,the images have been enhanced using noise removal techniques before being fed to the CNN model for segmentation to achieve high accuracy.As the output,different settings of the network architecture undoubtedly have achieved different accuracy,among which the highest accuracy of the model is 97.61%.Compared with the other models,these results have proven to be superior to this proposed method in achieving state-of-the-art results.展开更多
With the rising frequency and severity of wildfires across the globe,researchers have been actively searching for a reliable solution for early-stage forest fire detection.In recent years,Convolutional Neural Networks...With the rising frequency and severity of wildfires across the globe,researchers have been actively searching for a reliable solution for early-stage forest fire detection.In recent years,Convolutional Neural Networks(CNNs)have demonstrated outstanding performances in computer vision-based object detection tasks,including forest fire detection.Using CNNs to detect forest fires by segmenting both flame and smoke pixels not only can provide early and accurate detection but also additional information such as the size,spread,location,and movement of the fire.However,CNN-based segmentation networks are computationally demanding and can be difficult to incorporate onboard lightweight mobile platforms,such as an Uncrewed Aerial Vehicle(UAV).To address this issue,this paper has proposed a new efficient upsampling technique based on transposed convolution to make segmentation CNNs lighter.This proposed technique,named Reversed Depthwise Separable Transposed Convolution(RDSTC),achieved F1-scores of 0.78 for smoke and 0.74 for flame,outperforming U-Net networks with bilinear upsampling,transposed convolution,and CARAFE upsampling.Additionally,a Multi-signature Fire Detection Network(MsFireD-Net)has been proposed in this paper,having 93%fewer parameters and 94%fewer computations than the RDSTC U-Net.Despite being such a lightweight and efficient network,MsFireD-Net has demonstrated strong results against the other U-Net-based networks.展开更多
Wind turbine blades are prone to failure due to high tip speed,rain,dust and so on.A surface condition detecting approach based on wind turbine blade aerodynamic noise is proposed.On the experimental measurement data,...Wind turbine blades are prone to failure due to high tip speed,rain,dust and so on.A surface condition detecting approach based on wind turbine blade aerodynamic noise is proposed.On the experimental measurement data,variational mode decomposition filtering and Mel spectrogram drawing are conducted first.The Mel spectrogram is divided into two halves based on frequency characteristics and then sent into the convolutional neural network.Gaussian white noise is superimposed on the original signal and the output results are assessed based on score coefficients,considering the complexity of the real environment.The surfaces of Wind turbine blades are classified into four types:standard,attachments,polishing,and serrated trailing edge.The proposed method is evaluated and the detection accuracy in complicated background conditions is found to be 99.59%.In addition to support the differentiation of trained models,utilizing proper score coefficients also permit the screening of unknown types.展开更多
Attention mechanism combined with convolutional neural network(CNN) achieves promising performance for magnetic resonance imaging(MRI) image segmentation,however these methods only learn attention weights from single ...Attention mechanism combined with convolutional neural network(CNN) achieves promising performance for magnetic resonance imaging(MRI) image segmentation,however these methods only learn attention weights from single scale,resulting in incomplete attention learning.A novel method named completed attention convolutional neural network(CACNN) is proposed for MRI image segmentation.Specifically,the channel-wise attention block(CWAB) and the pixel-wise attention block(PWAB) are designed to learn attention weights from the aspects of channel and pixel levels.As a result,completed attention weights are obtained,which is beneficial to discriminative feature learning.The method is verified on two widely used datasets(HVSMR and MRBrainS),and the experimental results demonstrate that the proposed method achieves better results than the state-of-theart methods.展开更多
The recent advancements in vision technology have had a significant impact on our ability to identify multiple objects and understand complex scenes.Various technologies,such as augmented reality-driven scene integrat...The recent advancements in vision technology have had a significant impact on our ability to identify multiple objects and understand complex scenes.Various technologies,such as augmented reality-driven scene integration,robotic navigation,autonomous driving,and guided tour systems,heavily rely on this type of scene comprehension.This paper presents a novel segmentation approach based on the UNet network model,aimed at recognizing multiple objects within an image.The methodology begins with the acquisition and preprocessing of the image,followed by segmentation using the fine-tuned UNet architecture.Afterward,we use an annotation tool to accurately label the segmented regions.Upon labeling,significant features are extracted from these segmented objects,encompassing KAZE(Accelerated Segmentation and Extraction)features,energy-based edge detection,frequency-based,and blob characteristics.For the classification stage,a convolution neural network(CNN)is employed.This comprehensive methodology demonstrates a robust framework for achieving accurate and efficient recognition of multiple objects in images.The experimental results,which include complex object datasets like MSRC-v2 and PASCAL-VOC12,have been documented.After analyzing the experimental results,it was found that the PASCAL-VOC12 dataset achieved an accuracy rate of 95%,while the MSRC-v2 dataset achieved an accuracy of 89%.The evaluation performed on these diverse datasets highlights a notably impressive level of performance.展开更多
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ...Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.展开更多
Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is cr...Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is crucial for computationally limited portable devices such as augmented reality and virtual reality.With the rapid advancements in deep learning,many network models have been developed specifically for eye image segmentation.Some methods divide the segmentation process into multiple stages to achieve model parameter miniaturization while enhancing output through post processing techniques to improve segmentation accuracy.These approaches significantly increase the inference time.Other networks adopt more complex encoding and decoding modules to achieve end-to-end output,which requires substantial computation.Therefore,balancing the model’s size,accuracy,and computational complexity is essential.To address these challenges,we propose a lightweight asymmetric UNet architecture and a projection loss function.We utilize ResNet-3 layer blocks to enhance feature extraction efficiency in the encoding stage.In the decoding stage,we employ regular convolutions and skip connections to upscale the feature maps from the latent space to the original image size,balancing the model size and segmentation accuracy.In addition,we leverage the geometric features of the eye region and design a projection loss function to further improve the segmentation accuracy without adding any additional inference computational cost.We validate our approach on the OpenEDS2019 dataset for virtual reality and achieve state-of-the-art performance with 95.33%mean intersection over union(mIoU).Our model has only 0.63M parameters and 350 FPS,which are 68%and 200%of the state-of-the-art model RITNet,respectively.展开更多
Scleral vessels on the surface of the human eye can provide valuable information about potential diseases or dysfunctions of specific organs,and vessel segmentation is a key step in characterizing the scleral vessels....Scleral vessels on the surface of the human eye can provide valuable information about potential diseases or dysfunctions of specific organs,and vessel segmentation is a key step in characterizing the scleral vessels.However,accurate segmentation of blood vessels in the scleral images is a challenging task due to the intricate texture,tenuous structure,and erratic network of the scleral vessels.In this work,we propose a CNN-Transformer hybrid network named SVSNet for automatic scleral vessel segmentation.Following the typical U-shape encoder-decoder architecture,the SVSNet integrates a Sobel edge detection module to provide edge prior and further combines the Atrous Spatial Pyramid Pooling module to enhance its ability to extract vessels of various sizes.At the end of the encoding path,a vision Transformer module is incorporated to capture the global context and improve the continuity of the vessel network.To validate the effectiveness of the proposed SVSNet,comparative experiments are conducted on two public scleral image datasets,and the results show that the SVSNet outperforms other state-of-the-art models.Further experiments on three public retinal image datasets demonstrate that the SVSNet can be easily applied to other vessel datasets with good generalization capability.展开更多
BACKGROUND Artificial intelligence,such as convolutional neural networks(CNNs),has been used in the interpretation of images and the diagnosis of hepatocellular cancer(HCC)and liver masses.CNN,a machine-learning algor...BACKGROUND Artificial intelligence,such as convolutional neural networks(CNNs),has been used in the interpretation of images and the diagnosis of hepatocellular cancer(HCC)and liver masses.CNN,a machine-learning algorithm similar to deep learning,has demonstrated its capability to recognise specific features that can detect pathological lesions.AIM To assess the use of CNNs in examining HCC and liver masses images in the diagnosis of cancer and evaluating the accuracy level of CNNs and their performance.METHODS The databases PubMed,EMBASE,and the Web of Science and research books were systematically searched using related keywords.Studies analysing pathological anatomy,cellular,and radiological images on HCC or liver masses using CNNs were identified according to the study protocol to detect cancer,differentiating cancer from other lesions,or staging the lesion.The data were extracted as per a predefined extraction.The accuracy level and performance of the CNNs in detecting cancer or early stages of cancer were analysed.The primary outcomes of the study were analysing the type of cancer or liver mass and identifying the type of images that showed optimum accuracy in cancer detection.RESULTS A total of 11 studies that met the selection criteria and were consistent with the aims of the study were identified.The studies demonstrated the ability to differentiate liver masses or differentiate HCC from other lesions(n=6),HCC from cirrhosis or development of new tumours(n=3),and HCC nuclei grading or segmentation(n=2).The CNNs showed satisfactory levels of accuracy.The studies aimed at detecting lesions(n=4),classification(n=5),and segmentation(n=2).Several methods were used to assess the accuracy of CNN models used.CONCLUSION The role of CNNs in analysing images and as tools in early detection of HCC or liver masses has been demonstrated in these studies.While a few limitations have been identified in these studies,overall there was an optimal level of accuracy of the CNNs used in segmentation and classification of liver cancers images.展开更多
This paper presents a handwritten document recognition system based on the convolutional neural network technique.In today’s world,handwritten document recognition is rapidly attaining the attention of researchers du...This paper presents a handwritten document recognition system based on the convolutional neural network technique.In today’s world,handwritten document recognition is rapidly attaining the attention of researchers due to its promising behavior as assisting technology for visually impaired users.This technology is also helpful for the automatic data entry system.In the proposed systemprepared a dataset of English language handwritten character images.The proposed system has been trained for the large set of sample data and tested on the sample images of user-defined handwritten documents.In this research,multiple experiments get very worthy recognition results.The proposed systemwill first performimage pre-processing stages to prepare data for training using a convolutional neural network.After this processing,the input document is segmented using line,word and character segmentation.The proposed system get the accuracy during the character segmentation up to 86%.Then these segmented characters are sent to a convolutional neural network for their recognition.The recognition and segmentation technique proposed in this paper is providing the most acceptable accurate results on a given dataset.The proposed work approaches to the accuracy of the result during convolutional neural network training up to 93%,and for validation that accuracy slightly decreases with 90.42%.展开更多
Objective In tongue diagnosis,the location,color,and distribution of spots can be used to speculate on the viscera and severity of the heat evil.This work focuses on the image analysis method of artificial intelligenc...Objective In tongue diagnosis,the location,color,and distribution of spots can be used to speculate on the viscera and severity of the heat evil.This work focuses on the image analysis method of artificial intelligence(AI)to study the spotted tongue recognition of traditional Chinese medicine(TCM).Methods A model of spotted tongue recognition and extraction is designed,which is based on the principle of image deep learning and instance segmentation.This model includes multiscale feature map generation,region proposal searching,and target region recognition.Firstly,deep convolution network is used to build multiscale low-and high-abstraction feature maps after which,target candidate box generation algorithm and selection strategy are used to select high-quality target candidate regions.Finally,classification network is used for classifying target regions and calculating target region pixels.As a result,the region segmentation of spotted tongue is obtained.Under non-standard illumination conditions,various tongue images were taken by mobile phones,and experiments were conducted.Results The spotted tongue recognition achieved an area under curve(AUC)of 92.40%,an accuracy of 84.30%with a sensitivity of 88.20%,a specificity of 94.19%,a recall of 88.20%,a regional pixel accuracy index pixel accuracy(PA)of 73.00%,a mean pixel accuracy(m PA)of73.00%,an intersection over union(Io U)of 60.00%,and a mean intersection over union(mIo U)of 56.00%.Conclusion The results of the study verify that the model is suitable for the application of the TCM tongue diagnosis system.Spotted tongue recognition via multiscale convolutional neural network(CNN)would help to improve spot classification and the accurate extraction of pixels of spot area as well as provide a practical method for intelligent tongue diagnosis of TCM.展开更多
基金National Key Research and Development Program of China,Grant/Award Number:2018YFE0206900China Postdoctoral Science Foundation,Grant/Award Number:2023M731204+2 种基金The Open Project of Key Laboratory for Quality Evaluation of Ultrasound Surgical Equipment of National Medical Products Administration,Grant/Award Number:SMDTKL-2023-1-01The Hubei Province Key Research and Development Project,Grant/Award Number:2023BCB007CAAI-Huawei MindSpore Open Fund。
文摘Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.
文摘The diagnostic interpretation of dermoscopic images is a complex task as it is very difficult to identify the skin lesions from the normal.Thus the accurate detection of potential abnormalities is required for patient monitoring and effec-tive treatment.In this work,a Two-Tier Segmentation(TTS)system is designed,which combines the unsupervised and supervised techniques for skin lesion seg-mentation.It comprises preprocessing by the medianfilter,TTS by Colour K-Means Clustering(CKMC)for initial segmentation and Faster Region based Con-volutional Neural Network(FR-CNN)for refined segmentation.The CKMC approach is evaluated using the different number of clusters(k=3,5,7,and 9).An inception network with batch normalization is employed to segment mel-anoma regions effectively.Different loss functions such as Mean Absolute Error(MAE),Cross Entropy Loss(CEL),and Dice Loss(DL)are utilized for perfor-mance evaluation of the TTS system.The anchor box technique is employed to detect the melanoma region effectively.The TTS system is evaluated using 200 dermoscopic images from the PH2 database.The segmentation accuracies are analyzed in terms of Pixel Accuracy(PA)and Jaccard Index(JI).Results show that the TTS system has 90.19%PA with 0.8048 JI for skin lesion segmentation using DL in FR-CNN with seven clusters in CKMC than CEL and MAE.
文摘Liver tumors segmentation from computed tomography (CT) images is an essential task for diagnosis and treatments of liver cancer. However, it is difficult owing to the variability of appearances, fuzzy boundaries, heterogeneous densities, shapes and sizes of lesions. In this paper, an automatic method based on convolutional neural networks (CNNs) is presented to segment lesions from CT images. The CNNs is one of deep learning models with some convolutional filters which can learn hierarchical features from data. We compared the CNNs model to popular machine learning algorithms: AdaBoost, Random Forests (RF), and support vector machine (SVM). These classifiers were trained by handcrafted features containing mean, variance, and contextual features. Experimental evaluation was performed on 30 portal phase enhanced CT images using leave-one-out cross validation. The average Dice Similarity Coefficient (DSC), precision, and recall achieved of 80.06% ± 1.63%, 82.67% ± 1.43%, and 84.34% ± 1.61%, respectively. The results show that the CNNs method has better performance than other methods and is promising in liver tumor segmentation.
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA24020202)the National Key Research and Development Program of China(2017YFA0604300)+2 种基金the National Natural Science Foundation of China (U1811464 and 41875122)the Western Talents(2018XBYJRC004)the Guangdong Top Young Talents(2017TQ04Z359)。
文摘High-throughput maize phenotyping at both organ and plant levels plays a key role in molecular breeding for increasing crop yields. Although the rapid development of light detection and ranging(Li DAR) provides a new way to characterize three-dimensional(3 D) plant structure, there is a need to develop robust algorithms for extracting 3 D phenotypic traits from Li DAR data to assist in gene identification and selection. Accurate 3 D phenotyping in field environments remains challenging, owing to difficulties in segmentation of organs and individual plants in field terrestrial Li DAR data. We describe a two-stage method that combines both convolutional neural networks(CNNs) and morphological characteristics to segment stems and leaves of individual maize plants in field environments. It initially extracts stem points using the Point CNN model and obtains stem instances by fitting 3 D cylinders to the points. It then segments the field Li DAR point cloud into individual plants using local point densities and 3 D morphological structures of maize plants. The method was tested using 40 samples from field observations and showed high accuracy in the segmentation of both organs(F-score =0.8207) and plants(Fscore =0.9909). The effectiveness of terrestrial Li DAR for phenotyping at organ(including leaf area and stem position) and individual plant(including individual height and crown width) levels in field environments was evaluated. The accuracies of derived stem position(position error =0.0141 m), plant height(R^(2)>0.99), crown width(R^(2)>0.90), and leaf area(R^(2)>0.85) allow investigating plant structural and functional phenotypes in a high-throughput way. This CNN-based solution overcomes the major challenges in organ-level phenotypic trait extraction associated with the organ segmentation, and potentially contributes to studies of plant phenomics and precision agriculture.
基金the National Natural Science Foundation of China(No.81830052)the Shanghai Natural Science Foundation of China(No.20ZR1438300)the Shanghai Science and Technology Support Project(No.18441900500),China。
文摘To overcome the computational burden of processing three-dimensional(3 D)medical scans and the lack of spatial information in two-dimensional(2 D)medical scans,a novel segmentation method was proposed that integrates the segmentation results of three densely connected 2 D convolutional neural networks(2 D-CNNs).In order to combine the lowlevel features and high-level features,we added densely connected blocks in the network structure design so that the low-level features will not be missed as the network layer increases during the learning process.Further,in order to resolve the problems of the blurred boundary of the glioma edema area,we superimposed and fused the T2-weighted fluid-attenuated inversion recovery(FLAIR)modal image and the T2-weighted(T2)modal image to enhance the edema section.For the loss function of network training,we improved the cross-entropy loss function to effectively avoid network over-fitting.On the Multimodal Brain Tumor Image Segmentation Challenge(BraTS)datasets,our method achieves dice similarity coefficient values of 0.84,0.82,and 0.83 on the BraTS2018 training;0.82,0.85,and 0.83 on the BraTS2018 validation;and 0.81,0.78,and 0.83 on the BraTS2013 testing in terms of whole tumors,tumor cores,and enhancing cores,respectively.Experimental results showed that the proposed method achieved promising accuracy and fast processing,demonstrating good potential for clinical medicine.
文摘Magnetic Resonance Imaging (MRI) is an important diagnostic technique for early detection of brain Tumor and the classification of brain Tumor from MRI image is a challenging research work because of its different shapes, location and image intensities. For successful classification, the segmentation method is required to separate Tumor. Then important features are extracted from the segmented Tumor that is used to classify the Tumor. In this work, an efficient multilevel segmentation method is developed combining optimal thresholding and watershed segmentation technique followed by a morphological operation to separate the Tumor. Convolutional Neural Network (CNN) is then applied for feature extraction and finally, the Kernel Support Vector Machine (KSVM) is utilized for resultant classification that is justified by our experimental evaluation. Experimental results show that the proposed method effectively detect and classify the Tumor as cancerous or non-cancerous with promising accuracy.
基金supported by National Natural Science Foundation of China(NSFC)(61976123,62072213)Taishan Young Scholars Program of Shandong Provinceand Key Development Program for Basic Research of Shandong Province(ZR2020ZD44).
文摘In intelligent perception and diagnosis of medical equipment,the visual and morphological changes in retinal vessels are closely related to the severity of cardiovascular diseases(e.g.,diabetes and hypertension).Intelligent auxiliary diagnosis of these diseases depends on the accuracy of the retinal vascular segmentation results.To address this challenge,we design a Dual-Branch-UNet framework,which comprises a Dual-Branch encoder structure for feature extraction based on the traditional U-Net model for medical image segmentation.To be more explicit,we utilize a novel parallel encoder made up of various convolutional modules to enhance the encoder portion of the original U-Net.Then,image features are combined at each layer to produce richer semantic data and the model’s capacity is adjusted to various input images.Meanwhile,in the lower sampling section,we give up pooling and conduct the lower sampling by convolution operation to control step size for information fusion.We also employ an attentionmodule in the decoder stage to filter the image noises so as to lessen the response of irrelevant features.Experiments are verified and compared on the DRIVE and ARIA datasets for retinal vessels segmentation.The proposed Dual-Branch-UNet has proved to be superior to other five typical state-of-the-art methods.
基金the National Natural Science Foundation of China(52175236)Qingdao People’s Livelihood Science and Technology Plan(19-6-1-88-nsh).
文摘In actual traffic scenarios,precise recognition of traffic participants,such as vehicles and pedestrians,is crucial for intelligent transportation.This study proposes an improved algorithm built on Mask-RCNN to enhance the ability of autonomous driving systems to recognize traffic participants.The algorithmincorporates long and shortterm memory networks and the fused attention module(GSAM,GCT,and Spatial Attention Module)to enhance the algorithm’s capability to process both global and local information.Additionally,to increase the network’s initial operation stability,the original network activation function was replaced with Gaussian error linear unit.Experiments were conducted using the publicly available Cityscapes dataset.Comparing the test results,it was observed that the revised algorithmoutperformed the original algorithmin terms of AP_(50),AP_(75),and othermetrics by 8.7%and 9.6%for target detection and 12.5%and 13.3%for segmentation.
基金supported by the National Natural Science Foundation of China(61233010 61305106)+2 种基金the Shanghai Natural Science Foundation(17ZR1409700 18ZR1415300)the basic research project of Shanghai Municipal Science and Technology Commission(16JC1400900)
文摘This paper concerns the problem of object segmentation in real-time for picking system. A region proposal method inspired by human glance based on the convolutional neural network is proposed to select promising regions, allowing more processing is reserved only for these regions. The speed of object segmentation is significantly improved by the region proposal method.By the combination of the region proposal method based on the convolutional neural network and superpixel method, the category and location information can be used to segment objects and image redundancy is significantly reduced. The processing time is reduced considerably by this to achieve the real time. Experiments show that the proposed method can segment the interested target object in real time on an ordinary laptop.
文摘Coronary arterydisease(CAD)has become a significant causeof heart attack,especially amongthose 40yearsoldor younger.There is a need to develop new technologies andmethods to deal with this disease.Many researchers have proposed image processing-based solutions for CADdiagnosis,but achieving highly accurate results for angiogram segmentation is still a challenge.Several different types of angiograms are adopted for CAD diagnosis.This paper proposes an approach for image segmentation using ConvolutionNeuralNetworks(CNN)for diagnosing coronary artery disease to achieve state-of-the-art results.We have collected the 2D X-ray images from the hospital,and the proposed model has been applied to them.Image augmentation has been performed in this research as it’s the most significant task required to be initiated to increase the dataset’s size.Also,the images have been enhanced using noise removal techniques before being fed to the CNN model for segmentation to achieve high accuracy.As the output,different settings of the network architecture undoubtedly have achieved different accuracy,among which the highest accuracy of the model is 97.61%.Compared with the other models,these results have proven to be superior to this proposed method in achieving state-of-the-art results.
文摘With the rising frequency and severity of wildfires across the globe,researchers have been actively searching for a reliable solution for early-stage forest fire detection.In recent years,Convolutional Neural Networks(CNNs)have demonstrated outstanding performances in computer vision-based object detection tasks,including forest fire detection.Using CNNs to detect forest fires by segmenting both flame and smoke pixels not only can provide early and accurate detection but also additional information such as the size,spread,location,and movement of the fire.However,CNN-based segmentation networks are computationally demanding and can be difficult to incorporate onboard lightweight mobile platforms,such as an Uncrewed Aerial Vehicle(UAV).To address this issue,this paper has proposed a new efficient upsampling technique based on transposed convolution to make segmentation CNNs lighter.This proposed technique,named Reversed Depthwise Separable Transposed Convolution(RDSTC),achieved F1-scores of 0.78 for smoke and 0.74 for flame,outperforming U-Net networks with bilinear upsampling,transposed convolution,and CARAFE upsampling.Additionally,a Multi-signature Fire Detection Network(MsFireD-Net)has been proposed in this paper,having 93%fewer parameters and 94%fewer computations than the RDSTC U-Net.Despite being such a lightweight and efficient network,MsFireD-Net has demonstrated strong results against the other U-Net-based networks.
基金funded by the National Nature Science Founda-tion of China(Grant Nos.51905469 and 11672261)the National key research and development program of China under grant number(Grant No.2019YFE0192600)。
文摘Wind turbine blades are prone to failure due to high tip speed,rain,dust and so on.A surface condition detecting approach based on wind turbine blade aerodynamic noise is proposed.On the experimental measurement data,variational mode decomposition filtering and Mel spectrogram drawing are conducted first.The Mel spectrogram is divided into two halves based on frequency characteristics and then sent into the convolutional neural network.Gaussian white noise is superimposed on the original signal and the output results are assessed based on score coefficients,considering the complexity of the real environment.The surfaces of Wind turbine blades are classified into four types:standard,attachments,polishing,and serrated trailing edge.The proposed method is evaluated and the detection accuracy in complicated background conditions is found to be 99.59%.In addition to support the differentiation of trained models,utilizing proper score coefficients also permit the screening of unknown types.
基金Supported National Natural Science Foundation of China (No.62171321)Tianjin Municipal Natural Science Foundation (No.20JCZDJC00180,19 JCZDJC31500)the Open Projects Program of National Laboratory of Pattern Recognition (No.202000002)。
文摘Attention mechanism combined with convolutional neural network(CNN) achieves promising performance for magnetic resonance imaging(MRI) image segmentation,however these methods only learn attention weights from single scale,resulting in incomplete attention learning.A novel method named completed attention convolutional neural network(CACNN) is proposed for MRI image segmentation.Specifically,the channel-wise attention block(CWAB) and the pixel-wise attention block(PWAB) are designed to learn attention weights from the aspects of channel and pixel levels.As a result,completed attention weights are obtained,which is beneficial to discriminative feature learning.The method is verified on two widely used datasets(HVSMR and MRBrainS),and the experimental results demonstrate that the proposed method achieves better results than the state-of-theart methods.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ICAN(ICT Challenge and Advanced Network of HRD)Program(IITP-2024-RS-2022-00156326)supervised by the IITP(Institute of Information&Communications Technology Planning&Evaluation)+2 种基金The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding Program Grant Code(NU/GP/SERC/13/30)funding for this work was provided by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R410)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University,Arar,KSA for funding this research work through the Project Number“NBU-FFR-2024-231-06”.
文摘The recent advancements in vision technology have had a significant impact on our ability to identify multiple objects and understand complex scenes.Various technologies,such as augmented reality-driven scene integration,robotic navigation,autonomous driving,and guided tour systems,heavily rely on this type of scene comprehension.This paper presents a novel segmentation approach based on the UNet network model,aimed at recognizing multiple objects within an image.The methodology begins with the acquisition and preprocessing of the image,followed by segmentation using the fine-tuned UNet architecture.Afterward,we use an annotation tool to accurately label the segmented regions.Upon labeling,significant features are extracted from these segmented objects,encompassing KAZE(Accelerated Segmentation and Extraction)features,energy-based edge detection,frequency-based,and blob characteristics.For the classification stage,a convolution neural network(CNN)is employed.This comprehensive methodology demonstrates a robust framework for achieving accurate and efficient recognition of multiple objects in images.The experimental results,which include complex object datasets like MSRC-v2 and PASCAL-VOC12,have been documented.After analyzing the experimental results,it was found that the PASCAL-VOC12 dataset achieved an accuracy rate of 95%,while the MSRC-v2 dataset achieved an accuracy of 89%.The evaluation performed on these diverse datasets highlights a notably impressive level of performance.
基金supported by the Natural Science Foundation of the Anhui Higher Education Institutions of China(Grant Nos.2023AH040149 and 2024AH051915)the Anhui Provincial Natural Science Foundation(Grant No.2208085MF168)+1 种基金the Science and Technology Innovation Tackle Plan Project of Maanshan(Grant No.2024RGZN001)the Scientific Research Fund Project of Anhui Medical University(Grant No.2023xkj122).
文摘Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.
基金supported by the HFIPS Director’s Foundation(YZJJ202207-TS),the National Natural Science Foundation of China(82371931)the Natural Science Foundation of Anhui Province(2008085MC69)+3 种基金the Natural Science Foundation of Hefei City(2021033)the General Scientific Research Project of Anhui Provincial Health Commission(AHWJ2021b150)the Collaborative Innovation Program of Hefei Science Center,CAS(2021HSC-CIP013)the Anhui Province Key Research and Development Project(202204295107020004).
文摘Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is crucial for computationally limited portable devices such as augmented reality and virtual reality.With the rapid advancements in deep learning,many network models have been developed specifically for eye image segmentation.Some methods divide the segmentation process into multiple stages to achieve model parameter miniaturization while enhancing output through post processing techniques to improve segmentation accuracy.These approaches significantly increase the inference time.Other networks adopt more complex encoding and decoding modules to achieve end-to-end output,which requires substantial computation.Therefore,balancing the model’s size,accuracy,and computational complexity is essential.To address these challenges,we propose a lightweight asymmetric UNet architecture and a projection loss function.We utilize ResNet-3 layer blocks to enhance feature extraction efficiency in the encoding stage.In the decoding stage,we employ regular convolutions and skip connections to upscale the feature maps from the latent space to the original image size,balancing the model size and segmentation accuracy.In addition,we leverage the geometric features of the eye region and design a projection loss function to further improve the segmentation accuracy without adding any additional inference computational cost.We validate our approach on the OpenEDS2019 dataset for virtual reality and achieve state-of-the-art performance with 95.33%mean intersection over union(mIoU).Our model has only 0.63M parameters and 350 FPS,which are 68%and 200%of the state-of-the-art model RITNet,respectively.
基金supported by the National Key Research and Development Program of China(2022YFC3502301 and 2022YFC3502300)the National Natural Science Foundation of China(52475546)+1 种基金the R&D Program of Beijing Municipal Education Commission(KM202311232021)the Young Backbone Teacher Support Plan of Beijing Information Science&Technology University(YBT202410).
文摘Scleral vessels on the surface of the human eye can provide valuable information about potential diseases or dysfunctions of specific organs,and vessel segmentation is a key step in characterizing the scleral vessels.However,accurate segmentation of blood vessels in the scleral images is a challenging task due to the intricate texture,tenuous structure,and erratic network of the scleral vessels.In this work,we propose a CNN-Transformer hybrid network named SVSNet for automatic scleral vessel segmentation.Following the typical U-shape encoder-decoder architecture,the SVSNet integrates a Sobel edge detection module to provide edge prior and further combines the Atrous Spatial Pyramid Pooling module to enhance its ability to extract vessels of various sizes.At the end of the encoding path,a vision Transformer module is incorporated to capture the global context and improve the continuity of the vessel network.To validate the effectiveness of the proposed SVSNet,comparative experiments are conducted on two public scleral image datasets,and the results show that the SVSNet outperforms other state-of-the-art models.Further experiments on three public retinal image datasets demonstrate that the SVSNet can be easily applied to other vessel datasets with good generalization capability.
基金Supported by the College of Medicine Research Centre,Deanship of Scientific Research,King Saud University,Riyadh,Saudi Arabia
文摘BACKGROUND Artificial intelligence,such as convolutional neural networks(CNNs),has been used in the interpretation of images and the diagnosis of hepatocellular cancer(HCC)and liver masses.CNN,a machine-learning algorithm similar to deep learning,has demonstrated its capability to recognise specific features that can detect pathological lesions.AIM To assess the use of CNNs in examining HCC and liver masses images in the diagnosis of cancer and evaluating the accuracy level of CNNs and their performance.METHODS The databases PubMed,EMBASE,and the Web of Science and research books were systematically searched using related keywords.Studies analysing pathological anatomy,cellular,and radiological images on HCC or liver masses using CNNs were identified according to the study protocol to detect cancer,differentiating cancer from other lesions,or staging the lesion.The data were extracted as per a predefined extraction.The accuracy level and performance of the CNNs in detecting cancer or early stages of cancer were analysed.The primary outcomes of the study were analysing the type of cancer or liver mass and identifying the type of images that showed optimum accuracy in cancer detection.RESULTS A total of 11 studies that met the selection criteria and were consistent with the aims of the study were identified.The studies demonstrated the ability to differentiate liver masses or differentiate HCC from other lesions(n=6),HCC from cirrhosis or development of new tumours(n=3),and HCC nuclei grading or segmentation(n=2).The CNNs showed satisfactory levels of accuracy.The studies aimed at detecting lesions(n=4),classification(n=5),and segmentation(n=2).Several methods were used to assess the accuracy of CNN models used.CONCLUSION The role of CNNs in analysing images and as tools in early detection of HCC or liver masses has been demonstrated in these studies.While a few limitations have been identified in these studies,overall there was an optimal level of accuracy of the CNNs used in segmentation and classification of liver cancers images.
文摘This paper presents a handwritten document recognition system based on the convolutional neural network technique.In today’s world,handwritten document recognition is rapidly attaining the attention of researchers due to its promising behavior as assisting technology for visually impaired users.This technology is also helpful for the automatic data entry system.In the proposed systemprepared a dataset of English language handwritten character images.The proposed system has been trained for the large set of sample data and tested on the sample images of user-defined handwritten documents.In this research,multiple experiments get very worthy recognition results.The proposed systemwill first performimage pre-processing stages to prepare data for training using a convolutional neural network.After this processing,the input document is segmented using line,word and character segmentation.The proposed system get the accuracy during the character segmentation up to 86%.Then these segmented characters are sent to a convolutional neural network for their recognition.The recognition and segmentation technique proposed in this paper is providing the most acceptable accurate results on a given dataset.The proposed work approaches to the accuracy of the result during convolutional neural network training up to 93%,and for validation that accuracy slightly decreases with 90.42%.
基金Anhui Province College Natural Science Fund Key Project of China(KJ2020ZD77)the Project of Education Department of Anhui Province(KJ2020A0379)。
文摘Objective In tongue diagnosis,the location,color,and distribution of spots can be used to speculate on the viscera and severity of the heat evil.This work focuses on the image analysis method of artificial intelligence(AI)to study the spotted tongue recognition of traditional Chinese medicine(TCM).Methods A model of spotted tongue recognition and extraction is designed,which is based on the principle of image deep learning and instance segmentation.This model includes multiscale feature map generation,region proposal searching,and target region recognition.Firstly,deep convolution network is used to build multiscale low-and high-abstraction feature maps after which,target candidate box generation algorithm and selection strategy are used to select high-quality target candidate regions.Finally,classification network is used for classifying target regions and calculating target region pixels.As a result,the region segmentation of spotted tongue is obtained.Under non-standard illumination conditions,various tongue images were taken by mobile phones,and experiments were conducted.Results The spotted tongue recognition achieved an area under curve(AUC)of 92.40%,an accuracy of 84.30%with a sensitivity of 88.20%,a specificity of 94.19%,a recall of 88.20%,a regional pixel accuracy index pixel accuracy(PA)of 73.00%,a mean pixel accuracy(m PA)of73.00%,an intersection over union(Io U)of 60.00%,and a mean intersection over union(mIo U)of 56.00%.Conclusion The results of the study verify that the model is suitable for the application of the TCM tongue diagnosis system.Spotted tongue recognition via multiscale convolutional neural network(CNN)would help to improve spot classification and the accurate extraction of pixels of spot area as well as provide a practical method for intelligent tongue diagnosis of TCM.