This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 20...This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 2025.The primary objective is to evaluate methodological advancements,model performance,dataset usage,and existing challenges in developing clinically robust AI systems.We included peer-reviewed journal articles and highimpact conference papers published between 2022 and 2025,written in English,that proposed or evaluated deep learning methods for brain tumor segmentation and/or classification.Excluded were non-open-access publications,books,and non-English articles.A structured search was conducted across Scopus,Google Scholar,Wiley,and Taylor&Francis,with the last search performed in August 2025.Risk of bias was not formally quantified but considered during full-text screening based on dataset diversity,validation methods,and availability of performance metrics.We used narrative synthesis and tabular benchmarking to compare performance metrics(e.g.,accuracy,Dice score)across model types(CNN,Transformer,Hybrid),imaging modalities,and datasets.A total of 49 studies were included(43 journal articles and 6 conference papers).These studies spanned over 9 public datasets(e.g.,BraTS,Figshare,REMBRANDT,MOLAB)and utilized a range of imaging modalities,predominantly MRI.Hybrid models,especially ResViT and UNetFormer,consistently achieved high performance,with classification accuracy exceeding 98%and segmentation Dice scores above 0.90 across multiple studies.Transformers and hybrid architectures showed increasing adoption post2023.Many studies lacked external validation and were evaluated only on a few benchmark datasets,raising concerns about generalizability and dataset bias.Few studies addressed clinical interpretability or uncertainty quantification.Despite promising results,particularly for hybrid deep learning models,widespread clinical adoption remains limited due to lack of validation,interpretability concerns,and real-world deployment barriers.展开更多
Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone t...Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.展开更多
Segmenting a breast ultrasound image is still challenging due to the presence of speckle noise,dependency on the operator,and the variation of image quality.This paper presents the UltraSegNet architecture that addres...Segmenting a breast ultrasound image is still challenging due to the presence of speckle noise,dependency on the operator,and the variation of image quality.This paper presents the UltraSegNet architecture that addresses these challenges through three key technical innovations:This work adds three things:(1)a changed ResNet-50 backbone with sequential 3×3 convolutions to keep fine anatomical details that are needed for finding lesion boundaries;(2)a computationally efficient regional attention mechanism that works on high-resolution features without using a transformer’s extra memory;and(3)an adaptive feature fusion strategy that changes local and global featuresbasedonhowthe image isbeing used.Extensive evaluation on two distinct datasets demonstrates UltraSegNet’s superior performance:On the BUSI dataset,it obtains a precision of 0.915,a recall of 0.908,and an F1 score of 0.911.In the UDAIT dataset,it achieves robust performance across the board,with a precision of 0.901 and recall of 0.894.Importantly,these improvements are achieved at clinically feasible computation times,taking 235 ms per image on standard GPU hardware.Notably,UltraSegNet does amazingly well on difficult small lesions(less than 10 mm),achieving a detection accuracy of 0.891.This is a huge improvement over traditional methods that have a hard time with small-scale features,as standard models can only achieve 0.63–0.71 accuracy.This improvement in small lesion detection is particularly crucial for early-stage breast cancer identification.Results from this work demonstrate that UltraSegNet can be practically deployable in clinical workflows to improve breast cancer screening accuracy.展开更多
BACKGROUND Upper gastrointestinal(UGI)diseases present diagnostic challenges during endoscopy due to visual similarities,indistinct boundaries,and observer variability,which can lead to missed diagnoses and delayed tr...BACKGROUND Upper gastrointestinal(UGI)diseases present diagnostic challenges during endoscopy due to visual similarities,indistinct boundaries,and observer variability,which can lead to missed diagnoses and delayed treatment.Automated segmentation using deep learning(DL)models offers the potential to assist endoscopists,improve diagnostic accuracy,and reduce workload.However,multi-class UGI disease segmentation remains underexplored,with limited annotated datasets and insufficient focus on clinical validation.This study hypothesizes that comparative analysis of different DL architectures can identify models suitable for clinical application,providing actionable insights to reduce diagnostic errors and support clinical decision-making in endoscopic practice.AIM To evaluate 17 state-of-the-art DL models for multi-class UGI disease segmentation,emphasizing clinical translation and real-world applicability.METHODS This study evaluated 17 DL models spanning convolutional neural network(CNN)-,transformer-,and mambabased architectures using a self-collected dataset from two hospitals in Macao and Xiangyang(3313 images,9 classes)and the public EDD2020 dataset(386 images,5 classes).Models were assessed for segmentation performance and performance-efficiency trade-off.Statistical analyses were conducted to examine performance differences across architectures.Generalization capability was measured through a cross-dataset evaluation(training models on the self-collected dataset and testing on the EDD2020 dataset).RESULTS Swin-UMamba achieved the highest segmentation performance across both datasets[intersection over union(IoU):89.06%±0.20%self-collected,77.53%±0.32%EDD2020],followed by SegFormer(IoU:88.94%±0.38%selfcollected,77.20%±0.98%EDD2020)and ConvNeXt+UPerNet(IoU:88.48%±0.09%self-collected,76.90%±0.61%EDD2020).Statistical analyses showed no significant differences between paradigms,though hierarchical architectures with pre-trained encoders consistently outperformed simpler designs.SegFormer achieved the best balance of accuracy and computational efficiency with a performance-efficiency trade-off score of 92.02%,making it suitable for real-time clinical use.Cross-dataset evaluation revealed significant performance drops,with generalization retention rates of 64.78%to 71.52%.Transformer-based models,particularly pyramid vision transformer v2+efficient multi-scale convolutional decoding(IoU:63.35%±1.44%),generalized better than CNN-and mambabased models.CONCLUSION Hierarchical architectures like Swin-UMamba and SegFormer show promise for UGI disease segmentation,reducing missed diagnoses and improving workflows,but robust clinical validation is crucial for real-world deployment.展开更多
Remote sensing image segmentation has a wide range of applications in land cover classification,urban building recognition,crop monitoring,and other fields.In recent years,with the booming development of deep learning...Remote sensing image segmentation has a wide range of applications in land cover classification,urban building recognition,crop monitoring,and other fields.In recent years,with the booming development of deep learning,remote sensing image segmentation models based on deep learning have gradually emerged and produced a large number of scientific research achievements.This article is based on deep learning and reviews the latest achievements in remote sensing image segmentation,exploring future development directions.Firstly,the basic concepts,characteristics,classification,tasks,and commonly used datasets of remote sensingimages are presented.Secondly,the segmentation models based on deep learning were classified and summarized,and the principles,characteristics,and applications of various models were presented.Then,the key technologies involved in deep learning remote sensing image segmentation were introduced.Finally,the future development direction and applicationprospects of remote sensing image segmentation were discussed.This article reviews the latest research achievements in remote sensing image segmentationfrom the perspective of deep learning,which can provide reference and inspiration for the research of remote sensing image segmentation.展开更多
Medical image segmentation,i.e.,labeling structures of interest in medical images,is crucial for disease diagnosis and treatment in radiology.In reversible data hiding in medical images(RDHMI),segmentation consists of...Medical image segmentation,i.e.,labeling structures of interest in medical images,is crucial for disease diagnosis and treatment in radiology.In reversible data hiding in medical images(RDHMI),segmentation consists of only two regions:the focal and nonfocal regions.The focal region mainly contains information for diagnosis,while the nonfocal region serves as the monochrome background.The current traditional segmentation methods utilized in RDHMI are inaccurate for complex medical images,and manual segmentation is time-consuming,poorly reproducible,and operator-dependent.Implementing state-of-the-art deep learning(DL)models will facilitate key benefits,but the lack of domain-specific labels for existing medical datasets makes it impossible.To address this problem,this study provides labels of existing medical datasets based on a hybrid segmentation approach to facilitate the implementation of DL segmentation models in this domain.First,an initial segmentation based on a 33 kernel is performed to analyze×identified contour pixels before classifying pixels into focal and nonfocal regions.Then,several human expert raters evaluate and classify the generated labels into accurate and inaccurate labels.The inaccurate labels undergo manual segmentation by medical practitioners and are scored based on a hierarchical voting scheme before being assigned to the proposed dataset.To ensure reliability and integrity in the proposed dataset,we evaluate the accurate automated labels with manually segmented labels by medical practitioners using five assessment metrics:dice coefficient,Jaccard index,precision,recall,and accuracy.The experimental results show labels in the proposed dataset are consistent with the subjective judgment of human experts,with an average accuracy score of 94%and dice coefficient scores between 90%-99%.The study further proposes a ResNet-UNet with concatenated spatial and channel squeeze and excitation(scSE)architecture for semantic segmentation to validate and illustrate the usefulness of the proposed dataset.The results demonstrate the superior performance of the proposed architecture in accurately separating the focal and nonfocal regions compared to state-of-the-art architectures.Dataset information is released under the following URL:https://www.kaggle.com/lordamoah/datasets(accessed on 31 March 2025).展开更多
Automated prostate cancer detection in magnetic resonance imaging(MRI)scans is of significant importance for cancer patient management.Most existing computer-aided diagnosis systems adopt segmentation methods while ob...Automated prostate cancer detection in magnetic resonance imaging(MRI)scans is of significant importance for cancer patient management.Most existing computer-aided diagnosis systems adopt segmentation methods while object detection approaches recently show promising results.The authors have(1)carefully compared performances of most-developed segmentation and object detection methods in localising prostate imaging reporting and data system(PIRADS)-labelled prostate lesions on MRI scans;(2)proposed an additional customised set of lesion-level localisation sensitivity and precision;(3)proposed efficient ways to ensemble the segmentation and object detection methods for improved performances.The ground-truth(GT)perspective lesion-level sensitivity and prediction-perspective lesion-level precision are reported,to quantify the ratios of true positive voxels being detected by algorithms over the number of voxels in the GT labelled regions and predicted regions.The two networks are trained independently on 549 clinical patients data with PIRADS-V2 as GT labels,and tested on 161 internal and 100 external MRI scans.At the lesion level,nnDetection outperforms nnUNet for detecting both PIRADS≥3 and PIRADS≥4 lesions in majority cases.For example,at the average false positive prediction per patient being 3,nnDetection achieves a greater Intersection-of-Union(IoU)-based sensitivity than nnUNet for detecting PIRADS≥3 lesions,being 80.78%�1.50%versus 60.40%�1.64%(p<0.01).At the voxel level,nnUnet is in general superior or comparable to nnDetection.The proposed ensemble methods achieve improved or comparable lesion-level accuracy,in all tested clinical scenarios.For example,at 3 false positives,the lesion-wise ensemble method achieves 82.24%�1.43%sensitivity versus 80.78%�1.50%(nnDetection)and 60.40%�1.64%(nnUNet)for detecting PIRADS≥3 lesions.Consistent conclusions are also drawn from results on the external data set.展开更多
Deep learning-based methods have become alternatives to traditional numerical weather prediction systems,offering faster computation and the ability to utilize large historical datasets.However,the application of deep...Deep learning-based methods have become alternatives to traditional numerical weather prediction systems,offering faster computation and the ability to utilize large historical datasets.However,the application of deep learning to medium-range regional weather forecasting with limited data remains a significant challenge.In this work,three key solutions are proposed:(1)motivated by the need to improve model performance in data-scarce regional forecasting scenarios,the authors innovatively apply semantic segmentation models,to better capture spatiotemporal features and improve prediction accuracy;(2)recognizing the challenge of overfitting and the inability of traditional noise-based data augmentation methods to effectively enhance model robustness,a novel learnable Gaussian noise mechanism is introduced that allows the model to adaptively optimize perturbations for different locations,ensuring more effective learning;and(3)to address the issue of error accumulation in autoregressive prediction,as well as the challenge of learning difficulty and the lack of intermediate data utilization in one-shot prediction,the authors propose a cascade prediction approach that effectively resolves these problems while significantly improving model forecasting performance.The method achieves a competitive result in The East China Regional AI Medium Range Weather Forecasting Competition.Ablation experiments further validate the effectiveness of each component,highlighting their contributions to enhancing prediction performance.展开更多
The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning technique...The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories, such as bird species or dog breeds. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we review four types of deep learning based fine-grained image classification approaches, including the general convolutional neural networks (CNNs), part detection based, ensemble of networks based and visual attention based fine-grained image classification approaches. Besides, the deep learning based semantic segmentation approaches are also covered in this paper. The region proposal based and fully convolutional networks based approaches for semantic segmentation are introduced respectively.展开更多
Deep learning is widely used for lesion segmentation in medical images due to its breakthrough performance.Loss functions are critical in a deep learning pipeline,and they play important roles in segmenting performanc...Deep learning is widely used for lesion segmentation in medical images due to its breakthrough performance.Loss functions are critical in a deep learning pipeline,and they play important roles in segmenting performance.Dice loss is the most commonly used loss function in medical image segmentation,but it also has some disadvantages.In this paper,we discuss the advantages and disadvantages of the Dice loss function,and group the extensions of the Dice loss according to its improved purpose.The performances of some extensions are compared according to core references.Because different loss functions have different performances in different tasks,automatic loss function selection will be the potential direction in the future.展开更多
Many existing techniques to acquire dual-energy X-ray absorptiometry(DXA)images are unable to accurately distinguish between bone and soft tissue.For the most part,this failure stems from bone shape variability,noise ...Many existing techniques to acquire dual-energy X-ray absorptiometry(DXA)images are unable to accurately distinguish between bone and soft tissue.For the most part,this failure stems from bone shape variability,noise and low contrast in DXA images,inconsistent X-ray beam penetration producing shadowing effects,and person-to-person variations.This work explores the feasibility of using state-of-the-art deep learning semantic segmentation models,fully convolutional networks(FCNs),SegNet,and U-Net to distinguish femur bone from soft tissue.We investigated the performance of deep learning algorithms with reference to some of our previously applied conventional image segmentation techniques(i.e.,a decision-tree-based method using a pixel label decision tree[PLDT]and another method using Otsu’s thresholding)for femur DXA images,and we measured accuracy based on the average Jaccard index,sensitivity,and specificity.Deep learning models using SegNet,U-Net,and an FCN achieved average segmentation accuracies of 95.8%,95.1%,and 97.6%,respectively,compared to PLDT(91.4%)and Otsu’s thresholding(72.6%).Thus we conclude that an FCN outperforms other deep learning and conventional techniques when segmenting femur bone from soft tissue in DXA images.Accurate femur segmentation improves bone mineral density computation,which in turn enhances the diagnosing of osteoporosis.展开更多
In the shape analysis community,decomposing a 3D shape intomeaningful parts has become a topic of interest.3D model segmentation is largely used in tasks such as shape deformation,shape partial matching,skeleton extra...In the shape analysis community,decomposing a 3D shape intomeaningful parts has become a topic of interest.3D model segmentation is largely used in tasks such as shape deformation,shape partial matching,skeleton extraction,shape correspondence,shape annotation and texture mapping.Numerous approaches have attempted to provide better segmentation solutions;however,the majority of the previous techniques used handcrafted features,which are usually focused on a particular attribute of 3Dobjects and so are difficult to generalize.In this paper,we propose a three-stage approach for using Multi-view recurrent neural network to automatically segment a 3D shape into visually meaningful sub-meshes.The first stage involves normalizing and scaling a 3D model to fit within the unit sphere and rendering the object into different views.Contrasting viewpoints,on the other hand,might not have been associated,and a 3D region could correlate into totally distinct outcomes depending on the viewpoint.To address this,we ran each view through(shared weights)CNN and Bolster block in order to create a probability boundary map.The Bolster block simulates the area relationships between different views,which helps to improve and refine the data.In stage two,the feature maps generated in the previous step are correlated using a Recurrent Neural network to obtain compatible fine detail responses for each view.Finally,a layer that is fully connected is used to return coherent edges,which are then back project to 3D objects to produce the final segmentation.Experiments on the Princeton Segmentation Benchmark dataset show that our proposed method is effective for mesh segmentation tasks.展开更多
Automatic segmentation of the liver and hepatic lesions from abdominal 3D comput-ed tomography(CT)images is fundamental tasks in computer-assisted liver surgery planning.However,due to complex backgrounds,ambiguous bo...Automatic segmentation of the liver and hepatic lesions from abdominal 3D comput-ed tomography(CT)images is fundamental tasks in computer-assisted liver surgery planning.However,due to complex backgrounds,ambiguous boundaries,heterogeneous appearances and highly varied shapes of the liver,accurate liver segmentation and tumor detection are stil-1 challenging problems.To address these difficulties,we propose an automatic segmentation framework based on 3D U-net with dense connections and globally optimized refinement.First-ly,a deep U-net architecture with dense connections is trained to learn the probability map of the liver.Then the probability map goes into the following refinement step as the initial surface and prior shape.The segmentation of liver tumor is based on the similar network architecture with the help of segmentation results of liver.In order to reduce the infuence of the surrounding tissues with the similar intensity and texture behavior with the tumor region,during the training procedure,I x liverlabel is the input of the network for the segmentation of liver tumor.By do-ing this,the accuracy of segmentation can be improved.The proposed method is fully automatic without any user interaction.Both qualitative and quantitative results reveal that the pro-posed approach is efficient and accurate for liver volume estimation in clinical application.The high correlation between the automatic and manual references shows that the proposed method can be good enough to replace the time-consuming and non-reproducible manual segmentation method.展开更多
Image segmentation is crucial for various research areas. Manycomputer vision applications depend on segmenting images to understandthe scene, such as autonomous driving, surveillance systems, robotics, andmedical ima...Image segmentation is crucial for various research areas. Manycomputer vision applications depend on segmenting images to understandthe scene, such as autonomous driving, surveillance systems, robotics, andmedical imaging. With the recent advances in deep learning (DL) and itsconfounding results in image segmentation, more attention has been drawnto its use in medical image segmentation. This article introduces a surveyof the state-of-the-art deep convolution neural network (CNN) models andmechanisms utilized in image segmentation. First, segmentation models arecategorized based on their model architecture and primary working principle.Then, CNN categories are described, and various models are discussed withineach category. Compared with other existing surveys, several applicationswith multiple architectural adaptations are discussed within each category.A comparative summary is included to give the reader insights into utilizedarchitectures in different applications and datasets. This study focuses onmedical image segmentation applications, where the most widely used architecturesare illustrated, and other promising models are suggested that haveproven their success in different domains. Finally, the present work discussescurrent limitations and solutions along with future trends in the field.展开更多
Background:In medical image analysis,the diagnosis of skin lesions remains a challenging task.Skin lesion is a common type of skin cancer that exists worldwide.Dermoscopy is one of the latest technologies used for the...Background:In medical image analysis,the diagnosis of skin lesions remains a challenging task.Skin lesion is a common type of skin cancer that exists worldwide.Dermoscopy is one of the latest technologies used for the diagnosis of skin cancer.Challenges:Many computerized methods have been introduced in the literature to classify skin cancers.However,challenges remain such as imbalanced datasets,low contrast lesions,and the extraction of irrelevant or redundant features.Proposed Work:In this study,a new technique is proposed based on the conventional and deep learning framework.The proposed framework consists of two major tasks:lesion segmentation and classification.In the lesion segmentation task,contrast is initially improved by the fusion of two filtering techniques and then performed a color transformation to color lesion area color discrimination.Subsequently,the best channel is selected and the lesion map is computed,which is further converted into a binary form using a thresholding function.In the lesion classification task,two pre-trained CNN models were modified and trained using transfer learning.Deep features were extracted from both models and fused using canonical correlation analysis.During the fusion process,a few redundant features were also added,lowering classification accuracy.A new technique called maximum entropy score-based selection(MESbS)is proposed as a solution to this issue.The features selected through this approach are fed into a cubic support vector machine(C-SVM)for the final classification.Results:The experimental process was conducted on two datasets:ISIC 2017 and HAM10000.The ISIC 2017 dataset was used for the lesion segmentation task,whereas the HAM10000 dataset was used for the classification task.The achieved accuracy for both datasets was 95.6% and 96.7%, respectively, which was higher thanthe existing techniques.展开更多
An automated system is proposed for the detection and classification of GI abnormalities.The proposed method operates under two pipeline procedures:(a)segmentation of the bleeding infection region and(b)classification...An automated system is proposed for the detection and classification of GI abnormalities.The proposed method operates under two pipeline procedures:(a)segmentation of the bleeding infection region and(b)classification of GI abnormalities by deep learning.The first bleeding region is segmented using a hybrid approach.The threshold is applied to each channel extracted from the original RGB image.Later,all channels are merged through mutual information and pixel-based techniques.As a result,the image is segmented.Texture and deep learning features are extracted in the proposed classification task.The transfer learning(TL)approach is used for the extraction of deep features.The Local Binary Pattern(LBP)method is used for texture features.Later,an entropy-based feature selection approach is implemented to select the best features of both deep learning and texture vectors.The selected optimal features are combined with a serial-based technique and the resulting vector is fed to the Ensemble Learning Classifier.The experimental process is evaluated on the basis of two datasets:Private and KVASIR.The accuracy achieved is 99.8 per cent for the private data set and 86.4 percent for the KVASIR data set.It can be confirmed that the proposed method is effective in detecting and classifying GI abnormalities and exceeds other methods of comparison.展开更多
Although deep learning methods have been widely applied in medical image lesion segmentation,it is still challenging to apply them for segmenting ischemic stroke lesions,which are different from brain tumors in lesion...Although deep learning methods have been widely applied in medical image lesion segmentation,it is still challenging to apply them for segmenting ischemic stroke lesions,which are different from brain tumors in lesion characteristics,segmentation difficulty,algorithm maturity,and segmentation accuracy.Three main stages are used to describe the manifestations of stroke.For acute ischemic stroke,the size of the lesions is similar to that of brain tumors,and the current deep learning methods have been able to achieve a high segmentation accuracy.For sub-acute and chronic ischemic stroke,the segmentation results of mainstream deep learning algorithms are still unsatisfactory as lesions in these stages are small and diffuse.By using three scientific search engines including CNKI,Web of Science and Google Scholar,this paper aims to comprehensively understand the state-of-the-art deep learning algorithms applied to segmenting ischemic stroke lesions.For the first time,this paper discusses the current situation,challenges,and development directions of deep learning algorithms applied to ischemic stroke lesion segmentation in different stages.In the future,a system that can directly identify different stroke stages and automatically select the suitable network architecture for the stroke lesion segmentation needs to be proposed.展开更多
Accurate segmentation of CT images of liver tumors is an important adjunct for the liver diagnosis and treatment of liver diseases.In recent years,due to the great improvement of hard device,many deep learning based m...Accurate segmentation of CT images of liver tumors is an important adjunct for the liver diagnosis and treatment of liver diseases.In recent years,due to the great improvement of hard device,many deep learning based methods have been proposed for automatic liver segmentation.Among them,there are the plain neural network headed by FCN and the residual neural network headed by Resnet,both of which have many variations.They have achieved certain achievements in medical image segmentation.In this paper,we firstly select five representative structures,i.e.,FCN,U-Net,Segnet,Resnet and Densenet,to investigate their performance on liver segmentation.Since original Resnet and Densenet could not perform image segmentation directly,we make some adjustments for them to perform live segmentation.Our experimental results show that Densenet performs the best on liver segmentation,followed by Resnet.Both perform much better than Segnet,U-Net,and FCN.Among Segnet,U-Net,and FCN,U-Net performs the best,followed by Segnet.FCN performs the worst.展开更多
Microstructural classification is typically done manually by human experts,which gives rise to uncertainties due to subjectivity and reduces the overall efficiency.A high-throughput characterization is proposed based ...Microstructural classification is typically done manually by human experts,which gives rise to uncertainties due to subjectivity and reduces the overall efficiency.A high-throughput characterization is proposed based on deep learning,rapid acquisition technology,and mathematical statistics for the recognition,segmentation,and quantification of microstructure in weathering steel.The segmentation results showed that this method was accurate and efficient,and the segmentation of inclusions and pearlite phase achieved accuracy of 89.95%and 90.86%,respectively.The time required for batch processing by MIPAR software involving thresholding segmentation,morphological processing,and small area deletion was 1.05 s for a single image.By comparison,our system required only 0.102 s,which is ten times faster than the commercial software.The quantification results were extracted from large volumes of sequential image data(150 mm^(2),62,216 images,1024×1024 pixels),which ensure comprehensive statistics.Microstructure information,such as three-dimensional density distribution and the frequency of the minimum spatial distance of inclusions on the sample surface of 150 mm^(2),were quantified by extracting the coordinates and sizes of individual features.A refined characterization method for two-dimensional structures and spatial information that is unattainable when performing manually or with software is provided.That will be useful for understanding properties or behaviors of weathering steel,and reducing the resort to physical testing.展开更多
Objective To propose two novel methods based on deep learning for computer-aided tongue diagnosis,including tongue image segmentation and tongue color classification,improving their diagnostic accuracy.Methods LabelMe...Objective To propose two novel methods based on deep learning for computer-aided tongue diagnosis,including tongue image segmentation and tongue color classification,improving their diagnostic accuracy.Methods LabelMe was used to label the tongue mask and Snake model to optimize the labeling results.A new dataset was constructed for tongue image segmentation.Tongue color was marked to build a classified dataset for network training.In this research,the Inception+Atrous Spatial Pyramid Pooling(ASPP)+UNet(IAUNet)method was proposed for tongue image segmentation,based on the existing UNet,Inception,and atrous convolution.Moreover,the Tongue Color Classification Net(TCCNet)was constructed with reference to ResNet,Inception,and Triple-Loss.Several important measurement indexes were selected to evaluate and compare the effects of the novel and existing methods for tongue segmentation and tongue color classification.IAUNet was compared with existing mainstream methods such as UNet and DeepLabV3+for tongue segmentation.TCCNet for tongue color classification was compared with VGG16 and GoogLeNet.Results IAUNet can accurately segment the tongue from original images.The results showed that the Mean Intersection over Union(MIoU)of IAUNet reached 96.30%,and its Mean Pixel Accuracy(MPA),mean Average Precision(mAP),F1-Score,G-Score,and Area Under Curve(AUC)reached 97.86%,99.18%,96.71%,96.82%,and 99.71%,respectively,suggesting IAUNet produced better segmentation than other methods,with fewer parameters.Triplet-Loss was applied in the proposed TCCNet to separate different embedded colors.The experiment yielded ideal results,with F1-Score and mAP of the TCCNet reached 88.86% and 93.49%,respectively.Conclusion IAUNet based on deep learning for tongue segmentation is better than traditional ones.IAUNet can not only produce ideal tongue segmentation,but have better effects than those of PSPNet,SegNet,UNet,and DeepLabV3+,the traditional networks.As for tongue color classification,the proposed network,TCCNet,had better F1-Score and mAP values as compared with other neural networks such as VGG16 and GoogLeNet.展开更多
文摘This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 2025.The primary objective is to evaluate methodological advancements,model performance,dataset usage,and existing challenges in developing clinically robust AI systems.We included peer-reviewed journal articles and highimpact conference papers published between 2022 and 2025,written in English,that proposed or evaluated deep learning methods for brain tumor segmentation and/or classification.Excluded were non-open-access publications,books,and non-English articles.A structured search was conducted across Scopus,Google Scholar,Wiley,and Taylor&Francis,with the last search performed in August 2025.Risk of bias was not formally quantified but considered during full-text screening based on dataset diversity,validation methods,and availability of performance metrics.We used narrative synthesis and tabular benchmarking to compare performance metrics(e.g.,accuracy,Dice score)across model types(CNN,Transformer,Hybrid),imaging modalities,and datasets.A total of 49 studies were included(43 journal articles and 6 conference papers).These studies spanned over 9 public datasets(e.g.,BraTS,Figshare,REMBRANDT,MOLAB)and utilized a range of imaging modalities,predominantly MRI.Hybrid models,especially ResViT and UNetFormer,consistently achieved high performance,with classification accuracy exceeding 98%and segmentation Dice scores above 0.90 across multiple studies.Transformers and hybrid architectures showed increasing adoption post2023.Many studies lacked external validation and were evaluated only on a few benchmark datasets,raising concerns about generalizability and dataset bias.Few studies addressed clinical interpretability or uncertainty quantification.Despite promising results,particularly for hybrid deep learning models,widespread clinical adoption remains limited due to lack of validation,interpretability concerns,and real-world deployment barriers.
基金National Science and Technology Council,the Republic of China,under grants NSTC 113-2221-E-194-011-MY3 and Research Center on Artificial Intelligence and Sustainability,National Chung Cheng University under the research project grant titled“Generative Digital Twin System Design for Sustainable Smart City Development in Taiwan.
文摘Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R435),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Segmenting a breast ultrasound image is still challenging due to the presence of speckle noise,dependency on the operator,and the variation of image quality.This paper presents the UltraSegNet architecture that addresses these challenges through three key technical innovations:This work adds three things:(1)a changed ResNet-50 backbone with sequential 3×3 convolutions to keep fine anatomical details that are needed for finding lesion boundaries;(2)a computationally efficient regional attention mechanism that works on high-resolution features without using a transformer’s extra memory;and(3)an adaptive feature fusion strategy that changes local and global featuresbasedonhowthe image isbeing used.Extensive evaluation on two distinct datasets demonstrates UltraSegNet’s superior performance:On the BUSI dataset,it obtains a precision of 0.915,a recall of 0.908,and an F1 score of 0.911.In the UDAIT dataset,it achieves robust performance across the board,with a precision of 0.901 and recall of 0.894.Importantly,these improvements are achieved at clinically feasible computation times,taking 235 ms per image on standard GPU hardware.Notably,UltraSegNet does amazingly well on difficult small lesions(less than 10 mm),achieving a detection accuracy of 0.891.This is a huge improvement over traditional methods that have a hard time with small-scale features,as standard models can only achieve 0.63–0.71 accuracy.This improvement in small lesion detection is particularly crucial for early-stage breast cancer identification.Results from this work demonstrate that UltraSegNet can be practically deployable in clinical workflows to improve breast cancer screening accuracy.
基金Supported by the Guangdong Basic and Applied Basic Research Foundation,No.2021B1515130003the Key Research and Development Plan of Hubei Province,No.2022BCE034the Natural Science Foundation of Hubei Province,No.2024AFB1054.
文摘BACKGROUND Upper gastrointestinal(UGI)diseases present diagnostic challenges during endoscopy due to visual similarities,indistinct boundaries,and observer variability,which can lead to missed diagnoses and delayed treatment.Automated segmentation using deep learning(DL)models offers the potential to assist endoscopists,improve diagnostic accuracy,and reduce workload.However,multi-class UGI disease segmentation remains underexplored,with limited annotated datasets and insufficient focus on clinical validation.This study hypothesizes that comparative analysis of different DL architectures can identify models suitable for clinical application,providing actionable insights to reduce diagnostic errors and support clinical decision-making in endoscopic practice.AIM To evaluate 17 state-of-the-art DL models for multi-class UGI disease segmentation,emphasizing clinical translation and real-world applicability.METHODS This study evaluated 17 DL models spanning convolutional neural network(CNN)-,transformer-,and mambabased architectures using a self-collected dataset from two hospitals in Macao and Xiangyang(3313 images,9 classes)and the public EDD2020 dataset(386 images,5 classes).Models were assessed for segmentation performance and performance-efficiency trade-off.Statistical analyses were conducted to examine performance differences across architectures.Generalization capability was measured through a cross-dataset evaluation(training models on the self-collected dataset and testing on the EDD2020 dataset).RESULTS Swin-UMamba achieved the highest segmentation performance across both datasets[intersection over union(IoU):89.06%±0.20%self-collected,77.53%±0.32%EDD2020],followed by SegFormer(IoU:88.94%±0.38%selfcollected,77.20%±0.98%EDD2020)and ConvNeXt+UPerNet(IoU:88.48%±0.09%self-collected,76.90%±0.61%EDD2020).Statistical analyses showed no significant differences between paradigms,though hierarchical architectures with pre-trained encoders consistently outperformed simpler designs.SegFormer achieved the best balance of accuracy and computational efficiency with a performance-efficiency trade-off score of 92.02%,making it suitable for real-time clinical use.Cross-dataset evaluation revealed significant performance drops,with generalization retention rates of 64.78%to 71.52%.Transformer-based models,particularly pyramid vision transformer v2+efficient multi-scale convolutional decoding(IoU:63.35%±1.44%),generalized better than CNN-and mambabased models.CONCLUSION Hierarchical architectures like Swin-UMamba and SegFormer show promise for UGI disease segmentation,reducing missed diagnoses and improving workflows,but robust clinical validation is crucial for real-world deployment.
文摘Remote sensing image segmentation has a wide range of applications in land cover classification,urban building recognition,crop monitoring,and other fields.In recent years,with the booming development of deep learning,remote sensing image segmentation models based on deep learning have gradually emerged and produced a large number of scientific research achievements.This article is based on deep learning and reviews the latest achievements in remote sensing image segmentation,exploring future development directions.Firstly,the basic concepts,characteristics,classification,tasks,and commonly used datasets of remote sensingimages are presented.Secondly,the segmentation models based on deep learning were classified and summarized,and the principles,characteristics,and applications of various models were presented.Then,the key technologies involved in deep learning remote sensing image segmentation were introduced.Finally,the future development direction and applicationprospects of remote sensing image segmentation were discussed.This article reviews the latest research achievements in remote sensing image segmentationfrom the perspective of deep learning,which can provide reference and inspiration for the research of remote sensing image segmentation.
基金supported by the National Natural Science Foundation of China(Grant Nos.62072250,61772281,61702235,U1636117,U1804263,62172435,61872203 and 61802212)the Zhongyuan Science and Technology Innovation Leading Talent Project of China(Grant No.214200510019)+3 种基金the Suqian Municipal Science and Technology Plan Project in 2020(S202015)the Plan for Scientific Talent of Henan Province(Grant No.2018JR0018)the Opening Project of Guangdong Provincial Key Laboratory of Information Security Technology(Grant No.2020B1212060078)the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)Fund.
文摘Medical image segmentation,i.e.,labeling structures of interest in medical images,is crucial for disease diagnosis and treatment in radiology.In reversible data hiding in medical images(RDHMI),segmentation consists of only two regions:the focal and nonfocal regions.The focal region mainly contains information for diagnosis,while the nonfocal region serves as the monochrome background.The current traditional segmentation methods utilized in RDHMI are inaccurate for complex medical images,and manual segmentation is time-consuming,poorly reproducible,and operator-dependent.Implementing state-of-the-art deep learning(DL)models will facilitate key benefits,but the lack of domain-specific labels for existing medical datasets makes it impossible.To address this problem,this study provides labels of existing medical datasets based on a hybrid segmentation approach to facilitate the implementation of DL segmentation models in this domain.First,an initial segmentation based on a 33 kernel is performed to analyze×identified contour pixels before classifying pixels into focal and nonfocal regions.Then,several human expert raters evaluate and classify the generated labels into accurate and inaccurate labels.The inaccurate labels undergo manual segmentation by medical practitioners and are scored based on a hierarchical voting scheme before being assigned to the proposed dataset.To ensure reliability and integrity in the proposed dataset,we evaluate the accurate automated labels with manually segmented labels by medical practitioners using five assessment metrics:dice coefficient,Jaccard index,precision,recall,and accuracy.The experimental results show labels in the proposed dataset are consistent with the subjective judgment of human experts,with an average accuracy score of 94%and dice coefficient scores between 90%-99%.The study further proposes a ResNet-UNet with concatenated spatial and channel squeeze and excitation(scSE)architecture for semantic segmentation to validate and illustrate the usefulness of the proposed dataset.The results demonstrate the superior performance of the proposed architecture in accurately separating the focal and nonfocal regions compared to state-of-the-art architectures.Dataset information is released under the following URL:https://www.kaggle.com/lordamoah/datasets(accessed on 31 March 2025).
基金National Natural Science Foundation of China,Grant/Award Number:62303275International Alliance for Cancer Early Detection,Grant/Award Numbers:C28070/A30912,C73666/A31378Wellcome/EPSRC Centre for Interventional and Surgical Sciences,Grant/Award Number:203145Z/16/Z。
文摘Automated prostate cancer detection in magnetic resonance imaging(MRI)scans is of significant importance for cancer patient management.Most existing computer-aided diagnosis systems adopt segmentation methods while object detection approaches recently show promising results.The authors have(1)carefully compared performances of most-developed segmentation and object detection methods in localising prostate imaging reporting and data system(PIRADS)-labelled prostate lesions on MRI scans;(2)proposed an additional customised set of lesion-level localisation sensitivity and precision;(3)proposed efficient ways to ensemble the segmentation and object detection methods for improved performances.The ground-truth(GT)perspective lesion-level sensitivity and prediction-perspective lesion-level precision are reported,to quantify the ratios of true positive voxels being detected by algorithms over the number of voxels in the GT labelled regions and predicted regions.The two networks are trained independently on 549 clinical patients data with PIRADS-V2 as GT labels,and tested on 161 internal and 100 external MRI scans.At the lesion level,nnDetection outperforms nnUNet for detecting both PIRADS≥3 and PIRADS≥4 lesions in majority cases.For example,at the average false positive prediction per patient being 3,nnDetection achieves a greater Intersection-of-Union(IoU)-based sensitivity than nnUNet for detecting PIRADS≥3 lesions,being 80.78%�1.50%versus 60.40%�1.64%(p<0.01).At the voxel level,nnUnet is in general superior or comparable to nnDetection.The proposed ensemble methods achieve improved or comparable lesion-level accuracy,in all tested clinical scenarios.For example,at 3 false positives,the lesion-wise ensemble method achieves 82.24%�1.43%sensitivity versus 80.78%�1.50%(nnDetection)and 60.40%�1.64%(nnUNet)for detecting PIRADS≥3 lesions.Consistent conclusions are also drawn from results on the external data set.
基金supported by the National Natural Science Foundation of China[grant number 62376217]the Young Elite Scientists Sponsorship Program by CAST[grant number 2023QNRC001]the Joint Research Project for Meteorological Capacity Improvement[grant number 24NLTSZ003]。
文摘Deep learning-based methods have become alternatives to traditional numerical weather prediction systems,offering faster computation and the ability to utilize large historical datasets.However,the application of deep learning to medium-range regional weather forecasting with limited data remains a significant challenge.In this work,three key solutions are proposed:(1)motivated by the need to improve model performance in data-scarce regional forecasting scenarios,the authors innovatively apply semantic segmentation models,to better capture spatiotemporal features and improve prediction accuracy;(2)recognizing the challenge of overfitting and the inability of traditional noise-based data augmentation methods to effectively enhance model robustness,a novel learnable Gaussian noise mechanism is introduced that allows the model to adaptively optimize perturbations for different locations,ensuring more effective learning;and(3)to address the issue of error accumulation in autoregressive prediction,as well as the challenge of learning difficulty and the lack of intermediate data utilization in one-shot prediction,the authors propose a cascade prediction approach that effectively resolves these problems while significantly improving model forecasting performance.The method achieves a competitive result in The East China Regional AI Medium Range Weather Forecasting Competition.Ablation experiments further validate the effectiveness of each component,highlighting their contributions to enhancing prediction performance.
基金supported by the National Natural Science Foundation of China(Nos.61373121 and 61328205)Program for Sichuan Provincial Science Fund for Distinguished Young Scholars(No.13QNJJ0149)+1 种基金the Fundamental Research Funds for the Central UniversitiesChina Scholarship Council(No.201507000032)
文摘The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories, such as bird species or dog breeds. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we review four types of deep learning based fine-grained image classification approaches, including the general convolutional neural networks (CNNs), part detection based, ensemble of networks based and visual attention based fine-grained image classification approaches. Besides, the deep learning based semantic segmentation approaches are also covered in this paper. The region proposal based and fully convolutional networks based approaches for semantic segmentation are introduced respectively.
文摘Deep learning is widely used for lesion segmentation in medical images due to its breakthrough performance.Loss functions are critical in a deep learning pipeline,and they play important roles in segmenting performance.Dice loss is the most commonly used loss function in medical image segmentation,but it also has some disadvantages.In this paper,we discuss the advantages and disadvantages of the Dice loss function,and group the extensions of the Dice loss according to its improved purpose.The performances of some extensions are compared according to core references.Because different loss functions have different performances in different tasks,automatic loss function selection will be the potential direction in the future.
基金Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Science and ICT[NRF-2017R1E1A1A01077717].
文摘Many existing techniques to acquire dual-energy X-ray absorptiometry(DXA)images are unable to accurately distinguish between bone and soft tissue.For the most part,this failure stems from bone shape variability,noise and low contrast in DXA images,inconsistent X-ray beam penetration producing shadowing effects,and person-to-person variations.This work explores the feasibility of using state-of-the-art deep learning semantic segmentation models,fully convolutional networks(FCNs),SegNet,and U-Net to distinguish femur bone from soft tissue.We investigated the performance of deep learning algorithms with reference to some of our previously applied conventional image segmentation techniques(i.e.,a decision-tree-based method using a pixel label decision tree[PLDT]and another method using Otsu’s thresholding)for femur DXA images,and we measured accuracy based on the average Jaccard index,sensitivity,and specificity.Deep learning models using SegNet,U-Net,and an FCN achieved average segmentation accuracies of 95.8%,95.1%,and 97.6%,respectively,compared to PLDT(91.4%)and Otsu’s thresholding(72.6%).Thus we conclude that an FCN outperforms other deep learning and conventional techniques when segmenting femur bone from soft tissue in DXA images.Accurate femur segmentation improves bone mineral density computation,which in turn enhances the diagnosing of osteoporosis.
基金supported by the National Natural Science Foundation of China (61671397).
文摘In the shape analysis community,decomposing a 3D shape intomeaningful parts has become a topic of interest.3D model segmentation is largely used in tasks such as shape deformation,shape partial matching,skeleton extraction,shape correspondence,shape annotation and texture mapping.Numerous approaches have attempted to provide better segmentation solutions;however,the majority of the previous techniques used handcrafted features,which are usually focused on a particular attribute of 3Dobjects and so are difficult to generalize.In this paper,we propose a three-stage approach for using Multi-view recurrent neural network to automatically segment a 3D shape into visually meaningful sub-meshes.The first stage involves normalizing and scaling a 3D model to fit within the unit sphere and rendering the object into different views.Contrasting viewpoints,on the other hand,might not have been associated,and a 3D region could correlate into totally distinct outcomes depending on the viewpoint.To address this,we ran each view through(shared weights)CNN and Bolster block in order to create a probability boundary map.The Bolster block simulates the area relationships between different views,which helps to improve and refine the data.In stage two,the feature maps generated in the previous step are correlated using a Recurrent Neural network to obtain compatible fine detail responses for each view.Finally,a layer that is fully connected is used to return coherent edges,which are then back project to 3D objects to produce the final segmentation.Experiments on the Princeton Segmentation Benchmark dataset show that our proposed method is effective for mesh segmentation tasks.
基金Supported by the National Natural Science Foundation of China(12090020,12090025)Zhejiang Provin-cial Natural Science Foundation of China(LSD19H180005)。
文摘Automatic segmentation of the liver and hepatic lesions from abdominal 3D comput-ed tomography(CT)images is fundamental tasks in computer-assisted liver surgery planning.However,due to complex backgrounds,ambiguous boundaries,heterogeneous appearances and highly varied shapes of the liver,accurate liver segmentation and tumor detection are stil-1 challenging problems.To address these difficulties,we propose an automatic segmentation framework based on 3D U-net with dense connections and globally optimized refinement.First-ly,a deep U-net architecture with dense connections is trained to learn the probability map of the liver.Then the probability map goes into the following refinement step as the initial surface and prior shape.The segmentation of liver tumor is based on the similar network architecture with the help of segmentation results of liver.In order to reduce the infuence of the surrounding tissues with the similar intensity and texture behavior with the tumor region,during the training procedure,I x liverlabel is the input of the network for the segmentation of liver tumor.By do-ing this,the accuracy of segmentation can be improved.The proposed method is fully automatic without any user interaction.Both qualitative and quantitative results reveal that the pro-posed approach is efficient and accurate for liver volume estimation in clinical application.The high correlation between the automatic and manual references shows that the proposed method can be good enough to replace the time-consuming and non-reproducible manual segmentation method.
基金supported by the Information Technology Industry Development Agency (ITIDA),Egypt (Project No.CFP181).
文摘Image segmentation is crucial for various research areas. Manycomputer vision applications depend on segmenting images to understandthe scene, such as autonomous driving, surveillance systems, robotics, andmedical imaging. With the recent advances in deep learning (DL) and itsconfounding results in image segmentation, more attention has been drawnto its use in medical image segmentation. This article introduces a surveyof the state-of-the-art deep convolution neural network (CNN) models andmechanisms utilized in image segmentation. First, segmentation models arecategorized based on their model architecture and primary working principle.Then, CNN categories are described, and various models are discussed withineach category. Compared with other existing surveys, several applicationswith multiple architectural adaptations are discussed within each category.A comparative summary is included to give the reader insights into utilizedarchitectures in different applications and datasets. This study focuses onmedical image segmentation applications, where the most widely used architecturesare illustrated, and other promising models are suggested that haveproven their success in different domains. Finally, the present work discussescurrent limitations and solutions along with future trends in the field.
文摘Background:In medical image analysis,the diagnosis of skin lesions remains a challenging task.Skin lesion is a common type of skin cancer that exists worldwide.Dermoscopy is one of the latest technologies used for the diagnosis of skin cancer.Challenges:Many computerized methods have been introduced in the literature to classify skin cancers.However,challenges remain such as imbalanced datasets,low contrast lesions,and the extraction of irrelevant or redundant features.Proposed Work:In this study,a new technique is proposed based on the conventional and deep learning framework.The proposed framework consists of two major tasks:lesion segmentation and classification.In the lesion segmentation task,contrast is initially improved by the fusion of two filtering techniques and then performed a color transformation to color lesion area color discrimination.Subsequently,the best channel is selected and the lesion map is computed,which is further converted into a binary form using a thresholding function.In the lesion classification task,two pre-trained CNN models were modified and trained using transfer learning.Deep features were extracted from both models and fused using canonical correlation analysis.During the fusion process,a few redundant features were also added,lowering classification accuracy.A new technique called maximum entropy score-based selection(MESbS)is proposed as a solution to this issue.The features selected through this approach are fed into a cubic support vector machine(C-SVM)for the final classification.Results:The experimental process was conducted on two datasets:ISIC 2017 and HAM10000.The ISIC 2017 dataset was used for the lesion segmentation task,whereas the HAM10000 dataset was used for the classification task.The achieved accuracy for both datasets was 95.6% and 96.7%, respectively, which was higher thanthe existing techniques.
基金This research was financially supported in part by the Ministry of Trade,Industry and Energy(MOTIE)and Korea Institute for Advancement of Technology(KIAT)through the International Cooperative R&D program.(Project No.P0016038)in part by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2021-2016-0-00312)supervised by the IITP(Institute for Information&communications Technology Planning&Evaluation).
文摘An automated system is proposed for the detection and classification of GI abnormalities.The proposed method operates under two pipeline procedures:(a)segmentation of the bleeding infection region and(b)classification of GI abnormalities by deep learning.The first bleeding region is segmented using a hybrid approach.The threshold is applied to each channel extracted from the original RGB image.Later,all channels are merged through mutual information and pixel-based techniques.As a result,the image is segmented.Texture and deep learning features are extracted in the proposed classification task.The transfer learning(TL)approach is used for the extraction of deep features.The Local Binary Pattern(LBP)method is used for texture features.Later,an entropy-based feature selection approach is implemented to select the best features of both deep learning and texture vectors.The selected optimal features are combined with a serial-based technique and the resulting vector is fed to the Ensemble Learning Classifier.The experimental process is evaluated on the basis of two datasets:Private and KVASIR.The accuracy achieved is 99.8 per cent for the private data set and 86.4 percent for the KVASIR data set.It can be confirmed that the proposed method is effective in detecting and classifying GI abnormalities and exceeds other methods of comparison.
文摘Although deep learning methods have been widely applied in medical image lesion segmentation,it is still challenging to apply them for segmenting ischemic stroke lesions,which are different from brain tumors in lesion characteristics,segmentation difficulty,algorithm maturity,and segmentation accuracy.Three main stages are used to describe the manifestations of stroke.For acute ischemic stroke,the size of the lesions is similar to that of brain tumors,and the current deep learning methods have been able to achieve a high segmentation accuracy.For sub-acute and chronic ischemic stroke,the segmentation results of mainstream deep learning algorithms are still unsatisfactory as lesions in these stages are small and diffuse.By using three scientific search engines including CNKI,Web of Science and Google Scholar,this paper aims to comprehensively understand the state-of-the-art deep learning algorithms applied to segmenting ischemic stroke lesions.For the first time,this paper discusses the current situation,challenges,and development directions of deep learning algorithms applied to ischemic stroke lesion segmentation in different stages.In the future,a system that can directly identify different stroke stages and automatically select the suitable network architecture for the stroke lesion segmentation needs to be proposed.
基金This research has been partially supported by National Science Foundation under grant IIS-1115417the National Natural Science Foundation of China under grant 61728205,61876217+1 种基金the“double first-class”international cooperation and development scientific research project of Changsha University of Science and Technology(No.2018IC25)the Science and Technology Development Project of Suzhou under grant SZS201609 and SYG201707.
文摘Accurate segmentation of CT images of liver tumors is an important adjunct for the liver diagnosis and treatment of liver diseases.In recent years,due to the great improvement of hard device,many deep learning based methods have been proposed for automatic liver segmentation.Among them,there are the plain neural network headed by FCN and the residual neural network headed by Resnet,both of which have many variations.They have achieved certain achievements in medical image segmentation.In this paper,we firstly select five representative structures,i.e.,FCN,U-Net,Segnet,Resnet and Densenet,to investigate their performance on liver segmentation.Since original Resnet and Densenet could not perform image segmentation directly,we make some adjustments for them to perform live segmentation.Our experimental results show that Densenet performs the best on liver segmentation,followed by Resnet.Both perform much better than Segnet,U-Net,and FCN.Among Segnet,U-Net,and FCN,U-Net performs the best,followed by Segnet.FCN performs the worst.
基金supported by the National Key Research and Development Program of China(No.2017YFB0702303).
文摘Microstructural classification is typically done manually by human experts,which gives rise to uncertainties due to subjectivity and reduces the overall efficiency.A high-throughput characterization is proposed based on deep learning,rapid acquisition technology,and mathematical statistics for the recognition,segmentation,and quantification of microstructure in weathering steel.The segmentation results showed that this method was accurate and efficient,and the segmentation of inclusions and pearlite phase achieved accuracy of 89.95%and 90.86%,respectively.The time required for batch processing by MIPAR software involving thresholding segmentation,morphological processing,and small area deletion was 1.05 s for a single image.By comparison,our system required only 0.102 s,which is ten times faster than the commercial software.The quantification results were extracted from large volumes of sequential image data(150 mm^(2),62,216 images,1024×1024 pixels),which ensure comprehensive statistics.Microstructure information,such as three-dimensional density distribution and the frequency of the minimum spatial distance of inclusions on the sample surface of 150 mm^(2),were quantified by extracting the coordinates and sizes of individual features.A refined characterization method for two-dimensional structures and spatial information that is unattainable when performing manually or with software is provided.That will be useful for understanding properties or behaviors of weathering steel,and reducing the resort to physical testing.
基金Scientific Research Project of the Education Department of Hunan Province(20C1435)Open Fund Project for Computer Science and Technology of Hunan University of Chinese Medicine(2018JK05).
文摘Objective To propose two novel methods based on deep learning for computer-aided tongue diagnosis,including tongue image segmentation and tongue color classification,improving their diagnostic accuracy.Methods LabelMe was used to label the tongue mask and Snake model to optimize the labeling results.A new dataset was constructed for tongue image segmentation.Tongue color was marked to build a classified dataset for network training.In this research,the Inception+Atrous Spatial Pyramid Pooling(ASPP)+UNet(IAUNet)method was proposed for tongue image segmentation,based on the existing UNet,Inception,and atrous convolution.Moreover,the Tongue Color Classification Net(TCCNet)was constructed with reference to ResNet,Inception,and Triple-Loss.Several important measurement indexes were selected to evaluate and compare the effects of the novel and existing methods for tongue segmentation and tongue color classification.IAUNet was compared with existing mainstream methods such as UNet and DeepLabV3+for tongue segmentation.TCCNet for tongue color classification was compared with VGG16 and GoogLeNet.Results IAUNet can accurately segment the tongue from original images.The results showed that the Mean Intersection over Union(MIoU)of IAUNet reached 96.30%,and its Mean Pixel Accuracy(MPA),mean Average Precision(mAP),F1-Score,G-Score,and Area Under Curve(AUC)reached 97.86%,99.18%,96.71%,96.82%,and 99.71%,respectively,suggesting IAUNet produced better segmentation than other methods,with fewer parameters.Triplet-Loss was applied in the proposed TCCNet to separate different embedded colors.The experiment yielded ideal results,with F1-Score and mAP of the TCCNet reached 88.86% and 93.49%,respectively.Conclusion IAUNet based on deep learning for tongue segmentation is better than traditional ones.IAUNet can not only produce ideal tongue segmentation,but have better effects than those of PSPNet,SegNet,UNet,and DeepLabV3+,the traditional networks.As for tongue color classification,the proposed network,TCCNet,had better F1-Score and mAP values as compared with other neural networks such as VGG16 and GoogLeNet.