In recent years,fungal diseases affecting grape crops have attracted significant attention.Currently,the assessment of black rot severitymainly depends on the ratio of lesion area to leaf surface area.However,effectiv...In recent years,fungal diseases affecting grape crops have attracted significant attention.Currently,the assessment of black rot severitymainly depends on the ratio of lesion area to leaf surface area.However,effectively and accurately segmenting leaf lesions presents considerable challenges.Existing grape leaf lesion segmentationmodels have several limitations,such as a large number of parameters,long training durations,and limited precision in extracting small lesions and boundary details.To address these issues,we propose an enhanced DeepLabv3+model incorporating Strip Pooling,Content-Guided Fusion,and Convolutional Block Attention Module(SFC_DeepLabv3+),an enhanced lesion segmentation method based on DeepLabv3+.This approach uses the lightweight MobileNetv2 backbone to replace the original Xception,incorporates a lightweight convolutional block attention module,and introduces a content-guided feature fusion module to improve the detection accuracy of small lesions and blurred boundaries.Experimental results showthat the enhancedmodel achieves a mean Intersection overUnion(mIoU)of 90.98%,amean Pixel Accuracy(mPA)of 94.33%,and a precision of 95.84%.This represents relative gains of 2.22%,1.78%,and 0.89%respectively compared to the original model.Additionally,its complexity is significantly reduced without sacrificing performance,the parameter count is reduced to 6.27 M,a decrease of 88.5%compared to the original model,floating point of operations(GFLOPs)drops from 83.62 to 29.00 G,a reduction of 65.1%.Additionally,Frames Per Second(FPS)increases from 63.7 to 74.3 FPS,marking an improvement of 16.7%.Compared to other models,the improved architecture shows faster convergence and superior segmentation accuracy,making it highly suitable for applications in resource-constrained environments.展开更多
Research has been conducted to reduce resource consumption in 3D medical image segmentation for diverse resource-constrained environments.However,decreasing the number of parameters to enhance computational efficiency...Research has been conducted to reduce resource consumption in 3D medical image segmentation for diverse resource-constrained environments.However,decreasing the number of parameters to enhance computational efficiency can also lead to performance degradation.Moreover,these methods face challenges in balancing global and local features,increasing the risk of errors in multi-scale segmentation.This issue is particularly pronounced when segmenting small and complex structures within the human body.To address this problem,we propose a multi-stage hierarchical architecture composed of a detector and a segmentor.The detector extracts regions of interest(ROIs)in a 3D image,while the segmentor performs segmentation in the extracted ROI.Removing unnecessary areas in the detector allows the segmentation to be performed on a more compact input.The segmentor is designed with multiple stages,where each stage utilizes different input sizes.It implements a stage-skippingmechanism that deactivates certain stages using the initial input size.This approach minimizes unnecessary computations on segmenting the essential regions to reduce computational overhead.The proposed framework preserves segmentation performance while reducing resource consumption,enabling segmentation even in resource-constrained environments.展开更多
Brain tumor segmentation from Magnetic Resonance Imaging(MRI)supports neurologists and radiologists in analyzing tumors and developing personalized treatment plans,making it a crucial yet challenging task.Supervised m...Brain tumor segmentation from Magnetic Resonance Imaging(MRI)supports neurologists and radiologists in analyzing tumors and developing personalized treatment plans,making it a crucial yet challenging task.Supervised models such as 3D U-Net perform well in this domain,but their accuracy significantly improves with appropriate preprocessing.This paper demonstrates the effectiveness of preprocessing in brain tumor segmentation by applying a pre-segmentation step based on the Generalized Gaussian Mixture Model(GGMM)to T1 contrastenhanced MRI scans from the BraTS 2020 dataset.The Expectation-Maximization(EM)algorithm is employed to estimate parameters for four tissue classes,generating a new pre-segmented channel that enhances the training and performance of the 3DU-Net model.The proposed GGMM+3D U-Net framework achieved a Dice coefficient of 0.88 for whole tumor segmentation,outperforming both the standard multiscale 3D U-Net(0.84)and MMU-Net(0.85).It also delivered higher Intersection over Union(IoU)scores compared to models trained without preprocessing or with simpler GMM-based segmentation.These results,supported by qualitative visualizations,suggest that GGMM-based preprocessing should be integrated into brain tumor segmentation pipelines to optimize performance.展开更多
Accurate and efficient brain tumor segmentation is essential for early diagnosis,treatment planning,and clinical decision-making.However,the complex structure of brain anatomy and the heterogeneous nature of tumors pr...Accurate and efficient brain tumor segmentation is essential for early diagnosis,treatment planning,and clinical decision-making.However,the complex structure of brain anatomy and the heterogeneous nature of tumors present significant challenges for precise anomaly detection.While U-Net-based architectures have demonstrated strong performance in medical image segmentation,there remains room for improvement in feature extraction and localization accuracy.In this study,we propose a novel hybrid model designed to enhance 3D brain tumor segmentation.The architecture incorporates a 3D ResNet encoder known for mitigating the vanishing gradient problem and a 3D U-Net decoder.Additionally,to enhance the model’s generalization ability,Squeeze and Excitation attention mechanism is integrated.We introduce Gabor filter banks into the encoder to further strengthen the model’s ability to extract robust and transformation-invariant features from the complex and irregular shapes typical in medical imaging.This approach,which is not well explored in current U-Net-based segmentation frameworks,provides a unique advantage by enhancing texture-aware feature representation.Specifically,Gabor filters help extract distinctive low-level texture features,reducing the effects of texture interference and facilitating faster convergence during the early stages of training.Our model achieved Dice scores of 0.881,0.846,and 0.819 for Whole Tumor(WT),Tumor Core(TC),and Enhancing Tumor(ET),respectively,on the BraTS 2020 dataset.Cross-validation on the BraTS 2021 dataset further confirmed the model’s robustness,yielding Dice score values of 0.887 for WT,0.856 for TC,and 0.824 for ET.The proposed model outperforms several state-of-the-art existing models,particularly in accurately identifying small and complex tumor regions.Extensive evaluations suggest integrating advanced preprocessing with an attention-augmented hybrid architecture offers significant potential for reliable and clinically valuable brain tumor segmentation.展开更多
3D object recognition is a challenging task for intelligent and robot systems in industrial and home indoor environments.It is critical for such systems to recognize and segment the 3D object instances that they encou...3D object recognition is a challenging task for intelligent and robot systems in industrial and home indoor environments.It is critical for such systems to recognize and segment the 3D object instances that they encounter on a frequent basis.The computer vision,graphics,and machine learning fields have all given it a lot of attention.Traditionally,3D segmentation was done with hand-crafted features and designed approaches that didn’t achieve acceptable performance and couldn’t be generalized to large-scale data.Deep learning approaches have lately become the preferred method for 3D segmentation challenges by their great success in 2D computer vision.However,the task of instance segmentation is currently less explored.In this paper,we propose a novel approach for efficient 3D instance segmentation using red green blue and depth(RGB-D)data based on deep learning.The 2D region based convolutional neural networks(Mask R-CNN)deep learning model with point based rending module is adapted to integrate with depth information to recognize and segment 3D instances of objects.In order to generate 3D point cloud coordinates(x,y,z),segmented 2D pixels(u,v)of recognized object regions in the RGB image are merged into(u,v)points of the depth image.Moreover,we conducted an experiment and analysis to compare our proposed method from various points of view and distances.The experimentation shows the proposed 3D object recognition and instance segmentation are sufficiently beneficial to support object handling in robotic and intelligent systems.展开更多
Aiming to address the Unmanned Aerial Vehicle(UAV) formation collision avoidance problem in Three-Dimensional(3-D) low-altitude environments where dense various obstacles exist, a fluid-based path planning framework n...Aiming to address the Unmanned Aerial Vehicle(UAV) formation collision avoidance problem in Three-Dimensional(3-D) low-altitude environments where dense various obstacles exist, a fluid-based path planning framework named the Formation Interfered Fluid Dynamical System(FIFDS) with Moderate Evasive Maneuver Strategy(MEMS) is proposed in this study.First, the UAV formation collision avoidance problem including quantifiable performance indexes is formulated. Second, inspired by the phenomenon of fluids continuously flowing while bypassing objects, the FIFDS for multiple UAVs is presented, which contains a Parallel Streamline Tracking(PST) method for formation keeping and the traditional IFDS for collision avoidance. Third, to rationally balance flight safety and collision avoidance cost, MEMS is proposed to generate moderate evasive maneuvers that match up with collision risks. Comprehensively containing the time and distance safety information, the 3-D dynamic collision regions are modeled for collision prediction. Then, the moderate evasive maneuver principle is refined, which provides criterions of the maneuver amplitude and direction. On this basis, an analytical parameter mapping mechanism is designed to online optimize IFDS parameters. Finally, the performance of the proposed method is validated by comparative simulation results and real flight experiments using fixed-wing UAVs.展开更多
In the study of automatic driving,understanding the road scene is a key to improve driving safety.The semantic segmentation method could divide the image into different areas associated with semantic categories in acc...In the study of automatic driving,understanding the road scene is a key to improve driving safety.The semantic segmentation method could divide the image into different areas associated with semantic categories in accordance with the pixel level,so as to help vehicles to perceive and obtain the surrounding road environment information,which would improve driving safety.Deeplabv3+is the current popular semantic segmentation model.There are phenomena that small targets are missed and similar objects are easily misjudged during its semantic segmentation tasks,which leads to rough segmentation boundary and reduces semantic accuracy.This study focuses on the issue,based on the Deeplabv3+network structure and combined with the attention mechanism,to increase the weight of the segmentation area,and then proposes an improved Deeplabv3+fusion attention mechanism for road scene semantic segmentation method.First,a group of parallel position attention module and channel attention module are introduced on the Deeplabv3+encoding end to capture more spatial context information and high-level semantic information.Then,an attention mechanism is introduced to restore the spatial detail information,and the data shall be normalized in order to accelerate the convergence speed of the model at the decoding end.The effects of model segmentation with different attention-introducing mechanisms are compared and tested on CamVid and Cityscapes datasets.The experimental results show that the mean Intersection over Unons of the improved model segmentation accuracies on the two datasets are boosted by 6.88%and 2.58%,respectively,which is better than using Deeplabv3+.This method does not significantly increase the amount of network calculation and complexity,and has a good balance of speed and accuracy.展开更多
Currently, numerous automatic fabric defect detection algorithms have been proposed. Traditional machine vision algorithms that set separate parameters for different textures and defects rely on the manual design of c...Currently, numerous automatic fabric defect detection algorithms have been proposed. Traditional machine vision algorithms that set separate parameters for different textures and defects rely on the manual design of corresponding features to complete the detection. To overcome the limitations of traditional algorithms, deep learning-based correlative algorithms can extract more complex image features and perform better in image classification and object detection. A pixel-level defect segmentation methodology using DeepLabv3+, a classical semantic segmentation network, is proposed in this paper. Based on ResNet-18,ResNet-50 and Mobilenetv2, three DeepLabv3+ networks are constructed, which are trained and tested from data sets produced by capturing or publicizing images. The experimental results show that the performance of three DeepLabv3+ networks is close to one another on the four indicators proposed(Precision, Recall, F1-score and Accuracy), proving them to achieve defect detection and semantic segmentation, which provide new ideas and technical support for fabric defect detection.展开更多
A novel technique of three-dimensional (3D) reconstruction, segmentation, display and analysis of series slices of images including microscopic wide field optical sectioning by deconvolution method, cryo-electron micr...A novel technique of three-dimensional (3D) reconstruction, segmentation, display and analysis of series slices of images including microscopic wide field optical sectioning by deconvolution method, cryo-electron microscope slices by Fou-rier-Bessel synthesis and electron tomography (ET), and a series of computed tomography (CT) was developed to perform si-multaneous measurement on the structure and function of biomedical samples. The paper presents the 3D reconstruction seg-mentation display and analysis results of pollen spore, chaperonin, virus, head, cervical bone, tibia and carpus. At the same time, it also puts forward some potential applications of the new technique in the biomedical realm.展开更多
To overcome the shortcomings of 1 D and 2 D Otsu’s thresholding techniques, the 3 D Otsu method has been developed.Among all Otsu’s methods, 3 D Otsu technique provides the best threshold values for the multi-level ...To overcome the shortcomings of 1 D and 2 D Otsu’s thresholding techniques, the 3 D Otsu method has been developed.Among all Otsu’s methods, 3 D Otsu technique provides the best threshold values for the multi-level thresholding processes. In this paper, to improve the quality of segmented images, a simple and effective multilevel thresholding method is introduced. The proposed approach focuses on preserving edge detail by computing the 3 D Otsu along the fusion phenomena. The advantages of the presented scheme include higher quality outcomes, better preservation of tiny details and boundaries and reduced execution time with rising threshold levels. The fusion approach depends upon the differences between pixel intensity values within a small local space of an image;it aims to improve localized information after the thresholding process. The fusion of images based on local contrast can improve image segmentation performance by minimizing the loss of local contrast, loss of details and gray-level distributions. Results show that the proposed method yields more promising segmentation results when compared to conventional1 D Otsu, 2 D Otsu and 3 D Otsu methods, as evident from the objective and subjective evaluations.展开更多
3D modeling of geological bodies based on 3D seismic data is used to define the shape and volume of the bodies, which then can be directly applied to reservoir prediction, reserve estimation, and exploration. However,...3D modeling of geological bodies based on 3D seismic data is used to define the shape and volume of the bodies, which then can be directly applied to reservoir prediction, reserve estimation, and exploration. However, multiattributes are not effectively used in 3D modeling. To solve this problem, we propose a novel method for building of 3D model of geological anomalies based on the segmentation of multiattribute fusion. First, we divide the seismic attributes into edge- and region-based seismic attributes. Then, the segmentation model incorporating the edge- and region-based models is constructed within the levelset- based framework. Finally, the marching cubes algorithm is adopted to extract the zero level set based on the segmentation results and build the 3D model of the geological anomaly. Combining the edge-and region-based attributes to build the segmentation model, we satisfy the independence requirement and avoid the problem of insufficient data of single seismic attribute in capturing the boundaries of geological anomalies. We apply the proposed method to seismic data from the Sichuan Basin in southwestern China and obtain 3D models of caves and channels. Compared with 3D models obtained based on single seismic attributes, the results are better agreement with reality.展开更多
AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segment...AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.展开更多
With the widespread application of deep learning in the field of computer vision,gradually allowing medical image technology to assist doctors in making diagnoses has great practical and research significance.Aiming a...With the widespread application of deep learning in the field of computer vision,gradually allowing medical image technology to assist doctors in making diagnoses has great practical and research significance.Aiming at the shortcomings of the traditional U-Net model in 3D spatial information extraction,model over-fitting,and low degree of semantic information fusion,an improved medical image segmentation model has been used to achieve more accurate segmentation of medical images.In this model,we make full use of the residual network(ResNet)to solve the over-fitting problem.In order to process and aggregate data at different scales,the inception network is used instead of the traditional convolutional layer,and the dilated convolution is used to increase the receptive field.The conditional random field(CRF)can complete the contour refinement work.Compared with the traditional 3D U-Net network,the segmentation accuracy of the improved liver and tumor images increases by 2.89%and 7.66%,respectively.As a part of the image processing process,the method in this paper not only can be used for medical image segmentation,but also can lay the foundation for subsequent image 3D reconstruction work.展开更多
AIM: To explore a more accurate quantifying diagnosis method of diabetic macular edema(DME) by displaying detailed 3D morphometry beyond the gold-standard quantification indicator-central retinal thickness(CRT) and ap...AIM: To explore a more accurate quantifying diagnosis method of diabetic macular edema(DME) by displaying detailed 3D morphometry beyond the gold-standard quantification indicator-central retinal thickness(CRT) and apply it in follow-up of DME patients.METHODS: Optical coherence tomography(OCT) scans of 229 eyes from 160 patients were collected.We manually annotated cystoid macular edema(CME), subretinal fluid(SRF) and fovea as ground truths.Deep convolution neural networks(DCNNs) were constructed including U-Net, sASPP, HRNetV2-W48, and HRNetV2-W48+Object-Contextual Representation(OCR) for fluid(CME+SRF) segmentation and fovea detection respectively, based on which the thickness maps of CME, SRF and retina were generated and divided by Early Treatment Diabetic Retinopathy Study(ETDRS) grid.RESULTS: In fluid segmentation, with the best DCNN constructed and loss function, the dice similarity coefficients(DSC) of segmentation reached 0.78(CME), 0.82(SRF), and 0.95(retina).In fovea detection, the average deviation between the predicted fovea and the ground truth reached 145.7±117.8 μm.The generated macular edema thickness maps are able to discover center-involved DME by intuitive morphometry and fluid volume, which is ignored by the traditional definition of CRT>250 μm.Thickness maps could also help to discover fluid above or below the fovea center ignored or underestimated by a single OCT B-scan.CONCLUSION: Compared to the traditional unidimensional indicator-CRT, 3D macular edema thickness maps are able to display more intuitive morphometry and detailed statistics of DME, supporting more accurate diagnoses and follow-up of DME patients.展开更多
This work presents an efficient method for volume rendering of glioma tumors from segmented 2D MRI Datasets with user interactive control, by replacing manual segmentation required in the state of art methods. The mos...This work presents an efficient method for volume rendering of glioma tumors from segmented 2D MRI Datasets with user interactive control, by replacing manual segmentation required in the state of art methods. The most common primary brain tumors are gliomas, evolving from the cerebral supportive cells. For clinical follow-up, the evaluation of the preoperative tumor volume is essential. Tumor portions were automatically segmented from 2D MR images using morphological filtering techniques. These segmented tumor slices were propagated and modeled with the software package. The 3D modeled tumor consists of gray level values of the original image with exact tumor boundary. Axial slices of FLAIR and T2 weighted images were used for extracting tumors. Volumetric assessment of tumor volume with manual segmentation of its outlines is a time-consuming process and is prone to error. These defects are overcome in this method. Authors verified the performance of our method on several sets of MRI scans. The 3D modeling was also done using segmented 2D slices with the help of medical software package called 3D DOCTOR for verification purposes. The results were validated with the ground truth models by the Radiologist.展开更多
The process of segmenting point cloud data into several homogeneous areas with points in the same region having the same attributes is known as 3D segmentation.Segmentation is challenging with point cloud data due to...The process of segmenting point cloud data into several homogeneous areas with points in the same region having the same attributes is known as 3D segmentation.Segmentation is challenging with point cloud data due to substantial redundancy,fluctuating sample density and lack of apparent organization.The research area has a wide range of robotics applications,including intelligent vehicles,autonomous mapping and navigation.A number of researchers have introduced various methodologies and algorithms.Deep learning has been successfully used to a spectrum of 2D vision domains as a prevailing A.I.methods.However,due to the specific problems of processing point clouds with deep neural networks,deep learning on point clouds is still in its initial stages.This study examines many strategies that have been presented to 3D instance and semantic segmentation and gives a complete assessment of current developments in deep learning-based 3D segmentation.In these approaches’benefits,draw backs,and design mechanisms are studied and addressed.This study evaluates the impact of various segmentation algorithms on competitiveness on various publicly accessible datasets,as well as the most often used pipelines,their advantages and limits,insightful findings and intriguing future research directions.展开更多
Nighttime navigation faces challenges from limited data and interference,especially when satellite signals are unavailable.Leveraging lunar polarized light,polarization navigation offers a promising solution for night...Nighttime navigation faces challenges from limited data and interference,especially when satellite signals are unavailable.Leveraging lunar polarized light,polarization navigation offers a promising solution for nighttime autonomous navigation.Current algorithms,however,are limited by the requirement for known horizontal attitudes,restricting applications.This study introduces an autonomous 3-D attitude determination method to overcome this limitation.Our approach utilizes the Angle of Polarization(AOP)at night to extract neutral points from the AOP pattern.This allows for the calculation of polarization meridian plane information for attitude determination.Subsequently,we present an optimized Polarization TRIAD(Pol-TRIAD)algorithm to acquire the 3-D attitude.The proposed method outperforms the existing approaches in outdoor experiments by achieving lower Root Mean Square Error(RMSE).For one baseline attitude,it improves pitch by 31.7%,roll by 21.7%,and yaw by 2.6%,while for the attitude with a larger tilt angle,the improvements are 64.4%,30.4%,and 9.1%,respectively.展开更多
基金supported by the following grants:Zhejiang A&F University Research Development Fund(Talent Initiation Project No.2021LFR048)and 2023 University-Enterprise Joint Research Program(Grant No.LHYFZ2302)from the Modern Agricultural and Forestry Artificial Intelligence Industry Academy.
文摘In recent years,fungal diseases affecting grape crops have attracted significant attention.Currently,the assessment of black rot severitymainly depends on the ratio of lesion area to leaf surface area.However,effectively and accurately segmenting leaf lesions presents considerable challenges.Existing grape leaf lesion segmentationmodels have several limitations,such as a large number of parameters,long training durations,and limited precision in extracting small lesions and boundary details.To address these issues,we propose an enhanced DeepLabv3+model incorporating Strip Pooling,Content-Guided Fusion,and Convolutional Block Attention Module(SFC_DeepLabv3+),an enhanced lesion segmentation method based on DeepLabv3+.This approach uses the lightweight MobileNetv2 backbone to replace the original Xception,incorporates a lightweight convolutional block attention module,and introduces a content-guided feature fusion module to improve the detection accuracy of small lesions and blurred boundaries.Experimental results showthat the enhancedmodel achieves a mean Intersection overUnion(mIoU)of 90.98%,amean Pixel Accuracy(mPA)of 94.33%,and a precision of 95.84%.This represents relative gains of 2.22%,1.78%,and 0.89%respectively compared to the original model.Additionally,its complexity is significantly reduced without sacrificing performance,the parameter count is reduced to 6.27 M,a decrease of 88.5%compared to the original model,floating point of operations(GFLOPs)drops from 83.62 to 29.00 G,a reduction of 65.1%.Additionally,Frames Per Second(FPS)increases from 63.7 to 74.3 FPS,marking an improvement of 16.7%.Compared to other models,the improved architecture shows faster convergence and superior segmentation accuracy,making it highly suitable for applications in resource-constrained environments.
文摘Research has been conducted to reduce resource consumption in 3D medical image segmentation for diverse resource-constrained environments.However,decreasing the number of parameters to enhance computational efficiency can also lead to performance degradation.Moreover,these methods face challenges in balancing global and local features,increasing the risk of errors in multi-scale segmentation.This issue is particularly pronounced when segmenting small and complex structures within the human body.To address this problem,we propose a multi-stage hierarchical architecture composed of a detector and a segmentor.The detector extracts regions of interest(ROIs)in a 3D image,while the segmentor performs segmentation in the extracted ROI.Removing unnecessary areas in the detector allows the segmentation to be performed on a more compact input.The segmentor is designed with multiple stages,where each stage utilizes different input sizes.It implements a stage-skippingmechanism that deactivates certain stages using the initial input size.This approach minimizes unnecessary computations on segmenting the essential regions to reduce computational overhead.The proposed framework preserves segmentation performance while reducing resource consumption,enabling segmentation even in resource-constrained environments.
基金Princess Nourah Bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R826),Princess Nourah Bint Abdulrahman University,Riyadh,Saudi ArabiaNorthern Border University,Saudi Arabia,for supporting this work through project number(NBU-CRP-2025-2933).
文摘Brain tumor segmentation from Magnetic Resonance Imaging(MRI)supports neurologists and radiologists in analyzing tumors and developing personalized treatment plans,making it a crucial yet challenging task.Supervised models such as 3D U-Net perform well in this domain,but their accuracy significantly improves with appropriate preprocessing.This paper demonstrates the effectiveness of preprocessing in brain tumor segmentation by applying a pre-segmentation step based on the Generalized Gaussian Mixture Model(GGMM)to T1 contrastenhanced MRI scans from the BraTS 2020 dataset.The Expectation-Maximization(EM)algorithm is employed to estimate parameters for four tissue classes,generating a new pre-segmented channel that enhances the training and performance of the 3DU-Net model.The proposed GGMM+3D U-Net framework achieved a Dice coefficient of 0.88 for whole tumor segmentation,outperforming both the standard multiscale 3D U-Net(0.84)and MMU-Net(0.85).It also delivered higher Intersection over Union(IoU)scores compared to models trained without preprocessing or with simpler GMM-based segmentation.These results,supported by qualitative visualizations,suggest that GGMM-based preprocessing should be integrated into brain tumor segmentation pipelines to optimize performance.
基金the National Science and Technology Council(NSTC)of the Republic of China,Taiwan,for financially supporting this research under Contract No.NSTC 112-2637-M-131-001.
文摘Accurate and efficient brain tumor segmentation is essential for early diagnosis,treatment planning,and clinical decision-making.However,the complex structure of brain anatomy and the heterogeneous nature of tumors present significant challenges for precise anomaly detection.While U-Net-based architectures have demonstrated strong performance in medical image segmentation,there remains room for improvement in feature extraction and localization accuracy.In this study,we propose a novel hybrid model designed to enhance 3D brain tumor segmentation.The architecture incorporates a 3D ResNet encoder known for mitigating the vanishing gradient problem and a 3D U-Net decoder.Additionally,to enhance the model’s generalization ability,Squeeze and Excitation attention mechanism is integrated.We introduce Gabor filter banks into the encoder to further strengthen the model’s ability to extract robust and transformation-invariant features from the complex and irregular shapes typical in medical imaging.This approach,which is not well explored in current U-Net-based segmentation frameworks,provides a unique advantage by enhancing texture-aware feature representation.Specifically,Gabor filters help extract distinctive low-level texture features,reducing the effects of texture interference and facilitating faster convergence during the early stages of training.Our model achieved Dice scores of 0.881,0.846,and 0.819 for Whole Tumor(WT),Tumor Core(TC),and Enhancing Tumor(ET),respectively,on the BraTS 2020 dataset.Cross-validation on the BraTS 2021 dataset further confirmed the model’s robustness,yielding Dice score values of 0.887 for WT,0.856 for TC,and 0.824 for ET.The proposed model outperforms several state-of-the-art existing models,particularly in accurately identifying small and complex tumor regions.Extensive evaluations suggest integrating advanced preprocessing with an attention-augmented hybrid architecture offers significant potential for reliable and clinically valuable brain tumor segmentation.
文摘3D object recognition is a challenging task for intelligent and robot systems in industrial and home indoor environments.It is critical for such systems to recognize and segment the 3D object instances that they encounter on a frequent basis.The computer vision,graphics,and machine learning fields have all given it a lot of attention.Traditionally,3D segmentation was done with hand-crafted features and designed approaches that didn’t achieve acceptable performance and couldn’t be generalized to large-scale data.Deep learning approaches have lately become the preferred method for 3D segmentation challenges by their great success in 2D computer vision.However,the task of instance segmentation is currently less explored.In this paper,we propose a novel approach for efficient 3D instance segmentation using red green blue and depth(RGB-D)data based on deep learning.The 2D region based convolutional neural networks(Mask R-CNN)deep learning model with point based rending module is adapted to integrate with depth information to recognize and segment 3D instances of objects.In order to generate 3D point cloud coordinates(x,y,z),segmented 2D pixels(u,v)of recognized object regions in the RGB image are merged into(u,v)points of the depth image.Moreover,we conducted an experiment and analysis to compare our proposed method from various points of view and distances.The experimentation shows the proposed 3D object recognition and instance segmentation are sufficiently beneficial to support object handling in robotic and intelligent systems.
基金supported in part by the National Natural Science Foundations of China(Nos.61175084,61673042 and 62203046)the China Postdoctoral Science Foundation(No.2022M713006).
文摘Aiming to address the Unmanned Aerial Vehicle(UAV) formation collision avoidance problem in Three-Dimensional(3-D) low-altitude environments where dense various obstacles exist, a fluid-based path planning framework named the Formation Interfered Fluid Dynamical System(FIFDS) with Moderate Evasive Maneuver Strategy(MEMS) is proposed in this study.First, the UAV formation collision avoidance problem including quantifiable performance indexes is formulated. Second, inspired by the phenomenon of fluids continuously flowing while bypassing objects, the FIFDS for multiple UAVs is presented, which contains a Parallel Streamline Tracking(PST) method for formation keeping and the traditional IFDS for collision avoidance. Third, to rationally balance flight safety and collision avoidance cost, MEMS is proposed to generate moderate evasive maneuvers that match up with collision risks. Comprehensively containing the time and distance safety information, the 3-D dynamic collision regions are modeled for collision prediction. Then, the moderate evasive maneuver principle is refined, which provides criterions of the maneuver amplitude and direction. On this basis, an analytical parameter mapping mechanism is designed to online optimize IFDS parameters. Finally, the performance of the proposed method is validated by comparative simulation results and real flight experiments using fixed-wing UAVs.
基金National Natural Science Foundation of China(Nos.61941109,62061023)Distinguished Young Scholars of Gansu Province of China(No.21JR7RA345)。
文摘In the study of automatic driving,understanding the road scene is a key to improve driving safety.The semantic segmentation method could divide the image into different areas associated with semantic categories in accordance with the pixel level,so as to help vehicles to perceive and obtain the surrounding road environment information,which would improve driving safety.Deeplabv3+is the current popular semantic segmentation model.There are phenomena that small targets are missed and similar objects are easily misjudged during its semantic segmentation tasks,which leads to rough segmentation boundary and reduces semantic accuracy.This study focuses on the issue,based on the Deeplabv3+network structure and combined with the attention mechanism,to increase the weight of the segmentation area,and then proposes an improved Deeplabv3+fusion attention mechanism for road scene semantic segmentation method.First,a group of parallel position attention module and channel attention module are introduced on the Deeplabv3+encoding end to capture more spatial context information and high-level semantic information.Then,an attention mechanism is introduced to restore the spatial detail information,and the data shall be normalized in order to accelerate the convergence speed of the model at the decoding end.The effects of model segmentation with different attention-introducing mechanisms are compared and tested on CamVid and Cityscapes datasets.The experimental results show that the mean Intersection over Unons of the improved model segmentation accuracies on the two datasets are boosted by 6.88%and 2.58%,respectively,which is better than using Deeplabv3+.This method does not significantly increase the amount of network calculation and complexity,and has a good balance of speed and accuracy.
基金Supported by the National Natural Science Foundation of China(61876106)Shanghai Local Capacity-Building Project(19030501200)。
文摘Currently, numerous automatic fabric defect detection algorithms have been proposed. Traditional machine vision algorithms that set separate parameters for different textures and defects rely on the manual design of corresponding features to complete the detection. To overcome the limitations of traditional algorithms, deep learning-based correlative algorithms can extract more complex image features and perform better in image classification and object detection. A pixel-level defect segmentation methodology using DeepLabv3+, a classical semantic segmentation network, is proposed in this paper. Based on ResNet-18,ResNet-50 and Mobilenetv2, three DeepLabv3+ networks are constructed, which are trained and tested from data sets produced by capturing or publicizing images. The experimental results show that the performance of three DeepLabv3+ networks is close to one another on the four indicators proposed(Precision, Recall, F1-score and Accuracy), proving them to achieve defect detection and semantic segmentation, which provide new ideas and technical support for fabric defect detection.
文摘A novel technique of three-dimensional (3D) reconstruction, segmentation, display and analysis of series slices of images including microscopic wide field optical sectioning by deconvolution method, cryo-electron microscope slices by Fou-rier-Bessel synthesis and electron tomography (ET), and a series of computed tomography (CT) was developed to perform si-multaneous measurement on the structure and function of biomedical samples. The paper presents the 3D reconstruction seg-mentation display and analysis results of pollen spore, chaperonin, virus, head, cervical bone, tibia and carpus. At the same time, it also puts forward some potential applications of the new technique in the biomedical realm.
文摘To overcome the shortcomings of 1 D and 2 D Otsu’s thresholding techniques, the 3 D Otsu method has been developed.Among all Otsu’s methods, 3 D Otsu technique provides the best threshold values for the multi-level thresholding processes. In this paper, to improve the quality of segmented images, a simple and effective multilevel thresholding method is introduced. The proposed approach focuses on preserving edge detail by computing the 3 D Otsu along the fusion phenomena. The advantages of the presented scheme include higher quality outcomes, better preservation of tiny details and boundaries and reduced execution time with rising threshold levels. The fusion approach depends upon the differences between pixel intensity values within a small local space of an image;it aims to improve localized information after the thresholding process. The fusion of images based on local contrast can improve image segmentation performance by minimizing the loss of local contrast, loss of details and gray-level distributions. Results show that the proposed method yields more promising segmentation results when compared to conventional1 D Otsu, 2 D Otsu and 3 D Otsu methods, as evident from the objective and subjective evaluations.
基金supported by the National Natural Science Foundation of China(No.41604107)the Scientific Research Staring Foundation of University of Electronic Science and Technology of China(No.ZYGX2015KYQD049)
文摘3D modeling of geological bodies based on 3D seismic data is used to define the shape and volume of the bodies, which then can be directly applied to reservoir prediction, reserve estimation, and exploration. However, multiattributes are not effectively used in 3D modeling. To solve this problem, we propose a novel method for building of 3D model of geological anomalies based on the segmentation of multiattribute fusion. First, we divide the seismic attributes into edge- and region-based seismic attributes. Then, the segmentation model incorporating the edge- and region-based models is constructed within the levelset- based framework. Finally, the marching cubes algorithm is adopted to extract the zero level set based on the segmentation results and build the 3D model of the geological anomaly. Combining the edge-and region-based attributes to build the segmentation model, we satisfy the independence requirement and avoid the problem of insufficient data of single seismic attribute in capturing the boundaries of geological anomalies. We apply the proposed method to seismic data from the Sichuan Basin in southwestern China and obtain 3D models of caves and channels. Compared with 3D models obtained based on single seismic attributes, the results are better agreement with reality.
基金Supported by National Science Foundation of China(No.81800878)Interdisciplinary Program of Shanghai Jiao Tong University(No.YG2017QN24)+1 种基金Key Technological Research Projects of Songjiang District(No.18sjkjgg24)Bethune Langmu Ophthalmological Research Fund for Young and Middle-aged People(No.BJ-LM2018002J)
文摘AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.
文摘With the widespread application of deep learning in the field of computer vision,gradually allowing medical image technology to assist doctors in making diagnoses has great practical and research significance.Aiming at the shortcomings of the traditional U-Net model in 3D spatial information extraction,model over-fitting,and low degree of semantic information fusion,an improved medical image segmentation model has been used to achieve more accurate segmentation of medical images.In this model,we make full use of the residual network(ResNet)to solve the over-fitting problem.In order to process and aggregate data at different scales,the inception network is used instead of the traditional convolutional layer,and the dilated convolution is used to increase the receptive field.The conditional random field(CRF)can complete the contour refinement work.Compared with the traditional 3D U-Net network,the segmentation accuracy of the improved liver and tumor images increases by 2.89%and 7.66%,respectively.As a part of the image processing process,the method in this paper not only can be used for medical image segmentation,but also can lay the foundation for subsequent image 3D reconstruction work.
文摘AIM: To explore a more accurate quantifying diagnosis method of diabetic macular edema(DME) by displaying detailed 3D morphometry beyond the gold-standard quantification indicator-central retinal thickness(CRT) and apply it in follow-up of DME patients.METHODS: Optical coherence tomography(OCT) scans of 229 eyes from 160 patients were collected.We manually annotated cystoid macular edema(CME), subretinal fluid(SRF) and fovea as ground truths.Deep convolution neural networks(DCNNs) were constructed including U-Net, sASPP, HRNetV2-W48, and HRNetV2-W48+Object-Contextual Representation(OCR) for fluid(CME+SRF) segmentation and fovea detection respectively, based on which the thickness maps of CME, SRF and retina were generated and divided by Early Treatment Diabetic Retinopathy Study(ETDRS) grid.RESULTS: In fluid segmentation, with the best DCNN constructed and loss function, the dice similarity coefficients(DSC) of segmentation reached 0.78(CME), 0.82(SRF), and 0.95(retina).In fovea detection, the average deviation between the predicted fovea and the ground truth reached 145.7±117.8 μm.The generated macular edema thickness maps are able to discover center-involved DME by intuitive morphometry and fluid volume, which is ignored by the traditional definition of CRT>250 μm.Thickness maps could also help to discover fluid above or below the fovea center ignored or underestimated by a single OCT B-scan.CONCLUSION: Compared to the traditional unidimensional indicator-CRT, 3D macular edema thickness maps are able to display more intuitive morphometry and detailed statistics of DME, supporting more accurate diagnoses and follow-up of DME patients.
文摘This work presents an efficient method for volume rendering of glioma tumors from segmented 2D MRI Datasets with user interactive control, by replacing manual segmentation required in the state of art methods. The most common primary brain tumors are gliomas, evolving from the cerebral supportive cells. For clinical follow-up, the evaluation of the preoperative tumor volume is essential. Tumor portions were automatically segmented from 2D MR images using morphological filtering techniques. These segmented tumor slices were propagated and modeled with the software package. The 3D modeled tumor consists of gray level values of the original image with exact tumor boundary. Axial slices of FLAIR and T2 weighted images were used for extracting tumors. Volumetric assessment of tumor volume with manual segmentation of its outlines is a time-consuming process and is prone to error. These defects are overcome in this method. Authors verified the performance of our method on several sets of MRI scans. The 3D modeling was also done using segmented 2D slices with the help of medical software package called 3D DOCTOR for verification purposes. The results were validated with the ground truth models by the Radiologist.
基金This research was supported by the BB21 plus funded by Busan Metropolitan City and Busan Institute for Talent and Lifelong Education(BIT)and a grant from Tongmyong University Innovated University Research Park(I-URP)funded by Busan Metropolitan City,Republic of Korea.
文摘The process of segmenting point cloud data into several homogeneous areas with points in the same region having the same attributes is known as 3D segmentation.Segmentation is challenging with point cloud data due to substantial redundancy,fluctuating sample density and lack of apparent organization.The research area has a wide range of robotics applications,including intelligent vehicles,autonomous mapping and navigation.A number of researchers have introduced various methodologies and algorithms.Deep learning has been successfully used to a spectrum of 2D vision domains as a prevailing A.I.methods.However,due to the specific problems of processing point clouds with deep neural networks,deep learning on point clouds is still in its initial stages.This study examines many strategies that have been presented to 3D instance and semantic segmentation and gives a complete assessment of current developments in deep learning-based 3D segmentation.In these approaches’benefits,draw backs,and design mechanisms are studied and addressed.This study evaluates the impact of various segmentation algorithms on competitiveness on various publicly accessible datasets,as well as the most often used pipelines,their advantages and limits,insightful findings and intriguing future research directions.
基金supported in part by the National Key Research and Development Program of China(Nos.2020YFA0711200,2022YFB4701301)in part by the Defense Industrial Technology Development Program,China(No.JCKY2021601B016)+1 种基金in part by the Fundamental Research Funds for the Central Universities,China(No.YWF-23-JC-07)in part by the National Natural Science Foundation of China(No.62425302)。
文摘Nighttime navigation faces challenges from limited data and interference,especially when satellite signals are unavailable.Leveraging lunar polarized light,polarization navigation offers a promising solution for nighttime autonomous navigation.Current algorithms,however,are limited by the requirement for known horizontal attitudes,restricting applications.This study introduces an autonomous 3-D attitude determination method to overcome this limitation.Our approach utilizes the Angle of Polarization(AOP)at night to extract neutral points from the AOP pattern.This allows for the calculation of polarization meridian plane information for attitude determination.Subsequently,we present an optimized Polarization TRIAD(Pol-TRIAD)algorithm to acquire the 3-D attitude.The proposed method outperforms the existing approaches in outdoor experiments by achieving lower Root Mean Square Error(RMSE).For one baseline attitude,it improves pitch by 31.7%,roll by 21.7%,and yaw by 2.6%,while for the attitude with a larger tilt angle,the improvements are 64.4%,30.4%,and 9.1%,respectively.