Accurate segmentation of breast cancer in mammogram images plays a critical role in early diagnosis and treatment planning.As research in this domain continues to expand,various segmentation techniques have been propo...Accurate segmentation of breast cancer in mammogram images plays a critical role in early diagnosis and treatment planning.As research in this domain continues to expand,various segmentation techniques have been proposed across classical image processing,machine learning(ML),deep learning(DL),and hybrid/ensemble models.This study conducts a systematic literature review using the PRISMA methodology,analyzing 57 selected articles to explore how these methods have evolved and been applied.The review highlights the strengths and limitations of each approach,identifies commonly used public datasets,and observes emerging trends in model integration and clinical relevance.By synthesizing current findings,this work provides a structured overview of segmentation strategies and outlines key considerations for developing more adaptable and explainable tools for breast cancer detection.Overall,our synthesis suggests that classical and ML methods are suitable for limited labels and computing resources,while DL models are preferable when pixel-level annotations and resources are available,and hybrid pipelines are most appropriate when fine-grained clinical precision is required.展开更多
The northern segment of the North-South Seismic Belt is characterized by intense crustal deformation,well-developed active tectonics,and frequent occurrences of strong earthquakes.Therefore,conducting a Probabilistic ...The northern segment of the North-South Seismic Belt is characterized by intense crustal deformation,well-developed active tectonics,and frequent occurrences of strong earthquakes.Therefore,conducting a Probabilistic Seismic Hazard Analysis(PSHA)for this region is of significant importance for supporting seismic fortification in major engineering projects and formulating disaster prevention and mitigation policies.In this study,a composite seismic source model was constructed by integrating data on historical earthquakes,active faults,and paleoseismicity.Furthermore,a logic tree framework was employed to quantify epistemic uncertainties,enabling a systematic seismic hazard assessment of the region.To more accurately characterize the spatial heterogeneity of seismic activity,improvements were made to both the Circular Spatial Smoothing Model(CSSM)with a fixed radius and the Adaptive Spatial Smoothing Model(ASSM),with full consideration given to the spatiotemporal completeness of historical earthquake magnitudes.Regarding the CSSM,for scenarios involving small sample sizes in earthquake catalogs,the cross-validation method proposed in this study demonstrated higher robustness than the maximum likelihood method in determining the optimal correlation distance.Performance evaluation results indicate that while both models effectively characterize seismic activity,the ASSM exhibits superior overall predictive performance compared to the CSSM,owing to its ability to adaptively adjust the smoothing radius according to seismic density.Significant discrepancies were observed in the Peak Ground Acceleration(PGA)results calculated with a 10%probability of exceedance in 50 years across different combinations of seismic source models.The single spatially smoothed point-source model yielded a maximum PGA of approximately 0.52 g,with high-value areas concentrated near historical epicenters,thereby significantly underestimating the hazard associated with major fault zones.When combined with the simple fault-source model,the maximum PGA increased to 0.8 g,with high-value zones exhibiting a zonal distribution along faults;however,the risk remained underestimated for faults with low slip rates that are nevertheless approaching their recurrence cycles.Following the introduction of the time-dependent characteristic fault-source model,local PGA values for faults in the middle-to-late stages of their recurrence cycles increased by a factor of 2 to 7 compared to the single model.These results demonstrate that the characteristic fault-source model reasonably delineates the time-dependence of large earthquake recurrence,thereby providing a more accurate assessment of imminent seismic risks.By comprehensively applying the improved spatially smoothed pointsource model,the simple fault-source model,and the characteristic fault-source model,the following faults within the region were identified as having high seismic hazard:the Huangxianggou,Zhangxian,and Tianshui segments of the Xiqinling northern edge fault;the Maqin-Maqu segment of the Dongkunlun fault;the Longriqu fault;the Maoergai fault;the Elashan fault;the Riyueshan fault;the eastern segment of the Lenglongling fault;the Maxianshan segment of the Maxianshan northern Margin fault;and the Maomaoshan-Jinqianghe segment of the Laohushan-Maomaoshan fault.As these faults are located within seismic gaps or are approaching the recurrence periods of large earthquakes,they should be prioritized for current and future seismic monitoring as well as disaster prevention and mitigation efforts.展开更多
Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding ...Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.展开更多
The quantitative analysis of dispersed phases(bubbles,droplets,and particles)in multiphase flow systems represents a persistent technological challenge in petroleum engineering applications,including CO2-enhanced oil ...The quantitative analysis of dispersed phases(bubbles,droplets,and particles)in multiphase flow systems represents a persistent technological challenge in petroleum engineering applications,including CO2-enhanced oil recovery,foam flooding,and unconventional reservoir development.Current characterization methods remain constrained by labor-intensive manual workflows and limited dynamic analysis capabilities,particularly for processing large-scale microscopy data and video sequences that capture critical transient behavior like gas cluster migration and droplet coalescence.These limitations hinder the establishment of robust correlations between pore-scale flow patterns and reservoir-scale production performance.This study introduces a novel computer vision framework that integrates foundation models with lightweight neural networks to address these industry challenges.Leveraging the segment anything model's zero-shot learning capability,we developed an automated workflow that achieves an efficiency improvement of approximately 29 times in bubble labeling compared to manual methods while maintaining less than 2%deviation from expert annotations.Engineering-oriented optimization ensures lightweight deployment with 94%segmentation accuracy,while the integrated quantification system precisely resolves gas saturation,shape factors,and interfacial dynamics,parameters critical for optimizing gas injection strategies and predicting phase redistribution patterns.Validated through microfluidic gas-liquid displacement experiments for discontinuous phase segmentation accuracy,this methodology enables precise bubble morphology quantification with broad application potential in multiphase systems,including emulsion droplet dynamics characterization and particle transport behavior analysis.This work bridges the critical gap between pore-scale dynamics characterization and reservoir-scale simulation requirements,providing a foundational framework for intelligent flow diagnostics and predictive modeling in next-generation digital oilfield systems.展开更多
Efficient segmentation of oiled pixels in optical remotely sensed images is the precondition of optical identification and classification of different spilled oils,which remains one of the keys to optical remote sensi...Efficient segmentation of oiled pixels in optical remotely sensed images is the precondition of optical identification and classification of different spilled oils,which remains one of the keys to optical remote sensing of oil spills.Optical remotely sensed images of oil spills are inherently multidimensional and embedded with a complex knowledge framework.This complexity often hinders the effectiveness of mechanistic algorithms across varied scenarios.Although optical remote-sensing theory for oil spills has advanced,the scarcity of curated datasets and the difficulty of collecting them limit their usefulness for training deep learning models.This study introduces a data expansion strategy that utilizes the Segment Anything Model(SAM),effectively bridging the gap between traditional mechanism algorithms and emergent self-adaptive deep learning models.Optical dimension reduction is achieved through standardized preprocessing processes that address the decipherable properties of the input image.After preprocessing,SAM can swiftly and accurately segment spilled oil in images.The unified AI-based workflow significantly accelerates labeled-dataset creation and has proven effective for both rapid emergency intelligence during spill incidents and the rapid mapping and classification of oil footprints across China’s coastal waters.Our results show that coupling a remote sensing mechanism with a foundation model enables near-real-time,large-scale monitoring of complex surface slicks and offers guidance for the next generation of detection and quantification algorithms.展开更多
Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in ord...Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.展开更多
In this paper,we introduce an innovative method for computer-aided design(CAD)segmentation by concatenating meshes and CAD models.Many previous CAD segmentation methods have achieved impressive performance using singl...In this paper,we introduce an innovative method for computer-aided design(CAD)segmentation by concatenating meshes and CAD models.Many previous CAD segmentation methods have achieved impressive performance using single representations,such as meshes,CAD,and point clouds.However,existing methods cannot effectively combine different three-dimensional model types for the direct conversion,alignment,and integrity maintenance of geometric and topological information.Hence,we propose an integration approach that combines the geometric accuracy of CAD data with the flexibility of mesh representations,as well as introduce a unique hybrid representation that combines CAD and mesh models to enhance segmentation accuracy.To combine these two model types,our hybrid system utilizes advanced-neural-network techniques to convert CAD models into mesh models.For complex CAD models,model segmentation is crucial for model retrieval and reuse.In partial retrieval,it aims to segment a complex CAD model into several simple components.The first component of our hybrid system involves advanced mesh-labeling algorithms that harness the digitization of CAD properties to mesh models.The second component integrates labelled face features for CAD segmentation by leveraging the abundant multisemantic information embedded in CAD models.This combination of mesh and CAD not only refines the accuracy of boundary delineation but also provides a comprehensive understanding of the underlying object semantics.This study uses the Fusion 360 Gallery dataset.Experimental results indicate that our hybrid method can segment these models with higher accuracy than other methods that use single representations.展开更多
Segmenting skin lesions is critical for early skin cancer detection.Existing CNN and Transformer-based methods face challenges such as high computational complexity and limited adaptability to variations in lesion siz...Segmenting skin lesions is critical for early skin cancer detection.Existing CNN and Transformer-based methods face challenges such as high computational complexity and limited adaptability to variations in lesion sizes.To overcome these limitations,we introduce MSAMamba-UNet,a lightweight model that integrates two novel architectures:Multi-Scale Mamba(MSMamba)and Adaptive Dynamic Gating Block(ADGB).MSMamba utilizes multi-scale decomposition and a parallel hierarchical structure to enhance the delineation of irregular lesion boundaries and sensitivity to small targets.ADGB dynamically selects convolutional kernels with varying receptive fields based on input features,improving the model’s capacity to accommodate diverse lesion textures and scales.Additionally,we introduce a Mix Attention Fusion Block(MAF)to enhance shallow feature representation by integrating parallel channel and pixel attention mechanisms.Extensive evaluation of MSAMamba-UNet on the ISIC 2016,ISIC 2017,and ISIC 2018 datasets demonstrates competitive segmentation accuracy with only 0.056 M parameters and 0.069 GFLOPs.Our experiments revealed that MSAMamba-UNet achieved IoU scores of 85.53%,85.47%,and 82.22%,as well as DSC scores of 92.20%,92.17%,and 90.24%,respectively.These results underscore the lightweight design and effectiveness of MSAMamba-UNet.展开更多
Microdispersion technology is crucial for a variety of applications in both the chemical and biomedical fields.The precise and rapid characterization of microdroplets and microbubbles is essential for research as well...Microdispersion technology is crucial for a variety of applications in both the chemical and biomedical fields.The precise and rapid characterization of microdroplets and microbubbles is essential for research as well as for optimizing and controlling industrial processes.Traditional methods often rely on time-consuming manual analysis.Although some deep learning-based computer vision methods have been proposed for automated identification and characterization,these approaches often rely on supervised learning,which requires labeled data for model training.This dependency on labeled data can be time-consuming and expensive,especially when working with large and complex datasets.To address these challenges,we propose Micro Flow SAM,an innovative,motion-prompted,annotation-free,and training-free instance segmentation approach.By utilizing motion of microdroplets and microbubbles as prompts,our method directs large-scale vision models to perform accurate instance segmentation without the need for annotated data or model training.This approach eliminates the need for human intervention in data labeling and reduces computational costs,significantly streamlining the data analysis process.We demonstrate the effectiveness of Micro Flow SAM across 12 diverse datasets,achieving outstanding segmentation results that are competitive with traditional methods.This novel approach not only accelerates the analysis process but also establishes a foundation for efficient process control and optimization in microfluidic applications.Micro Flow SAM represents a breakthrough in reducing the complexities and resource demands of instance segmentation,enabling faster insights and advancements in the microdispersion field.展开更多
Objective This study aimed to explore a novel method that integrates the segmentation guidance classification and the dif-fusion model augmentation to realize the automatic classification for tibial plateau fractures(...Objective This study aimed to explore a novel method that integrates the segmentation guidance classification and the dif-fusion model augmentation to realize the automatic classification for tibial plateau fractures(TPFs).Methods YOLOv8n-cls was used to construct a baseline model on the data of 3781 patients from the Orthopedic Trauma Center of Wuhan Union Hospital.Additionally,a segmentation-guided classification approach was proposed.To enhance the dataset,a diffusion model was further demonstrated for data augmentation.Results The novel method that integrated the segmentation-guided classification and diffusion model augmentation sig-nificantly improved the accuracy and robustness of fracture classification.The average accuracy of classification for TPFs rose from 0.844 to 0.896.The comprehensive performance of the dual-stream model was also significantly enhanced after many rounds of training,with both the macro-area under the curve(AUC)and the micro-AUC increasing from 0.94 to 0.97.By utilizing diffusion model augmentation and segmentation map integration,the model demonstrated superior efficacy in identifying SchatzkerⅠ,achieving an accuracy of 0.880.It yielded an accuracy of 0.898 for SchatzkerⅡandⅢand 0.913 for SchatzkerⅣ;for SchatzkerⅤandⅥ,the accuracy was 0.887;and for intercondylar ridge fracture,the accuracy was 0.923.Conclusion The dual-stream attention-based classification network,which has been verified by many experiments,exhibited great potential in predicting the classification of TPFs.This method facilitates automatic TPF assessment and may assist surgeons in the rapid formulation of surgical plans.展开更多
Background:Traditional imaging approaches to keratoconus(KCN)have thus far failed to produce a standardized approach for diagnosis.While many diagnostic modalities and metrics exist,none have proven robust enough to b...Background:Traditional imaging approaches to keratoconus(KCN)have thus far failed to produce a standardized approach for diagnosis.While many diagnostic modalities and metrics exist,none have proven robust enough to be considered a gold standard.This study aims to introduce novel metrics to differentiate between KCN and healthy corneas using three-dimensional(3D)measurements of surface area and volume.Methods:This retrospective observational study examined KCN patients along with healthy control patients between the ages of 20 and 79 years old at the University of Maryland,Baltimore.The selected patients underwent a nine-line raster scan anterior segment optical coherence tomography(AS-OCT).ImageJ was used to determine the central 6 mm of each image and each corneal image was then divided into six 1 mm segments.Free-D software was then used to render the nine different images into a 3D model to calculate corneal surface area and volume.A two-tailed Mann-Whitney test was used to assess statistical significance when comparing these subsets.Results:Thirty-three eyes with KCN,along with 33 healthy control,were enrolled.There were statistically significant differences between the healthy and KCN groups in the metric of anterior corneal surface area(13.927 vs.13.991 mm^(2),P=0.046),posterior corneal surface area(14.045 vs.14.173 mm^(2),P<0.001),and volume(8.430 vs.7.773 mm3,P<0.001)within the central 6 mm.Conclusions:3D corneal models derived from AS-OCT can be used to measure anterior corneal surface area,posterior corneal surface area,and corneal volume.All three parameters are statistically different between corneas with KCN and healthy corneas.Further study and application of these parameters may yield new methodologies for the detection of KCN.展开更多
Detailed individual tree crown segmentation is highly relevant for the detection and monitoring of Fraxinus excelsior L.trees affected by ash dieback,a major threat to common ash populations across Europe.In this stud...Detailed individual tree crown segmentation is highly relevant for the detection and monitoring of Fraxinus excelsior L.trees affected by ash dieback,a major threat to common ash populations across Europe.In this study,both fine and coarse crown segmentation methods were applied to close-range multispectral UAV imagery.The fine tree crown segmentation method utilized a novel unsupervised machine learning approach based on a blended NIR-NDVI image,whereas the coarse segmentation relied on the segment anything model(SAM).Both methods successfully delineated tree crown outlines,however,only the fine segmentation accurately captured internal canopy gaps.Despite these structural differences,mean NDVI values calculated per tree crown revealed no significant differences between the two approaches,indicating that coarse segmentation is sufficient for mean vegetation index assessments.Nevertheless,the fine segmentation revealed increased heterogeneity in NDVI values in more severely damaged trees,underscoring its value for detailed structural and health analyses.Furthermore,the fine segmentation workflow proved transferable to both individual UAV images and orthophotos from broader UAV surveys.For applications focused on structural integrity and spatial variation in canopy health,the fine segmentation approach is recommended.展开更多
AIM:To construct an intelligent segmentation scheme for precise localization of central serous chorioretinopathy(CSC)leakage points,thereby enabling ophthalmologists to deliver accurate laser treatment without navigat...AIM:To construct an intelligent segmentation scheme for precise localization of central serous chorioretinopathy(CSC)leakage points,thereby enabling ophthalmologists to deliver accurate laser treatment without navigational laser equipment.METHODS:A dataset with dual labels(point-level and pixel-level)was first established based on fundus fluorescein angiography(FFA)images of CSC and subsequently divided into training(102 images),validation(40 images),and test(40 images)datasets.An intelligent segmentation method was then developed,based on the You Only Look Once version 8 Pose Estimation(YOLOv8-Pose)model and segment anything model(SAM),to segment CSC leakage points.Next,the YOLOv8-Pose model was trained for 200 epochs,and the best-performing model was selected to form the optimal combination with SAM.Additionally,the classic five types of U-Net series models[i.e.,U-Net,recurrent residual U-Net(R2U-Net),attention U-Net(AttU-Net),recurrent residual attention U-Net(R2AttUNet),and nested U-Net(UNet^(++))]were initialized with three random seeds and trained for 200 epochs,resulting in a total of 15 baseline models for comparison.Finally,based on the metrics including Dice similarity coefficient(DICE),intersection over union(IoU),precision,recall,precisionrecall(PR)curve,and receiver operating characteristic(ROC)curve,the proposed method was compared with baseline models through quantitative and qualitative experiments for leakage point segmentation,thereby demonstrating its effectiveness.RESULTS:With the increase of training epochs,the mAP50-95,Recall,and precision of the YOLOv8-Pose model showed a significant increase and tended to stabilize,and it achieved a preliminary localization success rate of 90%(i.e.,36 images)for CSC leakage points in 40 test images.Using manually expert-annotated pixel-level labels as the ground truth,the proposed method achieved outcomes with a DICE of 57.13%,an IoU of 45.31%,a precision of 45.91%,a recall of 93.57%,an area under the PR curve(AUC-PR)of 0.78 and an area under the ROC curve(AUC-ROC)of 0.97,which enables more accurate segmentation of CSC leakage points.CONCLUSION:By combining the precise localization capability of the YOLOv8-Pose model with the robust and flexible segmentation ability of SAM,the proposed method not only demonstrates the effectiveness of the YOLOv8-Pose model in detecting keypoint coordinates of CSC leakage points from the perspective of application innovation but also establishes a novel approach for accurate segmentation of CSC leakage points through the“detect-then-segment”strategy,thereby providing a potential auxiliary means for the automatic and precise realtime localization of leakage points during traditional laser photocoagulation for CSC.展开更多
This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 20...This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 2025.The primary objective is to evaluate methodological advancements,model performance,dataset usage,and existing challenges in developing clinically robust AI systems.We included peer-reviewed journal articles and highimpact conference papers published between 2022 and 2025,written in English,that proposed or evaluated deep learning methods for brain tumor segmentation and/or classification.Excluded were non-open-access publications,books,and non-English articles.A structured search was conducted across Scopus,Google Scholar,Wiley,and Taylor&Francis,with the last search performed in August 2025.Risk of bias was not formally quantified but considered during full-text screening based on dataset diversity,validation methods,and availability of performance metrics.We used narrative synthesis and tabular benchmarking to compare performance metrics(e.g.,accuracy,Dice score)across model types(CNN,Transformer,Hybrid),imaging modalities,and datasets.A total of 49 studies were included(43 journal articles and 6 conference papers).These studies spanned over 9 public datasets(e.g.,BraTS,Figshare,REMBRANDT,MOLAB)and utilized a range of imaging modalities,predominantly MRI.Hybrid models,especially ResViT and UNetFormer,consistently achieved high performance,with classification accuracy exceeding 98%and segmentation Dice scores above 0.90 across multiple studies.Transformers and hybrid architectures showed increasing adoption post2023.Many studies lacked external validation and were evaluated only on a few benchmark datasets,raising concerns about generalizability and dataset bias.Few studies addressed clinical interpretability or uncertainty quantification.Despite promising results,particularly for hybrid deep learning models,widespread clinical adoption remains limited due to lack of validation,interpretability concerns,and real-world deployment barriers.展开更多
Automatic segmentation of landslides from remote sensing imagery is challenging because traditional machine learning and early CNN-based models often fail to generalize across heterogeneous landscapes,where segmentati...Automatic segmentation of landslides from remote sensing imagery is challenging because traditional machine learning and early CNN-based models often fail to generalize across heterogeneous landscapes,where segmentation maps contain sparse and fragmented landslide regions under diverse geographical conditions.To address these issues,we propose a lightweight dual-stream siamese deep learning framework that integrates optical and topographical data fusion with an adaptive decoder,guided multimodal fusion,and deep supervision.The framework is built upon the synergistic combination of cross-attention,gated fusion,and sub-pixel upsampling within a unified dual-stream architecture specifically optimized for landslide segmentation,enabling efficient context modeling and robust feature exchange between modalities.The decoder captures long-range context at deeper levels using lightweight cross-attention and refines spatial details at shallower levels through attention-gated skip fusion,enabling precise boundary delineation and fewer false positives.The gated fusion further enhances multimodal integration of optical and topographical cues,and the deep supervision stabilizes training and improves generalization.Moreover,to mitigate checkerboard artifacts,a learnable sub-pixel upsampling is devised to replace the traditional transposed convolution.Despite its compact design with fewer parameters,the model consistently outperforms state-of-the-art baselines.Experiments on two benchmark datasets,Landslide4Sense and Bijie,confirm the effectiveness of the framework.On the Bijie dataset,it achieves an F1-score of 0.9110 and an intersection over union(IoU)of 0.8839.These results highlight its potential for accurate large-scale landslide inventory mapping and real-time disaster response.The implementation is publicly available at https://github.com/mishaown/DiGATe-UNet-LandSlide-Segmentation(accessed on 3 November 2025).展开更多
A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec-...A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec- ond step, the discriminant-based methods or clustering-based methods are performed on the reformed distribution. It is focused on the typical clustering methods-Gaussian mixture model (GMM) and its variant to demonstrate the feasibility of the framework. Due to the independence of the first step in its second step, it can be integrated into the pixel-based and the histogram-based methods to improve their segmentation quality. The experiments on artificial and real images show that the framework can achieve effective and robust segmentation results.展开更多
基金funded by BK21 FOUR(Fostering Outstanding Universities for Research)(No.:5199990914048).
文摘Accurate segmentation of breast cancer in mammogram images plays a critical role in early diagnosis and treatment planning.As research in this domain continues to expand,various segmentation techniques have been proposed across classical image processing,machine learning(ML),deep learning(DL),and hybrid/ensemble models.This study conducts a systematic literature review using the PRISMA methodology,analyzing 57 selected articles to explore how these methods have evolved and been applied.The review highlights the strengths and limitations of each approach,identifies commonly used public datasets,and observes emerging trends in model integration and clinical relevance.By synthesizing current findings,this work provides a structured overview of segmentation strategies and outlines key considerations for developing more adaptable and explainable tools for breast cancer detection.Overall,our synthesis suggests that classical and ML methods are suitable for limited labels and computing resources,while DL models are preferable when pixel-level annotations and resources are available,and hybrid pipelines are most appropriate when fine-grained clinical precision is required.
基金supported by the National Key R&D Program of China(No.2022YFC3003502).
文摘The northern segment of the North-South Seismic Belt is characterized by intense crustal deformation,well-developed active tectonics,and frequent occurrences of strong earthquakes.Therefore,conducting a Probabilistic Seismic Hazard Analysis(PSHA)for this region is of significant importance for supporting seismic fortification in major engineering projects and formulating disaster prevention and mitigation policies.In this study,a composite seismic source model was constructed by integrating data on historical earthquakes,active faults,and paleoseismicity.Furthermore,a logic tree framework was employed to quantify epistemic uncertainties,enabling a systematic seismic hazard assessment of the region.To more accurately characterize the spatial heterogeneity of seismic activity,improvements were made to both the Circular Spatial Smoothing Model(CSSM)with a fixed radius and the Adaptive Spatial Smoothing Model(ASSM),with full consideration given to the spatiotemporal completeness of historical earthquake magnitudes.Regarding the CSSM,for scenarios involving small sample sizes in earthquake catalogs,the cross-validation method proposed in this study demonstrated higher robustness than the maximum likelihood method in determining the optimal correlation distance.Performance evaluation results indicate that while both models effectively characterize seismic activity,the ASSM exhibits superior overall predictive performance compared to the CSSM,owing to its ability to adaptively adjust the smoothing radius according to seismic density.Significant discrepancies were observed in the Peak Ground Acceleration(PGA)results calculated with a 10%probability of exceedance in 50 years across different combinations of seismic source models.The single spatially smoothed point-source model yielded a maximum PGA of approximately 0.52 g,with high-value areas concentrated near historical epicenters,thereby significantly underestimating the hazard associated with major fault zones.When combined with the simple fault-source model,the maximum PGA increased to 0.8 g,with high-value zones exhibiting a zonal distribution along faults;however,the risk remained underestimated for faults with low slip rates that are nevertheless approaching their recurrence cycles.Following the introduction of the time-dependent characteristic fault-source model,local PGA values for faults in the middle-to-late stages of their recurrence cycles increased by a factor of 2 to 7 compared to the single model.These results demonstrate that the characteristic fault-source model reasonably delineates the time-dependence of large earthquake recurrence,thereby providing a more accurate assessment of imminent seismic risks.By comprehensively applying the improved spatially smoothed pointsource model,the simple fault-source model,and the characteristic fault-source model,the following faults within the region were identified as having high seismic hazard:the Huangxianggou,Zhangxian,and Tianshui segments of the Xiqinling northern edge fault;the Maqin-Maqu segment of the Dongkunlun fault;the Longriqu fault;the Maoergai fault;the Elashan fault;the Riyueshan fault;the eastern segment of the Lenglongling fault;the Maxianshan segment of the Maxianshan northern Margin fault;and the Maomaoshan-Jinqianghe segment of the Laohushan-Maomaoshan fault.As these faults are located within seismic gaps or are approaching the recurrence periods of large earthquakes,they should be prioritized for current and future seismic monitoring as well as disaster prevention and mitigation efforts.
基金supported by Natural Science Foundation Programme of Gansu Province(No.24JRRA231)National Natural Science Foundation of China(No.62061023)Gansu Provincial Science and Technology Plan Key Research and Development Program Project(No.24YFFA024).
文摘Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.
基金supported by Sichuan Province Outstanding Young Scientist Fund(Grant No.2025NSFJQ0009)Sichuan Regional Innovation Cooperation Fund(Grant No.2025YFHZ0270)。
文摘The quantitative analysis of dispersed phases(bubbles,droplets,and particles)in multiphase flow systems represents a persistent technological challenge in petroleum engineering applications,including CO2-enhanced oil recovery,foam flooding,and unconventional reservoir development.Current characterization methods remain constrained by labor-intensive manual workflows and limited dynamic analysis capabilities,particularly for processing large-scale microscopy data and video sequences that capture critical transient behavior like gas cluster migration and droplet coalescence.These limitations hinder the establishment of robust correlations between pore-scale flow patterns and reservoir-scale production performance.This study introduces a novel computer vision framework that integrates foundation models with lightweight neural networks to address these industry challenges.Leveraging the segment anything model's zero-shot learning capability,we developed an automated workflow that achieves an efficiency improvement of approximately 29 times in bubble labeling compared to manual methods while maintaining less than 2%deviation from expert annotations.Engineering-oriented optimization ensures lightweight deployment with 94%segmentation accuracy,while the integrated quantification system precisely resolves gas saturation,shape factors,and interfacial dynamics,parameters critical for optimizing gas injection strategies and predicting phase redistribution patterns.Validated through microfluidic gas-liquid displacement experiments for discontinuous phase segmentation accuracy,this methodology enables precise bubble morphology quantification with broad application potential in multiphase systems,including emulsion droplet dynamics characterization and particle transport behavior analysis.This work bridges the critical gap between pore-scale dynamics characterization and reservoir-scale simulation requirements,providing a foundational framework for intelligent flow diagnostics and predictive modeling in next-generation digital oilfield systems.
基金The National Natural Science Foundation of China under contract No.42371380the National Key Research and Development Program of China under contract No.2023YFC2811800the Fundamental Research Funds for the Central Universities under contract No.0904-14380035.
文摘Efficient segmentation of oiled pixels in optical remotely sensed images is the precondition of optical identification and classification of different spilled oils,which remains one of the keys to optical remote sensing of oil spills.Optical remotely sensed images of oil spills are inherently multidimensional and embedded with a complex knowledge framework.This complexity often hinders the effectiveness of mechanistic algorithms across varied scenarios.Although optical remote-sensing theory for oil spills has advanced,the scarcity of curated datasets and the difficulty of collecting them limit their usefulness for training deep learning models.This study introduces a data expansion strategy that utilizes the Segment Anything Model(SAM),effectively bridging the gap between traditional mechanism algorithms and emergent self-adaptive deep learning models.Optical dimension reduction is achieved through standardized preprocessing processes that address the decipherable properties of the input image.After preprocessing,SAM can swiftly and accurately segment spilled oil in images.The unified AI-based workflow significantly accelerates labeled-dataset creation and has proven effective for both rapid emergency intelligence during spill incidents and the rapid mapping and classification of oil footprints across China’s coastal waters.Our results show that coupling a remote sensing mechanism with a foundation model enables near-real-time,large-scale monitoring of complex surface slicks and offers guidance for the next generation of detection and quantification algorithms.
基金Natural Science Foundation of Zhejiang Province,Grant/Award Number:LY23F020025Science and Technology Commissioner Program of Huzhou,Grant/Award Number:2023GZ42Sichuan Provincial Science and Technology Support Program,Grant/Award Numbers:2023ZHCG0005,2023ZHCG0008。
文摘Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.
基金Supported by the National Key Research and Development Program of China(2024YFB3311703)National Natural Science Foundation of China(61932003)Beijing Science and Technology Plan Project(Z221100006322003).
文摘In this paper,we introduce an innovative method for computer-aided design(CAD)segmentation by concatenating meshes and CAD models.Many previous CAD segmentation methods have achieved impressive performance using single representations,such as meshes,CAD,and point clouds.However,existing methods cannot effectively combine different three-dimensional model types for the direct conversion,alignment,and integrity maintenance of geometric and topological information.Hence,we propose an integration approach that combines the geometric accuracy of CAD data with the flexibility of mesh representations,as well as introduce a unique hybrid representation that combines CAD and mesh models to enhance segmentation accuracy.To combine these two model types,our hybrid system utilizes advanced-neural-network techniques to convert CAD models into mesh models.For complex CAD models,model segmentation is crucial for model retrieval and reuse.In partial retrieval,it aims to segment a complex CAD model into several simple components.The first component of our hybrid system involves advanced mesh-labeling algorithms that harness the digitization of CAD properties to mesh models.The second component integrates labelled face features for CAD segmentation by leveraging the abundant multisemantic information embedded in CAD models.This combination of mesh and CAD not only refines the accuracy of boundary delineation but also provides a comprehensive understanding of the underlying object semantics.This study uses the Fusion 360 Gallery dataset.Experimental results indicate that our hybrid method can segment these models with higher accuracy than other methods that use single representations.
基金supported in part by the National Natural Science Foundation of China under Grant 62201201the Foundation of Henan Educational Committee under Grant 242102211042.
文摘Segmenting skin lesions is critical for early skin cancer detection.Existing CNN and Transformer-based methods face challenges such as high computational complexity and limited adaptability to variations in lesion sizes.To overcome these limitations,we introduce MSAMamba-UNet,a lightweight model that integrates two novel architectures:Multi-Scale Mamba(MSMamba)and Adaptive Dynamic Gating Block(ADGB).MSMamba utilizes multi-scale decomposition and a parallel hierarchical structure to enhance the delineation of irregular lesion boundaries and sensitivity to small targets.ADGB dynamically selects convolutional kernels with varying receptive fields based on input features,improving the model’s capacity to accommodate diverse lesion textures and scales.Additionally,we introduce a Mix Attention Fusion Block(MAF)to enhance shallow feature representation by integrating parallel channel and pixel attention mechanisms.Extensive evaluation of MSAMamba-UNet on the ISIC 2016,ISIC 2017,and ISIC 2018 datasets demonstrates competitive segmentation accuracy with only 0.056 M parameters and 0.069 GFLOPs.Our experiments revealed that MSAMamba-UNet achieved IoU scores of 85.53%,85.47%,and 82.22%,as well as DSC scores of 92.20%,92.17%,and 90.24%,respectively.These results underscore the lightweight design and effectiveness of MSAMamba-UNet.
基金the financial support from National Natural Science Foundation of China (21991104)。
文摘Microdispersion technology is crucial for a variety of applications in both the chemical and biomedical fields.The precise and rapid characterization of microdroplets and microbubbles is essential for research as well as for optimizing and controlling industrial processes.Traditional methods often rely on time-consuming manual analysis.Although some deep learning-based computer vision methods have been proposed for automated identification and characterization,these approaches often rely on supervised learning,which requires labeled data for model training.This dependency on labeled data can be time-consuming and expensive,especially when working with large and complex datasets.To address these challenges,we propose Micro Flow SAM,an innovative,motion-prompted,annotation-free,and training-free instance segmentation approach.By utilizing motion of microdroplets and microbubbles as prompts,our method directs large-scale vision models to perform accurate instance segmentation without the need for annotated data or model training.This approach eliminates the need for human intervention in data labeling and reduces computational costs,significantly streamlining the data analysis process.We demonstrate the effectiveness of Micro Flow SAM across 12 diverse datasets,achieving outstanding segmentation results that are competitive with traditional methods.This novel approach not only accelerates the analysis process but also establishes a foundation for efficient process control and optimization in microfluidic applications.Micro Flow SAM represents a breakthrough in reducing the complexities and resource demands of instance segmentation,enabling faster insights and advancements in the microdispersion field.
基金supported by the National Natural Science Foundation of China(Nos.81974355 and 82172524)Key Research and Development Program of Hubei Province(No.2021BEA161)+2 种基金National Innovation Platform Development Program(No.2020021105012440)Open Project Funding of the Hubei Key Laboratory of Big Data Intelligent Analysis and Application,Hubei University(No.2024BDIAA03)Free Innovation Preliminary Research Fund of Wuhan Union Hospital(No.2024XHYN047).
文摘Objective This study aimed to explore a novel method that integrates the segmentation guidance classification and the dif-fusion model augmentation to realize the automatic classification for tibial plateau fractures(TPFs).Methods YOLOv8n-cls was used to construct a baseline model on the data of 3781 patients from the Orthopedic Trauma Center of Wuhan Union Hospital.Additionally,a segmentation-guided classification approach was proposed.To enhance the dataset,a diffusion model was further demonstrated for data augmentation.Results The novel method that integrated the segmentation-guided classification and diffusion model augmentation sig-nificantly improved the accuracy and robustness of fracture classification.The average accuracy of classification for TPFs rose from 0.844 to 0.896.The comprehensive performance of the dual-stream model was also significantly enhanced after many rounds of training,with both the macro-area under the curve(AUC)and the micro-AUC increasing from 0.94 to 0.97.By utilizing diffusion model augmentation and segmentation map integration,the model demonstrated superior efficacy in identifying SchatzkerⅠ,achieving an accuracy of 0.880.It yielded an accuracy of 0.898 for SchatzkerⅡandⅢand 0.913 for SchatzkerⅣ;for SchatzkerⅤandⅥ,the accuracy was 0.887;and for intercondylar ridge fracture,the accuracy was 0.923.Conclusion The dual-stream attention-based classification network,which has been verified by many experiments,exhibited great potential in predicting the classification of TPFs.This method facilitates automatic TPF assessment and may assist surgeons in the rapid formulation of surgical plans.
文摘Background:Traditional imaging approaches to keratoconus(KCN)have thus far failed to produce a standardized approach for diagnosis.While many diagnostic modalities and metrics exist,none have proven robust enough to be considered a gold standard.This study aims to introduce novel metrics to differentiate between KCN and healthy corneas using three-dimensional(3D)measurements of surface area and volume.Methods:This retrospective observational study examined KCN patients along with healthy control patients between the ages of 20 and 79 years old at the University of Maryland,Baltimore.The selected patients underwent a nine-line raster scan anterior segment optical coherence tomography(AS-OCT).ImageJ was used to determine the central 6 mm of each image and each corneal image was then divided into six 1 mm segments.Free-D software was then used to render the nine different images into a 3D model to calculate corneal surface area and volume.A two-tailed Mann-Whitney test was used to assess statistical significance when comparing these subsets.Results:Thirty-three eyes with KCN,along with 33 healthy control,were enrolled.There were statistically significant differences between the healthy and KCN groups in the metric of anterior corneal surface area(13.927 vs.13.991 mm^(2),P=0.046),posterior corneal surface area(14.045 vs.14.173 mm^(2),P<0.001),and volume(8.430 vs.7.773 mm3,P<0.001)within the central 6 mm.Conclusions:3D corneal models derived from AS-OCT can be used to measure anterior corneal surface area,posterior corneal surface area,and corneal volume.All three parameters are statistically different between corneas with KCN and healthy corneas.Further study and application of these parameters may yield new methodologies for the detection of KCN.
基金This study was conducted within the project FraxVir“Detection,characterisation and analyses of the occurrence of viruses and ash dieback in special stands of Fraxinus excelsior-a supplementary study to the FraxForFuture demonstration project”and receives funding via the Waldklimafonds(WKF)funded by the German Federal Ministry of Food and Agriculture(BMEL)and Federal Ministry for the Environment,Nature Conservation,Nuclear Safety and Consumer Protection(BMUV)administrated by the Agency for Renewable Resources(FNR)under grant agreement 2220WK40A4.
文摘Detailed individual tree crown segmentation is highly relevant for the detection and monitoring of Fraxinus excelsior L.trees affected by ash dieback,a major threat to common ash populations across Europe.In this study,both fine and coarse crown segmentation methods were applied to close-range multispectral UAV imagery.The fine tree crown segmentation method utilized a novel unsupervised machine learning approach based on a blended NIR-NDVI image,whereas the coarse segmentation relied on the segment anything model(SAM).Both methods successfully delineated tree crown outlines,however,only the fine segmentation accurately captured internal canopy gaps.Despite these structural differences,mean NDVI values calculated per tree crown revealed no significant differences between the two approaches,indicating that coarse segmentation is sufficient for mean vegetation index assessments.Nevertheless,the fine segmentation revealed increased heterogeneity in NDVI values in more severely damaged trees,underscoring its value for detailed structural and health analyses.Furthermore,the fine segmentation workflow proved transferable to both individual UAV images and orthophotos from broader UAV surveys.For applications focused on structural integrity and spatial variation in canopy health,the fine segmentation approach is recommended.
基金Supported by the Shenzhen Science and Technology Program(No.JCYJ20240813152704006)the National Natural Science Foundation of China(No.62401259)+2 种基金the Fundamental Research Funds for the Central Universities(No.NZ2024036)the Postdoctoral Fellowship Program of CPSF(No.GZC20242228)High Performance Computing Platform of Nanjing University of Aeronautics and Astronautics。
文摘AIM:To construct an intelligent segmentation scheme for precise localization of central serous chorioretinopathy(CSC)leakage points,thereby enabling ophthalmologists to deliver accurate laser treatment without navigational laser equipment.METHODS:A dataset with dual labels(point-level and pixel-level)was first established based on fundus fluorescein angiography(FFA)images of CSC and subsequently divided into training(102 images),validation(40 images),and test(40 images)datasets.An intelligent segmentation method was then developed,based on the You Only Look Once version 8 Pose Estimation(YOLOv8-Pose)model and segment anything model(SAM),to segment CSC leakage points.Next,the YOLOv8-Pose model was trained for 200 epochs,and the best-performing model was selected to form the optimal combination with SAM.Additionally,the classic five types of U-Net series models[i.e.,U-Net,recurrent residual U-Net(R2U-Net),attention U-Net(AttU-Net),recurrent residual attention U-Net(R2AttUNet),and nested U-Net(UNet^(++))]were initialized with three random seeds and trained for 200 epochs,resulting in a total of 15 baseline models for comparison.Finally,based on the metrics including Dice similarity coefficient(DICE),intersection over union(IoU),precision,recall,precisionrecall(PR)curve,and receiver operating characteristic(ROC)curve,the proposed method was compared with baseline models through quantitative and qualitative experiments for leakage point segmentation,thereby demonstrating its effectiveness.RESULTS:With the increase of training epochs,the mAP50-95,Recall,and precision of the YOLOv8-Pose model showed a significant increase and tended to stabilize,and it achieved a preliminary localization success rate of 90%(i.e.,36 images)for CSC leakage points in 40 test images.Using manually expert-annotated pixel-level labels as the ground truth,the proposed method achieved outcomes with a DICE of 57.13%,an IoU of 45.31%,a precision of 45.91%,a recall of 93.57%,an area under the PR curve(AUC-PR)of 0.78 and an area under the ROC curve(AUC-ROC)of 0.97,which enables more accurate segmentation of CSC leakage points.CONCLUSION:By combining the precise localization capability of the YOLOv8-Pose model with the robust and flexible segmentation ability of SAM,the proposed method not only demonstrates the effectiveness of the YOLOv8-Pose model in detecting keypoint coordinates of CSC leakage points from the perspective of application innovation but also establishes a novel approach for accurate segmentation of CSC leakage points through the“detect-then-segment”strategy,thereby providing a potential auxiliary means for the automatic and precise realtime localization of leakage points during traditional laser photocoagulation for CSC.
文摘This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 2025.The primary objective is to evaluate methodological advancements,model performance,dataset usage,and existing challenges in developing clinically robust AI systems.We included peer-reviewed journal articles and highimpact conference papers published between 2022 and 2025,written in English,that proposed or evaluated deep learning methods for brain tumor segmentation and/or classification.Excluded were non-open-access publications,books,and non-English articles.A structured search was conducted across Scopus,Google Scholar,Wiley,and Taylor&Francis,with the last search performed in August 2025.Risk of bias was not formally quantified but considered during full-text screening based on dataset diversity,validation methods,and availability of performance metrics.We used narrative synthesis and tabular benchmarking to compare performance metrics(e.g.,accuracy,Dice score)across model types(CNN,Transformer,Hybrid),imaging modalities,and datasets.A total of 49 studies were included(43 journal articles and 6 conference papers).These studies spanned over 9 public datasets(e.g.,BraTS,Figshare,REMBRANDT,MOLAB)and utilized a range of imaging modalities,predominantly MRI.Hybrid models,especially ResViT and UNetFormer,consistently achieved high performance,with classification accuracy exceeding 98%and segmentation Dice scores above 0.90 across multiple studies.Transformers and hybrid architectures showed increasing adoption post2023.Many studies lacked external validation and were evaluated only on a few benchmark datasets,raising concerns about generalizability and dataset bias.Few studies addressed clinical interpretability or uncertainty quantification.Despite promising results,particularly for hybrid deep learning models,widespread clinical adoption remains limited due to lack of validation,interpretability concerns,and real-world deployment barriers.
基金funded by the National Natural Science Foundation of China,grant number 62262045the Fundamental Research Funds for the Central Universities,grant number 2023CDJYGRH-YB11the Open Funding of SUGON Industrial Control and Security Center,grant number CUIT-SICSC-2025-03.
文摘Automatic segmentation of landslides from remote sensing imagery is challenging because traditional machine learning and early CNN-based models often fail to generalize across heterogeneous landscapes,where segmentation maps contain sparse and fragmented landslide regions under diverse geographical conditions.To address these issues,we propose a lightweight dual-stream siamese deep learning framework that integrates optical and topographical data fusion with an adaptive decoder,guided multimodal fusion,and deep supervision.The framework is built upon the synergistic combination of cross-attention,gated fusion,and sub-pixel upsampling within a unified dual-stream architecture specifically optimized for landslide segmentation,enabling efficient context modeling and robust feature exchange between modalities.The decoder captures long-range context at deeper levels using lightweight cross-attention and refines spatial details at shallower levels through attention-gated skip fusion,enabling precise boundary delineation and fewer false positives.The gated fusion further enhances multimodal integration of optical and topographical cues,and the deep supervision stabilizes training and improves generalization.Moreover,to mitigate checkerboard artifacts,a learnable sub-pixel upsampling is devised to replace the traditional transposed convolution.Despite its compact design with fewer parameters,the model consistently outperforms state-of-the-art baselines.Experiments on two benchmark datasets,Landslide4Sense and Bijie,confirm the effectiveness of the framework.On the Bijie dataset,it achieves an F1-score of 0.9110 and an intersection over union(IoU)of 0.8839.These results highlight its potential for accurate large-scale landslide inventory mapping and real-time disaster response.The implementation is publicly available at https://github.com/mishaown/DiGATe-UNet-LandSlide-Segmentation(accessed on 3 November 2025).
基金Supported by the National Natural Science Foundation of China(60505004,60773061)~~
文摘A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec- ond step, the discriminant-based methods or clustering-based methods are performed on the reformed distribution. It is focused on the typical clustering methods-Gaussian mixture model (GMM) and its variant to demonstrate the feasibility of the framework. Due to the independence of the first step in its second step, it can be integrated into the pixel-based and the histogram-based methods to improve their segmentation quality. The experiments on artificial and real images show that the framework can achieve effective and robust segmentation results.