Fractures are critical to subsurface activities such as oil and gas extraction,geothermal energy production,and carbon storage.Hydraulic fracturing,a technique that enhances fluid production,creates complex fracture n...Fractures are critical to subsurface activities such as oil and gas extraction,geothermal energy production,and carbon storage.Hydraulic fracturing,a technique that enhances fluid production,creates complex fracture networks within rock formations containing natural discontinuities.Accurately distinguishing between hydraulically induced fractures and pre-existing discontinuities is essential for understanding hydraulic fracture mechanisms.However,this remains challenging due to the interconnected nature of fractures in three-dimensional(3D)space.Manual segmentation,while adaptive,is both labor-intensive and subjective,making it impractical for large-scale 3D datasets.This study introduces a deep learning-based progressive cross-sectional segmentation method to automate the classification of 3D fracture volumes.The proposed method was applied to a 3D hydraulic fracture network in a Montney cube sample,successfully segmenting natural fractures,parted bedding planes,and hydraulic fractures with minimal user intervention.The automated approach achieves a 99.6%reduction in manual image processing workload while maintaining high segmentation accuracy,with test accuracy exceeding 98%and F1-score over 84%.This approach generalizes well to Brazilian disc samples with different fracture patterns,achieving consistently high accuracy in distinguishing between bedding and non-bedding fractures.This automated fracture segmentation method offers an effective tool for enhanced quantitative characterization of fracture networks,which would contribute to a deeper understanding of hydraulic fracturing processes.展开更多
Segmenting vegetation in color images is a complex task, especially when the background and lighting conditions of the environment are uncontrolled. This paper proposes a vegetation segmentation algorithm that combine...Segmenting vegetation in color images is a complex task, especially when the background and lighting conditions of the environment are uncontrolled. This paper proposes a vegetation segmentation algorithm that combines a supervised and an unsupervised learning method to segment healthy and diseased plant images from the background. During the training stage, a Self-Organizing Map (SOM) neural network is applied to create different color groups from a set of images containing vegetation, acquired from a tomato greenhouse. The color groups are labeled as vegetation and non-vegetation and then used to create two color histogram models corresponding to vegetation and non-vegetation. In the online mode, input images are segmented by a Bayesian classifier using the two histogram models. This algorithm has provided a qualitatively better segmentation rate of images containing plants’ foliage in uncontrolled environments than the segmentation rate obtained by a color index technique, resulting in the elimination of the background and the preservation of important color information. This segmentation method will be applied in disease diagnosis of tomato plants in greenhouses as future work.展开更多
Acting as a pilot of the Square Kilometer Array (SKA), a Five hundred meter Aperture Spherical Telescope (FAST) project puts forward many innovative ideas, among which the design of the active main reflector shows fas...Acting as a pilot of the Square Kilometer Array (SKA), a Five hundred meter Aperture Spherical Telescope (FAST) project puts forward many innovative ideas, among which the design of the active main reflector shows fascinating potential. The main spherical reflector is to be composed of thousands of small spherical panels, which can be adjusted to fit a paraboloid of revolution in real time. For the construction and performance, the rms of the fit must be optimized, and so appropriate dimensional limits for the panels need to be determined. The issue of how to divide the spherical reflector mathematically is addressed in this paper. The advantages and drawbacks of various segmenting methods are discussed and an optimum one is suggested.展开更多
Segmenting Arabic handwritings had been one of the subjects of research in the field of Arabic character recognition for more than 25 years. The majority of reported segmentation techniques share a critical shortcomin...Segmenting Arabic handwritings had been one of the subjects of research in the field of Arabic character recognition for more than 25 years. The majority of reported segmentation techniques share a critical shortcoming, which is over-segmentation. The aim of segmentation is to produce the letters (segments) of a handwritten word. When a resulting letter (segment) is made of more than one piece (stroke) instead of one, this is called over-segmentation. Our objective is to overcome this problem by using an Artificial Neural Networks (ANN) to verify the resulting segment. We propose a set of heuristic-based rules to assemble strokes in order to report the precise segmented letters. Preprocessing phases that include normalization and feature extraction are required as a prerequisite step for the ANN system for recognition and verification. In our previous work [1], we did achieve a segmentation success rate of 86% but without recognition. In this work, our experimental results confirmed a segmentation success rate of no less than 95%.展开更多
The wide availability, low radiation dose and short acquisition time of Cone-Beam CT (CBCT) scans make them an attractive source of data for compiling databases of anatomical structures. However CBCT has higher noise ...The wide availability, low radiation dose and short acquisition time of Cone-Beam CT (CBCT) scans make them an attractive source of data for compiling databases of anatomical structures. However CBCT has higher noise and lower contrast than helical slice CT, which makes segmentation more challenging and the optimal methods are not yet known. This paper evaluates several methods of segmenting airway geometries (nares, nasal cavities and pharynx) from typical dental quality head and neck CBCT data. The nasal cavity has narrow and intricate passages and is separated from the paranasal sinuses by thin walls, making it is susceptible to either over- or under-segmentation. The upper airway was split into two: the nasal cavity and the pharyngeal region (nasopharynx to larynx). Each part was segmented using global thresholding, multi-step level-set, and region competition methods (the latter using thresholding, clustering and classification initialisation and edge attraction techniques). The segmented 3D surfaces were evaluated against a reference manual segmentation using distance-, overlap- and volume-based metrics. Global thresholding, multi-step level-set, and region competition all gave satisfactory results for the lower part of the airway (nasopharynx to larynx). Edge attraction failed completely. A semi-automatic region-growing segmentation with multi-thresholding (or classification) initialization offered the best quality segmentation. With some minimal manual editing, it resulted in an accurate upper airway model, as judged by the similarity and volumetric indices, while being the least time consuming of the semi-automatic methods, and relying the least on the operator’s expertise.展开更多
Visual attention mechanisms allow humans to extract relevant and important information from raw input percepts. Many applications in robotics and computer vision have modeled human visual attention mechanisms using a ...Visual attention mechanisms allow humans to extract relevant and important information from raw input percepts. Many applications in robotics and computer vision have modeled human visual attention mechanisms using a bottom-up data centric approach. In contrast, recent studies in cognitive science highlight advantages of a top-down approach to the attention mechanisms, especially in applications involving goal-directed search. In this paper, we propose a top-down approach for extracting salient objects/regions of space. The top-down methodology first isolates different objects in an unorganized point cloud, and compares each object for uniqueness. A measure of saliency using the properties of geodesic distance on the object’s surface is defined. Our method works on 3D point cloud data, and identifies salient objects of high curvature and unique silhouette. These being the most unique features of a scene, are robust to clutter, occlusions and view point changes. We provide the details of the proposed method and initial experimental results.展开更多
Segmentation of vegetation remote sensing images can minimize the interference of background,thus achieving efficient monitoring and analysis for vegetation information.The segmentation of vegetation poses a significa...Segmentation of vegetation remote sensing images can minimize the interference of background,thus achieving efficient monitoring and analysis for vegetation information.The segmentation of vegetation poses a significant challenge due to the inherently complex environmental conditions.Currently,there is a growing trend of using spectral sensing combined with deep learning for field vegetation segmentation to cope with complex environ-ments.However,two major constraints remain:the high cost of equipment required for field spectral data collection;the availability of field datasets is limited and data annotation is time-consuming and labor-intensive.To address these challenges,we propose a weakly supervised approach for field vegetation segmentation by using spectral reconstruction(SR)techniques as the foundation and drawing on the theory of vegetation index(Ⅵ).Specifically,to reduce the cost of data acquisition,we propose SRCNet and SRANet based on convolution and attention structure to reconstruct multispectral images of fields,respectively.Then,borrowing from theⅥprinciple,we aggregate the reconstructed data to establish the connection of spectral bands,obtaining more salient vegetation information.Finally,we employ the adaptation strategy to segment the fused feature map using a weakly supervised method,which does not require manual labeling to obtain a field vegetation segmentation result.Our segmentation method can achieve a Mean Intersection over Union(MIoU)of 0.853 on real field datasets,which outperforms the existing methods.In addition,we have open-sourced a dataset of unmanned aerial vehicle(UAV)RGB-multispectral images,comprising 2358 pairs of samples,to improve the richness of remote sensing agricultural data.The code and data are available at egment_SR,and.展开更多
Organoids possess immense potential for unraveling the intricate functions of human tissues and facilitating preclinical disease treatment.Their applications span from high-throughput drug screening to the modeling of...Organoids possess immense potential for unraveling the intricate functions of human tissues and facilitating preclinical disease treatment.Their applications span from high-throughput drug screening to the modeling of complex diseases,with some even achieving clinical translation.Changes in the overall size,shape,boundary,and other morphological features of organoids provide a noninvasive method for assessing organoid drug sensitivity.However,the precise segmentation of organoids in bright-field microscopy images is made difficult by the complexity of the organoid morphology and interference,including overlapping organoids,bubbles,dust particles,and cell fragments.This paper introduces the precision organoid segmentation technique(POST),which is a deep-learning algorithm for segmenting challenging organoids under simple bright-field imaging conditions.Unlike existing methods,POST accurately segments each organoid and eliminates various artifacts encountered during organoid culturing and imaging.Furthermore,it is sensitive to and aligns with measurements of organoid activity in drug sensitivity experiments.POST is expected to be a valuable tool for drug screening using organoids owing to its capability of automatically and rapidly eliminating interfering substances and thereby streamlining the organoid analysis and drug screening process.展开更多
Due to the inability of manufacturing a single monolithic mirror at the 10-meter scales,segmented mirrors have become indispensable tools in modern astronomical research.However,to match the imaging performance of the...Due to the inability of manufacturing a single monolithic mirror at the 10-meter scales,segmented mirrors have become indispensable tools in modern astronomical research.However,to match the imaging performance of the monolithic counterpart,the sub-mirrors must maintain precise co-phasing.Piston error critically degrades segmented mirror imaging quality,necessitating efficient and precise detection.To ad-dress the limitations that the conventional circular-aperture diffraction with two-wavelength algorithm is sus-ceptible to decentration errors,and the traditional convolutional neural networks(CNNs)struggle to capture global features under large-range piston errors due to their restricted local receptive fields,this paper pro-poses a method that integrates extended Young’s interference principles with a Vision Transformer(ViT)to detect piston error.By suppressing decentration error interference through two symmetrically arranged aper-tures and extending the measurement range to±7.95μm via a two-wavelength(589 nm/600 nm)algorithm.This approach exploits ViT’s self-attention mechanism to model global characteristics of interference fringes.Unlike CNNs constrained by local convolutional kernels,the ViT significantly improves sensitivity to inter-ferogram periodicity.The simulation results demonstrate that the proposed method achieves a measurement accuracy of 5 nm(0.0083λ0)across the range of±7.95μm,while maintaining an accuracy exceeding 95%in the presence of Gaussian noise(SNR≥15 dB),Poisson noise(λ≥9 photons/pixel),and sub-mirror gap er-ror(Egap≤0.2)interference.Moreover,the detection speed shows significant improvement compared to the cross-correlation algorithm.This study establishes an accurate,robust framework for segmented mirror error detection,advancing high-precision astronomical observation.展开更多
Detailed individual tree crown segmentation is highly relevant for the detection and monitoring of Fraxinus excelsior L.trees affected by ash dieback,a major threat to common ash populations across Europe.In this stud...Detailed individual tree crown segmentation is highly relevant for the detection and monitoring of Fraxinus excelsior L.trees affected by ash dieback,a major threat to common ash populations across Europe.In this study,both fine and coarse crown segmentation methods were applied to close-range multispectral UAV imagery.The fine tree crown segmentation method utilized a novel unsupervised machine learning approach based on a blended NIR-NDVI image,whereas the coarse segmentation relied on the segment anything model(SAM).Both methods successfully delineated tree crown outlines,however,only the fine segmentation accurately captured internal canopy gaps.Despite these structural differences,mean NDVI values calculated per tree crown revealed no significant differences between the two approaches,indicating that coarse segmentation is sufficient for mean vegetation index assessments.Nevertheless,the fine segmentation revealed increased heterogeneity in NDVI values in more severely damaged trees,underscoring its value for detailed structural and health analyses.Furthermore,the fine segmentation workflow proved transferable to both individual UAV images and orthophotos from broader UAV surveys.For applications focused on structural integrity and spatial variation in canopy health,the fine segmentation approach is recommended.展开更多
AIM:To construct an intelligent segmentation scheme for precise localization of central serous chorioretinopathy(CSC)leakage points,thereby enabling ophthalmologists to deliver accurate laser treatment without navigat...AIM:To construct an intelligent segmentation scheme for precise localization of central serous chorioretinopathy(CSC)leakage points,thereby enabling ophthalmologists to deliver accurate laser treatment without navigational laser equipment.METHODS:A dataset with dual labels(point-level and pixel-level)was first established based on fundus fluorescein angiography(FFA)images of CSC and subsequently divided into training(102 images),validation(40 images),and test(40 images)datasets.An intelligent segmentation method was then developed,based on the You Only Look Once version 8 Pose Estimation(YOLOv8-Pose)model and segment anything model(SAM),to segment CSC leakage points.Next,the YOLOv8-Pose model was trained for 200 epochs,and the best-performing model was selected to form the optimal combination with SAM.Additionally,the classic five types of U-Net series models[i.e.,U-Net,recurrent residual U-Net(R2U-Net),attention U-Net(AttU-Net),recurrent residual attention U-Net(R2AttUNet),and nested U-Net(UNet^(++))]were initialized with three random seeds and trained for 200 epochs,resulting in a total of 15 baseline models for comparison.Finally,based on the metrics including Dice similarity coefficient(DICE),intersection over union(IoU),precision,recall,precisionrecall(PR)curve,and receiver operating characteristic(ROC)curve,the proposed method was compared with baseline models through quantitative and qualitative experiments for leakage point segmentation,thereby demonstrating its effectiveness.RESULTS:With the increase of training epochs,the mAP50-95,Recall,and precision of the YOLOv8-Pose model showed a significant increase and tended to stabilize,and it achieved a preliminary localization success rate of 90%(i.e.,36 images)for CSC leakage points in 40 test images.Using manually expert-annotated pixel-level labels as the ground truth,the proposed method achieved outcomes with a DICE of 57.13%,an IoU of 45.31%,a precision of 45.91%,a recall of 93.57%,an area under the PR curve(AUC-PR)of 0.78 and an area under the ROC curve(AUC-ROC)of 0.97,which enables more accurate segmentation of CSC leakage points.CONCLUSION:By combining the precise localization capability of the YOLOv8-Pose model with the robust and flexible segmentation ability of SAM,the proposed method not only demonstrates the effectiveness of the YOLOv8-Pose model in detecting keypoint coordinates of CSC leakage points from the perspective of application innovation but also establishes a novel approach for accurate segmentation of CSC leakage points through the“detect-then-segment”strategy,thereby providing a potential auxiliary means for the automatic and precise realtime localization of leakage points during traditional laser photocoagulation for CSC.展开更多
Background:Laparoscopic anatomic hepatectomy of segment 7(LAH-S7)is a challenging surgery.In this study we aimed to investigate surgical and oncological outcomes of various approaches of LAH-S7 in patients with hepato...Background:Laparoscopic anatomic hepatectomy of segment 7(LAH-S7)is a challenging surgery.In this study we aimed to investigate surgical and oncological outcomes of various approaches of LAH-S7 in patients with hepatocellular carcinoma(HCC).A particular focus was placed on identifying the Glissonean pedicle of segment 7(G7)and the intersegmental plane.Given the scarcity of comprehensive reviews or comparative studies on clinical outcomes,we also sought to analyze the experiences and advantages associated with different approaches in relation to the anatomic variations of G7.Methods:The clinical data of 124 patients who underwent LAH-S7 for HCC across seven tertiary referral medical centers in China were retrospectively analyzed.Three surgical approaches were categorized based on the procedures used for G7 identification:the indocyanine green(ICG)fluorescence positive staining approach(IFPA),the Glissonean approach(GA),and the hepatic vein-guided approach(HVGA).Subsequently,the postoperative short-term results and oncological outcomes of the three different approaches were compared.Results:The distribution of surgical approaches among the patients was as follows:IFPA in 16(12.9%),GA in 62(50.0%),and HVGA in 46(37.1%)patients.Complications were observed in 27(21.8%)patients.The 1-,3-,and 5-year overall survival(OS)rates were 99.1%,89.2%,and 84.7%,respectively.The 1-,3-,and 5-year recurrence-free survival(RFS)rates were 99.0%,84.7%,and 69.3%,respectively.The OS and RFS rates were comparable across the three approaches.Conclusions:Following a standardized surgical procedure,LAH-S7 is demonstrated to be safe and yields favorable oncological outcomes.Surgeons performing LAH-S7 should select the appropriate surgical approach based on the anatomical characteristics and variations of G7.展开更多
Strontianite-rich carbonatite,containing over 30 vol%carbonate minerals predominantly composed of strontianite(SrCO3),is identified in the Zhengjialiangzi ore segment of the Muluozhai rare earth element(REE)deposit,we...Strontianite-rich carbonatite,containing over 30 vol%carbonate minerals predominantly composed of strontianite(SrCO3),is identified in the Zhengjialiangzi ore segment of the Muluozhai rare earth element(REE)deposit,western Sichuan Province,China.It exhibits a unique mineral assemblage dominated by strontianite,fluorite,bastnäsite,barite,calcite and dolomite,distinguishing it from conventional calcio-,magnesio-,ferro-,or natro-carbonatites.The rock shows extreme enrichment in REEs(ΣREE=47335-64367 ppm),with strong LREE/HREE fractionation[(La/Yb)N=1151-2119]and notably high concentrations of high-value critical REEs(e.g.,Pr,Nd,Tb,Dy),5-10 times greater than those in local calcite-dominated carbonatites.Trace element patterns indicate significant enrichment in REEs,Sr,and Ba,along with depletion in high-field-strength elements(HFSEs;e.g.,Nb,Ta,Zr,Hf).In-situ Sr isotopes of strontianite[(^(87)Sr/^(86)Sr)i=0.706190-0.707305]indicate an enriched mantle source(EMI-EMII).Sr enrichment is attributed to initial mantle source enrichment and extensive fractional crystallization,possibly accompanied by minor wall-rock assimilation.We propose that the strontianite-rich carbonatite formed from a highly evolved,Sr-and REEs-rich carbonatitic magma that intruded into shallow structural breccias,followed by rapid cooling.Its formation is associated with a continuous melt-fluid evolutionary process that is characteristic of carbonatitic systems.展开更多
Weakly Supervised Semantic Segmentation(WSSS),which relies only on image-level labels,has attracted significant attention for its cost-effectiveness and scalability.Existing methods mainly enhance inter-class distinct...Weakly Supervised Semantic Segmentation(WSSS),which relies only on image-level labels,has attracted significant attention for its cost-effectiveness and scalability.Existing methods mainly enhance inter-class distinctions and employ data augmentation to mitigate semantic ambiguity and reduce spurious activations.However,they often neglect the complex contextual dependencies among image patches,resulting in incomplete local representations and limited segmentation accuracy.To address these issues,we propose the Context Patch Fusion with Class Token Enhancement(CPF-CTE)framework,which exploits contextual relations among patches to enrich feature repre-sentations and improve segmentation.At its core,the Contextual-Fusion Bidirectional Long Short-Term Memory(CF-BiLSTM)module captures spatial dependencies between patches and enables bidirectional information flow,yield-ing a more comprehensive understanding of spatial correlations.This strengthens feature learning and segmentation robustness.Moreover,we introduce learnable class tokens that dynamically encode and refine class-specific semantics,enhancing discriminative capability.By effectively integrating spatial and semantic cues,CPF-CTE produces richer and more accurate representations of image content.Extensive experiments on PASCAL VOC 2012 and MS COCO 2014 validate that CPF-CTE consistently surpasses prior WSSS methods.展开更多
Microscopy imaging is fundamental in analyzing bacterial morphology and dynamics,offering critical insights into bacterial physiology and pathogenicity.Image segmentation techniques enable quantitative analysis of bac...Microscopy imaging is fundamental in analyzing bacterial morphology and dynamics,offering critical insights into bacterial physiology and pathogenicity.Image segmentation techniques enable quantitative analysis of bacterial structures,facilitating precise measurement of morphological variations and population behaviors at single-cell resolution.This paper reviews advancements in bacterial image segmentation,emphasizing the shift from traditional thresholding and watershed methods to deep learning-driven approaches.Convolutional neural networks(CNNs),U-Net architectures,and three-dimensional(3D)frameworks excel at segmenting dense biofilms and resolving antibiotic-induced morphological changes.These methods combine automated feature extraction with physics-informed postprocessing.Despite progress,challenges persist in computational efficiency,cross-species generalizability,and integration with multimodal experimental workflows.Future progress will depend on improving model robustness across species and imaging modalities,integrating multimodal data for phenotype-function mapping,and developing standard pipelines that link computational tools with clinical diagnostics.These innovations will expand microbial phenotyping beyond structural analysis,enabling deeper insights into bacterial physiology and ecological interactions.展开更多
Advanced traffic monitoring systems encounter substantial challenges in vehicle detection and classification due to the limitations of conventional methods,which often demand extensive computational resources and stru...Advanced traffic monitoring systems encounter substantial challenges in vehicle detection and classification due to the limitations of conventional methods,which often demand extensive computational resources and struggle with diverse data acquisition techniques.This research presents a novel approach for vehicle classification and recognition in aerial image sequences,integrating multiple advanced techniques to enhance detection accuracy.The proposed model begins with preprocessing using Multiscale Retinex(MSR)to enhance image quality,followed by Expectation-Maximization(EM)Segmentation for precise foreground object identification.Vehicle detection is performed using the state-of-the-art YOLOv10 framework,while feature extraction incorporates Maximally Stable Extremal Regions(MSER),Dense Scale-Invariant Feature Transform(Dense SIFT),and Zernike Moments Features to capture distinct object characteristics.Feature optimization is further refined through a Hybrid Swarm-based Optimization algorithm,ensuring optimal feature selection for improved classification performance.The final classification is conducted using a Vision Transformer,leveraging its robust learning capabilities for enhanced accuracy.Experimental evaluations on benchmark datasets,including UAVDT and the Unmanned Aerial Vehicle Intruder Dataset(UAVID),demonstrate the superiority of the proposed approach,achieving an accuracy of 94.40%on UAVDT and 93.57%on UAVID.The results highlight the efficacy of the model in significantly enhancing vehicle detection and classification in aerial imagery,outperforming existing methodologies and offering a statistically validated improvement for intelligent traffic monitoring systems compared to existing approaches.展开更多
Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task...Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.展开更多
Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the b...Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.展开更多
High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes an...High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.展开更多
This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 20...This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 2025.The primary objective is to evaluate methodological advancements,model performance,dataset usage,and existing challenges in developing clinically robust AI systems.We included peer-reviewed journal articles and highimpact conference papers published between 2022 and 2025,written in English,that proposed or evaluated deep learning methods for brain tumor segmentation and/or classification.Excluded were non-open-access publications,books,and non-English articles.A structured search was conducted across Scopus,Google Scholar,Wiley,and Taylor&Francis,with the last search performed in August 2025.Risk of bias was not formally quantified but considered during full-text screening based on dataset diversity,validation methods,and availability of performance metrics.We used narrative synthesis and tabular benchmarking to compare performance metrics(e.g.,accuracy,Dice score)across model types(CNN,Transformer,Hybrid),imaging modalities,and datasets.A total of 49 studies were included(43 journal articles and 6 conference papers).These studies spanned over 9 public datasets(e.g.,BraTS,Figshare,REMBRANDT,MOLAB)and utilized a range of imaging modalities,predominantly MRI.Hybrid models,especially ResViT and UNetFormer,consistently achieved high performance,with classification accuracy exceeding 98%and segmentation Dice scores above 0.90 across multiple studies.Transformers and hybrid architectures showed increasing adoption post2023.Many studies lacked external validation and were evaluated only on a few benchmark datasets,raising concerns about generalizability and dataset bias.Few studies addressed clinical interpretability or uncertainty quantification.Despite promising results,particularly for hybrid deep learning models,widespread clinical adoption remains limited due to lack of validation,interpretability concerns,and real-world deployment barriers.展开更多
基金supported through the Natural Sciences and Engineering Research Council of Canada(NSERC)Discovery Grants 341275,CRDPJ 543894-19NSERC/Energi Simulation Industrial Research Chair program.
文摘Fractures are critical to subsurface activities such as oil and gas extraction,geothermal energy production,and carbon storage.Hydraulic fracturing,a technique that enhances fluid production,creates complex fracture networks within rock formations containing natural discontinuities.Accurately distinguishing between hydraulically induced fractures and pre-existing discontinuities is essential for understanding hydraulic fracture mechanisms.However,this remains challenging due to the interconnected nature of fractures in three-dimensional(3D)space.Manual segmentation,while adaptive,is both labor-intensive and subjective,making it impractical for large-scale 3D datasets.This study introduces a deep learning-based progressive cross-sectional segmentation method to automate the classification of 3D fracture volumes.The proposed method was applied to a 3D hydraulic fracture network in a Montney cube sample,successfully segmenting natural fractures,parted bedding planes,and hydraulic fractures with minimal user intervention.The automated approach achieves a 99.6%reduction in manual image processing workload while maintaining high segmentation accuracy,with test accuracy exceeding 98%and F1-score over 84%.This approach generalizes well to Brazilian disc samples with different fracture patterns,achieving consistently high accuracy in distinguishing between bedding and non-bedding fractures.This automated fracture segmentation method offers an effective tool for enhanced quantitative characterization of fracture networks,which would contribute to a deeper understanding of hydraulic fracturing processes.
文摘Segmenting vegetation in color images is a complex task, especially when the background and lighting conditions of the environment are uncontrolled. This paper proposes a vegetation segmentation algorithm that combines a supervised and an unsupervised learning method to segment healthy and diseased plant images from the background. During the training stage, a Self-Organizing Map (SOM) neural network is applied to create different color groups from a set of images containing vegetation, acquired from a tomato greenhouse. The color groups are labeled as vegetation and non-vegetation and then used to create two color histogram models corresponding to vegetation and non-vegetation. In the online mode, input images are segmented by a Bayesian classifier using the two histogram models. This algorithm has provided a qualitatively better segmentation rate of images containing plants’ foliage in uncontrolled environments than the segmentation rate obtained by a color index technique, resulting in the elimination of the background and the preservation of important color information. This segmentation method will be applied in disease diagnosis of tomato plants in greenhouses as future work.
文摘Acting as a pilot of the Square Kilometer Array (SKA), a Five hundred meter Aperture Spherical Telescope (FAST) project puts forward many innovative ideas, among which the design of the active main reflector shows fascinating potential. The main spherical reflector is to be composed of thousands of small spherical panels, which can be adjusted to fit a paraboloid of revolution in real time. For the construction and performance, the rms of the fit must be optimized, and so appropriate dimensional limits for the panels need to be determined. The issue of how to divide the spherical reflector mathematically is addressed in this paper. The advantages and drawbacks of various segmenting methods are discussed and an optimum one is suggested.
文摘Segmenting Arabic handwritings had been one of the subjects of research in the field of Arabic character recognition for more than 25 years. The majority of reported segmentation techniques share a critical shortcoming, which is over-segmentation. The aim of segmentation is to produce the letters (segments) of a handwritten word. When a resulting letter (segment) is made of more than one piece (stroke) instead of one, this is called over-segmentation. Our objective is to overcome this problem by using an Artificial Neural Networks (ANN) to verify the resulting segment. We propose a set of heuristic-based rules to assemble strokes in order to report the precise segmented letters. Preprocessing phases that include normalization and feature extraction are required as a prerequisite step for the ANN system for recognition and verification. In our previous work [1], we did achieve a segmentation success rate of 86% but without recognition. In this work, our experimental results confirmed a segmentation success rate of no less than 95%.
文摘The wide availability, low radiation dose and short acquisition time of Cone-Beam CT (CBCT) scans make them an attractive source of data for compiling databases of anatomical structures. However CBCT has higher noise and lower contrast than helical slice CT, which makes segmentation more challenging and the optimal methods are not yet known. This paper evaluates several methods of segmenting airway geometries (nares, nasal cavities and pharynx) from typical dental quality head and neck CBCT data. The nasal cavity has narrow and intricate passages and is separated from the paranasal sinuses by thin walls, making it is susceptible to either over- or under-segmentation. The upper airway was split into two: the nasal cavity and the pharyngeal region (nasopharynx to larynx). Each part was segmented using global thresholding, multi-step level-set, and region competition methods (the latter using thresholding, clustering and classification initialisation and edge attraction techniques). The segmented 3D surfaces were evaluated against a reference manual segmentation using distance-, overlap- and volume-based metrics. Global thresholding, multi-step level-set, and region competition all gave satisfactory results for the lower part of the airway (nasopharynx to larynx). Edge attraction failed completely. A semi-automatic region-growing segmentation with multi-thresholding (or classification) initialization offered the best quality segmentation. With some minimal manual editing, it resulted in an accurate upper airway model, as judged by the similarity and volumetric indices, while being the least time consuming of the semi-automatic methods, and relying the least on the operator’s expertise.
文摘Visual attention mechanisms allow humans to extract relevant and important information from raw input percepts. Many applications in robotics and computer vision have modeled human visual attention mechanisms using a bottom-up data centric approach. In contrast, recent studies in cognitive science highlight advantages of a top-down approach to the attention mechanisms, especially in applications involving goal-directed search. In this paper, we propose a top-down approach for extracting salient objects/regions of space. The top-down methodology first isolates different objects in an unorganized point cloud, and compares each object for uniqueness. A measure of saliency using the properties of geodesic distance on the object’s surface is defined. Our method works on 3D point cloud data, and identifies salient objects of high curvature and unique silhouette. These being the most unique features of a scene, are robust to clutter, occlusions and view point changes. We provide the details of the proposed method and initial experimental results.
基金supported by National Key R&D Program of China(2024YFD2001100,2024YFE0214300)National Natural Science Foundation of China(No.62162008)+3 种基金Guizhou Provincial Science and Technology Projects([2024]002,CXTD[2023]027)Guizhou Province Youth Science and Technology Talent Project([2024]317)Guiyang Guian Science and Technology Talent Training Project([2024]2-15)The Talent Introduction Program of Guizhou University under Grant No.(2021)89.
文摘Segmentation of vegetation remote sensing images can minimize the interference of background,thus achieving efficient monitoring and analysis for vegetation information.The segmentation of vegetation poses a significant challenge due to the inherently complex environmental conditions.Currently,there is a growing trend of using spectral sensing combined with deep learning for field vegetation segmentation to cope with complex environ-ments.However,two major constraints remain:the high cost of equipment required for field spectral data collection;the availability of field datasets is limited and data annotation is time-consuming and labor-intensive.To address these challenges,we propose a weakly supervised approach for field vegetation segmentation by using spectral reconstruction(SR)techniques as the foundation and drawing on the theory of vegetation index(Ⅵ).Specifically,to reduce the cost of data acquisition,we propose SRCNet and SRANet based on convolution and attention structure to reconstruct multispectral images of fields,respectively.Then,borrowing from theⅥprinciple,we aggregate the reconstructed data to establish the connection of spectral bands,obtaining more salient vegetation information.Finally,we employ the adaptation strategy to segment the fused feature map using a weakly supervised method,which does not require manual labeling to obtain a field vegetation segmentation result.Our segmentation method can achieve a Mean Intersection over Union(MIoU)of 0.853 on real field datasets,which outperforms the existing methods.In addition,we have open-sourced a dataset of unmanned aerial vehicle(UAV)RGB-multispectral images,comprising 2358 pairs of samples,to improve the richness of remote sensing agricultural data.The code and data are available at egment_SR,and.
基金supported by the National Key R&D Program of China(No.2022YFC2504403)the National Natural Science Foundation of China(No.62172202)+1 种基金the Experiment Project of China Manned Space Program(No.HYZHXM01019)the Fundamental Research Funds for the Central Universities from Southeast University(No.3207032101C3)。
文摘Organoids possess immense potential for unraveling the intricate functions of human tissues and facilitating preclinical disease treatment.Their applications span from high-throughput drug screening to the modeling of complex diseases,with some even achieving clinical translation.Changes in the overall size,shape,boundary,and other morphological features of organoids provide a noninvasive method for assessing organoid drug sensitivity.However,the precise segmentation of organoids in bright-field microscopy images is made difficult by the complexity of the organoid morphology and interference,including overlapping organoids,bubbles,dust particles,and cell fragments.This paper introduces the precision organoid segmentation technique(POST),which is a deep-learning algorithm for segmenting challenging organoids under simple bright-field imaging conditions.Unlike existing methods,POST accurately segments each organoid and eliminates various artifacts encountered during organoid culturing and imaging.Furthermore,it is sensitive to and aligns with measurements of organoid activity in drug sensitivity experiments.POST is expected to be a valuable tool for drug screening using organoids owing to its capability of automatically and rapidly eliminating interfering substances and thereby streamlining the organoid analysis and drug screening process.
文摘Due to the inability of manufacturing a single monolithic mirror at the 10-meter scales,segmented mirrors have become indispensable tools in modern astronomical research.However,to match the imaging performance of the monolithic counterpart,the sub-mirrors must maintain precise co-phasing.Piston error critically degrades segmented mirror imaging quality,necessitating efficient and precise detection.To ad-dress the limitations that the conventional circular-aperture diffraction with two-wavelength algorithm is sus-ceptible to decentration errors,and the traditional convolutional neural networks(CNNs)struggle to capture global features under large-range piston errors due to their restricted local receptive fields,this paper pro-poses a method that integrates extended Young’s interference principles with a Vision Transformer(ViT)to detect piston error.By suppressing decentration error interference through two symmetrically arranged aper-tures and extending the measurement range to±7.95μm via a two-wavelength(589 nm/600 nm)algorithm.This approach exploits ViT’s self-attention mechanism to model global characteristics of interference fringes.Unlike CNNs constrained by local convolutional kernels,the ViT significantly improves sensitivity to inter-ferogram periodicity.The simulation results demonstrate that the proposed method achieves a measurement accuracy of 5 nm(0.0083λ0)across the range of±7.95μm,while maintaining an accuracy exceeding 95%in the presence of Gaussian noise(SNR≥15 dB),Poisson noise(λ≥9 photons/pixel),and sub-mirror gap er-ror(Egap≤0.2)interference.Moreover,the detection speed shows significant improvement compared to the cross-correlation algorithm.This study establishes an accurate,robust framework for segmented mirror error detection,advancing high-precision astronomical observation.
基金This study was conducted within the project FraxVir“Detection,characterisation and analyses of the occurrence of viruses and ash dieback in special stands of Fraxinus excelsior-a supplementary study to the FraxForFuture demonstration project”and receives funding via the Waldklimafonds(WKF)funded by the German Federal Ministry of Food and Agriculture(BMEL)and Federal Ministry for the Environment,Nature Conservation,Nuclear Safety and Consumer Protection(BMUV)administrated by the Agency for Renewable Resources(FNR)under grant agreement 2220WK40A4.
文摘Detailed individual tree crown segmentation is highly relevant for the detection and monitoring of Fraxinus excelsior L.trees affected by ash dieback,a major threat to common ash populations across Europe.In this study,both fine and coarse crown segmentation methods were applied to close-range multispectral UAV imagery.The fine tree crown segmentation method utilized a novel unsupervised machine learning approach based on a blended NIR-NDVI image,whereas the coarse segmentation relied on the segment anything model(SAM).Both methods successfully delineated tree crown outlines,however,only the fine segmentation accurately captured internal canopy gaps.Despite these structural differences,mean NDVI values calculated per tree crown revealed no significant differences between the two approaches,indicating that coarse segmentation is sufficient for mean vegetation index assessments.Nevertheless,the fine segmentation revealed increased heterogeneity in NDVI values in more severely damaged trees,underscoring its value for detailed structural and health analyses.Furthermore,the fine segmentation workflow proved transferable to both individual UAV images and orthophotos from broader UAV surveys.For applications focused on structural integrity and spatial variation in canopy health,the fine segmentation approach is recommended.
基金Supported by the Shenzhen Science and Technology Program(No.JCYJ20240813152704006)the National Natural Science Foundation of China(No.62401259)+2 种基金the Fundamental Research Funds for the Central Universities(No.NZ2024036)the Postdoctoral Fellowship Program of CPSF(No.GZC20242228)High Performance Computing Platform of Nanjing University of Aeronautics and Astronautics。
文摘AIM:To construct an intelligent segmentation scheme for precise localization of central serous chorioretinopathy(CSC)leakage points,thereby enabling ophthalmologists to deliver accurate laser treatment without navigational laser equipment.METHODS:A dataset with dual labels(point-level and pixel-level)was first established based on fundus fluorescein angiography(FFA)images of CSC and subsequently divided into training(102 images),validation(40 images),and test(40 images)datasets.An intelligent segmentation method was then developed,based on the You Only Look Once version 8 Pose Estimation(YOLOv8-Pose)model and segment anything model(SAM),to segment CSC leakage points.Next,the YOLOv8-Pose model was trained for 200 epochs,and the best-performing model was selected to form the optimal combination with SAM.Additionally,the classic five types of U-Net series models[i.e.,U-Net,recurrent residual U-Net(R2U-Net),attention U-Net(AttU-Net),recurrent residual attention U-Net(R2AttUNet),and nested U-Net(UNet^(++))]were initialized with three random seeds and trained for 200 epochs,resulting in a total of 15 baseline models for comparison.Finally,based on the metrics including Dice similarity coefficient(DICE),intersection over union(IoU),precision,recall,precisionrecall(PR)curve,and receiver operating characteristic(ROC)curve,the proposed method was compared with baseline models through quantitative and qualitative experiments for leakage point segmentation,thereby demonstrating its effectiveness.RESULTS:With the increase of training epochs,the mAP50-95,Recall,and precision of the YOLOv8-Pose model showed a significant increase and tended to stabilize,and it achieved a preliminary localization success rate of 90%(i.e.,36 images)for CSC leakage points in 40 test images.Using manually expert-annotated pixel-level labels as the ground truth,the proposed method achieved outcomes with a DICE of 57.13%,an IoU of 45.31%,a precision of 45.91%,a recall of 93.57%,an area under the PR curve(AUC-PR)of 0.78 and an area under the ROC curve(AUC-ROC)of 0.97,which enables more accurate segmentation of CSC leakage points.CONCLUSION:By combining the precise localization capability of the YOLOv8-Pose model with the robust and flexible segmentation ability of SAM,the proposed method not only demonstrates the effectiveness of the YOLOv8-Pose model in detecting keypoint coordinates of CSC leakage points from the perspective of application innovation but also establishes a novel approach for accurate segmentation of CSC leakage points through the“detect-then-segment”strategy,thereby providing a potential auxiliary means for the automatic and precise realtime localization of leakage points during traditional laser photocoagulation for CSC.
基金supported by grants from the Scientific Research Fund of Education Department of Yunnan Province(2023J767)the National Natural Science Foundation of China(82272963 and 82472718)+6 种基金Health Research Project of Hunan Provincial Health Commission(W20242019)Hunan Provincial Health High-Level Talent Scientific Research Project(R2023096)Hunan Provincial Department of Science and Technology Health Industry Joint Fund(2024JJ9479)Guangdong Province Basic and Applied Basic Research Foundation Project-Guangdong Province Natural Science Foundation(2024A1515220154)"Leading Goose"Project of the Science and Technology Department of Zhejiang Province(2024C03049)Major Project of Health Science and Technology Program of Zhejiang Province(WKJ-ZJ-2407)the National Key Research and Development Program(2024YFB331170204).
文摘Background:Laparoscopic anatomic hepatectomy of segment 7(LAH-S7)is a challenging surgery.In this study we aimed to investigate surgical and oncological outcomes of various approaches of LAH-S7 in patients with hepatocellular carcinoma(HCC).A particular focus was placed on identifying the Glissonean pedicle of segment 7(G7)and the intersegmental plane.Given the scarcity of comprehensive reviews or comparative studies on clinical outcomes,we also sought to analyze the experiences and advantages associated with different approaches in relation to the anatomic variations of G7.Methods:The clinical data of 124 patients who underwent LAH-S7 for HCC across seven tertiary referral medical centers in China were retrospectively analyzed.Three surgical approaches were categorized based on the procedures used for G7 identification:the indocyanine green(ICG)fluorescence positive staining approach(IFPA),the Glissonean approach(GA),and the hepatic vein-guided approach(HVGA).Subsequently,the postoperative short-term results and oncological outcomes of the three different approaches were compared.Results:The distribution of surgical approaches among the patients was as follows:IFPA in 16(12.9%),GA in 62(50.0%),and HVGA in 46(37.1%)patients.Complications were observed in 27(21.8%)patients.The 1-,3-,and 5-year overall survival(OS)rates were 99.1%,89.2%,and 84.7%,respectively.The 1-,3-,and 5-year recurrence-free survival(RFS)rates were 99.0%,84.7%,and 69.3%,respectively.The OS and RFS rates were comparable across the three approaches.Conclusions:Following a standardized surgical procedure,LAH-S7 is demonstrated to be safe and yields favorable oncological outcomes.Surgeons performing LAH-S7 should select the appropriate surgical approach based on the anatomical characteristics and variations of G7.
基金the National Natural Science Foundation of China(Grant No.42203073 and 41472072)Basic Scientific Research Fund of the Institute of Geology,CAGS(Grant No.J2317)Sichuan Science and Technology Program(Grant No.2023NSFSC0272).
文摘Strontianite-rich carbonatite,containing over 30 vol%carbonate minerals predominantly composed of strontianite(SrCO3),is identified in the Zhengjialiangzi ore segment of the Muluozhai rare earth element(REE)deposit,western Sichuan Province,China.It exhibits a unique mineral assemblage dominated by strontianite,fluorite,bastnäsite,barite,calcite and dolomite,distinguishing it from conventional calcio-,magnesio-,ferro-,or natro-carbonatites.The rock shows extreme enrichment in REEs(ΣREE=47335-64367 ppm),with strong LREE/HREE fractionation[(La/Yb)N=1151-2119]and notably high concentrations of high-value critical REEs(e.g.,Pr,Nd,Tb,Dy),5-10 times greater than those in local calcite-dominated carbonatites.Trace element patterns indicate significant enrichment in REEs,Sr,and Ba,along with depletion in high-field-strength elements(HFSEs;e.g.,Nb,Ta,Zr,Hf).In-situ Sr isotopes of strontianite[(^(87)Sr/^(86)Sr)i=0.706190-0.707305]indicate an enriched mantle source(EMI-EMII).Sr enrichment is attributed to initial mantle source enrichment and extensive fractional crystallization,possibly accompanied by minor wall-rock assimilation.We propose that the strontianite-rich carbonatite formed from a highly evolved,Sr-and REEs-rich carbonatitic magma that intruded into shallow structural breccias,followed by rapid cooling.Its formation is associated with a continuous melt-fluid evolutionary process that is characteristic of carbonatitic systems.
文摘Weakly Supervised Semantic Segmentation(WSSS),which relies only on image-level labels,has attracted significant attention for its cost-effectiveness and scalability.Existing methods mainly enhance inter-class distinctions and employ data augmentation to mitigate semantic ambiguity and reduce spurious activations.However,they often neglect the complex contextual dependencies among image patches,resulting in incomplete local representations and limited segmentation accuracy.To address these issues,we propose the Context Patch Fusion with Class Token Enhancement(CPF-CTE)framework,which exploits contextual relations among patches to enrich feature repre-sentations and improve segmentation.At its core,the Contextual-Fusion Bidirectional Long Short-Term Memory(CF-BiLSTM)module captures spatial dependencies between patches and enables bidirectional information flow,yield-ing a more comprehensive understanding of spatial correlations.This strengthens feature learning and segmentation robustness.Moreover,we introduce learnable class tokens that dynamically encode and refine class-specific semantics,enhancing discriminative capability.By effectively integrating spatial and semantic cues,CPF-CTE produces richer and more accurate representations of image content.Extensive experiments on PASCAL VOC 2012 and MS COCO 2014 validate that CPF-CTE consistently surpasses prior WSSS methods.
基金financially supported by the Open Project Program of Wuhan National Laboratory for Optoelectronics(No.2022WNLOKF009)the National Natural Science Foundation of China(No.62475216)+2 种基金the Key Research and Development Program of Shaanxi(No.2024GH-ZDXM-37)the Fujian Provincial Natural Science Foundation of China(No.2024J01060)the Startup Program of XMU,and the Fundamental Research Funds for the Central Universities.
文摘Microscopy imaging is fundamental in analyzing bacterial morphology and dynamics,offering critical insights into bacterial physiology and pathogenicity.Image segmentation techniques enable quantitative analysis of bacterial structures,facilitating precise measurement of morphological variations and population behaviors at single-cell resolution.This paper reviews advancements in bacterial image segmentation,emphasizing the shift from traditional thresholding and watershed methods to deep learning-driven approaches.Convolutional neural networks(CNNs),U-Net architectures,and three-dimensional(3D)frameworks excel at segmenting dense biofilms and resolving antibiotic-induced morphological changes.These methods combine automated feature extraction with physics-informed postprocessing.Despite progress,challenges persist in computational efficiency,cross-species generalizability,and integration with multimodal experimental workflows.Future progress will depend on improving model robustness across species and imaging modalities,integrating multimodal data for phenotype-function mapping,and developing standard pipelines that link computational tools with clinical diagnostics.These innovations will expand microbial phenotyping beyond structural analysis,enabling deeper insights into bacterial physiology and ecological interactions.
文摘Advanced traffic monitoring systems encounter substantial challenges in vehicle detection and classification due to the limitations of conventional methods,which often demand extensive computational resources and struggle with diverse data acquisition techniques.This research presents a novel approach for vehicle classification and recognition in aerial image sequences,integrating multiple advanced techniques to enhance detection accuracy.The proposed model begins with preprocessing using Multiscale Retinex(MSR)to enhance image quality,followed by Expectation-Maximization(EM)Segmentation for precise foreground object identification.Vehicle detection is performed using the state-of-the-art YOLOv10 framework,while feature extraction incorporates Maximally Stable Extremal Regions(MSER),Dense Scale-Invariant Feature Transform(Dense SIFT),and Zernike Moments Features to capture distinct object characteristics.Feature optimization is further refined through a Hybrid Swarm-based Optimization algorithm,ensuring optimal feature selection for improved classification performance.The final classification is conducted using a Vision Transformer,leveraging its robust learning capabilities for enhanced accuracy.Experimental evaluations on benchmark datasets,including UAVDT and the Unmanned Aerial Vehicle Intruder Dataset(UAVID),demonstrate the superiority of the proposed approach,achieving an accuracy of 94.40%on UAVDT and 93.57%on UAVID.The results highlight the efficacy of the model in significantly enhancing vehicle detection and classification in aerial imagery,outperforming existing methodologies and offering a statistically validated improvement for intelligent traffic monitoring systems compared to existing approaches.
文摘Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)under the Metaverse Support Program to Nurture the Best Talents(IITP-2024-RS-2023-00254529)grant funded by the Korea government(MSIT).
文摘Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation.
基金provided by the Science Research Project of Hebei Education Department under grant No.BJK2024115.
文摘High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.
文摘This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 2025.The primary objective is to evaluate methodological advancements,model performance,dataset usage,and existing challenges in developing clinically robust AI systems.We included peer-reviewed journal articles and highimpact conference papers published between 2022 and 2025,written in English,that proposed or evaluated deep learning methods for brain tumor segmentation and/or classification.Excluded were non-open-access publications,books,and non-English articles.A structured search was conducted across Scopus,Google Scholar,Wiley,and Taylor&Francis,with the last search performed in August 2025.Risk of bias was not formally quantified but considered during full-text screening based on dataset diversity,validation methods,and availability of performance metrics.We used narrative synthesis and tabular benchmarking to compare performance metrics(e.g.,accuracy,Dice score)across model types(CNN,Transformer,Hybrid),imaging modalities,and datasets.A total of 49 studies were included(43 journal articles and 6 conference papers).These studies spanned over 9 public datasets(e.g.,BraTS,Figshare,REMBRANDT,MOLAB)and utilized a range of imaging modalities,predominantly MRI.Hybrid models,especially ResViT and UNetFormer,consistently achieved high performance,with classification accuracy exceeding 98%and segmentation Dice scores above 0.90 across multiple studies.Transformers and hybrid architectures showed increasing adoption post2023.Many studies lacked external validation and were evaluated only on a few benchmark datasets,raising concerns about generalizability and dataset bias.Few studies addressed clinical interpretability or uncertainty quantification.Despite promising results,particularly for hybrid deep learning models,widespread clinical adoption remains limited due to lack of validation,interpretability concerns,and real-world deployment barriers.