The quantitative analysis of dispersed phases(bubbles,droplets,and particles)in multiphase flow systems represents a persistent technological challenge in petroleum engineering applications,including CO2-enhanced oil ...The quantitative analysis of dispersed phases(bubbles,droplets,and particles)in multiphase flow systems represents a persistent technological challenge in petroleum engineering applications,including CO2-enhanced oil recovery,foam flooding,and unconventional reservoir development.Current characterization methods remain constrained by labor-intensive manual workflows and limited dynamic analysis capabilities,particularly for processing large-scale microscopy data and video sequences that capture critical transient behavior like gas cluster migration and droplet coalescence.These limitations hinder the establishment of robust correlations between pore-scale flow patterns and reservoir-scale production performance.This study introduces a novel computer vision framework that integrates foundation models with lightweight neural networks to address these industry challenges.Leveraging the segment anything model's zero-shot learning capability,we developed an automated workflow that achieves an efficiency improvement of approximately 29 times in bubble labeling compared to manual methods while maintaining less than 2%deviation from expert annotations.Engineering-oriented optimization ensures lightweight deployment with 94%segmentation accuracy,while the integrated quantification system precisely resolves gas saturation,shape factors,and interfacial dynamics,parameters critical for optimizing gas injection strategies and predicting phase redistribution patterns.Validated through microfluidic gas-liquid displacement experiments for discontinuous phase segmentation accuracy,this methodology enables precise bubble morphology quantification with broad application potential in multiphase systems,including emulsion droplet dynamics characterization and particle transport behavior analysis.This work bridges the critical gap between pore-scale dynamics characterization and reservoir-scale simulation requirements,providing a foundational framework for intelligent flow diagnostics and predictive modeling in next-generation digital oilfield systems.展开更多
Efficient segmentation of oiled pixels in optical remotely sensed images is the precondition of optical identification and classification of different spilled oils,which remains one of the keys to optical remote sensi...Efficient segmentation of oiled pixels in optical remotely sensed images is the precondition of optical identification and classification of different spilled oils,which remains one of the keys to optical remote sensing of oil spills.Optical remotely sensed images of oil spills are inherently multidimensional and embedded with a complex knowledge framework.This complexity often hinders the effectiveness of mechanistic algorithms across varied scenarios.Although optical remote-sensing theory for oil spills has advanced,the scarcity of curated datasets and the difficulty of collecting them limit their usefulness for training deep learning models.This study introduces a data expansion strategy that utilizes the Segment Anything Model(SAM),effectively bridging the gap between traditional mechanism algorithms and emergent self-adaptive deep learning models.Optical dimension reduction is achieved through standardized preprocessing processes that address the decipherable properties of the input image.After preprocessing,SAM can swiftly and accurately segment spilled oil in images.The unified AI-based workflow significantly accelerates labeled-dataset creation and has proven effective for both rapid emergency intelligence during spill incidents and the rapid mapping and classification of oil footprints across China’s coastal waters.Our results show that coupling a remote sensing mechanism with a foundation model enables near-real-time,large-scale monitoring of complex surface slicks and offers guidance for the next generation of detection and quantification algorithms.展开更多
Large-scale unsupervised semantic segmentation(LUSS)is a sophisticated process that aims to segment similar areas within an image without relying on labeled training data.While existing methodologies have made substan...Large-scale unsupervised semantic segmentation(LUSS)is a sophisticated process that aims to segment similar areas within an image without relying on labeled training data.While existing methodologies have made substantial progress in this area,there is ample scope for enhancement.We thus introduce the PASS-SAM model,a comprehensive solution that amalgamates the benefits of various models to improve segmentation performance.展开更多
Automating the identification,localization,and monitoring of roadway assets distributed widely in the roadway network is critical for the traffic management system.It can effi-ciently provide up-to-date information in...Automating the identification,localization,and monitoring of roadway assets distributed widely in the roadway network is critical for the traffic management system.It can effi-ciently provide up-to-date information in supporting transportation asset management(TAM).Collecting videos with vehicle-mounted cameras and processing the data with com-puter vision(CV)-based deep learning methods is garnering increased attention from transportation agencies.While promising,challenges arise due to the lack of high-quality annotations for roadway assets in images,difficulties in identifying these assets,and limited solutions.The segment anything model(SAM),a visual foundation model,demonstrates robust zero-shot capability for general image segmentation under various prompts.This study evaluates SAM’s applicability and efficiency in extracting roadway assets from images.Specifically,it examines the impacts of model size and prompt quality on SAM’s performance in segmenting roadway assets.Five state-of-the-art semantic seg-mentation models are trained and compared with SAM.Results show that a lightweight SAM with human-rendered prompts outperforms the five semantic segmentation models.Based on the evaluation results,future work will explore incorporating SAM into trans-portation asset management applications,promoting collaboration between human experts and artificial intelligence.展开更多
Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding ...Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.展开更多
AIM:To construct an intelligent segmentation scheme for precise localization of central serous chorioretinopathy(CSC)leakage points,thereby enabling ophthalmologists to deliver accurate laser treatment without navigat...AIM:To construct an intelligent segmentation scheme for precise localization of central serous chorioretinopathy(CSC)leakage points,thereby enabling ophthalmologists to deliver accurate laser treatment without navigational laser equipment.METHODS:A dataset with dual labels(point-level and pixel-level)was first established based on fundus fluorescein angiography(FFA)images of CSC and subsequently divided into training(102 images),validation(40 images),and test(40 images)datasets.An intelligent segmentation method was then developed,based on the You Only Look Once version 8 Pose Estimation(YOLOv8-Pose)model and segment anything model(SAM),to segment CSC leakage points.Next,the YOLOv8-Pose model was trained for 200 epochs,and the best-performing model was selected to form the optimal combination with SAM.Additionally,the classic five types of U-Net series models[i.e.,U-Net,recurrent residual U-Net(R2U-Net),attention U-Net(AttU-Net),recurrent residual attention U-Net(R2AttUNet),and nested U-Net(UNet^(++))]were initialized with three random seeds and trained for 200 epochs,resulting in a total of 15 baseline models for comparison.Finally,based on the metrics including Dice similarity coefficient(DICE),intersection over union(IoU),precision,recall,precisionrecall(PR)curve,and receiver operating characteristic(ROC)curve,the proposed method was compared with baseline models through quantitative and qualitative experiments for leakage point segmentation,thereby demonstrating its effectiveness.RESULTS:With the increase of training epochs,the mAP50-95,Recall,and precision of the YOLOv8-Pose model showed a significant increase and tended to stabilize,and it achieved a preliminary localization success rate of 90%(i.e.,36 images)for CSC leakage points in 40 test images.Using manually expert-annotated pixel-level labels as the ground truth,the proposed method achieved outcomes with a DICE of 57.13%,an IoU of 45.31%,a precision of 45.91%,a recall of 93.57%,an area under the PR curve(AUC-PR)of 0.78 and an area under the ROC curve(AUC-ROC)of 0.97,which enables more accurate segmentation of CSC leakage points.CONCLUSION:By combining the precise localization capability of the YOLOv8-Pose model with the robust and flexible segmentation ability of SAM,the proposed method not only demonstrates the effectiveness of the YOLOv8-Pose model in detecting keypoint coordinates of CSC leakage points from the perspective of application innovation but also establishes a novel approach for accurate segmentation of CSC leakage points through the“detect-then-segment”strategy,thereby providing a potential auxiliary means for the automatic and precise realtime localization of leakage points during traditional laser photocoagulation for CSC.展开更多
Detailed individual tree crown segmentation is highly relevant for the detection and monitoring of Fraxinus excelsior L.trees affected by ash dieback,a major threat to common ash populations across Europe.In this stud...Detailed individual tree crown segmentation is highly relevant for the detection and monitoring of Fraxinus excelsior L.trees affected by ash dieback,a major threat to common ash populations across Europe.In this study,both fine and coarse crown segmentation methods were applied to close-range multispectral UAV imagery.The fine tree crown segmentation method utilized a novel unsupervised machine learning approach based on a blended NIR-NDVI image,whereas the coarse segmentation relied on the segment anything model(SAM).Both methods successfully delineated tree crown outlines,however,only the fine segmentation accurately captured internal canopy gaps.Despite these structural differences,mean NDVI values calculated per tree crown revealed no significant differences between the two approaches,indicating that coarse segmentation is sufficient for mean vegetation index assessments.Nevertheless,the fine segmentation revealed increased heterogeneity in NDVI values in more severely damaged trees,underscoring its value for detailed structural and health analyses.Furthermore,the fine segmentation workflow proved transferable to both individual UAV images and orthophotos from broader UAV surveys.For applications focused on structural integrity and spatial variation in canopy health,the fine segmentation approach is recommended.展开更多
Water leakage inspection in the tunnels is a critical engineering job that has attracted increasing concerns.Leakage area detection via manual inspection techniques is time-consuming and might produce unreliablefindin...Water leakage inspection in the tunnels is a critical engineering job that has attracted increasing concerns.Leakage area detection via manual inspection techniques is time-consuming and might produce unreliablefindings, so that automated techniques should be created to increase reliability and efficiency. Pre-trainedfoundational segmentation models for large datasets have attracted great interests recently. This paper proposes a novel SAM-based network for accurate automated water leakage inspection. The contributions of thispaper include the efficient adaptation of the SAM (Segment Anything Model) for shield tunnel water leakagesegmentation and the demonstration of the application effect by data experiments. Tunnel SAM Adapter hassatisfactory performance, achieving 76.2 % mIoU and 77.5 % Dice. Experimental results demonstrate that ourapproach has advantages over peer studies and guarantees the integrity and safety of these vital assets whilestreamlining tunnel maintenance.展开更多
Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in ord...Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.展开更多
Existing sandstone rock structure evaluation methods rely on visual inspection,with low efficiency,semi-quantitative analysis of roundness,and inability to perform classified statistics in particle size analysis.This ...Existing sandstone rock structure evaluation methods rely on visual inspection,with low efficiency,semi-quantitative analysis of roundness,and inability to perform classified statistics in particle size analysis.This study presents an intelligent evaluation method for sandstone rock structure based on the Segment Anything Model(SAM).By developing a lightweight SAM fine-tuning method with rank-decomposition matrix adapters,a multispectral rock particle segmentation model named CoreSAM is constructed,which achieves rock particle edge extraction and type identification.Building upon this,we propose a comprehensive quantitative evaluation system for rock structure,assessing parameters including particle size,sorting,roundness,particle contact and cementation types.The experimental results demonstrate that CoreSAM outperforms existing methods in rock particle segmentation accuracy while showing excellent generalization across different image types such as CT scans and core photographs.The proposed method enables full-sample,classified particle size analysis and quantitative characterization of parameters like roundness,advancing reservoir evaluation towards more precise,quantitative,intuitive,and comprehensive development.展开更多
Dear Editor,This letter presents techniques to simplify dataset generation for instance segmentation of raw meat products,a critical step toward automating food production lines.Accurate segmentation is essential for ...Dear Editor,This letter presents techniques to simplify dataset generation for instance segmentation of raw meat products,a critical step toward automating food production lines.Accurate segmentation is essential for addressing challenges such as occlusions,indistinct edges,and stacked configurations,which demand large,diverse datasets.To meet these demands,we propose two complementary approaches:a semi-automatic annotation interface using tools like the segment anything model(SAM)and GrabCut and a synthetic data generation pipeline leveraging 3D-scanned models.These methods reduce reliance on real meat,mitigate food waste,and improve scalability.Experimental results demonstrate that incorporating synthetic data enhances segmentation model performance and,when combined with real data,further boosts accuracy,paving the way for more efficient automation in the food industry.展开更多
“精灵圈”是海岸带盐沼植被生态系统中的一种“空间自组织”结构,对盐沼湿地的生产力、稳定性和恢复力有重要影响。无人机影像是实现“精灵圈”空间位置高精度识别及解译其时空演化趋势与规律的重要数据源,但“精灵圈”像素与背景像素...“精灵圈”是海岸带盐沼植被生态系统中的一种“空间自组织”结构,对盐沼湿地的生产力、稳定性和恢复力有重要影响。无人机影像是实现“精灵圈”空间位置高精度识别及解译其时空演化趋势与规律的重要数据源,但“精灵圈”像素与背景像素在色彩信息和外形特征上差异较小,如何从二维影像中智能精准地识别“精灵圈”像素并对识别的单个像素形成个体“精灵圈”是目前的技术难点。本文提出了一种结合分割万物模型(Segment Anything Model,SAM)视觉分割模型与随机森林机器学习的无人机影像“精灵圈”分割及分类方法,实现了单个“精灵圈”的识别和提取。首先,通过构建索伦森-骰子系数(S?rensen-Dice coefficient,Dice)和交并比(Intersection over Union,IOU)评价指标,从SAM中筛选预训练模型并对其参数进行优化,实现全自动影像分割,得到无属性信息的分割掩码/分割类;然后,利用红、绿、蓝(RGB)三通道信息及空间二维坐标将分割掩码与原图像进行信息匹配,构造分割掩码的特征指标,并根据袋外数据(Out of Bag,OOB)误差减小及特征分布规律对特征进行分析和筛选;最后,利用筛选的特征对随机森林模型进行训练,实现“精灵圈”植被、普通植被和光滩的自动识别与分类。实验结果表明:本文方法“精灵圈”平均正确提取率96.1%,平均错误提取率为9.5%,为精准刻画“精灵圈”时空格局及海岸带无人机遥感图像处理提供了方法和技术支撑。展开更多
Recently,Meta AI Research approaches a general,promptable segment anything model(SAM)pre-trained on an unprecedentedly large segmentation dataset(SA-1B).Without a doubt,the emergence of SAM will yield significant bene...Recently,Meta AI Research approaches a general,promptable segment anything model(SAM)pre-trained on an unprecedentedly large segmentation dataset(SA-1B).Without a doubt,the emergence of SAM will yield significant benefits for a wide array of practical image segmentation applications.In this study,we conduct a series of intriguing investigations into the performance of SAM across various applications,particularly in the fields of natural images,agriculture,manufacturing,remote sensing and healthcare.We analyze and discuss the benefits and limitations of SAM,while also presenting an outlook on its future development in segmentation tasks.By doing so,we aim to give a comprehensive understanding of SAM's practical applications.This work is expected to provide insights that facilitate future research activities toward generic segmentation.Source code is publicly available at https://github.com/LiuTingWed/SAM-Not-Perfect.展开更多
基金supported by Sichuan Province Outstanding Young Scientist Fund(Grant No.2025NSFJQ0009)Sichuan Regional Innovation Cooperation Fund(Grant No.2025YFHZ0270)。
文摘The quantitative analysis of dispersed phases(bubbles,droplets,and particles)in multiphase flow systems represents a persistent technological challenge in petroleum engineering applications,including CO2-enhanced oil recovery,foam flooding,and unconventional reservoir development.Current characterization methods remain constrained by labor-intensive manual workflows and limited dynamic analysis capabilities,particularly for processing large-scale microscopy data and video sequences that capture critical transient behavior like gas cluster migration and droplet coalescence.These limitations hinder the establishment of robust correlations between pore-scale flow patterns and reservoir-scale production performance.This study introduces a novel computer vision framework that integrates foundation models with lightweight neural networks to address these industry challenges.Leveraging the segment anything model's zero-shot learning capability,we developed an automated workflow that achieves an efficiency improvement of approximately 29 times in bubble labeling compared to manual methods while maintaining less than 2%deviation from expert annotations.Engineering-oriented optimization ensures lightweight deployment with 94%segmentation accuracy,while the integrated quantification system precisely resolves gas saturation,shape factors,and interfacial dynamics,parameters critical for optimizing gas injection strategies and predicting phase redistribution patterns.Validated through microfluidic gas-liquid displacement experiments for discontinuous phase segmentation accuracy,this methodology enables precise bubble morphology quantification with broad application potential in multiphase systems,including emulsion droplet dynamics characterization and particle transport behavior analysis.This work bridges the critical gap between pore-scale dynamics characterization and reservoir-scale simulation requirements,providing a foundational framework for intelligent flow diagnostics and predictive modeling in next-generation digital oilfield systems.
基金The National Natural Science Foundation of China under contract No.42371380the National Key Research and Development Program of China under contract No.2023YFC2811800the Fundamental Research Funds for the Central Universities under contract No.0904-14380035.
文摘Efficient segmentation of oiled pixels in optical remotely sensed images is the precondition of optical identification and classification of different spilled oils,which remains one of the keys to optical remote sensing of oil spills.Optical remotely sensed images of oil spills are inherently multidimensional and embedded with a complex knowledge framework.This complexity often hinders the effectiveness of mechanistic algorithms across varied scenarios.Although optical remote-sensing theory for oil spills has advanced,the scarcity of curated datasets and the difficulty of collecting them limit their usefulness for training deep learning models.This study introduces a data expansion strategy that utilizes the Segment Anything Model(SAM),effectively bridging the gap between traditional mechanism algorithms and emergent self-adaptive deep learning models.Optical dimension reduction is achieved through standardized preprocessing processes that address the decipherable properties of the input image.After preprocessing,SAM can swiftly and accurately segment spilled oil in images.The unified AI-based workflow significantly accelerates labeled-dataset creation and has proven effective for both rapid emergency intelligence during spill incidents and the rapid mapping and classification of oil footprints across China’s coastal waters.Our results show that coupling a remote sensing mechanism with a foundation model enables near-real-time,large-scale monitoring of complex surface slicks and offers guidance for the next generation of detection and quantification algorithms.
文摘Large-scale unsupervised semantic segmentation(LUSS)is a sophisticated process that aims to segment similar areas within an image without relying on labeled training data.While existing methodologies have made substantial progress in this area,there is ample scope for enhancement.We thus introduce the PASS-SAM model,a comprehensive solution that amalgamates the benefits of various models to improve segmentation performance.
基金The authors would like to express their deepest gratitude for the financial support from the National Science Foundation(No.ECCS-2026357)the Rural Equitable and Accessible Transportation(REAT)Center,a Tier-1 University Transportation Center funded by the United States Department of Transportation(No.69A3552348321).
文摘Automating the identification,localization,and monitoring of roadway assets distributed widely in the roadway network is critical for the traffic management system.It can effi-ciently provide up-to-date information in supporting transportation asset management(TAM).Collecting videos with vehicle-mounted cameras and processing the data with com-puter vision(CV)-based deep learning methods is garnering increased attention from transportation agencies.While promising,challenges arise due to the lack of high-quality annotations for roadway assets in images,difficulties in identifying these assets,and limited solutions.The segment anything model(SAM),a visual foundation model,demonstrates robust zero-shot capability for general image segmentation under various prompts.This study evaluates SAM’s applicability and efficiency in extracting roadway assets from images.Specifically,it examines the impacts of model size and prompt quality on SAM’s performance in segmenting roadway assets.Five state-of-the-art semantic seg-mentation models are trained and compared with SAM.Results show that a lightweight SAM with human-rendered prompts outperforms the five semantic segmentation models.Based on the evaluation results,future work will explore incorporating SAM into trans-portation asset management applications,promoting collaboration between human experts and artificial intelligence.
基金supported by Natural Science Foundation Programme of Gansu Province(No.24JRRA231)National Natural Science Foundation of China(No.62061023)Gansu Provincial Science and Technology Plan Key Research and Development Program Project(No.24YFFA024).
文摘Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.
基金Supported by the Shenzhen Science and Technology Program(No.JCYJ20240813152704006)the National Natural Science Foundation of China(No.62401259)+2 种基金the Fundamental Research Funds for the Central Universities(No.NZ2024036)the Postdoctoral Fellowship Program of CPSF(No.GZC20242228)High Performance Computing Platform of Nanjing University of Aeronautics and Astronautics。
文摘AIM:To construct an intelligent segmentation scheme for precise localization of central serous chorioretinopathy(CSC)leakage points,thereby enabling ophthalmologists to deliver accurate laser treatment without navigational laser equipment.METHODS:A dataset with dual labels(point-level and pixel-level)was first established based on fundus fluorescein angiography(FFA)images of CSC and subsequently divided into training(102 images),validation(40 images),and test(40 images)datasets.An intelligent segmentation method was then developed,based on the You Only Look Once version 8 Pose Estimation(YOLOv8-Pose)model and segment anything model(SAM),to segment CSC leakage points.Next,the YOLOv8-Pose model was trained for 200 epochs,and the best-performing model was selected to form the optimal combination with SAM.Additionally,the classic five types of U-Net series models[i.e.,U-Net,recurrent residual U-Net(R2U-Net),attention U-Net(AttU-Net),recurrent residual attention U-Net(R2AttUNet),and nested U-Net(UNet^(++))]were initialized with three random seeds and trained for 200 epochs,resulting in a total of 15 baseline models for comparison.Finally,based on the metrics including Dice similarity coefficient(DICE),intersection over union(IoU),precision,recall,precisionrecall(PR)curve,and receiver operating characteristic(ROC)curve,the proposed method was compared with baseline models through quantitative and qualitative experiments for leakage point segmentation,thereby demonstrating its effectiveness.RESULTS:With the increase of training epochs,the mAP50-95,Recall,and precision of the YOLOv8-Pose model showed a significant increase and tended to stabilize,and it achieved a preliminary localization success rate of 90%(i.e.,36 images)for CSC leakage points in 40 test images.Using manually expert-annotated pixel-level labels as the ground truth,the proposed method achieved outcomes with a DICE of 57.13%,an IoU of 45.31%,a precision of 45.91%,a recall of 93.57%,an area under the PR curve(AUC-PR)of 0.78 and an area under the ROC curve(AUC-ROC)of 0.97,which enables more accurate segmentation of CSC leakage points.CONCLUSION:By combining the precise localization capability of the YOLOv8-Pose model with the robust and flexible segmentation ability of SAM,the proposed method not only demonstrates the effectiveness of the YOLOv8-Pose model in detecting keypoint coordinates of CSC leakage points from the perspective of application innovation but also establishes a novel approach for accurate segmentation of CSC leakage points through the“detect-then-segment”strategy,thereby providing a potential auxiliary means for the automatic and precise realtime localization of leakage points during traditional laser photocoagulation for CSC.
基金This study was conducted within the project FraxVir“Detection,characterisation and analyses of the occurrence of viruses and ash dieback in special stands of Fraxinus excelsior-a supplementary study to the FraxForFuture demonstration project”and receives funding via the Waldklimafonds(WKF)funded by the German Federal Ministry of Food and Agriculture(BMEL)and Federal Ministry for the Environment,Nature Conservation,Nuclear Safety and Consumer Protection(BMUV)administrated by the Agency for Renewable Resources(FNR)under grant agreement 2220WK40A4.
文摘Detailed individual tree crown segmentation is highly relevant for the detection and monitoring of Fraxinus excelsior L.trees affected by ash dieback,a major threat to common ash populations across Europe.In this study,both fine and coarse crown segmentation methods were applied to close-range multispectral UAV imagery.The fine tree crown segmentation method utilized a novel unsupervised machine learning approach based on a blended NIR-NDVI image,whereas the coarse segmentation relied on the segment anything model(SAM).Both methods successfully delineated tree crown outlines,however,only the fine segmentation accurately captured internal canopy gaps.Despite these structural differences,mean NDVI values calculated per tree crown revealed no significant differences between the two approaches,indicating that coarse segmentation is sufficient for mean vegetation index assessments.Nevertheless,the fine segmentation revealed increased heterogeneity in NDVI values in more severely damaged trees,underscoring its value for detailed structural and health analyses.Furthermore,the fine segmentation workflow proved transferable to both individual UAV images and orthophotos from broader UAV surveys.For applications focused on structural integrity and spatial variation in canopy health,the fine segmentation approach is recommended.
基金funded by the National Natural Science Foundation of China(Nos.62171114,52222810)the Fundamental Research Funds for the Central Universities(No.DUT22RC(3)099).
文摘Water leakage inspection in the tunnels is a critical engineering job that has attracted increasing concerns.Leakage area detection via manual inspection techniques is time-consuming and might produce unreliablefindings, so that automated techniques should be created to increase reliability and efficiency. Pre-trainedfoundational segmentation models for large datasets have attracted great interests recently. This paper proposes a novel SAM-based network for accurate automated water leakage inspection. The contributions of thispaper include the efficient adaptation of the SAM (Segment Anything Model) for shield tunnel water leakagesegmentation and the demonstration of the application effect by data experiments. Tunnel SAM Adapter hassatisfactory performance, achieving 76.2 % mIoU and 77.5 % Dice. Experimental results demonstrate that ourapproach has advantages over peer studies and guarantees the integrity and safety of these vital assets whilestreamlining tunnel maintenance.
基金Natural Science Foundation of Zhejiang Province,Grant/Award Number:LY23F020025Science and Technology Commissioner Program of Huzhou,Grant/Award Number:2023GZ42Sichuan Provincial Science and Technology Support Program,Grant/Award Numbers:2023ZHCG0005,2023ZHCG0008。
文摘Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.
基金Supported by the National Natural Science Foundation of China(42372175,72088101)PetroChina Science and Technology Project of(2023DJ84)Basic Research Cooperation Project between China National Petroleum Corporation and Peking University.
文摘Existing sandstone rock structure evaluation methods rely on visual inspection,with low efficiency,semi-quantitative analysis of roundness,and inability to perform classified statistics in particle size analysis.This study presents an intelligent evaluation method for sandstone rock structure based on the Segment Anything Model(SAM).By developing a lightweight SAM fine-tuning method with rank-decomposition matrix adapters,a multispectral rock particle segmentation model named CoreSAM is constructed,which achieves rock particle edge extraction and type identification.Building upon this,we propose a comprehensive quantitative evaluation system for rock structure,assessing parameters including particle size,sorting,roundness,particle contact and cementation types.The experimental results demonstrate that CoreSAM outperforms existing methods in rock particle segmentation accuracy while showing excellent generalization across different image types such as CT scans and core photographs.The proposed method enables full-sample,classified particle size analysis and quantitative characterization of parameters like roundness,advancing reservoir evaluation towards more precise,quantitative,intuitive,and comprehensive development.
基金supported by European Union’s Horizon Europe research and innovation programme,project AGILEHAND(Smart Grading,Handling and Packaging Solutions for Soft and Deformable Products in Agile and Reconfigurable Lines)(101092043).
文摘Dear Editor,This letter presents techniques to simplify dataset generation for instance segmentation of raw meat products,a critical step toward automating food production lines.Accurate segmentation is essential for addressing challenges such as occlusions,indistinct edges,and stacked configurations,which demand large,diverse datasets.To meet these demands,we propose two complementary approaches:a semi-automatic annotation interface using tools like the segment anything model(SAM)and GrabCut and a synthetic data generation pipeline leveraging 3D-scanned models.These methods reduce reliance on real meat,mitigate food waste,and improve scalability.Experimental results demonstrate that incorporating synthetic data enhances segmentation model performance and,when combined with real data,further boosts accuracy,paving the way for more efficient automation in the food industry.
文摘“精灵圈”是海岸带盐沼植被生态系统中的一种“空间自组织”结构,对盐沼湿地的生产力、稳定性和恢复力有重要影响。无人机影像是实现“精灵圈”空间位置高精度识别及解译其时空演化趋势与规律的重要数据源,但“精灵圈”像素与背景像素在色彩信息和外形特征上差异较小,如何从二维影像中智能精准地识别“精灵圈”像素并对识别的单个像素形成个体“精灵圈”是目前的技术难点。本文提出了一种结合分割万物模型(Segment Anything Model,SAM)视觉分割模型与随机森林机器学习的无人机影像“精灵圈”分割及分类方法,实现了单个“精灵圈”的识别和提取。首先,通过构建索伦森-骰子系数(S?rensen-Dice coefficient,Dice)和交并比(Intersection over Union,IOU)评价指标,从SAM中筛选预训练模型并对其参数进行优化,实现全自动影像分割,得到无属性信息的分割掩码/分割类;然后,利用红、绿、蓝(RGB)三通道信息及空间二维坐标将分割掩码与原图像进行信息匹配,构造分割掩码的特征指标,并根据袋外数据(Out of Bag,OOB)误差减小及特征分布规律对特征进行分析和筛选;最后,利用筛选的特征对随机森林模型进行训练,实现“精灵圈”植被、普通植被和光滩的自动识别与分类。实验结果表明:本文方法“精灵圈”平均正确提取率96.1%,平均错误提取率为9.5%,为精准刻画“精灵圈”时空格局及海岸带无人机遥感图像处理提供了方法和技术支撑。
基金supported by the Mitacs,CFI-JELF and NSERC Discovery grants.
文摘Recently,Meta AI Research approaches a general,promptable segment anything model(SAM)pre-trained on an unprecedentedly large segmentation dataset(SA-1B).Without a doubt,the emergence of SAM will yield significant benefits for a wide array of practical image segmentation applications.In this study,we conduct a series of intriguing investigations into the performance of SAM across various applications,particularly in the fields of natural images,agriculture,manufacturing,remote sensing and healthcare.We analyze and discuss the benefits and limitations of SAM,while also presenting an outlook on its future development in segmentation tasks.By doing so,we aim to give a comprehensive understanding of SAM's practical applications.This work is expected to provide insights that facilitate future research activities toward generic segmentation.Source code is publicly available at https://github.com/LiuTingWed/SAM-Not-Perfect.