期刊文献+
共找到15篇文章
< 1 >
每页显示 20 50 100
A medical image segmentation model based on SAM with an integrated local multi-scale feature encoder
1
作者 DI Jing ZHU Yunlong LIANG Chan 《Journal of Measurement Science and Instrumentation》 2025年第3期359-370,共12页
Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding ... Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis. 展开更多
关键词 segment anything model(SAM) medical image segmentation ENCODER decoder multiaxial Hadamard product module(MHPM) cross-branch balancing adapter
在线阅读 下载PDF
Intelligent evaluation of sandstone rock structure based on a visual large model
2
作者 REN Yili ZENG Changmin +10 位作者 LI Xin LIU Xi HU Yanxu SU Qianxiao WANG Xiaoming LIN Zhiwei ZHOU Yixiao ZHENG Zilu HU Huiying YANG Yanning HUI Fang 《Petroleum Exploration and Development》 2025年第2期548-558,共11页
Existing sandstone rock structure evaluation methods rely on visual inspection,with low efficiency,semi-quantitative analysis of roundness,and inability to perform classified statistics in particle size analysis.This ... Existing sandstone rock structure evaluation methods rely on visual inspection,with low efficiency,semi-quantitative analysis of roundness,and inability to perform classified statistics in particle size analysis.This study presents an intelligent evaluation method for sandstone rock structure based on the Segment Anything Model(SAM).By developing a lightweight SAM fine-tuning method with rank-decomposition matrix adapters,a multispectral rock particle segmentation model named CoreSAM is constructed,which achieves rock particle edge extraction and type identification.Building upon this,we propose a comprehensive quantitative evaluation system for rock structure,assessing parameters including particle size,sorting,roundness,particle contact and cementation types.The experimental results demonstrate that CoreSAM outperforms existing methods in rock particle segmentation accuracy while showing excellent generalization across different image types such as CT scans and core photographs.The proposed method enables full-sample,classified particle size analysis and quantitative characterization of parameters like roundness,advancing reservoir evaluation towards more precise,quantitative,intuitive,and comprehensive development. 展开更多
关键词 SANDSTONE rock structure intelligent evaluation Segment anything model fine-tuning particle edge extraction type identification
在线阅读 下载PDF
YOLOv8改进算法在油茶果分拣中的应用
3
作者 刘姜毅 高自成 +2 位作者 刘怀粤 尹浇钦 罗媛尹 《林业工程学报》 北大核心 2025年第1期120-127,共8页
现有的油茶果分拣系统所依赖的YOLO等算法的目标检测、实例分割在低尺寸及密集型样本中鲁棒性较差,存在机械臂常抓取到枝叶、抓取不牢固、易脱落等问题。大部分系统使用目标识别,无法准确识别油茶果具体轮廓信息,不能对油茶果进行大小... 现有的油茶果分拣系统所依赖的YOLO等算法的目标检测、实例分割在低尺寸及密集型样本中鲁棒性较差,存在机械臂常抓取到枝叶、抓取不牢固、易脱落等问题。大部分系统使用目标识别,无法准确识别油茶果具体轮廓信息,不能对油茶果进行大小分类。针对这一问题,研究提出了YOWNet模型应对油茶果分拣的小目标、高密度识别任务。首先,研究了自动化边缘标注脚本,脚本调用零样本Segment Anything框架对原有已标注的油茶果目标检测框提取兴趣区间,将其自动转化为边缘标注信息;其次,为了提高模型对小目标的识别能力,研究摒弃了现有的固定感受野的卷积模块,针对油茶果特性提出三维注意力动态卷积模块用于捕捉特征图中的关键信息;最后,研究通过使用Wise⁃IoU损失函数,基于动态非单调聚焦机制的边界框损失,提升边框回归精度。总体网络模型命名为YOWNet,通过与YOLOv8在油茶果上的消融实验对比,试验结果表明:YOWNet模型能够快速准确地识别油茶果实例,在私有数据集上,准确度、Box_loss可达89.90%和0.523。 展开更多
关键词 油茶果 三维动态卷积 实例分割 YOLOv8 Segment anything model Wise⁃IoU
在线阅读 下载PDF
基于SAM图像处理的堆石料级配计算方法及验证
4
作者 张振伟 蔡可天 +3 位作者 高轩 贺一轩 王建 鲁洋 《水力发电》 2025年第2期80-86,共7页
堆石料级配检测是堆石坝施工过程中质量控制的重要环节,传统方法通常采用现场人工筛分法测量,存在检测样本少、效率低、干扰施工等问题。提出了一种基于图像处理的堆石料级配计算方法,采用国际最新Mata AI开源的通用图像分割大模型Segme... 堆石料级配检测是堆石坝施工过程中质量控制的重要环节,传统方法通常采用现场人工筛分法测量,存在检测样本少、效率低、干扰施工等问题。提出了一种基于图像处理的堆石料级配计算方法,采用国际最新Mata AI开源的通用图像分割大模型Segment Anything Model(SAM)对筑坝堆石料进行自动图像分割,提出堆石长宽比、面积比等堆石形态学几何参数用于提取堆石料图像中的堆石颗粒目标;同时,建立堆石形态数据库、堆石实例分割数据库,并分析参数取值和验证堆石图像级配计算方法的有效性;最后,试验验证结果表明该方法能够有效识别出图像中的堆石颗粒目标,实现级配曲线的智能识别,以及曲率、不均匀系数等级配指标的快速计算。该方法计算获得的级配与真实筛分法测的级配相关性可达0.94,平均绝对误差约5%,能够在堆石坝施工过程中有效辅助检测堆石料的颗粒级配信息,服务堆石坝的施工碾压质量控制。 展开更多
关键词 堆石料 级配 Segment anything model(SAM) 图像识别 快速检测
在线阅读 下载PDF
SAY-SOD:基于大模型优化的高清遥感图像小目标检测框架
5
作者 曾文龙 贾海涛 +1 位作者 周昊哲 程卓尔 《网络安全与数据治理》 2025年第S1期90-97,共8页
随着遥感技术的不断发展,遥感图像中小目标检测面临着背景复杂、目标尺寸小、像素信息少等挑战,传统检测算法在这一领域的表现存在一定局限。提出了一种基于SAM大模型和改进YOLOv8的小目标检测框架。首先,利用SAM对原始遥感图像进行感... 随着遥感技术的不断发展,遥感图像中小目标检测面临着背景复杂、目标尺寸小、像素信息少等挑战,传统检测算法在这一领域的表现存在一定局限。提出了一种基于SAM大模型和改进YOLOv8的小目标检测框架。首先,利用SAM对原始遥感图像进行感兴趣区域的提取和分割,随后对分割后的图像进行多尺度增强,以提高小目标的显著性。增强后的图像与原图的编号和定位信息一起构建数据集,用于训练改进的YOLOv8模型。改进措施包括特征金字塔网络的优化、引入注意力机制、重新设计损失函数。实验结果表明,SAY-SOD框架在复杂背景下有效提升了遥感小目标的检测精度和鲁棒性,尤其在面对不同尺度和背景变化时表现出色。 展开更多
关键词 遥感图像 小目标检测 Segment anything model YOLOv8 特征金字塔网络 数据增强 注意力机制
在线阅读 下载PDF
Optimizing zero-shot text-based segmentation of remote sensing imagery using SAM and Grounding DINO
6
作者 Mohanad Diab Polychronis Kolokoussis Maria Antonia Brovelli 《Artificial Intelligence in Geosciences》 2025年第1期14-24,共11页
The use of AI technologies in remote sensing(RS)tasks has been the focus of many individuals in both the professional and academic domains.Having more accessible interfaces and tools that allow people of little or no ... The use of AI technologies in remote sensing(RS)tasks has been the focus of many individuals in both the professional and academic domains.Having more accessible interfaces and tools that allow people of little or no experience to intuitively interact with RS data of multiple formats is a potential provided by this integration.However,the use of AI and AI agents to help automate RS-related tasks is still in its infancy stage,with some frameworks and interfaces built on top of well-known vision language models(VLM)such as GPT-4,segment anything model(SAM),and grounding DINO.These tools do promise and draw guidelines on the potentials and limitations of existing solutions concerning the use of said models.In this work,the state of the art AI foundation models(FM)are reviewed and used in a multi-modal manner to ingest RS imagery input and perform zero-shot object detection using natural language.The natural language input is then used to define the classes or labels the model should look for,then,both inputs are fed to the pipeline.The pipeline presented in this work makes up for the shortcomings of the general knowledge FMs by stacking pre-processing and post-processing applications on top of the FMs;these applications include tiling to produce uniform patches of the original image for faster detection,outlier rejection of redundant bounding boxes using statistical and machine learning methods.The pipeline was tested with UAV,aerial and satellite images taken over multiple areas.The accuracy for the semantic segmentation showed improvement from the original 64%to approximately 80%-99%by utilizing the pipeline and techniques proposed in this work.GitHub Repository:MohanadDiab/LangRS. 展开更多
关键词 Foundation models Multi-modal models Vision language models Semantic segmentation Segment anything model Earth observation Remote sensing
在线阅读 下载PDF
PASS-SAM:Integration of Segment Anything Model for Large-Scale Unsupervised Semantic Segmentation
7
作者 Yin Tang Rui Chen +1 位作者 Gensheng Pei Qiong Wang 《Computational Visual Media》 2025年第3期669-674,共6页
Large-scale unsupervised semantic segmentation(LUSS)is a sophisticated process that aims to segment similar areas within an image without relying on labeled training data.While existing methodologies have made substan... Large-scale unsupervised semantic segmentation(LUSS)is a sophisticated process that aims to segment similar areas within an image without relying on labeled training data.While existing methodologies have made substantial progress in this area,there is ample scope for enhancement.We thus introduce the PASS-SAM model,a comprehensive solution that amalgamates the benefits of various models to improve segmentation performance. 展开更多
关键词 segmentation performance amalgamates benefits various models segment anything model pass sam model segment similar areas large scale unsupervised semantic segmentation
原文传递
Pre-trained SAM as data augmentation for image segmentation
8
作者 Junjun Wu Yunbo Rao +1 位作者 Shaoning Zeng Bob Zhang 《CAAI Transactions on Intelligence Technology》 2025年第1期268-282,共15页
Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in ord... Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation. 展开更多
关键词 data augmentation image segmentation large model segment anything model
在线阅读 下载PDF
基于SAM&ImageJ图像处理的堆石混凝土坝层面露石率研究 被引量:4
9
作者 安宇 徐小蓉 +2 位作者 尹志刚 金峰 张喜喜 《水资源与水工程学报》 CSCD 北大核心 2024年第1期154-161,共8页
堆石混凝土坝层面的外露块石为上下层提供了重要的啮合作用,其投影面积比例是科学评价层间抗剪性能的重要指标。采用国际最新Meta AI模型segment anything model(SAM)对层面外露堆石进行自动图像分割,并基于ImageJ软件对SAM识别后的图... 堆石混凝土坝层面的外露块石为上下层提供了重要的啮合作用,其投影面积比例是科学评价层间抗剪性能的重要指标。采用国际最新Meta AI模型segment anything model(SAM)对层面外露堆石进行自动图像分割,并基于ImageJ软件对SAM识别后的图片进行再加工与图像计算,利用平滑、差分算法、中值滤波等方法精准标定外露堆石,二值化后计算得到层面露石率。结果表明:SAM图像预分割可识别约90%的外露堆石,经过ImageJ二次图像处理后可有效提高小粒径堆石的识别精度,对比手动标注结果误差在±3%以内。以贵州省两座水库的工程应用为例,对浇筑仓面进行分区预处理,结果发现靠近上游、中部、下游不同区域的露石率差别较大,计算得到的层面露石率以10%~30%居多,其中堆石入仓运输通道区域的露石率较低。研究内容与结论可为堆石混凝土结构层间界面抗剪力学性能和大坝蓄水安全稳定的研究提供参考与借鉴。 展开更多
关键词 堆石混凝土坝 segment anything model(SAM) 图像处理技术 露石率 层间抗剪性能
在线阅读 下载PDF
结合SAM视觉分割模型与随机森林机器学习的无人机影像盐沼植被“精灵圈”提取
10
作者 周若彤 谭凯 +2 位作者 杨建儒 韩江涛 张卫国 《海洋学报》 CAS CSCD 北大核心 2024年第5期116-126,共11页
“精灵圈”是海岸带盐沼植被生态系统中的一种“空间自组织”结构,对盐沼湿地的生产力、稳定性和恢复力有重要影响。无人机影像是实现“精灵圈”空间位置高精度识别及解译其时空演化趋势与规律的重要数据源,但“精灵圈”像素与背景像素... “精灵圈”是海岸带盐沼植被生态系统中的一种“空间自组织”结构,对盐沼湿地的生产力、稳定性和恢复力有重要影响。无人机影像是实现“精灵圈”空间位置高精度识别及解译其时空演化趋势与规律的重要数据源,但“精灵圈”像素与背景像素在色彩信息和外形特征上差异较小,如何从二维影像中智能精准地识别“精灵圈”像素并对识别的单个像素形成个体“精灵圈”是目前的技术难点。本文提出了一种结合分割万物模型(Segment Anything Model,SAM)视觉分割模型与随机森林机器学习的无人机影像“精灵圈”分割及分类方法,实现了单个“精灵圈”的识别和提取。首先,通过构建索伦森-骰子系数(S?rensen-Dice coefficient,Dice)和交并比(Intersection over Union,IOU)评价指标,从SAM中筛选预训练模型并对其参数进行优化,实现全自动影像分割,得到无属性信息的分割掩码/分割类;然后,利用红、绿、蓝(RGB)三通道信息及空间二维坐标将分割掩码与原图像进行信息匹配,构造分割掩码的特征指标,并根据袋外数据(Out of Bag,OOB)误差减小及特征分布规律对特征进行分析和筛选;最后,利用筛选的特征对随机森林模型进行训练,实现“精灵圈”植被、普通植被和光滩的自动识别与分类。实验结果表明:本文方法“精灵圈”平均正确提取率96.1%,平均错误提取率为9.5%,为精准刻画“精灵圈”时空格局及海岸带无人机遥感图像处理提供了方法和技术支撑。 展开更多
关键词 盐沼植被 精灵圈 segment anything model(SAM) 无人机影像 机器学习
在线阅读 下载PDF
一种街景图像中建筑物高度估算方法
11
作者 戈士博 刘纪平 +1 位作者 王勇 车向红 《遥感信息》 CSCD 北大核心 2024年第3期1-6,共6页
建筑物高度信息是城市三维建模的基础数据,但已有的建筑物高度估算研究多采用LiDAR和SAR等遥感影像。随着计算机和互联网的快速发展,街景数据因采集容易和成本低等特点成为了一种新兴的建筑物高度估算数据源。文章提出一种街景图像中建... 建筑物高度信息是城市三维建模的基础数据,但已有的建筑物高度估算研究多采用LiDAR和SAR等遥感影像。随着计算机和互联网的快速发展,街景数据因采集容易和成本低等特点成为了一种新兴的建筑物高度估算数据源。文章提出一种街景图像中建筑物高度估算方法,首先利用segment anything model实现图像中建筑物像素高度提取;然后利用图像元数据和电子地图数据获取建筑物与相机之间的距离、图像焦距,根据街景图像与建筑物实体的几何关系改进针孔相机模型,构建建筑物高度估算方法;最后选取北京、柏林的Mapillary街景图像开展实验验证。结果表明,与改进前相比,改进后针孔相机模型明显提升了高度估算准确度,RMSE降低了11.31 m,R^(2)提高了0.4,具备实用价值。 展开更多
关键词 街景图像 建筑物高度估算 针孔相机模型 segment anything model Mapillary
在线阅读 下载PDF
Material-SAM:Adapting SAM for Material XCT
12
作者 Xuelong Wu Junsheng Wang +6 位作者 Zhongyao Li Yisheng Miao Chengpeng Xue Yuling Lang Decai Kong Xiaoying Ma Haibao Qiao 《Computers, Materials & Continua》 SCIE EI 2024年第3期3703-3720,共18页
X-ray Computed Tomography(XCT)enables non-destructive acquisition of the internal structure of materials,and image segmentation plays a crucial role in analyzing material XCT images.This paper proposes an image segmen... X-ray Computed Tomography(XCT)enables non-destructive acquisition of the internal structure of materials,and image segmentation plays a crucial role in analyzing material XCT images.This paper proposes an image segmentation method based on the Segment Anything model(SAM).We constructed a dataset of carbide in nickel-based single crystal superalloys XCT images and preprocessed the images using median filtering,histogram equalization,and gamma correction.Subsequently,SAM was fine-tuned to adapt to the task of material XCT image segmentation,resulting in Material-SAM.We compared the performance of threshold segmentation,SAM,U-Net model,and Material-SAM.Our method achieved 88.45%Class Pixel Accuracy(CPA)and 88.77%Dice Similarity Coefficient(DSC)on the test set,outperforming SAM by 5.25%and 8.81%,respectively,and achieving the highest evaluation.Material-SAM demonstrated lower input requirements compared to SAM,as it only required three reference points for completing the segmentation task,which is one-fifth of the requirement of SAM.Material-SAM exhibited promising results,highlighting its potential as a novel method for material XCT image segmentation. 展开更多
关键词 Segment anything model X-ray computed tomography U-Net Ni-based superalloys foundation models
在线阅读 下载PDF
Tunnel SAM adapter:Adapting segment anything model for tunnel water leakage inspection
13
作者 Junxin Chen Xiaojie Yu +4 位作者 Shichang Liu Tao Chen Wei Wang Gwanggil Jeon Benguo He 《Geohazard Mechanics》 2024年第1期29-36,共8页
Water leakage inspection in the tunnels is a critical engineering job that has attracted increasing concerns.Leakage area detection via manual inspection techniques is time-consuming and might produce unreliablefindin... Water leakage inspection in the tunnels is a critical engineering job that has attracted increasing concerns.Leakage area detection via manual inspection techniques is time-consuming and might produce unreliablefindings, so that automated techniques should be created to increase reliability and efficiency. Pre-trainedfoundational segmentation models for large datasets have attracted great interests recently. This paper proposes a novel SAM-based network for accurate automated water leakage inspection. The contributions of thispaper include the efficient adaptation of the SAM (Segment Anything Model) for shield tunnel water leakagesegmentation and the demonstration of the application effect by data experiments. Tunnel SAM Adapter hassatisfactory performance, achieving 76.2 % mIoU and 77.5 % Dice. Experimental results demonstrate that ourapproach has advantages over peer studies and guarantees the integrity and safety of these vital assets whilestreamlining tunnel maintenance. 展开更多
关键词 Water leakage segmentation Segment anything model SAM-Adapter Smart engineering
在线阅读 下载PDF
Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications 被引量:3
14
作者 Wei Ji Jingjing Li +3 位作者 Qi Bi Tingwei Liu Wenbo Li Li Cheng 《Machine Intelligence Research》 EI CSCD 2024年第4期617-630,共14页
Recently,Meta AI Research approaches a general,promptable segment anything model(SAM)pre-trained on an unprecedentedly large segmentation dataset(SA-1B).Without a doubt,the emergence of SAM will yield significant bene... Recently,Meta AI Research approaches a general,promptable segment anything model(SAM)pre-trained on an unprecedentedly large segmentation dataset(SA-1B).Without a doubt,the emergence of SAM will yield significant benefits for a wide array of practical image segmentation applications.In this study,we conduct a series of intriguing investigations into the performance of SAM across various applications,particularly in the fields of natural images,agriculture,manufacturing,remote sensing and healthcare.We analyze and discuss the benefits and limitations of SAM,while also presenting an outlook on its future development in segmentation tasks.By doing so,we aim to give a comprehensive understanding of SAM's practical applications.This work is expected to provide insights that facilitate future research activities toward generic segmentation.Source code is publicly available at https://github.com/LiuTingWed/SAM-Not-Perfect. 展开更多
关键词 Segment anything model(SAM) visual perception segmentation foundational model computer vision.
原文传递
TV-SAM:Increasing Zero-Shot Segmentation Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation
15
作者 Zekun Jiang Dongjie Cheng +10 位作者 Ziyuan Qin Jun Gao Qicheng Lao Abdullaev Bakhrom Ismoilovich Urazboev Gayrat Yuldashov Elyorbek Bekchanov Habibullo Defu Tang Linjing Wei Kang Li Le Zhang 《Big Data Mining and Analytics》 CSCD 2024年第4期1199-1211,共13页
This study presents a novel multimodal medical image zero-shot segmentation algorithm named the text-visual-prompt segment anything model(TV-SAM)without any manual annotations.The TV-SAM incorporates and integrates th... This study presents a novel multimodal medical image zero-shot segmentation algorithm named the text-visual-prompt segment anything model(TV-SAM)without any manual annotations.The TV-SAM incorporates and integrates the large language model GPT-4,the vision language model GLIP,and the SAM to autonomously generate descriptive text prompts and visual bounding box prompts from medical images,thereby enhancing the SAM’s capability for zero-shot segmentation.Comprehensive evaluations are implemented on seven public datasets encompassing eight imaging modalities to demonstrate that TV-SAM can effectively segment unseen targets across various modalities without additional training.TV-SAM significantly outperforms SAM AUTO(p<0.01)and GSAM(p<0.05),closely matching the performance of SAM BBOX with gold standard bounding box prompts(p=0.07),and surpasses the state-of-the-art methods on specific datasets such as ISIC(0.853 versus 0.802)and WBC(0.968 versus 0.883).The study indicates that TV-SAM serves as an effective multimodal medical image zero-shot segmentation algorithm,highlighting the significant contribution of GPT-4 to zero-shot segmentation.By integrating foundational models such as GPT-4,GLIP,and SAM,the ability to address complex problems in specialized domains can be enhanced. 展开更多
关键词 large language model vision language model segment anything model medical image segmentation zero-shot segmentation GPT-4
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部