期刊文献+
共找到63,598篇文章
< 1 2 250 >
每页显示 20 50 100
Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring
1
作者 Moneerah Alotaibi 《Computers, Materials & Continua》 2026年第1期1629-1648,共20页
Advanced traffic monitoring systems encounter substantial challenges in vehicle detection and classification due to the limitations of conventional methods,which often demand extensive computational resources and stru... Advanced traffic monitoring systems encounter substantial challenges in vehicle detection and classification due to the limitations of conventional methods,which often demand extensive computational resources and struggle with diverse data acquisition techniques.This research presents a novel approach for vehicle classification and recognition in aerial image sequences,integrating multiple advanced techniques to enhance detection accuracy.The proposed model begins with preprocessing using Multiscale Retinex(MSR)to enhance image quality,followed by Expectation-Maximization(EM)Segmentation for precise foreground object identification.Vehicle detection is performed using the state-of-the-art YOLOv10 framework,while feature extraction incorporates Maximally Stable Extremal Regions(MSER),Dense Scale-Invariant Feature Transform(Dense SIFT),and Zernike Moments Features to capture distinct object characteristics.Feature optimization is further refined through a Hybrid Swarm-based Optimization algorithm,ensuring optimal feature selection for improved classification performance.The final classification is conducted using a Vision Transformer,leveraging its robust learning capabilities for enhanced accuracy.Experimental evaluations on benchmark datasets,including UAVDT and the Unmanned Aerial Vehicle Intruder Dataset(UAVID),demonstrate the superiority of the proposed approach,achieving an accuracy of 94.40%on UAVDT and 93.57%on UAVID.The results highlight the efficacy of the model in significantly enhancing vehicle detection and classification in aerial imagery,outperforming existing methodologies and offering a statistically validated improvement for intelligent traffic monitoring systems compared to existing approaches. 展开更多
关键词 Machine learning semantic segmentation remote sensors deep learning object monitoring system
在线阅读 下载PDF
Leci:Learnable Evolutionary Category Intermediates for Unsupervised Domain Adaptive Segmentation 被引量:1
2
作者 Qiming ZHANG Yufei XU +1 位作者 Jing ZHANG Dacheng TAO 《Artificial Intelligence Science and Engineering》 2025年第1期37-51,共15页
To avoid the laborious annotation process for dense prediction tasks like semantic segmentation,unsupervised domain adaptation(UDA)methods have been proposed to leverage the abundant annotations from a source domain,s... To avoid the laborious annotation process for dense prediction tasks like semantic segmentation,unsupervised domain adaptation(UDA)methods have been proposed to leverage the abundant annotations from a source domain,such as virtual world(e.g.,3D games),and adapt models to the target domain(the real world)by narrowing the domain discrepancies.However,because of the large domain gap,directly aligning two distinct domains without considering the intermediates leads to inefficient alignment and inferior adaptation.To address this issue,we propose a novel learnable evolutionary Category Intermediates(CIs)guided UDA model named Leci,which enables the information transfer between the two domains via two processes,i.e.,Distilling and Blending.Starting from a random initialization,the CIs learn shared category-wise semantics automatically from two domains in the Distilling process.Then,the learned semantics in the CIs are sent back to blend the domain features through a residual attentive fusion(RAF)module,such that the categorywise features of both domains shift towards each other.As the CIs progressively and consistently learn from the varying feature distributions during training,they are evolutionary to guide the model to achieve category-wise feature alignment.Experiments on both GTA5 and SYNTHIA datasets demonstrate Leci's superiority over prior representative methods. 展开更多
关键词 unsupervised domain adaptation semantic segmentation deep learning
在线阅读 下载PDF
Stochastic Augmented-Based Dual-Teaching for Semi-Supervised Medical Image Segmentation
3
作者 Hengyang Liu Yang Yuan +2 位作者 Pengcheng Ren Chengyun Song Fen Luo 《Computers, Materials & Continua》 SCIE EI 2025年第1期543-560,共18页
Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)t... Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)training the model solely with copy-paste mixed pictures from labeled and unlabeled input loses a lot of labeled information;(2)low-quality pseudo-labels can cause confirmation bias in pseudo-supervised learning on unlabeled data;(3)the segmentation performance in low-contrast and local regions is less than optimal.We design a Stochastic Augmentation-Based Dual-Teaching Auxiliary Training Strategy(SADT),which enhances feature diversity and learns high-quality features to overcome these problems.To be more precise,SADT trains the Student Network by using pseudo-label-based training from Teacher Network 1 and supervised learning with labeled data,which prevents the loss of rare labeled data.We introduce a bi-directional copy-pastemask with progressive high-entropy filtering to reduce data distribution disparities and mitigate confirmation bias in pseudo-supervision.For the mixed images,Deep-Shallow Spatial Contrastive Learning(DSSCL)is proposed in the feature spaces of Teacher Network 2 and the Student Network to improve the segmentation capabilities in low-contrast and local areas.In this procedure,the features retrieved by the Student Network are subjected to a random feature perturbation technique.On two openly available datasets,extensive trials show that our proposed SADT performs much better than the state-ofthe-art semi-supervised medical segmentation techniques.Using only 10%of the labeled data for training,SADT was able to acquire a Dice score of 90.10%on the ACDC(Automatic Cardiac Diagnosis Challenge)dataset. 展开更多
关键词 SEMI-SUPERvisED medical image segmentation contrastive learning stochastic augmented
在线阅读 下载PDF
Semi-Supervised Instrument Segmentation for Endoscopic Spinal Surgery
4
作者 Wenxin Chen Xingguang Duan +3 位作者 Ye Yuan Pu Chen Tengfei Cui Changsheng Li 《CAAI Transactions on Intelligence Technology》 2025年第6期1633-1645,共13页
Segmentation tasks require multiple annotation work which is time-consuming and labour-intensive.How to make full use of unlabelled data to assist in training deep learning models has been a research hotspot in recent... Segmentation tasks require multiple annotation work which is time-consuming and labour-intensive.How to make full use of unlabelled data to assist in training deep learning models has been a research hotspot in recent years.This paper takes instrument segmentation in endoscopic surgery as the background to explore how to use unlabelled data for semi-supervised learning more reasonably and effectively.An adaptive gradient correction method based on the degree of perturbation is proposed to improve segmentation accuracy.This paper integrates the recently popular segment anything model(SAM)with semi-supervised learning,taking full advantage of the large model to enhance the zero-shot ability of the model.Experimental results demonstrate the superior performance of the proposed segmentation strategy compared to traditional semi-supervised segmentation methods,achieving a 2.56% improvement in mean intersection over union(mIoU).The visual segmentation results show that incorporation of SAM significantly enhances our method,resulting in more accurate segmentation boundaries. 展开更多
关键词 deep learning image segmentation intelligent robots ROBOTICS
在线阅读 下载PDF
Positional Information is a Strong Supervision for Volumetric Medical Image Segmentation
5
作者 ZHAO Yinjie HOU Runping +5 位作者 ZENG Wanqin QIN Yulei SHEN Tianle XU Zhiyong FU Xiaolong SHEN Hongbin 《Journal of Shanghai Jiaotong university(Science)》 2025年第1期121-129,共9页
Medical image segmentation is a crucial preliminary step for a number of downstream diagnosis tasks.As deep convolutional neural networks successfully promote the development of computer vision,it is possible to make ... Medical image segmentation is a crucial preliminary step for a number of downstream diagnosis tasks.As deep convolutional neural networks successfully promote the development of computer vision,it is possible to make medical image segmentation a semi-automatic procedure by applying deep convolutional neural networks to finding the contours of regions of interest that are then revised by radiologists.However,supervised learning necessitates large annotated data,which are difficult to acquire especially for medical images.Self-supervised learning is able to take advantage of unlabeled data and provide good initialization to be finetuned for downstream tasks with limited annotations.Considering that most self-supervised learning especially contrastive learning methods are tailored to natural image classification and entail expensive GPU resources,we propose a novel and simple pretext-based self-supervised learning method that exploits the value of positional information in volumetric medical images.Specifically,we regard spatial coordinates as pseudo labels and pretrain the model by predicting positions of randomly sampled 2D slices in volumetric medical images.Experiments on four semantic segmentation datasets demonstrate the superiority of our method over other self-supervised learning methods in both semi-supervised learning and transfer learning settings.Codes are available at https://github.com/alienzyj/PPos. 展开更多
关键词 self-supervised learning medical image analysis semantic segmentation
原文传递
Research on indoor visual localization based on semantic segmentation and adaptive weighting
6
作者 TAO Sili QIN Danyang +1 位作者 YANG Jiaqiang BIE Haoze 《High Technology Letters》 2025年第3期300-308,共9页
Indoor visual localization relies heavily on image retrieval to ascertain location information.However,the widespread presence and high consistency of floor patterns across different images of-ten lead to the extracti... Indoor visual localization relies heavily on image retrieval to ascertain location information.However,the widespread presence and high consistency of floor patterns across different images of-ten lead to the extraction of numerous repetitive features,thereby reducing the accuracy of image retrieval.This article proposes an indoor visual localization method based on semantic segmentation and adaptive weight fusion to address the issue of ground texture interference with retrieval results.During the positioning process,an indoor semantic segmentation model is established.Semantic segmentation technology is applied to accurately delineate the ground portion of the images.Fea-ture extraction is performed on both the original database and the ground-segmented database.The vector of locally aggregated descriptors(VLAD)algorithm is then used to convert image features into a fixed-length feature representation,which improves the efficiency of image retrieval.Simul-taneously,a method for adaptive weight optimization in similarity calculation is proposed,using a-daptive weights to compute similarity for different regional features,thereby improving the accuracy of image retrieval.The experimental results indicate that this method significantly reduces ground interference and effectively utilizes ground information,thereby improving the accuracy of image retrieval. 展开更多
关键词 indoor localization image retrieval semantic segmentation adaptive weight
在线阅读 下载PDF
Selective Multiple Classifiers for Weakly Supervised Semantic Segmentation
7
作者 Zilin Guo Dongyue Wu +1 位作者 Changxin Gao Nong Sang 《CAAI Transactions on Intelligence Technology》 2025年第6期1688-1702,共15页
Existing weakly supervised semantic segmentation(WSSS)methods based on image-level labels always rely on class activation maps(CAMs),which measure the relationships between features and classifiers.However,CAMs only f... Existing weakly supervised semantic segmentation(WSSS)methods based on image-level labels always rely on class activation maps(CAMs),which measure the relationships between features and classifiers.However,CAMs only focus on the most discriminative regions of images,resulting in their poor coverage performance.We attribute this to the deficiency in the recognition ability of a single classifier and the negative impacts caused by magnitudes during the CAMs normalisation process.To address the aforementioned issues,we propose to construct selective multiple classifiers(SMC).During the training process,we extract multiple prototypes for each class and store them in the corresponding memory bank.These prototypes are divided into foreground and background prototypes,with the former used to identify foreground objects and the latter aimed at preventing the false activation of background pixels.As for the inference stage,multiple prototypes are adaptively selected from the memory bank for each image as SMC.Subsequently,CAMs are generated by measuring the angle between SMC and features.We enhance the recognition ability of classifiers by adaptively constructing multiple classifiers for each image,while only relying on angle measurement to generate CAMs can alleviate the suppression phenomenon caused by magnitudes.Furthermore,SMC can be integrated into other WSSS approaches to help generate better CAMs.Extensive experiments conducted on standard WSSS benchmarks such as PASCAL VOC 2012 and MS COCO 2014 demonstrate the superiority of our proposed method. 展开更多
关键词 image segmentation multiple classifiers weakly supervised learning
在线阅读 下载PDF
Visual Perception and Adaptive Scene Analysis with Autonomous Panoptic Segmentation
8
作者 Darthy Rabecka V Britto Pari J Man-Fai Leung 《Computers, Materials & Continua》 2025年第10期827-853,共27页
Techniques in deep learning have significantly boosted the accuracy and productivity of computer vision segmentation tasks.This article offers an intriguing architecture for semantic,instance,and panoptic segmentation... Techniques in deep learning have significantly boosted the accuracy and productivity of computer vision segmentation tasks.This article offers an intriguing architecture for semantic,instance,and panoptic segmentation using EfficientNet-B7 and Bidirectional Feature Pyramid Networks(Bi-FPN).When implemented in place of the EfficientNet-B5 backbone,EfficientNet-B7 strengthens the model’s feature extraction capabilities and is far more appropriate for real-world applications.By ensuring superior multi-scale feature fusion,Bi-FPN integration enhances the segmentation of complex objects across various urban environments.The design suggested is examined on rigorous datasets,encompassing Cityscapes,Common Objects in Context,KITTI Karlsruhe Institute of Technology and Toyota Technological Institute,and Indian Driving Dataset,which replicate numerous real-world driving conditions.During extensive training,validation,and testing,the model showcases major gains in segmentation accuracy and surpasses state-of-the-art performance in semantic,instance,and panoptic segmentation tasks.Outperforming present methods,the recommended approach generates noteworthy gains in Panoptic Quality:+0.4%on Cityscapes,+0.2%on COCO,+1.7%on KITTI,and+0.4%on IDD.These changes show just how efficient it is in various driving circumstances and datasets.This study emphasizes the potential of EfficientNet-B7 and Bi-FPN to provide dependable,high-precision segmentation in computer vision applications,primarily autonomous driving.The research results suggest that this framework efficiently tackles the constraints of practical situations while delivering a robust solution for high-performance tasks involving segmentation. 展开更多
关键词 Panoptic segmentation multi-scale features efficient net-B7 Feature Pyramid Network
在线阅读 下载PDF
Semi-supervised cardiac magnetic resonance image segmentation based on domain generalization
9
作者 SHAO Hong HOU Jinyang CUI Wencheng 《High Technology Letters》 2025年第1期41-52,共12页
In the realm of medical image segmentation,particularly in cardiac magnetic resonance imaging(MRI),achieving robust performance with limited annotated data is a significant challenge.Performance often degrades when fa... In the realm of medical image segmentation,particularly in cardiac magnetic resonance imaging(MRI),achieving robust performance with limited annotated data is a significant challenge.Performance often degrades when faced with testing scenarios from unknown domains.To address this problem,this paper proposes a novel semi-supervised approach for cardiac magnetic resonance image segmentation,aiming to enhance predictive capabilities and domain generalization(DG).This paper establishes an MT-like model utilizing pseudo-labeling and consistency regularization from semi-supervised learning,and integrates uncertainty estimation to improve the accuracy of pseudo-labels.Additionally,to tackle the challenge of domain generalization,a data manipulation strategy is introduced,extracting spatial and content-related information from images across different domains,enriching the dataset with a multi-domain perspective.This papers method is meticulously evaluated on the publicly available cardiac magnetic resonance imaging dataset M&Ms,validating its effectiveness.Comparative analyses against various methods highlight the out-standing performance of this papers approach,demonstrating its capability to segment cardiac magnetic resonance images in previously unseen domains even with limited annotated data. 展开更多
关键词 SEMI-SUPERvisED domain generalization(DG) cardiac magnetic resonance image segmentation
在线阅读 下载PDF
CPEWS:Contextual Prototype-Based End-to-End Weakly Supervised Semantic Segmentation
10
作者 Xiaoyan Shao Jiaqi Han +2 位作者 Lingling Li Xuezhuan Zhao Jingjing Yan 《Computers, Materials & Continua》 2025年第4期595-617,共23页
The primary challenge in weakly supervised semantic segmentation is effectively leveraging weak annotations while minimizing the performance gap compared to fully supervised methods.End-to-end model designs have gaine... The primary challenge in weakly supervised semantic segmentation is effectively leveraging weak annotations while minimizing the performance gap compared to fully supervised methods.End-to-end model designs have gained significant attention for improving training efficiency.Most current algorithms rely on Convolutional Neural Networks(CNNs)for feature extraction.Although CNNs are proficient at capturing local features,they often struggle with global context,leading to incomplete and false Class Activation Mapping(CAM).To address these limitations,this work proposes a Contextual Prototype-Based End-to-End Weakly Supervised Semantic Segmentation(CPEWS)model,which improves feature extraction by utilizing the Vision Transformer(ViT).By incorporating its intermediate feature layers to preserve semantic information,this work introduces the Intermediate Supervised Module(ISM)to supervise the final layer’s output,reducing boundary ambiguity and mitigating issues related to incomplete activation.Additionally,the Contextual Prototype Module(CPM)generates class-specific prototypes,while the proposed Prototype Discrimination Loss and Superclass Suppression Loss guide the network’s training,(LPDL)(LSSL)effectively addressing false activation without the need for extra supervision.The CPEWS model proposed in this paper achieves state-of-the-art performance in end-to-end weakly supervised semantic segmentation without additional supervision.The validation set and test set Mean Intersection over Union(MIoU)of PASCAL VOC 2012 dataset achieved 69.8%and 72.6%,respectively.Compared with ToCo(pre trained weight ImageNet-1k),MIoU on the test set is 2.1%higher.In addition,MIoU reached 41.4%on the validation set of the MS COCO 2014 dataset. 展开更多
关键词 End-to-end weakly supervised semantic segmentation vision transformer contextual prototype class activation map
在线阅读 下载PDF
Multi-Consistency Training for Semi-Supervised Medical Image Segmentation
11
作者 WU Changxue ZHANG Wenxi +1 位作者 HAN Jiaozhi WANG Hongyu 《Journal of Shanghai Jiaotong university(Science)》 2025年第4期800-814,共15页
Medical image segmentation is a crucial task in clinical applications.However,obtaining labeled data for medical images is often challenging.This has led to the appeal of semi-supervised learning(SSL),a technique adep... Medical image segmentation is a crucial task in clinical applications.However,obtaining labeled data for medical images is often challenging.This has led to the appeal of semi-supervised learning(SSL),a technique adept at leveraging a modest amount of labeled data.Nonetheless,most prevailing SSL segmentation methods for medical images either rely on the single consistency training method or directly fine-tune SSL methods designed for natural images.In this paper,we propose an innovative semi-supervised method called multi-consistency training(MCT)for medical image segmentation.Our approach transcends the constraints of prior methodologies by considering consistency from a dual perspective:output consistency across different up-sampling methods and output consistency of the same data within the same network under various perturbations to the intermediate features.We design distinct semi-supervised loss regression methods for these two types of consistencies.To enhance the application of our MCT model,we also develop a dedicated decoder as the core of our neural network.Thorough experiments were conducted on the polyp dataset and the dental dataset,rigorously compared against other SSL methods.Experimental results demonstrate the superiority of our approach,achieving higher segmentation accuracy.Moreover,comprehensive ablation studies and insightful discussion substantiate the efficacy of our approach in navigating the intricacies of medical image segmentation. 展开更多
关键词 semi-supervised learning(SSL) multi-consistency training(MCT) medical image segmentation intermediate feature perturbation
原文传递
GLMCNet: A Global-Local Multiscale Context Network for High-Resolution Remote Sensing Image Semantic Segmentation
12
作者 Yanting Zhang Qiyue Liu +4 位作者 Chuanzhao Tian Xuewen Li Na Yang Feng Zhang Hongyue Zhang 《Computers, Materials & Continua》 2026年第1期2086-2110,共25页
High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes an... High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet. 展开更多
关键词 Multiscale context attention mechanism remote sensing images semantic segmentation
在线阅读 下载PDF
Deep Learning for Brain Tumor Segmentation and Classification: A Systematic Review of Methods and Trends
13
作者 Ameer Hamza Robertas Damaševicius 《Computers, Materials & Continua》 2026年第1期132-172,共41页
This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 20... This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities,focusing on recent trends from 2022 to 2025.The primary objective is to evaluate methodological advancements,model performance,dataset usage,and existing challenges in developing clinically robust AI systems.We included peer-reviewed journal articles and highimpact conference papers published between 2022 and 2025,written in English,that proposed or evaluated deep learning methods for brain tumor segmentation and/or classification.Excluded were non-open-access publications,books,and non-English articles.A structured search was conducted across Scopus,Google Scholar,Wiley,and Taylor&Francis,with the last search performed in August 2025.Risk of bias was not formally quantified but considered during full-text screening based on dataset diversity,validation methods,and availability of performance metrics.We used narrative synthesis and tabular benchmarking to compare performance metrics(e.g.,accuracy,Dice score)across model types(CNN,Transformer,Hybrid),imaging modalities,and datasets.A total of 49 studies were included(43 journal articles and 6 conference papers).These studies spanned over 9 public datasets(e.g.,BraTS,Figshare,REMBRANDT,MOLAB)and utilized a range of imaging modalities,predominantly MRI.Hybrid models,especially ResViT and UNetFormer,consistently achieved high performance,with classification accuracy exceeding 98%and segmentation Dice scores above 0.90 across multiple studies.Transformers and hybrid architectures showed increasing adoption post2023.Many studies lacked external validation and were evaluated only on a few benchmark datasets,raising concerns about generalizability and dataset bias.Few studies addressed clinical interpretability or uncertainty quantification.Despite promising results,particularly for hybrid deep learning models,widespread clinical adoption remains limited due to lack of validation,interpretability concerns,and real-world deployment barriers. 展开更多
关键词 Brain tumor segmentation brain tumor classification deep learning vision transformers hybrid models
在线阅读 下载PDF
SwinHCAD: A Robust Multi-Modality Segmentation Model for Brain Tumors Using Transformer and Channel-Wise Attention
14
作者 Seyong Jin Muhammad Fayaz +2 位作者 L.Minh Dang Hyoung-Kyu Song Hyeonjoon Moon 《Computers, Materials & Continua》 2026年第1期511-533,共23页
Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the b... Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation. 展开更多
关键词 Attention mechanism brain tumor segmentation channel-wise attention decoder deep learning medical imaging MRI TRANSFORMER U-Net
在线阅读 下载PDF
Deep Learning-Based Toolkit Inspection: Object Detection and Segmentation in Assembly Lines
15
作者 Arvind Mukundan Riya Karmakar +1 位作者 Devansh Gupta Hsiang-Chen Wang 《Computers, Materials & Continua》 2026年第1期1255-1277,共23页
Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone t... Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities. 展开更多
关键词 Tool detection image segmentation object detection assembly line automation Industry 4.0 Intel RealSense deep learning toolkit verification RGB-D imaging quality assurance
在线阅读 下载PDF
Fusion of Infrared and Visible Light Images Based on Region Segmentation 被引量:12
16
作者 刘坤 郭雷 +1 位作者 李晖晖 陈敬松 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2009年第1期75-80,共6页
This article proposes a novel method to fuse infrared and visible light images based on region segmentation. Region segmen-tation is used to determine important regions and background information in the input image. T... This article proposes a novel method to fuse infrared and visible light images based on region segmentation. Region segmen-tation is used to determine important regions and background information in the input image. The non-subsampled contourlet transform (NSCT) provides a flexible multiresolution,local and directional image expansion,and also a sparse representation for two-dimensional (2-D) piecewise smooth signal building images,and then different fusion rules are applied to fuse the NSCT coefficients fo... 展开更多
关键词 image processing image fusion non-subsampled contourlet transform region segmentation infrared imaging
原文传递
Color-texture based unsupervised segmentation using JSEG with fuzzy connectedness 被引量:2
17
作者 Zheng Yuanjie Yang Jie Zhou Yue Wang Yuzhong 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2006年第1期213-219,共7页
Color quantization is bound to lose spatial information of color distribution. If too much necessary spatial distribution information of color is lost in JSEG, it is difficult or even impossible for JSEG to segment im... Color quantization is bound to lose spatial information of color distribution. If too much necessary spatial distribution information of color is lost in JSEG, it is difficult or even impossible for JSEG to segment image correctly. Enlightened from segmentation based on fuzzy theories, soft class-map is constracted to solve that problem. The definitions of values and other related ones are adjusted according to the soft class-map. With more detailed values obtained from soft class map, more color distribution information is preserved. Experiments on a synthetic image and many other color images illustrate that JSEG with soft class-map can solve efficiently the problem that in a region there may exist color gradual variation in a smooth transition. It is a more robust method especially for images which haven' t been heavily blurred near boundaries of underlying regions. 展开更多
关键词 unsupervised segmentation color segmentation color texture segmentation fuzzy method.
在线阅读 下载PDF
Supervised and Semi-supervised Methods for Abdominal Organ Segmentation: A Review 被引量:4
18
作者 Isaac Baffour Senkyire Zhe Liu 《International Journal of Automation and computing》 EI CSCD 2021年第6期887-914,共28页
Abdominal organ segmentation is the segregation of a single or multiple abdominal organ(s) into semantic image segments of pixels identified with homogeneous features such as color and texture, and intensity. The abdo... Abdominal organ segmentation is the segregation of a single or multiple abdominal organ(s) into semantic image segments of pixels identified with homogeneous features such as color and texture, and intensity. The abdominal organ(s) condition is mostly connected with greater morbidity and mortality. Most patients often have asymptomatic abdominal conditions and symptoms, which are often recognized late;hence the abdomen has been the third most common cause of damage to the human body. That notwithstanding,there may be improved outcomes where the condition of an abdominal organ is detected earlier. Over the years, supervised and semi-supervised machine learning methods have been used to segment abdominal organ(s) in order to detect the organ(s) condition. The supervised methods perform well when the used training data represents the target data, but the methods require large manually annotated data and have adaptation problems. The semi-supervised methods are fast but record poor performance than the supervised if assumptions about the data fail to hold. Current state-of-the-art methods of supervised segmentation are largely based on deep learning techniques due to their good accuracy and success in real world applications. Though it requires a large amount of training data for automatic feature extraction, deep learning can hardly be used. As regards the semi-supervised methods of segmentation, self-training and graph-based techniques have attracted much research attention. Self-training can be used with any classifier but does not have a mechanism to rectify mistakes early. Graph-based techniques thrive on their convexity, scalability, and effectiveness in application but have an out-of-sample problem. In this review paper, a study has been carried out on supervised and semi-supervised methods of performing abdominal organ segmentation. An observation of the current approaches, connection and gaps are identified, and prospective future research opportunities are enumerated. 展开更多
关键词 Abdominal organ supervised segmentation semi-supervised segmentation evaluation metrics image segmentation machine learning
原文传递
Visual Semantic Segmentation Based on Few/Zero-Shot Learning:An Overview 被引量:5
19
作者 Wenqi Ren Yang Tang +2 位作者 Qiyu Sun Chaoqiang Zhao Qing-Long Han 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第5期1106-1126,共21页
Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block,and it plays a crucial role in environmental perception... Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block,and it plays a crucial role in environmental perception.Conventional learning-based visual semantic segmentation approaches count heavily on largescale training data with dense annotations and consistently fail to estimate accurate semantic labels for unseen categories.This obstruction spurs a craze for studying visual semantic segmentation with the assistance of few/zero-shot learning.The emergence and rapid progress of few/zero-shot visual semantic segmentation make it possible to learn unseen categories from a few labeled or even zero-labeled samples,which advances the extension to practical applications.Therefore,this paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space and explores the commonalities and discrepancies of technical settlements under different segmentation circumstances.Specifically,the preliminaries on few/zeroshot visual semantic segmentation,including the problem definitions,typical datasets,and technical remedies,are briefly reviewed and discussed.Moreover,three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation,including image semantic segmentation,video object segmentation,and 3D segmentation.Finally,the future challenges of few/zero-shot visual semantic segmentation are discussed. 展开更多
关键词 visUAL segmentation SEPARATING
在线阅读 下载PDF
Visual inspection of aircraft skin:Automated pixel-level defect detection by instance segmentation 被引量:17
20
作者 Meng DING Boer WU +2 位作者 Juan XU Abdul Nasser KASULE Hongfu ZUO 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2022年第10期254-264,共11页
Skin defect inspection is one of the most significant tasks in the conventional process of aircraft inspection.This paper proposes a vision-based method of pixel-level defect detection,which is based on the Mask Scori... Skin defect inspection is one of the most significant tasks in the conventional process of aircraft inspection.This paper proposes a vision-based method of pixel-level defect detection,which is based on the Mask Scoring R-CNN.First,an attention mechanism and a feature fusion module are introduced,to improve feature representation.Second,a new classifier head—consisting of four convolutional layers and a fully connected layer—is proposed,to reduce the influence of information around the area of the defect.Third,to evaluate the proposed method,a dataset of aircraft skin defects was constructed,containing 276 images with a resolution of 960×720 pixels.Experimental results show that the proposed classifier head improves the detection and segmentation accuracy,for aircraft skin defect inspection,more effectively than the attention mechanism and feature fusion module.Compared with the Mask R-CNN and Mask Scoring R-CNN,the proposed method increased the segmentation precision by approximately 21%and 19.59%,respectively.These results demonstrate that the proposed method performs favorably against the other two methods of pixellevel aircraft skin defect detection. 展开更多
关键词 Aircraft skin Automatic non-destructive testing Defect inspection Instance segmentation Machine vision
原文传递
上一页 1 2 250 下一页 到第
使用帮助 返回顶部