期刊文献+
共找到945篇文章
< 1 2 48 >
每页显示 20 50 100
Bilateral Dual-Residual Real-Time Semantic Segmentation Network
1
作者 Shijie Xiang Dong Zhou +1 位作者 Dan Tian Zihao Wang 《Computers, Materials & Continua》 2025年第4期497-515,共19页
Real-time semantic segmentation tasks place stringent demands on network inference speed,often requiring a reduction in network depth to decrease computational load.However,shallow networks tend to exhibit degradation... Real-time semantic segmentation tasks place stringent demands on network inference speed,often requiring a reduction in network depth to decrease computational load.However,shallow networks tend to exhibit degradation in feature extraction completeness and inference accuracy.Therefore,balancing high performance with real-time requirements has become a critical issue in the study of real-time semantic segmentation.To address these challenges,this paper proposes a lightweight bilateral dual-residual network.By introducing a novel residual structure combined with feature extraction and fusion modules,the proposed network significantly enhances representational capacity while reducing computational costs.Specifically,an improved compound residual structure is designed to optimize the efficiency of information propagation and feature extraction.Furthermore,the proposed feature extraction and fusion module enables the network to better capture multi-scale information in images,improving the ability to detect both detailed and global semantic features.Experimental results on the publicly available Cityscapes dataset demonstrate that the proposed lightweight dual-branch network achieves outstanding performance while maintaining low computational complexity.In particular,the network achieved a mean Intersection over Union(mIoU)of 78.4%on the Cityscapes validation set,surpassing many existing semantic segmentation models.Additionally,in terms of inference speed,the network reached 74.5 frames per second when tested on an NVIDIA GeForce RTX 3090 GPU,significantly improving real-time performance. 展开更多
关键词 real-time residual structure semantic segmentation feature fusion
在线阅读 下载PDF
BSDNet:Semantic Information Distillation-Based for Bilateral-Branch Real-Time Semantic Segmentation on Street Scene Image
2
作者 Huan Zeng Jianxun Zhang +1 位作者 Hongji Chen Xinwei Zhu 《Computers, Materials & Continua》 2025年第11期3879-3896,共18页
Semantic segmentation in street scenes is a crucial technology for autonomous driving to analyze the surrounding environment.In street scenes,issues such as high image resolution caused by a large viewpoints and diffe... Semantic segmentation in street scenes is a crucial technology for autonomous driving to analyze the surrounding environment.In street scenes,issues such as high image resolution caused by a large viewpoints and differences in object scales lead to a decline in real-time performance and difficulties in multi-scale feature extraction.To address this,we propose a bilateral-branch real-time semantic segmentationmethod based on semantic information distillation(BSDNet)for street scene images.The BSDNet consists of a Feature Conversion Convolutional Block(FCB),a Semantic Information Distillation Module(SIDM),and a Deep Aggregation Atrous Convolution Pyramid Pooling(DASP).FCB reduces the semantic gap between the backbone and the semantic branch.SIDM extracts high-quality semantic information fromthe Transformer branch to reduce computational costs.DASP aggregates information lost in atrous convolutions,effectively capturingmulti-scale objects.Extensive experiments conducted on Cityscapes,CamVid,and ADE20K,achieving an accuracy of 81.7% Mean Intersection over Union(mIoU)at 70.6 Frames Per Second(FPS)on Cityscapes,demonstrate that our method achieves a better balance between accuracy and inference speed. 展开更多
关键词 Street scene understanding real-time semantic segmentation knowledge distillation multi-scale feature extraction
在线阅读 下载PDF
DuFNet:Dual Flow Network of Real-Time Semantic Segmentation for Unmanned Driving Application of Internet of Things 被引量:1
3
作者 Tao Duan Yue Liu +2 位作者 Jingze Li Zhichao Lian d Qianmu Li 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第7期223-239,共17页
The application of unmanned driving in the Internet of Things is one of the concrete manifestations of the application of artificial intelligence technology.Image semantic segmentation can help the unmanned driving sy... The application of unmanned driving in the Internet of Things is one of the concrete manifestations of the application of artificial intelligence technology.Image semantic segmentation can help the unmanned driving system by achieving road accessibility analysis.Semantic segmentation is also a challenging technology for image understanding and scene parsing.We focused on the challenging task of real-time semantic segmentation in this paper.In this paper,we proposed a novel fast architecture for real-time semantic segmentation named DuFNet.Starting from the existing work of Bilateral Segmentation Network(BiSeNet),DuFNet proposes a novel Semantic Information Flow(SIF)structure for context information and a novel Fringe Information Flow(FIF)structure for spatial information.We also proposed two kinds of SIF with cascaded and paralleled structures,respectively.The SIF encodes the input stage by stage in the ResNet18 backbone and provides context information for the feature fusionmodule.Features from previous stages usually contain rich low-level details but high-level semantics for later stages.Themultiple convolutions embed in Parallel SIF aggregate the corresponding features among different stages and generate a powerful global context representation with less computational cost.The FIF consists of a pooling layer and an upsampling operator followed by projection convolution layer.The concise component provides more spatial details for the network.Compared with BiSeNet,our work achieved faster speed and comparable performance with 72.34%mIoU accuracy and 78 FPS on Cityscapes Dataset based on the ResNet18 backbone. 展开更多
关键词 real-time semantic segmentation convolutional neural network feature fusion unmanned driving fringe information flow
在线阅读 下载PDF
Triple-Branch Asymmetric Network for Real-time Semantic Segmentation of Road Scenes 被引量:2
4
作者 Yazhi Zhang Xuguang Zhang Hui Yu 《Instrumentation》 2024年第2期72-82,共11页
As the field of autonomous driving evolves, real-time semantic segmentation has become a crucial part of computer vision tasks. However, most existing methods use lightweight convolution to reduce the computational ef... As the field of autonomous driving evolves, real-time semantic segmentation has become a crucial part of computer vision tasks. However, most existing methods use lightweight convolution to reduce the computational effort, resulting in lower accuracy. To address this problem, we construct TBANet, a network with an encoder-decoder structure for efficient feature extraction. In the encoder part, the TBA module is designed to extract details and the ETBA module is used to learn semantic representations in a high-dimensional space. In the decoder part, we design a combination of multiple upsampling methods to aggregate features with less computational overhead. We validate the efficiency of TBANet on the Cityscapes dataset. It achieves 75.1% mean Intersection over Union(mIoU) with only 2.07 million parameters and can reach 90.3 Frames Per Second(FPS). 展开更多
关键词 encoder-decoder architecture lightweight convolution real-time semantic segmentation
原文传递
Lightweight deep network and projection loss for eye semantic segmentation
5
作者 Qinjie Wang Tengfei Wang +1 位作者 Lizhuang Yang Hai Li 《中国科学技术大学学报》 北大核心 2025年第7期59-68,58,I0002,共12页
Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is cr... Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is crucial for computationally limited portable devices such as augmented reality and virtual reality.With the rapid advancements in deep learning,many network models have been developed specifically for eye image segmentation.Some methods divide the segmentation process into multiple stages to achieve model parameter miniaturization while enhancing output through post processing techniques to improve segmentation accuracy.These approaches significantly increase the inference time.Other networks adopt more complex encoding and decoding modules to achieve end-to-end output,which requires substantial computation.Therefore,balancing the model’s size,accuracy,and computational complexity is essential.To address these challenges,we propose a lightweight asymmetric UNet architecture and a projection loss function.We utilize ResNet-3 layer blocks to enhance feature extraction efficiency in the encoding stage.In the decoding stage,we employ regular convolutions and skip connections to upscale the feature maps from the latent space to the original image size,balancing the model size and segmentation accuracy.In addition,we leverage the geometric features of the eye region and design a projection loss function to further improve the segmentation accuracy without adding any additional inference computational cost.We validate our approach on the OpenEDS2019 dataset for virtual reality and achieve state-of-the-art performance with 95.33%mean intersection over union(mIoU).Our model has only 0.63M parameters and 350 FPS,which are 68%and 200%of the state-of-the-art model RITNet,respectively. 展开更多
关键词 lightweight deep network projection loss real-time semantic segmentation convolutional neural networks END-TO-END
在线阅读 下载PDF
A 3D semantic segmentation network for accurate neuronal soma segmentation
6
作者 Li Ma Qi Zhong +2 位作者 Yezi Wang Xiaoquan Yang Qian Du 《Journal of Innovative Optical Health Sciences》 2025年第1期67-83,共17页
Neuronal soma segmentation plays a crucial role in neuroscience applications.However,the fine structure,such as boundaries,small-volume neuronal somata and fibers,are commonly present in cell images,which pose a chall... Neuronal soma segmentation plays a crucial role in neuroscience applications.However,the fine structure,such as boundaries,small-volume neuronal somata and fibers,are commonly present in cell images,which pose a challenge for accurate segmentation.In this paper,we propose a 3D semantic segmentation network for neuronal soma segmentation to address this issue.Using an encoding-decoding structure,we introduce a Multi-Scale feature extraction and Adaptive Weighting fusion module(MSAW)after each encoding block.The MSAW module can not only emphasize the fine structures via an upsampling strategy,but also provide pixel-wise weights to measure the importance of the multi-scale features.Additionally,a dynamic convolution instead of normal convolution is employed to better adapt the network to input data with different distributions.The proposed MSAW-based semantic segmentation network(MSAW-Net)was evaluated on three neuronal soma images from mouse brain and one neuronal soma image from macaque brain,demonstrating the efficiency of the proposed method.It achieved an F1 score of 91.8%on Fezf2-2A-CreER dataset,97.1%on LSL-H2B-GFP dataset,82.8%on Thy1-EGFP-Mline dataset,and 86.9%on macaque dataset,achieving improvements over the 3D U-Net model by 3.1%,3.3%,3.9%,and 2.3%,respectively. 展开更多
关键词 Neuronal soma segmentation semantic segmentation network multi-scale feature extraction adaptive weighting fusion
原文传递
KD-SegNet: Efficient Semantic Segmentation Network with Knowledge Distillation Based on Monocular Camera
7
作者 Thai-Viet Dang Nhu-Nghia Bui Phan Xuan Tan 《Computers, Materials & Continua》 2025年第2期2001-2026,共26页
Due to the necessity for lightweight and efficient network models, deploying semantic segmentation models on mobile robots (MRs) is a formidable task. The fundamental limitation of the problem lies in the training per... Due to the necessity for lightweight and efficient network models, deploying semantic segmentation models on mobile robots (MRs) is a formidable task. The fundamental limitation of the problem lies in the training performance, the ability to effectively exploit the dataset, and the ability to adapt to complex environments when deploying the model. By utilizing the knowledge distillation techniques, the article strives to overcome the above challenges with the inheritance of the advantages of both the teacher model and the student model. More precisely, the ResNet152-PSP-Net model’s characteristics are utilized to train the ResNet18-PSP-Net model. Pyramid pooling blocks are utilized to decode multi-scale feature maps, creating a complete semantic map inference. The student model not only preserves the strong segmentation performance from the teacher model but also improves the inference speed of the prediction results. The proposed method exhibits a clear advantage over conventional convolutional neural network (CNN) models, as evident from the conducted experiments. Furthermore, the proposed model also shows remarkable improvement in processing speed when compared with light-weight models such as MobileNetV2 and EfficientNet based on latency and throughput parameters. The proposed KD-SegNet model obtains an accuracy of 96.3% and a mIoU (mean Intersection over Union) of 77%, outperforming the performance of existing models by more than 15% on the same training dataset. The suggested method has an average training time that is only 0.51 times less than same field models, while still achieving comparable segmentation performance. Hence, the semantic segmentation frames are collected, forming the motion trajectory for the system in the environment. Overall, this architecture shows great promise for the development of knowledge-based systems for MR’s navigation. 展开更多
关键词 Mobile robot navigation semantic segmentation knowledge distillation pyramid scene parsing fully convolutional networks
在线阅读 下载PDF
Axial Assembled Correspondence Network for Few-Shot Semantic Segmentation 被引量:3
8
作者 Yu Liu Bin Jiang Jiaming Xu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第3期711-721,共11页
Few-shot semantic segmentation aims at training a model that can segment novel classes in a query image with only a few densely annotated support exemplars.It remains a challenge because of large intra-class variation... Few-shot semantic segmentation aims at training a model that can segment novel classes in a query image with only a few densely annotated support exemplars.It remains a challenge because of large intra-class variations between the support and query images.Existing approaches utilize 4D convolutions to mine semantic correspondence between the support and query images.However,they still suffer from heavy computation,sparse correspondence,and large memory.We propose axial assembled correspondence network(AACNet)to alleviate these issues.The key point of AACNet is the proposed axial assembled 4D kernel,which constructs the basic block for semantic correspondence encoder(SCE).Furthermore,we propose the deblurring equations to provide more robust correspondence for the aforementioned SCE and design a novel fusion module to mix correspondences in a learnable manner.Experiments on PASCAL-5~i reveal that our AACNet achieves a mean intersection-over-union score of 65.9%for 1-shot segmentation and 70.6%for 5-shot segmentation,surpassing the state-of-the-art method by 5.8%and 5.0%respectively. 展开更多
关键词 Artificial intelligence computer vision deep convolutional neural network few-shot semantic segmentation
在线阅读 下载PDF
Semantic Pneumonia Segmentation and Classification for Covid-19 Using Deep Learning Network
9
作者 M.M.Lotfy Hazem M.El-Bakry +4 位作者 M.M.Elgayar Shaker El-Sappagh G.Abdallah M.I A.A.Soliman Kyung Sup Kwak 《Computers, Materials & Continua》 SCIE EI 2022年第10期1141-1158,共18页
Early detection of the Covid-19 disease is essential due to its higher rate of infection affecting tens of millions of people,and its high number of deaths also by 7%.For that purpose,a proposed model of several stage... Early detection of the Covid-19 disease is essential due to its higher rate of infection affecting tens of millions of people,and its high number of deaths also by 7%.For that purpose,a proposed model of several stages was developed.The first stage is optimizing the images using dynamic adaptive histogram equalization,performing a semantic segmentation using DeepLabv3Plus,then augmenting the data by flipping it horizontally,rotating it,then flipping it vertically.The second stage builds a custom convolutional neural network model using several pre-trained ImageNet.Finally,the model compares the pre-trained data to the new output,while repeatedly trimming the best-performing models to reduce complexity and improve memory efficiency.Several experiments were done using different techniques and parameters.Accordingly,the proposed model achieved an average accuracy of 99.6%and an area under the curve of 0.996 in the Covid-19 detection.This paper will discuss how to train a customized intelligent convolutional neural network using various parameters on a set of chest X-rays with an accuracy of 99.6%. 展开更多
关键词 SARS-COV2 COVID-19 PNEUMONIA deep learning network semantic segmentation smart classification
在线阅读 下载PDF
SGT-Net: A Transformer-Based Stratified Graph Convolutional Network for 3D Point Cloud Semantic Segmentation
10
作者 Suyi Liu Jianning Chi +2 位作者 Chengdong Wu Fang Xu Xiaosheng Yu 《Computers, Materials & Continua》 SCIE EI 2024年第6期4471-4489,共19页
In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and... In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and inherently sparse.Therefore,it is very difficult to extract long-range contexts and effectively aggregate local features for semantic segmentation in 3D point cloud space.Most current methods either focus on local feature aggregation or long-range context dependency,but fail to directly establish a global-local feature extractor to complete the point cloud semantic segmentation tasks.In this paper,we propose a Transformer-based stratified graph convolutional network(SGT-Net),which enlarges the effective receptive field and builds direct long-range dependency.Specifically,we first propose a novel dense-sparse sampling strategy that provides dense local vertices and sparse long-distance vertices for subsequent graph convolutional network(GCN).Secondly,we propose a multi-key self-attention mechanism based on the Transformer to further weight augmentation for crucial neighboring relationships and enlarge the effective receptive field.In addition,to further improve the efficiency of the network,we propose a similarity measurement module to determine whether the neighborhood near the center point is effective.We demonstrate the validity and superiority of our method on the S3DIS and ShapeNet datasets.Through ablation experiments and segmentation visualization,we verify that the SGT model can improve the performance of the point cloud semantic segmentation. 展开更多
关键词 3D point cloud semantic segmentation long-range contexts global-local feature graph convolutional network dense-sparse sampling strategy
在线阅读 下载PDF
A Survey on Deep Learning-based Fine-grained Object Classification and Semantic Segmentation 被引量:47
11
作者 Bo Zhao Jiashi Feng +1 位作者 Xiao Wu Shuicheng Yan 《International Journal of Automation and computing》 EI CSCD 2017年第2期119-135,共17页
The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning technique... The deep learning technology has shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. In particular, recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories, such as bird species or dog breeds. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we review four types of deep learning based fine-grained image classification approaches, including the general convolutional neural networks (CNNs), part detection based, ensemble of networks based and visual attention based fine-grained image classification approaches. Besides, the deep learning based semantic segmentation approaches are also covered in this paper. The region proposal based and fully convolutional networks based approaches for semantic segmentation are introduced respectively. 展开更多
关键词 Deep learning fine-grained image classification semantic segmentation convolutional neural network (CNN) recurrentneural network (RNN)
原文传递
Multi-task Learning of Semantic Segmentation and Height Estimation for Multi-modal Remote Sensing Images 被引量:4
12
作者 Mengyu WANG Zhiyuan YAN +2 位作者 Yingchao FENG Wenhui DIAO Xian SUN 《Journal of Geodesy and Geoinformation Science》 CSCD 2023年第4期27-39,共13页
Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively u... Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively utilize multi-modal remote sensing data to break through the performance bottleneck of single-modal interpretation.In addition,semantic segmentation and height estimation in remote sensing data are two tasks with strong correlation,but existing methods usually study individual tasks separately,which leads to high computational resource overhead.To this end,we propose a Multi-Task learning framework for Multi-Modal remote sensing images(MM_MT).Specifically,we design a Cross-Modal Feature Fusion(CMFF)method,which aggregates complementary information of different modalities to improve the accuracy of semantic segmentation and height estimation.Besides,a dual-stream multi-task learning method is introduced for Joint Semantic Segmentation and Height Estimation(JSSHE),extracting common features in a shared network to save time and resources,and then learning task-specific features in two task branches.Experimental results on the public multi-modal remote sensing image dataset Potsdam show that compared to training two tasks independently,multi-task learning saves 20%of training time and achieves competitive performance with mIoU of 83.02%for semantic segmentation and accuracy of 95.26%for height estimation. 展开更多
关键词 MULTI-MODAL MULTI-TASK semantic segmentation height estimation convolutional neural network
在线阅读 下载PDF
Image Semantic Segmentation Approach for Studying Human Behavior on Image Data 被引量:1
13
作者 ZHENG Zhan CHEN Da HUANG Yanrong 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2024年第2期145-153,共9页
Image semantic segmentation is an essential technique for studying human behavior through image data.This paper proposes an image semantic segmentation method for human behavior research.Firstly,an end-to-end convolut... Image semantic segmentation is an essential technique for studying human behavior through image data.This paper proposes an image semantic segmentation method for human behavior research.Firstly,an end-to-end convolutional neural network architecture is proposed,which consists of a depth-separable jump-connected fully convolutional network and a conditional random field network;then jump-connected convolution is used to classify each pixel in the image,and an image semantic segmentation method based on convolu-tional neural network is proposed;and then a conditional random field network is used to improve the effect of image segmentation of hu-man behavior and a linear modeling and nonlinear modeling method based on the semantic segmentation of conditional random field im-age is proposed.Finally,using the proposed image segmentation network,the input entrepreneurial image data is semantically segmented to obtain the contour features of the person;and the segmentation of the images in the medical field.The experimental results show that the image semantic segmentation method is effective.It is a new way to use image data to study human behavior and can be extended to other research areas. 展开更多
关键词 human behavior research image semantic segmentation hop-connected full convolution network conditional random field network deep learning
原文传递
MAAUNet:Exploration of U-shaped encoding and decoding structure for semantic segmentation of medical image 被引量:1
14
作者 SHAO Shuo GE Hongwei 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2022年第4期418-429,共12页
In view of the problems of multi-scale changes of segmentation targets,noise interference,rough segmentation results and slow training process faced by medical image semantic segmentation,a multi-scale residual aggreg... In view of the problems of multi-scale changes of segmentation targets,noise interference,rough segmentation results and slow training process faced by medical image semantic segmentation,a multi-scale residual aggregation U-shaped attention network structure of MAAUNet(MultiRes aggregation attention UNet)is proposed based on MultiResUNet.Firstly,aggregate connection is introduced from the original feature aggregation at the same level.Skip connection is redesigned to aggregate features of different semantic scales at the decoder subnet,and the problem of semantic gaps is further solved that may exist between skip connections.Secondly,after the multi-scale convolution module,a convolution block attention module is added to focus and integrate features in the two attention directions of channel and space to adaptively optimize the intermediate feature map.Finally,the original convolution block is improved.The convolution channels are expanded with a series convolution structure to complement each other and extract richer spatial features.Residual connections are retained and the convolution block is turned into a multi-channel convolution block.The model is made to extract multi-scale spatial features.The experimental results show that MAAUNet has strong competitiveness in challenging datasets,and shows good segmentation performance and stability in dealing with multi-scale input and noise interference. 展开更多
关键词 U-shaped attention network structure of MAAUNet convolutional neural network encoding-decoding structure attention mechanism medical image semantic segmentation
在线阅读 下载PDF
A Remote Sensing Image Semantic Segmentation Method by Combining Deformable Convolution with Conditional Random Fields 被引量:13
15
作者 Zongcheng ZUO Wen ZHANG Dongying ZHANG 《Journal of Geodesy and Geoinformation Science》 2020年第3期39-49,共11页
Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the a... Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the ability to simulate geometric transformations.Therefore,a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation.Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture.To overcome this shortcoming,the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation.The proposed method can easily be trained by end-to-end using standard backpropagation algorithms.Finally,the proposed method is tested on the ISPRS dataset.The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset. 展开更多
关键词 high-resolution remote sensing image semantic segmentation deformable convolution network conditions random fields
在线阅读 下载PDF
Bird’s-Eye View Semantic Segmentation and Voxel Semantic Segmentation Based on Frustum Voxel Modeling and Monocular Camera
16
作者 秦超 王亚飞 +1 位作者 张宇超 殷承良 《Journal of Shanghai Jiaotong university(Science)》 EI 2023年第1期100-113,共14页
The semantic segmentation of a bird’s-eye view(BEV)is crucial for environment perception in autonomous driving,which includes the static elements of the scene,such as drivable areas,and dynamic elements such as cars.... The semantic segmentation of a bird’s-eye view(BEV)is crucial for environment perception in autonomous driving,which includes the static elements of the scene,such as drivable areas,and dynamic elements such as cars.This paper proposes an end-to-end deep learning architecture based on 3D convolution to predict the semantic segmentation of a BEV,as well as voxel semantic segmentation,from monocular images.The voxelization of scenes and feature transformation from the perspective space to camera space are the key approaches of this model to boost the prediction accuracy.The effectiveness of the proposed method was demonstrated by training and evaluating the model on the NuScenes dataset.A comparison with other state-of-the-art methods showed that the proposed approach outperformed other approaches in the semantic segmentation of a BEV.It also implements voxel semantic segmentation,which cannot be achieved by the state-of-the-art methods. 展开更多
关键词 semantic segmentation voxel semantic segmentation deep learning convolution neural network bird’s-eye view(BEV)
原文传递
Multidimensional attention and multiscale upsampling for semantic segmentation
17
作者 LU Zhongda ZHANG Chunda +1 位作者 WANG Lijing XU Fengxia 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2022年第1期68-78,共11页
Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as ... Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as the backbone network and designs an encoder-decoder architecture based on multidimensional attention(MDA)module and multiscale upsampling(MSU)module.The MDA module calculates the attention matrices of the three dimensions to capture the dependency of each position,and adaptively captures the image features.The MSU module adopts parallel branches to capture the multiscale features of the images,and multiscale feature aggregation can enhance contextual information.A series of experiments demonstrate the validity of the model on Cityscapes and Camvid datasets. 展开更多
关键词 semantic segmentation attention mechanism multiscale feature convolutional neural network(CNN) residual network(ResNet)
在线阅读 下载PDF
MsFireD-Net:A lightweight and efficient convolutional neural network for flame and smoke segmentation 被引量:1
18
作者 F.M.Anim Hossain Youmin Zhang 《Journal of Automation and Intelligence》 2023年第3期130-138,共9页
With the rising frequency and severity of wildfires across the globe,researchers have been actively searching for a reliable solution for early-stage forest fire detection.In recent years,Convolutional Neural Networks... With the rising frequency and severity of wildfires across the globe,researchers have been actively searching for a reliable solution for early-stage forest fire detection.In recent years,Convolutional Neural Networks(CNNs)have demonstrated outstanding performances in computer vision-based object detection tasks,including forest fire detection.Using CNNs to detect forest fires by segmenting both flame and smoke pixels not only can provide early and accurate detection but also additional information such as the size,spread,location,and movement of the fire.However,CNN-based segmentation networks are computationally demanding and can be difficult to incorporate onboard lightweight mobile platforms,such as an Uncrewed Aerial Vehicle(UAV).To address this issue,this paper has proposed a new efficient upsampling technique based on transposed convolution to make segmentation CNNs lighter.This proposed technique,named Reversed Depthwise Separable Transposed Convolution(RDSTC),achieved F1-scores of 0.78 for smoke and 0.74 for flame,outperforming U-Net networks with bilinear upsampling,transposed convolution,and CARAFE upsampling.Additionally,a Multi-signature Fire Detection Network(MsFireD-Net)has been proposed in this paper,having 93%fewer parameters and 94%fewer computations than the RDSTC U-Net.Despite being such a lightweight and efficient network,MsFireD-Net has demonstrated strong results against the other U-Net-based networks. 展开更多
关键词 Forest fire detection Convolutional neural network semantic segmentation UAV Efficient upsampling
在线阅读 下载PDF
A Multi-Scale Network with the Encoder-Decoder Structure for CMR Segmentation 被引量:1
19
作者 Chaoyang Xia Jing Peng +1 位作者 Zongqing Ma Xiaojie Li 《Journal of Information Hiding and Privacy Protection》 2019年第3期109-117,共9页
Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are ... Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are often required to draw endocardial and epicardial contours of the left ventricle(LV)manually in routine clinical diagnosis or treatment planning period.This task is time-consuming and error-prone.Therefore,it is necessary to develop a fully automated end-to-end semantic segmentation method on cardiac magnetic resonance(CMR)imaging datasets.However,due to the low image quality and the deformation caused by heartbeat,there is no effective tool for fully automated end-to-end cardiac segmentation task.In this work,we propose a multi-scale segmentation network(MSSN)for left ventricle segmentation.It can effectively learn myocardium and blood pool structure representations from 2D short-axis CMR image slices in a multi-scale way.Specifically,our method employs both parallel and serial of dilated convolution layers with different dilation rates to capture multi-scale semantic features.Moreover,we design graduated up-sampling layers with subpixel layers as the decoder to reconstruct lost spatial information and produce accurate segmentation masks.We validated our method using 164 T1 Mapping CMR images and showed that it outperforms the advanced convolutional neural network(CNN)models.In validation metrics,we archived the Dice Similarity Coefficient(DSC)metric of 78.96%. 展开更多
关键词 Cardiac magnetic resonance imaging MULTI-SCALE semantic segmentation convolutional neural networks
暂未订购
An improved pulse coupled neural networks model for semantic IoT
20
作者 Rong Ma Zhen Zhang +3 位作者 Yide Ma Xiping Hu Edith C.H.Ngai Victor C.M.Leung 《Digital Communications and Networks》 SCIE CSCD 2024年第3期557-567,共11页
In recent years,the Internet of Things(IoT)has gradually developed applications such as collecting sensory data and building intelligent services,which has led to an explosion in mobile data traffic.Meanwhile,with the... In recent years,the Internet of Things(IoT)has gradually developed applications such as collecting sensory data and building intelligent services,which has led to an explosion in mobile data traffic.Meanwhile,with the rapid development of artificial intelligence,semantic communication has attracted great attention as a new communication paradigm.However,for IoT devices,however,processing image information efficiently in real time is an essential task for the rapid transmission of semantic information.With the increase of model parameters in deep learning methods,the model inference time in sensor devices continues to increase.In contrast,the Pulse Coupled Neural Network(PCNN)has fewer parameters,making it more suitable for processing real-time scene tasks such as image segmentation,which lays the foundation for real-time,effective,and accurate image transmission.However,the parameters of PCNN are determined by trial and error,which limits its application.To overcome this limitation,an Improved Pulse Coupled Neural Networks(IPCNN)model is proposed in this work.The IPCNN constructs the connection between the static properties of the input image and the dynamic properties of the neurons,and all its parameters are set adaptively,which avoids the inconvenience of manual setting in traditional methods and improves the adaptability of parameters to different types of images.Experimental segmentation results demonstrate the validity and efficiency of the proposed self-adaptive parameter setting method of IPCNN on the gray images and natural images from the Matlab and Berkeley Segmentation Datasets.The IPCNN method achieves a better segmentation result without training,providing a new solution for the real-time transmission of image semantic information. 展开更多
关键词 Internet of things(IoT) semantic information real-time application Improved pulse coupled neural network Image segmentation
在线阅读 下载PDF
上一页 1 2 48 下一页 到第
使用帮助 返回顶部