期刊文献+
共找到1,492篇文章
< 1 2 75 >
每页显示 20 50 100
A survey on semantic communications:Technologies,solutions,applications and challenges 被引量:3
1
作者 Yating Liu Xiaojie Wang +3 位作者 Zhaolong Ning MengChu Zhou Lei Guo Behrouz Jedari 《Digital Communications and Networks》 SCIE CSCD 2024年第3期528-545,共18页
Semantic Communication(SC)has emerged as a novel communication paradigm that provides a receiver with meaningful information extracted from the source to maximize information transmission throughput in wireless networ... Semantic Communication(SC)has emerged as a novel communication paradigm that provides a receiver with meaningful information extracted from the source to maximize information transmission throughput in wireless networks,beyond the theoretical capacity limit.Despite the extensive research on SC,there is a lack of comprehensive survey on technologies,solutions,applications,and challenges for SC.In this article,the development of SC is first reviewed and its characteristics,architecture,and advantages are summarized.Next,key technologies such as semantic extraction,semantic encoding,and semantic segmentation are discussed and their corresponding solutions in terms of efficiency,robustness,adaptability,and reliability are summarized.Applications of SC to UAV communication,remote image sensing and fusion,intelligent transportation,and healthcare are also presented and their strategies are summarized.Finally,some challenges and future research directions are presented to provide guidance for further research of SC. 展开更多
关键词 semantic communication semantic coding semantic extraction semantic communication framework semantic communication applications
在线阅读 下载PDF
Blockchain-based knowledge-aware semantic communications for remote driving image transmission
2
作者 Yangfei Lin Tutomu Murase +3 位作者 Yusheng Ji Wugedele Bao Lei Zhong Jie Li 《Digital Communications and Networks》 2025年第2期317-325,共9页
Remote driving,an emergent technology enabling remote operations of vehicles,presents a significant challenge in transmitting large volumes of image data to a central server.This requirement outpaces the capacity of t... Remote driving,an emergent technology enabling remote operations of vehicles,presents a significant challenge in transmitting large volumes of image data to a central server.This requirement outpaces the capacity of traditional communication methods.To tackle this,we propose a novel framework using semantic communications,through a region of interest semantic segmentation method,to reduce the communication costs by transmitting meaningful semantic information rather than bit-wise data.To solve the knowledge base inconsistencies inherent in semantic communications,we introduce a blockchain-based edge-assisted system for managing diverse and geographically varied semantic segmentation knowledge bases.This system not only ensures the security of data through the tamper-resistant nature of blockchain but also leverages edge computing for efficient management.Additionally,the implementation of blockchain sharding handles differentiated knowledge bases for various tasks,thus boosting overall blockchain efficiency.Experimental results show a great reduction in latency by sharding and an increase in model accuracy,confirming our framework's effectiveness. 展开更多
关键词 semantic communication Remote driving semantic segmentation Blockchain Knowledge base management
在线阅读 下载PDF
Facial Video Semantic Coding for Semantic Communication
3
作者 Du Qiyuan Duan Yiping Tao Xiaoming 《China Communications》 2025年第6期83-100,共18页
Multimedia semantic communication has been receiving increasing attention due to its significant enhancement of communication efficiency.Semantic coding,which is oriented towards extracting and encoding the key semant... Multimedia semantic communication has been receiving increasing attention due to its significant enhancement of communication efficiency.Semantic coding,which is oriented towards extracting and encoding the key semantics of video for transmission,is a key aspect in the framework of multimedia semantic communication.In this paper,we propose a facial video semantic coding method with low bitrate based on the temporal continuity of video semantics.At the sender’s end,we selectively transmit facial keypoints and deformation information,allocating distinct bitrates to different keypoints across frames.Compressive techniques involving sampling and quantization are employed to reduce the bitrate while retaining facial key semantic information.At the receiver’s end,a GAN-based generative network is utilized for reconstruction,effectively mitigating block artifacts and buffering problems present in traditional codec algorithms under low bitrates.The performance of the proposed approach is validated on multiple datasets,such as VoxCeleb and TalkingHead-1kH,employing metrics such as LPIPS,DISTS,and AKD for assessment.Experimental results demonstrate significant advantages over traditional codec methods,achieving up to approximately 10-fold bitrate reduction in prolonged,stable head pose scenarios across diverse conversational video settings. 展开更多
关键词 facial video semantic coding semantic communications talking head video compression
在线阅读 下载PDF
A Semantic Evaluation Framework for Medical Report Generation Using Large Language Models
4
作者 Haider Ali Rashadul Islam Sumon +2 位作者 Abdul Rehman Khalid Kounen Fathima Hee Cheol Kim 《Computers, Materials & Continua》 2025年第9期5445-5462,共18页
Artificial intelligence is reshaping radiology by enabling automated report generation,yet evaluating the clinical accuracy and relevance of these reports is a challenging task,as traditional natural language generati... Artificial intelligence is reshaping radiology by enabling automated report generation,yet evaluating the clinical accuracy and relevance of these reports is a challenging task,as traditional natural language generation metrics like BLEU and ROUGE prioritize lexical overlap over clinical relevance.To address this gap,we propose a novel semantic assessment framework for evaluating the accuracy of artificial intelligence-generated radiology reports against ground truth references.We trained 5229 image–report pairs from the Indiana University chest X-ray dataset on the R2GenRL model and generated a benchmark dataset on test data from the Indiana University chest X-ray and MIMIC-CXR datasets.These datasets were selected for their public availability,large scale,and comprehensive coverage of diverse clinical cases in chest radiography,enabling robust evaluation and comparison with prior work.Results demonstrate that the Mistral model,particularly with task-oriented prompting,achieves superior performance(up to 91.9%accuracy),surpassing other models and closely aligning with established metrics like BERTScore-F1(88.1%)and CLIP-Score(88.7%).Statistical analyses,including paired t-tests(p<0.01)and analysis of variance(p<0.05),confirm significant improvements driven by structured prompting.Failure case analysis reveals limitations,such as over-reliance on lexical similarity,underscoring the need for domain-specific fine-tuning.This framework advances the evaluation of artificial intelligence-driven(AI-driven)radiology report generation,offering a robust,clinically relevant metric for assessing semantic accuracy and paving the way for more reliable automated systems in medical imaging. 展开更多
关键词 semantic assessment AI-generated radiology reports large language models prompt engineering semantic score evaluation
暂未订购
Discrete and Topological Correspondence Theory for Modal MeetImplication Logic and Modal MeetSemilattice Logic in Filter Semantics
5
作者 Fei Liang Zhiguang Zhao 《逻辑学研究》 2025年第3期25-66,共42页
In the present paper,we give a systematic study of the discrete correspondence the-ory and topological correspondence theory of modal meet-implication logic and moda1 meet-semilattice logic,in the semantics provided i... In the present paper,we give a systematic study of the discrete correspondence the-ory and topological correspondence theory of modal meet-implication logic and moda1 meet-semilattice logic,in the semantics provided in[21].The special features of the present paper include the following three points:the first one is that the semantic structure used is based on a semilattice rather than an ordinary partial order,the second one is that the propositional vari-ables are interpreted as filters rather than upsets,and the nominals,which are the“first-order counterparts of propositional variables,are interpreted as principal filters rather than principal upsets;the third one is that in topological correspondence theory,the collection of admissi-ble valuations is not closed under taking disjunction,which makes the proof of the topological Ackermann 1emma different from existing settings. 展开更多
关键词 topological correspondence theory SEMILATTICE modal meet implication logic modal meet semilattice logic discrete correspondence theory semantic structure propositional variables filter semantics
在线阅读 下载PDF
Ten Challenges in Semantic Communications
6
作者 Qin Zhijin Ying Jingkai +4 位作者 Xin Gangtao Fan Pingyi Feng Wei Ge Ning Tao Xiaoming 《China Communications》 2025年第6期24-43,共20页
In recent years,deep learning-based semantic communications have shown great potential to enhance the performance of communication systems.This has led to the belief that semantic communications represent a breakthrou... In recent years,deep learning-based semantic communications have shown great potential to enhance the performance of communication systems.This has led to the belief that semantic communications represent a breakthrough beyond the Shannon paradigm and will play an essential role in future communications.To narrow the gap between current research and future vision,after an overview of semantic communications,this article presents and discusses ten fundamental and critical challenges in today’s semantic communication field.These challenges are divided into theory foundation,system design,and practical implementation.Challenges related to the theory foundation including semantic capacity,entropy,and rate-distortion are discussed first.Then,the system design challenges encompassing architecture,knowledge base,joint semantic-channel coding,tailored transmission scheme,and impairment are posed.The last two challenges associated with the practical implementation lie in cross-layer optimization for networks and standardization.For each challenge,efforts to date and thoughtful insights are provided. 展开更多
关键词 cross-layer optimization semantic communication semantic theory STANDARDIZATION
在线阅读 下载PDF
Entropy-Bottleneck-Based Privacy Protection Mechanism for Semantic Communication
7
作者 Kaiyang Han Xiaoqiang Jia +3 位作者 Yangfei Lin Tsutomu Yoshinaga Yalong Li Jiale Wu 《Computers, Materials & Continua》 2025年第5期2971-2988,共18页
With the rapid development of artificial intelligence and the Internet of Things,along with the growing demand for privacy-preserving transmission,the need for efficient and secure communication systems has become inc... With the rapid development of artificial intelligence and the Internet of Things,along with the growing demand for privacy-preserving transmission,the need for efficient and secure communication systems has become increasingly urgent.Traditional communication methods transmit data at the bit level without considering its semantic significance,leading to redundant transmission overhead and reduced efficiency.Semantic communication addresses this issue by extracting and transmitting only the mostmeaningful semantic information,thereby improving bandwidth efficiency.However,despite reducing the volume of data,it remains vulnerable to privacy risks,as semantic features may still expose sensitive information.To address this,we propose an entropy-bottleneck-based privacy protection mechanism for semantic communication.Our approach uses semantic segmentation to partition images into regions of interest(ROI)and regions of non-interest(RONI)based on the receiver’s needs,enabling differentiated semantic transmission.By focusing transmission on ROIs,bandwidth usage is optimized,and non-essential data is minimized.The entropy bottleneck model probabilistically encodes the semantic information into a compact bit stream,reducing correlation between the transmitted content and the original data,thus enhancing privacy protection.The proposed framework is systematically evaluated in terms of compression efficiency,semantic fidelity,and privacy preservation.Through comparative experiments with traditional and state-of-the-art methods,we demonstrate that the approach significantly reduces data transmission,maintains the quality of semantically important regions,and ensures robust privacy protection. 展开更多
关键词 semantic communication privacy protection semantic segmentation entropy-based compression
在线阅读 下载PDF
CAMSNet:Few-Shot Semantic Segmentation via Class Activation Map and Self-Cross Attention Block
8
作者 Jingjing Yan Xuyang Zhuang +2 位作者 Xuezhuan Zhao Xiaoyan Shao Jiaqi Han 《Computers, Materials & Continua》 2025年第3期5363-5386,共24页
The key to the success of few-shot semantic segmentation(FSS)depends on the efficient use of limited annotated support set to accurately segment novel classes in the query set.Due to the few samples in the support set... The key to the success of few-shot semantic segmentation(FSS)depends on the efficient use of limited annotated support set to accurately segment novel classes in the query set.Due to the few samples in the support set,FSS faces challenges such as intra-class differences,background(BG)mismatches between query and support sets,and ambiguous segmentation between the foreground(FG)and BG in the query set.To address these issues,The paper propose a multi-module network called CAMSNet,which includes four modules:the General Information Module(GIM),the Class Activation Map Aggregation(CAMA)module,the Self-Cross Attention(SCA)Block,and the Feature Fusion Module(FFM).In CAMSNet,The GIM employs an improved triplet loss,which concatenates word embedding vectors and support prototypes as anchors,and uses local support features of FG and BG as positive and negative samples to help solve the problem of intra-class differences.Then for the first time,the Class Activation Map(CAM)from the Weakly Supervised Semantic Segmentation(WSSS)is applied to FSS within the CAMA module.This method replaces the traditional use of cosine similarity to locate query information.Subsequently,the SCA Block processes the support and query features aggregated by the CAMA module,significantly enhancing the understanding of input information,leading to more accurate predictions and effectively addressing BG mismatch and ambiguous FG-BG segmentation.Finally,The FFM combines general class information with the enhanced query information to achieve accurate segmentation of the query image.Extensive Experiments on PASCAL and COCO demonstrate that-5i-20ithe CAMSNet yields superior performance and set a state-of-the-art. 展开更多
关键词 Few-shot semantic segmentation semantic segmentation meta learning
在线阅读 下载PDF
A Multi-Level Semantic Constraint Approach for Highway Tunnel Scene Twin Modeling 被引量:1
9
作者 LI Yufei XIE Yakun +3 位作者 CHEN Mingzhen ZHAO Yaoji TU Jiaxing HU Ya 《Journal of Geodesy and Geoinformation Science》 2025年第2期37-56,共20页
As a key node of modern transportation network,the informationization management of road tunnels is crucial to ensure the operation safety and traffic efficiency.However,the existing tunnel vehicle modeling methods ge... As a key node of modern transportation network,the informationization management of road tunnels is crucial to ensure the operation safety and traffic efficiency.However,the existing tunnel vehicle modeling methods generally have problems such as insufficient 3D scene description capability and low dynamic update efficiency,which are difficult to meet the demand of real-time accurate management.For this reason,this paper proposes a vehicle twin modeling method for road tunnels.This approach starts from the actual management needs,and supports multi-level dynamic modeling from vehicle type,size to color by constructing a vehicle model library that can be flexibly invoked;at the same time,semantic constraint rules with geometric layout,behavioral attributes,and spatial relationships are designed to ensure that the virtual model matches with the real model with a high degree of similarity;ultimately,the prototype system is constructed and the case region is selected for the case study,and the dynamic vehicle status in the tunnel is realized by integrating real-time monitoring data with semantic constraints for precise virtual-real mapping.Finally,the prototype system is constructed and case experiments are conducted in selected case areas,which are combined with real-time monitoring data to realize dynamic updating and three-dimensional visualization of vehicle states in tunnels.The experiments show that the proposed method can run smoothly with an average rendering efficiency of 17.70 ms while guaranteeing the modeling accuracy(composite similarity of 0.867),which significantly improves the real-time and intuitive tunnel management.The research results provide reliable technical support for intelligent operation and emergency response of road tunnels,and offer new ideas for digital twin modeling of complex scenes. 展开更多
关键词 highway tunnel twin modeling multi-level semantic constraints tunnel vehicles multidimensional modeling
在线阅读 下载PDF
MG-SLAM: RGB-D SLAM Based on Semantic Segmentation for Dynamic Environment in the Internet of Vehicles 被引量:1
10
作者 Fengju Zhang Kai Zhu 《Computers, Materials & Continua》 2025年第2期2353-2372,共20页
The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology play... The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology plays a crucial role in vehicle localization and navigation. Traditional Simultaneous Localization and Mapping (SLAM) systems are designed for use in static environments, and they can result in poor performance in terms of accuracy and robustness when used in dynamic environments where objects are in constant movement. To address this issue, a new real-time visual SLAM system called MG-SLAM has been developed. Based on ORB-SLAM2, MG-SLAM incorporates a dynamic target detection process that enables the detection of both known and unknown moving objects. In this process, a separate semantic segmentation thread is required to segment dynamic target instances, and the Mask R-CNN algorithm is applied on the Graphics Processing Unit (GPU) to accelerate segmentation. To reduce computational cost, only key frames are segmented to identify known dynamic objects. Additionally, a multi-view geometry method is adopted to detect unknown moving objects. The results demonstrate that MG-SLAM achieves higher precision, with an improvement from 0.2730 m to 0.0135 m in precision. Moreover, the processing time required by MG-SLAM is significantly reduced compared to other dynamic scene SLAM algorithms, which illustrates its efficacy in locating objects in dynamic scenes. 展开更多
关键词 Visual SLAM dynamic scene semantic segmentation GPU acceleration key segmentation frame
在线阅读 下载PDF
Semantic Segmentation of Lumbar Vertebrae Using Meijering U-Net(MU-Net)on Spine Magnetic Resonance Images
11
作者 Lakshmi S V V Shiloah Elizabeth Darmanayagam Sunil Retmin Raj Cyril 《Computer Modeling in Engineering & Sciences》 SCIE EI 2025年第1期733-757,共25页
Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the s... Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the spinal cord,nerves,intervertebral discs,and vertebrae,Magnetic Resonance Imaging is thought to be the most effective method for imaging the spine.The semantic segmentation of vertebrae plays a major role in the diagnostic process of lumbar diseases.It is difficult to semantically partition the vertebrae in Magnetic Resonance Images from the surrounding variety of tissues,including muscles,ligaments,and intervertebral discs.U-Net is a powerful deep-learning architecture to handle the challenges of medical image analysis tasks and achieves high segmentation accuracy.This work proposes a modified U-Net architecture namely MU-Net,consisting of the Meijering convolutional layer that incorporates the Meijering filter to perform the semantic segmentation of lumbar vertebrae L1 to L5 and sacral vertebra S1.Pseudo-colour mask images were generated and used as ground truth for training the model.The work has been carried out on 1312 images expanded from T1-weighted mid-sagittal MRI images of 515 patients in the Lumbar Spine MRI Dataset publicly available from Mendeley Data.The proposed MU-Net model for the semantic segmentation of the lumbar vertebrae gives better performance with 98.79%of pixel accuracy(PA),98.66%of dice similarity coefficient(DSC),97.36%of Jaccard coefficient,and 92.55%mean Intersection over Union(mean IoU)metrics using the mentioned dataset. 展开更多
关键词 Computer aided diagnosis(CAD) magnetic resonance imaging(MRI) semantic segmentation lumbar vertebrae deep learning U-Net model
在线阅读 下载PDF
LiDAR-Visual SLAM with Integrated Semantic and Texture Information for Enhanced Ecological Monitoring Vehicle Localization
12
作者 Yiqing Lu Liutao Zhao Qiankun Zhao 《Computers, Materials & Continua》 SCIE EI 2025年第1期1401-1416,共16页
Ecological monitoring vehicles are equipped with a range of sensors and monitoring devices designed to gather data on ecological and environmental factors.These vehicles are crucial in various fields,including environ... Ecological monitoring vehicles are equipped with a range of sensors and monitoring devices designed to gather data on ecological and environmental factors.These vehicles are crucial in various fields,including environmental science research,ecological and environmental monitoring projects,disaster response,and emergency management.A key method employed in these vehicles for achieving high-precision positioning is LiDAR(lightlaser detection and ranging)-Visual Simultaneous Localization and Mapping(SLAM).However,maintaining highprecision localization in complex scenarios,such as degraded environments or when dynamic objects are present,remains a significant challenge.To address this issue,we integrate both semantic and texture information from LiDAR and cameras to enhance the robustness and efficiency of data registration.Specifically,semantic information simplifies the modeling of scene elements,reducing the reliance on dense point clouds,which can be less efficient.Meanwhile,visual texture information complements LiDAR-Visual localization by providing additional contextual details.By incorporating semantic and texture details frompaired images and point clouds,we significantly improve the quality of data association,thereby increasing the success rate of localization.This approach not only enhances the operational capabilities of ecological monitoring vehicles in complex environments but also contributes to improving the overall efficiency and effectiveness of ecological monitoring and environmental protection efforts. 展开更多
关键词 LiDAR-Visual simultaneous localization and mapping integrated semantic texture information
在线阅读 下载PDF
An Analysis of OpenSeeD for Video Semantic Labeling
13
作者 Jenny Zhu 《Journal of Computer and Communications》 2025年第1期59-71,共13页
Semantic segmentation is a core task in computer vision that allows AI models to interact and understand their surrounding environment. Similarly to how humans subconsciously segment scenes, this ability is crucial fo... Semantic segmentation is a core task in computer vision that allows AI models to interact and understand their surrounding environment. Similarly to how humans subconsciously segment scenes, this ability is crucial for scene understanding. However, a challenge many semantic learning models face is the lack of data. Existing video datasets are limited to short, low-resolution videos that are not representative of real-world examples. Thus, one of our key contributions is a customized semantic segmentation version of the Walking Tours Dataset that features hour-long, high-resolution, real-world data from tours of different cities. Additionally, we evaluate the performance of open-vocabulary, semantic model OpenSeeD on our own custom dataset and discuss future implications. 展开更多
关键词 semantic Segmentation Detection LABELING OpenSeeD Open-Vocabulary Walking Tours Dataset VIDEOS
在线阅读 下载PDF
EffNet-CNN:A Semantic Model for Image Mining&Content-Based Image Retrieval
14
作者 Rajendran Thanikachalam Anandhavalli Muniasamy +1 位作者 Ashwag Alasmari Rajendran Thavasimuthu 《Computer Modeling in Engineering & Sciences》 2025年第5期1971-2000,共30页
Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval sy... Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval system mainly relies on the efficiency and accuracy of the classification models.This research addresses the challenge of enhancing the image retrieval system by developing a novel approach,EfficientNet-Convolutional Neural Network(EffNet-CNN).The key objective of this research is to evaluate the proposed EffNet-CNN model’s performance in image classification,image mining,and CBIR.The novelty of the proposed EffNet-CNN model includes the integration of different techniques and modifications.The model includes the Mahalanobis distance metric for feature matching,which enhances the similarity measurements.The model extends EfficientNet architecture by incorporating additional convolutional layers,batch normalization,dropout,and pooling layers for improved hierarchical feature extraction.A systematic hyperparameter optimization using SGD,performance evaluation with three datasets,and data normalization for improving feature representations.The EffNet-CNN is assessed utilizing precision,accuracy,F-measure,and recall metrics across MS-COCO,CIFAR-10 and 100 datasets.The model achieved accuracy values ranging from 90.60%to 95.90%for the MS-COCO dataset,96.8%to 98.3%for the CIFAR-10 dataset and 92.9%to 98.6%for the CIFAR-100 dataset.A validation of the EffNet-CNN model’s results with other models reveals the proposed model’s superior performance.The results highlight the potential of the EffNet-CNN model proposed for image classification and its usefulness in image mining and CBIR. 展开更多
关键词 Image mining CBIR semantic features EffNet-CNN image retrieval
在线阅读 下载PDF
Software Defect Prediction Based on Semantic Views of Metrics:Clustering Analysis and Model Performance Analysis
15
作者 Baishun Zhou Haijiao Zhao +4 位作者 Yuxin Wen Gangyi Ding Ying Xing Xinyang Lin Lei Xiao 《Computers, Materials & Continua》 2025年第9期5201-5221,共21页
In recent years,with the rapid development of software systems,the continuous expansion of software scale and the increasing complexity of systems have led to the emergence of a growing number of software metrics.Defe... In recent years,with the rapid development of software systems,the continuous expansion of software scale and the increasing complexity of systems have led to the emergence of a growing number of software metrics.Defect prediction methods based on software metric elements highly rely on software metric data.However,redundant software metric data is not conducive to efficient defect prediction,posing severe challenges to current software defect prediction tasks.To address these issues,this paper focuses on the rational clustering of software metric data.Firstly,multiple software projects are evaluated to determine the preset number of clusters for software metrics,and various clustering methods are employed to cluster the metric elements.Subsequently,a co-occurrence matrix is designed to comprehensively quantify the number of times that metrics appear in the same category.Based on the comprehensive results,the software metric data are divided into two semantic views containing different metrics,thereby analyzing the semantic information behind the software metrics.On this basis,this paper also conducts an in-depth analysis of the impact of different semantic view of metrics on defect prediction results,as well as the performance of various classification models under these semantic views.Experiments show that the joint use of the two semantic views can significantly improve the performance of models in software defect prediction,providing a new understanding and approach at the semantic view level for defect prediction research based on software metrics. 展开更多
关键词 Software defect prediction software engineering semantic views CLUSTERING INTERPRETABILITY
在线阅读 下载PDF
Remote sensing image semantic segmentation algorithm based on improved DeepLabv3+
16
作者 SONG Xirui GE Hongwei LI Ting 《Journal of Measurement Science and Instrumentation》 2025年第2期205-215,共11页
The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack... The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability. 展开更多
关键词 semantic segmentation high-resolution remote sensing image deep learning transformer model attention mechanism feature fusion ENCODER DECODER
在线阅读 下载PDF
MNTSCC:A VMamba-Based Nonlinear Joint Source-Channel Coding for Semantic Communications
17
作者 Chao Li Chen Wang +2 位作者 Caichang Ding Yonghao Liao Zhiwei Ye 《Computers, Materials & Continua》 2025年第11期3129-3149,共21页
Deep learning-based semantic communication has achieved remarkable progress with CNNs and Transformers.However,CNNs exhibit constrained performance in high-resolution image transmission,while Transformers incur high c... Deep learning-based semantic communication has achieved remarkable progress with CNNs and Transformers.However,CNNs exhibit constrained performance in high-resolution image transmission,while Transformers incur high computational cost due to quadratic complexity.Recently,VMamba,a novel state space model with linear complexity and exceptional long-range dependency modeling capabilities,has shown great potential in computer vision tasks.Inspired by this,we propose MNTSCC,an efficient VMamba-based nonlinear joint source-channel coding(JSCC)model for wireless image transmission.Specifically,MNTSCC comprises a VMamba-based nonlinear transform module,an MCAM entropy model,and a JSCC module.In the encoding stage,the input image is first encoded into a latent representation via the nonlinear transformation module,which is then processed by the MCAM for source distribution modeling.The JSCC module then optimizes transmission efficiency by adaptively assigning transmission rate to the latent representation according to the estimated entropy values.The proposedMCAMenhances the channel-wise autoregressive entropy model with attention mechanisms,which enables the entropy model to effectively capture both global and local information within latent features,thereby enabling more accurate entropy estimation and improved rate-distortion performance.Additionally,to further enhance the robustness of the system under varying signal-to-noise ratio(SNR)conditions,we incorporate SNR adaptive net(SAnet)into the JSCCmodule,which dynamically adjusts the encoding strategy by integrating SNRinformationwith latent features,thereby improving SNR adaptability.Experimental results across diverse resolution datasets demonstrate that the proposed method achieves superior image transmission performance compared to existing CNN-and Transformer-based semantic communication models,while maintaining competitive computational efficiency.In particular,under an Additive White Gaussian Noise(AWGN)channel with SNR=10 dB and a channel bandwidth ratio(CBR)of 1/16,MNTSCC consistently outperforms NTSCC,achieving a 1.72 dB Peak Signal-to-Noise Ratio(PSNR)gain on the Kodak24 dataset,0.79 dB on CLIC2022,and 2.54 dB on CIFAR-10,while reducing computational cost by 32.23%.The code is available at https://github.com/WanChen10/MNTSCC(accessed on 09 July 2025). 展开更多
关键词 semantic communication VMamba wireless image transmission joint source-channel coding channel adaptation nonlinear transformation
在线阅读 下载PDF
Improved SE-UNet network-based semantic segmentation and extraction of hidden geological significance in geological maps
18
作者 Kai Ma Jun-jie Liu +5 位作者 Si-qi Lu Ze-hua Huang Miao Tian Jun-yuan Deng Zhong Xie Qin-jun Qiu 《China Geology》 2025年第4期643-660,共18页
Automatic segmentation and recognition of content and element information in color geological map are of great significance for researchers to analyze the distribution of mineral resources and predict disaster informa... Automatic segmentation and recognition of content and element information in color geological map are of great significance for researchers to analyze the distribution of mineral resources and predict disaster information.This article focuses on color planar raster geological map(geological maps include planar geological maps,columnar maps,and profiles).While existing deep learning approaches are often used to segment general images,their performance is limited due to complex elements,diverse regional features,and complicated backgrounds for color geological map in the domain of geoscience.To address the issue,a color geological map segmentation model is proposed that combines the Felz clustering algorithm and an improved SE-UNet deep learning network(named GeoMSeg).Firstly,a symmetrical encoder-decoder structure backbone network based on UNet is constructed,and the channel attention mechanism SENet has been incorporated to augment the network’s capacity for feature representation,enabling the model to purposefully extract map information.The SE-UNet network is employed for feature extraction from the geological map and obtain coarse segmentation results.Secondly,the Felz clustering algorithm is used for super pixel pre-segmentation of geological maps.The coarse segmentation results are refined and modified based on the super pixel pre-segmentation results to obtain the final segmentation results.This study applies GeoMSeg to the constructed dataset,and the experimental results show that the algorithm proposed in this paper has superior performance compared to other mainstream map segmentation models,with an accuracy of 91.89%and a MIoU of 71.91%. 展开更多
关键词 Geological map UNet model Image segmentation semantic segmentation Pixel pre-segmentation Clustering algorithm Attention mechanism Deep learning Artificial intelligence Geological survey engineering
在线阅读 下载PDF
CG-FCLNet:Category-Guided Feature Collaborative Learning Network for Semantic Segmentation of Remote Sensing Images
19
作者 Min Yao Guangjie Hu Yaozu Zhang 《Computers, Materials & Continua》 2025年第5期2751-2771,共21页
Semantic segmentation of remote sensing images is a critical research area in the field of remote sensing.Despite the success of Convolutional Neural Networks(CNNs),they often fail to capture inter-layer feature relat... Semantic segmentation of remote sensing images is a critical research area in the field of remote sensing.Despite the success of Convolutional Neural Networks(CNNs),they often fail to capture inter-layer feature relationships and fully leverage contextual information,leading to the loss of important details.Additionally,due to significant intraclass variation and small inter-class differences in remote sensing images,CNNs may experience class confusion.To address these issues,we propose a novel Category-Guided Feature Collaborative Learning Network(CG-FCLNet),which enables fine-grained feature extraction and adaptive fusion.Specifically,we design a Feature Collaborative Learning Module(FCLM)to facilitate the tight interaction of multi-scale features.We also introduce a Scale-Aware Fusion Module(SAFM),which iteratively fuses features from different layers using a spatial attention mechanism,enabling deeper feature fusion.Furthermore,we design a Category-Guided Module(CGM)to extract category-aware information that guides feature fusion,ensuring that the fused featuresmore accurately reflect the semantic information of each category,thereby improving detailed segmentation.The experimental results show that CG-FCLNet achieves a Mean Intersection over Union(mIoU)of 83.46%,an mF1 of 90.87%,and an Overall Accuracy(OA)of 91.34% on the Vaihingen dataset.On the Potsdam dataset,it achieves a mIoU of 86.54%,an mF1 of 92.65%,and an OA of 91.29%.These results highlight the superior performance of CG-FCLNet compared to existing state-of-the-art methods. 展开更多
关键词 semantic segmentation remote sensing feature context interaction attentionmodule category-guided module
在线阅读 下载PDF
上一页 1 2 75 下一页 到第
使用帮助 返回顶部