Global semantic structures of two large semantic networks, HowNet and WordNet, are analyzed. It is found that they are both complex networks with features of small-world and scale-free, but with special properties. Ex...Global semantic structures of two large semantic networks, HowNet and WordNet, are analyzed. It is found that they are both complex networks with features of small-world and scale-free, but with special properties. Exponents of power law degree distribution of these two networks are between 1.0 and 2. 0, different from most scale-free networks which have exponents near 3.0. Coefficients of degree correlation are lower than 0, similar to biological networks. The BA (Barabasi-Albert) model and other similar models cannot explain their dynamics. Relations between clustering coefficient and node degree obey scaling law, which suggests that there exist self-similar hierarchical structures in networks. The results suggest that structures of semantic networks are influenced by the ways we learn semantic knowledge such as aggregation and metaphor.展开更多
The presentation method of the mechanical motion scheme must support thewhole process of conceptual design. To meet the requirement, a semantic network method is selectedto represent process level, action level, mecha...The presentation method of the mechanical motion scheme must support thewhole process of conceptual design. To meet the requirement, a semantic network method is selectedto represent process level, action level, mechanism level and relationships among them. Computeraided motion cycle chart exploration can be realized by the representation and revision of timecoordination of mechanism actions and their effect on the design scheme. The uncertain reasoningtechnology based on semantic network is applied in the mechanism types selection of the needledriving mechanism of industrial sewing mechanism, and the application indicated it is correct,useful and advance.展开更多
Abstract: It was discussed that the way to reflect the internal relations between judgment and identification, the two most fundamental ways of thinking or cognition operations, during the course of the semantic netw...Abstract: It was discussed that the way to reflect the internal relations between judgment and identification, the two most fundamental ways of thinking or cognition operations, during the course of the semantic network knowledge representation processing. A new extended Petri net is defined based on qualitative mapping, which strengths the expressive ability of the feature of thinking and the mode of action of brain. A model of semantic network knowledge representation based on new Petri net is given. Semantic network knowledge has a more efficient representation and reasoning mechanism. This model not only can reflect the characteristics of associative memory in semantic network knowledge representation, but also can use Petri net to express the criterion changes and its change law of recognition judgment, especially the cognitive operation of thinking based on extraction and integration of sensory characteristics to well express the thinking transition course from quantitative change to qualitative change of human cognition.展开更多
Based on the definition of component ontology, an effective component classification mechanism and a facet named component relationship are proposed. Then an application domain oriented, hierarchical component organiz...Based on the definition of component ontology, an effective component classification mechanism and a facet named component relationship are proposed. Then an application domain oriented, hierarchical component organization model is established. At last a hierarchical component semantic network (HCSN) described by ontology interchange language(OIL) is presented and then its function is described. Using HCSN and cooperating with other components retrieving algorithms based on component description, other components information and their assembly or composite modes related to the key component can be found. Based on HCSN, component directory library is catalogued and a prototype system is constructed. The prototype system proves that component library organization based on this model gives guarantee to the reliability of component assembly during program mining.展开更多
Geothermal resources are efficient,renewable and clean energy sources,and their reservoirs are usually closely associated with high-temperature regions of the land surface.Current exploration methods primarily involve...Geothermal resources are efficient,renewable and clean energy sources,and their reservoirs are usually closely associated with high-temperature regions of the land surface.Current exploration methods primarily involve migrating traditional geological techniques,which fail to fully use the unique features of geothermal radiation characteristics.Thermal infrared remote-sensing imaging technology can capture and present areas with distinctive surface thermal radiation features,providing considerable significance as a guide for localization prior tofield exploration.In this study,we propose a deep learningebased method for intelligently identifying and segmenting geothermal radiation sources from thermal infrared remote-sensing images,including data preparation and model training.To improve the localization drift and anomalous interference caused by the high complexity of the Earth's surface environment,this study uses a surface temperature retrieval algorithm to calculate the land surface temperature in the research area.The retrieval results are used to train the semantic segmentation model.In addition,a pixel-level geothermal spatial segmentation network(PGSSNet)is proposed to suppress the diffuse thermal radiation and reduce the broad and blurred white areas of images to exact locations.Once the training is completed,the model directly segments and extracts the actual range of thermal radiation sources from subsequent thermal infrared remote-sensing images without temperature retrieval and/or manual calibration.展开更多
High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes an...High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.展开更多
The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology play...The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology plays a crucial role in vehicle localization and navigation. Traditional Simultaneous Localization and Mapping (SLAM) systems are designed for use in static environments, and they can result in poor performance in terms of accuracy and robustness when used in dynamic environments where objects are in constant movement. To address this issue, a new real-time visual SLAM system called MG-SLAM has been developed. Based on ORB-SLAM2, MG-SLAM incorporates a dynamic target detection process that enables the detection of both known and unknown moving objects. In this process, a separate semantic segmentation thread is required to segment dynamic target instances, and the Mask R-CNN algorithm is applied on the Graphics Processing Unit (GPU) to accelerate segmentation. To reduce computational cost, only key frames are segmented to identify known dynamic objects. Additionally, a multi-view geometry method is adopted to detect unknown moving objects. The results demonstrate that MG-SLAM achieves higher precision, with an improvement from 0.2730 m to 0.0135 m in precision. Moreover, the processing time required by MG-SLAM is significantly reduced compared to other dynamic scene SLAM algorithms, which illustrates its efficacy in locating objects in dynamic scenes.展开更多
Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the s...Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the spinal cord,nerves,intervertebral discs,and vertebrae,Magnetic Resonance Imaging is thought to be the most effective method for imaging the spine.The semantic segmentation of vertebrae plays a major role in the diagnostic process of lumbar diseases.It is difficult to semantically partition the vertebrae in Magnetic Resonance Images from the surrounding variety of tissues,including muscles,ligaments,and intervertebral discs.U-Net is a powerful deep-learning architecture to handle the challenges of medical image analysis tasks and achieves high segmentation accuracy.This work proposes a modified U-Net architecture namely MU-Net,consisting of the Meijering convolutional layer that incorporates the Meijering filter to perform the semantic segmentation of lumbar vertebrae L1 to L5 and sacral vertebra S1.Pseudo-colour mask images were generated and used as ground truth for training the model.The work has been carried out on 1312 images expanded from T1-weighted mid-sagittal MRI images of 515 patients in the Lumbar Spine MRI Dataset publicly available from Mendeley Data.The proposed MU-Net model for the semantic segmentation of the lumbar vertebrae gives better performance with 98.79%of pixel accuracy(PA),98.66%of dice similarity coefficient(DSC),97.36%of Jaccard coefficient,and 92.55%mean Intersection over Union(mean IoU)metrics using the mentioned dataset.展开更多
Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is cr...Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is crucial for computationally limited portable devices such as augmented reality and virtual reality.With the rapid advancements in deep learning,many network models have been developed specifically for eye image segmentation.Some methods divide the segmentation process into multiple stages to achieve model parameter miniaturization while enhancing output through post processing techniques to improve segmentation accuracy.These approaches significantly increase the inference time.Other networks adopt more complex encoding and decoding modules to achieve end-to-end output,which requires substantial computation.Therefore,balancing the model’s size,accuracy,and computational complexity is essential.To address these challenges,we propose a lightweight asymmetric UNet architecture and a projection loss function.We utilize ResNet-3 layer blocks to enhance feature extraction efficiency in the encoding stage.In the decoding stage,we employ regular convolutions and skip connections to upscale the feature maps from the latent space to the original image size,balancing the model size and segmentation accuracy.In addition,we leverage the geometric features of the eye region and design a projection loss function to further improve the segmentation accuracy without adding any additional inference computational cost.We validate our approach on the OpenEDS2019 dataset for virtual reality and achieve state-of-the-art performance with 95.33%mean intersection over union(mIoU).Our model has only 0.63M parameters and 350 FPS,which are 68%and 200%of the state-of-the-art model RITNet,respectively.展开更多
Few-shot image semantic segmentation aims to achieve pixel-level classification for novel classes using only a few labeled examples.The method first trains the segmentation model on base classes,and then adapts it to ...Few-shot image semantic segmentation aims to achieve pixel-level classification for novel classes using only a few labeled examples.The method first trains the segmentation model on base classes,and then adapts it to novel classes.Although existing methods have achieved remarkable performance in few-shot image semantic segmentation,they still face the following challenges.Traditional methods typically rely on mask average pooling to generate single-category prototype vectors and perform feature matching via metric learning,but they exhibit significant limitations in modeling inter-category relationships and addressing complex background interference.Inspired by the analogy-based transfer mechanisms in cognitive psychology,we propose a Generalized Prototype Network(GPNet)to enhance the model's generalization ability for unseen categories and improve robustness in feature matching.GPNet consists of two key modules.The first is a generalized prototype enhancement module,which explores potential inter-category relationships to construct more discriminative category prototype representations.The second is a multi-scale feature alignment module,which dynamically aligns support and query features across multiple scales using an attention mechanism,thus mitigating background interference in complex scenarios.Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art approaches on several few-shot semantic segmentation benchmarks,validating its effectiveness and generalization capabilities.展开更多
Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval sy...Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval system mainly relies on the efficiency and accuracy of the classification models.This research addresses the challenge of enhancing the image retrieval system by developing a novel approach,EfficientNet-Convolutional Neural Network(EffNet-CNN).The key objective of this research is to evaluate the proposed EffNet-CNN model’s performance in image classification,image mining,and CBIR.The novelty of the proposed EffNet-CNN model includes the integration of different techniques and modifications.The model includes the Mahalanobis distance metric for feature matching,which enhances the similarity measurements.The model extends EfficientNet architecture by incorporating additional convolutional layers,batch normalization,dropout,and pooling layers for improved hierarchical feature extraction.A systematic hyperparameter optimization using SGD,performance evaluation with three datasets,and data normalization for improving feature representations.The EffNet-CNN is assessed utilizing precision,accuracy,F-measure,and recall metrics across MS-COCO,CIFAR-10 and 100 datasets.The model achieved accuracy values ranging from 90.60%to 95.90%for the MS-COCO dataset,96.8%to 98.3%for the CIFAR-10 dataset and 92.9%to 98.6%for the CIFAR-100 dataset.A validation of the EffNet-CNN model’s results with other models reveals the proposed model’s superior performance.The results highlight the potential of the EffNet-CNN model proposed for image classification and its usefulness in image mining and CBIR.展开更多
Semantic segmentation of remote sensing images is a critical research area in the field of remote sensing.Despite the success of Convolutional Neural Networks(CNNs),they often fail to capture inter-layer feature relat...Semantic segmentation of remote sensing images is a critical research area in the field of remote sensing.Despite the success of Convolutional Neural Networks(CNNs),they often fail to capture inter-layer feature relationships and fully leverage contextual information,leading to the loss of important details.Additionally,due to significant intraclass variation and small inter-class differences in remote sensing images,CNNs may experience class confusion.To address these issues,we propose a novel Category-Guided Feature Collaborative Learning Network(CG-FCLNet),which enables fine-grained feature extraction and adaptive fusion.Specifically,we design a Feature Collaborative Learning Module(FCLM)to facilitate the tight interaction of multi-scale features.We also introduce a Scale-Aware Fusion Module(SAFM),which iteratively fuses features from different layers using a spatial attention mechanism,enabling deeper feature fusion.Furthermore,we design a Category-Guided Module(CGM)to extract category-aware information that guides feature fusion,ensuring that the fused featuresmore accurately reflect the semantic information of each category,thereby improving detailed segmentation.The experimental results show that CG-FCLNet achieves a Mean Intersection over Union(mIoU)of 83.46%,an mF1 of 90.87%,and an Overall Accuracy(OA)of 91.34% on the Vaihingen dataset.On the Potsdam dataset,it achieves a mIoU of 86.54%,an mF1 of 92.65%,and an OA of 91.29%.These results highlight the superior performance of CG-FCLNet compared to existing state-of-the-art methods.展开更多
Automatic segmentation and recognition of content and element information in color geological map are of great significance for researchers to analyze the distribution of mineral resources and predict disaster informa...Automatic segmentation and recognition of content and element information in color geological map are of great significance for researchers to analyze the distribution of mineral resources and predict disaster information.This article focuses on color planar raster geological map(geological maps include planar geological maps,columnar maps,and profiles).While existing deep learning approaches are often used to segment general images,their performance is limited due to complex elements,diverse regional features,and complicated backgrounds for color geological map in the domain of geoscience.To address the issue,a color geological map segmentation model is proposed that combines the Felz clustering algorithm and an improved SE-UNet deep learning network(named GeoMSeg).Firstly,a symmetrical encoder-decoder structure backbone network based on UNet is constructed,and the channel attention mechanism SENet has been incorporated to augment the network’s capacity for feature representation,enabling the model to purposefully extract map information.The SE-UNet network is employed for feature extraction from the geological map and obtain coarse segmentation results.Secondly,the Felz clustering algorithm is used for super pixel pre-segmentation of geological maps.The coarse segmentation results are refined and modified based on the super pixel pre-segmentation results to obtain the final segmentation results.This study applies GeoMSeg to the constructed dataset,and the experimental results show that the algorithm proposed in this paper has superior performance compared to other mainstream map segmentation models,with an accuracy of 91.89%and a MIoU of 71.91%.展开更多
Real-time semantic segmentation tasks place stringent demands on network inference speed,often requiring a reduction in network depth to decrease computational load.However,shallow networks tend to exhibit degradation...Real-time semantic segmentation tasks place stringent demands on network inference speed,often requiring a reduction in network depth to decrease computational load.However,shallow networks tend to exhibit degradation in feature extraction completeness and inference accuracy.Therefore,balancing high performance with real-time requirements has become a critical issue in the study of real-time semantic segmentation.To address these challenges,this paper proposes a lightweight bilateral dual-residual network.By introducing a novel residual structure combined with feature extraction and fusion modules,the proposed network significantly enhances representational capacity while reducing computational costs.Specifically,an improved compound residual structure is designed to optimize the efficiency of information propagation and feature extraction.Furthermore,the proposed feature extraction and fusion module enables the network to better capture multi-scale information in images,improving the ability to detect both detailed and global semantic features.Experimental results on the publicly available Cityscapes dataset demonstrate that the proposed lightweight dual-branch network achieves outstanding performance while maintaining low computational complexity.In particular,the network achieved a mean Intersection over Union(mIoU)of 78.4%on the Cityscapes validation set,surpassing many existing semantic segmentation models.Additionally,in terms of inference speed,the network reached 74.5 frames per second when tested on an NVIDIA GeForce RTX 3090 GPU,significantly improving real-time performance.展开更多
Due to the necessity for lightweight and efficient network models, deploying semantic segmentation models on mobile robots (MRs) is a formidable task. The fundamental limitation of the problem lies in the training per...Due to the necessity for lightweight and efficient network models, deploying semantic segmentation models on mobile robots (MRs) is a formidable task. The fundamental limitation of the problem lies in the training performance, the ability to effectively exploit the dataset, and the ability to adapt to complex environments when deploying the model. By utilizing the knowledge distillation techniques, the article strives to overcome the above challenges with the inheritance of the advantages of both the teacher model and the student model. More precisely, the ResNet152-PSP-Net model’s characteristics are utilized to train the ResNet18-PSP-Net model. Pyramid pooling blocks are utilized to decode multi-scale feature maps, creating a complete semantic map inference. The student model not only preserves the strong segmentation performance from the teacher model but also improves the inference speed of the prediction results. The proposed method exhibits a clear advantage over conventional convolutional neural network (CNN) models, as evident from the conducted experiments. Furthermore, the proposed model also shows remarkable improvement in processing speed when compared with light-weight models such as MobileNetV2 and EfficientNet based on latency and throughput parameters. The proposed KD-SegNet model obtains an accuracy of 96.3% and a mIoU (mean Intersection over Union) of 77%, outperforming the performance of existing models by more than 15% on the same training dataset. The suggested method has an average training time that is only 0.51 times less than same field models, while still achieving comparable segmentation performance. Hence, the semantic segmentation frames are collected, forming the motion trajectory for the system in the environment. Overall, this architecture shows great promise for the development of knowledge-based systems for MR’s navigation.展开更多
The key to the success of few-shot semantic segmentation(FSS)depends on the efficient use of limited annotated support set to accurately segment novel classes in the query set.Due to the few samples in the support set...The key to the success of few-shot semantic segmentation(FSS)depends on the efficient use of limited annotated support set to accurately segment novel classes in the query set.Due to the few samples in the support set,FSS faces challenges such as intra-class differences,background(BG)mismatches between query and support sets,and ambiguous segmentation between the foreground(FG)and BG in the query set.To address these issues,The paper propose a multi-module network called CAMSNet,which includes four modules:the General Information Module(GIM),the Class Activation Map Aggregation(CAMA)module,the Self-Cross Attention(SCA)Block,and the Feature Fusion Module(FFM).In CAMSNet,The GIM employs an improved triplet loss,which concatenates word embedding vectors and support prototypes as anchors,and uses local support features of FG and BG as positive and negative samples to help solve the problem of intra-class differences.Then for the first time,the Class Activation Map(CAM)from the Weakly Supervised Semantic Segmentation(WSSS)is applied to FSS within the CAMA module.This method replaces the traditional use of cosine similarity to locate query information.Subsequently,the SCA Block processes the support and query features aggregated by the CAMA module,significantly enhancing the understanding of input information,leading to more accurate predictions and effectively addressing BG mismatch and ambiguous FG-BG segmentation.Finally,The FFM combines general class information with the enhanced query information to achieve accurate segmentation of the query image.Extensive Experiments on PASCAL and COCO demonstrate that-5i-20ithe CAMSNet yields superior performance and set a state-of-the-art.展开更多
Semantic segmentation in street scenes is a crucial technology for autonomous driving to analyze the surrounding environment.In street scenes,issues such as high image resolution caused by a large viewpoints and diffe...Semantic segmentation in street scenes is a crucial technology for autonomous driving to analyze the surrounding environment.In street scenes,issues such as high image resolution caused by a large viewpoints and differences in object scales lead to a decline in real-time performance and difficulties in multi-scale feature extraction.To address this,we propose a bilateral-branch real-time semantic segmentationmethod based on semantic information distillation(BSDNet)for street scene images.The BSDNet consists of a Feature Conversion Convolutional Block(FCB),a Semantic Information Distillation Module(SIDM),and a Deep Aggregation Atrous Convolution Pyramid Pooling(DASP).FCB reduces the semantic gap between the backbone and the semantic branch.SIDM extracts high-quality semantic information fromthe Transformer branch to reduce computational costs.DASP aggregates information lost in atrous convolutions,effectively capturingmulti-scale objects.Extensive experiments conducted on Cityscapes,CamVid,and ADE20K,achieving an accuracy of 81.7% Mean Intersection over Union(mIoU)at 70.6 Frames Per Second(FPS)on Cityscapes,demonstrate that our method achieves a better balance between accuracy and inference speed.展开更多
Neuronal soma segmentation plays a crucial role in neuroscience applications.However,the fine structure,such as boundaries,small-volume neuronal somata and fibers,are commonly present in cell images,which pose a chall...Neuronal soma segmentation plays a crucial role in neuroscience applications.However,the fine structure,such as boundaries,small-volume neuronal somata and fibers,are commonly present in cell images,which pose a challenge for accurate segmentation.In this paper,we propose a 3D semantic segmentation network for neuronal soma segmentation to address this issue.Using an encoding-decoding structure,we introduce a Multi-Scale feature extraction and Adaptive Weighting fusion module(MSAW)after each encoding block.The MSAW module can not only emphasize the fine structures via an upsampling strategy,but also provide pixel-wise weights to measure the importance of the multi-scale features.Additionally,a dynamic convolution instead of normal convolution is employed to better adapt the network to input data with different distributions.The proposed MSAW-based semantic segmentation network(MSAW-Net)was evaluated on three neuronal soma images from mouse brain and one neuronal soma image from macaque brain,demonstrating the efficiency of the proposed method.It achieved an F1 score of 91.8%on Fezf2-2A-CreER dataset,97.1%on LSL-H2B-GFP dataset,82.8%on Thy1-EGFP-Mline dataset,and 86.9%on macaque dataset,achieving improvements over the 3D U-Net model by 3.1%,3.3%,3.9%,and 2.3%,respectively.展开更多
Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relation...Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relationships between hand joints and improve the accuracy of 3D hand pose regression.However,GCNs cannot effectively describe the relationships between non-adjacent hand joints.Recently,hypergraph convolutional networks(HGCNs)have received much attention as they can describe multi-dimensional relationships between nodes through hyperedges;therefore,this paper proposes a framework for 3D hand pose estimation based on HGCN,which can better extract correlated relationships between adjacent and non-adjacent hand joints.To overcome the shortcomings of predefined hypergraph structures,a kind of dynamic hypergraph convolutional network is proposed,in which hyperedges are constructed dynamically based on hand joint feature similarity.To better explore the local semantic relationships between nodes,a kind of semantic dynamic hypergraph convolution is proposed.The proposed method is evaluated on publicly available benchmark datasets.Qualitative and quantitative experimental results both show that the proposed HGCN and improved methods for 3D hand pose estimation are better than GCN,and achieve state-of-the-art performance compared with existing methods.展开更多
This paper presents an intelligent patrol and security robot integrating 2D LiDAR and RGB-D vision sensors to achieve semantic simultaneous localization and mapping(SLAM),real-time object recognition,and dynamic obsta...This paper presents an intelligent patrol and security robot integrating 2D LiDAR and RGB-D vision sensors to achieve semantic simultaneous localization and mapping(SLAM),real-time object recognition,and dynamic obstacle avoidance.The system employs the YOLOv7 deep-learning framework for semantic detection and SLAM for localization and mapping,fusing geometric and visual data to build a high-fidelity 2D semantic map.This map enables the robot to identify and project object information for improved situational awareness.Experimental results show that object recognition reached 95.4%mAP@0.5.Semantic completeness increased from 68.7%(single view)to 94.1%(multi-view)with an average position error of 3.1 cm.During navigation,the robot achieved 98.0%reliability,avoided moving obstacles in 90.0%of encounters,and replanned paths in 0.42 s on average.The integration of LiDAR-based SLAMwith deep-learning–driven semantic perception establishes a robust foundation for intelligent,adaptive,and safe robotic navigation in dynamic environments.展开更多
基金The National Natural Science Foundation of China(No.60275016).
文摘Global semantic structures of two large semantic networks, HowNet and WordNet, are analyzed. It is found that they are both complex networks with features of small-world and scale-free, but with special properties. Exponents of power law degree distribution of these two networks are between 1.0 and 2. 0, different from most scale-free networks which have exponents near 3.0. Coefficients of degree correlation are lower than 0, similar to biological networks. The BA (Barabasi-Albert) model and other similar models cannot explain their dynamics. Relations between clustering coefficient and node degree obey scaling law, which suggests that there exist self-similar hierarchical structures in networks. The results suggest that structures of semantic networks are influenced by the ways we learn semantic knowledge such as aggregation and metaphor.
基金This Project is supported by National Natural Science Foundation of China(No.59875058).
文摘The presentation method of the mechanical motion scheme must support thewhole process of conceptual design. To meet the requirement, a semantic network method is selectedto represent process level, action level, mechanism level and relationships among them. Computeraided motion cycle chart exploration can be realized by the representation and revision of timecoordination of mechanism actions and their effect on the design scheme. The uncertain reasoningtechnology based on semantic network is applied in the mechanism types selection of the needledriving mechanism of industrial sewing mechanism, and the application indicated it is correct,useful and advance.
文摘Abstract: It was discussed that the way to reflect the internal relations between judgment and identification, the two most fundamental ways of thinking or cognition operations, during the course of the semantic network knowledge representation processing. A new extended Petri net is defined based on qualitative mapping, which strengths the expressive ability of the feature of thinking and the mode of action of brain. A model of semantic network knowledge representation based on new Petri net is given. Semantic network knowledge has a more efficient representation and reasoning mechanism. This model not only can reflect the characteristics of associative memory in semantic network knowledge representation, but also can use Petri net to express the criterion changes and its change law of recognition judgment, especially the cognitive operation of thinking based on extraction and integration of sensory characteristics to well express the thinking transition course from quantitative change to qualitative change of human cognition.
文摘Based on the definition of component ontology, an effective component classification mechanism and a facet named component relationship are proposed. Then an application domain oriented, hierarchical component organization model is established. At last a hierarchical component semantic network (HCSN) described by ontology interchange language(OIL) is presented and then its function is described. Using HCSN and cooperating with other components retrieving algorithms based on component description, other components information and their assembly or composite modes related to the key component can be found. Based on HCSN, component directory library is catalogued and a prototype system is constructed. The prototype system proves that component library organization based on this model gives guarantee to the reliability of component assembly during program mining.
基金supported by National Natural Science Foundation(NNSF)of China under Grant 41927801.
文摘Geothermal resources are efficient,renewable and clean energy sources,and their reservoirs are usually closely associated with high-temperature regions of the land surface.Current exploration methods primarily involve migrating traditional geological techniques,which fail to fully use the unique features of geothermal radiation characteristics.Thermal infrared remote-sensing imaging technology can capture and present areas with distinctive surface thermal radiation features,providing considerable significance as a guide for localization prior tofield exploration.In this study,we propose a deep learningebased method for intelligently identifying and segmenting geothermal radiation sources from thermal infrared remote-sensing images,including data preparation and model training.To improve the localization drift and anomalous interference caused by the high complexity of the Earth's surface environment,this study uses a surface temperature retrieval algorithm to calculate the land surface temperature in the research area.The retrieval results are used to train the semantic segmentation model.In addition,a pixel-level geothermal spatial segmentation network(PGSSNet)is proposed to suppress the diffuse thermal radiation and reduce the broad and blurred white areas of images to exact locations.Once the training is completed,the model directly segments and extracts the actual range of thermal radiation sources from subsequent thermal infrared remote-sensing images without temperature retrieval and/or manual calibration.
基金provided by the Science Research Project of Hebei Education Department under grant No.BJK2024115.
文摘High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.
基金funded by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(grant number 22KJD440001)Changzhou Science&Technology Program(grant number CJ20220232).
文摘The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology plays a crucial role in vehicle localization and navigation. Traditional Simultaneous Localization and Mapping (SLAM) systems are designed for use in static environments, and they can result in poor performance in terms of accuracy and robustness when used in dynamic environments where objects are in constant movement. To address this issue, a new real-time visual SLAM system called MG-SLAM has been developed. Based on ORB-SLAM2, MG-SLAM incorporates a dynamic target detection process that enables the detection of both known and unknown moving objects. In this process, a separate semantic segmentation thread is required to segment dynamic target instances, and the Mask R-CNN algorithm is applied on the Graphics Processing Unit (GPU) to accelerate segmentation. To reduce computational cost, only key frames are segmented to identify known dynamic objects. Additionally, a multi-view geometry method is adopted to detect unknown moving objects. The results demonstrate that MG-SLAM achieves higher precision, with an improvement from 0.2730 m to 0.0135 m in precision. Moreover, the processing time required by MG-SLAM is significantly reduced compared to other dynamic scene SLAM algorithms, which illustrates its efficacy in locating objects in dynamic scenes.
文摘Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the spinal cord,nerves,intervertebral discs,and vertebrae,Magnetic Resonance Imaging is thought to be the most effective method for imaging the spine.The semantic segmentation of vertebrae plays a major role in the diagnostic process of lumbar diseases.It is difficult to semantically partition the vertebrae in Magnetic Resonance Images from the surrounding variety of tissues,including muscles,ligaments,and intervertebral discs.U-Net is a powerful deep-learning architecture to handle the challenges of medical image analysis tasks and achieves high segmentation accuracy.This work proposes a modified U-Net architecture namely MU-Net,consisting of the Meijering convolutional layer that incorporates the Meijering filter to perform the semantic segmentation of lumbar vertebrae L1 to L5 and sacral vertebra S1.Pseudo-colour mask images were generated and used as ground truth for training the model.The work has been carried out on 1312 images expanded from T1-weighted mid-sagittal MRI images of 515 patients in the Lumbar Spine MRI Dataset publicly available from Mendeley Data.The proposed MU-Net model for the semantic segmentation of the lumbar vertebrae gives better performance with 98.79%of pixel accuracy(PA),98.66%of dice similarity coefficient(DSC),97.36%of Jaccard coefficient,and 92.55%mean Intersection over Union(mean IoU)metrics using the mentioned dataset.
基金supported by the HFIPS Director’s Foundation(YZJJ202207-TS),the National Natural Science Foundation of China(82371931)the Natural Science Foundation of Anhui Province(2008085MC69)+3 种基金the Natural Science Foundation of Hefei City(2021033)the General Scientific Research Project of Anhui Provincial Health Commission(AHWJ2021b150)the Collaborative Innovation Program of Hefei Science Center,CAS(2021HSC-CIP013)the Anhui Province Key Research and Development Project(202204295107020004).
文摘Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is crucial for computationally limited portable devices such as augmented reality and virtual reality.With the rapid advancements in deep learning,many network models have been developed specifically for eye image segmentation.Some methods divide the segmentation process into multiple stages to achieve model parameter miniaturization while enhancing output through post processing techniques to improve segmentation accuracy.These approaches significantly increase the inference time.Other networks adopt more complex encoding and decoding modules to achieve end-to-end output,which requires substantial computation.Therefore,balancing the model’s size,accuracy,and computational complexity is essential.To address these challenges,we propose a lightweight asymmetric UNet architecture and a projection loss function.We utilize ResNet-3 layer blocks to enhance feature extraction efficiency in the encoding stage.In the decoding stage,we employ regular convolutions and skip connections to upscale the feature maps from the latent space to the original image size,balancing the model size and segmentation accuracy.In addition,we leverage the geometric features of the eye region and design a projection loss function to further improve the segmentation accuracy without adding any additional inference computational cost.We validate our approach on the OpenEDS2019 dataset for virtual reality and achieve state-of-the-art performance with 95.33%mean intersection over union(mIoU).Our model has only 0.63M parameters and 350 FPS,which are 68%and 200%of the state-of-the-art model RITNet,respectively.
文摘Few-shot image semantic segmentation aims to achieve pixel-level classification for novel classes using only a few labeled examples.The method first trains the segmentation model on base classes,and then adapts it to novel classes.Although existing methods have achieved remarkable performance in few-shot image semantic segmentation,they still face the following challenges.Traditional methods typically rely on mask average pooling to generate single-category prototype vectors and perform feature matching via metric learning,but they exhibit significant limitations in modeling inter-category relationships and addressing complex background interference.Inspired by the analogy-based transfer mechanisms in cognitive psychology,we propose a Generalized Prototype Network(GPNet)to enhance the model's generalization ability for unseen categories and improve robustness in feature matching.GPNet consists of two key modules.The first is a generalized prototype enhancement module,which explores potential inter-category relationships to construct more discriminative category prototype representations.The second is a multi-scale feature alignment module,which dynamically aligns support and query features across multiple scales using an attention mechanism,thus mitigating background interference in complex scenarios.Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art approaches on several few-shot semantic segmentation benchmarks,validating its effectiveness and generalization capabilities.
基金The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University,Kingdom of Saudi Arabia,for funding this work through the Small Research Group Project under Grant Number RGP.1/316/45.
文摘Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval system mainly relies on the efficiency and accuracy of the classification models.This research addresses the challenge of enhancing the image retrieval system by developing a novel approach,EfficientNet-Convolutional Neural Network(EffNet-CNN).The key objective of this research is to evaluate the proposed EffNet-CNN model’s performance in image classification,image mining,and CBIR.The novelty of the proposed EffNet-CNN model includes the integration of different techniques and modifications.The model includes the Mahalanobis distance metric for feature matching,which enhances the similarity measurements.The model extends EfficientNet architecture by incorporating additional convolutional layers,batch normalization,dropout,and pooling layers for improved hierarchical feature extraction.A systematic hyperparameter optimization using SGD,performance evaluation with three datasets,and data normalization for improving feature representations.The EffNet-CNN is assessed utilizing precision,accuracy,F-measure,and recall metrics across MS-COCO,CIFAR-10 and 100 datasets.The model achieved accuracy values ranging from 90.60%to 95.90%for the MS-COCO dataset,96.8%to 98.3%for the CIFAR-10 dataset and 92.9%to 98.6%for the CIFAR-100 dataset.A validation of the EffNet-CNN model’s results with other models reveals the proposed model’s superior performance.The results highlight the potential of the EffNet-CNN model proposed for image classification and its usefulness in image mining and CBIR.
基金funded by National Natural Science Foundation of China(61603245).
文摘Semantic segmentation of remote sensing images is a critical research area in the field of remote sensing.Despite the success of Convolutional Neural Networks(CNNs),they often fail to capture inter-layer feature relationships and fully leverage contextual information,leading to the loss of important details.Additionally,due to significant intraclass variation and small inter-class differences in remote sensing images,CNNs may experience class confusion.To address these issues,we propose a novel Category-Guided Feature Collaborative Learning Network(CG-FCLNet),which enables fine-grained feature extraction and adaptive fusion.Specifically,we design a Feature Collaborative Learning Module(FCLM)to facilitate the tight interaction of multi-scale features.We also introduce a Scale-Aware Fusion Module(SAFM),which iteratively fuses features from different layers using a spatial attention mechanism,enabling deeper feature fusion.Furthermore,we design a Category-Guided Module(CGM)to extract category-aware information that guides feature fusion,ensuring that the fused featuresmore accurately reflect the semantic information of each category,thereby improving detailed segmentation.The experimental results show that CG-FCLNet achieves a Mean Intersection over Union(mIoU)of 83.46%,an mF1 of 90.87%,and an Overall Accuracy(OA)of 91.34% on the Vaihingen dataset.On the Potsdam dataset,it achieves a mIoU of 86.54%,an mF1 of 92.65%,and an OA of 91.29%.These results highlight the superior performance of CG-FCLNet compared to existing state-of-the-art methods.
基金financially supported by the Natural Science Foundation of China(42301492)the Open Fund of Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering(2022SDSJ04,2024SDSJ03)+1 种基金the Opening Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education(GLAB 2023ZR01,GLAB2024ZR08)the Fundamental Research Funds for the Central Universities.
文摘Automatic segmentation and recognition of content and element information in color geological map are of great significance for researchers to analyze the distribution of mineral resources and predict disaster information.This article focuses on color planar raster geological map(geological maps include planar geological maps,columnar maps,and profiles).While existing deep learning approaches are often used to segment general images,their performance is limited due to complex elements,diverse regional features,and complicated backgrounds for color geological map in the domain of geoscience.To address the issue,a color geological map segmentation model is proposed that combines the Felz clustering algorithm and an improved SE-UNet deep learning network(named GeoMSeg).Firstly,a symmetrical encoder-decoder structure backbone network based on UNet is constructed,and the channel attention mechanism SENet has been incorporated to augment the network’s capacity for feature representation,enabling the model to purposefully extract map information.The SE-UNet network is employed for feature extraction from the geological map and obtain coarse segmentation results.Secondly,the Felz clustering algorithm is used for super pixel pre-segmentation of geological maps.The coarse segmentation results are refined and modified based on the super pixel pre-segmentation results to obtain the final segmentation results.This study applies GeoMSeg to the constructed dataset,and the experimental results show that the algorithm proposed in this paper has superior performance compared to other mainstream map segmentation models,with an accuracy of 91.89%and a MIoU of 71.91%.
文摘Real-time semantic segmentation tasks place stringent demands on network inference speed,often requiring a reduction in network depth to decrease computational load.However,shallow networks tend to exhibit degradation in feature extraction completeness and inference accuracy.Therefore,balancing high performance with real-time requirements has become a critical issue in the study of real-time semantic segmentation.To address these challenges,this paper proposes a lightweight bilateral dual-residual network.By introducing a novel residual structure combined with feature extraction and fusion modules,the proposed network significantly enhances representational capacity while reducing computational costs.Specifically,an improved compound residual structure is designed to optimize the efficiency of information propagation and feature extraction.Furthermore,the proposed feature extraction and fusion module enables the network to better capture multi-scale information in images,improving the ability to detect both detailed and global semantic features.Experimental results on the publicly available Cityscapes dataset demonstrate that the proposed lightweight dual-branch network achieves outstanding performance while maintaining low computational complexity.In particular,the network achieved a mean Intersection over Union(mIoU)of 78.4%on the Cityscapes validation set,surpassing many existing semantic segmentation models.Additionally,in terms of inference speed,the network reached 74.5 frames per second when tested on an NVIDIA GeForce RTX 3090 GPU,significantly improving real-time performance.
基金funded by Hanoi University of Science and Technology(HUST)under project number T2023-PC-008.
文摘Due to the necessity for lightweight and efficient network models, deploying semantic segmentation models on mobile robots (MRs) is a formidable task. The fundamental limitation of the problem lies in the training performance, the ability to effectively exploit the dataset, and the ability to adapt to complex environments when deploying the model. By utilizing the knowledge distillation techniques, the article strives to overcome the above challenges with the inheritance of the advantages of both the teacher model and the student model. More precisely, the ResNet152-PSP-Net model’s characteristics are utilized to train the ResNet18-PSP-Net model. Pyramid pooling blocks are utilized to decode multi-scale feature maps, creating a complete semantic map inference. The student model not only preserves the strong segmentation performance from the teacher model but also improves the inference speed of the prediction results. The proposed method exhibits a clear advantage over conventional convolutional neural network (CNN) models, as evident from the conducted experiments. Furthermore, the proposed model also shows remarkable improvement in processing speed when compared with light-weight models such as MobileNetV2 and EfficientNet based on latency and throughput parameters. The proposed KD-SegNet model obtains an accuracy of 96.3% and a mIoU (mean Intersection over Union) of 77%, outperforming the performance of existing models by more than 15% on the same training dataset. The suggested method has an average training time that is only 0.51 times less than same field models, while still achieving comparable segmentation performance. Hence, the semantic segmentation frames are collected, forming the motion trajectory for the system in the environment. Overall, this architecture shows great promise for the development of knowledge-based systems for MR’s navigation.
基金supported by funding from the following sources:National Natural Science Foundation of China(U1904119)Research Programs of Henan Science and Technology Department(232102210033,232102210054)+3 种基金Chongqing Natural Science Foundation(CSTB2023NSCQ-MSX0070)Henan Province Key Research and Development Project(231111212000)Aviation Science Foundation(20230001055002)supported by Henan Center for Outstanding Overseas Scientists(GZS2022011).
文摘The key to the success of few-shot semantic segmentation(FSS)depends on the efficient use of limited annotated support set to accurately segment novel classes in the query set.Due to the few samples in the support set,FSS faces challenges such as intra-class differences,background(BG)mismatches between query and support sets,and ambiguous segmentation between the foreground(FG)and BG in the query set.To address these issues,The paper propose a multi-module network called CAMSNet,which includes four modules:the General Information Module(GIM),the Class Activation Map Aggregation(CAMA)module,the Self-Cross Attention(SCA)Block,and the Feature Fusion Module(FFM).In CAMSNet,The GIM employs an improved triplet loss,which concatenates word embedding vectors and support prototypes as anchors,and uses local support features of FG and BG as positive and negative samples to help solve the problem of intra-class differences.Then for the first time,the Class Activation Map(CAM)from the Weakly Supervised Semantic Segmentation(WSSS)is applied to FSS within the CAMA module.This method replaces the traditional use of cosine similarity to locate query information.Subsequently,the SCA Block processes the support and query features aggregated by the CAMA module,significantly enhancing the understanding of input information,leading to more accurate predictions and effectively addressing BG mismatch and ambiguous FG-BG segmentation.Finally,The FFM combines general class information with the enhanced query information to achieve accurate segmentation of the query image.Extensive Experiments on PASCAL and COCO demonstrate that-5i-20ithe CAMSNet yields superior performance and set a state-of-the-art.
基金supported in part by the National Natural Science Foundation of China[Grant number 62471075]the Major Science and Technology Project Grant of the Chongqing Municipal Education Commission[Grant number KJZD-M202301901]Graduate Innovation Fund of Chongqing[gzlcx20253235].
文摘Semantic segmentation in street scenes is a crucial technology for autonomous driving to analyze the surrounding environment.In street scenes,issues such as high image resolution caused by a large viewpoints and differences in object scales lead to a decline in real-time performance and difficulties in multi-scale feature extraction.To address this,we propose a bilateral-branch real-time semantic segmentationmethod based on semantic information distillation(BSDNet)for street scene images.The BSDNet consists of a Feature Conversion Convolutional Block(FCB),a Semantic Information Distillation Module(SIDM),and a Deep Aggregation Atrous Convolution Pyramid Pooling(DASP).FCB reduces the semantic gap between the backbone and the semantic branch.SIDM extracts high-quality semantic information fromthe Transformer branch to reduce computational costs.DASP aggregates information lost in atrous convolutions,effectively capturingmulti-scale objects.Extensive experiments conducted on Cityscapes,CamVid,and ADE20K,achieving an accuracy of 81.7% Mean Intersection over Union(mIoU)at 70.6 Frames Per Second(FPS)on Cityscapes,demonstrate that our method achieves a better balance between accuracy and inference speed.
基金supported by the STI2030-Major-Projects(No.2021ZD0200104)the National Natural Science Foundations of China under Grant 61771437.
文摘Neuronal soma segmentation plays a crucial role in neuroscience applications.However,the fine structure,such as boundaries,small-volume neuronal somata and fibers,are commonly present in cell images,which pose a challenge for accurate segmentation.In this paper,we propose a 3D semantic segmentation network for neuronal soma segmentation to address this issue.Using an encoding-decoding structure,we introduce a Multi-Scale feature extraction and Adaptive Weighting fusion module(MSAW)after each encoding block.The MSAW module can not only emphasize the fine structures via an upsampling strategy,but also provide pixel-wise weights to measure the importance of the multi-scale features.Additionally,a dynamic convolution instead of normal convolution is employed to better adapt the network to input data with different distributions.The proposed MSAW-based semantic segmentation network(MSAW-Net)was evaluated on three neuronal soma images from mouse brain and one neuronal soma image from macaque brain,demonstrating the efficiency of the proposed method.It achieved an F1 score of 91.8%on Fezf2-2A-CreER dataset,97.1%on LSL-H2B-GFP dataset,82.8%on Thy1-EGFP-Mline dataset,and 86.9%on macaque dataset,achieving improvements over the 3D U-Net model by 3.1%,3.3%,3.9%,and 2.3%,respectively.
基金the National Key Research and Development Program of China(No.2021ZD0111902)the National Natural Science Foundation of China(Nos.62172022 and U21B2038)。
文摘Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relationships between hand joints and improve the accuracy of 3D hand pose regression.However,GCNs cannot effectively describe the relationships between non-adjacent hand joints.Recently,hypergraph convolutional networks(HGCNs)have received much attention as they can describe multi-dimensional relationships between nodes through hyperedges;therefore,this paper proposes a framework for 3D hand pose estimation based on HGCN,which can better extract correlated relationships between adjacent and non-adjacent hand joints.To overcome the shortcomings of predefined hypergraph structures,a kind of dynamic hypergraph convolutional network is proposed,in which hyperedges are constructed dynamically based on hand joint feature similarity.To better explore the local semantic relationships between nodes,a kind of semantic dynamic hypergraph convolution is proposed.The proposed method is evaluated on publicly available benchmark datasets.Qualitative and quantitative experimental results both show that the proposed HGCN and improved methods for 3D hand pose estimation are better than GCN,and achieve state-of-the-art performance compared with existing methods.
基金supported by the National Science and Technology Council of under Grant NSTC 114-2221-E-130-007.
文摘This paper presents an intelligent patrol and security robot integrating 2D LiDAR and RGB-D vision sensors to achieve semantic simultaneous localization and mapping(SLAM),real-time object recognition,and dynamic obstacle avoidance.The system employs the YOLOv7 deep-learning framework for semantic detection and SLAM for localization and mapping,fusing geometric and visual data to build a high-fidelity 2D semantic map.This map enables the robot to identify and project object information for improved situational awareness.Experimental results show that object recognition reached 95.4%mAP@0.5.Semantic completeness increased from 68.7%(single view)to 94.1%(multi-view)with an average position error of 3.1 cm.During navigation,the robot achieved 98.0%reliability,avoided moving obstacles in 90.0%of encounters,and replanned paths in 0.42 s on average.The integration of LiDAR-based SLAMwith deep-learning–driven semantic perception establishes a robust foundation for intelligent,adaptive,and safe robotic navigation in dynamic environments.