In the present paper,we give a systematic study of the discrete correspondence the-ory and topological correspondence theory of modal meet-implication logic and moda1 meet-semilattice logic,in the semantics provided i...In the present paper,we give a systematic study of the discrete correspondence the-ory and topological correspondence theory of modal meet-implication logic and moda1 meet-semilattice logic,in the semantics provided in[21].The special features of the present paper include the following three points:the first one is that the semantic structure used is based on a semilattice rather than an ordinary partial order,the second one is that the propositional vari-ables are interpreted as filters rather than upsets,and the nominals,which are the“first-order counterparts of propositional variables,are interpreted as principal filters rather than principal upsets;the third one is that in topological correspondence theory,the collection of admissi-ble valuations is not closed under taking disjunction,which makes the proof of the topological Ackermann 1emma different from existing settings.展开更多
This conceptual study proposes a pedagogical framework that integrates Generative Artificial Intelligence tools(AIGC)and Chain-of-Thought(CoT)reasoning,grounded in the cognitive apprenticeship model,for the Pragmatics...This conceptual study proposes a pedagogical framework that integrates Generative Artificial Intelligence tools(AIGC)and Chain-of-Thought(CoT)reasoning,grounded in the cognitive apprenticeship model,for the Pragmatics and Translation course within Master of Translation and Interpreting(MTI)programs.A key feature involves CoT reasoning exercises,which require students to articulate their step-by-step translation reasoning.This explicates cognitive processes,enhances pragmatic awareness,translation strategy development,and critical reflection on linguistic choices and context.Hypothetical activities exemplify its application,including comparative analysis of AI and human translations to examine pragmatic nuances,and guided exercises where students analyze or critique the reasoning traces generated by Large Language Models(LLMs).Ethically grounded,the framework positions AI as a supportive tool,thereby ensuring human translators retain the central decision-making role and promoting critical evaluation of machine-generated suggestions.Potential challenges,such as AI biases,ethical concerns,and overreliance,are addressed through strategies including bias-awareness discussions,rigorous accuracy verification,and a strong emphasis on human accountability.Future research will involve piloting the framework to empirically evaluate its impact on learners’pragmatic competence and translation skills,followed by iterative refinements to advance evidence-based translation pedagogy.展开更多
V.Translatology and PragmaticsPragmatics is the study of language usage.It is a study of those relationsbetween language and context that are grammaticalized,or encoded in the structureof a language.It includes the st...V.Translatology and PragmaticsPragmatics is the study of language usage.It is a study of those relationsbetween language and context that are grammaticalized,or encoded in the structureof a language.It includes the study of deixis,presupposition and speech acts.It is tobe noted that there is also a close relation between translatology and pragmatics.The following instances afford an illustration of this point.展开更多
The close relationship between semantics and pragmatics has made it difficult to set a clear boundary between them.Dicussion of the relationship can date back to Morris, whose semiotic trichotomy was taken up by Carna...The close relationship between semantics and pragmatics has made it difficult to set a clear boundary between them.Dicussion of the relationship can date back to Morris, whose semiotic trichotomy was taken up by Carnap. Leech, Levinson, Bach and Huang also tried to make a distinction between the two, but no consensus has been reached.展开更多
This article is a tentative study of the relationship between semantics and pagmatics through a detailed analysis of Leech’s classical questions about the meaning of X. It makes a discrimination of the different aspe...This article is a tentative study of the relationship between semantics and pagmatics through a detailed analysis of Leech’s classical questions about the meaning of X. It makes a discrimination of the different aspects of meaning that the two branches of linguistics focus upon respectively. Both semantics and pragmatics are studies of the meanings of language. Semantics studies the meaning within the system of language while pragmatics studies the meaning with the speaker involved, that is to say from the social angle. Pragmatics is based on the knowledge of semantics. Given the fact that neither semantics nor pragmatics alone can solve the myth of the meaning of language, it may not be wise to make a clear cut between semantics and pragmatics.\;展开更多
Remote sensing data plays an important role in natural disaster management.However,with the increase of the variety and quantity of remote sensors,the problem of“knowledge barriers”arises when data users in disaster...Remote sensing data plays an important role in natural disaster management.However,with the increase of the variety and quantity of remote sensors,the problem of“knowledge barriers”arises when data users in disaster field retrieve remote sensing data.To improve this problem,this paper proposes an ontology and rule based retrieval(ORR)method to retrieve disaster remote sensing data,and this method introduces ontology technology to express earthquake disaster and remote sensing knowledge,on this basis,and realizes the task suitability reasoning of earthquake disaster remote sensing data,mining the semantic relationship between remote sensing metadata and disasters.The prototype system is built according to the ORR method,which is compared with the traditional method,using the ORR method to retrieve disaster remote sensing data can reduce the knowledge requirements of data users in the retrieval process and improve data retrieval efficiency.展开更多
The advent of self-attention mechanisms within Transformer models has significantly propelled the advancement of deep learning algorithms,yielding outstanding achievements across diverse domains.Nonetheless,self-atten...The advent of self-attention mechanisms within Transformer models has significantly propelled the advancement of deep learning algorithms,yielding outstanding achievements across diverse domains.Nonetheless,self-attention mechanisms falter when applied to datasets with intricate semantic content and extensive dependency structures.In response,this paper introduces a Diffusion Sampling and Label-Driven Co-attention Neural Network(DSLD),which adopts a diffusion sampling method to capture more comprehensive semantic information of the data.Additionally,themodel leverages the joint correlation information of labels and data to introduce the computation of text representation,correcting semantic representationbiases in thedata,andincreasing the accuracyof semantic representation.Ultimately,the model computes the corresponding classification results by synthesizing these rich data semantic representations.Experiments on seven benchmark datasets show that our proposed model achieves competitive results compared to state-of-the-art methods.展开更多
As a key node of modern transportation network,the informationization management of road tunnels is crucial to ensure the operation safety and traffic efficiency.However,the existing tunnel vehicle modeling methods ge...As a key node of modern transportation network,the informationization management of road tunnels is crucial to ensure the operation safety and traffic efficiency.However,the existing tunnel vehicle modeling methods generally have problems such as insufficient 3D scene description capability and low dynamic update efficiency,which are difficult to meet the demand of real-time accurate management.For this reason,this paper proposes a vehicle twin modeling method for road tunnels.This approach starts from the actual management needs,and supports multi-level dynamic modeling from vehicle type,size to color by constructing a vehicle model library that can be flexibly invoked;at the same time,semantic constraint rules with geometric layout,behavioral attributes,and spatial relationships are designed to ensure that the virtual model matches with the real model with a high degree of similarity;ultimately,the prototype system is constructed and the case region is selected for the case study,and the dynamic vehicle status in the tunnel is realized by integrating real-time monitoring data with semantic constraints for precise virtual-real mapping.Finally,the prototype system is constructed and case experiments are conducted in selected case areas,which are combined with real-time monitoring data to realize dynamic updating and three-dimensional visualization of vehicle states in tunnels.The experiments show that the proposed method can run smoothly with an average rendering efficiency of 17.70 ms while guaranteeing the modeling accuracy(composite similarity of 0.867),which significantly improves the real-time and intuitive tunnel management.The research results provide reliable technical support for intelligent operation and emergency response of road tunnels,and offer new ideas for digital twin modeling of complex scenes.展开更多
The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology play...The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology plays a crucial role in vehicle localization and navigation. Traditional Simultaneous Localization and Mapping (SLAM) systems are designed for use in static environments, and they can result in poor performance in terms of accuracy and robustness when used in dynamic environments where objects are in constant movement. To address this issue, a new real-time visual SLAM system called MG-SLAM has been developed. Based on ORB-SLAM2, MG-SLAM incorporates a dynamic target detection process that enables the detection of both known and unknown moving objects. In this process, a separate semantic segmentation thread is required to segment dynamic target instances, and the Mask R-CNN algorithm is applied on the Graphics Processing Unit (GPU) to accelerate segmentation. To reduce computational cost, only key frames are segmented to identify known dynamic objects. Additionally, a multi-view geometry method is adopted to detect unknown moving objects. The results demonstrate that MG-SLAM achieves higher precision, with an improvement from 0.2730 m to 0.0135 m in precision. Moreover, the processing time required by MG-SLAM is significantly reduced compared to other dynamic scene SLAM algorithms, which illustrates its efficacy in locating objects in dynamic scenes.展开更多
Ecological monitoring vehicles are equipped with a range of sensors and monitoring devices designed to gather data on ecological and environmental factors.These vehicles are crucial in various fields,including environ...Ecological monitoring vehicles are equipped with a range of sensors and monitoring devices designed to gather data on ecological and environmental factors.These vehicles are crucial in various fields,including environmental science research,ecological and environmental monitoring projects,disaster response,and emergency management.A key method employed in these vehicles for achieving high-precision positioning is LiDAR(lightlaser detection and ranging)-Visual Simultaneous Localization and Mapping(SLAM).However,maintaining highprecision localization in complex scenarios,such as degraded environments or when dynamic objects are present,remains a significant challenge.To address this issue,we integrate both semantic and texture information from LiDAR and cameras to enhance the robustness and efficiency of data registration.Specifically,semantic information simplifies the modeling of scene elements,reducing the reliance on dense point clouds,which can be less efficient.Meanwhile,visual texture information complements LiDAR-Visual localization by providing additional contextual details.By incorporating semantic and texture details frompaired images and point clouds,we significantly improve the quality of data association,thereby increasing the success rate of localization.This approach not only enhances the operational capabilities of ecological monitoring vehicles in complex environments but also contributes to improving the overall efficiency and effectiveness of ecological monitoring and environmental protection efforts.展开更多
To effectively address the complexity of the environment,information uncertainty,and variability among decision-makers in the event of an enterprise emergency,a multi-granularity binary semantic-based emergency decisi...To effectively address the complexity of the environment,information uncertainty,and variability among decision-makers in the event of an enterprise emergency,a multi-granularity binary semantic-based emergency decision-making method is proposed.Decision-makers use preferred multi-granularity non-uniform linguistic scales combined with binary semantics to represent the evaluation information of key influencing factors.Secondly,the weights were determined based on the proposed method.Finally,the proposed method’s effectiveness is validated using a case study of a fire incident in a chemical company.展开更多
Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the s...Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the spinal cord,nerves,intervertebral discs,and vertebrae,Magnetic Resonance Imaging is thought to be the most effective method for imaging the spine.The semantic segmentation of vertebrae plays a major role in the diagnostic process of lumbar diseases.It is difficult to semantically partition the vertebrae in Magnetic Resonance Images from the surrounding variety of tissues,including muscles,ligaments,and intervertebral discs.U-Net is a powerful deep-learning architecture to handle the challenges of medical image analysis tasks and achieves high segmentation accuracy.This work proposes a modified U-Net architecture namely MU-Net,consisting of the Meijering convolutional layer that incorporates the Meijering filter to perform the semantic segmentation of lumbar vertebrae L1 to L5 and sacral vertebra S1.Pseudo-colour mask images were generated and used as ground truth for training the model.The work has been carried out on 1312 images expanded from T1-weighted mid-sagittal MRI images of 515 patients in the Lumbar Spine MRI Dataset publicly available from Mendeley Data.The proposed MU-Net model for the semantic segmentation of the lumbar vertebrae gives better performance with 98.79%of pixel accuracy(PA),98.66%of dice similarity coefficient(DSC),97.36%of Jaccard coefficient,and 92.55%mean Intersection over Union(mean IoU)metrics using the mentioned dataset.展开更多
Semantic segmentation is a core task in computer vision that allows AI models to interact and understand their surrounding environment. Similarly to how humans subconsciously segment scenes, this ability is crucial fo...Semantic segmentation is a core task in computer vision that allows AI models to interact and understand their surrounding environment. Similarly to how humans subconsciously segment scenes, this ability is crucial for scene understanding. However, a challenge many semantic learning models face is the lack of data. Existing video datasets are limited to short, low-resolution videos that are not representative of real-world examples. Thus, one of our key contributions is a customized semantic segmentation version of the Walking Tours Dataset that features hour-long, high-resolution, real-world data from tours of different cities. Additionally, we evaluate the performance of open-vocabulary, semantic model OpenSeeD on our own custom dataset and discuss future implications.展开更多
In recent years,deep learning-based semantic communications have shown great potential to enhance the performance of communication systems.This has led to the belief that semantic communications represent a breakthrou...In recent years,deep learning-based semantic communications have shown great potential to enhance the performance of communication systems.This has led to the belief that semantic communications represent a breakthrough beyond the Shannon paradigm and will play an essential role in future communications.To narrow the gap between current research and future vision,after an overview of semantic communications,this article presents and discusses ten fundamental and critical challenges in today’s semantic communication field.These challenges are divided into theory foundation,system design,and practical implementation.Challenges related to the theory foundation including semantic capacity,entropy,and rate-distortion are discussed first.Then,the system design challenges encompassing architecture,knowledge base,joint semantic-channel coding,tailored transmission scheme,and impairment are posed.The last two challenges associated with the practical implementation lie in cross-layer optimization for networks and standardization.For each challenge,efforts to date and thoughtful insights are provided.展开更多
With the rapid development of artificial intelligence and the Internet of Things,along with the growing demand for privacy-preserving transmission,the need for efficient and secure communication systems has become inc...With the rapid development of artificial intelligence and the Internet of Things,along with the growing demand for privacy-preserving transmission,the need for efficient and secure communication systems has become increasingly urgent.Traditional communication methods transmit data at the bit level without considering its semantic significance,leading to redundant transmission overhead and reduced efficiency.Semantic communication addresses this issue by extracting and transmitting only the mostmeaningful semantic information,thereby improving bandwidth efficiency.However,despite reducing the volume of data,it remains vulnerable to privacy risks,as semantic features may still expose sensitive information.To address this,we propose an entropy-bottleneck-based privacy protection mechanism for semantic communication.Our approach uses semantic segmentation to partition images into regions of interest(ROI)and regions of non-interest(RONI)based on the receiver’s needs,enabling differentiated semantic transmission.By focusing transmission on ROIs,bandwidth usage is optimized,and non-essential data is minimized.The entropy bottleneck model probabilistically encodes the semantic information into a compact bit stream,reducing correlation between the transmitted content and the original data,thus enhancing privacy protection.The proposed framework is systematically evaluated in terms of compression efficiency,semantic fidelity,and privacy preservation.Through comparative experiments with traditional and state-of-the-art methods,we demonstrate that the approach significantly reduces data transmission,maintains the quality of semantically important regions,and ensures robust privacy protection.展开更多
Remote driving,an emergent technology enabling remote operations of vehicles,presents a significant challenge in transmitting large volumes of image data to a central server.This requirement outpaces the capacity of t...Remote driving,an emergent technology enabling remote operations of vehicles,presents a significant challenge in transmitting large volumes of image data to a central server.This requirement outpaces the capacity of traditional communication methods.To tackle this,we propose a novel framework using semantic communications,through a region of interest semantic segmentation method,to reduce the communication costs by transmitting meaningful semantic information rather than bit-wise data.To solve the knowledge base inconsistencies inherent in semantic communications,we introduce a blockchain-based edge-assisted system for managing diverse and geographically varied semantic segmentation knowledge bases.This system not only ensures the security of data through the tamper-resistant nature of blockchain but also leverages edge computing for efficient management.Additionally,the implementation of blockchain sharding handles differentiated knowledge bases for various tasks,thus boosting overall blockchain efficiency.Experimental results show a great reduction in latency by sharding and an increase in model accuracy,confirming our framework's effectiveness.展开更多
Multimedia semantic communication has been receiving increasing attention due to its significant enhancement of communication efficiency.Semantic coding,which is oriented towards extracting and encoding the key semant...Multimedia semantic communication has been receiving increasing attention due to its significant enhancement of communication efficiency.Semantic coding,which is oriented towards extracting and encoding the key semantics of video for transmission,is a key aspect in the framework of multimedia semantic communication.In this paper,we propose a facial video semantic coding method with low bitrate based on the temporal continuity of video semantics.At the sender’s end,we selectively transmit facial keypoints and deformation information,allocating distinct bitrates to different keypoints across frames.Compressive techniques involving sampling and quantization are employed to reduce the bitrate while retaining facial key semantic information.At the receiver’s end,a GAN-based generative network is utilized for reconstruction,effectively mitigating block artifacts and buffering problems present in traditional codec algorithms under low bitrates.The performance of the proposed approach is validated on multiple datasets,such as VoxCeleb and TalkingHead-1kH,employing metrics such as LPIPS,DISTS,and AKD for assessment.Experimental results demonstrate significant advantages over traditional codec methods,achieving up to approximately 10-fold bitrate reduction in prolonged,stable head pose scenarios across diverse conversational video settings.展开更多
The key to the success of few-shot semantic segmentation(FSS)depends on the efficient use of limited annotated support set to accurately segment novel classes in the query set.Due to the few samples in the support set...The key to the success of few-shot semantic segmentation(FSS)depends on the efficient use of limited annotated support set to accurately segment novel classes in the query set.Due to the few samples in the support set,FSS faces challenges such as intra-class differences,background(BG)mismatches between query and support sets,and ambiguous segmentation between the foreground(FG)and BG in the query set.To address these issues,The paper propose a multi-module network called CAMSNet,which includes four modules:the General Information Module(GIM),the Class Activation Map Aggregation(CAMA)module,the Self-Cross Attention(SCA)Block,and the Feature Fusion Module(FFM).In CAMSNet,The GIM employs an improved triplet loss,which concatenates word embedding vectors and support prototypes as anchors,and uses local support features of FG and BG as positive and negative samples to help solve the problem of intra-class differences.Then for the first time,the Class Activation Map(CAM)from the Weakly Supervised Semantic Segmentation(WSSS)is applied to FSS within the CAMA module.This method replaces the traditional use of cosine similarity to locate query information.Subsequently,the SCA Block processes the support and query features aggregated by the CAMA module,significantly enhancing the understanding of input information,leading to more accurate predictions and effectively addressing BG mismatch and ambiguous FG-BG segmentation.Finally,The FFM combines general class information with the enhanced query information to achieve accurate segmentation of the query image.Extensive Experiments on PASCAL and COCO demonstrate that-5i-20ithe CAMSNet yields superior performance and set a state-of-the-art.展开更多
Flipped is a book written by American author Wendelin Van Draanen. It is a novel about young teenagers and was adapted into the famous film of the same name in 2010. The thesis employs speech acts, as pioneered by Joh...Flipped is a book written by American author Wendelin Van Draanen. It is a novel about young teenagers and was adapted into the famous film of the same name in 2010. The thesis employs speech acts, as pioneered by John Austin and further developed by John Searle, to investigate the influence of dialogues on characterization and plot development in Flipped. By exploring the theory of speech acts presented in dialogues between characters, the author deciphers the underlying intentions embodied in the dialogues and demonstrates the importance of the use of speech acts in dialogues in revealing the characters, driving the development of the plot and expressing the theme of the text.展开更多
Artificial intelligence is reshaping radiology by enabling automated report generation,yet evaluating the clinical accuracy and relevance of these reports is a challenging task,as traditional natural language generati...Artificial intelligence is reshaping radiology by enabling automated report generation,yet evaluating the clinical accuracy and relevance of these reports is a challenging task,as traditional natural language generation metrics like BLEU and ROUGE prioritize lexical overlap over clinical relevance.To address this gap,we propose a novel semantic assessment framework for evaluating the accuracy of artificial intelligence-generated radiology reports against ground truth references.We trained 5229 image–report pairs from the Indiana University chest X-ray dataset on the R2GenRL model and generated a benchmark dataset on test data from the Indiana University chest X-ray and MIMIC-CXR datasets.These datasets were selected for their public availability,large scale,and comprehensive coverage of diverse clinical cases in chest radiography,enabling robust evaluation and comparison with prior work.Results demonstrate that the Mistral model,particularly with task-oriented prompting,achieves superior performance(up to 91.9%accuracy),surpassing other models and closely aligning with established metrics like BERTScore-F1(88.1%)and CLIP-Score(88.7%).Statistical analyses,including paired t-tests(p<0.01)and analysis of variance(p<0.05),confirm significant improvements driven by structured prompting.Failure case analysis reveals limitations,such as over-reliance on lexical similarity,underscoring the need for domain-specific fine-tuning.This framework advances the evaluation of artificial intelligence-driven(AI-driven)radiology report generation,offering a robust,clinically relevant metric for assessing semantic accuracy and paving the way for more reliable automated systems in medical imaging.展开更多
基金supported by the Chinese Ministry of Education of Humanities and Social Science Project(23YJC72040003)the Key Project of Chinese Ministry of Education(22JJD720021)supported by the Natural Science Foundation of Shandong Province,China(project number:ZR2023QF021)。
文摘In the present paper,we give a systematic study of the discrete correspondence the-ory and topological correspondence theory of modal meet-implication logic and moda1 meet-semilattice logic,in the semantics provided in[21].The special features of the present paper include the following three points:the first one is that the semantic structure used is based on a semilattice rather than an ordinary partial order,the second one is that the propositional vari-ables are interpreted as filters rather than upsets,and the nominals,which are the“first-order counterparts of propositional variables,are interpreted as principal filters rather than principal upsets;the third one is that in topological correspondence theory,the collection of admissi-ble valuations is not closed under taking disjunction,which makes the proof of the topological Ackermann 1emma different from existing settings.
文摘This conceptual study proposes a pedagogical framework that integrates Generative Artificial Intelligence tools(AIGC)and Chain-of-Thought(CoT)reasoning,grounded in the cognitive apprenticeship model,for the Pragmatics and Translation course within Master of Translation and Interpreting(MTI)programs.A key feature involves CoT reasoning exercises,which require students to articulate their step-by-step translation reasoning.This explicates cognitive processes,enhances pragmatic awareness,translation strategy development,and critical reflection on linguistic choices and context.Hypothetical activities exemplify its application,including comparative analysis of AI and human translations to examine pragmatic nuances,and guided exercises where students analyze or critique the reasoning traces generated by Large Language Models(LLMs).Ethically grounded,the framework positions AI as a supportive tool,thereby ensuring human translators retain the central decision-making role and promoting critical evaluation of machine-generated suggestions.Potential challenges,such as AI biases,ethical concerns,and overreliance,are addressed through strategies including bias-awareness discussions,rigorous accuracy verification,and a strong emphasis on human accountability.Future research will involve piloting the framework to empirically evaluate its impact on learners’pragmatic competence and translation skills,followed by iterative refinements to advance evidence-based translation pedagogy.
文摘V.Translatology and PragmaticsPragmatics is the study of language usage.It is a study of those relationsbetween language and context that are grammaticalized,or encoded in the structureof a language.It includes the study of deixis,presupposition and speech acts.It is tobe noted that there is also a close relation between translatology and pragmatics.The following instances afford an illustration of this point.
文摘The close relationship between semantics and pragmatics has made it difficult to set a clear boundary between them.Dicussion of the relationship can date back to Morris, whose semiotic trichotomy was taken up by Carnap. Leech, Levinson, Bach and Huang also tried to make a distinction between the two, but no consensus has been reached.
文摘This article is a tentative study of the relationship between semantics and pagmatics through a detailed analysis of Leech’s classical questions about the meaning of X. It makes a discrimination of the different aspects of meaning that the two branches of linguistics focus upon respectively. Both semantics and pragmatics are studies of the meanings of language. Semantics studies the meaning within the system of language while pragmatics studies the meaning with the speaker involved, that is to say from the social angle. Pragmatics is based on the knowledge of semantics. Given the fact that neither semantics nor pragmatics alone can solve the myth of the meaning of language, it may not be wise to make a clear cut between semantics and pragmatics.\;
基金supported by the National Key Research and Development Program of China(2020YFC1512304).
文摘Remote sensing data plays an important role in natural disaster management.However,with the increase of the variety and quantity of remote sensors,the problem of“knowledge barriers”arises when data users in disaster field retrieve remote sensing data.To improve this problem,this paper proposes an ontology and rule based retrieval(ORR)method to retrieve disaster remote sensing data,and this method introduces ontology technology to express earthquake disaster and remote sensing knowledge,on this basis,and realizes the task suitability reasoning of earthquake disaster remote sensing data,mining the semantic relationship between remote sensing metadata and disasters.The prototype system is built according to the ORR method,which is compared with the traditional method,using the ORR method to retrieve disaster remote sensing data can reduce the knowledge requirements of data users in the retrieval process and improve data retrieval efficiency.
基金the Communication University of China(CUC230A013)the Fundamental Research Funds for the Central Universities.
文摘The advent of self-attention mechanisms within Transformer models has significantly propelled the advancement of deep learning algorithms,yielding outstanding achievements across diverse domains.Nonetheless,self-attention mechanisms falter when applied to datasets with intricate semantic content and extensive dependency structures.In response,this paper introduces a Diffusion Sampling and Label-Driven Co-attention Neural Network(DSLD),which adopts a diffusion sampling method to capture more comprehensive semantic information of the data.Additionally,themodel leverages the joint correlation information of labels and data to introduce the computation of text representation,correcting semantic representationbiases in thedata,andincreasing the accuracyof semantic representation.Ultimately,the model computes the corresponding classification results by synthesizing these rich data semantic representations.Experiments on seven benchmark datasets show that our proposed model achieves competitive results compared to state-of-the-art methods.
基金National Natural Science Foundation of China(Nos.42301473,42271424,42171397)Chinese Postdoctoral Innovation Talents Support Program(No.BX20230299)+2 种基金China Postdoctoral Science Foundation(No.2023M742884)Natural Science Foundation of Sichuan Province(Nos.24NSFSC2264,2025ZNSFSC0322)Key Research and Development Project of Sichuan Province(No.24ZDYF0633).
文摘As a key node of modern transportation network,the informationization management of road tunnels is crucial to ensure the operation safety and traffic efficiency.However,the existing tunnel vehicle modeling methods generally have problems such as insufficient 3D scene description capability and low dynamic update efficiency,which are difficult to meet the demand of real-time accurate management.For this reason,this paper proposes a vehicle twin modeling method for road tunnels.This approach starts from the actual management needs,and supports multi-level dynamic modeling from vehicle type,size to color by constructing a vehicle model library that can be flexibly invoked;at the same time,semantic constraint rules with geometric layout,behavioral attributes,and spatial relationships are designed to ensure that the virtual model matches with the real model with a high degree of similarity;ultimately,the prototype system is constructed and the case region is selected for the case study,and the dynamic vehicle status in the tunnel is realized by integrating real-time monitoring data with semantic constraints for precise virtual-real mapping.Finally,the prototype system is constructed and case experiments are conducted in selected case areas,which are combined with real-time monitoring data to realize dynamic updating and three-dimensional visualization of vehicle states in tunnels.The experiments show that the proposed method can run smoothly with an average rendering efficiency of 17.70 ms while guaranteeing the modeling accuracy(composite similarity of 0.867),which significantly improves the real-time and intuitive tunnel management.The research results provide reliable technical support for intelligent operation and emergency response of road tunnels,and offer new ideas for digital twin modeling of complex scenes.
基金funded by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(grant number 22KJD440001)Changzhou Science&Technology Program(grant number CJ20220232).
文摘The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology plays a crucial role in vehicle localization and navigation. Traditional Simultaneous Localization and Mapping (SLAM) systems are designed for use in static environments, and they can result in poor performance in terms of accuracy and robustness when used in dynamic environments where objects are in constant movement. To address this issue, a new real-time visual SLAM system called MG-SLAM has been developed. Based on ORB-SLAM2, MG-SLAM incorporates a dynamic target detection process that enables the detection of both known and unknown moving objects. In this process, a separate semantic segmentation thread is required to segment dynamic target instances, and the Mask R-CNN algorithm is applied on the Graphics Processing Unit (GPU) to accelerate segmentation. To reduce computational cost, only key frames are segmented to identify known dynamic objects. Additionally, a multi-view geometry method is adopted to detect unknown moving objects. The results demonstrate that MG-SLAM achieves higher precision, with an improvement from 0.2730 m to 0.0135 m in precision. Moreover, the processing time required by MG-SLAM is significantly reduced compared to other dynamic scene SLAM algorithms, which illustrates its efficacy in locating objects in dynamic scenes.
基金supported by the project“GEF9874:Strengthening Coordinated Approaches to Reduce Invasive Alien Species(lAS)Threats to Globally Significant Agrobiodiversity and Agroecosystems in China”funding from the Excellent Talent Training Funding Project in Dongcheng District,Beijing,with project number 2024-dchrcpyzz-9.
文摘Ecological monitoring vehicles are equipped with a range of sensors and monitoring devices designed to gather data on ecological and environmental factors.These vehicles are crucial in various fields,including environmental science research,ecological and environmental monitoring projects,disaster response,and emergency management.A key method employed in these vehicles for achieving high-precision positioning is LiDAR(lightlaser detection and ranging)-Visual Simultaneous Localization and Mapping(SLAM).However,maintaining highprecision localization in complex scenarios,such as degraded environments or when dynamic objects are present,remains a significant challenge.To address this issue,we integrate both semantic and texture information from LiDAR and cameras to enhance the robustness and efficiency of data registration.Specifically,semantic information simplifies the modeling of scene elements,reducing the reliance on dense point clouds,which can be less efficient.Meanwhile,visual texture information complements LiDAR-Visual localization by providing additional contextual details.By incorporating semantic and texture details frompaired images and point clouds,we significantly improve the quality of data association,thereby increasing the success rate of localization.This approach not only enhances the operational capabilities of ecological monitoring vehicles in complex environments but also contributes to improving the overall efficiency and effectiveness of ecological monitoring and environmental protection efforts.
文摘To effectively address the complexity of the environment,information uncertainty,and variability among decision-makers in the event of an enterprise emergency,a multi-granularity binary semantic-based emergency decision-making method is proposed.Decision-makers use preferred multi-granularity non-uniform linguistic scales combined with binary semantics to represent the evaluation information of key influencing factors.Secondly,the weights were determined based on the proposed method.Finally,the proposed method’s effectiveness is validated using a case study of a fire incident in a chemical company.
文摘Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the spinal cord,nerves,intervertebral discs,and vertebrae,Magnetic Resonance Imaging is thought to be the most effective method for imaging the spine.The semantic segmentation of vertebrae plays a major role in the diagnostic process of lumbar diseases.It is difficult to semantically partition the vertebrae in Magnetic Resonance Images from the surrounding variety of tissues,including muscles,ligaments,and intervertebral discs.U-Net is a powerful deep-learning architecture to handle the challenges of medical image analysis tasks and achieves high segmentation accuracy.This work proposes a modified U-Net architecture namely MU-Net,consisting of the Meijering convolutional layer that incorporates the Meijering filter to perform the semantic segmentation of lumbar vertebrae L1 to L5 and sacral vertebra S1.Pseudo-colour mask images were generated and used as ground truth for training the model.The work has been carried out on 1312 images expanded from T1-weighted mid-sagittal MRI images of 515 patients in the Lumbar Spine MRI Dataset publicly available from Mendeley Data.The proposed MU-Net model for the semantic segmentation of the lumbar vertebrae gives better performance with 98.79%of pixel accuracy(PA),98.66%of dice similarity coefficient(DSC),97.36%of Jaccard coefficient,and 92.55%mean Intersection over Union(mean IoU)metrics using the mentioned dataset.
文摘Semantic segmentation is a core task in computer vision that allows AI models to interact and understand their surrounding environment. Similarly to how humans subconsciously segment scenes, this ability is crucial for scene understanding. However, a challenge many semantic learning models face is the lack of data. Existing video datasets are limited to short, low-resolution videos that are not representative of real-world examples. Thus, one of our key contributions is a customized semantic segmentation version of the Walking Tours Dataset that features hour-long, high-resolution, real-world data from tours of different cities. Additionally, we evaluate the performance of open-vocabulary, semantic model OpenSeeD on our own custom dataset and discuss future implications.
基金supported in part by the National Key Research and Development Program of China under Grant 2021YFA1000500(4)in part by the Natural Science Foundation of China(NSFC)under Grant 62293484,Grant U22B2001,Grant 62425110,Grant 62227801,Grant 62442106.
文摘In recent years,deep learning-based semantic communications have shown great potential to enhance the performance of communication systems.This has led to the belief that semantic communications represent a breakthrough beyond the Shannon paradigm and will play an essential role in future communications.To narrow the gap between current research and future vision,after an overview of semantic communications,this article presents and discusses ten fundamental and critical challenges in today’s semantic communication field.These challenges are divided into theory foundation,system design,and practical implementation.Challenges related to the theory foundation including semantic capacity,entropy,and rate-distortion are discussed first.Then,the system design challenges encompassing architecture,knowledge base,joint semantic-channel coding,tailored transmission scheme,and impairment are posed.The last two challenges associated with the practical implementation lie in cross-layer optimization for networks and standardization.For each challenge,efforts to date and thoughtful insights are provided.
基金supported in part by the Innovation and Entrepreneurship Training Program for Chinese College Students(No.202410128019)in part by JST ASPIRE Grant Number JPMJAP2325in part by Support Center for Advanced Telecommunications Technology Research(SCAT).
文摘With the rapid development of artificial intelligence and the Internet of Things,along with the growing demand for privacy-preserving transmission,the need for efficient and secure communication systems has become increasingly urgent.Traditional communication methods transmit data at the bit level without considering its semantic significance,leading to redundant transmission overhead and reduced efficiency.Semantic communication addresses this issue by extracting and transmitting only the mostmeaningful semantic information,thereby improving bandwidth efficiency.However,despite reducing the volume of data,it remains vulnerable to privacy risks,as semantic features may still expose sensitive information.To address this,we propose an entropy-bottleneck-based privacy protection mechanism for semantic communication.Our approach uses semantic segmentation to partition images into regions of interest(ROI)and regions of non-interest(RONI)based on the receiver’s needs,enabling differentiated semantic transmission.By focusing transmission on ROIs,bandwidth usage is optimized,and non-essential data is minimized.The entropy bottleneck model probabilistically encodes the semantic information into a compact bit stream,reducing correlation between the transmitted content and the original data,thus enhancing privacy protection.The proposed framework is systematically evaluated in terms of compression efficiency,semantic fidelity,and privacy preservation.Through comparative experiments with traditional and state-of-the-art methods,we demonstrate that the approach significantly reduces data transmission,maintains the quality of semantically important regions,and ensures robust privacy protection.
基金supported in part by the National Natural Science Foundation of China under Grant No.62062031in part by the MIC/SCOPE#JP235006102+2 种基金in part by JST ASPIRE Grant Number JPMJAP2325in part by ROIS NII Open Collaborative Research under Grant 24S0601in part by collaborative research with Toyota Motor Corporation,Japan。
文摘Remote driving,an emergent technology enabling remote operations of vehicles,presents a significant challenge in transmitting large volumes of image data to a central server.This requirement outpaces the capacity of traditional communication methods.To tackle this,we propose a novel framework using semantic communications,through a region of interest semantic segmentation method,to reduce the communication costs by transmitting meaningful semantic information rather than bit-wise data.To solve the knowledge base inconsistencies inherent in semantic communications,we introduce a blockchain-based edge-assisted system for managing diverse and geographically varied semantic segmentation knowledge bases.This system not only ensures the security of data through the tamper-resistant nature of blockchain but also leverages edge computing for efficient management.Additionally,the implementation of blockchain sharding handles differentiated knowledge bases for various tasks,thus boosting overall blockchain efficiency.Experimental results show a great reduction in latency by sharding and an increase in model accuracy,confirming our framework's effectiveness.
基金supported by the National Natural Science Foundation of China (Nos. NSFC 61925105, 62322109, 62171257 and U22B2001)the Xplorer Prize in Information and Electronics technologiesthe Tsinghua University (Department of Electronic Engineering)-Nantong Research Institute for Advanced Communication Technologies Joint Research Center for Space, Air, Ground and Sea Cooperative Communication Network Technology
文摘Multimedia semantic communication has been receiving increasing attention due to its significant enhancement of communication efficiency.Semantic coding,which is oriented towards extracting and encoding the key semantics of video for transmission,is a key aspect in the framework of multimedia semantic communication.In this paper,we propose a facial video semantic coding method with low bitrate based on the temporal continuity of video semantics.At the sender’s end,we selectively transmit facial keypoints and deformation information,allocating distinct bitrates to different keypoints across frames.Compressive techniques involving sampling and quantization are employed to reduce the bitrate while retaining facial key semantic information.At the receiver’s end,a GAN-based generative network is utilized for reconstruction,effectively mitigating block artifacts and buffering problems present in traditional codec algorithms under low bitrates.The performance of the proposed approach is validated on multiple datasets,such as VoxCeleb and TalkingHead-1kH,employing metrics such as LPIPS,DISTS,and AKD for assessment.Experimental results demonstrate significant advantages over traditional codec methods,achieving up to approximately 10-fold bitrate reduction in prolonged,stable head pose scenarios across diverse conversational video settings.
基金supported by funding from the following sources:National Natural Science Foundation of China(U1904119)Research Programs of Henan Science and Technology Department(232102210033,232102210054)+3 种基金Chongqing Natural Science Foundation(CSTB2023NSCQ-MSX0070)Henan Province Key Research and Development Project(231111212000)Aviation Science Foundation(20230001055002)supported by Henan Center for Outstanding Overseas Scientists(GZS2022011).
文摘The key to the success of few-shot semantic segmentation(FSS)depends on the efficient use of limited annotated support set to accurately segment novel classes in the query set.Due to the few samples in the support set,FSS faces challenges such as intra-class differences,background(BG)mismatches between query and support sets,and ambiguous segmentation between the foreground(FG)and BG in the query set.To address these issues,The paper propose a multi-module network called CAMSNet,which includes four modules:the General Information Module(GIM),the Class Activation Map Aggregation(CAMA)module,the Self-Cross Attention(SCA)Block,and the Feature Fusion Module(FFM).In CAMSNet,The GIM employs an improved triplet loss,which concatenates word embedding vectors and support prototypes as anchors,and uses local support features of FG and BG as positive and negative samples to help solve the problem of intra-class differences.Then for the first time,the Class Activation Map(CAM)from the Weakly Supervised Semantic Segmentation(WSSS)is applied to FSS within the CAMA module.This method replaces the traditional use of cosine similarity to locate query information.Subsequently,the SCA Block processes the support and query features aggregated by the CAMA module,significantly enhancing the understanding of input information,leading to more accurate predictions and effectively addressing BG mismatch and ambiguous FG-BG segmentation.Finally,The FFM combines general class information with the enhanced query information to achieve accurate segmentation of the query image.Extensive Experiments on PASCAL and COCO demonstrate that-5i-20ithe CAMSNet yields superior performance and set a state-of-the-art.
文摘Flipped is a book written by American author Wendelin Van Draanen. It is a novel about young teenagers and was adapted into the famous film of the same name in 2010. The thesis employs speech acts, as pioneered by John Austin and further developed by John Searle, to investigate the influence of dialogues on characterization and plot development in Flipped. By exploring the theory of speech acts presented in dialogues between characters, the author deciphers the underlying intentions embodied in the dialogues and demonstrates the importance of the use of speech acts in dialogues in revealing the characters, driving the development of the plot and expressing the theme of the text.
基金supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)-Innovative Human Resource Development for Local Intellectualization program grant funded by the Korea government(MSIT)(IITP-2024-RS-2024-00436773).
文摘Artificial intelligence is reshaping radiology by enabling automated report generation,yet evaluating the clinical accuracy and relevance of these reports is a challenging task,as traditional natural language generation metrics like BLEU and ROUGE prioritize lexical overlap over clinical relevance.To address this gap,we propose a novel semantic assessment framework for evaluating the accuracy of artificial intelligence-generated radiology reports against ground truth references.We trained 5229 image–report pairs from the Indiana University chest X-ray dataset on the R2GenRL model and generated a benchmark dataset on test data from the Indiana University chest X-ray and MIMIC-CXR datasets.These datasets were selected for their public availability,large scale,and comprehensive coverage of diverse clinical cases in chest radiography,enabling robust evaluation and comparison with prior work.Results demonstrate that the Mistral model,particularly with task-oriented prompting,achieves superior performance(up to 91.9%accuracy),surpassing other models and closely aligning with established metrics like BERTScore-F1(88.1%)and CLIP-Score(88.7%).Statistical analyses,including paired t-tests(p<0.01)and analysis of variance(p<0.05),confirm significant improvements driven by structured prompting.Failure case analysis reveals limitations,such as over-reliance on lexical similarity,underscoring the need for domain-specific fine-tuning.This framework advances the evaluation of artificial intelligence-driven(AI-driven)radiology report generation,offering a robust,clinically relevant metric for assessing semantic accuracy and paving the way for more reliable automated systems in medical imaging.