Carbon dots(CDs)-based composites have shown impressive performance in fields of information encryption and sensing,however,a great challenge is to simultaneously implement multi-mode luminescence and room-temperature...Carbon dots(CDs)-based composites have shown impressive performance in fields of information encryption and sensing,however,a great challenge is to simultaneously implement multi-mode luminescence and room-temperature phosphorescence(RTP)detection in single system due to the formidable synthesis.Herein,a multifunctional composite of Eu&CDs@p RHO has been designed by co-assembly strategy and prepared via a facile calcination and impregnation treatment.Eu&CDs@p RHO exhibits intense fluorescence(FL)and RTP coming from two individual luminous centers,Eu3+in the free pores and CDs in the interrupted structure of RHO zeolite.Unique four-mode color outputs including pink(Eu^(3+),ex.254 nm),light violet(CDs,ex.365 nm),blue(CDs,254 nm off),and green(CDs,365 nm off)could be realized,on the basis of it,a preliminary application of advanced information encoding has been demonstrated.Given the free pores of matrix and stable RTP in water of confined CDs,a visual RTP detection of Fe^(3+)ions is achieved with the detection limit as low as 9.8μmol/L.This work has opened up a new perspective for the strategic amalgamation of luminous vips with porous zeolite to construct the advanced functional materials.展开更多
Prostate cancer(PCa)is characterized by high incidence and propensity for easy metastasis,presenting significant challenges in clinical diagnosis and treatment.Tumor microenvironment(TME)-responsive nanomaterials prov...Prostate cancer(PCa)is characterized by high incidence and propensity for easy metastasis,presenting significant challenges in clinical diagnosis and treatment.Tumor microenvironment(TME)-responsive nanomaterials provide a promising prospect for imaging-guided precision therapy.Considering that tumor-derived alkaline phosphatase(ALP)is over-expressed in metastatic PCa,it makes a great chance to develop a theranostics system with ALP responsive in the TME.Herein,an ALP-responsive aggregationinduced emission luminogens(AIEgens)nanoprobe AMNF self-assembly was designed for enhancing the diagnosis and treatment of metastatic PCa.The nanoprobe exhibited self-aggregation in the presence of ALP resulted in aggregation-induced fluorescence,and enhanced accumulation and prolonged retention period at the tumor site.In terms of detection,the fluorescence(FL)/computed tomography(CT)/magnetic resonance(MR)multi-mode imaging effect of nanoprobe was significantly improved post-aggregation,enabling precise diagnosis through the amalgamation of multiple imaging modes.Enhanced CT/MR imaging can achieve assist preoperative tumor diagnosis,and enhanced FL imaging technology can achieve“intraoperative visual navigation”,showing its potential application value in clinical tumor detection and surgical guidance.In terms of treatment,AMNF showed strong absorption in the near infrared region after aggregation,which improved the photothermal treatment effect.Overall,our work developed an effective aggregation-enhanced theranostic strategy for ALP-related cancers.展开更多
Low-carbon smart parks achieve selfbalanced carbon emission and absorption through the cooperative scheduling of direct current(DC)-based distributed photovoltaic,energy storage units,and loads.Direct current power li...Low-carbon smart parks achieve selfbalanced carbon emission and absorption through the cooperative scheduling of direct current(DC)-based distributed photovoltaic,energy storage units,and loads.Direct current power line communication(DC-PLC)enables real-time data transmission on DC power lines.With traffic adaptation,DC-PLC can be integrated with other complementary media such as 5G to reduce transmission delay and improve reliability.However,traffic adaptation for DC-PLC and 5G integration still faces the challenges such as coupling between traffic admission control and traffic partition,dimensionality curse,and the ignorance of extreme event occurrence.To address these challenges,we propose a deep reinforcement learning(DRL)-based delay sensitive and reliable traffic adaptation algorithm(DSRTA)to minimize the total queuing delay under the constraints of traffic admission control,queuing delay,and extreme events occurrence probability.DSRTA jointly optimizes traffic admission control and traffic partition,and enables learning-based intelligent traffic adaptation.The long-term constraints are incorporated into both state and bound of drift-pluspenalty to achieve delay awareness and enforce reliability guarantee.Simulation results show that DSRTA has lower queuing delay and more reliable quality of service(QoS)guarantee than other state-of-the-art algorithms.展开更多
The all-wheel drive(AWD)hybrid system is a research focus on high-performance new energy vehicles that can meet the demands of dynamic performance and passing ability.Simultaneous optimization of the power and economy...The all-wheel drive(AWD)hybrid system is a research focus on high-performance new energy vehicles that can meet the demands of dynamic performance and passing ability.Simultaneous optimization of the power and economy of hybrid vehicles becomes an issue.A unique multi-mode coupling(MMC)AWD hybrid system is presented to realize the distributed and centralized driving of the front and rear axles to achieve vectored distribution and full utilization of the system power between the axles of vehicles.Based on the parameters of the benchmarking model of a hybrid vehicle,the best model-predictive control-based energy management strategy is proposed.First,the drive system model was built after the analysis of the MMC-AWD’s drive modes.Next,three fundamental strategies were established to address power distribution adjustment and battery SOC maintenance when the SOC changed,which was followed by the design of a road driving force observer.Then,the energy consumption rate in the average time domain was processed before designing the minimum fuel consumption controller based on the equivalent fuel consumption coefficient.Finally,the advantage of the MMC-AWD was confirmed by comparison with the dynamic performance and economy of the BYD Song PLUS DMI-AWD.The findings indicate that,in comparison to the comparative hybrid system at road adhesion coefficients of 0.8 and 0.6,the MMC-AWD’s capacity to accelerate increases by 5.26%and 7.92%,respectively.When the road adhesion coefficient is 0.8,0.6,and 0.4,the maximum climbing ability increases by 14.22%,12.88%,and 4.55%,respectively.As a result,the dynamic performance is greatly enhanced,and the fuel savings rate per 100 km of mileage reaches 12.06%,which is also very economical.The proposed control strategies for the new hybrid AWD vehicle can optimize the power and economy simultaneously.展开更多
This work evaluates the viability of a cutting-edge flexible wing prototype actuated by Shape Memory Alloy(SMA)wire actuators.Such flexible wings have garnered significant interest for their potential to enhance aerod...This work evaluates the viability of a cutting-edge flexible wing prototype actuated by Shape Memory Alloy(SMA)wire actuators.Such flexible wings have garnered significant interest for their potential to enhance aerodynamic efficiency by mitigating noise and delaying flow separation.SMA actuators are particularly advantageous due to their superior power-to-weight ratio and adaptive response,making them increasingly favored in morphing aircraft applications.Our methodology begins with a detailed delineation of the fishbone camber morphing wing rib structure,followed by the construction of a multi-mode morphing wing segment through 3D-printed rib assembly.Comprehensive testing of the SMA wire actuators’actuation capacity and efficiency was conducted to establish their operational parameters.Subsequent experimental analyses focused on the bi-directional and reciprocating morphing performance of the fishbone wing rib,which incorporates SMA wires on the upper and lower sides.These experiments confirmed the segment’s multi-mode morphing abilities.Aerodynamic assessments have demonstrated that our design substantially improves the Lift-to-Drag ratio(L/D)when compared to conventional rigid wings.Finally,two phases of flight tests demonstrated the feasibility of SMA as an aircraft actuator and the validity of flexible wing structures to adjust the aircraft attitude,respectively.展开更多
In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure in...In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical.Unfortunately,existing approaches fail to handle these problems.This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues(TD-MMC),which utilizes three valuable multi-model clues:text-model importance,text-image complementary,and text-image inconsistency.TD-MMC is dominated by textural content and assisted by image information while using social network information to enhance text representation.To reduce the irrelevant social structure’s information interference,we use a unidirectional cross-modal attention mechanism to selectively learn the social structure’s features.A cross-modal attention mechanism is adopted to obtain text-image cross-modal features while retaining textual features to reduce the loss of important information.In addition,TD-MMC employs a new multi-model loss to improve the model’s generalization ability.Extensive experiments have been conducted on two public real-world English and Chinese datasets,and the results show that our proposed model outperforms the state-of-the-art methods on classification evaluation metrics.展开更多
Lanthanum-doped double halide perovskite has attracted increasing interest due to its distinctive upconversion and near-infrared(NIR) luminous characteristics.Here,erbium ion(Er^(3+)) doped Cs_(2)(Na/Ag)BiCl_(6) micro...Lanthanum-doped double halide perovskite has attracted increasing interest due to its distinctive upconversion and near-infrared(NIR) luminous characteristics.Here,erbium ion(Er^(3+)) doped Cs_(2)(Na/Ag)BiCl_(6) microcrystals(MCs) were synthesized and proved to be one of the most prospective candidates for optical thermometry.The enhancement of both white light from self-trapped exciton emission and NIR emission from Er^(3+) ion of Cs_(2)AgBiCl_(6) microcrystals is caused by lattice distortion due to Na^(+) ion doping.Fluorescence intensity ratio and lifetime methods provide self-referenced and sensitive thermometry under 405 and/or 980 nm laser excitation at the temperatures from 80 to 480 K.Besides,the maximum values of relative and absolute sensitivity of 3.62%/K and 27//K can be achieved in the low to high temperature range under 980 and 405 nm laser co-excitation.Through the experimental analysis,Er^(3+)doped Cs_(2)(Na/Ag)BiCl_(6) double perovskite is considered to be an ideal self-calibrating thermometric material due to its good long-term stability and multi-mode function of excitation and detection.展开更多
Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocar...Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.展开更多
Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or ...Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines.展开更多
Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and ...Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and knowledge and the limitations of data sources,the visual knowledge within the knowledge graphs is generally of low quality,and some entities suffer from the issue of missing visual modality.Nevertheless,previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing.In this case,mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion,which inevitably suffers from problems such as error propagation and increased uncertainty.To address these problems,we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation(MMCSD).Specifically,we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality.Moreover,we design multi-level visual semantic extraction and entity description generation,thereby further extracting entity semantics from structural triples and visual images.Meanwhile,we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features.We conducted experiments on FB15K-237 and DB13K,and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance.展开更多
Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status...Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.展开更多
To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities...To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model.展开更多
Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single ...Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single target”presents significant challenges due to its holistic approach.Network pharmacology and its core theory of network targets connect drugs and diseases from a holistic and systematic perspective based on biological networks,overcoming the limitations of reductionist research models and showing considerable value in TCM research.Recent integration of network target computational and experimental methods with artificial intelligence(AI)and multi-modal multi-omics technologies has substantially enhanced network pharmacology methodology.The advancement in computational and experimental techniques provides complementary support for network target theory in decoding TCM principles.This review,centered on network targets,examines the progress of network target methods combined with AI in predicting disease molecular mechanisms and drug-target relationships,alongside the application of multi-modal multi-omics technologies in analyzing TCM formulae,syndromes,and toxicity.Looking forward,network target theory is expected to incorporate emerging technologies while developing novel approaches aligned with its unique characteristics,potentially leading to significant breakthroughs in TCM research and advancing scientific understanding and innovation in TCM.展开更多
As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advan...As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection.展开更多
The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character rese...The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character resemblance,and semantic discrepancies.However,existing methods often struggle to fully address these types of errors,impacting the overall correction accuracy.This paper introduces a multi-modal feature encoder designed to efficiently extract features from three distinct modalities:pinyin,semantics,and character morphology.Unlike previous methods that rely on direct fusion or fixed-weight summation to integrate multi-modal information,our approach employs a multi-head attention mechanism to focuse more on relevant modal information while dis-regarding less pertinent data.To prevent issues such as gradient explosion or vanishing,the model incorporates a residual connection of the original text vector for fine-tuning.This approach ensures robust model performance by maintaining essential linguistic details throughout the correction process.Experimental evaluations on the SIGHAN benchmark dataset demonstrate that the pro-posed model outperforms baseline approaches across various metrics and datasets,confirming its effectiveness and feasibility.展开更多
基金supported by the National Natural Science Foundation of China(No.22288101)the 111 Project(No.B17020)。
文摘Carbon dots(CDs)-based composites have shown impressive performance in fields of information encryption and sensing,however,a great challenge is to simultaneously implement multi-mode luminescence and room-temperature phosphorescence(RTP)detection in single system due to the formidable synthesis.Herein,a multifunctional composite of Eu&CDs@p RHO has been designed by co-assembly strategy and prepared via a facile calcination and impregnation treatment.Eu&CDs@p RHO exhibits intense fluorescence(FL)and RTP coming from two individual luminous centers,Eu3+in the free pores and CDs in the interrupted structure of RHO zeolite.Unique four-mode color outputs including pink(Eu^(3+),ex.254 nm),light violet(CDs,ex.365 nm),blue(CDs,254 nm off),and green(CDs,365 nm off)could be realized,on the basis of it,a preliminary application of advanced information encoding has been demonstrated.Given the free pores of matrix and stable RTP in water of confined CDs,a visual RTP detection of Fe^(3+)ions is achieved with the detection limit as low as 9.8μmol/L.This work has opened up a new perspective for the strategic amalgamation of luminous vips with porous zeolite to construct the advanced functional materials.
基金supported by Natural Science Foundation of Jilin Province(No.SKL202302002)Key Research and Development project of Jilin Provincial Science and Technology Department(No.20210204142YY)+2 种基金The Science and Technology Development Program of Jilin Province(No.2020122256JC)Beijing Kechuang Medical Development Foundation Fund of China(No.KC2023-JX-0186BQ079)Talent Reserve Program(TRP),the First Hospital of Jilin University(No.JDYY-TRP-2024007)。
文摘Prostate cancer(PCa)is characterized by high incidence and propensity for easy metastasis,presenting significant challenges in clinical diagnosis and treatment.Tumor microenvironment(TME)-responsive nanomaterials provide a promising prospect for imaging-guided precision therapy.Considering that tumor-derived alkaline phosphatase(ALP)is over-expressed in metastatic PCa,it makes a great chance to develop a theranostics system with ALP responsive in the TME.Herein,an ALP-responsive aggregationinduced emission luminogens(AIEgens)nanoprobe AMNF self-assembly was designed for enhancing the diagnosis and treatment of metastatic PCa.The nanoprobe exhibited self-aggregation in the presence of ALP resulted in aggregation-induced fluorescence,and enhanced accumulation and prolonged retention period at the tumor site.In terms of detection,the fluorescence(FL)/computed tomography(CT)/magnetic resonance(MR)multi-mode imaging effect of nanoprobe was significantly improved post-aggregation,enabling precise diagnosis through the amalgamation of multiple imaging modes.Enhanced CT/MR imaging can achieve assist preoperative tumor diagnosis,and enhanced FL imaging technology can achieve“intraoperative visual navigation”,showing its potential application value in clinical tumor detection and surgical guidance.In terms of treatment,AMNF showed strong absorption in the near infrared region after aggregation,which improved the photothermal treatment effect.Overall,our work developed an effective aggregation-enhanced theranostic strategy for ALP-related cancers.
基金supported by the Science and Technology Project of State Grid Corporation of China under grant 52094021N010(5400-202199534A-0-5-ZN)。
文摘Low-carbon smart parks achieve selfbalanced carbon emission and absorption through the cooperative scheduling of direct current(DC)-based distributed photovoltaic,energy storage units,and loads.Direct current power line communication(DC-PLC)enables real-time data transmission on DC power lines.With traffic adaptation,DC-PLC can be integrated with other complementary media such as 5G to reduce transmission delay and improve reliability.However,traffic adaptation for DC-PLC and 5G integration still faces the challenges such as coupling between traffic admission control and traffic partition,dimensionality curse,and the ignorance of extreme event occurrence.To address these challenges,we propose a deep reinforcement learning(DRL)-based delay sensitive and reliable traffic adaptation algorithm(DSRTA)to minimize the total queuing delay under the constraints of traffic admission control,queuing delay,and extreme events occurrence probability.DSRTA jointly optimizes traffic admission control and traffic partition,and enables learning-based intelligent traffic adaptation.The long-term constraints are incorporated into both state and bound of drift-pluspenalty to achieve delay awareness and enforce reliability guarantee.Simulation results show that DSRTA has lower queuing delay and more reliable quality of service(QoS)guarantee than other state-of-the-art algorithms.
基金Supported by Hebei Provincial Natural Science Foundation of China(Grant Nos.E2020203174,E2020203078)S&T Program of Hebei Province of China(Grant No.226Z2202G)Science Research Project of Hebei Provincial Education Department of China(Grant No.ZD2022029).
文摘The all-wheel drive(AWD)hybrid system is a research focus on high-performance new energy vehicles that can meet the demands of dynamic performance and passing ability.Simultaneous optimization of the power and economy of hybrid vehicles becomes an issue.A unique multi-mode coupling(MMC)AWD hybrid system is presented to realize the distributed and centralized driving of the front and rear axles to achieve vectored distribution and full utilization of the system power between the axles of vehicles.Based on the parameters of the benchmarking model of a hybrid vehicle,the best model-predictive control-based energy management strategy is proposed.First,the drive system model was built after the analysis of the MMC-AWD’s drive modes.Next,three fundamental strategies were established to address power distribution adjustment and battery SOC maintenance when the SOC changed,which was followed by the design of a road driving force observer.Then,the energy consumption rate in the average time domain was processed before designing the minimum fuel consumption controller based on the equivalent fuel consumption coefficient.Finally,the advantage of the MMC-AWD was confirmed by comparison with the dynamic performance and economy of the BYD Song PLUS DMI-AWD.The findings indicate that,in comparison to the comparative hybrid system at road adhesion coefficients of 0.8 and 0.6,the MMC-AWD’s capacity to accelerate increases by 5.26%and 7.92%,respectively.When the road adhesion coefficient is 0.8,0.6,and 0.4,the maximum climbing ability increases by 14.22%,12.88%,and 4.55%,respectively.As a result,the dynamic performance is greatly enhanced,and the fuel savings rate per 100 km of mileage reaches 12.06%,which is also very economical.The proposed control strategies for the new hybrid AWD vehicle can optimize the power and economy simultaneously.
基金co-supported by the National Key R&D Program of China(No.2022YFB3402200)the National Natural Science Foundation of China(Nos.12372123,12272305 and 12372156)+2 种基金the Key Project of NSFC,China(Nos.92271205,12032018 and 12220101002)the Fundamental Research Funds for the Central Universities of China(No.G2022KY0606)the Basic Research Program of China(No.JCKY2022603C016).
文摘This work evaluates the viability of a cutting-edge flexible wing prototype actuated by Shape Memory Alloy(SMA)wire actuators.Such flexible wings have garnered significant interest for their potential to enhance aerodynamic efficiency by mitigating noise and delaying flow separation.SMA actuators are particularly advantageous due to their superior power-to-weight ratio and adaptive response,making them increasingly favored in morphing aircraft applications.Our methodology begins with a detailed delineation of the fishbone camber morphing wing rib structure,followed by the construction of a multi-mode morphing wing segment through 3D-printed rib assembly.Comprehensive testing of the SMA wire actuators’actuation capacity and efficiency was conducted to establish their operational parameters.Subsequent experimental analyses focused on the bi-directional and reciprocating morphing performance of the fishbone wing rib,which incorporates SMA wires on the upper and lower sides.These experiments confirmed the segment’s multi-mode morphing abilities.Aerodynamic assessments have demonstrated that our design substantially improves the Lift-to-Drag ratio(L/D)when compared to conventional rigid wings.Finally,two phases of flight tests demonstrated the feasibility of SMA as an aircraft actuator and the validity of flexible wing structures to adjust the aircraft attitude,respectively.
基金This research was funded by the General Project of Philosophy and Social Science of Heilongjiang Province,Grant Number:20SHB080.
文摘In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical.Unfortunately,existing approaches fail to handle these problems.This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues(TD-MMC),which utilizes three valuable multi-model clues:text-model importance,text-image complementary,and text-image inconsistency.TD-MMC is dominated by textural content and assisted by image information while using social network information to enhance text representation.To reduce the irrelevant social structure’s information interference,we use a unidirectional cross-modal attention mechanism to selectively learn the social structure’s features.A cross-modal attention mechanism is adopted to obtain text-image cross-modal features while retaining textual features to reduce the loss of important information.In addition,TD-MMC employs a new multi-model loss to improve the model’s generalization ability.Extensive experiments have been conducted on two public real-world English and Chinese datasets,and the results show that our proposed model outperforms the state-of-the-art methods on classification evaluation metrics.
基金Project supported by the Heilongjiang Provincial Key Laboratory of Micro-nano Sensitive Devices and SystemsBasic Research Project for Outstanding Young Teachers of Heilongjiang Province (YQJH2023128)Cultivation Project of Double First-class Initiative Discipline by Heilongjiang Province(LJGXCG2022-061)。
文摘Lanthanum-doped double halide perovskite has attracted increasing interest due to its distinctive upconversion and near-infrared(NIR) luminous characteristics.Here,erbium ion(Er^(3+)) doped Cs_(2)(Na/Ag)BiCl_(6) microcrystals(MCs) were synthesized and proved to be one of the most prospective candidates for optical thermometry.The enhancement of both white light from self-trapped exciton emission and NIR emission from Er^(3+) ion of Cs_(2)AgBiCl_(6) microcrystals is caused by lattice distortion due to Na^(+) ion doping.Fluorescence intensity ratio and lifetime methods provide self-referenced and sensitive thermometry under 405 and/or 980 nm laser excitation at the temperatures from 80 to 480 K.Besides,the maximum values of relative and absolute sensitivity of 3.62%/K and 27//K can be achieved in the low to high temperature range under 980 and 405 nm laser co-excitation.Through the experimental analysis,Er^(3+)doped Cs_(2)(Na/Ag)BiCl_(6) double perovskite is considered to be an ideal self-calibrating thermometric material due to its good long-term stability and multi-mode function of excitation and detection.
基金Construction Program of the Key Discipline of State Administration of Traditional Chinese Medicine of China(ZYYZDXK-2023069)Research Project of Shanghai Municipal Health Commission (2024QN018)Shanghai University of Traditional Chinese Medicine Science and Technology Development Program (23KFL005)。
文摘Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.
基金funded by Research Project,grant number BHQ090003000X03.
文摘Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines.
基金funded by Research Project,grant number BHQ090003000X03。
文摘Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and knowledge and the limitations of data sources,the visual knowledge within the knowledge graphs is generally of low quality,and some entities suffer from the issue of missing visual modality.Nevertheless,previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing.In this case,mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion,which inevitably suffers from problems such as error propagation and increased uncertainty.To address these problems,we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation(MMCSD).Specifically,we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality.Moreover,we design multi-level visual semantic extraction and entity description generation,thereby further extracting entity semantics from structural triples and visual images.Meanwhile,we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features.We conducted experiments on FB15K-237 and DB13K,and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance.
基金supported by the Deanship of Research and Graduate Studies at King Khalid University under Small Research Project grant number RGP1/139/45.
文摘Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.
基金partially supported by the National Natural Science Foundation of China under Grants 62471493 and 62402257(for conceptualization and investigation)partially supported by the Natural Science Foundation of Shandong Province,China under Grants ZR2023LZH017,ZR2024MF066,and 2023QF025(for formal analysis and validation)+1 种基金partially supported by the Open Foundation of Key Laboratory of Computing Power Network and Information Security,Ministry of Education,Qilu University of Technology(Shandong Academy of Sciences)under Grant 2023ZD010(for methodology and model design)partially supported by the Russian Science Foundation(RSF)Project under Grant 22-71-10095-P(for validation and results verification).
文摘To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model.
文摘Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single target”presents significant challenges due to its holistic approach.Network pharmacology and its core theory of network targets connect drugs and diseases from a holistic and systematic perspective based on biological networks,overcoming the limitations of reductionist research models and showing considerable value in TCM research.Recent integration of network target computational and experimental methods with artificial intelligence(AI)and multi-modal multi-omics technologies has substantially enhanced network pharmacology methodology.The advancement in computational and experimental techniques provides complementary support for network target theory in decoding TCM principles.This review,centered on network targets,examines the progress of network target methods combined with AI in predicting disease molecular mechanisms and drug-target relationships,alongside the application of multi-modal multi-omics technologies in analyzing TCM formulae,syndromes,and toxicity.Looking forward,network target theory is expected to incorporate emerging technologies while developing novel approaches aligned with its unique characteristics,potentially leading to significant breakthroughs in TCM research and advancing scientific understanding and innovation in TCM.
基金funded by the Yangtze River Delta Science and Technology Innovation Community Joint Research Project(2023CSJGG1600)the Natural Science Foundation of Anhui Province(2208085MF173)Wuhu“ChiZhu Light”Major Science and Technology Project(2023ZD01,2023ZD03).
文摘As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection.
基金Supported by the National Natural Science Foundation of China(No.61472256,61170277)the Hujiang Foundation(No.A14006).
文摘The primary objective of Chinese spelling correction(CSC)is to detect and correct erroneous characters in Chinese text,which can result from various factors,such as inaccuracies in pinyin representation,character resemblance,and semantic discrepancies.However,existing methods often struggle to fully address these types of errors,impacting the overall correction accuracy.This paper introduces a multi-modal feature encoder designed to efficiently extract features from three distinct modalities:pinyin,semantics,and character morphology.Unlike previous methods that rely on direct fusion or fixed-weight summation to integrate multi-modal information,our approach employs a multi-head attention mechanism to focuse more on relevant modal information while dis-regarding less pertinent data.To prevent issues such as gradient explosion or vanishing,the model incorporates a residual connection of the original text vector for fine-tuning.This approach ensures robust model performance by maintaining essential linguistic details throughout the correction process.Experimental evaluations on the SIGHAN benchmark dataset demonstrate that the pro-posed model outperforms baseline approaches across various metrics and datasets,confirming its effectiveness and feasibility.