At present,most experimental teaching systems lack guidance of an operator,and thus users often do not know what to do during an experiment.The user load is therefore increased,and the learning efficiency of the stude...At present,most experimental teaching systems lack guidance of an operator,and thus users often do not know what to do during an experiment.The user load is therefore increased,and the learning efficiency of the students is decreased.To solve the problem of insufficient system interactivity and guidance,an experimental navigation system based on multi-mode fusion is proposed in this paper.The system first obtains user information by sensing the hardware devices,intelligently perceives the user intention and progress of the experiment according to the information acquired,and finally carries out a multi-modal intelligent navigation process for users.As an innovative aspect of this study,an intelligent multi-mode navigation system is used to guide users in conducting experiments,thereby reducing the user load and enabling the users to effectively complete their experiments.The results prove that this system can guide users in completing their experiments,and can effectively reduce the user load during the interaction process and improve the efficiency.展开更多
Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocar...Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.展开更多
As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advan...As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection.展开更多
Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or ...Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines.展开更多
To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities...To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model.展开更多
To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features e...To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features extracted synchronously by the CCAE were stacked and fed to the multi-channel convolution layers for fusion. Then, the fused data was passed to all connection layers for compression and fed to the Softmax module for classification. Finally, the coupling loss function coefficients and the network parameters were optimized through an adaptive approach using the gray wolf optimization (GWO) algorithm. Experimental comparisons showed that the proposed ADCCAE fusion model was superior to existing models for multi-mode data fusion.展开更多
Carbon dots(CDs)-based composites have shown impressive performance in fields of information encryption and sensing,however,a great challenge is to simultaneously implement multi-mode luminescence and room-temperature...Carbon dots(CDs)-based composites have shown impressive performance in fields of information encryption and sensing,however,a great challenge is to simultaneously implement multi-mode luminescence and room-temperature phosphorescence(RTP)detection in single system due to the formidable synthesis.Herein,a multifunctional composite of Eu&CDs@p RHO has been designed by co-assembly strategy and prepared via a facile calcination and impregnation treatment.Eu&CDs@p RHO exhibits intense fluorescence(FL)and RTP coming from two individual luminous centers,Eu3+in the free pores and CDs in the interrupted structure of RHO zeolite.Unique four-mode color outputs including pink(Eu^(3+),ex.254 nm),light violet(CDs,ex.365 nm),blue(CDs,254 nm off),and green(CDs,365 nm off)could be realized,on the basis of it,a preliminary application of advanced information encoding has been demonstrated.Given the free pores of matrix and stable RTP in water of confined CDs,a visual RTP detection of Fe^(3+)ions is achieved with the detection limit as low as 9.8μmol/L.This work has opened up a new perspective for the strategic amalgamation of luminous vips with porous zeolite to construct the advanced functional materials.展开更多
Prostate cancer(PCa)is characterized by high incidence and propensity for easy metastasis,presenting significant challenges in clinical diagnosis and treatment.Tumor microenvironment(TME)-responsive nanomaterials prov...Prostate cancer(PCa)is characterized by high incidence and propensity for easy metastasis,presenting significant challenges in clinical diagnosis and treatment.Tumor microenvironment(TME)-responsive nanomaterials provide a promising prospect for imaging-guided precision therapy.Considering that tumor-derived alkaline phosphatase(ALP)is over-expressed in metastatic PCa,it makes a great chance to develop a theranostics system with ALP responsive in the TME.Herein,an ALP-responsive aggregationinduced emission luminogens(AIEgens)nanoprobe AMNF self-assembly was designed for enhancing the diagnosis and treatment of metastatic PCa.The nanoprobe exhibited self-aggregation in the presence of ALP resulted in aggregation-induced fluorescence,and enhanced accumulation and prolonged retention period at the tumor site.In terms of detection,the fluorescence(FL)/computed tomography(CT)/magnetic resonance(MR)multi-mode imaging effect of nanoprobe was significantly improved post-aggregation,enabling precise diagnosis through the amalgamation of multiple imaging modes.Enhanced CT/MR imaging can achieve assist preoperative tumor diagnosis,and enhanced FL imaging technology can achieve“intraoperative visual navigation”,showing its potential application value in clinical tumor detection and surgical guidance.In terms of treatment,AMNF showed strong absorption in the near infrared region after aggregation,which improved the photothermal treatment effect.Overall,our work developed an effective aggregation-enhanced theranostic strategy for ALP-related cancers.展开更多
Low-carbon smart parks achieve selfbalanced carbon emission and absorption through the cooperative scheduling of direct current(DC)-based distributed photovoltaic,energy storage units,and loads.Direct current power li...Low-carbon smart parks achieve selfbalanced carbon emission and absorption through the cooperative scheduling of direct current(DC)-based distributed photovoltaic,energy storage units,and loads.Direct current power line communication(DC-PLC)enables real-time data transmission on DC power lines.With traffic adaptation,DC-PLC can be integrated with other complementary media such as 5G to reduce transmission delay and improve reliability.However,traffic adaptation for DC-PLC and 5G integration still faces the challenges such as coupling between traffic admission control and traffic partition,dimensionality curse,and the ignorance of extreme event occurrence.To address these challenges,we propose a deep reinforcement learning(DRL)-based delay sensitive and reliable traffic adaptation algorithm(DSRTA)to minimize the total queuing delay under the constraints of traffic admission control,queuing delay,and extreme events occurrence probability.DSRTA jointly optimizes traffic admission control and traffic partition,and enables learning-based intelligent traffic adaptation.The long-term constraints are incorporated into both state and bound of drift-pluspenalty to achieve delay awareness and enforce reliability guarantee.Simulation results show that DSRTA has lower queuing delay and more reliable quality of service(QoS)guarantee than other state-of-the-art algorithms.展开更多
Fault diagnosis of rolling bearings is crucial for ensuring the stable operation of mechanical equipment and production safety in industrial environments.However,due to the nonlinearity and non-stationarity of collect...Fault diagnosis of rolling bearings is crucial for ensuring the stable operation of mechanical equipment and production safety in industrial environments.However,due to the nonlinearity and non-stationarity of collected vibration signals,single-modal methods struggle to capture fault features fully.This paper proposes a rolling bearing fault diagnosis method based on multi-modal information fusion.The method first employs the Hippopotamus Optimization Algorithm(HO)to optimize the number of modes in Variational Mode Decomposition(VMD)to achieve optimal modal decomposition performance.It combines Convolutional Neural Networks(CNN)and Gated Recurrent Units(GRU)to extract temporal features from one-dimensional time-series signals.Meanwhile,the Markovian Transition Field(MTF)is used to transform one-dimensional signals into two-dimensional images for spatial feature mining.Through visualization techniques,the effectiveness of generated images from different parameter combinations is compared to determine the optimal parameter configuration.A multi-modal network(GSTCN)is constructed by integrating Swin-Transformer and the Convolutional Block Attention Module(CBAM),where the attention module is utilized to enhance fault features.Finally,the fault features extracted from different modalities are deeply fused and fed into a fully connected layer to complete fault classification.Experimental results show that the GSTCN model achieves an average diagnostic accuracy of 99.5%across three datasets,significantly outperforming existing comparison methods.This demonstrates that the proposed model has high diagnostic precision and good generalization ability,providing an efficient and reliable solution for rolling bearing fault diagnosis.展开更多
Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively h...Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.展开更多
A dual-phase synergistic enhancement method was adopted to strengthen the Al-Mn-Mg-Sc-Zr alloy fabricated by laser powder bed fusion(LPBF)by leveraging the unique advantages of Er and TiB_(2).Spherical powders of 0.5w...A dual-phase synergistic enhancement method was adopted to strengthen the Al-Mn-Mg-Sc-Zr alloy fabricated by laser powder bed fusion(LPBF)by leveraging the unique advantages of Er and TiB_(2).Spherical powders of 0.5wt%Er-1wt%TiB_(2)/Al-Mn-Mg-Sc-Zr nanocomposite were prepared using vacuum homogenization technique,and the density of samples prepared through the LPBF process reached 99.8%.The strengthening and toughening mechanisms of Er-TiB_(2)were investigated.The results show that Al_(3)Er diffraction peaks are detected by X-ray diffraction analysis,and texture strength decreases according to electron backscatter diffraction results.The added Er and TiB_(2)nano-reinforcing phases act as heterogeneous nucleation sites during the LPBF forming process,hindering grain growth and effectively refining the grains.After incorporating the Er-TiB_(2)dual-phase nano-reinforcing phases,the tensile strength and elongation at break of the LPBF-deposited samples reach 550 MPa and 18.7%,which are 13.4%and 26.4%higher than those of the matrix material,respectively.展开更多
BACKGROUND Lumbar interbody fusion(LIF)is the primary treatment for lumbar degenerative diseases.Elderly patients are prone to anxiety and depression after undergoing surgery,which affects their postoperative recovery...BACKGROUND Lumbar interbody fusion(LIF)is the primary treatment for lumbar degenerative diseases.Elderly patients are prone to anxiety and depression after undergoing surgery,which affects their postoperative recovery speed and quality of life.Effective prevention of anxiety and depression in elderly patients has become an urgent problem.AIM To investigate the trajectory of anxiety and depression levels in elderly patients after LIF,and the influencing factors.METHODS Random sampling was used to select 239 elderly patients who underwent LIF from January 2020 to December 2024 in Shenzhen Pingle Orthopedic Hospital.General information and surgery-related indices were recorded,and participants completed measures of psychological status,lumbar spine dysfunction,and quality of life.A latent class growth model was used to analyze the post-LIF trajectory of anxiety and depression levels,and unordered multi-categorical logistic regression was used to analyze the influencing factors.RESULTS Three trajectories of change in anxiety level were identified:Increasing anxiety(n=26,10.88%),decreasing anxiety(n=27,11.30%),and stable anxiety(n=186,77.82%).Likewise,three trajectories of change in depression level were identified:Increasing depression(n=30,12.55%),decreasing depression(n=26,10.88%),and stable depression(n=183,76.57%).Regression analysis showed that having no partner,female sex,elevated Oswestry dysfunction index(ODI)scores,and reduced 36-Item Short Form Health Survey scores all contributed to increased anxiety levels,whereas female sex,postoperative opioid use,and elevated ODI scores all contributed to increased depression levels.CONCLUSION During clinical observation,combining factors to predict anxiety and depression in post-LIF elderly patients enables timely intervention,quickens recovery,and enhances quality of life.展开更多
BACKGROUND Salvage of the infected long stem revision total knee arthroplasty is challenging due to the presence of well-fixed ingrown or cemented stems.Reconstructive options are limited.Above knee amputation(AKA)is ...BACKGROUND Salvage of the infected long stem revision total knee arthroplasty is challenging due to the presence of well-fixed ingrown or cemented stems.Reconstructive options are limited.Above knee amputation(AKA)is often recommended.We present a surgical technique that was successfully used on four such patients to convert them to a knee fusion(KF)using a cephalomedullary nail.CASE SUMMARY Four patients with infected long stem revision knee replacements that refused AKA had a single stage removal of their infected revision total knee followed by a KF.They were all treated with a statically locked antegrade cephalomedullary fusion nail,augmented with antibiotic impregnated bone cement.All patients had successful limb salvage and were ambulatory with assistive devices at the time of last follow-up.All were infection free at an average follow-up of 25.5 months(range 16-31).CONCLUSION Single stage cephalomedullary nailing can result in a successful KF in patients with infected long stem revision total knees.展开更多
Impact craters are important for understanding the evolution of lunar geologic and surface erosion rates,among other functions.However,the morphological characteristics of these micro impact craters are not obvious an...Impact craters are important for understanding the evolution of lunar geologic and surface erosion rates,among other functions.However,the morphological characteristics of these micro impact craters are not obvious and they are numerous,resulting in low detection accuracy by deep learning models.Therefore,we proposed a new multi-scale fusion crater detection algorithm(MSF-CDA)based on the YOLO11 to improve the accuracy of lunar impact crater detection,especially for small craters with a diameter of<1 km.Using the images taken by the LROC(Lunar Reconnaissance Orbiter Camera)at the Chang’e-4(CE-4)landing area,we constructed three separate datasets for craters with diameters of 0-70 m,70-140 m,and>140 m.We then trained three submodels separately with these three datasets.Additionally,we designed a slicing-amplifying-slicing strategy to enhance the ability to extract features from small craters.To handle redundant predictions,we proposed a new Non-Maximum Suppression with Area Filtering method to fuse the results in overlapping targets within the multi-scale submodels.Finally,our new MSF-CDA method achieved high detection performance,with the Precision,Recall,and F1 score having values of 0.991,0.987,and 0.989,respectively,perfectly addressing the problems induced by the lesser features and sample imbalance of small craters.Our MSF-CDA can provide strong data support for more in-depth study of the geological evolution of the lunar surface and finer geological age estimations.This strategy can also be used to detect other small objects with lesser features and sample imbalance problems.We detected approximately 500,000 impact craters in an area of approximately 214 km2 around the CE-4 landing area.By statistically analyzing the new data,we updated the distribution function of the number and diameter of impact craters.Finally,we identified the most suitable lighting conditions for detecting impact crater targets by analyzing the effect of different lighting conditions on the detection accuracy.展开更多
This paper considers the problem of time varying congestion pricing to determine optimal time-varying tolls at peak periods for a queuing network with the interactions between buses and private cars.Through the combin...This paper considers the problem of time varying congestion pricing to determine optimal time-varying tolls at peak periods for a queuing network with the interactions between buses and private cars.Through the combined applications of the space-time expanded network(STEN) and the conventional network equilibrium modeling techniques,a multi-class,multi-mode and multi-criteria traffic network equilibrium model is developed.Travelers of different classes have distinctive value of times(VOTs),and travelers from the same class perceive their travel disutility or generalized costs on a route according to different weights of travel time and travel costs.Moreover,the symmetric cost function model is extended to deal with the interactions between buses and private cars.It is found that there exists a uniform(anonymous) link toll pattern which can drive a multi-class,multi-mode and multi-criteria user equilibrium flow pattern to a system optimum when the system's objective function is measured in terms of money.It is also found that the marginal cost pricing models with a symmetric travel cost function do not reflect the interactions between traffic flows of different road sections,and the obtained congestion pricing toll is smaller than the real value.展开更多
Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant resear...Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities.Under complex scenes,multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions.However,achieving outstanding performance is challenging because of equipment performance limitations,missing information,and data noise.This paper comprehensively reviews existing methods based onmulti-modal fusion techniques and completes a detailed and in-depth analysis.According to the data fusion stage,multi-modal fusion has four primary methods:early fusion,deep fusion,late fusion,and hybrid fusion.The paper surveys the three majormulti-modal fusion technologies that can significantly enhance the effect of data fusion and further explore the applications of multi-modal fusion technology in various fields.Finally,it discusses the challenges and explores potential research opportunities.Multi-modal tasks still need intensive study because of data heterogeneity and quality.Preserving complementary information and eliminating redundant information between modalities is critical in multi-modal technology.Invalid data fusion methods may introduce extra noise and lead to worse results.This paper provides a comprehensive and detailed summary in response to these challenges.展开更多
基金the the National Key R&D Program of China(No.2018YFB1004901)the Independent Innovation Team Project of Jinan City(No.2019GXRC013).
文摘At present,most experimental teaching systems lack guidance of an operator,and thus users often do not know what to do during an experiment.The user load is therefore increased,and the learning efficiency of the students is decreased.To solve the problem of insufficient system interactivity and guidance,an experimental navigation system based on multi-mode fusion is proposed in this paper.The system first obtains user information by sensing the hardware devices,intelligently perceives the user intention and progress of the experiment according to the information acquired,and finally carries out a multi-modal intelligent navigation process for users.As an innovative aspect of this study,an intelligent multi-mode navigation system is used to guide users in conducting experiments,thereby reducing the user load and enabling the users to effectively complete their experiments.The results prove that this system can guide users in completing their experiments,and can effectively reduce the user load during the interaction process and improve the efficiency.
基金Construction Program of the Key Discipline of State Administration of Traditional Chinese Medicine of China(ZYYZDXK-2023069)Research Project of Shanghai Municipal Health Commission (2024QN018)Shanghai University of Traditional Chinese Medicine Science and Technology Development Program (23KFL005)。
文摘Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.
基金funded by the Yangtze River Delta Science and Technology Innovation Community Joint Research Project(2023CSJGG1600)the Natural Science Foundation of Anhui Province(2208085MF173)Wuhu“ChiZhu Light”Major Science and Technology Project(2023ZD01,2023ZD03).
文摘As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection.
基金funded by Research Project,grant number BHQ090003000X03.
文摘Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines.
基金partially supported by the National Natural Science Foundation of China under Grants 62471493 and 62402257(for conceptualization and investigation)partially supported by the Natural Science Foundation of Shandong Province,China under Grants ZR2023LZH017,ZR2024MF066,and 2023QF025(for formal analysis and validation)+1 种基金partially supported by the Open Foundation of Key Laboratory of Computing Power Network and Information Security,Ministry of Education,Qilu University of Technology(Shandong Academy of Sciences)under Grant 2023ZD010(for methodology and model design)partially supported by the Russian Science Foundation(RSF)Project under Grant 22-71-10095-P(for validation and results verification).
文摘To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model.
文摘To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features extracted synchronously by the CCAE were stacked and fed to the multi-channel convolution layers for fusion. Then, the fused data was passed to all connection layers for compression and fed to the Softmax module for classification. Finally, the coupling loss function coefficients and the network parameters were optimized through an adaptive approach using the gray wolf optimization (GWO) algorithm. Experimental comparisons showed that the proposed ADCCAE fusion model was superior to existing models for multi-mode data fusion.
基金supported by the National Natural Science Foundation of China(No.22288101)the 111 Project(No.B17020)。
文摘Carbon dots(CDs)-based composites have shown impressive performance in fields of information encryption and sensing,however,a great challenge is to simultaneously implement multi-mode luminescence and room-temperature phosphorescence(RTP)detection in single system due to the formidable synthesis.Herein,a multifunctional composite of Eu&CDs@p RHO has been designed by co-assembly strategy and prepared via a facile calcination and impregnation treatment.Eu&CDs@p RHO exhibits intense fluorescence(FL)and RTP coming from two individual luminous centers,Eu3+in the free pores and CDs in the interrupted structure of RHO zeolite.Unique four-mode color outputs including pink(Eu^(3+),ex.254 nm),light violet(CDs,ex.365 nm),blue(CDs,254 nm off),and green(CDs,365 nm off)could be realized,on the basis of it,a preliminary application of advanced information encoding has been demonstrated.Given the free pores of matrix and stable RTP in water of confined CDs,a visual RTP detection of Fe^(3+)ions is achieved with the detection limit as low as 9.8μmol/L.This work has opened up a new perspective for the strategic amalgamation of luminous vips with porous zeolite to construct the advanced functional materials.
基金supported by Natural Science Foundation of Jilin Province(No.SKL202302002)Key Research and Development project of Jilin Provincial Science and Technology Department(No.20210204142YY)+2 种基金The Science and Technology Development Program of Jilin Province(No.2020122256JC)Beijing Kechuang Medical Development Foundation Fund of China(No.KC2023-JX-0186BQ079)Talent Reserve Program(TRP),the First Hospital of Jilin University(No.JDYY-TRP-2024007)。
文摘Prostate cancer(PCa)is characterized by high incidence and propensity for easy metastasis,presenting significant challenges in clinical diagnosis and treatment.Tumor microenvironment(TME)-responsive nanomaterials provide a promising prospect for imaging-guided precision therapy.Considering that tumor-derived alkaline phosphatase(ALP)is over-expressed in metastatic PCa,it makes a great chance to develop a theranostics system with ALP responsive in the TME.Herein,an ALP-responsive aggregationinduced emission luminogens(AIEgens)nanoprobe AMNF self-assembly was designed for enhancing the diagnosis and treatment of metastatic PCa.The nanoprobe exhibited self-aggregation in the presence of ALP resulted in aggregation-induced fluorescence,and enhanced accumulation and prolonged retention period at the tumor site.In terms of detection,the fluorescence(FL)/computed tomography(CT)/magnetic resonance(MR)multi-mode imaging effect of nanoprobe was significantly improved post-aggregation,enabling precise diagnosis through the amalgamation of multiple imaging modes.Enhanced CT/MR imaging can achieve assist preoperative tumor diagnosis,and enhanced FL imaging technology can achieve“intraoperative visual navigation”,showing its potential application value in clinical tumor detection and surgical guidance.In terms of treatment,AMNF showed strong absorption in the near infrared region after aggregation,which improved the photothermal treatment effect.Overall,our work developed an effective aggregation-enhanced theranostic strategy for ALP-related cancers.
基金supported by the Science and Technology Project of State Grid Corporation of China under grant 52094021N010(5400-202199534A-0-5-ZN)。
文摘Low-carbon smart parks achieve selfbalanced carbon emission and absorption through the cooperative scheduling of direct current(DC)-based distributed photovoltaic,energy storage units,and loads.Direct current power line communication(DC-PLC)enables real-time data transmission on DC power lines.With traffic adaptation,DC-PLC can be integrated with other complementary media such as 5G to reduce transmission delay and improve reliability.However,traffic adaptation for DC-PLC and 5G integration still faces the challenges such as coupling between traffic admission control and traffic partition,dimensionality curse,and the ignorance of extreme event occurrence.To address these challenges,we propose a deep reinforcement learning(DRL)-based delay sensitive and reliable traffic adaptation algorithm(DSRTA)to minimize the total queuing delay under the constraints of traffic admission control,queuing delay,and extreme events occurrence probability.DSRTA jointly optimizes traffic admission control and traffic partition,and enables learning-based intelligent traffic adaptation.The long-term constraints are incorporated into both state and bound of drift-pluspenalty to achieve delay awareness and enforce reliability guarantee.Simulation results show that DSRTA has lower queuing delay and more reliable quality of service(QoS)guarantee than other state-of-the-art algorithms.
基金funded by the Jilin Provincial Department of Science and Technology,grant number 20230101208JC.
文摘Fault diagnosis of rolling bearings is crucial for ensuring the stable operation of mechanical equipment and production safety in industrial environments.However,due to the nonlinearity and non-stationarity of collected vibration signals,single-modal methods struggle to capture fault features fully.This paper proposes a rolling bearing fault diagnosis method based on multi-modal information fusion.The method first employs the Hippopotamus Optimization Algorithm(HO)to optimize the number of modes in Variational Mode Decomposition(VMD)to achieve optimal modal decomposition performance.It combines Convolutional Neural Networks(CNN)and Gated Recurrent Units(GRU)to extract temporal features from one-dimensional time-series signals.Meanwhile,the Markovian Transition Field(MTF)is used to transform one-dimensional signals into two-dimensional images for spatial feature mining.Through visualization techniques,the effectiveness of generated images from different parameter combinations is compared to determine the optimal parameter configuration.A multi-modal network(GSTCN)is constructed by integrating Swin-Transformer and the Convolutional Block Attention Module(CBAM),where the attention module is utilized to enhance fault features.Finally,the fault features extracted from different modalities are deeply fused and fed into a fully connected layer to complete fault classification.Experimental results show that the GSTCN model achieves an average diagnostic accuracy of 99.5%across three datasets,significantly outperforming existing comparison methods.This demonstrates that the proposed model has high diagnostic precision and good generalization ability,providing an efficient and reliable solution for rolling bearing fault diagnosis.
文摘Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.
基金Shaanxi Province Qin Chuangyuan“Scientist+Engineer”Team Construction Project(2022KXJ-071)2022 Qin Chuangyuan Achievement Transformation Incubation Capacity Improvement Project(2022JH-ZHFHTS-0012)+8 种基金Shaanxi Province Key Research and Development Plan-“Two Chains”Integration Key Project-Qin Chuangyuan General Window Industrial Cluster Project(2023QCY-LL-02)Xixian New Area Science and Technology Plan(2022-YXYJ-003,2022-XXCY-010)2024 Scientific Research Project of Shaanxi National Defense Industry Vocational and Technical College(Gfy24-07)Shaanxi Vocational and Technical Education Association 2024 Vocational Education Teaching Reform Research Topic(2024SZX354)National Natural Science Foundation of China(U24A20115)2024 Shaanxi Provincial Education Department Service Local Special Scientific Research Program Project-Industrialization Cultivation Project(24JC005,24JC063)Shaanxi Province“14th Five-Year Plan”Education Science Plan,2024 Project(SGH24Y3181)National Key Research and Development Program of China(2023YFB4606400)Longmen Laboratory Frontier Exploration Topics Project(LMQYTSKT003)。
文摘A dual-phase synergistic enhancement method was adopted to strengthen the Al-Mn-Mg-Sc-Zr alloy fabricated by laser powder bed fusion(LPBF)by leveraging the unique advantages of Er and TiB_(2).Spherical powders of 0.5wt%Er-1wt%TiB_(2)/Al-Mn-Mg-Sc-Zr nanocomposite were prepared using vacuum homogenization technique,and the density of samples prepared through the LPBF process reached 99.8%.The strengthening and toughening mechanisms of Er-TiB_(2)were investigated.The results show that Al_(3)Er diffraction peaks are detected by X-ray diffraction analysis,and texture strength decreases according to electron backscatter diffraction results.The added Er and TiB_(2)nano-reinforcing phases act as heterogeneous nucleation sites during the LPBF forming process,hindering grain growth and effectively refining the grains.After incorporating the Er-TiB_(2)dual-phase nano-reinforcing phases,the tensile strength and elongation at break of the LPBF-deposited samples reach 550 MPa and 18.7%,which are 13.4%and 26.4%higher than those of the matrix material,respectively.
基金Supported by the Scientific Research Projects of the Health System in Pingshan District,No.2023122.
文摘BACKGROUND Lumbar interbody fusion(LIF)is the primary treatment for lumbar degenerative diseases.Elderly patients are prone to anxiety and depression after undergoing surgery,which affects their postoperative recovery speed and quality of life.Effective prevention of anxiety and depression in elderly patients has become an urgent problem.AIM To investigate the trajectory of anxiety and depression levels in elderly patients after LIF,and the influencing factors.METHODS Random sampling was used to select 239 elderly patients who underwent LIF from January 2020 to December 2024 in Shenzhen Pingle Orthopedic Hospital.General information and surgery-related indices were recorded,and participants completed measures of psychological status,lumbar spine dysfunction,and quality of life.A latent class growth model was used to analyze the post-LIF trajectory of anxiety and depression levels,and unordered multi-categorical logistic regression was used to analyze the influencing factors.RESULTS Three trajectories of change in anxiety level were identified:Increasing anxiety(n=26,10.88%),decreasing anxiety(n=27,11.30%),and stable anxiety(n=186,77.82%).Likewise,three trajectories of change in depression level were identified:Increasing depression(n=30,12.55%),decreasing depression(n=26,10.88%),and stable depression(n=183,76.57%).Regression analysis showed that having no partner,female sex,elevated Oswestry dysfunction index(ODI)scores,and reduced 36-Item Short Form Health Survey scores all contributed to increased anxiety levels,whereas female sex,postoperative opioid use,and elevated ODI scores all contributed to increased depression levels.CONCLUSION During clinical observation,combining factors to predict anxiety and depression in post-LIF elderly patients enables timely intervention,quickens recovery,and enhances quality of life.
文摘BACKGROUND Salvage of the infected long stem revision total knee arthroplasty is challenging due to the presence of well-fixed ingrown or cemented stems.Reconstructive options are limited.Above knee amputation(AKA)is often recommended.We present a surgical technique that was successfully used on four such patients to convert them to a knee fusion(KF)using a cephalomedullary nail.CASE SUMMARY Four patients with infected long stem revision knee replacements that refused AKA had a single stage removal of their infected revision total knee followed by a KF.They were all treated with a statically locked antegrade cephalomedullary fusion nail,augmented with antibiotic impregnated bone cement.All patients had successful limb salvage and were ambulatory with assistive devices at the time of last follow-up.All were infection free at an average follow-up of 25.5 months(range 16-31).CONCLUSION Single stage cephalomedullary nailing can result in a successful KF in patients with infected long stem revision total knees.
基金the National Key Research and Development Program of China(Grant No.2022YFF0711400)which provided valuable financial support and resources for my research and made it possible for me to deeply explore the unknown mysteries in the field of lunar geologythe National Space Science Data Center Youth Open Project(Grant No.NSSDC2302001),which has not only facilitated the smooth progress of my research,but has also built a platform for me to communicate and cooperate with experts in the field.
文摘Impact craters are important for understanding the evolution of lunar geologic and surface erosion rates,among other functions.However,the morphological characteristics of these micro impact craters are not obvious and they are numerous,resulting in low detection accuracy by deep learning models.Therefore,we proposed a new multi-scale fusion crater detection algorithm(MSF-CDA)based on the YOLO11 to improve the accuracy of lunar impact crater detection,especially for small craters with a diameter of<1 km.Using the images taken by the LROC(Lunar Reconnaissance Orbiter Camera)at the Chang’e-4(CE-4)landing area,we constructed three separate datasets for craters with diameters of 0-70 m,70-140 m,and>140 m.We then trained three submodels separately with these three datasets.Additionally,we designed a slicing-amplifying-slicing strategy to enhance the ability to extract features from small craters.To handle redundant predictions,we proposed a new Non-Maximum Suppression with Area Filtering method to fuse the results in overlapping targets within the multi-scale submodels.Finally,our new MSF-CDA method achieved high detection performance,with the Precision,Recall,and F1 score having values of 0.991,0.987,and 0.989,respectively,perfectly addressing the problems induced by the lesser features and sample imbalance of small craters.Our MSF-CDA can provide strong data support for more in-depth study of the geological evolution of the lunar surface and finer geological age estimations.This strategy can also be used to detect other small objects with lesser features and sample imbalance problems.We detected approximately 500,000 impact craters in an area of approximately 214 km2 around the CE-4 landing area.By statistically analyzing the new data,we updated the distribution function of the number and diameter of impact craters.Finally,we identified the most suitable lighting conditions for detecting impact crater targets by analyzing the effect of different lighting conditions on the detection accuracy.
基金The National High Technology Research and Development Program of China (863 Program) (No. 2007AA11Z202)the National Key Technology R & D Program of China during the 11th Five-Year Plan Period(No. 2006BAJ18B03)the Fundamental Research Funds for the Central Universities (No. DUT10RC(3) 112)
文摘This paper considers the problem of time varying congestion pricing to determine optimal time-varying tolls at peak periods for a queuing network with the interactions between buses and private cars.Through the combined applications of the space-time expanded network(STEN) and the conventional network equilibrium modeling techniques,a multi-class,multi-mode and multi-criteria traffic network equilibrium model is developed.Travelers of different classes have distinctive value of times(VOTs),and travelers from the same class perceive their travel disutility or generalized costs on a route according to different weights of travel time and travel costs.Moreover,the symmetric cost function model is extended to deal with the interactions between buses and private cars.It is found that there exists a uniform(anonymous) link toll pattern which can drive a multi-class,multi-mode and multi-criteria user equilibrium flow pattern to a system optimum when the system's objective function is measured in terms of money.It is also found that the marginal cost pricing models with a symmetric travel cost function do not reflect the interactions between traffic flows of different road sections,and the obtained congestion pricing toll is smaller than the real value.
基金supported by the Natural Science Foundation of Liaoning Province(Grant No.2023-MSBA-070)the National Natural Science Foundation of China(Grant No.62302086).
文摘Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities.Under complex scenes,multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions.However,achieving outstanding performance is challenging because of equipment performance limitations,missing information,and data noise.This paper comprehensively reviews existing methods based onmulti-modal fusion techniques and completes a detailed and in-depth analysis.According to the data fusion stage,multi-modal fusion has four primary methods:early fusion,deep fusion,late fusion,and hybrid fusion.The paper surveys the three majormulti-modal fusion technologies that can significantly enhance the effect of data fusion and further explore the applications of multi-modal fusion technology in various fields.Finally,it discusses the challenges and explores potential research opportunities.Multi-modal tasks still need intensive study because of data heterogeneity and quality.Preserving complementary information and eliminating redundant information between modalities is critical in multi-modal technology.Invalid data fusion methods may introduce extra noise and lead to worse results.This paper provides a comprehensive and detailed summary in response to these challenges.