Recent advances in spatially resolved transcriptomic technologies have enabled unprecedented opportunities to elucidate tissue architecture and function in situ.Spatial transcriptomics can provide multimodal and compl...Recent advances in spatially resolved transcriptomic technologies have enabled unprecedented opportunities to elucidate tissue architecture and function in situ.Spatial transcriptomics can provide multimodal and complementary information simultaneously,including gene expression profiles,spatial locations,and histology images.However,most existing methods have limitations in efficiently utilizing spatial information and matched high-resolution histology images.To fully leverage the multi-modal information,we propose a SPAtially embedded Deep Attentional graph Clustering(SpaDAC)method to identify spatial domains while reconstructing denoised gene expression profiles.This method can efficiently learn the low-dimensional embeddings for spatial transcriptomics data by constructing multi-view graph modules to capture both spatial location connectives and morphological connectives.Benchmark results demonstrate that SpaDAC outperforms other algorithms on several recent spatial transcriptomics datasets.SpaDAC is a valuable tool for spatial domain detection,facilitating the comprehension of tissue architecture and cellular microenvironment.The source code of SpaDAC is freely available at Github(https://github.com/huoyuying/SpaDAC.git).展开更多
To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities...To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model.展开更多
With the growing advancement of wireless communication technologies,WiFi-based human sensing has gained increasing attention as a non-intrusive and device-free solution.Among the available signal types,Channel State I...With the growing advancement of wireless communication technologies,WiFi-based human sensing has gained increasing attention as a non-intrusive and device-free solution.Among the available signal types,Channel State Information(CSI)offers fine-grained temporal,frequency,and spatial insights into multipath propagation,making it a crucial data source for human-centric sensing.Recently,the integration of deep learning has significantly improved the robustness and automation of feature extraction from CSI in complex environments.This paper provides a comprehensive review of deep learning-enhanced human sensing based on CSI.We first outline mainstream CSI acquisition tools and their hardware specifications,then provide a detailed discussion of preprocessing methods such as denoising,time–frequency transformation,data segmentation,and augmentation.Subsequently,we categorize deep learning approaches according to sensing tasks—namely detection,localization,and recognition—and highlight representative models across application scenarios.Finally,we examine key challenges including domain generalization,multi-user interference,and limited data availability,and we propose future research directions involving lightweight model deployment,multimodal data fusion,and semantic-level sensing.展开更多
With the increasing of the elderly population and the growing hearth care cost, the role of service robots in aiding the disabled and the elderly is becoming important. Many researchers in the world have paid much att...With the increasing of the elderly population and the growing hearth care cost, the role of service robots in aiding the disabled and the elderly is becoming important. Many researchers in the world have paid much attention to heaRthcare robots and rehabilitation robots. To get natural and harmonious communication between the user and a service robot, the information perception/feedback ability, and interaction ability for service robots become more important in many key issues.展开更多
This paper focuses on multi-modal Information Perception(IP)for Soft Robotic Hands(SRHs)using Machine Learning(ML)algorithms.A flexible Optical Fiber-based Curvature Sensor(OFCS)is fabricated,consisting of a Light-Emi...This paper focuses on multi-modal Information Perception(IP)for Soft Robotic Hands(SRHs)using Machine Learning(ML)algorithms.A flexible Optical Fiber-based Curvature Sensor(OFCS)is fabricated,consisting of a Light-Emitting Diode(LED),photosensitive detector,and optical fiber.Bending the roughened optical fiber generates lower light intensity,which reflecting the curvature of the soft finger.Together with the curvature and pressure information,multi-modal IP is performed to improve the recognition accuracy.Recognitions of gesture,object shape,size,and weight are implemented with multiple ML approaches,including the Supervised Learning Algorithms(SLAs)of K-Nearest Neighbor(KNN),Support Vector Machine(SVM),Logistic Regression(LR),and the unSupervised Learning Algorithm(un-SLA)of K-Means Clustering(KMC).Moreover,Optical Sensor Information(OSI),Pressure Sensor Information(PSI),and Double-Sensor Information(DSI)are adopted to compare the recognition accuracies.The experiment results demonstrate that the proposed sensors and recognition approaches are feasible and effective.The recognition accuracies obtained using the above ML algorithms and three modes of sensor information are higer than 85 percent for almost all combinations.Moreover,DSI is more accurate when compared to single modal sensor information and the KNN algorithm with a DSI outperforms the other combinations in recognition accuracy.展开更多
With the rapid development of economy,air pollution caused by industrial expansion has caused serious harm to human health and social development.Therefore,establishing an effective air pollution concentration predict...With the rapid development of economy,air pollution caused by industrial expansion has caused serious harm to human health and social development.Therefore,establishing an effective air pollution concentration prediction system is of great scientific and practical significance for accurate and reliable predictions.This paper proposes a combination of pointinterval prediction system for pollutant concentration prediction by leveraging neural network,meta-heuristic optimization algorithm,and fuzzy theory.Fuzzy information granulation technology is used in data preprocessing to transform numerical sequences into fuzzy particles for comprehensive feature extraction.The golden Jackal optimization algorithm is employed in the optimization stage to fine-tune model hyperparameters.In the prediction stage,an ensemble learning method combines training results frommultiplemodels to obtain final point predictions while also utilizing quantile regression and kernel density estimation methods for interval predictions on the test set.Experimental results demonstrate that the combined model achieves a high goodness of fit coefficient of determination(R^(2))at 99.3% and a maximum difference between prediction accuracy mean absolute percentage error(MAPE)and benchmark model at 12.6%.This suggests that the integrated learning system proposed in this paper can provide more accurate deterministic predictions as well as reliable uncertainty analysis compared to traditionalmodels,offering practical reference for air quality early warning.展开更多
Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocar...Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.展开更多
Ecological monitoring vehicles are equipped with a range of sensors and monitoring devices designed to gather data on ecological and environmental factors.These vehicles are crucial in various fields,including environ...Ecological monitoring vehicles are equipped with a range of sensors and monitoring devices designed to gather data on ecological and environmental factors.These vehicles are crucial in various fields,including environmental science research,ecological and environmental monitoring projects,disaster response,and emergency management.A key method employed in these vehicles for achieving high-precision positioning is LiDAR(lightlaser detection and ranging)-Visual Simultaneous Localization and Mapping(SLAM).However,maintaining highprecision localization in complex scenarios,such as degraded environments or when dynamic objects are present,remains a significant challenge.To address this issue,we integrate both semantic and texture information from LiDAR and cameras to enhance the robustness and efficiency of data registration.Specifically,semantic information simplifies the modeling of scene elements,reducing the reliance on dense point clouds,which can be less efficient.Meanwhile,visual texture information complements LiDAR-Visual localization by providing additional contextual details.By incorporating semantic and texture details frompaired images and point clouds,we significantly improve the quality of data association,thereby increasing the success rate of localization.This approach not only enhances the operational capabilities of ecological monitoring vehicles in complex environments but also contributes to improving the overall efficiency and effectiveness of ecological monitoring and environmental protection efforts.展开更多
Advanced geological prediction is a crucial means to ensure safety and efficiency in tunnel construction.However,diff erent advanced geological forecasting methods have their own limitations,resulting in poor detectio...Advanced geological prediction is a crucial means to ensure safety and efficiency in tunnel construction.However,diff erent advanced geological forecasting methods have their own limitations,resulting in poor detection accuracy.Using multiple methods to carry out a comprehensive evaluation can eff ectively improve the accuracy of advanced geological prediction results.In this study,geological information is combined with the detection results of geophysical methods,including transient electromagnetic,induced polarization,and tunnel seismic prediction,to establish a comprehensive analysis method of adverse geology.First,the possible main adverse geological problems are determined according to the geological information.Subsequently,various physical parameters of the rock mass in front of the tunnel face can then be derived on the basis of multisource geophysical data.Finally,based on the analysis results of geological information,the multisource data fusion algorithm is used to determine the type,location,and scale of adverse geology.The advanced geological prediction results that can provide eff ective guidance for tunnel construction can then be obtained.展开更多
Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or ...Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines.展开更多
Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and ...Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and knowledge and the limitations of data sources,the visual knowledge within the knowledge graphs is generally of low quality,and some entities suffer from the issue of missing visual modality.Nevertheless,previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing.In this case,mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion,which inevitably suffers from problems such as error propagation and increased uncertainty.To address these problems,we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation(MMCSD).Specifically,we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality.Moreover,we design multi-level visual semantic extraction and entity description generation,thereby further extracting entity semantics from structural triples and visual images.Meanwhile,we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features.We conducted experiments on FB15K-237 and DB13K,and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance.展开更多
Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status...Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.展开更多
Objective: This study evaluates the impact of handshake and information support on patients’ outcomes during laparoscopic cholecystectomy. It examines the effects on their physiological and psychological responses an...Objective: This study evaluates the impact of handshake and information support on patients’ outcomes during laparoscopic cholecystectomy. It examines the effects on their physiological and psychological responses and overall satisfaction with nursing care. Methods: A total of 84 patients scheduled for laparoscopic cholecystectomy were selected through convenient sampling and randomly assigned to either the control group or the intervention group using a random number table. Each group consisted of 42 patients. The control group received standard surgical nursing care. In addition to standard care, the intervention group received handshake and information support from the circulating nurse before anesthesia induction. Vital signs were recorded before surgery and before anesthesia induction. Anxiety levels were measured using the State-Trait Anxiety Inventory (STAI) and the State-Anxiety Inventory (S-AI), while nursing satisfaction was assessed using a numerical rating scale. Results: No significant differences were found between the two groups in systolic and diastolic blood pressures before surgery and anesthesia induction (P > 0.05). However, there was a significant difference in heart rate before anesthesia induction (P Conclusion: Providing handshake and information support before anesthesia induction effectively reduces stress, alleviates anxiety, and enhances comfort and satisfaction among patients undergoing laparoscopic cholecystectomy.展开更多
This study analyzes the User Interface(UI)and User Experience(UX)of information systems that provide local government information.The systems analyzed are the Local Administrative Comprehensive Information Disclosure ...This study analyzes the User Interface(UI)and User Experience(UX)of information systems that provide local government information.The systems analyzed are the Local Administrative Comprehensive Information Disclosure System(Zheripan),the Integrated Local Financial Disclosure System(Qinching Online),and the Local Regulations Information System(12348 Zhejiang Legal Network).The Local Administrative Comprehensive Information Disclosure System offers public service and personnel information,while the Integrated Local Financial Disclosure System provides financial information,and the Local Regulations Information System offers legal information as its main content.The analysis framework utilized three elements:objective data,psychological factors,and heuristic evaluation.The results of the first objective data analysis show that approximately 70%of visits to Zheripan and Qinching Online are through search,and the time spent on the homepage is short.In contrast,about 70%of visits to the 12348 Zhejiang Legal Network are direct visits,with users browsing multiple pages with a clear purpose.In terms of data provision methods,Zheripan provides two types of data in three formats,Qinching Online offers 28 types of data in five formats,and 12348 Zhejiang Legal Network provides one type of information in a single format.The second psychological factor analysis found that all three websites had a number of menus suitable for short-term cognitive capacity.However,only one of the sites had a layout that considered the user’s eye movement.Finally,the heuristic evaluation revealed that most of the evaluation criteria were not met.While the design is relatively simple and follows standards,feedback for users,error prevention,and help options were lacking.Moreover,the user-specific usability was low,and the systems remained at the information-providing level.Based on these findings,both short-term and long-term improvement measures for creating an interactive system beyond simple information disclosure are proposed.展开更多
Nowadays,spatiotemporal information,positioning,and navigation services have become critical components of new infrastructure.Precise positioning technology is indispensable for determining spatiotemporal information ...Nowadays,spatiotemporal information,positioning,and navigation services have become critical components of new infrastructure.Precise positioning technology is indispensable for determining spatiotemporal information and providing navigation services.展开更多
Background: For nursing students, gathering social information is essential for understanding healthcare and social issues and developing critical thinking and decision-making skills. However, the choice of informatio...Background: For nursing students, gathering social information is essential for understanding healthcare and social issues and developing critical thinking and decision-making skills. However, the choice of information sources varies by age and individual habits. With the widespread use of the internet, there are notable differences between younger and older generations in their reliance on the internet versus traditional media sources like newspapers and television. Given the wide age range and diverse backgrounds of nursing students, understanding generational differences in information-gathering methods is important for implementing effective education. Purpose: The purpose of this study is to identify how nursing students in different age groups obtain social information and to examine media usage trends by age group. Additionally, we aim to use the findings to provide insights into effective information dissemination methods in nursing education. Results: The results showed that nursing students in their teens to forties, regardless of gender, primarily relied on the internet as their main information source, with television playing a secondary role. In contrast, students in their fifties tended to obtain information more often from newspapers and television than from the internet. This highlights an age-related difference in preferred information sources, with older students showing a greater reliance on traditional media. Conclusions: This study demonstrates that nursing students use different information-gathering methods based on their age, suggesting a need to custo-mize information dissemination strategies in nursing education. Digital media may be more effective for younger students, while traditional media or printed materials might better serve older students. Educational institutions should consider these generational differences in media usage and adopt strategies that meet the diverse needs of their student populations.展开更多
As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advan...As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection.展开更多
Background: The availability of essential medicines and medical supplies is crucial for effectively delivering healthcare services. In Zambia, the Logistics Management Information System (LMIS) is a key tool for manag...Background: The availability of essential medicines and medical supplies is crucial for effectively delivering healthcare services. In Zambia, the Logistics Management Information System (LMIS) is a key tool for managing the supply chain of these commodities. This study aimed to evaluate the effectiveness of LMIS in ensuring the availability of essential medicines and medical supplies in public hospitals in the Copperbelt Province of Zambia. Materials and Methods: From February to April 2022, a cross-sectional study was conducted in 12 public hospitals across the Copperbelt Province. Data were collected using structured questionnaires, checklists, and stock control cards. The study assessed LMIS availability, training, and knowledge among pharmacy personnel, as well as data accuracy, product availability, and order fill rates. Descriptive statistics were used to analyse the data. Results: All surveyed hospitals had LMIS implemented and were using eLMIS as the primary LMIS. Only 47% and 48% of pharmacy personnel received training in eLMIS and Essential Medicines Logistics Improvement Program (EMLIP), respectively. Most personnel demonstrated good knowledge of LMIS, with 77.7% able to log in to eLMIS Facility Edition, 76.6% able to locate stock control cards in the system, and 78.7% able to perform transactions. However, data accuracy from physical and electronic records varied from 0% to 60%, and product availability ranged from 50% to 80%. Order fill rates from Zambia Medicines and Medical Supplies Agency (ZAMMSA) were consistently below 30%. Discrepancies were observed between physical stock counts and eLMIS records. Conclusion: This study found that most hospitals in the Copperbelt Province of Zambia have implemented LMIS use. While LMIS implementation is high in the Copperbelt Province of Zambia, challenges such as low training levels, data inaccuracies, low product availability, and order fill rates persist. Addressing these issues requires a comprehensive approach, including capacity building, data quality improvement, supply chain coordination, and investment in infrastructure and human resources. Strengthening LMIS effectiveness is crucial for improving healthcare delivery and patient outcomes in Zambia.展开更多
BACKGROUND Stress ulcers are common complications in critically ill patients,with a higher incidence observed in older patients following gastrointestinal surgery.This study aimed to develop and evaluate the effective...BACKGROUND Stress ulcers are common complications in critically ill patients,with a higher incidence observed in older patients following gastrointestinal surgery.This study aimed to develop and evaluate the effectiveness of a multi-modal intervention protocol to prevent stress ulcers in this high-risk population.AIM To assess the impact of a multi-modal intervention on preventing stress ulcers in older intensive care unit(ICU)patients postoperatively.METHODS A randomized controlled trial involving critically ill patients(aged≥65 years)admitted to the ICU after gastrointestinal surgery was conducted.Patients were randomly assigned to either the intervention group,which received a multimodal stress ulcer prevention protocol,or the control group,which received standard care.The primary outcome measure was the incidence of stress ulcers.The secondary outcomes included ulcer healing time,complication rates,and length of hospital stay.RESULTS A total of 200 patients(100 in each group)were included in this study.The intervention group exhibited a significantly lower incidence of stress ulcers than the control group(15%vs 30%,P<0.01).Additionally,the intervention group demonstrated shorter ulcer healing times(mean 5.2 vs 7.8 days,P<0.05),lower complication rates(10%vs 22%,P<0.05),and reduced length of hospital stay(mean 12.3 vs 15.7 days,P<0.05).CONCLUSION This multi-modal intervention protocol significantly reduced the incidence of stress ulcers and improved clinical outcomes in critically ill older patients after gastrointestinal surgery.This comprehensive approach may provide a valuable strategy for managing high-risk populations in intensive care settings.展开更多
基金supported by National Natural Science Foundation of China(62003028).X.L.was supported by a Scholarship from the China Scholarship Council.
文摘Recent advances in spatially resolved transcriptomic technologies have enabled unprecedented opportunities to elucidate tissue architecture and function in situ.Spatial transcriptomics can provide multimodal and complementary information simultaneously,including gene expression profiles,spatial locations,and histology images.However,most existing methods have limitations in efficiently utilizing spatial information and matched high-resolution histology images.To fully leverage the multi-modal information,we propose a SPAtially embedded Deep Attentional graph Clustering(SpaDAC)method to identify spatial domains while reconstructing denoised gene expression profiles.This method can efficiently learn the low-dimensional embeddings for spatial transcriptomics data by constructing multi-view graph modules to capture both spatial location connectives and morphological connectives.Benchmark results demonstrate that SpaDAC outperforms other algorithms on several recent spatial transcriptomics datasets.SpaDAC is a valuable tool for spatial domain detection,facilitating the comprehension of tissue architecture and cellular microenvironment.The source code of SpaDAC is freely available at Github(https://github.com/huoyuying/SpaDAC.git).
基金partially supported by the National Natural Science Foundation of China under Grants 62471493 and 62402257(for conceptualization and investigation)partially supported by the Natural Science Foundation of Shandong Province,China under Grants ZR2023LZH017,ZR2024MF066,and 2023QF025(for formal analysis and validation)+1 种基金partially supported by the Open Foundation of Key Laboratory of Computing Power Network and Information Security,Ministry of Education,Qilu University of Technology(Shandong Academy of Sciences)under Grant 2023ZD010(for methodology and model design)partially supported by the Russian Science Foundation(RSF)Project under Grant 22-71-10095-P(for validation and results verification).
文摘To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model.
基金supported by National Natural Science Foundation of China(NSFC)under grant U23A20310.
文摘With the growing advancement of wireless communication technologies,WiFi-based human sensing has gained increasing attention as a non-intrusive and device-free solution.Among the available signal types,Channel State Information(CSI)offers fine-grained temporal,frequency,and spatial insights into multipath propagation,making it a crucial data source for human-centric sensing.Recently,the integration of deep learning has significantly improved the robustness and automation of feature extraction from CSI in complex environments.This paper provides a comprehensive review of deep learning-enhanced human sensing based on CSI.We first outline mainstream CSI acquisition tools and their hardware specifications,then provide a detailed discussion of preprocessing methods such as denoising,time–frequency transformation,data segmentation,and augmentation.Subsequently,we categorize deep learning approaches according to sensing tasks—namely detection,localization,and recognition—and highlight representative models across application scenarios.Finally,we examine key challenges including domain generalization,multi-user interference,and limited data availability,and we propose future research directions involving lightweight model deployment,multimodal data fusion,and semantic-level sensing.
文摘With the increasing of the elderly population and the growing hearth care cost, the role of service robots in aiding the disabled and the elderly is becoming important. Many researchers in the world have paid much attention to heaRthcare robots and rehabilitation robots. To get natural and harmonious communication between the user and a service robot, the information perception/feedback ability, and interaction ability for service robots become more important in many key issues.
基金support provided by the National Natural Science Foundation of China (Nos. 61803267 and 61572328)the China Postdoctoral Science Foundation (No.2017M622757)+1 种基金the Beijing Science and Technology program (No.Z171100000817007)the National Science Foundation of China (NSFC) and the German Re-search Foundation (DFG) in the project Cross Modal Learning,NSFC 61621136008/DFG TRR-169
文摘This paper focuses on multi-modal Information Perception(IP)for Soft Robotic Hands(SRHs)using Machine Learning(ML)algorithms.A flexible Optical Fiber-based Curvature Sensor(OFCS)is fabricated,consisting of a Light-Emitting Diode(LED),photosensitive detector,and optical fiber.Bending the roughened optical fiber generates lower light intensity,which reflecting the curvature of the soft finger.Together with the curvature and pressure information,multi-modal IP is performed to improve the recognition accuracy.Recognitions of gesture,object shape,size,and weight are implemented with multiple ML approaches,including the Supervised Learning Algorithms(SLAs)of K-Nearest Neighbor(KNN),Support Vector Machine(SVM),Logistic Regression(LR),and the unSupervised Learning Algorithm(un-SLA)of K-Means Clustering(KMC).Moreover,Optical Sensor Information(OSI),Pressure Sensor Information(PSI),and Double-Sensor Information(DSI)are adopted to compare the recognition accuracies.The experiment results demonstrate that the proposed sensors and recognition approaches are feasible and effective.The recognition accuracies obtained using the above ML algorithms and three modes of sensor information are higer than 85 percent for almost all combinations.Moreover,DSI is more accurate when compared to single modal sensor information and the KNN algorithm with a DSI outperforms the other combinations in recognition accuracy.
基金supported by General Scientific Research Funding of the Science and Technology Development Fund(FDCT)in Macao(No.0150/2022/A)the Faculty Research Grants of Macao University of Science and Technology(No.FRG-22-074-FIE).
文摘With the rapid development of economy,air pollution caused by industrial expansion has caused serious harm to human health and social development.Therefore,establishing an effective air pollution concentration prediction system is of great scientific and practical significance for accurate and reliable predictions.This paper proposes a combination of pointinterval prediction system for pollutant concentration prediction by leveraging neural network,meta-heuristic optimization algorithm,and fuzzy theory.Fuzzy information granulation technology is used in data preprocessing to transform numerical sequences into fuzzy particles for comprehensive feature extraction.The golden Jackal optimization algorithm is employed in the optimization stage to fine-tune model hyperparameters.In the prediction stage,an ensemble learning method combines training results frommultiplemodels to obtain final point predictions while also utilizing quantile regression and kernel density estimation methods for interval predictions on the test set.Experimental results demonstrate that the combined model achieves a high goodness of fit coefficient of determination(R^(2))at 99.3% and a maximum difference between prediction accuracy mean absolute percentage error(MAPE)and benchmark model at 12.6%.This suggests that the integrated learning system proposed in this paper can provide more accurate deterministic predictions as well as reliable uncertainty analysis compared to traditionalmodels,offering practical reference for air quality early warning.
基金Construction Program of the Key Discipline of State Administration of Traditional Chinese Medicine of China(ZYYZDXK-2023069)Research Project of Shanghai Municipal Health Commission (2024QN018)Shanghai University of Traditional Chinese Medicine Science and Technology Development Program (23KFL005)。
文摘Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.
基金supported by the project“GEF9874:Strengthening Coordinated Approaches to Reduce Invasive Alien Species(lAS)Threats to Globally Significant Agrobiodiversity and Agroecosystems in China”funding from the Excellent Talent Training Funding Project in Dongcheng District,Beijing,with project number 2024-dchrcpyzz-9.
文摘Ecological monitoring vehicles are equipped with a range of sensors and monitoring devices designed to gather data on ecological and environmental factors.These vehicles are crucial in various fields,including environmental science research,ecological and environmental monitoring projects,disaster response,and emergency management.A key method employed in these vehicles for achieving high-precision positioning is LiDAR(lightlaser detection and ranging)-Visual Simultaneous Localization and Mapping(SLAM).However,maintaining highprecision localization in complex scenarios,such as degraded environments or when dynamic objects are present,remains a significant challenge.To address this issue,we integrate both semantic and texture information from LiDAR and cameras to enhance the robustness and efficiency of data registration.Specifically,semantic information simplifies the modeling of scene elements,reducing the reliance on dense point clouds,which can be less efficient.Meanwhile,visual texture information complements LiDAR-Visual localization by providing additional contextual details.By incorporating semantic and texture details frompaired images and point clouds,we significantly improve the quality of data association,thereby increasing the success rate of localization.This approach not only enhances the operational capabilities of ecological monitoring vehicles in complex environments but also contributes to improving the overall efficiency and effectiveness of ecological monitoring and environmental protection efforts.
基金National Natural Science Foundation of China(grant numbers 42293351,41877239,51422904 and 51379112).
文摘Advanced geological prediction is a crucial means to ensure safety and efficiency in tunnel construction.However,diff erent advanced geological forecasting methods have their own limitations,resulting in poor detection accuracy.Using multiple methods to carry out a comprehensive evaluation can eff ectively improve the accuracy of advanced geological prediction results.In this study,geological information is combined with the detection results of geophysical methods,including transient electromagnetic,induced polarization,and tunnel seismic prediction,to establish a comprehensive analysis method of adverse geology.First,the possible main adverse geological problems are determined according to the geological information.Subsequently,various physical parameters of the rock mass in front of the tunnel face can then be derived on the basis of multisource geophysical data.Finally,based on the analysis results of geological information,the multisource data fusion algorithm is used to determine the type,location,and scale of adverse geology.The advanced geological prediction results that can provide eff ective guidance for tunnel construction can then be obtained.
基金funded by Research Project,grant number BHQ090003000X03.
文摘Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines.
基金funded by Research Project,grant number BHQ090003000X03。
文摘Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and knowledge and the limitations of data sources,the visual knowledge within the knowledge graphs is generally of low quality,and some entities suffer from the issue of missing visual modality.Nevertheless,previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing.In this case,mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion,which inevitably suffers from problems such as error propagation and increased uncertainty.To address these problems,we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation(MMCSD).Specifically,we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality.Moreover,we design multi-level visual semantic extraction and entity description generation,thereby further extracting entity semantics from structural triples and visual images.Meanwhile,we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features.We conducted experiments on FB15K-237 and DB13K,and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance.
基金supported by the Deanship of Research and Graduate Studies at King Khalid University under Small Research Project grant number RGP1/139/45.
文摘Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.
文摘Objective: This study evaluates the impact of handshake and information support on patients’ outcomes during laparoscopic cholecystectomy. It examines the effects on their physiological and psychological responses and overall satisfaction with nursing care. Methods: A total of 84 patients scheduled for laparoscopic cholecystectomy were selected through convenient sampling and randomly assigned to either the control group or the intervention group using a random number table. Each group consisted of 42 patients. The control group received standard surgical nursing care. In addition to standard care, the intervention group received handshake and information support from the circulating nurse before anesthesia induction. Vital signs were recorded before surgery and before anesthesia induction. Anxiety levels were measured using the State-Trait Anxiety Inventory (STAI) and the State-Anxiety Inventory (S-AI), while nursing satisfaction was assessed using a numerical rating scale. Results: No significant differences were found between the two groups in systolic and diastolic blood pressures before surgery and anesthesia induction (P > 0.05). However, there was a significant difference in heart rate before anesthesia induction (P Conclusion: Providing handshake and information support before anesthesia induction effectively reduces stress, alleviates anxiety, and enhances comfort and satisfaction among patients undergoing laparoscopic cholecystectomy.
文摘This study analyzes the User Interface(UI)and User Experience(UX)of information systems that provide local government information.The systems analyzed are the Local Administrative Comprehensive Information Disclosure System(Zheripan),the Integrated Local Financial Disclosure System(Qinching Online),and the Local Regulations Information System(12348 Zhejiang Legal Network).The Local Administrative Comprehensive Information Disclosure System offers public service and personnel information,while the Integrated Local Financial Disclosure System provides financial information,and the Local Regulations Information System offers legal information as its main content.The analysis framework utilized three elements:objective data,psychological factors,and heuristic evaluation.The results of the first objective data analysis show that approximately 70%of visits to Zheripan and Qinching Online are through search,and the time spent on the homepage is short.In contrast,about 70%of visits to the 12348 Zhejiang Legal Network are direct visits,with users browsing multiple pages with a clear purpose.In terms of data provision methods,Zheripan provides two types of data in three formats,Qinching Online offers 28 types of data in five formats,and 12348 Zhejiang Legal Network provides one type of information in a single format.The second psychological factor analysis found that all three websites had a number of menus suitable for short-term cognitive capacity.However,only one of the sites had a layout that considered the user’s eye movement.Finally,the heuristic evaluation revealed that most of the evaluation criteria were not met.While the design is relatively simple and follows standards,feedback for users,error prevention,and help options were lacking.Moreover,the user-specific usability was low,and the systems remained at the information-providing level.Based on these findings,both short-term and long-term improvement measures for creating an interactive system beyond simple information disclosure are proposed.
文摘Nowadays,spatiotemporal information,positioning,and navigation services have become critical components of new infrastructure.Precise positioning technology is indispensable for determining spatiotemporal information and providing navigation services.
文摘Background: For nursing students, gathering social information is essential for understanding healthcare and social issues and developing critical thinking and decision-making skills. However, the choice of information sources varies by age and individual habits. With the widespread use of the internet, there are notable differences between younger and older generations in their reliance on the internet versus traditional media sources like newspapers and television. Given the wide age range and diverse backgrounds of nursing students, understanding generational differences in information-gathering methods is important for implementing effective education. Purpose: The purpose of this study is to identify how nursing students in different age groups obtain social information and to examine media usage trends by age group. Additionally, we aim to use the findings to provide insights into effective information dissemination methods in nursing education. Results: The results showed that nursing students in their teens to forties, regardless of gender, primarily relied on the internet as their main information source, with television playing a secondary role. In contrast, students in their fifties tended to obtain information more often from newspapers and television than from the internet. This highlights an age-related difference in preferred information sources, with older students showing a greater reliance on traditional media. Conclusions: This study demonstrates that nursing students use different information-gathering methods based on their age, suggesting a need to custo-mize information dissemination strategies in nursing education. Digital media may be more effective for younger students, while traditional media or printed materials might better serve older students. Educational institutions should consider these generational differences in media usage and adopt strategies that meet the diverse needs of their student populations.
基金funded by the Yangtze River Delta Science and Technology Innovation Community Joint Research Project(2023CSJGG1600)the Natural Science Foundation of Anhui Province(2208085MF173)Wuhu“ChiZhu Light”Major Science and Technology Project(2023ZD01,2023ZD03).
文摘As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection.
文摘Background: The availability of essential medicines and medical supplies is crucial for effectively delivering healthcare services. In Zambia, the Logistics Management Information System (LMIS) is a key tool for managing the supply chain of these commodities. This study aimed to evaluate the effectiveness of LMIS in ensuring the availability of essential medicines and medical supplies in public hospitals in the Copperbelt Province of Zambia. Materials and Methods: From February to April 2022, a cross-sectional study was conducted in 12 public hospitals across the Copperbelt Province. Data were collected using structured questionnaires, checklists, and stock control cards. The study assessed LMIS availability, training, and knowledge among pharmacy personnel, as well as data accuracy, product availability, and order fill rates. Descriptive statistics were used to analyse the data. Results: All surveyed hospitals had LMIS implemented and were using eLMIS as the primary LMIS. Only 47% and 48% of pharmacy personnel received training in eLMIS and Essential Medicines Logistics Improvement Program (EMLIP), respectively. Most personnel demonstrated good knowledge of LMIS, with 77.7% able to log in to eLMIS Facility Edition, 76.6% able to locate stock control cards in the system, and 78.7% able to perform transactions. However, data accuracy from physical and electronic records varied from 0% to 60%, and product availability ranged from 50% to 80%. Order fill rates from Zambia Medicines and Medical Supplies Agency (ZAMMSA) were consistently below 30%. Discrepancies were observed between physical stock counts and eLMIS records. Conclusion: This study found that most hospitals in the Copperbelt Province of Zambia have implemented LMIS use. While LMIS implementation is high in the Copperbelt Province of Zambia, challenges such as low training levels, data inaccuracies, low product availability, and order fill rates persist. Addressing these issues requires a comprehensive approach, including capacity building, data quality improvement, supply chain coordination, and investment in infrastructure and human resources. Strengthening LMIS effectiveness is crucial for improving healthcare delivery and patient outcomes in Zambia.
文摘BACKGROUND Stress ulcers are common complications in critically ill patients,with a higher incidence observed in older patients following gastrointestinal surgery.This study aimed to develop and evaluate the effectiveness of a multi-modal intervention protocol to prevent stress ulcers in this high-risk population.AIM To assess the impact of a multi-modal intervention on preventing stress ulcers in older intensive care unit(ICU)patients postoperatively.METHODS A randomized controlled trial involving critically ill patients(aged≥65 years)admitted to the ICU after gastrointestinal surgery was conducted.Patients were randomly assigned to either the intervention group,which received a multimodal stress ulcer prevention protocol,or the control group,which received standard care.The primary outcome measure was the incidence of stress ulcers.The secondary outcomes included ulcer healing time,complication rates,and length of hospital stay.RESULTS A total of 200 patients(100 in each group)were included in this study.The intervention group exhibited a significantly lower incidence of stress ulcers than the control group(15%vs 30%,P<0.01).Additionally,the intervention group demonstrated shorter ulcer healing times(mean 5.2 vs 7.8 days,P<0.05),lower complication rates(10%vs 22%,P<0.05),and reduced length of hospital stay(mean 12.3 vs 15.7 days,P<0.05).CONCLUSION This multi-modal intervention protocol significantly reduced the incidence of stress ulcers and improved clinical outcomes in critically ill older patients after gastrointestinal surgery.This comprehensive approach may provide a valuable strategy for managing high-risk populations in intensive care settings.