Central nervous system(CNS) axons fail to regenerate following brain or spinal cord injury(SCI),which typically leads to permanent neurological deficits.Peripheral nervous system axons,howeve r,can regenerate followin...Central nervous system(CNS) axons fail to regenerate following brain or spinal cord injury(SCI),which typically leads to permanent neurological deficits.Peripheral nervous system axons,howeve r,can regenerate following injury.Understanding the mechanisms that underlie this difference is key to developing treatments for CNS neurological diseases and injuries characterized by axonal damage.To initiate repair after peripheral nerve injury,dorsal root ganglion(DRG) neurons mobilize a pro-regenerative gene expression program,which facilitates axon outgrowth.展开更多
Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocar...Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.展开更多
Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or ...Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines.展开更多
Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and ...Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and knowledge and the limitations of data sources,the visual knowledge within the knowledge graphs is generally of low quality,and some entities suffer from the issue of missing visual modality.Nevertheless,previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing.In this case,mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion,which inevitably suffers from problems such as error propagation and increased uncertainty.To address these problems,we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation(MMCSD).Specifically,we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality.Moreover,we design multi-level visual semantic extraction and entity description generation,thereby further extracting entity semantics from structural triples and visual images.Meanwhile,we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features.We conducted experiments on FB15K-237 and DB13K,and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance.展开更多
Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status...Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.展开更多
To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities...To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model.展开更多
As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advan...As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection.展开更多
BACKGROUND Stress ulcers are common complications in critically ill patients,with a higher incidence observed in older patients following gastrointestinal surgery.This study aimed to develop and evaluate the effective...BACKGROUND Stress ulcers are common complications in critically ill patients,with a higher incidence observed in older patients following gastrointestinal surgery.This study aimed to develop and evaluate the effectiveness of a multi-modal intervention protocol to prevent stress ulcers in this high-risk population.AIM To assess the impact of a multi-modal intervention on preventing stress ulcers in older intensive care unit(ICU)patients postoperatively.METHODS A randomized controlled trial involving critically ill patients(aged≥65 years)admitted to the ICU after gastrointestinal surgery was conducted.Patients were randomly assigned to either the intervention group,which received a multimodal stress ulcer prevention protocol,or the control group,which received standard care.The primary outcome measure was the incidence of stress ulcers.The secondary outcomes included ulcer healing time,complication rates,and length of hospital stay.RESULTS A total of 200 patients(100 in each group)were included in this study.The intervention group exhibited a significantly lower incidence of stress ulcers than the control group(15%vs 30%,P<0.01).Additionally,the intervention group demonstrated shorter ulcer healing times(mean 5.2 vs 7.8 days,P<0.05),lower complication rates(10%vs 22%,P<0.05),and reduced length of hospital stay(mean 12.3 vs 15.7 days,P<0.05).CONCLUSION This multi-modal intervention protocol significantly reduced the incidence of stress ulcers and improved clinical outcomes in critically ill older patients after gastrointestinal surgery.This comprehensive approach may provide a valuable strategy for managing high-risk populations in intensive care settings.展开更多
With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intellig...With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intelligent SA(ISA).However,the existing AI-based SA approaches often rely on unimodal data and lack a comprehensive description and benchmark of the ISA tasks utilizing multi-modal data for real-time ATC environments.To address this gap,by analyzing the situation awareness procedure of the ATCOs,the ISA task is refined to the processing of the two primary elements,i.e.,spoken instructions and flight trajectories.Subsequently,the ISA is further formulated into Controlling Intent Understanding(CIU)and Flight Trajectory Prediction(FTP)tasks.For the CIU task,an innovative automatic speech recognition and understanding framework is designed to extract the controlling intent from unstructured and continuous ATC communications.For the FTP task,the single-and multi-horizon FTP approaches are investigated to support the high-precision prediction of the situation evolution.A total of 32 unimodal/multi-modal advanced methods with extensive evaluation metrics are introduced to conduct the benchmarks on the real-world multi-modal ATC situation dataset.Experimental results demonstrate the effectiveness of AI-based techniques in enhancing ISA for the ATC environment.展开更多
The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring ef...The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models.展开更多
Acute Bilirubin Encephalopathy(ABE)is a significant threat to neonates and it leads to disability and high mortality rates.Detecting and treating ABE promptly is important to prevent further complications and long-ter...Acute Bilirubin Encephalopathy(ABE)is a significant threat to neonates and it leads to disability and high mortality rates.Detecting and treating ABE promptly is important to prevent further complications and long-term issues.Recent studies have explored ABE diagnosis.However,they often face limitations in classification due to reliance on a single modality of Magnetic Resonance Imaging(MRI).To tackle this problem,the authors propose a Tri-M2MT model for precise ABE detection by using tri-modality MRI scans.The scans include T1-weighted imaging(T1WI),T2-weighted imaging(T2WI),and apparent diffusion coefficient maps to get indepth information.Initially,the tri-modality MRI scans are collected and preprocessesed by using an Advanced Gaussian Filter for noise reduction and Z-score normalisation for data standardisation.An Advanced Capsule Network was utilised to extract relevant features by using Snake Optimization Algorithm to select optimal features based on feature correlation with the aim of minimising complexity and enhancing detection accuracy.Furthermore,a multi-transformer approach was used for feature fusion and identify feature correlations effectively.Finally,accurate ABE diagnosis is achieved through the utilisation of a SoftMax layer.The performance of the proposed Tri-M2MT model is evaluated across various metrics,including accuracy,specificity,sensitivity,F1-score,and ROC curve analysis,and the proposed methodology provides better performance compared to existing methodologies.展开更多
Objective:To explore the effectiveness of multi-modal teaching based on an online case library in the education of gene methylation combined with spiral computed tomography(CT)screening for pulmonary ground-glass opac...Objective:To explore the effectiveness of multi-modal teaching based on an online case library in the education of gene methylation combined with spiral computed tomography(CT)screening for pulmonary ground-glass opacity(GGO)nodules.Methods:From October 2023 to April 2024,66 medical imaging students were selected and randomly divided into a control group and an observation group,each with 33 students.The control group received traditional lecture-based teaching,while the observation group was taught using a multi-modal teaching approach based on an online case library.Performance on assessments and teaching quality were analyzed between the two groups.Results:The observation group achieved higher scores in theoretical and practical knowledge compared to the control group(P<0.05).Additionally,the teaching quality scores were significantly higher in the observation group(P<0.05).Conclusion:Implementing multi-modal teaching based on an online case library for pulmonary GGO nodule screening with gene methylation combined with spiral CT can enhance students’knowledge acquisition,improve teaching quality,and have significant clinical application value.展开更多
A personalized outfit recommendation has emerged as a hot research topic in the fashion domain.However,existing recommendations do not fully exploit user style preferences.Typically,users prefer particular styles such...A personalized outfit recommendation has emerged as a hot research topic in the fashion domain.However,existing recommendations do not fully exploit user style preferences.Typically,users prefer particular styles such as casual and athletic styles,and consider attributes like color and texture when selecting outfits.To achieve personalized outfit recommendations in line with user style preferences,this paper proposes a personal style guided outfit recommendation with multi-modal fashion compatibility modeling,termed as PSGNet.Firstly,a style classifier is designed to categorize fashion images of various clothing types and attributes into distinct style categories.Secondly,a personal style prediction module extracts user style preferences by analyzing historical data.Then,to address the limitations of single-modal representations and enhance fashion compatibility,both fashion images and text data are leveraged to extract multi-modal features.Finally,PSGNet integrates these components through Bayesian personalized ranking(BPR)to unify the personal style and fashion compatibility,where the former is used as personal style features and guides the output of the personalized outfit recommendation tailored to the target user.Extensive experiments on large-scale datasets demonstrate that the proposed model is efficient on the personalized outfit recommendation.展开更多
Ensuring the provision of accessible,affordable,and high-quality public services to all individuals aligns with one of the paramount aims of the United Nations’Sustainable Development Goals(SDGs).In the face of esca ...Ensuring the provision of accessible,affordable,and high-quality public services to all individuals aligns with one of the paramount aims of the United Nations’Sustainable Development Goals(SDGs).In the face of esca lating urbanization and a dwindling rural populace in China,reconstructing rural settlements to enhance public service accessibility has become a fundamental strategy for achieving the SDGs in rural areas.However,few stud ies have examined the optimal methods for rural settlement reconstruction that ensure accessible and equitable public services while considering multiple existing facilities and service provisions.This paper focuses on rural settlement reconstruction in the context of the SDGs,employing an inverted MCLP-CC(maximal coverage loca tion problem for complementary coverage)model to identify optimal rural settlements and a rank-based method for their relocation.Conducted in Changyuan,a county-level city in Henan Province,China,this study observed significant enhancements in both accessibility and equity following rural settlement reconstruction by utilizing the MH3SFCA(modified Huff 3-step floating catchment area)and the spatial Lorenz curve method.Remarkably,these improvements were achieved without the addition of new facilities,with the accessibility increasing by 44.21%,4.97%,and 3.11%;Gini coefficients decreasing by 19.53%,1.64%,and 3.18%;Ricci-Schutz coef-ficients decreasing by 21.09%,2.09%,and 4.33%for educational,medical,and cultural and sports facilities,respectively.It indicated that rural settlement reconstruction can bolster the accessibility and equity of public ser-vices by leveraging existing facilities.This paper provides a new framework for stakeholders to better reconstruct rural settlements and promote sustainable development in rural areas in China.展开更多
Since the 1970s,a series of international and national sources have supported the principle of accessibility,which slowly has become a statuary norm and a legislative obligation.Each country has implemented accessibil...Since the 1970s,a series of international and national sources have supported the principle of accessibility,which slowly has become a statuary norm and a legislative obligation.Each country has implemented accessibility through a singular policy.But in addition to the accessibility of a place or an activity,to inform about what is accessible is very important as well,and has not really taken off.Indeed,for disabled people,the difficulty lies not only with access to places and the use of resources,but also with the visibility of these resources.This means that information concerning accessibility has to be disclosed and provided effectively to disabled people,those involved with them and the relevant institutions.In different countries all over the world,many labels and pictograms have been created for this purpose and give information relating to accessibility.Using a socio-historical approach,we will present and analyze the different types of icons,symbols,pictograms and labels that have been put in place around the world and in France:what are they used for and for whom are they made?We will show that they are pointers which firstly reflect the diversity and range within the target group concerned by accessibility,and secondly the evolution of accessibility as a dynamic and ecological principle.展开更多
BACKGROUND Hepatocellular carcinoma(HCC)is notorious for its aggressive progression and dismal prognosis,with chromatin accessibility dynamics emerging as pivotal yet poorly understood drivers.AIM To dissect how multi...BACKGROUND Hepatocellular carcinoma(HCC)is notorious for its aggressive progression and dismal prognosis,with chromatin accessibility dynamics emerging as pivotal yet poorly understood drivers.AIM To dissect how multilayered chromatin regulation sustains oncogenic transcription and tumor-stroma crosstalk in HCC,we combined multiomics single cell analysis.METHODS We integrated single-cell RNA sequencing and paired single-cell assay for transposase-accessible chromatin with sequencing data of HCC samples,complemented by bulk RNA sequencing validation across The Cancer Genome Atlas,Liver Cancer Institute,and GSE25907 cohorts.Cell type-specific chromatin architectures were resolved via ArchR,with regulatory hubs identified through peak-to-gene linkages and coaccessibility networks.Functional validation employed A485-mediated histone 3 lysine 27 acetylation suppression and small interfering RNA targeting DGAT1.RESULTS Malignant hepatocytes exhibited expanded chromatin accessibility profiles,characterized by increased numbers of accessible peaks and larger physical regions despite reduced peak intensity.Enhancer-like peaks enriched in malignant regulation,forming long-range hubs.Eighteen enhancer-like peak-related genes showed tumor-specific overexpression and diagnostic accuracy,correlating with poor prognosis.Intercellular coaccessibility analysis revealed tumor-stroma symbiosis via shared chromatin states.Pharmacological histone 3 lysine 27 acetylation inhibition paradoxically downregulated DGAT1,the hub gene most strongly regulated by chromatin accessibility.DGAT1 knockdown suppressed cell proliferation.CONCLUSION Multilayered chromatin reprogramming sustains HCC progression through tumor-stroma crosstalk and DGAT1-related oncogenic transcription,defining targetable epigenetic vulnerabilities.展开更多
[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-base...[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management.展开更多
Objective:The scarcity of healthcare resources and inadequate access to medical services in rural and remote areas are pervasive challenges many countries face,particularly in the developing world.Telemedicine,with it...Objective:The scarcity of healthcare resources and inadequate access to medical services in rural and remote areas are pervasive challenges many countries face,particularly in the developing world.Telemedicine,with its capacity to overcome geographical barriers and provide patients with real‐time medical services,has shown considerable potential in addressing these issues,attracting wide-spread attention.Compact medical communities and family doctor systems play important roles in improving healthcare accessibility.However,despite the critical nature of patients'perceptions of healthcare accessibility,research in this domain is sparse.This study aimed to explore the impact of telemedicine on rural residents'perceived healthcare accessibility in China,analyze the mechanisms underpinning this relationship,and elucidate the roles of compact medical communities and the family doctor system.Methods:Survey data from 3311 rural residents were analyzed using a probit model,instrumental variables,and subgroup regression analyses to ascertain causal effects,perform heterogeneity analysis,examine mechanisms,and ascertain the robustness of the findings.Results:Telemedicine significantly enhanced rural residents'perceived healthcare accessibility,with particularly notable benefits for those in sparsely populated areas,regions with high‐speed internet access,within the purview of compact healthcare consortiums,and those with access to family doctor services.Furthermore,telemedicine improved rural residents'perceived healthcare accessibility by encouraging the use of primary care services.Conclusion:Telemedicine in China has played a significant role in improving the perceived healthcare accessibility among rural residents and aiding in the reduction of disparities in accessibility across different demographic groups.This is consistent with the broader objective of achieving universal health coverage.However,the efficacy of telemedicine in enhancing healthcare accessibility is contingent upon certain preconditions.Policymakers must confront local infrastructure challenges,particularly regarding internet connectivity,when expanding telemedicine services to ensure their effective operation.The synergistic interaction observed between telemedicine,the family doctor system,and compact medical communities highlights the importance of integrating telemedicine into existing healthcare systems.Such integration could enhance collaboration with current healthcare frameworks,ensuring the provision of safe,accessible,and affordable healthcare services,and promoting the health and well‐being of local populations.展开更多
基金supported by the Canada Foundation for Innovation (Project#44220)the Natural Sciences and Engineering Research Council of Canada (RGPIN-2024-03986)+3 种基金the Michael Smith Foundation for Health Research BCthe financial support of Health Canada,through the Canada Brain Research Fund,an innovative partnership between the Government of Canada (through Health Canada),Brain Canada Foundationthe Azrieli Foundationsupported by a Canadian Institutes of Health Research (CIHR) Canada Graduate Scholarship–Master’s Award。
文摘Central nervous system(CNS) axons fail to regenerate following brain or spinal cord injury(SCI),which typically leads to permanent neurological deficits.Peripheral nervous system axons,howeve r,can regenerate following injury.Understanding the mechanisms that underlie this difference is key to developing treatments for CNS neurological diseases and injuries characterized by axonal damage.To initiate repair after peripheral nerve injury,dorsal root ganglion(DRG) neurons mobilize a pro-regenerative gene expression program,which facilitates axon outgrowth.
基金Construction Program of the Key Discipline of State Administration of Traditional Chinese Medicine of China(ZYYZDXK-2023069)Research Project of Shanghai Municipal Health Commission (2024QN018)Shanghai University of Traditional Chinese Medicine Science and Technology Development Program (23KFL005)。
文摘Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.
基金funded by Research Project,grant number BHQ090003000X03.
文摘Multi-modal Named Entity Recognition(MNER)aims to better identify meaningful textual entities by integrating information from images.Previous work has focused on extracting visual semantics at a fine-grained level,or obtaining entity related external knowledge from knowledge bases or Large Language Models(LLMs).However,these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches.In this paper,we present MMAVK,a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion,which aims to leverage the Multi-modal Large Language Model(MLLM)as an implicit knowledge base.It also extracts vision-based auxiliary knowledge from the image formore accurate and effective recognition.Specifically,we propose vision-based auxiliary knowledge generation,which guides the MLLM to extract external knowledge exclusively derived from images to aid entity recognition by designing target-specific prompts,thus avoiding redundant recognition and cognitive confusion caused by the simultaneous processing of image-text pairs.Furthermore,we employ a word-level multi-modal fusion mechanism to fuse the extracted external knowledge with each word-embedding embedded from the transformerbased encoder.Extensive experimental results demonstrate that MMAVK outperforms or equals the state-of-the-art methods on the two classical MNER datasets,even when the largemodels employed have significantly fewer parameters than other baselines.
基金funded by Research Project,grant number BHQ090003000X03。
文摘Multi-modal knowledge graph completion(MMKGC)aims to complete missing entities or relations in multi-modal knowledge graphs,thereby discovering more previously unknown triples.Due to the continuous growth of data and knowledge and the limitations of data sources,the visual knowledge within the knowledge graphs is generally of low quality,and some entities suffer from the issue of missing visual modality.Nevertheless,previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing.In this case,mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion,which inevitably suffers from problems such as error propagation and increased uncertainty.To address these problems,we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation(MMCSD).Specifically,we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality.Moreover,we design multi-level visual semantic extraction and entity description generation,thereby further extracting entity semantics from structural triples and visual images.Meanwhile,we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features.We conducted experiments on FB15K-237 and DB13K,and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance.
基金supported by the Deanship of Research and Graduate Studies at King Khalid University under Small Research Project grant number RGP1/139/45.
文摘Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.
基金partially supported by the National Natural Science Foundation of China under Grants 62471493 and 62402257(for conceptualization and investigation)partially supported by the Natural Science Foundation of Shandong Province,China under Grants ZR2023LZH017,ZR2024MF066,and 2023QF025(for formal analysis and validation)+1 种基金partially supported by the Open Foundation of Key Laboratory of Computing Power Network and Information Security,Ministry of Education,Qilu University of Technology(Shandong Academy of Sciences)under Grant 2023ZD010(for methodology and model design)partially supported by the Russian Science Foundation(RSF)Project under Grant 22-71-10095-P(for validation and results verification).
文摘To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model.
基金funded by the Yangtze River Delta Science and Technology Innovation Community Joint Research Project(2023CSJGG1600)the Natural Science Foundation of Anhui Province(2208085MF173)Wuhu“ChiZhu Light”Major Science and Technology Project(2023ZD01,2023ZD03).
文摘As the number and complexity of sensors in autonomous vehicles continue to rise,multimodal fusionbased object detection algorithms are increasingly being used to detect 3D environmental information,significantly advancing the development of perception technology in autonomous driving.To further promote the development of fusion algorithms and improve detection performance,this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms.Starting fromsingle-modal sensor detection,the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds.For image-based detection methods,they are categorized into monocular detection and binocular detection based on different input types.For point cloud-based detection methods,they are classified into projection-based,voxel-based,point cluster-based,pillar-based,and graph structure-based approaches based on the technical pathways for processing point cloud features.Additionally,multimodal fusion algorithms are divided into Camera-LiDAR fusion,Camera-Radar fusion,Camera-LiDAR-Radar fusion,and other sensor fusion methods based on the types of sensors involved.Furthermore,the paper identifies five key future research directions in this field,aiming to provide insights for researchers engaged in multimodal fusion-based object detection algorithms and to encourage broader attention to the research and application of multimodal fusion-based object detection.
文摘BACKGROUND Stress ulcers are common complications in critically ill patients,with a higher incidence observed in older patients following gastrointestinal surgery.This study aimed to develop and evaluate the effectiveness of a multi-modal intervention protocol to prevent stress ulcers in this high-risk population.AIM To assess the impact of a multi-modal intervention on preventing stress ulcers in older intensive care unit(ICU)patients postoperatively.METHODS A randomized controlled trial involving critically ill patients(aged≥65 years)admitted to the ICU after gastrointestinal surgery was conducted.Patients were randomly assigned to either the intervention group,which received a multimodal stress ulcer prevention protocol,or the control group,which received standard care.The primary outcome measure was the incidence of stress ulcers.The secondary outcomes included ulcer healing time,complication rates,and length of hospital stay.RESULTS A total of 200 patients(100 in each group)were included in this study.The intervention group exhibited a significantly lower incidence of stress ulcers than the control group(15%vs 30%,P<0.01).Additionally,the intervention group demonstrated shorter ulcer healing times(mean 5.2 vs 7.8 days,P<0.05),lower complication rates(10%vs 22%,P<0.05),and reduced length of hospital stay(mean 12.3 vs 15.7 days,P<0.05).CONCLUSION This multi-modal intervention protocol significantly reduced the incidence of stress ulcers and improved clinical outcomes in critically ill older patients after gastrointestinal surgery.This comprehensive approach may provide a valuable strategy for managing high-risk populations in intensive care settings.
基金supported by the National Natural Science Foundation of China(Nos.62371323,62401380,U2433217,U2333209,and U20A20161)Natural Science Foundation of Sichuan Province,China(Nos.2025ZNSFSC1476)+2 种基金Sichuan Science and Technology Program,China(Nos.2024YFG0010 and 2024ZDZX0046)the Institutional Research Fund from Sichuan University(Nos.2024SCUQJTX030)the Open Fund of Key Laboratory of Flight Techniques and Flight Safety,CAAC(Nos.GY2024-01A).
文摘With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intelligent SA(ISA).However,the existing AI-based SA approaches often rely on unimodal data and lack a comprehensive description and benchmark of the ISA tasks utilizing multi-modal data for real-time ATC environments.To address this gap,by analyzing the situation awareness procedure of the ATCOs,the ISA task is refined to the processing of the two primary elements,i.e.,spoken instructions and flight trajectories.Subsequently,the ISA is further formulated into Controlling Intent Understanding(CIU)and Flight Trajectory Prediction(FTP)tasks.For the CIU task,an innovative automatic speech recognition and understanding framework is designed to extract the controlling intent from unstructured and continuous ATC communications.For the FTP task,the single-and multi-horizon FTP approaches are investigated to support the high-precision prediction of the situation evolution.A total of 32 unimodal/multi-modal advanced methods with extensive evaluation metrics are introduced to conduct the benchmarks on the real-world multi-modal ATC situation dataset.Experimental results demonstrate the effectiveness of AI-based techniques in enhancing ISA for the ATC environment.
基金supported by the National Natural Science Foundation of China(Grant Nos.62071315 and 62271336).
文摘The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models.
文摘Acute Bilirubin Encephalopathy(ABE)is a significant threat to neonates and it leads to disability and high mortality rates.Detecting and treating ABE promptly is important to prevent further complications and long-term issues.Recent studies have explored ABE diagnosis.However,they often face limitations in classification due to reliance on a single modality of Magnetic Resonance Imaging(MRI).To tackle this problem,the authors propose a Tri-M2MT model for precise ABE detection by using tri-modality MRI scans.The scans include T1-weighted imaging(T1WI),T2-weighted imaging(T2WI),and apparent diffusion coefficient maps to get indepth information.Initially,the tri-modality MRI scans are collected and preprocessesed by using an Advanced Gaussian Filter for noise reduction and Z-score normalisation for data standardisation.An Advanced Capsule Network was utilised to extract relevant features by using Snake Optimization Algorithm to select optimal features based on feature correlation with the aim of minimising complexity and enhancing detection accuracy.Furthermore,a multi-transformer approach was used for feature fusion and identify feature correlations effectively.Finally,accurate ABE diagnosis is achieved through the utilisation of a SoftMax layer.The performance of the proposed Tri-M2MT model is evaluated across various metrics,including accuracy,specificity,sensitivity,F1-score,and ROC curve analysis,and the proposed methodology provides better performance compared to existing methodologies.
基金supported by the Autonomous Region Industry-Education Integration Project“Application of DNA Methylation Combined with Spiral CT in the Screening of Pulmonary Ground-Glass Nodules and AI Recognition Systems in Teaching Practice”(Project No.2023210016)the“Open Project of the State Key Laboratory of High Incidence Diseases in Central Asia”(Project No.SKL-HIDCA-2021-28).
文摘Objective:To explore the effectiveness of multi-modal teaching based on an online case library in the education of gene methylation combined with spiral computed tomography(CT)screening for pulmonary ground-glass opacity(GGO)nodules.Methods:From October 2023 to April 2024,66 medical imaging students were selected and randomly divided into a control group and an observation group,each with 33 students.The control group received traditional lecture-based teaching,while the observation group was taught using a multi-modal teaching approach based on an online case library.Performance on assessments and teaching quality were analyzed between the two groups.Results:The observation group achieved higher scores in theoretical and practical knowledge compared to the control group(P<0.05).Additionally,the teaching quality scores were significantly higher in the observation group(P<0.05).Conclusion:Implementing multi-modal teaching based on an online case library for pulmonary GGO nodule screening with gene methylation combined with spiral CT can enhance students’knowledge acquisition,improve teaching quality,and have significant clinical application value.
基金Shanghai Frontier Science Research Center for Modern Textiles,Donghua University,ChinaOpen Project of Henan Key Laboratory of Intelligent Manufacturing of Mechanical Equipment,Zhengzhou University of Light Industry,China(No.IM202303)National Key Research and Development Program of China(No.2019YFB1706300)。
文摘A personalized outfit recommendation has emerged as a hot research topic in the fashion domain.However,existing recommendations do not fully exploit user style preferences.Typically,users prefer particular styles such as casual and athletic styles,and consider attributes like color and texture when selecting outfits.To achieve personalized outfit recommendations in line with user style preferences,this paper proposes a personal style guided outfit recommendation with multi-modal fashion compatibility modeling,termed as PSGNet.Firstly,a style classifier is designed to categorize fashion images of various clothing types and attributes into distinct style categories.Secondly,a personal style prediction module extracts user style preferences by analyzing historical data.Then,to address the limitations of single-modal representations and enhance fashion compatibility,both fashion images and text data are leveraged to extract multi-modal features.Finally,PSGNet integrates these components through Bayesian personalized ranking(BPR)to unify the personal style and fashion compatibility,where the former is used as personal style features and guides the output of the personalized outfit recommendation tailored to the target user.Extensive experiments on large-scale datasets demonstrate that the proposed model is efficient on the personalized outfit recommendation.
基金funded by the National Nat-ural Science Foundation of China(Grants No.42371433,U2443214)National Key Project of High-Resolution Earth Observation System of China(Grant No.80Y50G19900122/23)Foundation of Key Laboratory of Soil andWater Conservation on the Loess Plateau ofMinistry ofWater Resources(Grant No.WSCLP202301).
文摘Ensuring the provision of accessible,affordable,and high-quality public services to all individuals aligns with one of the paramount aims of the United Nations’Sustainable Development Goals(SDGs).In the face of esca lating urbanization and a dwindling rural populace in China,reconstructing rural settlements to enhance public service accessibility has become a fundamental strategy for achieving the SDGs in rural areas.However,few stud ies have examined the optimal methods for rural settlement reconstruction that ensure accessible and equitable public services while considering multiple existing facilities and service provisions.This paper focuses on rural settlement reconstruction in the context of the SDGs,employing an inverted MCLP-CC(maximal coverage loca tion problem for complementary coverage)model to identify optimal rural settlements and a rank-based method for their relocation.Conducted in Changyuan,a county-level city in Henan Province,China,this study observed significant enhancements in both accessibility and equity following rural settlement reconstruction by utilizing the MH3SFCA(modified Huff 3-step floating catchment area)and the spatial Lorenz curve method.Remarkably,these improvements were achieved without the addition of new facilities,with the accessibility increasing by 44.21%,4.97%,and 3.11%;Gini coefficients decreasing by 19.53%,1.64%,and 3.18%;Ricci-Schutz coef-ficients decreasing by 21.09%,2.09%,and 4.33%for educational,medical,and cultural and sports facilities,respectively.It indicated that rural settlement reconstruction can bolster the accessibility and equity of public ser-vices by leveraging existing facilities.This paper provides a new framework for stakeholders to better reconstruct rural settlements and promote sustainable development in rural areas in China.
文摘Since the 1970s,a series of international and national sources have supported the principle of accessibility,which slowly has become a statuary norm and a legislative obligation.Each country has implemented accessibility through a singular policy.But in addition to the accessibility of a place or an activity,to inform about what is accessible is very important as well,and has not really taken off.Indeed,for disabled people,the difficulty lies not only with access to places and the use of resources,but also with the visibility of these resources.This means that information concerning accessibility has to be disclosed and provided effectively to disabled people,those involved with them and the relevant institutions.In different countries all over the world,many labels and pictograms have been created for this purpose and give information relating to accessibility.Using a socio-historical approach,we will present and analyze the different types of icons,symbols,pictograms and labels that have been put in place around the world and in France:what are they used for and for whom are they made?We will show that they are pointers which firstly reflect the diversity and range within the target group concerned by accessibility,and secondly the evolution of accessibility as a dynamic and ecological principle.
基金Supported by the Science and Technology Planning Project of Guangzhou,No.2024A03J0102the Natural Science Foundation of Guangdong Province for Distinguished Young Scholar,No.2022B1515020024+1 种基金National Natural Science Foundation of China,No.82070574the Key Research and Development Program of Guangzhou,No.2023B03J1298.
文摘BACKGROUND Hepatocellular carcinoma(HCC)is notorious for its aggressive progression and dismal prognosis,with chromatin accessibility dynamics emerging as pivotal yet poorly understood drivers.AIM To dissect how multilayered chromatin regulation sustains oncogenic transcription and tumor-stroma crosstalk in HCC,we combined multiomics single cell analysis.METHODS We integrated single-cell RNA sequencing and paired single-cell assay for transposase-accessible chromatin with sequencing data of HCC samples,complemented by bulk RNA sequencing validation across The Cancer Genome Atlas,Liver Cancer Institute,and GSE25907 cohorts.Cell type-specific chromatin architectures were resolved via ArchR,with regulatory hubs identified through peak-to-gene linkages and coaccessibility networks.Functional validation employed A485-mediated histone 3 lysine 27 acetylation suppression and small interfering RNA targeting DGAT1.RESULTS Malignant hepatocytes exhibited expanded chromatin accessibility profiles,characterized by increased numbers of accessible peaks and larger physical regions despite reduced peak intensity.Enhancer-like peaks enriched in malignant regulation,forming long-range hubs.Eighteen enhancer-like peak-related genes showed tumor-specific overexpression and diagnostic accuracy,correlating with poor prognosis.Intercellular coaccessibility analysis revealed tumor-stroma symbiosis via shared chromatin states.Pharmacological histone 3 lysine 27 acetylation inhibition paradoxically downregulated DGAT1,the hub gene most strongly regulated by chromatin accessibility.DGAT1 knockdown suppressed cell proliferation.CONCLUSION Multilayered chromatin reprogramming sustains HCC progression through tumor-stroma crosstalk and DGAT1-related oncogenic transcription,defining targetable epigenetic vulnerabilities.
文摘[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management.
基金supported by the China National Health Development Rescarch Center Study on Total Health Insurance Package Payment and National Office for Philosophy and Social Sciences,National Social Science Fund of China(17ZDA121)Tsinghua University Dushi Program(2024Z11DSZ001).
文摘Objective:The scarcity of healthcare resources and inadequate access to medical services in rural and remote areas are pervasive challenges many countries face,particularly in the developing world.Telemedicine,with its capacity to overcome geographical barriers and provide patients with real‐time medical services,has shown considerable potential in addressing these issues,attracting wide-spread attention.Compact medical communities and family doctor systems play important roles in improving healthcare accessibility.However,despite the critical nature of patients'perceptions of healthcare accessibility,research in this domain is sparse.This study aimed to explore the impact of telemedicine on rural residents'perceived healthcare accessibility in China,analyze the mechanisms underpinning this relationship,and elucidate the roles of compact medical communities and the family doctor system.Methods:Survey data from 3311 rural residents were analyzed using a probit model,instrumental variables,and subgroup regression analyses to ascertain causal effects,perform heterogeneity analysis,examine mechanisms,and ascertain the robustness of the findings.Results:Telemedicine significantly enhanced rural residents'perceived healthcare accessibility,with particularly notable benefits for those in sparsely populated areas,regions with high‐speed internet access,within the purview of compact healthcare consortiums,and those with access to family doctor services.Furthermore,telemedicine improved rural residents'perceived healthcare accessibility by encouraging the use of primary care services.Conclusion:Telemedicine in China has played a significant role in improving the perceived healthcare accessibility among rural residents and aiding in the reduction of disparities in accessibility across different demographic groups.This is consistent with the broader objective of achieving universal health coverage.However,the efficacy of telemedicine in enhancing healthcare accessibility is contingent upon certain preconditions.Policymakers must confront local infrastructure challenges,particularly regarding internet connectivity,when expanding telemedicine services to ensure their effective operation.The synergistic interaction observed between telemedicine,the family doctor system,and compact medical communities highlights the importance of integrating telemedicine into existing healthcare systems.Such integration could enhance collaboration with current healthcare frameworks,ensuring the provision of safe,accessible,and affordable healthcare services,and promoting the health and well‐being of local populations.