Lead-free double perovskites have gained recognition as top luminescent materials due to their environmental friendliness,high chemical stability,structural adjustability,and excellent photoelectric properties.However...Lead-free double perovskites have gained recognition as top luminescent materials due to their environmental friendliness,high chemical stability,structural adjustability,and excellent photoelectric properties.However,the poor modulation of emission restricts their applications,and it is highly desirable to explore stable and efficient double perovskites with multimode luminescence and adjustable spectra for multifunctional photoelectric applications.Herein,the rare earth ions Ln^(3+)(Er^(3+)and Ho^(3+))-doped Cs_(2)NaYCl_(6):Sb^(3+)crystals were synthesized by a simple solvothermal route.The X-ray diffraction pattern(XRD),energy-dispersive spectroscopy(EDS),X-ray photoelectron spectroscopy(XPS),and elemental mapping images demonstrate that the Sb^(3+),Er^(3+),and Ho^(3+)ions have been homogeneously incorporated into the Cs_(2)NaYCl_(6)crystals.As anticipated,the emissio n spectra of Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)are composed of two bands.One broad blue band derives from self-trapped exciton(STE)in[SbCl_(6)]3-octahedra while another group of emission peaks stems from the f-f transitions of Ln^(3+)ions.The emission colors of Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)phosphors can be tuned in a wide range by modulating the doping concentrations of Ln^(3+)ions.The efficient energy transfer from STE to Ln^(3+)is the key point to achieving the efficient and tunable emissions Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)samples.Interestingly,Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)can also exhibit characteristic up-conversion luminescence of Ln^(3+)under nearinfrared(NIR)excitation besides the down-conversion luminescence,revealing that the materials may have potential applicability in multimode anti-counterfeiting and information encryption applications.Furthermore,the light emitting diodes(LEDs)assembled by Cs_(2)NaYCl_(6):Sb^(3+)and Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)phosphors display dazzling blue,green,and red emissions under a forward bias current,which indicates that the as-obtained double perovskites materials may have great potential in solid-state lighting and optoelectronic devices.展开更多
This work introduces special states for light in multimode fibers featuring strongly enhanced or reduced correlations be-tween output fields in the presence of environmental temperature fluctuations.Using experimental...This work introduces special states for light in multimode fibers featuring strongly enhanced or reduced correlations be-tween output fields in the presence of environmental temperature fluctuations.Using experimentally measured multi-tem-perature transmission matrix,a set of temperature principal modes that exhibit resilience to disturbances caused by tem-perature fluctuations can be generated.Reversing this concept also allows the construction of temperature anti-principal modes,with output profiles more susceptible to temperature influences than the unmodulated wavefront.Despite changes in the length of the multimode fiber within the temperature-fluctuating region,the proposed approach remains capable of robustly controlling the temperature response within the fiber.To illustrate the practicality of the proposed spe-cial state,a learning-empowered fiber specklegram temperature sensor based on temperature anti-principal mode sensi-tization is proposed.This sensor exhibits outstanding superiority over traditional approaches in terms of resolution and accuracy.These novel states are anticipated to have wide-ranging applications in fiber communication,sensing,imaging,and spectroscopy,and serve as a source of inspiration for the discovery of other novel states.展开更多
We proposed and demonstrated the ultra-compact 1310/1550 nm wavelength multiplexer/demultiplexer assisted by subwavelength grating(SWG)using particle swarm optimization(PSO)algorithm in silicon-on-insulator(SOI)platfo...We proposed and demonstrated the ultra-compact 1310/1550 nm wavelength multiplexer/demultiplexer assisted by subwavelength grating(SWG)using particle swarm optimization(PSO)algorithm in silicon-on-insulator(SOI)platform.Through the self-imaging effect of multimode interference(MMI)coupler,the demultiplexing function for 1310 nm and 1550 nm wavelengths is implemented.After that,three parallel SWG-based slots are inserted into the MMI section so that the effective refractive index of the modes can be engineered and thus the beat length can be adjusted.Importantly,these three SWG slots significantly reduce the length of the device,which is much shorter than the length of traditional MMI-based wavelength demultiplexers.Ultimately,by using the PSO algorithm,the equivalent refractive index and width of the SWG in a certain range are optimized to achieve the best performance of the wavelength demultiplexer.It has been verified that the device footprint is only 2×30.68μm^(2),and 1 dB bandwidths of larger than 120 nm are acquired at 1310 nm and 1550 nm wavelengths.Meanwhile,the transmitted spectrum shows that the insertion loss(IL)values are below 0.47 dB at both wavelengths when the extinction ratio(ER)values are above 12.65 dB.This inverse design approach has been proved to be efficient in increasing bandwidth and reducing device length.展开更多
Arrhythmias are a frequently occurring phenomenon in clinical practice,but how to accurately dis-tinguish subtle rhythm abnormalities remains an ongoing difficulty faced by the entire research community when conductin...Arrhythmias are a frequently occurring phenomenon in clinical practice,but how to accurately dis-tinguish subtle rhythm abnormalities remains an ongoing difficulty faced by the entire research community when conducting ECG-based studies.From a review of existing studies,two main factors appear to contribute to this problem:the uneven distribution of arrhythmia classes and the limited expressiveness of features learned by current models.To overcome these limitations,this study proposes a dual-path multimodal framework,termed DM-EHC(Dual-Path Multimodal ECG Heartbeat Classifier),for ECG-based heartbeat classification.The proposed framework links 1D ECG temporal features with 2D time–frequency features.By setting up the dual paths described above,the model can process more dimensions of feature information.The MIT-BIH arrhythmia database was selected as the baseline dataset for the experiments.Experimental results show that the proposed method outperforms single modalities and performs better for certain specific types of arrhythmias.The model achieved mean precision,recall,and F1 score of 95.14%,92.26%,and 93.65%,respectively.These results indicate that the framework is robust and has potential value in automated arrhythmia classification.展开更多
Hepatocellular carcinoma presents with three distinct immune phenotypes,including immune-desert,immune-excluded,and immune-inflamed,indicating various treatment responses and prognostic outcomes.The clinical applicati...Hepatocellular carcinoma presents with three distinct immune phenotypes,including immune-desert,immune-excluded,and immune-inflamed,indicating various treatment responses and prognostic outcomes.The clinical application of multi-omics parameters is still restricted by the expensive and less accessible assays,although they accurately reflect immune status.A comprehensive evaluation framework based on“easy-to-obtain”multi-model clinical parameters is urgently required,incorporating clinical features to establish baseline patient profiles and disease staging;routine blood tests assessing systemic metabolic and functional status;immune cell subsets quantifying subcluster dynamics;imaging features delineating tumor morphology,spatial configuration,and perilesional anatomical relationships;immunohistochemical markers positioning qualitative and quantitative detection of tumor antigens from the cellular and molecular level.This integrated phenomic approach aims to improve prognostic stratification and clinical decision-making in hepatocellular carcinoma management conveniently and practically.展开更多
Business Process Modelling(BPM)is essential for analyzing,improving,and automating the flow of information within organizations,but traditional approaches based on manual interpretation are slow,error-prone,and requir...Business Process Modelling(BPM)is essential for analyzing,improving,and automating the flow of information within organizations,but traditional approaches based on manual interpretation are slow,error-prone,and require a high level of expertise.This article proposes an innovative alternative solution that overcomes these limitations by automatically generating comprehensive Business Process Modelling and Notation(BPMN)diagrams solely from verbal descriptions of the processes to be modeled,utilizing Large Language Models(LLMs)and multimodal Artificial Intelligence(AI).Experimental results,based on video recordings of process explanations provided by an expert from an organization(in this case,the Commercial Courts of a public justice administration),demonstrate that the proposed methodology successfully enables the automatic generation of complete and accurate BPMN diagrams,leading to significant improvements in the speed,accuracy,and accessibility of process modeling.This research makes a substantial contribution to the field of business process modeling,as its methodology is groundbreaking in its use of LLMs and multimodal AI capabilities to handle different types of source material(text and video),combining several tools to minimize the number of queries and reduce the complexity of the prompts required for the automatic generation of successful BPMN diagrams.展开更多
To ensure the safe and stable operation of rotating machinery,intelligent fault diagnosis methods hold significant research value.However,existing diagnostic approaches largely rely on manual feature extraction and ex...To ensure the safe and stable operation of rotating machinery,intelligent fault diagnosis methods hold significant research value.However,existing diagnostic approaches largely rely on manual feature extraction and expert experience,which limits their adaptability under variable operating conditions and strong noise environments,severely affecting the generalization capability of diagnostic models.To address this issue,this study proposes a multimodal fusion fault diagnosis framework based on Mel-spectrograms and automated machine learning(AutoML).The framework first extracts fault-sensitive Mel time–frequency features from acoustic signals and fuses them with statistical features of vibration signals to construct complementary fault representations.On this basis,automated machine learning techniques are introduced to enable end-to-end diagnostic workflow construction and optimal model configuration acquisition.Finally,diagnostic decisions are achieved by automatically integrating the predictions of multiple high-performance base models.Experimental results on a centrifugal pump vibration and acoustic dataset demonstrate that the proposed framework achieves high diagnostic accuracy under noise-free conditions and maintains strong robustness under noisy interference,validating its efficiency,scalability,and practical value for rotating machinery fault diagnosis.展开更多
From the perspective of Multimodal Metaphor Theory,the architectural scenes in Ne Zha 2 embody highly condensed cultural connotations.Through the synergy of vision,soundscape,and dialect,the film constructs a metaphor...From the perspective of Multimodal Metaphor Theory,the architectural scenes in Ne Zha 2 embody highly condensed cultural connotations.Through the synergy of vision,soundscape,and dialect,the film constructs a metaphorical chain of“human order-ethnic oppression-theocratic structure”via the three core architectural spaces.As core signifiers,buildings drive the plot,shape characters,and convey values.The study reveals that animation activates traditional architecture’s metaphorical potential through cross-modal mapping,endowing historical symbols with contemporary vitality and providing a paradigm for the creative transformation of traditional culture.展开更多
The diagnostic efficacy of contemporary bioimaging technologies remains constrained by inherent limitations of conventional imaging agents,including suboptimal sensitivity,off-target biodistribution,and inherent cytot...The diagnostic efficacy of contemporary bioimaging technologies remains constrained by inherent limitations of conventional imaging agents,including suboptimal sensitivity,off-target biodistribution,and inherent cytotoxicity.These limitations have catalyzed the development of intelligent stimuli-responsive block copolymers-based bioimaging agents,which was engineered to dynamically respond to endogenous biochemical cues(e.g.,p H gradients,redox potential,enzyme activity,hypoxia environment) or exogenous physical triggers(e.g.,photoirradiation,thermal gradients,ultrasound(US)/magnetic stimuli).Through spatiotemporally controlled structural transformations,stimuli-responsive block copolymers enable precise contrast targeting,activatable signal amplification,and theranostic integration,thereby substantially enhancing signal-to-noise ratios of bioimaging and diagnostic specificity.Hence,this mini-review systematically examines molecular engineering principles for designing p H-,redox-,enzyme-,light-,thermo-,and US/magnetic-responsive polymers,with emphasis on structure-property relationships governing imaging performance modulation.Furthermore,we critically analyze emerging strategies for optical imaging,US synergies,and magnetic resonance imaging(MRI).Multimodal bioimaging has also been elaborated,which could overcome the inherent trade-offs between resolution,penetration depth,and functional specificity in single-modal approaches.By elucidating mechanistic insights and translational challenges,this mini-review aims to establish a design framework of stimuli-responsive block copolymersbased for high fidelity bioimaging agents and accelerate their clinical translation in precise diagnosis and therapy.展开更多
It remains difficult to automate the creation and validation of Unified Modeling Language(UML)dia-grams due to unstructured requirements,limited automated pipelines,and the lack of reliable evaluation methods.This stu...It remains difficult to automate the creation and validation of Unified Modeling Language(UML)dia-grams due to unstructured requirements,limited automated pipelines,and the lack of reliable evaluation methods.This study introduces a cohesive architecture that amalgamates requirement development,UML synthesis,and multimodal validation.First,LLaMA-3.2-1B-Instruct was utilized to generate user-focused requirements.Then,DeepSeek-R1-Distill-Qwen-32B applies its reasoning skills to transform these requirements into PlantUML code.Using this dual-LLM pipeline,we constructed a synthetic dataset of 11,997 UML diagrams spanning six major diagram families.Rendering analysis showed that 89.5%of the generated diagrams compile correctly,while invalid cases were detected automatically.To assess quality,we employed a multimodal scoring method that combines Qwen2.5-VL-3B,LLaMA-3.2-11B-Vision-Instruct and Aya-Vision-8B,with weights based on MMMU performance.A study with 94 experts revealed strong alignment between automatic and manual evaluations,yielding a Pearson correlation of r=0.82 and a Fleiss’Kappa of 0.78.This indicates a high degree of concordance between automated metrics and human judgment.Overall,the results demonstrated that our scoring system is effective and that the proposed generation pipeline produces UML diagrams that are both syntactically correct and semantically coherent.More broadly,the system provides a scalable and reproducible foundation for future work in AI-driven software modeling and multimodal verification.展开更多
In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we devel...In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we develop a multimodal framework that integrates symbolic task reasoning with continuous trajectory generation.The approach employs transformer models and adversarial training to map high-level intent to robotic motion.Information from multiple data sources,such as voice traits,hand and body keypoints,visual observations,and recorded paths,is integrated simultaneously.These signals are mapped into a shared representation that supports interpretable reasoning while enabling smooth and realistic motion generation.Based on this design,two different learning strategies are investigated.In the first step,grammar-constrained Linear Temporal Logic(LTL)expressions are created from multimodal human inputs.These expressions are subsequently decoded into robot trajectories.The second method generates trajectories directly from symbolic intent and linguistic data,bypassing an intermediate logical representation.Transformer encoders combine multiple types of information,and autoregressive transformer decoders generate motion sequences.Adding smoothness and speed limits during training increases the likelihood of physical feasibility.To improve the realism and stability of the generated trajectories during training,an adversarial discriminator is also included to guide them toward the distribution of actual robot motion.Tests on the NATSGLD dataset indicate that the complete system exhibits stable training behaviour and performance.In normalised coordinates,the logic-based pipeline has an Average Displacement Error(ADE)of 0.040 and a Final Displacement Error(FDE)of 0.036.The adversarial generator makes substantially more progress,reducing ADE to 0.021 and FDE to 0.018.Visual examination confirms that the generated trajectories closely align with observed motion patterns while preserving smooth temporal dynamics.展开更多
Accurate detection of driver fatigue is essential for improving road safety.This study investigates the effectiveness of using multimodal physiological signals for fatigue detection while incorporating uncertainty qua...Accurate detection of driver fatigue is essential for improving road safety.This study investigates the effectiveness of using multimodal physiological signals for fatigue detection while incorporating uncertainty quantification to enhance the reliability of predictions.Physiological signals,including Electrocardiogram(ECG),Galvanic Skin Response(GSR),and Electroencephalogram(EEG),were transformed into image representations and analyzed using pretrained deep neu-ral networks.The extracted features were classified through a feedforward neural network,and prediction reliability was assessed using uncertainty quantification techniques such as Monte Carlo Dropout(MCD),model ensembles,and combined approaches.Evaluation metrics included standard measures(sensitivity,specificity,precision,and accuracy)along with uncertainty-aware metrics such as uncertainty sensitivity and uncertainty precision.Across all evaluations,ECG-based models consistently demonstrated strong performance.The findings indicate that combining multimodal physi-ological signals,Transfer Learning(TL),and uncertainty quantification can significantly improve both the accuracy and trustworthiness of fatigue detection systems.This approach supports the development of more reliable driver assistance technologies aimed at preventing fatigue-related accidents.展开更多
The problem of fake news detection(FND)is becoming increasingly important in the field of natural language processing(NLP)because of the rapid dissemination of misleading information on the web.Large language models(L...The problem of fake news detection(FND)is becoming increasingly important in the field of natural language processing(NLP)because of the rapid dissemination of misleading information on the web.Large language models(LLMs)such as GPT-4.Zero excels in natural language understanding tasks but can still struggle to distinguish between fact and fiction,particularly when applied in the wild.However,a key challenge of existing FND methods is that they only consider unimodal data(e.g.,images),while more detailed multimodal data(e.g.,user behaviour,temporal dynamics)is neglected,and the latter is crucial for full-context understanding.To overcome these limitations,we introduce M3-FND(Multimodal Misinformation Mitigation for False News Detection),a novel methodological framework that integrates LLMs with multimodal data sources to perform context-aware veracity assessments.Our method proposes a hybrid system that combines image-text alignment,user credibility profiling,and temporal pattern recognition,which is also strengthened through a natural feedback loop that provides real-time feedback for correcting downstream errors.We use contextual reinforcement learning to schedule prompt updating and update the classifier threshold based on the latest multimodal input,which enables the model to better adapt to changing misinformation attack strategies.M3-FND is tested on three diverse datasets,FakeNewsNet,Twitter15,andWeibo,which contain both text and visual socialmedia content.Experiments showthatM3-FND significantly outperforms conventional and LLMbased baselines in terms of accuracy,F1-score,and AUC on all benchmarks.Our results indicate the importance of employing multimodal cues and adaptive learning for effective and timely detection of fake news.展开更多
Gastrointestinal tumors require personalized treatment strategies due to their heterogeneity and complexity.Multimodal artificial intelligence(AI)addresses this challenge by integrating diverse data sources-including ...Gastrointestinal tumors require personalized treatment strategies due to their heterogeneity and complexity.Multimodal artificial intelligence(AI)addresses this challenge by integrating diverse data sources-including computed tomography(CT),magnetic resonance imaging(MRI),endoscopic imaging,and genomic profiles-to enable intelligent decision-making for individualized therapy.This approach leverages AI algorithms to fuse imaging,endoscopic,and omics data,facilitating comprehensive characterization of tumor biology,prediction of treatment response,and optimization of therapeutic strategies.By combining CT and MRI for structural assessment,endoscopic data for real-time visual inspection,and genomic information for molecular profiling,multimodal AI enhances the accuracy of patient stratification and treatment personalization.The clinical implementation of this technology demonstrates potential for improving patient outcomes,advancing precision oncology,and supporting individualized care in gastrointestinal cancers.Ultimately,multimodal AI serves as a transformative tool in oncology,bridging data integration with clinical application to effectively tailor therapies.展开更多
High-throughput transcriptomics has evolved from bulk RNA-seq to single-cell and spatial profiling,yet its clinical translation still depends on effective integration across diverse omics and data modalities.Emerging ...High-throughput transcriptomics has evolved from bulk RNA-seq to single-cell and spatial profiling,yet its clinical translation still depends on effective integration across diverse omics and data modalities.Emerging foundation models and multimodal learning frameworks are enabling scalable and transferable representations of cellular states,while advances in interpretability and real-world data integration are bridging the gap between discovery and clinical application.This paper outlines a concise roadmap for AI-driven,transcriptome-centered multi-omics integration in precision medicine(Figure 1).展开更多
For decades,the central dogma of oncology has been that a cancer’s identity is inextricably linked to its anatomical origin.This principle underpins the entire diagnostic and therapeutic framework,from histology-base...For decades,the central dogma of oncology has been that a cancer’s identity is inextricably linked to its anatomical origin.This principle underpins the entire diagnostic and therapeutic framework,from histology-based classification to site-specific treatment guidelines.Yet,this framework catastrophically fails for a substantial population of patients diagnosed with cancer of unknown primary(CUP).These patients present metastatic disease,yet their primary tumors remain elusive despite exhaustive clinical workup1.CUP,accounting for 1%-3%of all cancer diagnoses,is an enigma with devastating consequences;the median overall survival is only 2-12 months2-4.The inability to pinpoint an origin forces clinicians to rely on broad-spectrum empirical chemotherapy,such as taxane-carboplatin regimens,which have limited efficacy and exclude patients from the promise of targeted therapies and clinical trials5.CUP is not only a diagnostic challenge but also an indictment of the siloed approach to understanding malignancy:this cancer highlights the limitations of origin-based diagnostic frameworks.However,the confluence of high-dimensional biological data and advanced artificial intelligence(AI)is now poised to address this long-standing diagnostic limitation and to herald a new era for not only CUP but also oncology as a whole(Figure 1).展开更多
The brain atlas,or parcellation-delineating spatial partitions,organizes the brain's structure and function[1].The spatial arrangements of highly heterogeneous landscapes represent specialized functional regions f...The brain atlas,or parcellation-delineating spatial partitions,organizes the brain's structure and function[1].The spatial arrangements of highly heterogeneous landscapes represent specialized functional regions for investigating their interactions.Early efforts to parcellate the mammalian brain,using histological cytoarchitecture and myeloarchitecture,as well as recent in vivo magnetic resonance imaging(MRl)[2,3],have primarily involved cortical areas,subcortical structures,and cerebellar nuclei.Human brain parcellations primarily focus on grey matter(GM),which purposefully excludes white matter(WM),hindering the development of next-generation brain atlases.展开更多
A new tapered multimode interference (MMl)-based coherent lightwave combiner is reported. A comprehensive theoretical analysis of mode behaviors in the tapered MMI waveguide is presented, and the output characterist...A new tapered multimode interference (MMl)-based coherent lightwave combiner is reported. A comprehensive theoretical analysis of mode behaviors in the tapered MMI waveguide is presented, and the output characteristics of the tapered MMI combiners with various structures are demonstrated. The combiner is fabricated on a silicon-on-insulator (SO1) substrate. Due to its advantages of having no end-facet reflection,easy extension to a multi-port configuration, high tolerance for fabrication errors, and compact size, the tapered MMI is a good candidate for a coherent lightwave combiner to be used in large-scale photonic integrated circuits.展开更多
基金Project supported by the Natural Science Foundation of Hebei Province(B2023201108,B2024201076)Science Fund for Creative Research Groups of Natural Science Foundation of Hebei Province(B2021201038)+3 种基金333 Talent Project Fund of Hebei Province(C20221015)National High-End Foreign Expert Recruitment Plan(G2022003007L)Hebei Province Higher Education Science and Technology Research Project(JZX2023001)Hebei Province Innovation Capability Enhancement Plan Project(22567632H)。
文摘Lead-free double perovskites have gained recognition as top luminescent materials due to their environmental friendliness,high chemical stability,structural adjustability,and excellent photoelectric properties.However,the poor modulation of emission restricts their applications,and it is highly desirable to explore stable and efficient double perovskites with multimode luminescence and adjustable spectra for multifunctional photoelectric applications.Herein,the rare earth ions Ln^(3+)(Er^(3+)and Ho^(3+))-doped Cs_(2)NaYCl_(6):Sb^(3+)crystals were synthesized by a simple solvothermal route.The X-ray diffraction pattern(XRD),energy-dispersive spectroscopy(EDS),X-ray photoelectron spectroscopy(XPS),and elemental mapping images demonstrate that the Sb^(3+),Er^(3+),and Ho^(3+)ions have been homogeneously incorporated into the Cs_(2)NaYCl_(6)crystals.As anticipated,the emissio n spectra of Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)are composed of two bands.One broad blue band derives from self-trapped exciton(STE)in[SbCl_(6)]3-octahedra while another group of emission peaks stems from the f-f transitions of Ln^(3+)ions.The emission colors of Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)phosphors can be tuned in a wide range by modulating the doping concentrations of Ln^(3+)ions.The efficient energy transfer from STE to Ln^(3+)is the key point to achieving the efficient and tunable emissions Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)samples.Interestingly,Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)can also exhibit characteristic up-conversion luminescence of Ln^(3+)under nearinfrared(NIR)excitation besides the down-conversion luminescence,revealing that the materials may have potential applicability in multimode anti-counterfeiting and information encryption applications.Furthermore,the light emitting diodes(LEDs)assembled by Cs_(2)NaYCl_(6):Sb^(3+)and Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)phosphors display dazzling blue,green,and red emissions under a forward bias current,which indicates that the as-obtained double perovskites materials may have great potential in solid-state lighting and optoelectronic devices.
基金financial supports from the National Natural Science Foundation of China (62075132 and 92050202)Natural Science Foundation of Shanghai (22ZR1443100)
文摘This work introduces special states for light in multimode fibers featuring strongly enhanced or reduced correlations be-tween output fields in the presence of environmental temperature fluctuations.Using experimentally measured multi-tem-perature transmission matrix,a set of temperature principal modes that exhibit resilience to disturbances caused by tem-perature fluctuations can be generated.Reversing this concept also allows the construction of temperature anti-principal modes,with output profiles more susceptible to temperature influences than the unmodulated wavefront.Despite changes in the length of the multimode fiber within the temperature-fluctuating region,the proposed approach remains capable of robustly controlling the temperature response within the fiber.To illustrate the practicality of the proposed spe-cial state,a learning-empowered fiber specklegram temperature sensor based on temperature anti-principal mode sensi-tization is proposed.This sensor exhibits outstanding superiority over traditional approaches in terms of resolution and accuracy.These novel states are anticipated to have wide-ranging applications in fiber communication,sensing,imaging,and spectroscopy,and serve as a source of inspiration for the discovery of other novel states.
基金supported by the National Natural Science Foundation of China(No.61505160)the Innovation Capability Support Program of Shaanxi(No.2018KJXX-042)+2 种基金the Natural Science Basic Research Program of Shaanxi(No.2019JM-084)the State Key Laboratory of Transient Optics and Photonics(No.SKLST202108)the Graduate Innovation and Practical Ability Training Project of Xi’an Shiyou University(No.YCS22213190)。
文摘We proposed and demonstrated the ultra-compact 1310/1550 nm wavelength multiplexer/demultiplexer assisted by subwavelength grating(SWG)using particle swarm optimization(PSO)algorithm in silicon-on-insulator(SOI)platform.Through the self-imaging effect of multimode interference(MMI)coupler,the demultiplexing function for 1310 nm and 1550 nm wavelengths is implemented.After that,three parallel SWG-based slots are inserted into the MMI section so that the effective refractive index of the modes can be engineered and thus the beat length can be adjusted.Importantly,these three SWG slots significantly reduce the length of the device,which is much shorter than the length of traditional MMI-based wavelength demultiplexers.Ultimately,by using the PSO algorithm,the equivalent refractive index and width of the SWG in a certain range are optimized to achieve the best performance of the wavelength demultiplexer.It has been verified that the device footprint is only 2×30.68μm^(2),and 1 dB bandwidths of larger than 120 nm are acquired at 1310 nm and 1550 nm wavelengths.Meanwhile,the transmitted spectrum shows that the insertion loss(IL)values are below 0.47 dB at both wavelengths when the extinction ratio(ER)values are above 12.65 dB.This inverse design approach has been proved to be efficient in increasing bandwidth and reducing device length.
基金supported by the Innovative Human Resource Development for Local Intel-lectualization program through the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.IITP-2026-2020-0-01741)the research fund of Hanyang University(HY-2025-1110).
文摘Arrhythmias are a frequently occurring phenomenon in clinical practice,but how to accurately dis-tinguish subtle rhythm abnormalities remains an ongoing difficulty faced by the entire research community when conducting ECG-based studies.From a review of existing studies,two main factors appear to contribute to this problem:the uneven distribution of arrhythmia classes and the limited expressiveness of features learned by current models.To overcome these limitations,this study proposes a dual-path multimodal framework,termed DM-EHC(Dual-Path Multimodal ECG Heartbeat Classifier),for ECG-based heartbeat classification.The proposed framework links 1D ECG temporal features with 2D time–frequency features.By setting up the dual paths described above,the model can process more dimensions of feature information.The MIT-BIH arrhythmia database was selected as the baseline dataset for the experiments.Experimental results show that the proposed method outperforms single modalities and performs better for certain specific types of arrhythmias.The model achieved mean precision,recall,and F1 score of 95.14%,92.26%,and 93.65%,respectively.These results indicate that the framework is robust and has potential value in automated arrhythmia classification.
文摘Hepatocellular carcinoma presents with three distinct immune phenotypes,including immune-desert,immune-excluded,and immune-inflamed,indicating various treatment responses and prognostic outcomes.The clinical application of multi-omics parameters is still restricted by the expensive and less accessible assays,although they accurately reflect immune status.A comprehensive evaluation framework based on“easy-to-obtain”multi-model clinical parameters is urgently required,incorporating clinical features to establish baseline patient profiles and disease staging;routine blood tests assessing systemic metabolic and functional status;immune cell subsets quantifying subcluster dynamics;imaging features delineating tumor morphology,spatial configuration,and perilesional anatomical relationships;immunohistochemical markers positioning qualitative and quantitative detection of tumor antigens from the cellular and molecular level.This integrated phenomic approach aims to improve prognostic stratification and clinical decision-making in hepatocellular carcinoma management conveniently and practically.
基金funded by Fundación CajaCanarias and Fundación Bancaria“la Caixa”,grant number 2023DIG11.
文摘Business Process Modelling(BPM)is essential for analyzing,improving,and automating the flow of information within organizations,but traditional approaches based on manual interpretation are slow,error-prone,and require a high level of expertise.This article proposes an innovative alternative solution that overcomes these limitations by automatically generating comprehensive Business Process Modelling and Notation(BPMN)diagrams solely from verbal descriptions of the processes to be modeled,utilizing Large Language Models(LLMs)and multimodal Artificial Intelligence(AI).Experimental results,based on video recordings of process explanations provided by an expert from an organization(in this case,the Commercial Courts of a public justice administration),demonstrate that the proposed methodology successfully enables the automatic generation of complete and accurate BPMN diagrams,leading to significant improvements in the speed,accuracy,and accessibility of process modeling.This research makes a substantial contribution to the field of business process modeling,as its methodology is groundbreaking in its use of LLMs and multimodal AI capabilities to handle different types of source material(text and video),combining several tools to minimize the number of queries and reduce the complexity of the prompts required for the automatic generation of successful BPMN diagrams.
基金supported in part by the National Natural Science Foundation of China under Grants 52475102 and 52205101in part by the Guangdong Basic and Applied Basic Research Foundation under Grant 2023A1515240021+1 种基金in part by the Young Talent Support Project of Guangzhou Association for Science and Technology(QT-2024-28)in part by the Youth Development Initiative of Guangdong Association for Science and Technology(SKXRC2025254).
文摘To ensure the safe and stable operation of rotating machinery,intelligent fault diagnosis methods hold significant research value.However,existing diagnostic approaches largely rely on manual feature extraction and expert experience,which limits their adaptability under variable operating conditions and strong noise environments,severely affecting the generalization capability of diagnostic models.To address this issue,this study proposes a multimodal fusion fault diagnosis framework based on Mel-spectrograms and automated machine learning(AutoML).The framework first extracts fault-sensitive Mel time–frequency features from acoustic signals and fuses them with statistical features of vibration signals to construct complementary fault representations.On this basis,automated machine learning techniques are introduced to enable end-to-end diagnostic workflow construction and optimal model configuration acquisition.Finally,diagnostic decisions are achieved by automatically integrating the predictions of multiple high-performance base models.Experimental results on a centrifugal pump vibration and acoustic dataset demonstrate that the proposed framework achieves high diagnostic accuracy under noise-free conditions and maintains strong robustness under noisy interference,validating its efficiency,scalability,and practical value for rotating machinery fault diagnosis.
文摘From the perspective of Multimodal Metaphor Theory,the architectural scenes in Ne Zha 2 embody highly condensed cultural connotations.Through the synergy of vision,soundscape,and dialect,the film constructs a metaphorical chain of“human order-ethnic oppression-theocratic structure”via the three core architectural spaces.As core signifiers,buildings drive the plot,shape characters,and convey values.The study reveals that animation activates traditional architecture’s metaphorical potential through cross-modal mapping,endowing historical symbols with contemporary vitality and providing a paradigm for the creative transformation of traditional culture.
基金supported by the National Natural Science Foundation of China (Nos.22208218,22078196,and 22278268)the Natural Science Foundation of Shanghai (No.22ZR1460400)Collaborative Innovation Center of Fragrance Flavour and Cosmetics,and Collaborative Innovation Project of Shanghai Institute of Technology (No.XTCX2023-07)。
文摘The diagnostic efficacy of contemporary bioimaging technologies remains constrained by inherent limitations of conventional imaging agents,including suboptimal sensitivity,off-target biodistribution,and inherent cytotoxicity.These limitations have catalyzed the development of intelligent stimuli-responsive block copolymers-based bioimaging agents,which was engineered to dynamically respond to endogenous biochemical cues(e.g.,p H gradients,redox potential,enzyme activity,hypoxia environment) or exogenous physical triggers(e.g.,photoirradiation,thermal gradients,ultrasound(US)/magnetic stimuli).Through spatiotemporally controlled structural transformations,stimuli-responsive block copolymers enable precise contrast targeting,activatable signal amplification,and theranostic integration,thereby substantially enhancing signal-to-noise ratios of bioimaging and diagnostic specificity.Hence,this mini-review systematically examines molecular engineering principles for designing p H-,redox-,enzyme-,light-,thermo-,and US/magnetic-responsive polymers,with emphasis on structure-property relationships governing imaging performance modulation.Furthermore,we critically analyze emerging strategies for optical imaging,US synergies,and magnetic resonance imaging(MRI).Multimodal bioimaging has also been elaborated,which could overcome the inherent trade-offs between resolution,penetration depth,and functional specificity in single-modal approaches.By elucidating mechanistic insights and translational challenges,this mini-review aims to establish a design framework of stimuli-responsive block copolymersbased for high fidelity bioimaging agents and accelerate their clinical translation in precise diagnosis and therapy.
基金supported by the DH2025-TN07-07 project conducted at the Thai Nguyen University of Information and Communication Technology,Thai Nguyen,Vietnam,with additional support from the AI in Software Engineering Lab.
文摘It remains difficult to automate the creation and validation of Unified Modeling Language(UML)dia-grams due to unstructured requirements,limited automated pipelines,and the lack of reliable evaluation methods.This study introduces a cohesive architecture that amalgamates requirement development,UML synthesis,and multimodal validation.First,LLaMA-3.2-1B-Instruct was utilized to generate user-focused requirements.Then,DeepSeek-R1-Distill-Qwen-32B applies its reasoning skills to transform these requirements into PlantUML code.Using this dual-LLM pipeline,we constructed a synthetic dataset of 11,997 UML diagrams spanning six major diagram families.Rendering analysis showed that 89.5%of the generated diagrams compile correctly,while invalid cases were detected automatically.To assess quality,we employed a multimodal scoring method that combines Qwen2.5-VL-3B,LLaMA-3.2-11B-Vision-Instruct and Aya-Vision-8B,with weights based on MMMU performance.A study with 94 experts revealed strong alignment between automatic and manual evaluations,yielding a Pearson correlation of r=0.82 and a Fleiss’Kappa of 0.78.This indicates a high degree of concordance between automated metrics and human judgment.Overall,the results demonstrated that our scoring system is effective and that the proposed generation pipeline produces UML diagrams that are both syntactically correct and semantically coherent.More broadly,the system provides a scalable and reproducible foundation for future work in AI-driven software modeling and multimodal verification.
基金The authors extend their appreciation to Prince Sattam bin Abdulaziz University for funding this research work through the project number(PSAU/2024/01/32082).
文摘In Human–Robot Interaction(HRI),generating robot trajectories that accurately reflect user intentions while ensuring physical realism remains challenging,especially in unstructured environments.In this study,we develop a multimodal framework that integrates symbolic task reasoning with continuous trajectory generation.The approach employs transformer models and adversarial training to map high-level intent to robotic motion.Information from multiple data sources,such as voice traits,hand and body keypoints,visual observations,and recorded paths,is integrated simultaneously.These signals are mapped into a shared representation that supports interpretable reasoning while enabling smooth and realistic motion generation.Based on this design,two different learning strategies are investigated.In the first step,grammar-constrained Linear Temporal Logic(LTL)expressions are created from multimodal human inputs.These expressions are subsequently decoded into robot trajectories.The second method generates trajectories directly from symbolic intent and linguistic data,bypassing an intermediate logical representation.Transformer encoders combine multiple types of information,and autoregressive transformer decoders generate motion sequences.Adding smoothness and speed limits during training increases the likelihood of physical feasibility.To improve the realism and stability of the generated trajectories during training,an adversarial discriminator is also included to guide them toward the distribution of actual robot motion.Tests on the NATSGLD dataset indicate that the complete system exhibits stable training behaviour and performance.In normalised coordinates,the logic-based pipeline has an Average Displacement Error(ADE)of 0.040 and a Final Displacement Error(FDE)of 0.036.The adversarial generator makes substantially more progress,reducing ADE to 0.021 and FDE to 0.018.Visual examination confirms that the generated trajectories closely align with observed motion patterns while preserving smooth temporal dynamics.
基金the Australian Research Council Discovery Projects funding scheme(DP190102181,DP210101465).
文摘Accurate detection of driver fatigue is essential for improving road safety.This study investigates the effectiveness of using multimodal physiological signals for fatigue detection while incorporating uncertainty quantification to enhance the reliability of predictions.Physiological signals,including Electrocardiogram(ECG),Galvanic Skin Response(GSR),and Electroencephalogram(EEG),were transformed into image representations and analyzed using pretrained deep neu-ral networks.The extracted features were classified through a feedforward neural network,and prediction reliability was assessed using uncertainty quantification techniques such as Monte Carlo Dropout(MCD),model ensembles,and combined approaches.Evaluation metrics included standard measures(sensitivity,specificity,precision,and accuracy)along with uncertainty-aware metrics such as uncertainty sensitivity and uncertainty precision.Across all evaluations,ECG-based models consistently demonstrated strong performance.The findings indicate that combining multimodal physi-ological signals,Transfer Learning(TL),and uncertainty quantification can significantly improve both the accuracy and trustworthiness of fatigue detection systems.This approach supports the development of more reliable driver assistance technologies aimed at preventing fatigue-related accidents.
文摘The problem of fake news detection(FND)is becoming increasingly important in the field of natural language processing(NLP)because of the rapid dissemination of misleading information on the web.Large language models(LLMs)such as GPT-4.Zero excels in natural language understanding tasks but can still struggle to distinguish between fact and fiction,particularly when applied in the wild.However,a key challenge of existing FND methods is that they only consider unimodal data(e.g.,images),while more detailed multimodal data(e.g.,user behaviour,temporal dynamics)is neglected,and the latter is crucial for full-context understanding.To overcome these limitations,we introduce M3-FND(Multimodal Misinformation Mitigation for False News Detection),a novel methodological framework that integrates LLMs with multimodal data sources to perform context-aware veracity assessments.Our method proposes a hybrid system that combines image-text alignment,user credibility profiling,and temporal pattern recognition,which is also strengthened through a natural feedback loop that provides real-time feedback for correcting downstream errors.We use contextual reinforcement learning to schedule prompt updating and update the classifier threshold based on the latest multimodal input,which enables the model to better adapt to changing misinformation attack strategies.M3-FND is tested on three diverse datasets,FakeNewsNet,Twitter15,andWeibo,which contain both text and visual socialmedia content.Experiments showthatM3-FND significantly outperforms conventional and LLMbased baselines in terms of accuracy,F1-score,and AUC on all benchmarks.Our results indicate the importance of employing multimodal cues and adaptive learning for effective and timely detection of fake news.
基金Supported by Xuhui District Health Commission,No.SHXH202214.
文摘Gastrointestinal tumors require personalized treatment strategies due to their heterogeneity and complexity.Multimodal artificial intelligence(AI)addresses this challenge by integrating diverse data sources-including computed tomography(CT),magnetic resonance imaging(MRI),endoscopic imaging,and genomic profiles-to enable intelligent decision-making for individualized therapy.This approach leverages AI algorithms to fuse imaging,endoscopic,and omics data,facilitating comprehensive characterization of tumor biology,prediction of treatment response,and optimization of therapeutic strategies.By combining CT and MRI for structural assessment,endoscopic data for real-time visual inspection,and genomic information for molecular profiling,multimodal AI enhances the accuracy of patient stratification and treatment personalization.The clinical implementation of this technology demonstrates potential for improving patient outcomes,advancing precision oncology,and supporting individualized care in gastrointestinal cancers.Ultimately,multimodal AI serves as a transformative tool in oncology,bridging data integration with clinical application to effectively tailor therapies.
文摘High-throughput transcriptomics has evolved from bulk RNA-seq to single-cell and spatial profiling,yet its clinical translation still depends on effective integration across diverse omics and data modalities.Emerging foundation models and multimodal learning frameworks are enabling scalable and transferable representations of cellular states,while advances in interpretability and real-world data integration are bridging the gap between discovery and clinical application.This paper outlines a concise roadmap for AI-driven,transcriptome-centered multi-omics integration in precision medicine(Figure 1).
基金supported by the National Natural Science Foundation of China(Grant Nos.32270688,31801117,and 82430107 to X.L.,and 32500589 to H.S.)the China Postdoctoral Science Foundation(Grant Nos.BX20240253 and 2024M762384 to H.S.)+1 种基金the Natural Science Foundation of Tianjin(Grant No.24JCQNJC01280 to H.S.)Tianjin Key Medical Discipline(Specialty)Construction Project(Grant No.TJYXZDXK-3-003A).
文摘For decades,the central dogma of oncology has been that a cancer’s identity is inextricably linked to its anatomical origin.This principle underpins the entire diagnostic and therapeutic framework,from histology-based classification to site-specific treatment guidelines.Yet,this framework catastrophically fails for a substantial population of patients diagnosed with cancer of unknown primary(CUP).These patients present metastatic disease,yet their primary tumors remain elusive despite exhaustive clinical workup1.CUP,accounting for 1%-3%of all cancer diagnoses,is an enigma with devastating consequences;the median overall survival is only 2-12 months2-4.The inability to pinpoint an origin forces clinicians to rely on broad-spectrum empirical chemotherapy,such as taxane-carboplatin regimens,which have limited efficacy and exclude patients from the promise of targeted therapies and clinical trials5.CUP is not only a diagnostic challenge but also an indictment of the siloed approach to understanding malignancy:this cancer highlights the limitations of origin-based diagnostic frameworks.However,the confluence of high-dimensional biological data and advanced artificial intelligence(AI)is now poised to address this long-standing diagnostic limitation and to herald a new era for not only CUP but also oncology as a whole(Figure 1).
基金supported by the National Natural Science Foundation of China(62473082,82202250,82121003,62036003,and 62333003)the Fundamental Research Funds for the Central Universities(ZYGX2022YGRH008 and ZYGX2024XJ054)the Medical-Engineering Cooperation Funds from the University of Electronic Science and Technology of China(ZYGX2021YGLH201).
文摘The brain atlas,or parcellation-delineating spatial partitions,organizes the brain's structure and function[1].The spatial arrangements of highly heterogeneous landscapes represent specialized functional regions for investigating their interactions.Early efforts to parcellate the mammalian brain,using histological cytoarchitecture and myeloarchitecture,as well as recent in vivo magnetic resonance imaging(MRl)[2,3],have primarily involved cortical areas,subcortical structures,and cerebellar nuclei.Human brain parcellations primarily focus on grey matter(GM),which purposefully excludes white matter(WM),hindering the development of next-generation brain atlases.
基金supported by the National Key R&D Program of China(2023YFA1406200)the National Natural Science Foundation of China(T2521005,12174144,12474009,12174146,and 124B2059)the Special Construction Project Fund for Shan-dong Province Taishan Scholars.
文摘Multifunctional optical responsive materials have grown increasingly pivotal in addressingthe escalating demands of sensing,detection,and anti-counterfeiting applications[1,2].These materials exhibit distinct visible optical variations upon exposure to external stimuli,such as pressure,temperature,light,solvents,pH fluctuations,or mechanical force.Fluorescent sensing and anti-counterfeiting technologies leveraging these optical responses have emerged as highly promising solutions.
文摘A new tapered multimode interference (MMl)-based coherent lightwave combiner is reported. A comprehensive theoretical analysis of mode behaviors in the tapered MMI waveguide is presented, and the output characteristics of the tapered MMI combiners with various structures are demonstrated. The combiner is fabricated on a silicon-on-insulator (SO1) substrate. Due to its advantages of having no end-facet reflection,easy extension to a multi-port configuration, high tolerance for fabrication errors, and compact size, the tapered MMI is a good candidate for a coherent lightwave combiner to be used in large-scale photonic integrated circuits.