To address the challenge of achieving decentralized,scalable,and adaptive control for large-scale multiple unmanned aerial vehicle(multi-UAV)swarms in dynamic urban environments with obstacles and wind perturbations,w...To address the challenge of achieving decentralized,scalable,and adaptive control for large-scale multiple unmanned aerial vehicle(multi-UAV)swarms in dynamic urban environments with obstacles and wind perturbations,we proposed a hybrid framework integrating adaptive reinforcement learning(RL),multi-modal perception fusion,and enhanced pigeon flock optimization(PFO)with curiosity-driven exploration to enable robust autonomous and formation control.The framework leverages meta-learning to optimize RL policies for real-time adaptation,fuses sensor data for precise state estimation,and enhances PFO with learned leader-follower dynamics and exploration rewards to maintain cohesive formations and explore uncertain areas.For swarms of 10–30 UAVs,it achieves 34%faster convergence,61%reduced stability root mean square error(RMSE),88%fewer collisions and 85.6%–92.3%success rates in target detection and encirclement,outperforming standard multi-agent RL,pure PFO,and single-modality RL.Three-dimensional trajectory visualizations confirm cohesive formations,collision-free maneuvers,and efficient exploration in urban search-and-rescue scenarios.Innovations include meta-RL for rapid adaptation,multi-modal fusion for robust perception,and curiosity-driven PFO for scalable,decentralized control,advancing real-world multi-UAV swarm autonomy and coordination.展开更多
Gait recognition is a key biometric for long-distance identification,yet its performance is severely degraded by real-world challenges such as varying clothing,carrying conditions,and changing viewpoints.While combini...Gait recognition is a key biometric for long-distance identification,yet its performance is severely degraded by real-world challenges such as varying clothing,carrying conditions,and changing viewpoints.While combining silhouette and skeleton data is a promising direction,effectively fusing these heterogeneous modalities and adaptively weighting their contributions in response to diverse conditions remains a central problem.This paper introduces GaitMAFF,a novelMulti-modal Adaptive Feature Fusion Network,to address this challenge.Our approach first transforms discrete skeleton joints into a dense SkeletonMap representation to align with silhouettes,then employs an attention-based module to dynamically learn the fusion weights between the two modalities.These fused features are processed by a powerful spatio-temporal backbone withWeighted Global-Local Feature FusionModules(WFFM)to learn a discriminative representation.Extensive experiments on the challenging CCPG and Gait3D datasets show that GaitMAFF achieves state-of-the-art performance,with an average Rank-1 accuracy of 84.6%on CCPG and 58.7%on Gait3D.These results demonstrate that our adaptive fusion strategy effectively integrates complementary multimodal information,significantly enhancing gait recognition robustness and accuracy in complex scenes and providing a practical solution for real-world applications.展开更多
With the convergence of sensor technology,artificial intelligence,and the Internet of Things,intelligent vibration monitoring systems are undergoing transformative development.This evolution imposes stringent demands ...With the convergence of sensor technology,artificial intelligence,and the Internet of Things,intelligent vibration monitoring systems are undergoing transformative development.This evolution imposes stringent demands on the miniaturization,low power consumption,high integration,and environmental adaptability of transducers.Graphene,renowned for its superlative physicochemical attributes,holds significant promise for application in micro-and nanoelectromechanical systems(M/NEMS).However,the inherent central symmetry of graphene restricts its utility in piezoelectric devices.Inspired by the sensilla trichoidea of spiders,a threedimensional(3D)cilia-like monolayer graphene omnidirectional vibration transducer(CGVT)based on a stress-induced self-assembly mechanism is fabricated,demonstrating notable performance and high-temperature resistance.Furthermore,3D vibration vector decoding is realized via an omnidirectional decoupling algorithm based on one-dimensional convolutional neural networks(1DCNN)to achieve precise discrimination of vibration directions.The 3D bionic vibration-sensing system incorporates a spider web structure into a bionic cilia MEMS chip through a gold wire bonding process,enabling the realization of three distinct mechanisms for vibration detection and recognition.In particular,these devices are manufactured using silicon-based semiconductor processing techniques and MEMS fabrication methodologies,leading to a substantial reduction in the dimensions of individual components compared to traditional counterparts.展开更多
Neuromorphic visual perception,by emulating the efficient information processing mechanisms of biological vision systems and integrating innovations in materials and device architectures,offers novel solutions for art...Neuromorphic visual perception,by emulating the efficient information processing mechanisms of biological vision systems and integrating innovations in materials and device architectures,offers novel solutions for artificial intelligence sensing.For instance,the incorporation of low-dimensional materials(e.g.,quantum dots,carbon nanotubes,and two-dimensional materials)optimizes device optoelectronic properties,while the synergistic design of organic semiconductors and oxide materials balances flexibility with complementary metal-oxide-semiconductor(CMOS)compatibility.Representative neuromorphic devices such as memristors and neuromorphic transistors address traditional vision system bottlenecks via near-sensor and in-sensor architectures in data transmission latency and energy consumption,offering a new paradigm for highly integrated,energy-efficient real-time perception.However,critical challenges—including device non-uniformity caused by material interface defects,system instability induced by memristor conductance drift,and environmental adaptability under complex illumination—remain barriers to scalable applications.This review comprehensively examines neuromorphic visual perception devices from the perspectives of device structure,operational mechanisms,materials,and applications.It explores the pivotal roles of memristors,electrolyte-gated transistors,and other neuromorphic devices in optical signal perception and information processing,with a focus on their implementations in visual perception tasks and future prospects.展开更多
This study extends the self-propelled particle(SPP)model by incorporating a limited vision cone and local density sensing.The results reveal that clusters can simultaneously exhibit velocity polarization and spatial c...This study extends the self-propelled particle(SPP)model by incorporating a limited vision cone and local density sensing.The results reveal that clusters can simultaneously exhibit velocity polarization and spatial cohesion within specific ranges of vision angle and density threshold.The dependence of the dynamical features,including the order parameter and density variation,on the threshold and visual cone is investigated.Furthermore,a critical threshold is identified,which governs the transition between ordered and disordered states and is closely linked to density fluctuations and noise intensity.The clustering results show that the model is explained by the chasing mechanism responsible for cluster formation,density,and shape.These results may stimulate practical applications in swarm maneuvering.展开更多
Autism spectrum disorder(AsD)is a highly heterogeneous neurodevelopmental disorder.Early diagnosis and intervention are crucial for improving outcomes.Traditional single-modality diagnostic methods are subjective,limi...Autism spectrum disorder(AsD)is a highly heterogeneous neurodevelopmental disorder.Early diagnosis and intervention are crucial for improving outcomes.Traditional single-modality diagnostic methods are subjective,limited,and struggle to reveal the underlying pathological mechanisms.In contrast,multimodal data analysis integrates behavioral,physiological,and neuroimaging information with advanced machine-learning and deeplearning algorithms to overcome these limitations.In this review,we surveyed the recent pediatric AsD literature,highlighting artificial intelligence-driven diagnostic techniques,multimodal data fusion strategies,and emerging trends in ASD assessment.We surveyed studies that integrated two or more modalities and summarized the fusion levels,learning paradigms,tasks,datasets,and metrics.Multimodal approaches outperform singlemodality baselines in classification,severity estimation,and subtyping by leveraging complementary information and reducing modality-specific biases.Multimodal approaches significantly enhance diagnostic accuracy and comprehensiveness,enabling early screening of AsD,symptom subtyping,severity assessment,and personalized interventions.Advances in multimodal fusion techniques have promoted progress in precision medicine for the treatment of ASD.展开更多
What is spacetime?How do we perceive this medium?How can we fit it into our everyday linear lives?How can we situate ourselves within it in our post-industrial worldview,in an unsustainable world?This philosophical es...What is spacetime?How do we perceive this medium?How can we fit it into our everyday linear lives?How can we situate ourselves within it in our post-industrial worldview,in an unsustainable world?This philosophical essay adopts a phenomenological method to interrogate the meaning of this fundamental dimension of reality.Spacetime is interpreted not merely as a physical structure but as a plastic field whose instability shapes inner and social life.Yet the contemporary human condition is marked by a profound alienation,much of which derives from a self-inflicted existential disorientation:I once chose exile and moved to a remote island in the Atlantic Ocean,becoming my own research material.In search of genuine contact with nature,the nonverbal appeared as a necessity.I turned to music as an archetypal language,in the Romantic sense of a medium offering pre-conceptual access to the real.I composed Light Atlas,a six-movement work aiming to capture the flight of seagulls and the eternal struggle between light and darkness.This led me back to physics,to my original question:the lived perception of spacetime.展开更多
In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing method...In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing methods struggle to dynamically adjust cross-modal complementary strength to optimize fusion quality and lack effective mechanisms to model the dynamic evolution of emotions.To address these issues,we propose a multi-level dynamic gating and emotion transfer framework for multi-modal emotion recognition.A dynamic gating mechanism is applied across unimodal encoding,cross-modal alignment,and emotion transfer modeling,substantially improving noise robustness and feature alignment.First,we construct a unimodal encoder based on gated recurrent units and feature-selection gating to suppress intra-modal noise and enhance contextual representation.Second,we design a gated-attention crossmodal encoder that dynamically calibrates the complementary contributions of visual and audio modalities to the dominant textual features and eliminates redundant information.Finally,we introduce a gated enhanced emotion transfer module that explicitly models the temporal dependence of emotional evolution in dialogues via transfer gating and optimizes continuity modeling with a comparative learning loss.Experimental results demonstrate that the proposed method outperforms state-of-the-art models on the public MELD and IEMOCAP datasets.展开更多
Objectives:Psychological resilience is a critical resource for vocational high school students navigating social biases and fostering mental well-being.This six-month longitudinal study investigated the developmental ...Objectives:Psychological resilience is a critical resource for vocational high school students navigating social biases and fostering mental well-being.This six-month longitudinal study investigated the developmental trajectories of discrimination perception,vocational identity,and psychological resilience in this population.It further examined the longitudinal mediating role of vocational identity in the relationship between discrimination perception and psychological resilience.Methods:A total of 526 students from five vocational high schools in Guangdong,China,were assessed via convenience sampling at two time points:baseline(T1,September 2023)and six-month follow-up(T2,March 2024).Measures of discrimination perception,psychological resilience,and vocational identity were administered.Data were analyzed using a cross-lagged panel model to test for bidirectional relationships.Results:Over the six-month period,students showed significant decreases in discrimination perception and vocational identity,but a significant increase in psychological resilience.The cross-lagged model revealed significant bidirectional relationships:discrimination perception and psychological resilience negatively predicted each other over time(β=−0.124,p<0.01;β=−0.200,p<0.001),while psychological resilience and vocational identity positively predicted each other(β=0.084,p<0.05;β=0.076,p<0.05).The mediation analysis revealed a dual-pathway mechanism.T1 discrimination perception exerted both a significant direct negative effect on T2 psychological resilience(β=−0.332,p<0.001)and a significant indirect positive effect via T1 vocational identity(indirect effect=0.020,95%CI[0.001,0.046]).This confirms a partial mediating role,indicating that vocational identity functions as a compensatory mechanism,transforming the experience of discrimination perception into a potential source of psychological resilience.Conclusions:For vocational high school students,perception of discrimination directly undermines psychological resilience,but also indirectly fosters it through the positive development of vocational identity.These findings highlight vocational identity as a pivotal mechanism in the complex relationship between social adversity and mental resilience.展开更多
As a cornerstone for applications such as autonomous driving,3D urban perception is a burgeoning field of study.Enhancing the performance and robustness of these perception systems is crucial for ensuring the safety o...As a cornerstone for applications such as autonomous driving,3D urban perception is a burgeoning field of study.Enhancing the performance and robustness of these perception systems is crucial for ensuring the safety of next-generation autonomous vehicles.In this work,we introduce a novel neural scene representation called Street Detection Gaussians(SDGs),which redefines urban 3D perception through an integrated architecture unifying reconstruction and detection.At its core lies the dynamic Gaussian representation,where time-conditioned parameterization enables simultaneous modeling of static environments and dynamic objects through physically constrained Gaussian evolution.The framework’s radar-enhanced perception module learns cross-modal correlations between sparse radardata anddense visual features,resulting ina22%reduction inocclusionerrors compared tovisiononly systems.A breakthrough differentiable rendering pipeline back-propagates semantic detection losses throughout the entire 3D reconstruction process,enabling the optimization of both geometric and semantic fidelity.Evaluated on the Waymo Open Dataset and the KITTI Dataset,the system achieves real-time performance(135 Frames Per Second(FPS)),photorealistic quality(Peak Signal-to-Noise Ratio(PSNR)34.9 dB),and state-of-the-art detection accuracy(78.1%Mean Average Precision(mAP)),demonstrating a 3.8×end-to-end improvement over existing hybrid approaches while enabling seamless integration with autonomous driving stacks.展开更多
Legged robots have considerable potential for traversing unstructured situations;nonetheless,their inflexible frameworks often constrain adaptability and obstacle negotiation.The study article presents a revolutionary...Legged robots have considerable potential for traversing unstructured situations;nonetheless,their inflexible frameworks often constrain adaptability and obstacle negotiation.The study article presents a revolutionary Soft Tri-Legged Robot(STLR)that improves movement and obstacle-avoidance skills by using a bio-inspired pneumatic artificial muscle(Bubble Artificial Muscles)and a bio-inspired tactile sensor(TacTip).The STLR is activated by BAMs,which are flexible,pneu-matic-driven actuators that provide fine control over forward,backward,and steering movements.Obstacle identification and avoidance are facilitated by the TacTip sensor,which delivers tactile input for traversing unstructured terrains.We delineate the mechanical features of the BAMs,assess the functionality of the robot's legs,and elaborate on the incorpora-tion of the tactile sensing system.Experimental results demonstrate that the STLR can effectively achieve multi-directional flexible movement and obstacle avoidance through a cross-modal perception-actuation mechanism.This study highlights the promise of soft robotics for search and rescue,medical aid,and autonomous exploration,while delineating difficulties and opportunities for future improvements in functionality and efficiency.展开更多
Embodied intelligent systems integrate perception,control,and decision-making within physical agents,and have become a cornerstone of modern aerospace,autonomous driving,and cooperative robotic applications.When opera...Embodied intelligent systems integrate perception,control,and decision-making within physical agents,and have become a cornerstone of modern aerospace,autonomous driving,and cooperative robotic applications.When operating in uncertain and dynamic environments,such systems must address challenges arising from incomplete sensing,unpredictable maneuvers,communication constraints,disturbances,and evolving network structures.展开更多
With the increasing of the elderly population and the growing hearth care cost, the role of service robots in aiding the disabled and the elderly is becoming important. Many researchers in the world have paid much att...With the increasing of the elderly population and the growing hearth care cost, the role of service robots in aiding the disabled and the elderly is becoming important. Many researchers in the world have paid much attention to heaRthcare robots and rehabilitation robots. To get natural and harmonious communication between the user and a service robot, the information perception/feedback ability, and interaction ability for service robots become more important in many key issues.展开更多
This article examines the complex relationship between disease perception,negative emotions,and their impact on postoperative recovery in patients with perianal diseases.These conditions not only cause physical discom...This article examines the complex relationship between disease perception,negative emotions,and their impact on postoperative recovery in patients with perianal diseases.These conditions not only cause physical discomfort,but also carry a significant emotional burden,often exacerbated by social stigma.Psycho-logical factors,including stress,anxiety,and depression,activate neuroendocrine pathways,such as the hypothalamic–pituitary–adrenal axis,disrupting the gut microbiota and leading to dysbiosis.This disruption can delay wound healing,prolong hospital stay,and intensify pain.Drawing on the findings of Hou et al,our article highlights the critical role of illness perception and negative emotions in shaping recovery outcomes.It advocates for a holistic approach that integrates psychological support and gut microbiota modulation,to enhance healing and improve overall patient outcomes.展开更多
Objectives Diabetes remains a major global health challenge in China.Artificial intelligence(AI)has demonstrated considerable potential in improving diabetes management.This study aimed to assess healthcare providers...Objectives Diabetes remains a major global health challenge in China.Artificial intelligence(AI)has demonstrated considerable potential in improving diabetes management.This study aimed to assess healthcare providers’perceptions regarding AI in diabetes care across China.Methods A cross-sectional survey was conducted using snowball sampling from November 12 to November 24,2024.We selected 514 physicians and nurses by a snowball sampling method from healthcare providers across 30 cities or provinces in China.The self-developed questionnaire comprised five sections with 19 questions assessing medical workers’demographic characteristics,AI-related experience and interest,awareness,attitudes,and concerns regarding AI in diabetes care.Statistical analysis was performed using t-test,analysis of variance(ANOVA),and linear regression.Results Among them,20.0%and 48.1%of respondents had participated in AI-related research and training,while 85.4%expressed moderate to high interest in AI training for diabetes care.Most respondents reported partial awareness of AI in diabetes care,and only 12.6%exhibited a comprehensive or substantial understanding.Attitudes toward AI in diabetes care were generally positive,with a mean score of 24.50±3.38.Nurses demonstrated significantly higher scores than physicians(P<0.05).Greater awareness,prior AI training experience,and higher interest in AI training in diabetes care were strongly associated with more positive attitudes(P<0.05).Key concerns regarding AI included trust issues from AI-clinician inconsistencies(77.2%),increased workload and clinical workflow disruptions(63.4%),and incomplete legal and regulatory frameworks(60.3%).Only 34.2%of respondents expressed concerns about job displacement,indicating general confidence in their professional roles.Conclusions While Chinese healthcare providers show moderate awareness of AI in diabetes care,their attitudes are generally positive,and they are considerably interested in future training.Tailored,role-specific AI training is essential for equitable and effective integration into clinical practice.Additionally,transparent,reliable,ethical AI models must be prioritized to alleviate practitioners’concerns.展开更多
Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocar...Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.展开更多
Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single ...Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single target”presents significant challenges due to its holistic approach.Network pharmacology and its core theory of network targets connect drugs and diseases from a holistic and systematic perspective based on biological networks,overcoming the limitations of reductionist research models and showing considerable value in TCM research.Recent integration of network target computational and experimental methods with artificial intelligence(AI)and multi-modal multi-omics technologies has substantially enhanced network pharmacology methodology.The advancement in computational and experimental techniques provides complementary support for network target theory in decoding TCM principles.This review,centered on network targets,examines the progress of network target methods combined with AI in predicting disease molecular mechanisms and drug-target relationships,alongside the application of multi-modal multi-omics technologies in analyzing TCM formulae,syndromes,and toxicity.Looking forward,network target theory is expected to incorporate emerging technologies while developing novel approaches aligned with its unique characteristics,potentially leading to significant breakthroughs in TCM research and advancing scientific understanding and innovation in TCM.展开更多
The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring ef...The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models.展开更多
Spatial computing and augmented reality are advancing rapidly,with the goal of seamlessly blending virtual and physical worlds.However,traditional depth-sensing systems are bulky and energy-intensive,limiting their us...Spatial computing and augmented reality are advancing rapidly,with the goal of seamlessly blending virtual and physical worlds.However,traditional depth-sensing systems are bulky and energy-intensive,limiting their use in wearable devices.To overcome this,recent research by X.Liu et al.presents a compact binocular metalens-based depth perception system that integrates efficient edge detection through an advanced neural network.This system enables accurate,realtime depth mapping even in complex environments,enhancing potential applications in augmented reality,robotics,and autonomous systems.展开更多
With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intellig...With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intelligent SA(ISA).However,the existing AI-based SA approaches often rely on unimodal data and lack a comprehensive description and benchmark of the ISA tasks utilizing multi-modal data for real-time ATC environments.To address this gap,by analyzing the situation awareness procedure of the ATCOs,the ISA task is refined to the processing of the two primary elements,i.e.,spoken instructions and flight trajectories.Subsequently,the ISA is further formulated into Controlling Intent Understanding(CIU)and Flight Trajectory Prediction(FTP)tasks.For the CIU task,an innovative automatic speech recognition and understanding framework is designed to extract the controlling intent from unstructured and continuous ATC communications.For the FTP task,the single-and multi-horizon FTP approaches are investigated to support the high-precision prediction of the situation evolution.A total of 32 unimodal/multi-modal advanced methods with extensive evaluation metrics are introduced to conduct the benchmarks on the real-world multi-modal ATC situation dataset.Experimental results demonstrate the effectiveness of AI-based techniques in enhancing ISA for the ATC environment.展开更多
基金supported by the National Natural Science Foundation of China(No.62350048)。
文摘To address the challenge of achieving decentralized,scalable,and adaptive control for large-scale multiple unmanned aerial vehicle(multi-UAV)swarms in dynamic urban environments with obstacles and wind perturbations,we proposed a hybrid framework integrating adaptive reinforcement learning(RL),multi-modal perception fusion,and enhanced pigeon flock optimization(PFO)with curiosity-driven exploration to enable robust autonomous and formation control.The framework leverages meta-learning to optimize RL policies for real-time adaptation,fuses sensor data for precise state estimation,and enhances PFO with learned leader-follower dynamics and exploration rewards to maintain cohesive formations and explore uncertain areas.For swarms of 10–30 UAVs,it achieves 34%faster convergence,61%reduced stability root mean square error(RMSE),88%fewer collisions and 85.6%–92.3%success rates in target detection and encirclement,outperforming standard multi-agent RL,pure PFO,and single-modality RL.Three-dimensional trajectory visualizations confirm cohesive formations,collision-free maneuvers,and efficient exploration in urban search-and-rescue scenarios.Innovations include meta-RL for rapid adaptation,multi-modal fusion for robust perception,and curiosity-driven PFO for scalable,decentralized control,advancing real-world multi-UAV swarm autonomy and coordination.
基金funded by the Natural Science Foundation of Chongqing Municipality,grant number CSTB2022NSCQ-MSX0503.
文摘Gait recognition is a key biometric for long-distance identification,yet its performance is severely degraded by real-world challenges such as varying clothing,carrying conditions,and changing viewpoints.While combining silhouette and skeleton data is a promising direction,effectively fusing these heterogeneous modalities and adaptively weighting their contributions in response to diverse conditions remains a central problem.This paper introduces GaitMAFF,a novelMulti-modal Adaptive Feature Fusion Network,to address this challenge.Our approach first transforms discrete skeleton joints into a dense SkeletonMap representation to align with silhouettes,then employs an attention-based module to dynamically learn the fusion weights between the two modalities.These fused features are processed by a powerful spatio-temporal backbone withWeighted Global-Local Feature FusionModules(WFFM)to learn a discriminative representation.Extensive experiments on the challenging CCPG and Gait3D datasets show that GaitMAFF achieves state-of-the-art performance,with an average Rank-1 accuracy of 84.6%on CCPG and 58.7%on Gait3D.These results demonstrate that our adaptive fusion strategy effectively integrates complementary multimodal information,significantly enhancing gait recognition robustness and accuracy in complex scenes and providing a practical solution for real-world applications.
基金supported by the Deep Earth Probe and Mineral Resources Exploration-National Science and Technology Major Project(No.2024ZD1003100)the National Key R&D Program of China(Grant No.2024YFC2813700)。
文摘With the convergence of sensor technology,artificial intelligence,and the Internet of Things,intelligent vibration monitoring systems are undergoing transformative development.This evolution imposes stringent demands on the miniaturization,low power consumption,high integration,and environmental adaptability of transducers.Graphene,renowned for its superlative physicochemical attributes,holds significant promise for application in micro-and nanoelectromechanical systems(M/NEMS).However,the inherent central symmetry of graphene restricts its utility in piezoelectric devices.Inspired by the sensilla trichoidea of spiders,a threedimensional(3D)cilia-like monolayer graphene omnidirectional vibration transducer(CGVT)based on a stress-induced self-assembly mechanism is fabricated,demonstrating notable performance and high-temperature resistance.Furthermore,3D vibration vector decoding is realized via an omnidirectional decoupling algorithm based on one-dimensional convolutional neural networks(1DCNN)to achieve precise discrimination of vibration directions.The 3D bionic vibration-sensing system incorporates a spider web structure into a bionic cilia MEMS chip through a gold wire bonding process,enabling the realization of three distinct mechanisms for vibration detection and recognition.In particular,these devices are manufactured using silicon-based semiconductor processing techniques and MEMS fabrication methodologies,leading to a substantial reduction in the dimensions of individual components compared to traditional counterparts.
基金supported by Post-Moore Major Project of the National Natural Science Foundation of China(Grant No.92364204)Zhejiang Province introduces and cultivates leading innovation and entrepreneurship teams(Grant No.2023R01011)+1 种基金Zhejiang Provincial Natural Science Foundation of China(Grant No.LMS25F040005)the Key R&D Program of Zhejiang(Grant No.2024SSYS0042)。
文摘Neuromorphic visual perception,by emulating the efficient information processing mechanisms of biological vision systems and integrating innovations in materials and device architectures,offers novel solutions for artificial intelligence sensing.For instance,the incorporation of low-dimensional materials(e.g.,quantum dots,carbon nanotubes,and two-dimensional materials)optimizes device optoelectronic properties,while the synergistic design of organic semiconductors and oxide materials balances flexibility with complementary metal-oxide-semiconductor(CMOS)compatibility.Representative neuromorphic devices such as memristors and neuromorphic transistors address traditional vision system bottlenecks via near-sensor and in-sensor architectures in data transmission latency and energy consumption,offering a new paradigm for highly integrated,energy-efficient real-time perception.However,critical challenges—including device non-uniformity caused by material interface defects,system instability induced by memristor conductance drift,and environmental adaptability under complex illumination—remain barriers to scalable applications.This review comprehensively examines neuromorphic visual perception devices from the perspectives of device structure,operational mechanisms,materials,and applications.It explores the pivotal roles of memristors,electrolyte-gated transistors,and other neuromorphic devices in optical signal perception and information processing,with a focus on their implementations in visual perception tasks and future prospects.
基金Project supported by the Postgraduate Research&Practice Innovation Program of Jiangsu Province(Grant No.KYCX240139)funded by the Youth Independent Innovation Fund of PLA Army Engineering University(Grant No.KYJBJKQTZQ23006)。
文摘This study extends the self-propelled particle(SPP)model by incorporating a limited vision cone and local density sensing.The results reveal that clusters can simultaneously exhibit velocity polarization and spatial cohesion within specific ranges of vision angle and density threshold.The dependence of the dynamical features,including the order parameter and density variation,on the threshold and visual cone is investigated.Furthermore,a critical threshold is identified,which governs the transition between ordered and disordered states and is closely linked to density fluctuations and noise intensity.The clustering results show that the model is explained by the chasing mechanism responsible for cluster formation,density,and shape.These results may stimulate practical applications in swarm maneuvering.
基金supported by the National Key Research and Development Program of China(Research Grant Number:2023YFC3603600).
文摘Autism spectrum disorder(AsD)is a highly heterogeneous neurodevelopmental disorder.Early diagnosis and intervention are crucial for improving outcomes.Traditional single-modality diagnostic methods are subjective,limited,and struggle to reveal the underlying pathological mechanisms.In contrast,multimodal data analysis integrates behavioral,physiological,and neuroimaging information with advanced machine-learning and deeplearning algorithms to overcome these limitations.In this review,we surveyed the recent pediatric AsD literature,highlighting artificial intelligence-driven diagnostic techniques,multimodal data fusion strategies,and emerging trends in ASD assessment.We surveyed studies that integrated two or more modalities and summarized the fusion levels,learning paradigms,tasks,datasets,and metrics.Multimodal approaches outperform singlemodality baselines in classification,severity estimation,and subtyping by leveraging complementary information and reducing modality-specific biases.Multimodal approaches significantly enhance diagnostic accuracy and comprehensiveness,enabling early screening of AsD,symptom subtyping,severity assessment,and personalized interventions.Advances in multimodal fusion techniques have promoted progress in precision medicine for the treatment of ASD.
文摘What is spacetime?How do we perceive this medium?How can we fit it into our everyday linear lives?How can we situate ourselves within it in our post-industrial worldview,in an unsustainable world?This philosophical essay adopts a phenomenological method to interrogate the meaning of this fundamental dimension of reality.Spacetime is interpreted not merely as a physical structure but as a plastic field whose instability shapes inner and social life.Yet the contemporary human condition is marked by a profound alienation,much of which derives from a self-inflicted existential disorientation:I once chose exile and moved to a remote island in the Atlantic Ocean,becoming my own research material.In search of genuine contact with nature,the nonverbal appeared as a necessity.I turned to music as an archetypal language,in the Romantic sense of a medium offering pre-conceptual access to the real.I composed Light Atlas,a six-movement work aiming to capture the flight of seagulls and the eternal struggle between light and darkness.This led me back to physics,to my original question:the lived perception of spacetime.
基金funded by“the Fanying Special Program of the National Natural Science Foundation of China,grant number 62341307”“the Scientific research project of Jiangxi Provincial Department of Education,grant number GJJ200839”“theDoctoral startup fund of JiangxiUniversity of Technology,grant number 205200100402”.
文摘In multi-modal emotion recognition,excessive reliance on historical context often impedes the detection of emotional shifts,while modality heterogeneity and unimodal noise limit recognition performance.Existing methods struggle to dynamically adjust cross-modal complementary strength to optimize fusion quality and lack effective mechanisms to model the dynamic evolution of emotions.To address these issues,we propose a multi-level dynamic gating and emotion transfer framework for multi-modal emotion recognition.A dynamic gating mechanism is applied across unimodal encoding,cross-modal alignment,and emotion transfer modeling,substantially improving noise robustness and feature alignment.First,we construct a unimodal encoder based on gated recurrent units and feature-selection gating to suppress intra-modal noise and enhance contextual representation.Second,we design a gated-attention crossmodal encoder that dynamically calibrates the complementary contributions of visual and audio modalities to the dominant textual features and eliminates redundant information.Finally,we introduce a gated enhanced emotion transfer module that explicitly models the temporal dependence of emotional evolution in dialogues via transfer gating and optimizes continuity modeling with a comparative learning loss.Experimental results demonstrate that the proposed method outperforms state-of-the-art models on the public MELD and IEMOCAP datasets.
基金supported by the Guangdong Provincial Philosophy and Social Science“14th Five-Year Plan”Discipline Co-Construction Project(Grant No.GD22XJY14)the 2022 Guangdong Provincial Higher Education Teaching Reform Project(Grant No.Yue Jiao Gao[2023]4)Guangdong Polytechnic Normal University’s Project for Enhancing the Research Capacity of Doctoral Application Institution(Grant No.22GPNUZDJS48).
文摘Objectives:Psychological resilience is a critical resource for vocational high school students navigating social biases and fostering mental well-being.This six-month longitudinal study investigated the developmental trajectories of discrimination perception,vocational identity,and psychological resilience in this population.It further examined the longitudinal mediating role of vocational identity in the relationship between discrimination perception and psychological resilience.Methods:A total of 526 students from five vocational high schools in Guangdong,China,were assessed via convenience sampling at two time points:baseline(T1,September 2023)and six-month follow-up(T2,March 2024).Measures of discrimination perception,psychological resilience,and vocational identity were administered.Data were analyzed using a cross-lagged panel model to test for bidirectional relationships.Results:Over the six-month period,students showed significant decreases in discrimination perception and vocational identity,but a significant increase in psychological resilience.The cross-lagged model revealed significant bidirectional relationships:discrimination perception and psychological resilience negatively predicted each other over time(β=−0.124,p<0.01;β=−0.200,p<0.001),while psychological resilience and vocational identity positively predicted each other(β=0.084,p<0.05;β=0.076,p<0.05).The mediation analysis revealed a dual-pathway mechanism.T1 discrimination perception exerted both a significant direct negative effect on T2 psychological resilience(β=−0.332,p<0.001)and a significant indirect positive effect via T1 vocational identity(indirect effect=0.020,95%CI[0.001,0.046]).This confirms a partial mediating role,indicating that vocational identity functions as a compensatory mechanism,transforming the experience of discrimination perception into a potential source of psychological resilience.Conclusions:For vocational high school students,perception of discrimination directly undermines psychological resilience,but also indirectly fosters it through the positive development of vocational identity.These findings highlight vocational identity as a pivotal mechanism in the complex relationship between social adversity and mental resilience.
文摘As a cornerstone for applications such as autonomous driving,3D urban perception is a burgeoning field of study.Enhancing the performance and robustness of these perception systems is crucial for ensuring the safety of next-generation autonomous vehicles.In this work,we introduce a novel neural scene representation called Street Detection Gaussians(SDGs),which redefines urban 3D perception through an integrated architecture unifying reconstruction and detection.At its core lies the dynamic Gaussian representation,where time-conditioned parameterization enables simultaneous modeling of static environments and dynamic objects through physically constrained Gaussian evolution.The framework’s radar-enhanced perception module learns cross-modal correlations between sparse radardata anddense visual features,resulting ina22%reduction inocclusionerrors compared tovisiononly systems.A breakthrough differentiable rendering pipeline back-propagates semantic detection losses throughout the entire 3D reconstruction process,enabling the optimization of both geometric and semantic fidelity.Evaluated on the Waymo Open Dataset and the KITTI Dataset,the system achieves real-time performance(135 Frames Per Second(FPS)),photorealistic quality(Peak Signal-to-Noise Ratio(PSNR)34.9 dB),and state-of-the-art detection accuracy(78.1%Mean Average Precision(mAP)),demonstrating a 3.8×end-to-end improvement over existing hybrid approaches while enabling seamless integration with autonomous driving stacks.
基金the Natural Science Foundation of China(Project for Young Scientists:Grant No.52105010,Regular Project:Grant No.62173096)Natural Science Foundationof Guangdong Province(Regular Project:Grant No.2025A1515012124,Grant No.2022A1515010327)Guangdong-Hong Kong-Macao Key Laboratory of Multi-scaleInformation Fusion and Collaborative Optimization Control Manufacturing Process.
文摘Legged robots have considerable potential for traversing unstructured situations;nonetheless,their inflexible frameworks often constrain adaptability and obstacle negotiation.The study article presents a revolutionary Soft Tri-Legged Robot(STLR)that improves movement and obstacle-avoidance skills by using a bio-inspired pneumatic artificial muscle(Bubble Artificial Muscles)and a bio-inspired tactile sensor(TacTip).The STLR is activated by BAMs,which are flexible,pneu-matic-driven actuators that provide fine control over forward,backward,and steering movements.Obstacle identification and avoidance are facilitated by the TacTip sensor,which delivers tactile input for traversing unstructured terrains.We delineate the mechanical features of the BAMs,assess the functionality of the robot's legs,and elaborate on the incorpora-tion of the tactile sensing system.Experimental results demonstrate that the STLR can effectively achieve multi-directional flexible movement and obstacle avoidance through a cross-modal perception-actuation mechanism.This study highlights the promise of soft robotics for search and rescue,medical aid,and autonomous exploration,while delineating difficulties and opportunities for future improvements in functionality and efficiency.
文摘Embodied intelligent systems integrate perception,control,and decision-making within physical agents,and have become a cornerstone of modern aerospace,autonomous driving,and cooperative robotic applications.When operating in uncertain and dynamic environments,such systems must address challenges arising from incomplete sensing,unpredictable maneuvers,communication constraints,disturbances,and evolving network structures.
文摘With the increasing of the elderly population and the growing hearth care cost, the role of service robots in aiding the disabled and the elderly is becoming important. Many researchers in the world have paid much attention to heaRthcare robots and rehabilitation robots. To get natural and harmonious communication between the user and a service robot, the information perception/feedback ability, and interaction ability for service robots become more important in many key issues.
文摘This article examines the complex relationship between disease perception,negative emotions,and their impact on postoperative recovery in patients with perianal diseases.These conditions not only cause physical discomfort,but also carry a significant emotional burden,often exacerbated by social stigma.Psycho-logical factors,including stress,anxiety,and depression,activate neuroendocrine pathways,such as the hypothalamic–pituitary–adrenal axis,disrupting the gut microbiota and leading to dysbiosis.This disruption can delay wound healing,prolong hospital stay,and intensify pain.Drawing on the findings of Hou et al,our article highlights the critical role of illness perception and negative emotions in shaping recovery outcomes.It advocates for a holistic approach that integrates psychological support and gut microbiota modulation,to enhance healing and improve overall patient outcomes.
基金supported by the Jiangsu Provincial Department of Science and Technology Social Development Project(No.BE2020787)。
文摘Objectives Diabetes remains a major global health challenge in China.Artificial intelligence(AI)has demonstrated considerable potential in improving diabetes management.This study aimed to assess healthcare providers’perceptions regarding AI in diabetes care across China.Methods A cross-sectional survey was conducted using snowball sampling from November 12 to November 24,2024.We selected 514 physicians and nurses by a snowball sampling method from healthcare providers across 30 cities or provinces in China.The self-developed questionnaire comprised five sections with 19 questions assessing medical workers’demographic characteristics,AI-related experience and interest,awareness,attitudes,and concerns regarding AI in diabetes care.Statistical analysis was performed using t-test,analysis of variance(ANOVA),and linear regression.Results Among them,20.0%and 48.1%of respondents had participated in AI-related research and training,while 85.4%expressed moderate to high interest in AI training for diabetes care.Most respondents reported partial awareness of AI in diabetes care,and only 12.6%exhibited a comprehensive or substantial understanding.Attitudes toward AI in diabetes care were generally positive,with a mean score of 24.50±3.38.Nurses demonstrated significantly higher scores than physicians(P<0.05).Greater awareness,prior AI training experience,and higher interest in AI training in diabetes care were strongly associated with more positive attitudes(P<0.05).Key concerns regarding AI included trust issues from AI-clinician inconsistencies(77.2%),increased workload and clinical workflow disruptions(63.4%),and incomplete legal and regulatory frameworks(60.3%).Only 34.2%of respondents expressed concerns about job displacement,indicating general confidence in their professional roles.Conclusions While Chinese healthcare providers show moderate awareness of AI in diabetes care,their attitudes are generally positive,and they are considerably interested in future training.Tailored,role-specific AI training is essential for equitable and effective integration into clinical practice.Additionally,transparent,reliable,ethical AI models must be prioritized to alleviate practitioners’concerns.
基金Construction Program of the Key Discipline of State Administration of Traditional Chinese Medicine of China(ZYYZDXK-2023069)Research Project of Shanghai Municipal Health Commission (2024QN018)Shanghai University of Traditional Chinese Medicine Science and Technology Development Program (23KFL005)。
文摘Objective To develop a non-invasive predictive model for coronary artery stenosis severity based on adaptive multi-modal integration of traditional Chinese and western medicine data.Methods Clinical indicators,echocardiographic data,traditional Chinese medicine(TCM)tongue manifestations,and facial features were collected from patients who underwent coro-nary computed tomography angiography(CTA)in the Cardiac Care Unit(CCU)of Shanghai Tenth People's Hospital between May 1,2023 and May 1,2024.An adaptive weighted multi-modal data fusion(AWMDF)model based on deep learning was constructed to predict the severity of coronary artery stenosis.The model was evaluated using metrics including accura-cy,precision,recall,F1 score,and the area under the receiver operating characteristic(ROC)curve(AUC).Further performance assessment was conducted through comparisons with six ensemble machine learning methods,data ablation,model component ablation,and various decision-level fusion strategies.Results A total of 158 patients were included in the study.The AWMDF model achieved ex-cellent predictive performance(AUC=0.973,accuracy=0.937,precision=0.937,recall=0.929,and F1 score=0.933).Compared with model ablation,data ablation experiments,and various traditional machine learning models,the AWMDF model demonstrated superior per-formance.Moreover,the adaptive weighting strategy outperformed alternative approaches,including simple weighting,averaging,voting,and fixed-weight schemes.Conclusion The AWMDF model demonstrates potential clinical value in the non-invasive prediction of coronary artery disease and could serve as a tool for clinical decision support.
文摘Traditional Chinese medicine(TCM)demonstrates distinctive advantages in disease prevention and treatment.However,analyzing its biological mechanisms through the modern medical research paradigm of“single drug,single target”presents significant challenges due to its holistic approach.Network pharmacology and its core theory of network targets connect drugs and diseases from a holistic and systematic perspective based on biological networks,overcoming the limitations of reductionist research models and showing considerable value in TCM research.Recent integration of network target computational and experimental methods with artificial intelligence(AI)and multi-modal multi-omics technologies has substantially enhanced network pharmacology methodology.The advancement in computational and experimental techniques provides complementary support for network target theory in decoding TCM principles.This review,centered on network targets,examines the progress of network target methods combined with AI in predicting disease molecular mechanisms and drug-target relationships,alongside the application of multi-modal multi-omics technologies in analyzing TCM formulae,syndromes,and toxicity.Looking forward,network target theory is expected to incorporate emerging technologies while developing novel approaches aligned with its unique characteristics,potentially leading to significant breakthroughs in TCM research and advancing scientific understanding and innovation in TCM.
基金supported by the National Natural Science Foundation of China(Grant Nos.62071315 and 62271336).
文摘The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models.
基金financially supported by the POSCO-POSTECH-RIST Convergence Research Center program funded by POSCOthe National Research Foundation (NRF) grants (RS-2024-00462912, RS-2024-00416272, RS-2024-00337012, RS-2024-00408446) funded by the Ministry of Science and ICT (MSIT) of the Korean government+2 种基金the Korea Evaluation Institute of Industrial Technology (KEIT) grant (No. 1415185027/20019169, Alchemist project) funded by the Ministry of Trade, Industry and Energy (MOTIE) of the Korean governmentthe Soseon Science fellowship funded by Community Chest of Koreathe NRF PhD fellowship (RS-2023-00275565) funded by the Ministry of Education (MOE) of the Korean government。
文摘Spatial computing and augmented reality are advancing rapidly,with the goal of seamlessly blending virtual and physical worlds.However,traditional depth-sensing systems are bulky and energy-intensive,limiting their use in wearable devices.To overcome this,recent research by X.Liu et al.presents a compact binocular metalens-based depth perception system that integrates efficient edge detection through an advanced neural network.This system enables accurate,realtime depth mapping even in complex environments,enhancing potential applications in augmented reality,robotics,and autonomous systems.
基金supported by the National Natural Science Foundation of China(Nos.62371323,62401380,U2433217,U2333209,and U20A20161)Natural Science Foundation of Sichuan Province,China(Nos.2025ZNSFSC1476)+2 种基金Sichuan Science and Technology Program,China(Nos.2024YFG0010 and 2024ZDZX0046)the Institutional Research Fund from Sichuan University(Nos.2024SCUQJTX030)the Open Fund of Key Laboratory of Flight Techniques and Flight Safety,CAAC(Nos.GY2024-01A).
文摘With the advent of the next-generation Air Traffic Control(ATC)system,there is growing interest in using Artificial Intelligence(AI)techniques to enhance Situation Awareness(SA)for ATC Controllers(ATCOs),i.e.,Intelligent SA(ISA).However,the existing AI-based SA approaches often rely on unimodal data and lack a comprehensive description and benchmark of the ISA tasks utilizing multi-modal data for real-time ATC environments.To address this gap,by analyzing the situation awareness procedure of the ATCOs,the ISA task is refined to the processing of the two primary elements,i.e.,spoken instructions and flight trajectories.Subsequently,the ISA is further formulated into Controlling Intent Understanding(CIU)and Flight Trajectory Prediction(FTP)tasks.For the CIU task,an innovative automatic speech recognition and understanding framework is designed to extract the controlling intent from unstructured and continuous ATC communications.For the FTP task,the single-and multi-horizon FTP approaches are investigated to support the high-precision prediction of the situation evolution.A total of 32 unimodal/multi-modal advanced methods with extensive evaluation metrics are introduced to conduct the benchmarks on the real-world multi-modal ATC situation dataset.Experimental results demonstrate the effectiveness of AI-based techniques in enhancing ISA for the ATC environment.