Dear Editor,Sugarcane mosaic virus (SCMV) causes severe viral diseases in maize worldwide (Fuchs and Gruntzig, 1995), resulting in significant losses in grain and forage yield in susceptible cultivars of maize and...Dear Editor,Sugarcane mosaic virus (SCMV) causes severe viral diseases in maize worldwide (Fuchs and Gruntzig, 1995), resulting in significant losses in grain and forage yield in susceptible cultivars of maize and related crops. The most promising solution is to cultivate resistant varieties, which contribute to sustainable crop production. Two epistatically interacting major SCMV resistance loci (Scmvl and Scmv2) are required to confer complete resistance against SCMV in the resistant nearisogenic line F7RPJRR (the letters left of the slash refer to the genotype at Scmv2 on chromosome 3 and those on the right refer to the genotype at Scmvl on chromosome 6, with R indicating a resistance allele and S a susceptibility allele) (Xing et al., 2006).展开更多
In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Mu...In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.展开更多
A novel method that combines reinforced enzyme-induced carbonate precipitation(REICP)was proposed to improve the mechanical properties of dispersive soil.Dispersive soils,which are highly susceptible to erosion caused...A novel method that combines reinforced enzyme-induced carbonate precipitation(REICP)was proposed to improve the mechanical properties of dispersive soil.Dispersive soils,which are highly susceptible to erosion caused by rainfall or seepage,pose significantenvironmental challenges.It is essential to focus on modifying dispersive soil using environmentally friendly methods.This study investigated the cohesion,internal friction angle,permeability,hydrostability test,and microstructure of dispersive soil treated with enzyme-induced carbonate precipitation(EICP)-MgCl2-xanthan gum(REICP),using statistical analysis.A series of laboratory experiments was conducted,including direct shear tests,permeability experiments,mud ball tests,simulated rainfall tests,Fourier transform infrared spectroscopy(FTIR),X-ray diffraction(XRD),and scanning electron microscopy(SEM).The results showed that the combined treatment significantly enhanced the mechanical properties of dispersive soil.At the optimal ratio,cohesion increased by a factor of 2,and the permeability coefficientdecreased by approximately 1.7×10^(7)times.Additionally,the strength parameters gradually increased with curing time.Microstructural analyses indicated that calcite precipitation,pore filling,and ionic redistribution significantlyimproved the mechanical properties and hydrostability of the soil.Statistical analyses showed that EICP materials and xanthan gum increased soil cohesion,while magnesium chloride enhanced the internal friction angle and reduced porosity.This study integrates mechanical testing,statistical analysis,and microstructural evaluation to propose a sustainable and environmentally friendly method for improving dispersive soils.This approach reduces the use of chemical modifiers,minimizes environmental impacts,and demonstrates application potential in the stabilization of dispersive soils.展开更多
Unmanned Aerial Vehicles(UAVs)have become integral components in smart city infrastructures,supporting applications such as emergency response,surveillance,and data collection.However,the high mobility and dynamic top...Unmanned Aerial Vehicles(UAVs)have become integral components in smart city infrastructures,supporting applications such as emergency response,surveillance,and data collection.However,the high mobility and dynamic topology of Flying Ad Hoc Networks(FANETs)present significant challenges for maintaining reliable,low-latency communication.Conventional geographic routing protocols often struggle in situations where link quality varies and mobility patterns are unpredictable.To overcome these limitations,this paper proposes an improved routing protocol based on reinforcement learning.This new approach integrates Q-learning with mechanisms that are both link-aware and mobility-aware.The proposed method optimizes the selection of relay nodes by using an adaptive reward function that takes into account energy consumption,delay,and link quality.Additionally,a Kalman filter is integrated to predict UAV mobility,improving the stability of communication links under dynamic network conditions.Simulation experiments were conducted using realistic scenarios,varying the number of UAVs to assess scalability.An analysis was conducted on key performance metrics,including the packet delivery ratio,end-to-end delay,and total energy consumption.The results demonstrate that the proposed approach significantly improves the packet delivery ratio by 12%–15%and reduces delay by up to 25.5%when compared to conventional GEO and QGEO protocols.However,this improvement comes at the cost of higher energy consumption due to additional computations and control overhead.Despite this trade-off,the proposed solution ensures reliable and efficient communication,making it well-suited for large-scale UAV networks operating in complex urban environments.展开更多
Biobased biodegradable plastics have gained increasing attention as sustainable alternatives to petroleum-based materials in food packaging,offering biodegradability,renewability,and reduced environmental impact.This ...Biobased biodegradable plastics have gained increasing attention as sustainable alternatives to petroleum-based materials in food packaging,offering biodegradability,renewability,and reduced environmental impact.This review adopts a narrative review approach,integrating studies published between 2015 and 2025 from major databases to critically evaluate the recent advances,feasibility,and limitations of biobased biodegradable plastics in food packaging.Literature was thematically analyzed by material type and functional enhancement to assess their feasibility and limitations for sustainable packaging applications.Recent advances have focused on enhancing their mechanical,barrier,and functional properties through polymer blending,nanoparticle reinforcement,and incorporation of natural bioactive agents.Starch-based bioplastics,derived from renewable sources such as corn and cassava,have been improved by blending with polylactic acid(PLA)or polybutylene succinate(PBS)and reinforcing with nanocellulose or silica to enhance flexibility,strength,and thermal stability.Incorporating plant extracts and polyphenols has added antioxidant and antimicrobial functions.PLA-based films have benefited from nanoparticle fillers like zinc oxide and lignin nanoparticles,and the integration of bioactive compounds such as tea polyphenols and hop extract has enabled multifunctional,intelligent packaging with controlled release and UV protection.Polyhydroxyalkanoates(PHAs),producedmicrobially,have been functionalizedwith tannins,ferulic acid,and other natural agents to achieve high antioxidant,antibacterial,and UV-blocking performance,while multilayer coatings have improved moisture and gas resistance.PBS composites have been enhanced using nanofillers like silver or magnesium oxide and natural additives such as quercetin and essential oils,thereby improving durability and bioactivity.Emerging materials,including chitosan-,protein-,and polysaccharide-based films,show excellent film-forming ability and compatibility with natural antimicrobials;smart systems with pH-sensing and UV-shielding functions further extend food shelf life.Despite remaining challenges such as cost,moisture sensitivity,limited scalability,and potential competition with food resources,recent progress demonstrates that biobased biodegradable plastics hold strong potential to advance sustainable,high-performance food packaging,particularly when waste is valorized.Future research should focus on improving the cost-effectiveness,scalability,and moisture resistance of biobased biodegradable plastics,while advancing waste-derived feedstocks,multifunctional smart packaging,and comprehensive life cycle assessments to ensure sustainable and practical food packaging solutions.展开更多
Frequency hopping(FH)communication has good anti-fading,anti-jamming and anti-eavesdropping capabilities,so it is one of the main ways to combat electronic jamming.In order to further improve the anti-jamming capabili...Frequency hopping(FH)communication has good anti-fading,anti-jamming and anti-eavesdropping capabilities,so it is one of the main ways to combat electronic jamming.In order to further improve the anti-jamming capability of FH communication,the parameters such as fixed frequency interval,hopping rate and hopping frequency in conventional FH can be assigned with time-varying characteristics.In order to set appropriate hopping parameters to improve the performance of the system in the electromagnetic environment with various types of jamming,a heuristically accelerated Q-learning(HAQL)method is proposed in this paper.Firstly,a theoretical model for the parameter decision-making of FH system is made,and the key parameters affecting the energy efficiency of the system are analyzed.Secondly,a Q-learning model in complex electromagnetic environment is proposed,which includes setting states,actions and rewards,as well as a HAQL-based decisionmaking algorithm is put forward.Lastly,simulations are carried out under different jamming environments,and simulation results show that the average energy efficiency of HAQL algorithm is higher than that of the SARSA algorithm,the e-greedy QL algorithm and the HQL-OSGM algorithm,respectively.展开更多
Muon scattering tomography(MST) is a powerful noninvasive imaging technique with significant applications in nuclear material detection and security screening.Traditional MST usually relies on the point of closest app...Muon scattering tomography(MST) is a powerful noninvasive imaging technique with significant applications in nuclear material detection and security screening.Traditional MST usually relies on the point of closest approach(PoCA) algorithm to reconstruct images from muon scattering data;however,PoCA often suffers from suboptimal image clarity and resolution.To overcome these challenges,we propose a novel approach that leverages reinforcement learning(RL) to enhance MST reconstruction,termed the μRL-enhanced method.By framing the MST optimization task as an RL problem,we developed an intelligent agent capable of dynamically adjusting the key PoCA parameters.The agent is trained using a multi-objective reward function that guides the optimization toward higher-quality reconstructions.Our experimental results show that theμRL-enhanced method significantly outperforms the traditional PoCA baseline acros s multiple benchmark metrics.Specifically,the proposed approach on average attains a 307% improvement in the intersection over union(IoU),a 79% increase in the structural similarity index measure(SSIM),and a 8.4% enhancement in the peak signal-to-noise ratio(PSNR) across four experiments.Furthermore,when benchmarked against the maximum likelihood scattering and displacement(MLSD)algorithm,the μRL-enhanced method offers modest gains in PS NR and IoU,together with a one-third increase in SSIM.These improvements demonstrate the enhanced reconstruction accuracy and structural fidelity of the μRL-enhanced method,highlighting its potential to advance MST technologies and their applications.展开更多
Cooperative pursuit poses challenges across natural,social,and technical systems,particularly when decentralized,slow-speed pursuers attempt to capture a high-speed evader with limited observation.Most existing contri...Cooperative pursuit poses challenges across natural,social,and technical systems,particularly when decentralized,slow-speed pursuers attempt to capture a high-speed evader with limited observation.Most existing contributions place the focus on the greedy pursuit of the evader,overlooking potential collaborations among pursuers.To tackle this issue,a decisionmaking framework of multi-agent coordinated reciprocity formation pursuit(MACRFP)via deep reinforcement learning is introduced.This framework integrates the actor-critic algorithm with the coordinated reciprocity mechanism to enhance the capability of capturing a faster evader.Initially,a local perception model is created by utilizing a cellular network to simulate limitations caused by obstacles.Next,the formation coalition of pursuit is guided by the Cartesian Oval,enabling dispersed pursuers to create a siege against the faster evader.Furthermore,a coordinated reciprocity model based on the coordination graph and the attention-based graph neural networks is developed,addressing the global coordination problem by estimating a reciprocity coefficient to adjust agents'rewards.Numerical simulations demonstrate the emergence of cooperative behaviors in cooperative besiegement,target tracking,and intelligent interception during the pursuit,indicating that the proposed algorithm enhances the feasibility and effectiveness of capturing a fast-escaping target by integrating coordinated reciprocity and coalition formation.展开更多
Cellulose,the dominant natural polymer on Earth,features a distinct molecular structure with extraordinary mechanical properties and tunable characteristics,making it attractive for gel systems.Although significant pr...Cellulose,the dominant natural polymer on Earth,features a distinct molecular structure with extraordinary mechanical properties and tunable characteristics,making it attractive for gel systems.Although significant progress has been made,challenges remain in fully leveraging their functional potential and broadening practical applications.This review systematically examines the properties of cellulose and cellulose gels,exploring novel reinforcement strategies—across molecular,supramolecular network,and macroscale structure levels—to enhance mechanical,electrical,and thermal performance,while coordinating these properties for practical implementations.These advancements are exemplified in emerging fields such as flexible robotics,electronic skins,flexible energy storage devices,and human-machine interaction systems.This article thoroughly investigates the fundamental characteristics,multi-scale design approaches,performance enhancement mechanisms,and cutting-edge implementations of cellulose-based gels across diverse domains.It provides a comprehensive overview of these advanced materials and offers strategic insights and recommendations for future research and innovation.展开更多
Wireless Sensor Networks(WSNs)play a crucial role in numerous Internet of Things(IoT)applications and next-generation communication systems,yet they continue to face challenges in balancing energy efficiency and relia...Wireless Sensor Networks(WSNs)play a crucial role in numerous Internet of Things(IoT)applications and next-generation communication systems,yet they continue to face challenges in balancing energy efficiency and reliable connectivity.This study proposes SAC-HTC(Soft Actor-Critic-based High-performance Topology Control),a deep reinforcement learning(DRL)method based on the Actor-Critic framework,implemented within a Software Defined Wireless Sensor Network(SDWSN)architecture.In this approach,sensor nodes periodically transmit state information,including coordinates,node degree,transmission power,and neighbor lists,to a centralized controller.The controller acts as the reinforcement learning(RL)agent,with the Actor generating decisions to adjust transmission ranges,while the Critic evaluates action values to reflect the overall network performance.The bidirectional Node-Controller feedback mechanism enables the controller to issue appropriate control commands to each node,ensuring the maintenance of the desired node degree,reducing energy consumption,and preserving network connectivity.The algorithmfurther incorporates soft entropy adjustment to balance exploration and exploitation,alongwith an off-policy mechanism for efficient data reuse,making it well-suited to the resource-constrained conditions ofWSNs.Simulation results demonstrate that SAC-HTC not only outperforms traditional methods and several existing RL algorithms but also achieves faster convergence,optimized communication range control,global connectivity maintenance,and extended network lifetime.The key novelty of this research lies in the integration of the SAC method with the SDWSN architecture forWSNs topology control,providing an adaptive,efficient,and highly promisingmechanism for large-scale,dynamic,and high-performance sensor networks.展开更多
The increasing occurrence of corrosion-related damage in steel pipelines has led to the growing use of composite-based repair techniques as an efficient alternative to traditional replacement methods.Computer modeling...The increasing occurrence of corrosion-related damage in steel pipelines has led to the growing use of composite-based repair techniques as an efficient alternative to traditional replacement methods.Computer modeling and structural analysis were performed for the repair reinforcement of a steel pipeline with a composite bandage.A preliminary analysis of possible contact interaction schemes was implemented based on the theory of cylindrical shells,taking into account transverse shear deformations.The finite element method was used for a detailed study of the stress state of the composite bandage and the reinforced section of the pipeline.The limit state of the reinforced section was assessed based on the von Mises criterion for steel and the Tsai-Wu criterion for composites.The effectiveness of the repair was demonstrated on a pipeline whose wall thickness had decreased by 20%as a result of corrosion damage.At a nominal pressure of P=6 MPa,the maximum normal stress in the weakened area reached 381 MPa.The installation of a composite bandage reduced this stress to 312 MPa,making the repaired section virtually as strong as the undamaged pipeline.Due to the linearity of the problem,the results obtained can be easily used to find critical internal pressure values.展开更多
This paper investigates the traffic offloading optimization challenge in Space-Air-Ground Integrated Networks(SAGIN)through a novel Recursive Multi-Agent Proximal Policy Optimization(RMAPPO)algorithm.The exponential g...This paper investigates the traffic offloading optimization challenge in Space-Air-Ground Integrated Networks(SAGIN)through a novel Recursive Multi-Agent Proximal Policy Optimization(RMAPPO)algorithm.The exponential growth of mobile devices and data traffic has substantially increased network congestion,particularly in urban areas and regions with limited terrestrial infrastructure.Our approach jointly optimizes unmanned aerial vehicle(UAV)trajectories and satellite-assisted offloading strategies to simultaneously maximize data throughput,minimize energy consumption,and maintain equitable resource distribution.The proposed RMAPPO framework incorporates recurrent neural networks(RNNs)to model temporal dependencies in UAV mobility patterns and utilizes a decentralized multi-agent reinforcement learning architecture to reduce communication overhead while improving system robustness.The proposed RMAPPO algorithm was evaluated through simulation experiments,with the results indicating that it significantly enhances the cumulative traffic offloading rate of nodes and reduces the energy consumption of UAVs.展开更多
Theintegration of human factors into artificial intelligence(AI)systems has emerged as a critical research frontier,particularly in reinforcement learning(RL),where human-AI interaction(HAII)presents both opportunitie...Theintegration of human factors into artificial intelligence(AI)systems has emerged as a critical research frontier,particularly in reinforcement learning(RL),where human-AI interaction(HAII)presents both opportunities and challenges.As RL continues to demonstrate remarkable success in model-free and partially observable environments,its real-world deployment increasingly requires effective collaboration with human operators and stakeholders.This article systematically examines HAII techniques in RL through both theoretical analysis and practical case studies.We establish a conceptual framework built upon three fundamental pillars of effective human-AI collaboration:computational trust modeling,system usability,and decision understandability.Our comprehensive review organizes HAII methods into five key categories:(1)learning from human feedback,including various shaping approaches;(2)learning from human demonstration through inverse RL and imitation learning;(3)shared autonomy architectures for dynamic control allocation;(4)human-in-the-loop querying strategies for active learning;and(5)explainable RL techniques for interpretable policy generation.Recent state-of-the-art works are critically reviewed,with particular emphasis on advances incorporating large language models in human-AI interaction research.To illustrate some concepts,we present three detailed case studies:an empirical trust model for farmers adopting AI-driven agricultural management systems,the implementation of ethical constraints in roboticmotion planning through human-guided RL,and an experimental investigation of human trust dynamics using a multi-armed bandit paradigm.These applications demonstrate how HAII principles can enhance RL systems’practical utility while bridging the gap between theoretical RL and real-world human-centered applications,ultimately contributing to more deployable and socially beneficial intelligent systems.展开更多
Ride-hailing electric vehicles are mobile resources with dispatch potential to improve resilience.However,they have not been well investigated because their charging and order-serving are affected or managed by the po...Ride-hailing electric vehicles are mobile resources with dispatch potential to improve resilience.However,they have not been well investigated because their charging and order-serving are affected or managed by the power grid dispatching center and the ride-hailing platform.Effective pre-strategies can improve the prevention ability for high-impact and low-probability(HILP)events and provide the foundation for measures in the response and restoration stages.First,this paper proposes a resilience reserve to expand the existing research on power system resilience.Secondly,this paper puts forward an interactive method of deep reinforcement learning,which considers the interests of both the power grid dispatching center and the ride-hailing platform.It improves the resilience reserve by achieving the order dispatch,orderly charging management of ride-hailing electric vehicles,and the pricing strategy of charging stations.Finally,this paper uses a practical example covering about 107.32 km2 in the center of Chengdu to verify that the proposed method improves the resilience reserve of the power system without obviously damaging the interests of the ride-hailing platform.展开更多
Lithium-rich layered oxides(LRLOs)are promising cathode materials due to their high specific capacity,energy density,and operating voltage.However,their performance is hindered by the limited redox activity of transit...Lithium-rich layered oxides(LRLOs)are promising cathode materials due to their high specific capacity,energy density,and operating voltage.However,their performance is hindered by the limited redox activity of transition metals,leading to oxygen redox instability,oxygen release,and capacity degradation.To address these issues,we propose an innovative lattice-oxygen modulation(LOM)strategy that incorporates Mn^(3+)and Ti^(4+)into the Li_(1.2)Cr_(0.3)Mn_(0.4)Ti_(0.1)O_(2) system,effectively mitigating Cr migration,stabilizing oxygen redox reactions,and reinforcing structural integrity.This results in improved electrochemical performance,as demonstrated by a 56.5 mAh g^(−1) increase in initial discharge capacity to 364.2 mAh g^(−1),with 71.3%capacity retention after 30 cycles,reflecting a 20.2%improvement in cycling stability.Density functional theory(DFT)calculations confirm enhanced Cr redox reversibility and reduced oxygen evolution,further strengthening structural stability.These synergistic effects highlight the pivotal role of the LOM strategy in optimizing both electrochemical performance and structural integrity,offering a scalable pathway to improve capacity and cycling stability in lithium-rich cathodes.展开更多
To address the poor mechanical performance and improve the tribological properties of self-lubricating polyphenylene sulfide/irradiation treated polytetrafluoroethylene(PPS/i-PTFE)blends,different aspect ratio carbon ...To address the poor mechanical performance and improve the tribological properties of self-lubricating polyphenylene sulfide/irradiation treated polytetrafluoroethylene(PPS/i-PTFE)blends,different aspect ratio carbon fibers(i.e.,PSCF:50,SCF:about 429)were introduced as reinforcement fillers.The results showed that the hybriding of PSCF and SCF at certain mass ratios exhibited simultaneous enhancement of mechanical and tribological performance for PPS/i-PTFE blend through the construction of synergistic lubrication and mechanical interlocking network.Specifically,the flexural strength and modulus of PPS/i-PTFE were increased by 125.6% and 389.3%,the friction coefficient and specific wear rate were decreased by 13.9% and 95%,respectively.It was worth noting that PPS composites possessed excellent integrated performance which were able to withstand sliding action under high PV(≥10 MPa·m/s)conditions,as assessed by a customized pin-on-disc tester.This work demonstrated that the formation of intact lubricating film combined with the enhanced thermal and mechanical properties were favorable for improving the tribological properties of PPS-based composites,which makes them suitable for advanced engineering applications.展开更多
In order to address the issue of overly conservative offline reinforcement learning(RL) methods that limit the generalization of policy in the out-of-distribution(OOD) region,this article designs a surrogate target fo...In order to address the issue of overly conservative offline reinforcement learning(RL) methods that limit the generalization of policy in the out-of-distribution(OOD) region,this article designs a surrogate target for OOD value function based on dataset distance and proposes a novel generalized Q-learning mechanism with distance regularization(GQDR).In theory,we not only prove the convergence of GQDR,but also ensure that the difference between the Q-value learned by GQDR and its true value is bounded.Furthermore,an offline generalized actor-critic method with distance regularization(OGACDR) is proposed by combining GQDR with actor-critic learning framework.Two implementations of OGACDR,OGACDR-EXP and OGACDRSQR,are introduced according to exponential(EXP) and opensquare(SQR) distance weight functions,and it has been theoretically proved that OGACDR provides a safe policy improvement.Experimental results on Gym-MuJoCo continuous control tasks show that OGACDR can not only alleviate the overestimation and overconservatism of Q-value function,but also outperform conservative offline RL baselines.展开更多
This study investigates the performance of high-strength cable bolts under impact loading conditions representative of rock bursts in underground environments.Although widely used,the dynamic behaviour of these cable ...This study investigates the performance of high-strength cable bolts under impact loading conditions representative of rock bursts in underground environments.Although widely used,the dynamic behaviour of these cable bolts has received limited experimental attention,and their effectiveness in seismically active zones remains a subject of ongoing debate.To address this gap,a reverse pull-out test machine integrated with a drop hammer rig was employed.Tests were conducted on 70-t SUMO bulbed and non-bulbed cable bolts with encapsulation lengths of 300 and 450 mm,subjected to an impact energy of 14.52 k J.Results indicate that non-bulbed cables,despite showing lower initial peak loads(average 218 vs.328 k N for bulbed cables at 300 mm encapsulation),demonstrated superior energy absorption(average 11.26 vs.8.75 k J)and displacement capacity(average 48.40 vs.36.25 mm).Increasing the encapsulation length for bulbed cables led to a reduction in initial peak load but improved displacement and energy absorption.The dominant failure mechanism was debonding at the cable-grout interface,characterised by frictional sliding and cable rotation.These findings provide new insights into the energy dissipation mechanisms of cables and support the development of more resilient ground support systems for dynamically active conditions.展开更多
To address the high costs and operational instability of distribution networks caused by the large-scale integration of distributed energy resources(DERs)(such as photovoltaic(PV)systems,wind turbines(WT),and energy s...To address the high costs and operational instability of distribution networks caused by the large-scale integration of distributed energy resources(DERs)(such as photovoltaic(PV)systems,wind turbines(WT),and energy storage(ES)devices),and the increased grid load fluctuations and safety risks due to uncoordinated electric vehicles(EVs)charging,this paper proposes a novel dual-scale hierarchical collaborative optimization strategy.This strategy decouples system-level economic dispatch from distributed EV agent control,effectively solving the resource coordination conflicts arising from the high computational complexity,poor scalability of existing centralized optimization,or the reliance on local information decision-making in fully decentralized frameworks.At the lower level,an EV charging and discharging model with a hybrid discrete-continuous action space is established,and optimized using an improved Parameterized Deep Q-Network(PDQN)algorithm,which directly handles mode selection and power regulation while embedding physical constraints to ensure safety.At the upper level,microgrid(MG)operators adopt a dynamic pricing strategy optimized through Deep Reinforcement Learning(DRL)to maximize economic benefits and achieve peak-valley shaving.Simulation results show that the proposed strategy outperforms traditional methods,reducing the total operating cost of the MG by 21.6%,decreasing the peak-to-valley load difference by 33.7%,reducing the number of voltage limit violations by 88.9%,and lowering the average electricity cost for EV users by 15.2%.This method brings a win-win result for operators and users,providing a reliable and efficient scheduling solution for distribution networks with high renewable energy penetration rates.展开更多
文摘Dear Editor,Sugarcane mosaic virus (SCMV) causes severe viral diseases in maize worldwide (Fuchs and Gruntzig, 1995), resulting in significant losses in grain and forage yield in susceptible cultivars of maize and related crops. The most promising solution is to cultivate resistant varieties, which contribute to sustainable crop production. Two epistatically interacting major SCMV resistance loci (Scmvl and Scmv2) are required to confer complete resistance against SCMV in the resistant nearisogenic line F7RPJRR (the letters left of the slash refer to the genotype at Scmv2 on chromosome 3 and those on the right refer to the genotype at Scmvl on chromosome 6, with R indicating a resistance allele and S a susceptibility allele) (Xing et al., 2006).
文摘In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.
基金supported by the National Natural Science Foundation of China(Grant No.42407199)Heilongjiang Provincial Natural Science Foundation of China(Grant No.PL2024D003)the Fundamental Research Funds for the Central Universities(Grant No.2572023CT17).
文摘A novel method that combines reinforced enzyme-induced carbonate precipitation(REICP)was proposed to improve the mechanical properties of dispersive soil.Dispersive soils,which are highly susceptible to erosion caused by rainfall or seepage,pose significantenvironmental challenges.It is essential to focus on modifying dispersive soil using environmentally friendly methods.This study investigated the cohesion,internal friction angle,permeability,hydrostability test,and microstructure of dispersive soil treated with enzyme-induced carbonate precipitation(EICP)-MgCl2-xanthan gum(REICP),using statistical analysis.A series of laboratory experiments was conducted,including direct shear tests,permeability experiments,mud ball tests,simulated rainfall tests,Fourier transform infrared spectroscopy(FTIR),X-ray diffraction(XRD),and scanning electron microscopy(SEM).The results showed that the combined treatment significantly enhanced the mechanical properties of dispersive soil.At the optimal ratio,cohesion increased by a factor of 2,and the permeability coefficientdecreased by approximately 1.7×10^(7)times.Additionally,the strength parameters gradually increased with curing time.Microstructural analyses indicated that calcite precipitation,pore filling,and ionic redistribution significantlyimproved the mechanical properties and hydrostability of the soil.Statistical analyses showed that EICP materials and xanthan gum increased soil cohesion,while magnesium chloride enhanced the internal friction angle and reduced porosity.This study integrates mechanical testing,statistical analysis,and microstructural evaluation to propose a sustainable and environmentally friendly method for improving dispersive soils.This approach reduces the use of chemical modifiers,minimizes environmental impacts,and demonstrates application potential in the stabilization of dispersive soils.
基金funded by Hung Yen University of Technology and Education under grand number UTEHY.L.2025.62.
文摘Unmanned Aerial Vehicles(UAVs)have become integral components in smart city infrastructures,supporting applications such as emergency response,surveillance,and data collection.However,the high mobility and dynamic topology of Flying Ad Hoc Networks(FANETs)present significant challenges for maintaining reliable,low-latency communication.Conventional geographic routing protocols often struggle in situations where link quality varies and mobility patterns are unpredictable.To overcome these limitations,this paper proposes an improved routing protocol based on reinforcement learning.This new approach integrates Q-learning with mechanisms that are both link-aware and mobility-aware.The proposed method optimizes the selection of relay nodes by using an adaptive reward function that takes into account energy consumption,delay,and link quality.Additionally,a Kalman filter is integrated to predict UAV mobility,improving the stability of communication links under dynamic network conditions.Simulation experiments were conducted using realistic scenarios,varying the number of UAVs to assess scalability.An analysis was conducted on key performance metrics,including the packet delivery ratio,end-to-end delay,and total energy consumption.The results demonstrate that the proposed approach significantly improves the packet delivery ratio by 12%–15%and reduces delay by up to 25.5%when compared to conventional GEO and QGEO protocols.However,this improvement comes at the cost of higher energy consumption due to additional computations and control overhead.Despite this trade-off,the proposed solution ensures reliable and efficient communication,making it well-suited for large-scale UAV networks operating in complex urban environments.
文摘Biobased biodegradable plastics have gained increasing attention as sustainable alternatives to petroleum-based materials in food packaging,offering biodegradability,renewability,and reduced environmental impact.This review adopts a narrative review approach,integrating studies published between 2015 and 2025 from major databases to critically evaluate the recent advances,feasibility,and limitations of biobased biodegradable plastics in food packaging.Literature was thematically analyzed by material type and functional enhancement to assess their feasibility and limitations for sustainable packaging applications.Recent advances have focused on enhancing their mechanical,barrier,and functional properties through polymer blending,nanoparticle reinforcement,and incorporation of natural bioactive agents.Starch-based bioplastics,derived from renewable sources such as corn and cassava,have been improved by blending with polylactic acid(PLA)or polybutylene succinate(PBS)and reinforcing with nanocellulose or silica to enhance flexibility,strength,and thermal stability.Incorporating plant extracts and polyphenols has added antioxidant and antimicrobial functions.PLA-based films have benefited from nanoparticle fillers like zinc oxide and lignin nanoparticles,and the integration of bioactive compounds such as tea polyphenols and hop extract has enabled multifunctional,intelligent packaging with controlled release and UV protection.Polyhydroxyalkanoates(PHAs),producedmicrobially,have been functionalizedwith tannins,ferulic acid,and other natural agents to achieve high antioxidant,antibacterial,and UV-blocking performance,while multilayer coatings have improved moisture and gas resistance.PBS composites have been enhanced using nanofillers like silver or magnesium oxide and natural additives such as quercetin and essential oils,thereby improving durability and bioactivity.Emerging materials,including chitosan-,protein-,and polysaccharide-based films,show excellent film-forming ability and compatibility with natural antimicrobials;smart systems with pH-sensing and UV-shielding functions further extend food shelf life.Despite remaining challenges such as cost,moisture sensitivity,limited scalability,and potential competition with food resources,recent progress demonstrates that biobased biodegradable plastics hold strong potential to advance sustainable,high-performance food packaging,particularly when waste is valorized.Future research should focus on improving the cost-effectiveness,scalability,and moisture resistance of biobased biodegradable plastics,while advancing waste-derived feedstocks,multifunctional smart packaging,and comprehensive life cycle assessments to ensure sustainable and practical food packaging solutions.
基金State Key Program of National Natural Science of China under grant nos.U19B2016。
文摘Frequency hopping(FH)communication has good anti-fading,anti-jamming and anti-eavesdropping capabilities,so it is one of the main ways to combat electronic jamming.In order to further improve the anti-jamming capability of FH communication,the parameters such as fixed frequency interval,hopping rate and hopping frequency in conventional FH can be assigned with time-varying characteristics.In order to set appropriate hopping parameters to improve the performance of the system in the electromagnetic environment with various types of jamming,a heuristically accelerated Q-learning(HAQL)method is proposed in this paper.Firstly,a theoretical model for the parameter decision-making of FH system is made,and the key parameters affecting the energy efficiency of the system are analyzed.Secondly,a Q-learning model in complex electromagnetic environment is proposed,which includes setting states,actions and rewards,as well as a HAQL-based decisionmaking algorithm is put forward.Lastly,simulations are carried out under different jamming environments,and simulation results show that the average energy efficiency of HAQL algorithm is higher than that of the SARSA algorithm,the e-greedy QL algorithm and the HQL-OSGM algorithm,respectively.
基金supported by the National Natural Science Foundation of China (No.12222502)。
文摘Muon scattering tomography(MST) is a powerful noninvasive imaging technique with significant applications in nuclear material detection and security screening.Traditional MST usually relies on the point of closest approach(PoCA) algorithm to reconstruct images from muon scattering data;however,PoCA often suffers from suboptimal image clarity and resolution.To overcome these challenges,we propose a novel approach that leverages reinforcement learning(RL) to enhance MST reconstruction,termed the μRL-enhanced method.By framing the MST optimization task as an RL problem,we developed an intelligent agent capable of dynamically adjusting the key PoCA parameters.The agent is trained using a multi-objective reward function that guides the optimization toward higher-quality reconstructions.Our experimental results show that theμRL-enhanced method significantly outperforms the traditional PoCA baseline acros s multiple benchmark metrics.Specifically,the proposed approach on average attains a 307% improvement in the intersection over union(IoU),a 79% increase in the structural similarity index measure(SSIM),and a 8.4% enhancement in the peak signal-to-noise ratio(PSNR) across four experiments.Furthermore,when benchmarked against the maximum likelihood scattering and displacement(MLSD)algorithm,the μRL-enhanced method offers modest gains in PS NR and IoU,together with a one-third increase in SSIM.These improvements demonstrate the enhanced reconstruction accuracy and structural fidelity of the μRL-enhanced method,highlighting its potential to advance MST technologies and their applications.
基金supported by the National Natural Science Foundation of China(72371052,71871042)。
文摘Cooperative pursuit poses challenges across natural,social,and technical systems,particularly when decentralized,slow-speed pursuers attempt to capture a high-speed evader with limited observation.Most existing contributions place the focus on the greedy pursuit of the evader,overlooking potential collaborations among pursuers.To tackle this issue,a decisionmaking framework of multi-agent coordinated reciprocity formation pursuit(MACRFP)via deep reinforcement learning is introduced.This framework integrates the actor-critic algorithm with the coordinated reciprocity mechanism to enhance the capability of capturing a faster evader.Initially,a local perception model is created by utilizing a cellular network to simulate limitations caused by obstacles.Next,the formation coalition of pursuit is guided by the Cartesian Oval,enabling dispersed pursuers to create a siege against the faster evader.Furthermore,a coordinated reciprocity model based on the coordination graph and the attention-based graph neural networks is developed,addressing the global coordination problem by estimating a reciprocity coefficient to adjust agents'rewards.Numerical simulations demonstrate the emergence of cooperative behaviors in cooperative besiegement,target tracking,and intelligent interception during the pursuit,indicating that the proposed algorithm enhances the feasibility and effectiveness of capturing a fast-escaping target by integrating coordinated reciprocity and coalition formation.
基金the National Natural Science Foundation of China(Grant No.32371823)the Liaoning Province Xingliao Talents Leading Talent Program(Grant No.XLYC2402043)the Open Foundation of State Key Laboratory of Woody Oil Resources Utilization(Grant No.SKLN EFU202517).
文摘Cellulose,the dominant natural polymer on Earth,features a distinct molecular structure with extraordinary mechanical properties and tunable characteristics,making it attractive for gel systems.Although significant progress has been made,challenges remain in fully leveraging their functional potential and broadening practical applications.This review systematically examines the properties of cellulose and cellulose gels,exploring novel reinforcement strategies—across molecular,supramolecular network,and macroscale structure levels—to enhance mechanical,electrical,and thermal performance,while coordinating these properties for practical implementations.These advancements are exemplified in emerging fields such as flexible robotics,electronic skins,flexible energy storage devices,and human-machine interaction systems.This article thoroughly investigates the fundamental characteristics,multi-scale design approaches,performance enhancement mechanisms,and cutting-edge implementations of cellulose-based gels across diverse domains.It provides a comprehensive overview of these advanced materials and offers strategic insights and recommendations for future research and innovation.
文摘Wireless Sensor Networks(WSNs)play a crucial role in numerous Internet of Things(IoT)applications and next-generation communication systems,yet they continue to face challenges in balancing energy efficiency and reliable connectivity.This study proposes SAC-HTC(Soft Actor-Critic-based High-performance Topology Control),a deep reinforcement learning(DRL)method based on the Actor-Critic framework,implemented within a Software Defined Wireless Sensor Network(SDWSN)architecture.In this approach,sensor nodes periodically transmit state information,including coordinates,node degree,transmission power,and neighbor lists,to a centralized controller.The controller acts as the reinforcement learning(RL)agent,with the Actor generating decisions to adjust transmission ranges,while the Critic evaluates action values to reflect the overall network performance.The bidirectional Node-Controller feedback mechanism enables the controller to issue appropriate control commands to each node,ensuring the maintenance of the desired node degree,reducing energy consumption,and preserving network connectivity.The algorithmfurther incorporates soft entropy adjustment to balance exploration and exploitation,alongwith an off-policy mechanism for efficient data reuse,making it well-suited to the resource-constrained conditions ofWSNs.Simulation results demonstrate that SAC-HTC not only outperforms traditional methods and several existing RL algorithms but also achieves faster convergence,optimized communication range control,global connectivity maintenance,and extended network lifetime.The key novelty of this research lies in the integration of the SAC method with the SDWSN architecture forWSNs topology control,providing an adaptive,efficient,and highly promisingmechanism for large-scale,dynamic,and high-performance sensor networks.
文摘The increasing occurrence of corrosion-related damage in steel pipelines has led to the growing use of composite-based repair techniques as an efficient alternative to traditional replacement methods.Computer modeling and structural analysis were performed for the repair reinforcement of a steel pipeline with a composite bandage.A preliminary analysis of possible contact interaction schemes was implemented based on the theory of cylindrical shells,taking into account transverse shear deformations.The finite element method was used for a detailed study of the stress state of the composite bandage and the reinforced section of the pipeline.The limit state of the reinforced section was assessed based on the von Mises criterion for steel and the Tsai-Wu criterion for composites.The effectiveness of the repair was demonstrated on a pipeline whose wall thickness had decreased by 20%as a result of corrosion damage.At a nominal pressure of P=6 MPa,the maximum normal stress in the weakened area reached 381 MPa.The installation of a composite bandage reduced this stress to 312 MPa,making the repaired section virtually as strong as the undamaged pipeline.Due to the linearity of the problem,the results obtained can be easily used to find critical internal pressure values.
文摘This paper investigates the traffic offloading optimization challenge in Space-Air-Ground Integrated Networks(SAGIN)through a novel Recursive Multi-Agent Proximal Policy Optimization(RMAPPO)algorithm.The exponential growth of mobile devices and data traffic has substantially increased network congestion,particularly in urban areas and regions with limited terrestrial infrastructure.Our approach jointly optimizes unmanned aerial vehicle(UAV)trajectories and satellite-assisted offloading strategies to simultaneously maximize data throughput,minimize energy consumption,and maintain equitable resource distribution.The proposed RMAPPO framework incorporates recurrent neural networks(RNNs)to model temporal dependencies in UAV mobility patterns and utilizes a decentralized multi-agent reinforcement learning architecture to reduce communication overhead while improving system robustness.The proposed RMAPPO algorithm was evaluated through simulation experiments,with the results indicating that it significantly enhances the cumulative traffic offloading rate of nodes and reduces the energy consumption of UAVs.
基金funded by the U.S.Department of Education under Grant Number ED#P116S210005the National Science Foundation under Grant Numbers 2226936 and 2420405.
文摘Theintegration of human factors into artificial intelligence(AI)systems has emerged as a critical research frontier,particularly in reinforcement learning(RL),where human-AI interaction(HAII)presents both opportunities and challenges.As RL continues to demonstrate remarkable success in model-free and partially observable environments,its real-world deployment increasingly requires effective collaboration with human operators and stakeholders.This article systematically examines HAII techniques in RL through both theoretical analysis and practical case studies.We establish a conceptual framework built upon three fundamental pillars of effective human-AI collaboration:computational trust modeling,system usability,and decision understandability.Our comprehensive review organizes HAII methods into five key categories:(1)learning from human feedback,including various shaping approaches;(2)learning from human demonstration through inverse RL and imitation learning;(3)shared autonomy architectures for dynamic control allocation;(4)human-in-the-loop querying strategies for active learning;and(5)explainable RL techniques for interpretable policy generation.Recent state-of-the-art works are critically reviewed,with particular emphasis on advances incorporating large language models in human-AI interaction research.To illustrate some concepts,we present three detailed case studies:an empirical trust model for farmers adopting AI-driven agricultural management systems,the implementation of ethical constraints in roboticmotion planning through human-guided RL,and an experimental investigation of human trust dynamics using a multi-armed bandit paradigm.These applications demonstrate how HAII principles can enhance RL systems’practical utility while bridging the gap between theoretical RL and real-world human-centered applications,ultimately contributing to more deployable and socially beneficial intelligent systems.
文摘Ride-hailing electric vehicles are mobile resources with dispatch potential to improve resilience.However,they have not been well investigated because their charging and order-serving are affected or managed by the power grid dispatching center and the ride-hailing platform.Effective pre-strategies can improve the prevention ability for high-impact and low-probability(HILP)events and provide the foundation for measures in the response and restoration stages.First,this paper proposes a resilience reserve to expand the existing research on power system resilience.Secondly,this paper puts forward an interactive method of deep reinforcement learning,which considers the interests of both the power grid dispatching center and the ride-hailing platform.It improves the resilience reserve by achieving the order dispatch,orderly charging management of ride-hailing electric vehicles,and the pricing strategy of charging stations.Finally,this paper uses a practical example covering about 107.32 km2 in the center of Chengdu to verify that the proposed method improves the resilience reserve of the power system without obviously damaging the interests of the ride-hailing platform.
基金support from National Key R&D Program of China(2022YFB3807200)Science and Technology Commission of Shanghai Municipality(25CL2902100).
文摘Lithium-rich layered oxides(LRLOs)are promising cathode materials due to their high specific capacity,energy density,and operating voltage.However,their performance is hindered by the limited redox activity of transition metals,leading to oxygen redox instability,oxygen release,and capacity degradation.To address these issues,we propose an innovative lattice-oxygen modulation(LOM)strategy that incorporates Mn^(3+)and Ti^(4+)into the Li_(1.2)Cr_(0.3)Mn_(0.4)Ti_(0.1)O_(2) system,effectively mitigating Cr migration,stabilizing oxygen redox reactions,and reinforcing structural integrity.This results in improved electrochemical performance,as demonstrated by a 56.5 mAh g^(−1) increase in initial discharge capacity to 364.2 mAh g^(−1),with 71.3%capacity retention after 30 cycles,reflecting a 20.2%improvement in cycling stability.Density functional theory(DFT)calculations confirm enhanced Cr redox reversibility and reduced oxygen evolution,further strengthening structural stability.These synergistic effects highlight the pivotal role of the LOM strategy in optimizing both electrochemical performance and structural integrity,offering a scalable pathway to improve capacity and cycling stability in lithium-rich cathodes.
基金financially supported by the National Natural Science Foundation of China(No.52103040)China Postdoctoral Science Foundation(No.2020M673217)the Fundamental Research Funds for the Central Universities(No.2023SCU12022)。
文摘To address the poor mechanical performance and improve the tribological properties of self-lubricating polyphenylene sulfide/irradiation treated polytetrafluoroethylene(PPS/i-PTFE)blends,different aspect ratio carbon fibers(i.e.,PSCF:50,SCF:about 429)were introduced as reinforcement fillers.The results showed that the hybriding of PSCF and SCF at certain mass ratios exhibited simultaneous enhancement of mechanical and tribological performance for PPS/i-PTFE blend through the construction of synergistic lubrication and mechanical interlocking network.Specifically,the flexural strength and modulus of PPS/i-PTFE were increased by 125.6% and 389.3%,the friction coefficient and specific wear rate were decreased by 13.9% and 95%,respectively.It was worth noting that PPS composites possessed excellent integrated performance which were able to withstand sliding action under high PV(≥10 MPa·m/s)conditions,as assessed by a customized pin-on-disc tester.This work demonstrated that the formation of intact lubricating film combined with the enhanced thermal and mechanical properties were favorable for improving the tribological properties of PPS-based composites,which makes them suitable for advanced engineering applications.
基金supported by the National Natural Science Foundation of China(62373364,62176259)the Key Research and Development Program of Jiangsu Province(BE2022095)。
文摘In order to address the issue of overly conservative offline reinforcement learning(RL) methods that limit the generalization of policy in the out-of-distribution(OOD) region,this article designs a surrogate target for OOD value function based on dataset distance and proposes a novel generalized Q-learning mechanism with distance regularization(GQDR).In theory,we not only prove the convergence of GQDR,but also ensure that the difference between the Q-value learned by GQDR and its true value is bounded.Furthermore,an offline generalized actor-critic method with distance regularization(OGACDR) is proposed by combining GQDR with actor-critic learning framework.Two implementations of OGACDR,OGACDR-EXP and OGACDRSQR,are introduced according to exponential(EXP) and opensquare(SQR) distance weight functions,and it has been theoretically proved that OGACDR provides a safe policy improvement.Experimental results on Gym-MuJoCo continuous control tasks show that OGACDR can not only alleviate the overestimation and overconservatism of Q-value function,but also outperform conservative offline RL baselines.
文摘This study investigates the performance of high-strength cable bolts under impact loading conditions representative of rock bursts in underground environments.Although widely used,the dynamic behaviour of these cable bolts has received limited experimental attention,and their effectiveness in seismically active zones remains a subject of ongoing debate.To address this gap,a reverse pull-out test machine integrated with a drop hammer rig was employed.Tests were conducted on 70-t SUMO bulbed and non-bulbed cable bolts with encapsulation lengths of 300 and 450 mm,subjected to an impact energy of 14.52 k J.Results indicate that non-bulbed cables,despite showing lower initial peak loads(average 218 vs.328 k N for bulbed cables at 300 mm encapsulation),demonstrated superior energy absorption(average 11.26 vs.8.75 k J)and displacement capacity(average 48.40 vs.36.25 mm).Increasing the encapsulation length for bulbed cables led to a reduction in initial peak load but improved displacement and energy absorption.The dominant failure mechanism was debonding at the cable-grout interface,characterised by frictional sliding and cable rotation.These findings provide new insights into the energy dissipation mechanisms of cables and support the development of more resilient ground support systems for dynamically active conditions.
基金supported in part by the Research on Key Technologies for the Development of an Active Balancing Cooperative Control Systemfor Distribution Networks and the National Natural Science Foundation of China under Grant 521532240029,Grant 62303006.
文摘To address the high costs and operational instability of distribution networks caused by the large-scale integration of distributed energy resources(DERs)(such as photovoltaic(PV)systems,wind turbines(WT),and energy storage(ES)devices),and the increased grid load fluctuations and safety risks due to uncoordinated electric vehicles(EVs)charging,this paper proposes a novel dual-scale hierarchical collaborative optimization strategy.This strategy decouples system-level economic dispatch from distributed EV agent control,effectively solving the resource coordination conflicts arising from the high computational complexity,poor scalability of existing centralized optimization,or the reliance on local information decision-making in fully decentralized frameworks.At the lower level,an EV charging and discharging model with a hybrid discrete-continuous action space is established,and optimized using an improved Parameterized Deep Q-Network(PDQN)algorithm,which directly handles mode selection and power regulation while embedding physical constraints to ensure safety.At the upper level,microgrid(MG)operators adopt a dynamic pricing strategy optimized through Deep Reinforcement Learning(DRL)to maximize economic benefits and achieve peak-valley shaving.Simulation results show that the proposed strategy outperforms traditional methods,reducing the total operating cost of the MG by 21.6%,decreasing the peak-to-valley load difference by 33.7%,reducing the number of voltage limit violations by 88.9%,and lowering the average electricity cost for EV users by 15.2%.This method brings a win-win result for operators and users,providing a reliable and efficient scheduling solution for distribution networks with high renewable energy penetration rates.