To study the incentive mechanisms of cooperation, we propose a preference rewarding mechanism in the spatial prisoner’s dilemma game, which simultaneously considers reputational preference, other-regarding preference...To study the incentive mechanisms of cooperation, we propose a preference rewarding mechanism in the spatial prisoner’s dilemma game, which simultaneously considers reputational preference, other-regarding preference and the dynamic adjustment of vertex weight. The vertex weight of a player is adaptively adjusted according to the comparison result of his own reputation and the average reputation value of his immediate neighbors. Players are inclined to pay a personal cost to reward the cooperative neighbor with the greatest vertex weight. The vertex weight of a player is proportional to the preference rewards he can obtain from direct neighbors. We find that the preference rewarding mechanism significantly facilitates the evolution of cooperation, and the dynamic adjustment of vertex weight has powerful effect on the emergence of cooperative behavior. To validate multiple effects, strategy distribution and the average payoff and fitness of players are discussed in a microcosmic view.展开更多
This work aims to identify a method by the coordinator of the OU(operational unit)for the training of gratified personnel through the use of a rewarding system.The continuous transformations that concern the Italian h...This work aims to identify a method by the coordinator of the OU(operational unit)for the training of gratified personnel through the use of a rewarding system.The continuous transformations that concern the Italian healthcare scene lead the operators to face always new needs and problems.Professionals can not only be considered as workers but bearers of qualified intellectual,professional and cultural skills.Individual coordinators are required to be real leaders within their operational units and to use their managerial skills in achieving company objectives and in evaluating the personnel they manage.The main factor to which difficulties in the management of staff are related concerns the motivation,defined as a state of mind together with aspirations,needs,orientations,that pushes people to act and to use a behavior characterized by commitment,perseverance and determination.The need to better rationalize the resources available,to promote high quality health care,improving safety,efficiency and appropriateness has led the general management and coordinator of the OU to use the reward systems.With the introduction of this procedure aimed at enhancing the merit and encouraging virtuous behavior during the provision of health services,the public employment reform participates in the evolution of the regulatory framework and it turns on the change that is taking place in the world of work.展开更多
Network marketing is a trading technique that provides companies with the opportunity to increase sales.With the increasing number of Internet-based purchases,several threats are increasingly observed in this field,su...Network marketing is a trading technique that provides companies with the opportunity to increase sales.With the increasing number of Internet-based purchases,several threats are increasingly observed in this field,such as user privacy violations,company owner(CO)fraud,the changing of sold products’information,and the scalability of selling networks.This study presents the concept of a blockchain-based market called ACR-MLM that functions based on the multi-level marketing(MLM)model,through which registered users receive anonymous and confidential rewards for their own and their subgroups’sales.Applying a public blockchain as the ACR-MLM framework’s infrastructure solves existing problems in MLM-based markets,such as CO fraud(against the government or its users),user privacy violations(obtaining their real names or subgroup users),and scalability(when vast numbers of users have been registered).To provide confidentiality and scalability to the ACR-MLM framework,hierarchical identity-based encryption(HIBE)was applied with a functional encryption(FE)scheme.Finally,the security of ACR-MLM is analyzed using the random oracle(RO)model and then evaluated.展开更多
The CAS Institute of Modern Physics is a center of pure basic research concerning nuclear physics, accelerator physics and related technology. In recent years, it succeeded in the construction of China’s first produc...The CAS Institute of Modern Physics is a center of pure basic research concerning nuclear physics, accelerator physics and related technology. In recent years, it succeeded in the construction of China’s first production line for manufacturing radiation-crosslinked (RC) wire and cable with the aid of international cooperation,achieving rewarding benefits from it.展开更多
BACKGROUND Anhedonia,a hallmark symptom of major depressive disorder(MDD),is often resistant to common antidepressants.Preliminary evidence indicates that Pedio-coccus acidilactici(P.acidilactici)CCFM6432 may offer po...BACKGROUND Anhedonia,a hallmark symptom of major depressive disorder(MDD),is often resistant to common antidepressants.Preliminary evidence indicates that Pedio-coccus acidilactici(P.acidilactici)CCFM6432 may offer potential benefits in ame-liorating this symptomatology in patients with MDD.AIM To further assess the efficacy of P.acidilactici CCFM6432 in alleviating anhedonia in patients with MDD,using a combination of objective and subjective assessment tools.METHODS Adult patients with MDD exhibiting anhedonic symptoms were enrolled and randomly assigned to two treatment groups:One receiving standard antide-pressant therapy plus P.acidilactici CCFM6432,and the other receiving standard antidepressant treatment along with a placebo,for 30 days.Assessments were conducted at baseline and post-intervention using the Hamilton Depression Rating Scale(HAMD),Temporal Experience of Pleasure Scale(TEPS),and synchronous electroencephalography(EEG)during a"Doors Guessing Task."Changes in both clinical outcomes and EEG biomarkers,specifically the stimulus-preceding negativity(SPN)and feedback-related nega-tivity amplitudes,were analyzed.RESULTS Of the 92 screened participants,71 were enrolled and 55 completed the study(CCFM6432 group:n=27;Placebo group:n=28).No baseline differences were noted between the groups in terms of demographics,clinical assessments,or EEG metrics.A mixed-design analysis of variance revealed that the CCFM6432 group showed significantly greater improvements in both HAMD and TEPS scores compared to the Placebo group.Moreover,the CCFM6432 group demonstrated a significant increase in SPN amplitudes,which were inversely correlated with the improvements observed in HAMD scores.No such changes were observed in the Placebo group.CONCLUSION Adjunctive administration of P.acidilactici CCFM6432 not only augments the therapeutic efficacy of antide-pressants but also significantly ameliorates the symptoms of anhedonia in MDD.展开更多
Robot navigation in complex crowd service scenarios,such as medical logistics and commercial guidance,requires a dynamic balance between safety and efficiency,while the traditional fixed reward mechanism lacks environ...Robot navigation in complex crowd service scenarios,such as medical logistics and commercial guidance,requires a dynamic balance between safety and efficiency,while the traditional fixed reward mechanism lacks environmental adaptability and struggles to adapt to the variability of crowd density and pedestrian motion patterns.This paper proposes a navigation method that integrates spatiotemporal risk field modeling and adaptive reward optimization,aiming to improve the robot’s decision-making ability in diverse crowd scenarios through dynamic risk assessment and nonlinear weight adjustment.We construct a spatiotemporal risk field model based on a Gaussian kernel function by combining crowd density,relative distance,andmotion speed to quantify environmental complexity and realize crowd-density-sensitive risk assessment dynamically.We apply an exponential decay function to reward design to address the linear conflict problem of fixed weights in multi-objective optimization.We adaptively adjust weight allocation between safety constraints and navigation efficiency based on real-time risk values,prioritizing safety in highly dense areas and navigation efficiency in sparse areas.Experimental results show that our method improves the navigation success rate by 9.0%over state-of-the-art models in high-density scenarios,with a 10.7%reduction in intrusion time ratio.Simulation comparisons validate the risk field model’s ability to capture risk superposition effects in dense scenarios and the suppression of near-field dangerous behaviors by the exponential decay mechanism.Our parametric optimization paradigm establishes an explicit mapping between navigation objectives and risk parameters through rigorous mathematical formalization,providing an interpretable approach for safe deployment of service robots in dynamic environments.展开更多
For better flexibility and greater coverage areas,Unmanned Aerial Vehicles(UAVs)have been applied in Flying Mobile Edge Computing(F-MEC)systems to offer offloading services for the User Equipment(UEs).This paper consi...For better flexibility and greater coverage areas,Unmanned Aerial Vehicles(UAVs)have been applied in Flying Mobile Edge Computing(F-MEC)systems to offer offloading services for the User Equipment(UEs).This paper considers a disaster-affected scenario where UAVs undertake the role of MEC servers to provide computing resources for Disaster Relief Devices(DRDs).Considering the fairness of DRDs,a max-min problem is formulated to optimize the saved time by jointly designing the trajectory of the UAVs,the offloading policy and serving time under the constraint of the UAVs'energy capacity.To solve the above non-convex problem,we first model the service process as a Markov Decision Process(MDP)with the Reward Shaping(RS)technique,and then propose a Deep Reinforcement Learning(DRL)based algorithm to find the optimal solution for the MDP.Simulations show that the proposed RS-DRL algorithm is valid and effective,and has better performance than the baseline algorithms.展开更多
Early life stress correlates with a higher prevalence of neurological disorders,including autism,attention-deficit/hyperactivity disorder,schizophrenia,depression,and Parkinson's disease.These conditions,primarily...Early life stress correlates with a higher prevalence of neurological disorders,including autism,attention-deficit/hyperactivity disorder,schizophrenia,depression,and Parkinson's disease.These conditions,primarily involving abnormal development and damage of the dopaminergic system,pose significant public health challenges.Microglia,as the primary immune cells in the brain,are crucial in regulating neuronal circuit development and survival.From the embryonic stage to adulthood,microglia exhibit stage-specific gene expression profiles,transcriptome characteristics,and functional phenotypes,enhancing the susceptibility to early life stress.However,the role of microglia in mediating dopaminergic system disorders under early life stress conditions remains poorly understood.This review presents an up-to-date overview of preclinical studies elucidating the impact of early life stress on microglia,leading to dopaminergic system disorders,along with the underlying mechanisms and therapeutic potential for neurodegenerative and neurodevelopmental conditions.Impaired microglial activity damages dopaminergic neurons by diminishing neurotrophic support(e.g.,insulin-like growth factor-1)and hinders dopaminergic axon growth through defective phagocytosis and synaptic pruning.Furthermore,blunted microglial immunoreactivity suppresses striatal dopaminergic circuit development and reduces neuronal transmission.Furthermore,inflammation and oxidative stress induced by activated microglia can directly damage dopaminergic neurons,inhibiting dopamine synthesis,reuptake,and receptor activity.Enhanced microglial phagocytosis inhibits dopamine axon extension.These long-lasting effects of microglial perturbations may be driven by early life stress–induced epigenetic reprogramming of microglia.Indirectly,early life stress may influence microglial function through various pathways,such as astrocytic activation,the hypothalamic–pituitary–adrenal axis,the gut–brain axis,and maternal immune signaling.Finally,various therapeutic strategies and molecular mechanisms for targeting microglia to restore the dopaminergic system were summarized and discussed.These strategies include classical antidepressants and antipsychotics,antibiotics and anti-inflammatory agents,and herbal-derived medicine.Further investigations combining pharmacological interventions and genetic strategies are essential to elucidate the causal role of microglial phenotypic and functional perturbations in the dopaminergic system disrupted by early life stress.展开更多
This paper investigates impulsive orbital attack-defense(AD)games under multiple constraints and victory conditions,involving three spacecraft:attacker,target,and defender.In the AD scenario,the attacker aims to breac...This paper investigates impulsive orbital attack-defense(AD)games under multiple constraints and victory conditions,involving three spacecraft:attacker,target,and defender.In the AD scenario,the attacker aims to breach the defender's interception to rendezvous with the target,while the defender seeks to protect the target by blocking or actively pursuing the attacker.Four different maneuvering constraints and five potential game outcomes are incorporated to more accurately model AD game problems and increase complexity,thereby reducing the effectiveness of traditional methods such as differential games and game-tree searches.To address these challenges,this study proposes a multiagent deep reinforcement learning solution with variable reward functions.Two attack strategies,Direct attack(DA)and Bypass attack(BA),are developed for the attacker,each focusing on different mission priorities.Similarly,two defense strategies,Direct interdiction(DI)and Collinear interdiction(CI),are designed for the defender,each optimizing specific defensive actions through tailored reward functions.Each reward function incorporates both process rewards(e.g.,distance and angle)and outcome rewards,derived from physical principles and validated via geometric analysis.Extensive simulations of four strategy confrontations demonstrate average defensive success rates of 75%for DI vs.DA,40%for DI vs.BA,80%for CI vs.DA,and 70%for CI vs.BA.Results indicate that CI outperforms DI for defenders,while BA outperforms DA for attackers.Moreover,defenders achieve their objectives more effectively under identical maneuvering capabilities.Trajectory evolution analyses further illustrate the effectiveness of the proposed variable reward function-driven strategies.These strategies and analyses offer valuable guidance for practical orbital defense scenarios and lay a foundation for future multi-agent game research.展开更多
Combined inoculation with dark septate endophytes(DSEs)and arbuscular mycorrhizal fungi(AMF)has been shown to promote plant growth,yet the underlying plant-fungus interaction mechanisms remain unclear.To elucidate the...Combined inoculation with dark septate endophytes(DSEs)and arbuscular mycorrhizal fungi(AMF)has been shown to promote plant growth,yet the underlying plant-fungus interaction mechanisms remain unclear.To elucidate the nature of this symbiosis,it is crucial to explore carbon(C)transport from plants to fungi and nutrient exchange between them.In this study,a pot experiment was conducted with two phosphorus(P)fertilization levels(low and normal)and four fungal inoculation treatments(no inoculation,single inoculation of AMF and DSE,and co-inoculation of AMF and DSE).The^(13)C isotope pulse labeling method was employed to quantify the plant photosynthetic C transfer from plants to different fungi,shedding light on the mechanisms of nutrient exchange between plants and fungi.Soil and mycelium δ^(13)C,soil C/N ratio,and soil C/P ratio were higher at the low P level than at the normal P level.However,soil microbial biomass C/P ratio was lower at the low P level,suggesting that the low P level was beneficial to soil C fixation and soil fungal P mineralization and transport.At the low P level,the P reward to plants from AMF and DSE increased significantly when the plants transferred the same amount of C to the fungi,and the two fungi synergistically promoted plant nutrient uptake and growth.At the normal P level,the root P content was significantly higher in the AMF-inoculated plants than in the DSE-inoculated plants,indicating that AMF contributed more than DSE to plant P uptake with the same amount of C received.Moreover,plants preferentially allocated more C to AMF.These findings indicate the presence of a source-sink balance between plant C allocation and fungal P contribution.Overall,AMF and DSE conferred a higher reward to plants at the low P level through functional synergistic strategies.展开更多
The fiscal stimulus package continues to play the leading role in China’s economic prosperity After several nervous months, China is finally breathing a sigh of relief as the powerful stimulus shifts the nation’ s g...The fiscal stimulus package continues to play the leading role in China’s economic prosperity After several nervous months, China is finally breathing a sigh of relief as the powerful stimulus shifts the nation’ s growth engine out of展开更多
The Ventral Tegmental Area(VTA)is a midbrain structure known to integrate aversive and rewarding stimuli,but little is known about the role of VTA glutamatergic(VGluT2)neurons in these functions.Direct activation of V...The Ventral Tegmental Area(VTA)is a midbrain structure known to integrate aversive and rewarding stimuli,but little is known about the role of VTA glutamatergic(VGluT2)neurons in these functions.Direct activation of VGluT2 soma evokes rewarding behaviors,while activation of their downstream projections evokes aversive behaviors.To facilitate our understanding of these conflicting properties,we recorded calcium signals from VTAVGluT2+neurons using fiber photometry in VGluT2-cre mice to investigate how this population was recruited by aversive and rewarding stimulation,both during unconditioned and conditioned protocols.Our results revealed that,as a population,VTAVGluT2+neurons responded similarly to unconditioned-aversive and unconditioned-rewarding stimulation.During aversive and rewarding conditioning,the CS-evoked responses gradually increased across trials whilst the US-evoked response remained stable.Retrieval 24 h after conditioning,during which mice received only CS presentation,resulted in VTAVGluT2+neurons strongly responding to CS presentation and to the expected-US but only for aversive conditioning.To help understand these differences based on VTAVGluT2+neuronal networks,the inputs and outputs of VTAVGluT2+neurons were investigated using Cholera Toxin B(CTB)and rabies virus.Based on our results,we propose that the divergent VTAVGluT2+neuronal responses to aversion and reward conditioning may be partly due to the existence of VTAVGluT2+subpopulations that are characterized by their connectivity.展开更多
Autonomous umanned aerial vehicle(UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battlefield. A large amount of research has been devo...Autonomous umanned aerial vehicle(UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battlefield. A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment, where finding the optimal maneuvering decisionmaking policy became one of the key issues for enabling the intelligence of UAV. In this paper, we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert experience. Specifically, we refine the guidance towards area and guidance towards specific point tasks for the air-delivery process based on the traditional air-to-surface fire control methods.Moreover, we construct the UAV maneuvering decision-making model based on Markov decision processes(MDPs). Specifically, we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice. The proposed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process. The effectiveness of the proposed maneuvering decision-making policy is illustrated by the curves of training parameters and extensive experimental results for testing the trained policy.展开更多
In public goods games, punishments and rewards have been shown to be effective mechanisms for maintaining individualcooperation. However, punishments and rewards are costly to incentivize cooperation. Therefore, the g...In public goods games, punishments and rewards have been shown to be effective mechanisms for maintaining individualcooperation. However, punishments and rewards are costly to incentivize cooperation. Therefore, the generation ofcostly penalties and rewards has been a complex problem in promoting the development of cooperation. In real society,specialized institutions exist to punish evil people or reward good people by collecting taxes. We propose a strong altruisticpunishment or reward strategy in the public goods game through this phenomenon. Through theoretical analysis and numericalcalculation, we can get that tax-based strong altruistic punishment (reward) has more evolutionary advantages thantraditional strong altruistic punishment (reward) in maintaining cooperation and tax-based strong altruistic reward leads toa higher level of cooperation than tax-based strong altruistic punishment.展开更多
Cross-lingual image description,the task of generating image captions in a target language from images and descriptions in a source language,is addressed in this study through a novel approach that combines neural net...Cross-lingual image description,the task of generating image captions in a target language from images and descriptions in a source language,is addressed in this study through a novel approach that combines neural network models and semantic matching techniques.Experiments conducted on the Flickr8k and AraImg2k benchmark datasets,featuring images and descriptions in English and Arabic,showcase remarkable performance improvements over state-of-the-art methods.Our model,equipped with the Image&Cross-Language Semantic Matching module and the Target Language Domain Evaluation module,significantly enhances the semantic relevance of generated image descriptions.For English-to-Arabic and Arabic-to-English cross-language image descriptions,our approach achieves a CIDEr score for English and Arabic of 87.9%and 81.7%,respectively,emphasizing the substantial contributions of our methodology.Comparative analyses with previous works further affirm the superior performance of our approach,and visual results underscore that our model generates image captions that are both semantically accurate and stylistically consistent with the target language.In summary,this study advances the field of cross-lingual image description,offering an effective solution for generating image captions across languages,with the potential to impact multilingual communication and accessibility.Future research directions include expanding to more languages and incorporating diverse visual and textual data sources.展开更多
Multi-agent reinforcement learning has recently been applied to solve pursuit problems.However,it suffers from a large number of time steps per training episode,thus always struggling to converge effectively,resulting...Multi-agent reinforcement learning has recently been applied to solve pursuit problems.However,it suffers from a large number of time steps per training episode,thus always struggling to converge effectively,resulting in low rewards and an inability for agents to learn strategies.This paper proposes a deep reinforcement learning(DRL)training method that employs an ensemble segmented multi-reward function design approach to address the convergence problem mentioned before.The ensemble reward function combines the advantages of two reward functions,which enhances the training effect of agents in long episode.Then,we eliminate the non-monotonic behavior in reward function introduced by the trigonometric functions in the traditional 2D polar coordinates observation representation.Experimental results demonstrate that this method outperforms the traditional single reward function mechanism in the pursuit scenario by enhancing agents’policy scores of the task.These ideas offer a solution to the convergence challenges faced by DRL models in long episode pursuit problems,leading to an improved model training performance.展开更多
By integrating deep neural networks with reinforcement learning,the Double Deep Q Network(DDQN)algorithm overcomes the limitations of Q-learning in handling continuous spaces and is widely applied in the path planning...By integrating deep neural networks with reinforcement learning,the Double Deep Q Network(DDQN)algorithm overcomes the limitations of Q-learning in handling continuous spaces and is widely applied in the path planning of mobile robots.However,the traditional DDQN algorithm suffers from sparse rewards and inefficient utilization of high-quality data.Targeting those problems,an improved DDQN algorithm based on average Q-value estimation and reward redistribution was proposed.First,to enhance the precision of the target Q-value,the average of multiple previously learned Q-values from the target Q network is used to replace the single Q-value from the current target Q network.Next,a reward redistribution mechanism is designed to overcome the sparse reward problem by adjusting the final reward of each action using the round reward from trajectory information.Additionally,a reward-prioritized experience selection method is introduced,which ranks experience samples according to reward values to ensure frequent utilization of high-quality data.Finally,simulation experiments are conducted to verify the effectiveness of the proposed algorithm in fixed-position scenario and random environments.The experimental results show that compared to the traditional DDQN algorithm,the proposed algorithm achieves shorter average running time,higher average return and fewer average steps.The performance of the proposed algorithm is improved by 11.43%in the fixed scenario and 8.33%in random environments.It not only plans economic and safe paths but also significantly improves efficiency and generalization in path planning,making it suitable for widespread application in autonomous navigation and industrial automation.展开更多
Due to the issue of long-horizon,a substantial number of visits to the state space is required during the exploration phase of reinforcement learning(RL)to gather valuable information.Addi-tionally,due to the challeng...Due to the issue of long-horizon,a substantial number of visits to the state space is required during the exploration phase of reinforcement learning(RL)to gather valuable information.Addi-tionally,due to the challenge posed by sparse rewards,the planning phase of reinforcement learning consumes a considerable amount of time on repetitive and unproductive tasks before adequately ac-cessing sparse reward signals.To address these challenges,this work proposes a space partitioning and reverse merging(SPaRM)framework based on reward-free exploration(RFE).The framework consists of two parts:the space partitioning module and the reverse merging module.The former module partitions the entire state space into a specific number of subspaces to expedite the explora-tion phase.This work establishes its theoretical sample complexity lower bound.The latter module starts planning in reverse from near the target and gradually extends to the starting state,as opposed to the conventional practice of starting at the beginning.This facilitates the early involvement of sparse rewards at the target in the policy update process.This work designs two experimental envi-ronments:a complex maze and a set of randomly generated maps.Compared with two state-of-the-art(SOTA)algorithms,experimental results validate the effectiveness and superior performance of the proposed algorithm.展开更多
基金the National Natural Science Foundation of China(Grant No.62062049)the Social Science Project of the Ministry of Education of China(Grant No.20YJCZH212)the Natural Science Foundation of Gansu Province,China(Grant No.20JR5RA390).
文摘To study the incentive mechanisms of cooperation, we propose a preference rewarding mechanism in the spatial prisoner’s dilemma game, which simultaneously considers reputational preference, other-regarding preference and the dynamic adjustment of vertex weight. The vertex weight of a player is adaptively adjusted according to the comparison result of his own reputation and the average reputation value of his immediate neighbors. Players are inclined to pay a personal cost to reward the cooperative neighbor with the greatest vertex weight. The vertex weight of a player is proportional to the preference rewards he can obtain from direct neighbors. We find that the preference rewarding mechanism significantly facilitates the evolution of cooperation, and the dynamic adjustment of vertex weight has powerful effect on the emergence of cooperative behavior. To validate multiple effects, strategy distribution and the average payoff and fitness of players are discussed in a microcosmic view.
文摘This work aims to identify a method by the coordinator of the OU(operational unit)for the training of gratified personnel through the use of a rewarding system.The continuous transformations that concern the Italian healthcare scene lead the operators to face always new needs and problems.Professionals can not only be considered as workers but bearers of qualified intellectual,professional and cultural skills.Individual coordinators are required to be real leaders within their operational units and to use their managerial skills in achieving company objectives and in evaluating the personnel they manage.The main factor to which difficulties in the management of staff are related concerns the motivation,defined as a state of mind together with aspirations,needs,orientations,that pushes people to act and to use a behavior characterized by commitment,perseverance and determination.The need to better rationalize the resources available,to promote high quality health care,improving safety,efficiency and appropriateness has led the general management and coordinator of the OU to use the reward systems.With the introduction of this procedure aimed at enhancing the merit and encouraging virtuous behavior during the provision of health services,the public employment reform participates in the evolution of the regulatory framework and it turns on the change that is taking place in the world of work.
文摘Network marketing is a trading technique that provides companies with the opportunity to increase sales.With the increasing number of Internet-based purchases,several threats are increasingly observed in this field,such as user privacy violations,company owner(CO)fraud,the changing of sold products’information,and the scalability of selling networks.This study presents the concept of a blockchain-based market called ACR-MLM that functions based on the multi-level marketing(MLM)model,through which registered users receive anonymous and confidential rewards for their own and their subgroups’sales.Applying a public blockchain as the ACR-MLM framework’s infrastructure solves existing problems in MLM-based markets,such as CO fraud(against the government or its users),user privacy violations(obtaining their real names or subgroup users),and scalability(when vast numbers of users have been registered).To provide confidentiality and scalability to the ACR-MLM framework,hierarchical identity-based encryption(HIBE)was applied with a functional encryption(FE)scheme.Finally,the security of ACR-MLM is analyzed using the random oracle(RO)model and then evaluated.
文摘The CAS Institute of Modern Physics is a center of pure basic research concerning nuclear physics, accelerator physics and related technology. In recent years, it succeeded in the construction of China’s first production line for manufacturing radiation-crosslinked (RC) wire and cable with the aid of international cooperation,achieving rewarding benefits from it.
文摘Ⅰ. THE SUGGESTION OF THE STRATEGIC MEASURE Situated at the junction between the vast Eurasian landmass and the south Asian subcontinent, Yunnan Prov-
基金Supported by the Top Talent Support Program for Young and Middle-aged People of Wuxi Health Committee,No.BJ2023086Wuxi Taihu Talent Project,No.WXTTP 2021.
文摘BACKGROUND Anhedonia,a hallmark symptom of major depressive disorder(MDD),is often resistant to common antidepressants.Preliminary evidence indicates that Pedio-coccus acidilactici(P.acidilactici)CCFM6432 may offer potential benefits in ame-liorating this symptomatology in patients with MDD.AIM To further assess the efficacy of P.acidilactici CCFM6432 in alleviating anhedonia in patients with MDD,using a combination of objective and subjective assessment tools.METHODS Adult patients with MDD exhibiting anhedonic symptoms were enrolled and randomly assigned to two treatment groups:One receiving standard antide-pressant therapy plus P.acidilactici CCFM6432,and the other receiving standard antidepressant treatment along with a placebo,for 30 days.Assessments were conducted at baseline and post-intervention using the Hamilton Depression Rating Scale(HAMD),Temporal Experience of Pleasure Scale(TEPS),and synchronous electroencephalography(EEG)during a"Doors Guessing Task."Changes in both clinical outcomes and EEG biomarkers,specifically the stimulus-preceding negativity(SPN)and feedback-related nega-tivity amplitudes,were analyzed.RESULTS Of the 92 screened participants,71 were enrolled and 55 completed the study(CCFM6432 group:n=27;Placebo group:n=28).No baseline differences were noted between the groups in terms of demographics,clinical assessments,or EEG metrics.A mixed-design analysis of variance revealed that the CCFM6432 group showed significantly greater improvements in both HAMD and TEPS scores compared to the Placebo group.Moreover,the CCFM6432 group demonstrated a significant increase in SPN amplitudes,which were inversely correlated with the improvements observed in HAMD scores.No such changes were observed in the Placebo group.CONCLUSION Adjunctive administration of P.acidilactici CCFM6432 not only augments the therapeutic efficacy of antide-pressants but also significantly ameliorates the symptoms of anhedonia in MDD.
基金supported by the Sichuan Science and Technology Program(2025ZNSFSC0005).
文摘Robot navigation in complex crowd service scenarios,such as medical logistics and commercial guidance,requires a dynamic balance between safety and efficiency,while the traditional fixed reward mechanism lacks environmental adaptability and struggles to adapt to the variability of crowd density and pedestrian motion patterns.This paper proposes a navigation method that integrates spatiotemporal risk field modeling and adaptive reward optimization,aiming to improve the robot’s decision-making ability in diverse crowd scenarios through dynamic risk assessment and nonlinear weight adjustment.We construct a spatiotemporal risk field model based on a Gaussian kernel function by combining crowd density,relative distance,andmotion speed to quantify environmental complexity and realize crowd-density-sensitive risk assessment dynamically.We apply an exponential decay function to reward design to address the linear conflict problem of fixed weights in multi-objective optimization.We adaptively adjust weight allocation between safety constraints and navigation efficiency based on real-time risk values,prioritizing safety in highly dense areas and navigation efficiency in sparse areas.Experimental results show that our method improves the navigation success rate by 9.0%over state-of-the-art models in high-density scenarios,with a 10.7%reduction in intrusion time ratio.Simulation comparisons validate the risk field model’s ability to capture risk superposition effects in dense scenarios and the suppression of near-field dangerous behaviors by the exponential decay mechanism.Our parametric optimization paradigm establishes an explicit mapping between navigation objectives and risk parameters through rigorous mathematical formalization,providing an interpretable approach for safe deployment of service robots in dynamic environments.
基金supported by the Key Research and Development Program of Jiangsu Province(No.BE2020084-2)the National Key Research and Development Program of China(No.2020YFB1600104)。
文摘For better flexibility and greater coverage areas,Unmanned Aerial Vehicles(UAVs)have been applied in Flying Mobile Edge Computing(F-MEC)systems to offer offloading services for the User Equipment(UEs).This paper considers a disaster-affected scenario where UAVs undertake the role of MEC servers to provide computing resources for Disaster Relief Devices(DRDs).Considering the fairness of DRDs,a max-min problem is formulated to optimize the saved time by jointly designing the trajectory of the UAVs,the offloading policy and serving time under the constraint of the UAVs'energy capacity.To solve the above non-convex problem,we first model the service process as a Markov Decision Process(MDP)with the Reward Shaping(RS)technique,and then propose a Deep Reinforcement Learning(DRL)based algorithm to find the optimal solution for the MDP.Simulations show that the proposed RS-DRL algorithm is valid and effective,and has better performance than the baseline algorithms.
基金supported by the National Natural Science Foundation of China,Nos.82304990(to NY),81973748(to JC),82174278(to JC)the National Key R&D Program of China,No.2023YFE0209500(to JC)+4 种基金China Postdoctoral Science Foundation,No.2023M732380(to NY)Guangzhou Key Laboratory of Formula-Pattern of Traditional Chinese Medicine,No.202102010014(to JC)Huang Zhendong Research Fund for Traditional Chinese Medicine of Jinan University,No.201911(to JC)National Innovation and Entrepreneurship Training Program for Undergraduates in China,No.202310559128(to NY and QM)Innovation and Entrepreneurship Training Program for Undergraduates at Jinan University,Nos.CX24380,CX24381(both to NY and QM)。
文摘Early life stress correlates with a higher prevalence of neurological disorders,including autism,attention-deficit/hyperactivity disorder,schizophrenia,depression,and Parkinson's disease.These conditions,primarily involving abnormal development and damage of the dopaminergic system,pose significant public health challenges.Microglia,as the primary immune cells in the brain,are crucial in regulating neuronal circuit development and survival.From the embryonic stage to adulthood,microglia exhibit stage-specific gene expression profiles,transcriptome characteristics,and functional phenotypes,enhancing the susceptibility to early life stress.However,the role of microglia in mediating dopaminergic system disorders under early life stress conditions remains poorly understood.This review presents an up-to-date overview of preclinical studies elucidating the impact of early life stress on microglia,leading to dopaminergic system disorders,along with the underlying mechanisms and therapeutic potential for neurodegenerative and neurodevelopmental conditions.Impaired microglial activity damages dopaminergic neurons by diminishing neurotrophic support(e.g.,insulin-like growth factor-1)and hinders dopaminergic axon growth through defective phagocytosis and synaptic pruning.Furthermore,blunted microglial immunoreactivity suppresses striatal dopaminergic circuit development and reduces neuronal transmission.Furthermore,inflammation and oxidative stress induced by activated microglia can directly damage dopaminergic neurons,inhibiting dopamine synthesis,reuptake,and receptor activity.Enhanced microglial phagocytosis inhibits dopamine axon extension.These long-lasting effects of microglial perturbations may be driven by early life stress–induced epigenetic reprogramming of microglia.Indirectly,early life stress may influence microglial function through various pathways,such as astrocytic activation,the hypothalamic–pituitary–adrenal axis,the gut–brain axis,and maternal immune signaling.Finally,various therapeutic strategies and molecular mechanisms for targeting microglia to restore the dopaminergic system were summarized and discussed.These strategies include classical antidepressants and antipsychotics,antibiotics and anti-inflammatory agents,and herbal-derived medicine.Further investigations combining pharmacological interventions and genetic strategies are essential to elucidate the causal role of microglial phenotypic and functional perturbations in the dopaminergic system disrupted by early life stress.
基金supported by National Key R&D Program of China:Gravitational Wave Detection Project(Grant Nos.2021YFC22026,2021YFC2202601,2021YFC2202603)National Natural Science Foundation of China(Grant Nos.12172288 and 12472046)。
文摘This paper investigates impulsive orbital attack-defense(AD)games under multiple constraints and victory conditions,involving three spacecraft:attacker,target,and defender.In the AD scenario,the attacker aims to breach the defender's interception to rendezvous with the target,while the defender seeks to protect the target by blocking or actively pursuing the attacker.Four different maneuvering constraints and five potential game outcomes are incorporated to more accurately model AD game problems and increase complexity,thereby reducing the effectiveness of traditional methods such as differential games and game-tree searches.To address these challenges,this study proposes a multiagent deep reinforcement learning solution with variable reward functions.Two attack strategies,Direct attack(DA)and Bypass attack(BA),are developed for the attacker,each focusing on different mission priorities.Similarly,two defense strategies,Direct interdiction(DI)and Collinear interdiction(CI),are designed for the defender,each optimizing specific defensive actions through tailored reward functions.Each reward function incorporates both process rewards(e.g.,distance and angle)and outcome rewards,derived from physical principles and validated via geometric analysis.Extensive simulations of four strategy confrontations demonstrate average defensive success rates of 75%for DI vs.DA,40%for DI vs.BA,80%for CI vs.DA,and 70%for CI vs.BA.Results indicate that CI outperforms DI for defenders,while BA outperforms DA for attackers.Moreover,defenders achieve their objectives more effectively under identical maneuvering capabilities.Trajectory evolution analyses further illustrate the effectiveness of the proposed variable reward function-driven strategies.These strategies and analyses offer valuable guidance for practical orbital defense scenarios and lay a foundation for future multi-agent game research.
基金supported by the National Key Research and Development Program of China(No.2022YFF 1303303)the National Natural Science Foundation of China(No.52394194).
文摘Combined inoculation with dark septate endophytes(DSEs)and arbuscular mycorrhizal fungi(AMF)has been shown to promote plant growth,yet the underlying plant-fungus interaction mechanisms remain unclear.To elucidate the nature of this symbiosis,it is crucial to explore carbon(C)transport from plants to fungi and nutrient exchange between them.In this study,a pot experiment was conducted with two phosphorus(P)fertilization levels(low and normal)and four fungal inoculation treatments(no inoculation,single inoculation of AMF and DSE,and co-inoculation of AMF and DSE).The^(13)C isotope pulse labeling method was employed to quantify the plant photosynthetic C transfer from plants to different fungi,shedding light on the mechanisms of nutrient exchange between plants and fungi.Soil and mycelium δ^(13)C,soil C/N ratio,and soil C/P ratio were higher at the low P level than at the normal P level.However,soil microbial biomass C/P ratio was lower at the low P level,suggesting that the low P level was beneficial to soil C fixation and soil fungal P mineralization and transport.At the low P level,the P reward to plants from AMF and DSE increased significantly when the plants transferred the same amount of C to the fungi,and the two fungi synergistically promoted plant nutrient uptake and growth.At the normal P level,the root P content was significantly higher in the AMF-inoculated plants than in the DSE-inoculated plants,indicating that AMF contributed more than DSE to plant P uptake with the same amount of C received.Moreover,plants preferentially allocated more C to AMF.These findings indicate the presence of a source-sink balance between plant C allocation and fungal P contribution.Overall,AMF and DSE conferred a higher reward to plants at the low P level through functional synergistic strategies.
文摘The fiscal stimulus package continues to play the leading role in China’s economic prosperity After several nervous months, China is finally breathing a sigh of relief as the powerful stimulus shifts the nation’ s growth engine out of
基金supported by the National Natural Science Foundation of China,China(31630031,81425010,31471109,31671116,and 31500861)International Partnership Program of Chinese Academy of Sciences,China(172644KYS820170004)+7 种基金Helmholtz-CAS Joint Research Grant(GJHZ1508)Guangdong Provincial Key Laboratory of Brain Connectome and Behavior,China(2017B030301017)Shenzhen Governmental Grants,China(JCYJ20160429190927063,KQJSCX20160301144002,JCYJ20170413164535041,JCYJ20150401150223647,JCYJ20160429185854999,JSGG20160429190521240)Research Instrument Development Project of the Chinese Academy of Sciences,China(YJKYYQ20170064)Youth Innovation Promotion Association of Chinese Academy of Sciences(2017413)Shenzhen Municipal Funding,China(GJHZ20160229200136090)Shenzhen Discipline Construction Project for Neurobiology,China(DRCSM[2016]1379)Ten Thousand Talent Program,Guangdong Special Support Program,China and Science and Technology Planning Project of Guangdong Province,China(2018B030331001)
文摘The Ventral Tegmental Area(VTA)is a midbrain structure known to integrate aversive and rewarding stimuli,but little is known about the role of VTA glutamatergic(VGluT2)neurons in these functions.Direct activation of VGluT2 soma evokes rewarding behaviors,while activation of their downstream projections evokes aversive behaviors.To facilitate our understanding of these conflicting properties,we recorded calcium signals from VTAVGluT2+neurons using fiber photometry in VGluT2-cre mice to investigate how this population was recruited by aversive and rewarding stimulation,both during unconditioned and conditioned protocols.Our results revealed that,as a population,VTAVGluT2+neurons responded similarly to unconditioned-aversive and unconditioned-rewarding stimulation.During aversive and rewarding conditioning,the CS-evoked responses gradually increased across trials whilst the US-evoked response remained stable.Retrieval 24 h after conditioning,during which mice received only CS presentation,resulted in VTAVGluT2+neurons strongly responding to CS presentation and to the expected-US but only for aversive conditioning.To help understand these differences based on VTAVGluT2+neuronal networks,the inputs and outputs of VTAVGluT2+neurons were investigated using Cholera Toxin B(CTB)and rabies virus.Based on our results,we propose that the divergent VTAVGluT2+neuronal responses to aversion and reward conditioning may be partly due to the existence of VTAVGluT2+subpopulations that are characterized by their connectivity.
基金supported by the Key Research and Development Program of Shaanxi (2022GXLH-02-09)the Aeronautical Science Foundation of China (20200051053001)the Natural Science Basic Research Program of Shaanxi (2020JM-147)。
文摘Autonomous umanned aerial vehicle(UAV) manipulation is necessary for the defense department to execute tactical missions given by commanders in the future unmanned battlefield. A large amount of research has been devoted to improving the autonomous decision-making ability of UAV in an interactive environment, where finding the optimal maneuvering decisionmaking policy became one of the key issues for enabling the intelligence of UAV. In this paper, we propose a maneuvering decision-making algorithm for autonomous air-delivery based on deep reinforcement learning under the guidance of expert experience. Specifically, we refine the guidance towards area and guidance towards specific point tasks for the air-delivery process based on the traditional air-to-surface fire control methods.Moreover, we construct the UAV maneuvering decision-making model based on Markov decision processes(MDPs). Specifically, we present a reward shaping method for the guidance towards area and guidance towards specific point tasks using potential-based function and expert-guided advice. The proposed algorithm could accelerate the convergence of the maneuvering decision-making policy and increase the stability of the policy in terms of the output during the later stage of training process. The effectiveness of the proposed maneuvering decision-making policy is illustrated by the curves of training parameters and extensive experimental results for testing the trained policy.
基金the National Natural Science Foun-dation of China(Grant No.71961003).
文摘In public goods games, punishments and rewards have been shown to be effective mechanisms for maintaining individualcooperation. However, punishments and rewards are costly to incentivize cooperation. Therefore, the generation ofcostly penalties and rewards has been a complex problem in promoting the development of cooperation. In real society,specialized institutions exist to punish evil people or reward good people by collecting taxes. We propose a strong altruisticpunishment or reward strategy in the public goods game through this phenomenon. Through theoretical analysis and numericalcalculation, we can get that tax-based strong altruistic punishment (reward) has more evolutionary advantages thantraditional strong altruistic punishment (reward) in maintaining cooperation and tax-based strong altruistic reward leads toa higher level of cooperation than tax-based strong altruistic punishment.
文摘Cross-lingual image description,the task of generating image captions in a target language from images and descriptions in a source language,is addressed in this study through a novel approach that combines neural network models and semantic matching techniques.Experiments conducted on the Flickr8k and AraImg2k benchmark datasets,featuring images and descriptions in English and Arabic,showcase remarkable performance improvements over state-of-the-art methods.Our model,equipped with the Image&Cross-Language Semantic Matching module and the Target Language Domain Evaluation module,significantly enhances the semantic relevance of generated image descriptions.For English-to-Arabic and Arabic-to-English cross-language image descriptions,our approach achieves a CIDEr score for English and Arabic of 87.9%and 81.7%,respectively,emphasizing the substantial contributions of our methodology.Comparative analyses with previous works further affirm the superior performance of our approach,and visual results underscore that our model generates image captions that are both semantically accurate and stylistically consistent with the target language.In summary,this study advances the field of cross-lingual image description,offering an effective solution for generating image captions across languages,with the potential to impact multilingual communication and accessibility.Future research directions include expanding to more languages and incorporating diverse visual and textual data sources.
基金National Natural Science Foundation of China(Nos.61803260,61673262 and 61175028)。
文摘Multi-agent reinforcement learning has recently been applied to solve pursuit problems.However,it suffers from a large number of time steps per training episode,thus always struggling to converge effectively,resulting in low rewards and an inability for agents to learn strategies.This paper proposes a deep reinforcement learning(DRL)training method that employs an ensemble segmented multi-reward function design approach to address the convergence problem mentioned before.The ensemble reward function combines the advantages of two reward functions,which enhances the training effect of agents in long episode.Then,we eliminate the non-monotonic behavior in reward function introduced by the trigonometric functions in the traditional 2D polar coordinates observation representation.Experimental results demonstrate that this method outperforms the traditional single reward function mechanism in the pursuit scenario by enhancing agents’policy scores of the task.These ideas offer a solution to the convergence challenges faced by DRL models in long episode pursuit problems,leading to an improved model training performance.
基金funded by National Natural Science Foundation of China(No.62063006)Guangxi Science and Technology Major Program(No.2022AA05002)+1 种基金Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region(No.2022GXZDSY003)Central Leading Local Science and Technology Development Fund Project of Wuzhou(No.202201001).
文摘By integrating deep neural networks with reinforcement learning,the Double Deep Q Network(DDQN)algorithm overcomes the limitations of Q-learning in handling continuous spaces and is widely applied in the path planning of mobile robots.However,the traditional DDQN algorithm suffers from sparse rewards and inefficient utilization of high-quality data.Targeting those problems,an improved DDQN algorithm based on average Q-value estimation and reward redistribution was proposed.First,to enhance the precision of the target Q-value,the average of multiple previously learned Q-values from the target Q network is used to replace the single Q-value from the current target Q network.Next,a reward redistribution mechanism is designed to overcome the sparse reward problem by adjusting the final reward of each action using the round reward from trajectory information.Additionally,a reward-prioritized experience selection method is introduced,which ranks experience samples according to reward values to ensure frequent utilization of high-quality data.Finally,simulation experiments are conducted to verify the effectiveness of the proposed algorithm in fixed-position scenario and random environments.The experimental results show that compared to the traditional DDQN algorithm,the proposed algorithm achieves shorter average running time,higher average return and fewer average steps.The performance of the proposed algorithm is improved by 11.43%in the fixed scenario and 8.33%in random environments.It not only plans economic and safe paths but also significantly improves efficiency and generalization in path planning,making it suitable for widespread application in autonomous navigation and industrial automation.
基金Supported by the International Partnership Program of Chinese Academy of Sciences(No.184131KYSB20200033).
文摘Due to the issue of long-horizon,a substantial number of visits to the state space is required during the exploration phase of reinforcement learning(RL)to gather valuable information.Addi-tionally,due to the challenge posed by sparse rewards,the planning phase of reinforcement learning consumes a considerable amount of time on repetitive and unproductive tasks before adequately ac-cessing sparse reward signals.To address these challenges,this work proposes a space partitioning and reverse merging(SPaRM)framework based on reward-free exploration(RFE).The framework consists of two parts:the space partitioning module and the reverse merging module.The former module partitions the entire state space into a specific number of subspaces to expedite the explora-tion phase.This work establishes its theoretical sample complexity lower bound.The latter module starts planning in reverse from near the target and gradually extends to the starting state,as opposed to the conventional practice of starting at the beginning.This facilitates the early involvement of sparse rewards at the target in the policy update process.This work designs two experimental envi-ronments:a complex maze and a set of randomly generated maps.Compared with two state-of-the-art(SOTA)algorithms,experimental results validate the effectiveness and superior performance of the proposed algorithm.