Offline policy evaluation,evaluating and selecting complex policies for decision-making by only using offline datasets is important in reinforcement learning.At present,the model-based offline policy evaluation(MBOPE)...Offline policy evaluation,evaluating and selecting complex policies for decision-making by only using offline datasets is important in reinforcement learning.At present,the model-based offline policy evaluation(MBOPE)is widely welcomed because of its easy to implement and good performance.MBOPE directly approximates the unknown value of a given policy using the Monte Carlo method given the estimated transition and reward functions of the environment.Usually,multiple models are trained,and then one of them is selected to be used.However,a challenge remains in selecting an appropriate model from those trained for further use.The authors first analyse the upper bound of the difference between the approximated value and the unknown true value.Theoretical results show that this difference is related to the trajectories generated by the given policy on the learnt model and the prediction error of the transition and reward functions at these generated data points.Based on the theoretical results,a new criterion is proposed to tell which trained model is better suited for evaluating the given policy.At last,the effectiveness of the proposed criterion is demonstrated on both benchmark and synthetic offline datasets.展开更多
To alleviate the extrapolation error and instability inherent in Q-function directly learned by off-policy Q-learning(QL-style)on static datasets,this article utilizes the on-policy state-action-reward-state-action(SA...To alleviate the extrapolation error and instability inherent in Q-function directly learned by off-policy Q-learning(QL-style)on static datasets,this article utilizes the on-policy state-action-reward-state-action(SARSA-style)to develop an offline reinforcement learning(RL)method termed robust offline Actor-Critic with on-policy regularized policy evaluation(OPRAC).With the help of SARSA-style bootstrap actions,a conservative on-policy Q-function and a penalty term for matching the on-policy and off-policy actions are jointly constructed to regularize the optimal Q-function of off-policy QL-style.This naturally equips the off-policy QL-style policy evaluation with the intrinsic pessimistic conservatism of on-policy SARSA-style,thus facilitating the acquisition of stable estimated Q-function.Even with limited data sampling errors,the convergence of Q-function learned by OPRAC and the controllability of bias upper bound between the learned Q-function and its true Q-value can be theoretically guaranteed.In addition,the sub-optimality of learned optimal policy merely stems from sampling errors.Experiments on the well-known D4RL Gym-MuJoCo benchmark demonstrate that OPRAC can rapidly learn robust and effective tasksolving policies owing to the stable estimate of Q-value,outperforming state-of-the-art offline RLs by at least 15%.展开更多
This study explores the application of Bayesian econometrics in policy evaluation through theoretical analysis. The research first reviews the theoretical foundations of Bayesian methods, including the concepts of Bay...This study explores the application of Bayesian econometrics in policy evaluation through theoretical analysis. The research first reviews the theoretical foundations of Bayesian methods, including the concepts of Bayesian inference, prior distributions, and posterior distributions. Through systematic analysis, the study constructs a theoretical framework for applying Bayesian methods in policy evaluation. The research finds that Bayesian methods have multiple theoretical advantages in policy evaluation: Based on parameter uncertainty theory, Bayesian methods can better handle uncertainty in model parameters and provide more comprehensive estimates of policy effects;from the perspective of model selection theory, Bayesian model averaging can reduce model selection bias and enhance the robustness of evaluation results;according to causal inference theory, Bayesian causal inference methods provide new approaches for evaluating policy causal effects. The study also points out the complexities of applying Bayesian methods in policy evaluation, such as the selection of prior information and computational complexity. To address these complexities, the study proposes hybrid methods combining frequentist approaches and suggestions for developing computationally efficient algorithms. The research also discusses theoretical comparisons between Bayesian methods and other policy evaluation techniques, providing directions for future research.展开更多
This paper studies a distributed policy evaluation in multi-agent reinforcement learning.Under cooperative settings,each agent only obtains a local reward,while all agents share a common environmental state.To optimiz...This paper studies a distributed policy evaluation in multi-agent reinforcement learning.Under cooperative settings,each agent only obtains a local reward,while all agents share a common environmental state.To optimize the global return as the sum of local return,the agents exchange information with their neighbors through a communication network.The mean squared projected Bellman error minimization problem is reformulated as a constrained convex optimization problem with a consensus constraint;then,a distributed alternating directions method of multipliers(ADMM)algorithm is proposed to solve it.Furthermore,an inexact step for ADMM is used to achieve efficient computation at each iteration.The convergence of the proposed algorithm is established.yipeng@tongji.edu.cn;LilLi received the B.Sc.and M.Se.degrees from Shengyang Agri-culture University,China in 1996 and 1999.respectivly.and the Ph.D.degree from Shenyang Institute of Automation,Chinese Academy of Science,in 2003.She joined Tongji Universitry,Shanghai,China,in 2003,and is now a professor at the Depart-ment of Control Science and Engineering.Her research inter-ests are in data-driven modeling and opimization,computaional intelligence.展开更多
In the reinforcement learning,policy evaluation aims to predict long-term values of a state under a certain policy.Since high-dimensional representations become more and more common in the reinforcement learning,how t...In the reinforcement learning,policy evaluation aims to predict long-term values of a state under a certain policy.Since high-dimensional representations become more and more common in the reinforcement learning,how to reduce the computational cost becomes a significant problem to the policy evaluation.Many recent works focus on adopting matrix sketching methods to accelerate least-square temporal difference(TD)algorithms and quasi-Newton temporal difference algorithms.Among these sketching methods,the truncated incremental SVD shows better performance because it is stable and efficient.However,the convergence properties of the incremental SVD is still open.In this paper,we first show that the conventional incremental SVD algorithms could have enormous approximation errors in the worst case.Then we propose a variant of incremental SVD with better theoretical guarantees by shrinking the singular values periodically.Moreover,we employ our improved incremental SVD to accelerate least-square TD and quasi-Newton TD algorithms.The experimental results verify the correctness and effectiveness of our methods.展开更多
In this paper,we highlight some recent developments of a new route to evaluate macroeconomic policy effects,which are investigated under the framework with potential outcomes.First,this paper begins with a brief intro...In this paper,we highlight some recent developments of a new route to evaluate macroeconomic policy effects,which are investigated under the framework with potential outcomes.First,this paper begins with a brief introduction of the basic model setup in modern econometric analysis of program evaluation.Secondly,primary attention goes to the focus on causal effect estimation of macroeconomic policy with single time series data together with some extensions to multiple time series data.Furthermore,we examine the connection of this new approach to traditional macroeconomic models for policy analysis and evaluation.Finally,we conclude by addressing some possible future research directions in statistics and econometrics.展开更多
When implementing open access, policy pioneers and flagship institutions alike have faced considerable challenges in meeting their own aims and achieving a recognized success. Legitimate authority, sufficient resource...When implementing open access, policy pioneers and flagship institutions alike have faced considerable challenges in meeting their own aims and achieving a recognized success. Legitimate authority, sufficient resources and the right timing are crucial, but the professionals charged with implementing policy still need several years to accomplish significant progress. This study defines a methodological standard for evaluating the first generation of open access policies. Evaluating implementation establishes evidence, enables reflection, and may foster the emergence of a second generation of open access policies.While the study is based on a small number of cases, these case studies cover most of the pioneer institutions, present the most significant issues and offer an international overview.Each case is reconstructed individually on the basis of public documents and background information, and supported by interviews with professionals responsible for open access implementation. This article presents the highlights from each case study. The results are utilized to indicate how a second generation of policies might define open access as a key component of digital research infrastructures that provide inputs and outputs for research,teaching and learning in real time.展开更多
Objective To evaluate the effect of health care reform policy in China comprehensively and provide suggestions for its further implementation.Methods Data on the effect of health care reform were obtained from“China ...Objective To evaluate the effect of health care reform policy in China comprehensively and provide suggestions for its further implementation.Methods Data on the effect of health care reform were obtained from“China Health Statistics Yearbook”and National Bureau of Statistics of China and the indicators were selected by corrected item total correlation(CITC)and Cronbach’sαreliability coefficient.Then,the selected indicators were calculated through the prospect theory model.Meanwhile,the gray relation analysis method was introduced to enlarge the differences between the advantages and disadvantages to make the comprehensive evaluation result more obvious.Results and Conclusion The implementation of China’s health care reform has a significant impact on China’s medical and health system.However,the effect of the policy will become less with the increase of the total amount.An effective management can ensure that the policy continues to play its role.展开更多
Background:A milestone goal of the Healthy China Program(2019-2030)is to achieve 5-year cancer survival at 43.3%for all cancers combined by 2022.To assess the progress towards this target,we analyzed the updated survi...Background:A milestone goal of the Healthy China Program(2019-2030)is to achieve 5-year cancer survival at 43.3%for all cancers combined by 2022.To assess the progress towards this target,we analyzed the updated survival for all cancers combined and 25 specific cancer types in China from 2019 to 2021.Methods:We conducted standardized data collection and quality control for cancer registries across 32 provincial-level regions in China,and included 6,410,940 newly diagnosed cancer patients from 281 cancer registries during 2008-2019,with follow-up data on vital status available until December 2021.We estimated the age-standardized 5-year relative survival overall and by site,age group,and period of diagnosis using the International Cancer Survival Standard Weights,and quantified the survival changes to assess the progress in cancer control.Results:In 2019-2021,the age-standardized 5-year relative survival for all cancers combined was 43.7%(95%confidence interval[CI],43.6-43.7).The 5-year relative survival varied by cancer type,ranging from 8.5%(95%CI,8.2-8.7)for pancreatic cancer to 92.9%(95%CI,92.4-93.3)for thyroid cancer.Eight cancers had 5-year survival of over 60%,including cancers of the thyroid,breast,testis,bladder,prostate,kidney,uterus,and cervix.The 5-year relative survival was generally lower in males than in females.From 2008 to 2021,we observed significant survival improvements for cancers of the lung,prostate,bone,uterus,breast,cervix,nasopharynx,larynx,and bladder.The most significant improvement was in lung cancer.Conclusions:Progress in cancer control was evident in China.This highlights the importance of a comprehensive approach to control and prevent cancer.展开更多
Policy evaluation(PE)is a critical sub-problem in reinforcement learning,which estimates the value function for a given policy and can be used for policy improvement.However,there still exist some limitations in curre...Policy evaluation(PE)is a critical sub-problem in reinforcement learning,which estimates the value function for a given policy and can be used for policy improvement.However,there still exist some limitations in current PE methods,such as low sample efficiency and local convergence,especially on complex tasks.In this study,a novel PE algorithm called Least-Squares Truncated Temporal-Difference learning(LST2D)is proposed.In LST2D,an adaptive truncation mechanism is designed,which effectively takes advantage of the fast convergence property of Least-Squares Temporal Difference learning and the asymptotic convergence property of Temporal Difference learning(TD).Then,two feature pre-training methods are utilised to improve the approximation ability of LST2D.Furthermore,an Actor-Critic algorithm based on LST2D and pre-trained feature representations(ACLPF)is proposed,where LST2D is integrated into the critic network to improve learning-prediction efficiency.Comprehensive simulation studies were conducted on four robotic tasks,and the corresponding results illustrate the effectiveness of LST2D.The proposed ACLPF algorithm outperformed DQN,ACER and PPO in terms of sample efficiency and stability,which demonstrated that LST2D can be applied to online learning control problems by incorporating it into the actor-critic architecture.展开更多
Measuring the economic and social effects of the Northeast China Revitalization Strategy is critical to addressing regional sustainable development in China. To shed light on this issue, an integrated perspective was ...Measuring the economic and social effects of the Northeast China Revitalization Strategy is critical to addressing regional sustainable development in China. To shed light on this issue, an integrated perspective was adopted that is combined with the difference-in-differences method to measure the effects of the strategy on economic growth and social development in Northeast China. The findings suggest that the strategy has significantly improved regional economic growth and per-capita income by increasing its gross domestic product(GDP) and GDP per capita by 25.70% and 46.00%, respectively. However, the strategy has significantly worsened the regional employment in the secondary industry of the region. In addition, the strategy has not significantly improved regional infrastructural road, education investment or social security, and has had no significant effect on mitigating regional disparity. In addition, the policy effects are highly heterogeneous across cities based on city size and characteristics. Therefore, there is no simple answer regarding whether the Northeast China Revitalization Strategy has reached its original goals from an integrated perspective. The next phase of the strategy should emphasize improving research and development(R&D) and human capital investments based on urban heterogeneity to prevent conservative path-dependency and the lock-in of outdated technologies.展开更多
Datong County has been developed for nearly 5 years since it was selected as a comprehensive demonstration county of e-commerce entering rural areas in 2017.In this context,this paper analyzes the development status o...Datong County has been developed for nearly 5 years since it was selected as a comprehensive demonstration county of e-commerce entering rural areas in 2017.In this context,this paper analyzes the development status of e-commerce in rural areas of Datong County,and provides feasible suggestions for the development of e-commerce entering rural areas in demonstration counties.展开更多
This study evaluated the impacts of food safety policies on Japan's Simultaneous Buy and Sell rice imports through measuring tariff equivalents of food safety policies.In order to construct an estimated model,a Ja...This study evaluated the impacts of food safety policies on Japan's Simultaneous Buy and Sell rice imports through measuring tariff equivalents of food safety policies.In order to construct an estimated model,a Japanese consumer's utility function is introduced and developed with consumer's preference parameters and elasticity of substitution.In the empirical study part,Japan's positive list system and rice traceability were analyzed and assessed as critical food safety policies.Results showed that after the implementation of the positive list system,consumers'preference for foreign rice and the substitution elasticity diminished.This decreasing tendency was quite similar to the results after the enforcement of rice traceability.The tariff equivalents of food safety policies on imported rice fluctuated around ¥50 yen/kg from fiscal year 2000 to 2005 and decreased because of the global grain price hike after 2006.The tariff equivalents soared in 2010,which was induced by the traceability regulation,and then dulled during Japan's earthquake and tsunami in 2011.Subsequently,after the recovery from natural disasters,the tariff equivalents of food safety policies became higher.Therefore,food safety policies had made imported rice less attractive,weakened the competitive power of rice exporting countries,and had statistically significant impacts on Japan's rice importation.展开更多
A main challenge of attribute-based access control(ABAC)is the handling of missing information.Several studies have shown that the way standard ABAC mechanisms,e.g.based on XACML,handle missing information is flawed,m...A main challenge of attribute-based access control(ABAC)is the handling of missing information.Several studies have shown that the way standard ABAC mechanisms,e.g.based on XACML,handle missing information is flawed,making ABAC policies vulnerable to attribute-hiding attacks.Recent work has addressed the problem of missing information in ABAC by introducing the notion of extended evaluation,where the evaluation of a query considers all queries that can be obtained by extending the initial query.This method counters attribute-hiding attacks,but a na飗e implementation is intractable,as it requires an evaluation of the whole query space.In this paper,we present a framework for the extended evaluation of ABAC policies.The framework relies on Binary Decision Diagram(BDDs)data structures for the efficient computation of the extended evaluation of ABAC policies.We also introduce the notion of query constraints and attribute value power to avoid evaluating queries that do not represent a valid state of the system and to identify which attribute values should be considered in the computation of the extended evaluation,respectively.We illustrate our framework using three real-world policies,which would be intractable with the original method but which are analyzed in seconds using our framework.展开更多
A main challenge of attribute-based access control(ABAC)is the handling of missing information.Several studies have shown that the way standard ABAC mechanisms,e.g.based on XACML,handle missing information is flawed,m...A main challenge of attribute-based access control(ABAC)is the handling of missing information.Several studies have shown that the way standard ABAC mechanisms,e.g.based on XACML,handle missing information is flawed,making ABAC policies vulnerable to attribute-hiding attacks.Recent work has addressed the problem of missing information in ABAC by introducing the notion of extended evaluation,where the evaluation of a query considers all queries that can be obtained by extending the initial query.This method counters attribute-hiding attacks,but a naïve implementation is intractable,as it requires an evaluation of the whole query space.In this paper,we present a framework for the extended evaluation of ABAC policies.The framework relies on Binary Decision Diagram(BDDs)data structures for the efficient computation of the extended evaluation of ABAC policies.We also introduce the notion of query constraints and attribute value power to avoid evaluating queries that do not represent a valid state of the system and to identify which attribute values should be considered in the computation of the extended evaluation,respectively.We illustrate our framework using three real-world policies,which would be intractable with the original method but which are analyzed in seconds using our framework.展开更多
Addressing pollution caused by economic development,especially the overcapacity of polluting enterprises,is crucial for promoting sustainable economic growth.Targeted environmental policies are essential for strengthe...Addressing pollution caused by economic development,especially the overcapacity of polluting enterprises,is crucial for promoting sustainable economic growth.Targeted environmental policies are essential for strengthening environmental constraints on enterprises and enhancing the effectiveness of regulatory instruments.This study focused on the Environmental Credit Evaluation policy by examining its potential to improve capacity utilization and assessing its broader impact on heavily polluting enterprises.It constructed a time-varying difference-in-difference-in-differences model using panel data from 965 industrial enterprises from 2009 to 2019.The findings reveal that,in comparison with their non-heavily polluting counterparts,heavily polluting enterprises subject to the policy demonstrated significant improvements in capacity utilization.Heavily polluting enterprises that experienced a substantial increase in financing costs also exhibited a marked reduction in inefficient investment,without negatively affecting innovation investments or output.展开更多
Based on the background of the special clean-up action of"Breaking the Five-only",this paper combs the relevant policies of domestic science and education evaluation.Using CiteSpace and VOSviewer scientific ...Based on the background of the special clean-up action of"Breaking the Five-only",this paper combs the relevant policies of domestic science and education evaluation.Using CiteSpace and VOSviewer scientific measurement software,this paper makes a visual analysis on the related domestic research of the"Five-only"and"science and education evaluation",and expounds the frontier hot spots and trends of science and education evaluation research in China.Based on this,this paper summarizes the countermeasures and suggestions on how to"break"the"Five-only"and how to"establish"the"new system of science and education evaluation",in order to provide a reference for the sustainable and healthy development of science and education evaluation in China.展开更多
文摘Offline policy evaluation,evaluating and selecting complex policies for decision-making by only using offline datasets is important in reinforcement learning.At present,the model-based offline policy evaluation(MBOPE)is widely welcomed because of its easy to implement and good performance.MBOPE directly approximates the unknown value of a given policy using the Monte Carlo method given the estimated transition and reward functions of the environment.Usually,multiple models are trained,and then one of them is selected to be used.However,a challenge remains in selecting an appropriate model from those trained for further use.The authors first analyse the upper bound of the difference between the approximated value and the unknown true value.Theoretical results show that this difference is related to the trajectories generated by the given policy on the learnt model and the prediction error of the transition and reward functions at these generated data points.Based on the theoretical results,a new criterion is proposed to tell which trained model is better suited for evaluating the given policy.At last,the effectiveness of the proposed criterion is demonstrated on both benchmark and synthetic offline datasets.
基金supported in part by the National Natural Science Foundation of China(62176259,62373364)the Key Research and Development Program of Jiangsu Province(BE2022095)。
文摘To alleviate the extrapolation error and instability inherent in Q-function directly learned by off-policy Q-learning(QL-style)on static datasets,this article utilizes the on-policy state-action-reward-state-action(SARSA-style)to develop an offline reinforcement learning(RL)method termed robust offline Actor-Critic with on-policy regularized policy evaluation(OPRAC).With the help of SARSA-style bootstrap actions,a conservative on-policy Q-function and a penalty term for matching the on-policy and off-policy actions are jointly constructed to regularize the optimal Q-function of off-policy QL-style.This naturally equips the off-policy QL-style policy evaluation with the intrinsic pessimistic conservatism of on-policy SARSA-style,thus facilitating the acquisition of stable estimated Q-function.Even with limited data sampling errors,the convergence of Q-function learned by OPRAC and the controllability of bias upper bound between the learned Q-function and its true Q-value can be theoretically guaranteed.In addition,the sub-optimality of learned optimal policy merely stems from sampling errors.Experiments on the well-known D4RL Gym-MuJoCo benchmark demonstrate that OPRAC can rapidly learn robust and effective tasksolving policies owing to the stable estimate of Q-value,outperforming state-of-the-art offline RLs by at least 15%.
文摘This study explores the application of Bayesian econometrics in policy evaluation through theoretical analysis. The research first reviews the theoretical foundations of Bayesian methods, including the concepts of Bayesian inference, prior distributions, and posterior distributions. Through systematic analysis, the study constructs a theoretical framework for applying Bayesian methods in policy evaluation. The research finds that Bayesian methods have multiple theoretical advantages in policy evaluation: Based on parameter uncertainty theory, Bayesian methods can better handle uncertainty in model parameters and provide more comprehensive estimates of policy effects;from the perspective of model selection theory, Bayesian model averaging can reduce model selection bias and enhance the robustness of evaluation results;according to causal inference theory, Bayesian causal inference methods provide new approaches for evaluating policy causal effects. The study also points out the complexities of applying Bayesian methods in policy evaluation, such as the selection of prior information and computational complexity. To address these complexities, the study proposes hybrid methods combining frequentist approaches and suggestions for developing computationally efficient algorithms. The research also discusses theoretical comparisons between Bayesian methods and other policy evaluation techniques, providing directions for future research.
基金the National Key Research and Development Program of Science and Technology,China(No.2018YFB1305304)the Shanghai Science and Technology Pilot Project,China(No.19511132100)+2 种基金the National Natural Science Foun-dation,China(No.51475334)the Shanghai Sailing Program,China(No.20YF1453000)the Fundamental Research Funds for the Central Univesitie,China(No.22120200048)。
文摘This paper studies a distributed policy evaluation in multi-agent reinforcement learning.Under cooperative settings,each agent only obtains a local reward,while all agents share a common environmental state.To optimize the global return as the sum of local return,the agents exchange information with their neighbors through a communication network.The mean squared projected Bellman error minimization problem is reformulated as a constrained convex optimization problem with a consensus constraint;then,a distributed alternating directions method of multipliers(ADMM)algorithm is proposed to solve it.Furthermore,an inexact step for ADMM is used to achieve efficient computation at each iteration.The convergence of the proposed algorithm is established.yipeng@tongji.edu.cn;LilLi received the B.Sc.and M.Se.degrees from Shengyang Agri-culture University,China in 1996 and 1999.respectivly.and the Ph.D.degree from Shenyang Institute of Automation,Chinese Academy of Science,in 2003.She joined Tongji Universitry,Shanghai,China,in 2003,and is now a professor at the Depart-ment of Control Science and Engineering.Her research inter-ests are in data-driven modeling and opimization,computaional intelligence.
基金The corresponding author Weinan Zhang was supported by the“New Generation of AI 2030”Major Project(2018AAA0100900)the National Natural Science Foundation of China(Grant Nos.62076161,61772333,61632017).
文摘In the reinforcement learning,policy evaluation aims to predict long-term values of a state under a certain policy.Since high-dimensional representations become more and more common in the reinforcement learning,how to reduce the computational cost becomes a significant problem to the policy evaluation.Many recent works focus on adopting matrix sketching methods to accelerate least-square temporal difference(TD)algorithms and quasi-Newton temporal difference algorithms.Among these sketching methods,the truncated incremental SVD shows better performance because it is stable and efficient.However,the convergence properties of the incremental SVD is still open.In this paper,we first show that the conventional incremental SVD algorithms could have enormous approximation errors in the worst case.Then we propose a variant of incremental SVD with better theoretical guarantees by shrinking the singular values periodically.Moreover,we employ our improved incremental SVD to accelerate least-square TD and quasi-Newton TD algorithms.The experimental results verify the correctness and effectiveness of our methods.
基金the National Natural Science Foundation of China(71631004,Key Project)the National Science Fund for Distinguished Young Scholars(71625001)+2 种基金the Basic Scientific Center Project of National Science Foundation of China:Econometrics and Quantitative Policy Evaluation(71988101)the Science Foundation of Ministry of Education of China(19YJA910003)China Scholarship Council Funded Project(201806315045).
文摘In this paper,we highlight some recent developments of a new route to evaluate macroeconomic policy effects,which are investigated under the framework with potential outcomes.First,this paper begins with a brief introduction of the basic model setup in modern econometric analysis of program evaluation.Secondly,primary attention goes to the focus on causal effect estimation of macroeconomic policy with single time series data together with some extensions to multiple time series data.Furthermore,we examine the connection of this new approach to traditional macroeconomic models for policy analysis and evaluation.Finally,we conclude by addressing some possible future research directions in statistics and econometrics.
文摘When implementing open access, policy pioneers and flagship institutions alike have faced considerable challenges in meeting their own aims and achieving a recognized success. Legitimate authority, sufficient resources and the right timing are crucial, but the professionals charged with implementing policy still need several years to accomplish significant progress. This study defines a methodological standard for evaluating the first generation of open access policies. Evaluating implementation establishes evidence, enables reflection, and may foster the emergence of a second generation of open access policies.While the study is based on a small number of cases, these case studies cover most of the pioneer institutions, present the most significant issues and offer an international overview.Each case is reconstructed individually on the basis of public documents and background information, and supported by interviews with professionals responsible for open access implementation. This article presents the highlights from each case study. The results are utilized to indicate how a second generation of policies might define open access as a key component of digital research infrastructures that provide inputs and outputs for research,teaching and learning in real time.
文摘Objective To evaluate the effect of health care reform policy in China comprehensively and provide suggestions for its further implementation.Methods Data on the effect of health care reform were obtained from“China Health Statistics Yearbook”and National Bureau of Statistics of China and the indicators were selected by corrected item total correlation(CITC)and Cronbach’sαreliability coefficient.Then,the selected indicators were calculated through the prospect theory model.Meanwhile,the gray relation analysis method was introduced to enlarge the differences between the advantages and disadvantages to make the comprehensive evaluation result more obvious.Results and Conclusion The implementation of China’s health care reform has a significant impact on China’s medical and health system.However,the effect of the policy will become less with the increase of the total amount.An effective management can ensure that the policy continues to play its role.
基金supported by“National Key R&D Program of China”(grant numbers:2022YFC3600805,2020AAA0109500)the National Natural Science Foundation of China(grant number:82188102)+2 种基金the R&D Program of Beijing Municipal Education Commission(grant num-ber:KJZD20191002302)CAMS Initiative for Innovative Medicine(grant number:2021-1-I2M-012)Shenzhen High-level Hospital Con-struction Fund,Sanming Project of Medicine in Shenzhen(grant num-ber:SZSM202211011).
文摘Background:A milestone goal of the Healthy China Program(2019-2030)is to achieve 5-year cancer survival at 43.3%for all cancers combined by 2022.To assess the progress towards this target,we analyzed the updated survival for all cancers combined and 25 specific cancer types in China from 2019 to 2021.Methods:We conducted standardized data collection and quality control for cancer registries across 32 provincial-level regions in China,and included 6,410,940 newly diagnosed cancer patients from 281 cancer registries during 2008-2019,with follow-up data on vital status available until December 2021.We estimated the age-standardized 5-year relative survival overall and by site,age group,and period of diagnosis using the International Cancer Survival Standard Weights,and quantified the survival changes to assess the progress in cancer control.Results:In 2019-2021,the age-standardized 5-year relative survival for all cancers combined was 43.7%(95%confidence interval[CI],43.6-43.7).The 5-year relative survival varied by cancer type,ranging from 8.5%(95%CI,8.2-8.7)for pancreatic cancer to 92.9%(95%CI,92.4-93.3)for thyroid cancer.Eight cancers had 5-year survival of over 60%,including cancers of the thyroid,breast,testis,bladder,prostate,kidney,uterus,and cervix.The 5-year relative survival was generally lower in males than in females.From 2008 to 2021,we observed significant survival improvements for cancers of the lung,prostate,bone,uterus,breast,cervix,nasopharynx,larynx,and bladder.The most significant improvement was in lung cancer.Conclusions:Progress in cancer control was evident in China.This highlights the importance of a comprehensive approach to control and prevent cancer.
基金Joint Funds of the National Natural Science Foundation of China,Grant/Award Number:U21A20518National Natural Science Foundation of China,Grant/Award Numbers:62106279,61903372。
文摘Policy evaluation(PE)is a critical sub-problem in reinforcement learning,which estimates the value function for a given policy and can be used for policy improvement.However,there still exist some limitations in current PE methods,such as low sample efficiency and local convergence,especially on complex tasks.In this study,a novel PE algorithm called Least-Squares Truncated Temporal-Difference learning(LST2D)is proposed.In LST2D,an adaptive truncation mechanism is designed,which effectively takes advantage of the fast convergence property of Least-Squares Temporal Difference learning and the asymptotic convergence property of Temporal Difference learning(TD).Then,two feature pre-training methods are utilised to improve the approximation ability of LST2D.Furthermore,an Actor-Critic algorithm based on LST2D and pre-trained feature representations(ACLPF)is proposed,where LST2D is integrated into the critic network to improve learning-prediction efficiency.Comprehensive simulation studies were conducted on four robotic tasks,and the corresponding results illustrate the effectiveness of LST2D.The proposed ACLPF algorithm outperformed DQN,ACER and PPO in terms of sample efficiency and stability,which demonstrated that LST2D can be applied to online learning control problems by incorporating it into the actor-critic architecture.
基金the auspices of Young-and Middle-aged Science and Technology Talent Support Program of Shenyang City(No.RC180221)Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDA23070501)+1 种基金Open Research Project of Shouguang Facilities Agriculture Center in Institute of Applied Ecology(No.2018SG-B-01)National Natural Science Foundation of China(No.41971166,41701142)。
文摘Measuring the economic and social effects of the Northeast China Revitalization Strategy is critical to addressing regional sustainable development in China. To shed light on this issue, an integrated perspective was adopted that is combined with the difference-in-differences method to measure the effects of the strategy on economic growth and social development in Northeast China. The findings suggest that the strategy has significantly improved regional economic growth and per-capita income by increasing its gross domestic product(GDP) and GDP per capita by 25.70% and 46.00%, respectively. However, the strategy has significantly worsened the regional employment in the secondary industry of the region. In addition, the strategy has not significantly improved regional infrastructural road, education investment or social security, and has had no significant effect on mitigating regional disparity. In addition, the policy effects are highly heterogeneous across cities based on city size and characteristics. Therefore, there is no simple answer regarding whether the Northeast China Revitalization Strategy has reached its original goals from an integrated perspective. The next phase of the strategy should emphasize improving research and development(R&D) and human capital investments based on urban heterogeneity to prevent conservative path-dependency and the lock-in of outdated technologies.
文摘Datong County has been developed for nearly 5 years since it was selected as a comprehensive demonstration county of e-commerce entering rural areas in 2017.In this context,this paper analyzes the development status of e-commerce in rural areas of Datong County,and provides feasible suggestions for the development of e-commerce entering rural areas in demonstration counties.
基金This work was supported in part by the Fundamental Research Funds for the Central Universities,CUMT(Project No.2017WA02).
文摘This study evaluated the impacts of food safety policies on Japan's Simultaneous Buy and Sell rice imports through measuring tariff equivalents of food safety policies.In order to construct an estimated model,a Japanese consumer's utility function is introduced and developed with consumer's preference parameters and elasticity of substitution.In the empirical study part,Japan's positive list system and rice traceability were analyzed and assessed as critical food safety policies.Results showed that after the implementation of the positive list system,consumers'preference for foreign rice and the substitution elasticity diminished.This decreasing tendency was quite similar to the results after the enforcement of rice traceability.The tariff equivalents of food safety policies on imported rice fluctuated around ¥50 yen/kg from fiscal year 2000 to 2005 and decreased because of the global grain price hike after 2006.The tariff equivalents soared in 2010,which was induced by the traceability regulation,and then dulled during Japan's earthquake and tsunami in 2011.Subsequently,after the recovery from natural disasters,the tariff equivalents of food safety policies became higher.Therefore,food safety policies had made imported rice less attractive,weakened the competitive power of rice exporting countries,and had statistically significant impacts on Japan's rice importation.
基金This work is partially funded by the ITEA3 project APPSTACLE(15017)the ECSEL project SECREDAS(783119).
文摘A main challenge of attribute-based access control(ABAC)is the handling of missing information.Several studies have shown that the way standard ABAC mechanisms,e.g.based on XACML,handle missing information is flawed,making ABAC policies vulnerable to attribute-hiding attacks.Recent work has addressed the problem of missing information in ABAC by introducing the notion of extended evaluation,where the evaluation of a query considers all queries that can be obtained by extending the initial query.This method counters attribute-hiding attacks,but a na飗e implementation is intractable,as it requires an evaluation of the whole query space.In this paper,we present a framework for the extended evaluation of ABAC policies.The framework relies on Binary Decision Diagram(BDDs)data structures for the efficient computation of the extended evaluation of ABAC policies.We also introduce the notion of query constraints and attribute value power to avoid evaluating queries that do not represent a valid state of the system and to identify which attribute values should be considered in the computation of the extended evaluation,respectively.We illustrate our framework using three real-world policies,which would be intractable with the original method but which are analyzed in seconds using our framework.
基金partially funded by the ITEA3 project APPSTACLE(15017)the ECSEL project SECREDAS(783119).
文摘A main challenge of attribute-based access control(ABAC)is the handling of missing information.Several studies have shown that the way standard ABAC mechanisms,e.g.based on XACML,handle missing information is flawed,making ABAC policies vulnerable to attribute-hiding attacks.Recent work has addressed the problem of missing information in ABAC by introducing the notion of extended evaluation,where the evaluation of a query considers all queries that can be obtained by extending the initial query.This method counters attribute-hiding attacks,but a naïve implementation is intractable,as it requires an evaluation of the whole query space.In this paper,we present a framework for the extended evaluation of ABAC policies.The framework relies on Binary Decision Diagram(BDDs)data structures for the efficient computation of the extended evaluation of ABAC policies.We also introduce the notion of query constraints and attribute value power to avoid evaluating queries that do not represent a valid state of the system and to identify which attribute values should be considered in the computation of the extended evaluation,respectively.We illustrate our framework using three real-world policies,which would be intractable with the original method but which are analyzed in seconds using our framework.
基金support from the Major Program of the National Social Science Foundation of China(No.21&ZD109).
文摘Addressing pollution caused by economic development,especially the overcapacity of polluting enterprises,is crucial for promoting sustainable economic growth.Targeted environmental policies are essential for strengthening environmental constraints on enterprises and enhancing the effectiveness of regulatory instruments.This study focused on the Environmental Credit Evaluation policy by examining its potential to improve capacity utilization and assessing its broader impact on heavily polluting enterprises.It constructed a time-varying difference-in-difference-in-differences model using panel data from 965 industrial enterprises from 2009 to 2019.The findings reveal that,in comparison with their non-heavily polluting counterparts,heavily polluting enterprises subject to the policy demonstrated significant improvements in capacity utilization.Heavily polluting enterprises that experienced a substantial increase in financing costs also exhibited a marked reduction in inefficient investment,without negatively affecting innovation investments or output.
基金This research is supported by the Major Projects of the National Social Science Foundation of China,Research on the construction of science and education evaluation information cloud platform and intelligent service based on big data(Grant No.19ZDA348).
文摘Based on the background of the special clean-up action of"Breaking the Five-only",this paper combs the relevant policies of domestic science and education evaluation.Using CiteSpace and VOSviewer scientific measurement software,this paper makes a visual analysis on the related domestic research of the"Five-only"and"science and education evaluation",and expounds the frontier hot spots and trends of science and education evaluation research in China.Based on this,this paper summarizes the countermeasures and suggestions on how to"break"the"Five-only"and how to"establish"the"new system of science and education evaluation",in order to provide a reference for the sustainable and healthy development of science and education evaluation in China.