Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive scheme...Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence.展开更多
Learning style is the most important variable that affects the success of English learning. It can both give full play to student's learning superiority and make up their inferiority. The formation of learning style ...Learning style is the most important variable that affects the success of English learning. It can both give full play to student's learning superiority and make up their inferiority. The formation of learning style is related with external elements, including culture. Chinese culture greatly differs from American culture. With the distinct cultural differences, the learning styles of the Chinese student and the American student show clear differences.展开更多
Individual differences in foreign language learning have long been the concern of linguists and language teachers. Researches on this subject have been carried out in schools, universities and other educational instit...Individual differences in foreign language learning have long been the concern of linguists and language teachers. Researches on this subject have been carried out in schools, universities and other educational institutions and great achievements have been made. As it is, there are many individual differences which affect the learning of foreign languages, such as intelligence, aptitude, motivation, personality, attitude,展开更多
Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithm...Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithms for average reward problems, a novel incremental algorithm, called R( λ ) learning, was proposed. Results and Conclusion The proposed algorithm is a natural extension of the Q( λ) learning, the multi step discounted reward reinforcement learning algorithm, to the average reward cases. Simulation results show that the R( λ ) learning with intermediate λ values makes significant performance improvement over the simple R learning.展开更多
The iterated prisoner's dilemma(IPD) is an ideal model for analyzing interactions between agents in complex networks. It has attracted wide interest in the development of novel strategies since the success of tit-...The iterated prisoner's dilemma(IPD) is an ideal model for analyzing interactions between agents in complex networks. It has attracted wide interest in the development of novel strategies since the success of tit-for-tat in Axelrod's tournament. This paper studies a new adaptive strategy of IPD in different complex networks, where agents can learn and adapt their strategies through reinforcement learning method. A temporal difference learning method is applied for designing the adaptive strategy to optimize the decision making process of the agents. Previous studies indicated that mutual cooperation is hard to emerge in the IPD. Therefore, three examples which based on square lattice network and scale-free network are provided to show two features of the adaptive strategy. First, the mutual cooperation can be achieved by the group with adaptive agents under scale-free network, and once evolution has converged mutual cooperation, it is unlikely to shift. Secondly, the adaptive strategy can earn a better payoff compared with other strategies in the square network. The analytical properties are discussed for verifying evolutionary stability of the adaptive strategy.展开更多
Learner beliefs of language learning are of critical importance to the success or failure of any student's efforts to master a foreign language. In this paper, some recent researches on learner beliefs about language...Learner beliefs of language learning are of critical importance to the success or failure of any student's efforts to master a foreign language. In this paper, some recent researches on learner beliefs about language learning have been reviewed and summarized. The learner beliefs in Chinese context are also discussed. It is suggested that it is premature to determine learner beliefs in Chinese context and to conclude that cultural differences have great influences on learner beliefs. More researches need to be conducted by means of more various research methodolozies.展开更多
A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence...A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.展开更多
This paper, from the educational and psychological point of view, explores EFL college students' language learning strategies in the Chinese context. The subjects under study involve 106 non-English majors from Hohai...This paper, from the educational and psychological point of view, explores EFL college students' language learning strategies in the Chinese context. The subjects under study involve 106 non-English majors from Hohai University at its Changzhou Campus. The approach is used for the research through two questionnaires to investigate the learners' language learning strategies. In the study, it is found that students use compensation strategies most frequently, while metacognitive strategies less and social strategies the least. Findings of the present study also indicate that the different strategies are respectively emphasized for the male and female students, students of arts and science and engineering.展开更多
There is an apparent contrast between children’s first language acquisition and adults’second language acquisition,which are mainly manifested in the following three aspects:age difference,difference in learning pro...There is an apparent contrast between children’s first language acquisition and adults’second language acquisition,which are mainly manifested in the following three aspects:age difference,difference in learning process and motivation difference.This paper will analyze these three differences in detail,and combine the analysis results to guide second language pedagogical implications according to the current situation.展开更多
The last few decades have seen a phenomenal increase in the quality, diversity and pervasiveness of computer games. The worldwide computer games market is estimated to be worth around USD 21bn annually, and is predict...The last few decades have seen a phenomenal increase in the quality, diversity and pervasiveness of computer games. The worldwide computer games market is estimated to be worth around USD 21bn annually, and is predicted to continue to grow rapidly. This paper reviews some of the recent developments in applying computational intelligence (CI) methods to games, points out some of the potential pitfalls, and suggests some fruitful directions for future research.展开更多
In the reinforcement learning,policy evaluation aims to predict long-term values of a state under a certain policy.Since high-dimensional representations become more and more common in the reinforcement learning,how t...In the reinforcement learning,policy evaluation aims to predict long-term values of a state under a certain policy.Since high-dimensional representations become more and more common in the reinforcement learning,how to reduce the computational cost becomes a significant problem to the policy evaluation.Many recent works focus on adopting matrix sketching methods to accelerate least-square temporal difference(TD)algorithms and quasi-Newton temporal difference algorithms.Among these sketching methods,the truncated incremental SVD shows better performance because it is stable and efficient.However,the convergence properties of the incremental SVD is still open.In this paper,we first show that the conventional incremental SVD algorithms could have enormous approximation errors in the worst case.Then we propose a variant of incremental SVD with better theoretical guarantees by shrinking the singular values periodically.Moreover,we employ our improved incremental SVD to accelerate least-square TD and quasi-Newton TD algorithms.The experimental results verify the correctness and effectiveness of our methods.展开更多
文摘Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence.
文摘Learning style is the most important variable that affects the success of English learning. It can both give full play to student's learning superiority and make up their inferiority. The formation of learning style is related with external elements, including culture. Chinese culture greatly differs from American culture. With the distinct cultural differences, the learning styles of the Chinese student and the American student show clear differences.
文摘Individual differences in foreign language learning have long been the concern of linguists and language teachers. Researches on this subject have been carried out in schools, universities and other educational institutions and great achievements have been made. As it is, there are many individual differences which affect the learning of foreign languages, such as intelligence, aptitude, motivation, personality, attitude,
文摘Aim To investigate the model free multi step average reward reinforcement learning algorithm. Methods By combining the R learning algorithms with the temporal difference learning (TD( λ ) learning) algorithms for average reward problems, a novel incremental algorithm, called R( λ ) learning, was proposed. Results and Conclusion The proposed algorithm is a natural extension of the Q( λ) learning, the multi step discounted reward reinforcement learning algorithm, to the average reward cases. Simulation results show that the R( λ ) learning with intermediate λ values makes significant performance improvement over the simple R learning.
基金supported by the National Natural Science Foundation(NNSF)of China(61603196,61503079,61520106009,61533008)the Natural Science Foundation of Jiangsu Province of China(BK20150851)+4 种基金China Postdoctoral Science Foundation(2015M581842)Jiangsu Postdoctoral Science Foundation(1601259C)Nanjing University of Posts and Telecommunications Science Foundation(NUPTSF)(NY215011)Priority Academic Program Development of Jiangsu Higher Education Institutions,the open fund of Key Laboratory of Measurement and Control of Complex Systems of Engineering,Ministry of Education(MCCSE2015B02)the Research Innovation Program for College Graduates of Jiangsu Province(CXLX1309)
文摘The iterated prisoner's dilemma(IPD) is an ideal model for analyzing interactions between agents in complex networks. It has attracted wide interest in the development of novel strategies since the success of tit-for-tat in Axelrod's tournament. This paper studies a new adaptive strategy of IPD in different complex networks, where agents can learn and adapt their strategies through reinforcement learning method. A temporal difference learning method is applied for designing the adaptive strategy to optimize the decision making process of the agents. Previous studies indicated that mutual cooperation is hard to emerge in the IPD. Therefore, three examples which based on square lattice network and scale-free network are provided to show two features of the adaptive strategy. First, the mutual cooperation can be achieved by the group with adaptive agents under scale-free network, and once evolution has converged mutual cooperation, it is unlikely to shift. Secondly, the adaptive strategy can earn a better payoff compared with other strategies in the square network. The analytical properties are discussed for verifying evolutionary stability of the adaptive strategy.
文摘Learner beliefs of language learning are of critical importance to the success or failure of any student's efforts to master a foreign language. In this paper, some recent researches on learner beliefs about language learning have been reviewed and summarized. The learner beliefs in Chinese context are also discussed. It is suggested that it is premature to determine learner beliefs in Chinese context and to conclude that cultural differences have great influences on learner beliefs. More researches need to be conducted by means of more various research methodolozies.
文摘A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.
基金This paper is funded by the Humanity and Social Science Foundation of Hohai University Changzhou Campus.
文摘This paper, from the educational and psychological point of view, explores EFL college students' language learning strategies in the Chinese context. The subjects under study involve 106 non-English majors from Hohai University at its Changzhou Campus. The approach is used for the research through two questionnaires to investigate the learners' language learning strategies. In the study, it is found that students use compensation strategies most frequently, while metacognitive strategies less and social strategies the least. Findings of the present study also indicate that the different strategies are respectively emphasized for the male and female students, students of arts and science and engineering.
文摘There is an apparent contrast between children’s first language acquisition and adults’second language acquisition,which are mainly manifested in the following three aspects:age difference,difference in learning process and motivation difference.This paper will analyze these three differences in detail,and combine the analysis results to guide second language pedagogical implications according to the current situation.
文摘The last few decades have seen a phenomenal increase in the quality, diversity and pervasiveness of computer games. The worldwide computer games market is estimated to be worth around USD 21bn annually, and is predicted to continue to grow rapidly. This paper reviews some of the recent developments in applying computational intelligence (CI) methods to games, points out some of the potential pitfalls, and suggests some fruitful directions for future research.
基金The corresponding author Weinan Zhang was supported by the“New Generation of AI 2030”Major Project(2018AAA0100900)the National Natural Science Foundation of China(Grant Nos.62076161,61772333,61632017).
文摘In the reinforcement learning,policy evaluation aims to predict long-term values of a state under a certain policy.Since high-dimensional representations become more and more common in the reinforcement learning,how to reduce the computational cost becomes a significant problem to the policy evaluation.Many recent works focus on adopting matrix sketching methods to accelerate least-square temporal difference(TD)algorithms and quasi-Newton temporal difference algorithms.Among these sketching methods,the truncated incremental SVD shows better performance because it is stable and efficient.However,the convergence properties of the incremental SVD is still open.In this paper,we first show that the conventional incremental SVD algorithms could have enormous approximation errors in the worst case.Then we propose a variant of incremental SVD with better theoretical guarantees by shrinking the singular values periodically.Moreover,we employ our improved incremental SVD to accelerate least-square TD and quasi-Newton TD algorithms.The experimental results verify the correctness and effectiveness of our methods.