期刊文献+
共找到22篇文章
< 1 2 >
每页显示 20 50 100
Intelligent sequential multi-impulse collision avoidance method for non-cooperative spacecraft based on an improved search tree algorithm 被引量:1
1
作者 Xuyang CAO Xin NING +4 位作者 Zheng WANG Suyi LIU Fei CHENG Wenlong LI Xiaobin LIAN 《Chinese Journal of Aeronautics》 2025年第4期378-393,共16页
The problem of collision avoidance for non-cooperative targets has received significant attention from researchers in recent years.Non-cooperative targets exhibit uncertain states and unpredictable behaviors,making co... The problem of collision avoidance for non-cooperative targets has received significant attention from researchers in recent years.Non-cooperative targets exhibit uncertain states and unpredictable behaviors,making collision avoidance significantly more challenging than that for space debris.Much existing research focuses on the continuous thrust model,whereas the impulsive maneuver model is more appropriate for long-duration and long-distance avoidance missions.Additionally,it is important to minimize the impact on the original mission while avoiding noncooperative targets.On the other hand,the existing avoidance algorithms are computationally complex and time-consuming especially with the limited computing capability of the on-board computer,posing challenges for practical engineering applications.To conquer these difficulties,this paper makes the following key contributions:(A)a turn-based(sequential decision-making)limited-area impulsive collision avoidance model considering the time delay of precision orbit determination is established for the first time;(B)a novel Selection Probability Learning Adaptive Search-depth Search Tree(SPL-ASST)algorithm is proposed for non-cooperative target avoidance,which improves the decision-making efficiency by introducing an adaptive-search-depth mechanism and a neural network into the traditional Monte Carlo Tree Search(MCTS).Numerical simulations confirm the effectiveness and efficiency of the proposed method. 展开更多
关键词 Non-cooperative target Collision avoidance Limited motion area Impulsive maneuver model search tree algorithm Neural networks
原文传递
Limiting theorems for the nodes in binary search trees 被引量:1
2
作者 LIU Jie SU Chun CHEN Yu 《Science China Mathematics》 SCIE 2008年第1期101-114,共14页
We consider three random variables X_n, Y_n and Z_n, which represent the numbers of the nodes with 0, 1, and 2 children, in the binary search trees of size n. The expectation and variance of the three above random var... We consider three random variables X_n, Y_n and Z_n, which represent the numbers of the nodes with 0, 1, and 2 children, in the binary search trees of size n. The expectation and variance of the three above random variables are got, and it is also shown that X_n, Y_n and Z_n are all asymptotically normal as n→∞by applying the contraction method. 展开更多
关键词 binary search tree NODES law of large numbers contraction method limiting distribution 60F05 05C80
原文传递
Research on the adaptive hybrid search tree anti-collision algorithm in RFID system 被引量:3
3
作者 靳晓芳 Liu Mengxuan +2 位作者 Shao Min Jin Libiao Huang Xianglin 《High Technology Letters》 EI CAS 2016年第1期107-112,共6页
Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in thr... Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in throughput are very limited.In order to solve these problems,this paper presents a novel tag anti-collision scheme,namely adaptive hybrid search tree(AHST),by combining two algorithms of the adaptive binary-tree disassembly(ABD) and the combination query tree(CQT),in which ABD has superior tag identification velocity and CQT has optimum performance in system throughput and search timeslots.From the theoretical analysis and numerical simulations,the proposed algorithm can colligate the advantages of above algorithms,improve the system throughput and reduce the searching timeslots dramatically. 展开更多
关键词 ANTI-COLLISION adaptive binary-tree disassembly( ABD) hybrid search tree DISCRIMINATION
在线阅读 下载PDF
Blocking optimized SIMD tree search on modern processors 被引量:2
4
作者 张倬 陆宇凡 +2 位作者 沈文枫 徐炜民 郑衍衡 《Journal of Shanghai University(English Edition)》 CAS 2011年第5期437-444,共8页
Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting... Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting single instruction multiple date (SIMD) capacity of processors to improve the performance of tree search, and proposes several improvement methods on reported SIMD tree search algorithms. Based on blocking tree structure, blocking for memory alignment and dynamic blocking prefetch are proposed to optimize the overhead of memory access. Furthermore, as a way of non-linear loop unrolling, the search branch unwinding shows that the number of branches can exceed the data width of SIMD instructions in the SIMD search algorithm. The experiments suggest that blocking optimized SIMD tree search algorithm can achieve 1.6 times response speed faster than the un-optimized algorithm. 展开更多
关键词 single instruction multiple date (SIMD) tree search binary search streaming SIMD extensions (SSE) Cell broadband engine (BE)
在线阅读 下载PDF
Planning,monitoring and replanning techniques for handling abnormity in HTN-based planning and execution
5
作者 KANG Kai CHENG Kai +2 位作者 SHAO Tianhao ZHANG Hongjun ZHANG Ke 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第5期1264-1275,共12页
A framework that integrates planning,monitoring and replanning techniques is proposed.It can devise the best solution based on the current state according to specific objectives and properly deal with the influence of... A framework that integrates planning,monitoring and replanning techniques is proposed.It can devise the best solution based on the current state according to specific objectives and properly deal with the influence of abnormity on the plan execution.The framework consists of three parts:the hierarchical task network(HTN)planner based on Monte Carlo tree search(MCTS),hybrid plan monitoring based on forward and backward and norm-based replanning method selection.The HTN planner based on MCTS selects the optimal method for HTN compound task through pre-exploration.Based on specific objectives,it can identify the best solution to the current problem.The hybrid plan monitoring has the capability to detect the influence of abnormity on the effect of an executed action and the premise of an unexecuted action,thus trigger the replanning.The norm-based replanning selection method can measure the difference between the expected state and the actual state,and then select the best replanning algorithm.The experimental results reveal that our method can effectively deal with the influence of abnormity on the implementation of the plan and achieve the target task in an optimal way. 展开更多
关键词 hierarchical task network Monte carlo tree search(MCTS) PLANNING EXECUTION abnormity
在线阅读 下载PDF
A Physical Layer Network Coding Based Tag Anti-Collision Algorithm for RFID System 被引量:3
6
作者 Cuixiang Wang Xing Shao +1 位作者 Yifan Meng Jun Gao 《Computers, Materials & Continua》 SCIE EI 2021年第1期931-945,共15页
In RFID(Radio Frequency IDentification)system,when multiple tags are in the operating range of one reader and send their information to the reader simultaneously,the signals of these tags are superimposed in the air,w... In RFID(Radio Frequency IDentification)system,when multiple tags are in the operating range of one reader and send their information to the reader simultaneously,the signals of these tags are superimposed in the air,which results in a collision and leads to the degrading of tags identifying efficiency.To improve the multiple tags’identifying efficiency due to collision,a physical layer network coding based binary search tree algorithm(PNBA)is proposed in this paper.PNBA pushes the conflicting signal information of multiple tags into a stack,which is discarded by the traditional anti-collision algorithm.In addition,physical layer network coding is exploited by PNBA to obtain unread tag information through the decoding operation of physical layer network coding using the conflicting information in the stack.Therefore,PNBA reduces the number of interactions between reader and tags,and improves the tags identification efficiency.Theoretical analysis and simulation results using MATLAB demonstrate that PNBA reduces the number of readings,and improve RFID identification efficiency.Especially,when the number of tags to be identified is 100,the average needed reading number of PNBA is 83%lower than the basic binary search tree algorithm,43%lower than reverse binary search tree algorithm,and its reading efficiency reaches 0.93. 展开更多
关键词 Radio frequency identification(RFID) tag anti-collision algorithm physical layer network coding binary search tree algorithm
在线阅读 下载PDF
An intelligent task offloading algorithm(iTOA)for UAV edge computing network 被引量:8
7
作者 Jienan Chen Siyu Chen +3 位作者 Siyu Luo Qi Wang Bin Cao Xiaoqian Li 《Digital Communications and Networks》 SCIE 2020年第4期433-443,共11页
Unmanned Aerial Vehicle(UAV)has emerged as a promising technology for the support of human activities,such as target tracking,disaster rescue,and surveillance.However,these tasks require a large computation load of im... Unmanned Aerial Vehicle(UAV)has emerged as a promising technology for the support of human activities,such as target tracking,disaster rescue,and surveillance.However,these tasks require a large computation load of image or video processing,which imposes enormous pressure on the UAV computation platform.To solve this issue,in this work,we propose an intelligent Task Offloading Algorithm(iTOA)for UAV edge computing network.Compared with existing methods,iTOA is able to perceive the network’s environment intelligently to decide the offloading action based on deep Monte Calor Tree Search(MCTS),the core algorithm of Alpha Go.MCTS will simulate the offloading decision trajectories to acquire the best decision by maximizing the reward,such as lowest latency or power consumption.To accelerate the search convergence of MCTS,we also proposed a splitting Deep Neural Network(sDNN)to supply the prior probability for MCTS.The sDNN is trained by a self-supervised learning manager.Here,the training data set is obtained from iTOA itself as its own teacher.Compared with game theory and greedy search-based methods,the proposed iTOA improves service latency performance by 33%and 60%,respectively. 展开更多
关键词 Unmanned aerial vehicles(UAVs) Mobile edge computing(MEC) Intelligent task offloading algorithm(iTOA) Monte Carlo tree search(MCTS) Deep reinforcement learning Splitting deep neural network(sDNN)
在线阅读 下载PDF
A geospatial service composition approach based on MCTS with temporal-difference learning
8
作者 Zhuang Can Guo Mingqiang Xie Zhong 《High Technology Letters》 EI CAS 2021年第1期17-25,共9页
With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is ri... With the complexity of the composition process and the rapid growth of candidate services,realizing optimal or near-optimal service composition is an urgent problem.Currently,the static service composition chain is rigid and cannot be easily adapted to the dynamic Web environment.To address these challenges,the geographic information service composition(GISC) problem as a sequential decision-making task is modeled.In addition,the Markov decision process(MDP),as a universal model for the planning problem of agents,is used to describe the GISC problem.Then,to achieve self-adaptivity and optimization in a dynamic environment,a novel approach that integrates Monte Carlo tree search(MCTS) and a temporal-difference(TD) learning algorithm is proposed.The concrete services of abstract services are determined with optimal policies and adaptive capability at runtime,based on the environment and the status of component services.The simulation experiment is performed to demonstrate the effectiveness and efficiency through learning quality and performance. 展开更多
关键词 geospatial service composition reinforcement learning(RL) Markov decision process(MDP) Monte Carlo tree search(MCTS) temporal-difference(TD)learning
在线阅读 下载PDF
Game Tree Search-based Impulsive Orbital Pursuit–Evasion Game with Limited Actions 被引量:2
9
作者 Wenyuan Xie Liran Zhao Zhaohui Dang 《Space(Science & Technology)》 2024年第1期631-643,共13页
This paper focuses on the impulsive orbital pursuit–evasion game(OPEG)with limited action sets for the pursuer and evader.Initially,a mathematical model is developed by combining game theory and orbital dynamics,form... This paper focuses on the impulsive orbital pursuit–evasion game(OPEG)with limited action sets for the pursuer and evader.Initially,a mathematical model is developed by combining game theory and orbital dynamics,forming a finite-round impulsive OPEG problem.The problem is then formulated as a bilateral optimization problem,employing a minimum-maximum optimization index based on terminal distance.To tackle this problem,an algorithm based on game tree search is designed,enabling the determination of the optimal pursuit–evasion strategy with limited action sets.Additionally,we explore the influence of the initial pursuing orientation on OPEG.The optimal initial pursuit orientation is analytically derived using relative motion dynamics under uncontrolled conditions.Furthermore,considering factors such as the initial status of the pursuit spacecraft,initial relative distance,transfer time,and maneuverability,the impulsive OPEG problem with limited action sets is numerically solved using game tree search.The findings of this study showcase the efficacy of game tree search in addressing impulsive OPEG problems with limited action sets.The study also demonstrates that the initial pursuing orientation selection at the start of the game plays a crucial role in increasing the success rate of pursuit.The research findings of this study have important implications for future practical engineering applications. 展开更多
关键词 limited action sets bilateral optimization problememploying combining game theory orbital dynamicsforming impulsive orbital pursuit evasion game opeg impulsive orbital pursuit evasion game mathematical model game tree search optimal pursuit evasion strategy
原文传递
Two-Dimensional Rectangular Stock CuttingProblem and Solution Methods
10
作者 Zhao Hui Yu Liang +1 位作者 Ning Tao Xi Ping School of Mechanical Engineering and Automation, Beijing University of Aeronautics and Astronautics, Beijing 100083, China Manufacturing and Production 《Computer Aided Drafting,Design and Manufacturing》 2001年第2期1-7,共7页
Optimal layout of rectangular stock cutting is still in great demand from industry for diversified applications. This paper introduces four basic solution methods to the problem linear programming, dynamic programming... Optimal layout of rectangular stock cutting is still in great demand from industry for diversified applications. This paper introduces four basic solution methods to the problem linear programming, dynamic programming, tree search and heuristic approach. A prototype of application software is developed to verify the pros and cons of various approaches 展开更多
关键词 rectangular stock cutting linear programming dynamic programming tree search HEURISTIC
全文增补中
Jiu fusion artificial intelligence(JFA):a two-stage reinforcement learning model with hierarchical neural networks and human knowledge for Tibetan Jiu chess
11
作者 Xiali LI Xiaoyu FAN +3 位作者 Junzhi YU Zhicheng DONG Xianmu CAIRANG Ping LAN 《Frontiers of Information Technology & Electronic Engineering》 2025年第10期1969-1983,共15页
Tibetan Jiu chess,recognized as a national intangible cultural heritage,is a complex game comprising two distinct phases:the layout phase and the battle phase.Improving the performance of deep reinforcement learning(D... Tibetan Jiu chess,recognized as a national intangible cultural heritage,is a complex game comprising two distinct phases:the layout phase and the battle phase.Improving the performance of deep reinforcement learning(DRL)models for Tibetan Jiu chess is challenging,especially given the constraints of hardware resources.To address this,we propose a two-stage model called JFA,which incorporates hierarchical neural networks and knowledge-guided techniques.The model includes sub-models:strategic layout model(SLM)for the layout phase and hierarchical battle model(HBM)for the battle phase.Both sub-models use similar network structures and employ parallel Monte Carlo tree search(MCTS)methods for independent self-play training.HBM is structured as a hierarchical neural network,with the upper network selecting movement and jump capturing actions and the lower network handling square capturing actions.Human knowledge-based auxiliary agents are introduced to assist SLM and HBM,simulating the entire game and providing reward signals based on square capturing or victory outcomes.Additionally,within the HBM,we propose two human knowledge-based pruning methods that prune parallel MCTS and capture actions in the lower network.In the experiments against a layout model using the AlphaZero method,SLM achieves a 74%win rate,with the decision-making time being reduced to approximately 1/147 of the time required by the AlphaZero model.SLM also won the first place at the 2024 China National Computer Game Tournament.HBM achieves a 70%win rate when playing against other Tibetan Jiu chess models.When used together,SLM and HBM in JFA achieve an 81%win rate,comparable to the level of a human amateur 4-dan player.These results demonstrate that JFA effectively enhances artificial intelligence(AI)performance in Tibetan Jiu chess. 展开更多
关键词 GAMES Reinforcement learning Tibetan Jiu chess Separate two-stage model Self-play Hierarchical neural network Parallel Monte Carlo tree search
原文传递
ADAPTIVE CGF COMMANDER BEHAVIOR MODELING THROUGH HTN GUIDED MONTE CARLO TREE SEARCH 被引量:7
12
作者 Xiao Xu Mei Yang Ge Li 《Journal of Systems Science and Systems Engineering》 SCIE EI CSCD 2018年第2期231-249,共19页
Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to prede... Improving the intelligence of virtual entities is an important issue in Computer Generated Forces (CGFs) construction. Some traditional approaches try to achieve this by specifying how entities should react to predefined conditions, which is not suitable for complex and dynamic environments. This paper aims to apply Monte Carlo Tree Search (MCTS) for the behavior modeling of CGF commander. By look-ahead reasoning, the model generates adaptive decisions to direct the whole troops to fight. Our main work is to formulate the tree model through the state and action abstraction, and extend its expansion process to handle simultaneous and durative moves. We also employ Hierarchical Task Network (HTN) planning to guide the search, thus enhancing the search efficiency. The final implementation is tested in an infantry combat simulation where a company commander needs to control three platoons to assault and clear enemies within defined areas. Comparative results from a series of experiments demonstrate that the HTN guided MCTS commander can outperform other commanders following fixed strategies. 展开更多
关键词 Monte Carlo Tree search Hierarchical Task Network Computer generated force Behaviormodeling
原文传递
Concurrent Manipulation of Expanded AVL Trees
13
作者 章寅 许卓群 《Journal of Computer Science & Technology》 SCIE EI CSCD 1998年第4期325-336,共12页
The concurrent manipulation of an expanded AVL tree (EAVL tree) is considered in this paper. The presented system can support any number of concurrent processes which perform searching, insertion and deletion on the t... The concurrent manipulation of an expanded AVL tree (EAVL tree) is considered in this paper. The presented system can support any number of concurrent processes which perform searching, insertion and deletion on the tree. Simulation results indicate the high performance of the system. Elaborate techniques are used to achieve such a system unawilable based on any known algorithms. Methods developed in this paper may provide new insights into other problems in the area of concurrent search structure manipulation. 展开更多
关键词 AVL tree data structure binary search tree concurrent algorithm concurrency control locking protocol
原文传递
Fast Tree Search for A Triangular Lattice Model of Protein Folding
14
作者 XiaomeiLi NengchaoWang 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2004年第4期245-252,共8页
Using a triangular lattice model to study the designability of proteinfolding, we overcame the parity problem of previous cubic lattice model and enumerated all thesequences and compact structures on a simple two-dime... Using a triangular lattice model to study the designability of proteinfolding, we overcame the parity problem of previous cubic lattice model and enumerated all thesequences and compact structures on a simple two-dimensional triangular lattice model of size4+5+6+5+4. We used two types of amino acids, hydrophobic and polar, to make up the sequences, andachieved 2^(23)+2^(12) different sequences excluding the reverse symmetry sequences. The totalstring number of distinct compact structures was 219,093, excluding reflection symmetry in theself-avoiding path of length 24 triangular lattice model. Based on this model, we applied a fastsearch algorithm by constructing a cluster tree. The algorithm decreased the computation bycomputing the objective energy of non-leaf nodes. The parallel experiments proved that the fast treesearch algorithm yielded an exponential speed-up in the model of size 4+5+6+5+4. Designabilityanalysis was performed to understand the search result. 展开更多
关键词 triangular lattice model protein folding fast search tree DESIGNABILITY
在线阅读 下载PDF
VLSI implementation of MIMO detection for 802.11n using a novel adaptive tree search algorithm
15
作者 尧横 鉴海防 +1 位作者 周立国 石寅 《Journal of Semiconductors》 EI CAS CSCD 2013年第10期107-113,共7页
A 4×4 64-QAM multiple-input multiple-output (MIMO) detector is presented for the application of an IEEE 802.1 In wireless local area network. The detector is the implementation of a novel adaptive tree search ... A 4×4 64-QAM multiple-input multiple-output (MIMO) detector is presented for the application of an IEEE 802.1 In wireless local area network. The detector is the implementation of a novel adaptive tree search (ATS) algorithm, and multiple ATS cores need to be instantiated to achieve the wideband requirement in the 802.11 n standard. Both the ATS algorithm and the architectural considerations are explained. The latency of the detector is 0.75 μs, and the detector has a gate count of 848 k with a total of 19 parallel ATS cores. Each ATS core runs at 67 MHz. Measurement results show that compared with the floating-point ATS algorithm, the fixed-point imple- mentation achieves a loss of 0.9 dB at a BER of 10^-3. 展开更多
关键词 multiple-input multiple-output adaptive tree search sphere decoder fixed complexity sphere decoder 802.11n
原文传递
Reinforcement learning and A^(*)search for the unit commitment problem Patrick de Mars^(∗),Aidan O’Sullivan
16
作者 Patrick de Mars Aidan O’Sullivan 《Energy and AI》 2022年第3期172-181,共10页
Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to s... Previous research has combined model-free reinforcement learning with model-based tree search methodsto solve the unit commitment problem with stochastic demand and renewables generation. This approachwas limited to shallow search depths and suffered from significant variability in run time across probleminstances with varying complexity. To mitigate these issues, we extend this methodology to more advancedsearch algorithms based on A^(*) search. First, we develop a problem-specific heuristic based on priority list unitcommitment methods and apply this in Guided A^(*) search, reducing run time by up to 94% with negligibleimpact on operating costs. In addition, we address the run time variability issue by employing a novel anytimealgorithm, Guided IDA^(*), replacing the fixed search depth parameter with a time budget constraint. We showthat Guided IDA^(*) mitigates the run time variability of previous guided tree search algorithms and enablesfurther operating cost reductions of up to 1%. 展开更多
关键词 Unit commitment Reinforcement learning Tree search Power systems
在线阅读 下载PDF
TibetanGoTinyNet:a lightweight U-Net style network for zero learning of Tibetan Go 被引量:1
17
作者 Xiali LI Yanyin ZHANG +2 位作者 Licheng WU Yandong CHEN Junzhi YU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2024年第7期924-937,共14页
The game of Tibetan Go faces the scarcity of expert knowledge and research literature.Therefore,we study the zero learning model of Tibetan Go under limited computing power resources and propose a novel scaleinvariant... The game of Tibetan Go faces the scarcity of expert knowledge and research literature.Therefore,we study the zero learning model of Tibetan Go under limited computing power resources and propose a novel scaleinvariant U-Net style two-headed output lightweight network TibetanGoTinyNet.The lightweight convolutional neural networks and capsule structure are applied to the encoder and decoder of TibetanGoTinyNet to reduce computational burden and achieve better feature extraction results.Several autonomous self-attention mechanisms are integrated into TibetanGoTinyNet to capture the Tibetan Go board’s spatial and global information and select important channels.The training data are generated entirely from self-play games.TibetanGoTinyNet achieves 62%–78%winning rate against other four U-Net style models including Res-UNet,Res-UNet Attention,Ghost-UNet,and Ghost Capsule-UNet.It also achieves 75%winning rate in the ablation experiments on the attention mechanism with embedded positional information.The model saves about 33%of the training time with 45%–50%winning rate for different Monte–Carlo tree search(MCTS)simulation counts when migrated from 9×9 to 11×11 boards.Code for our model is available at https://github.com/paulzyy/TibetanGoTinyNet. 展开更多
关键词 Zero learning Tibetan Go U-Net Self-attention mechanism Capsule network Monte-Carlo tree search
原文传递
A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games 被引量:5
18
作者 Li ZHANG Yuxuan CHEN +4 位作者 Wei WANG Ziliang HAN Shijian Li Zhijie PAN Gang PAN 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第5期137-150,共14页
Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that lea... Solving the optimization problem to approach a Nash Equilibrium point plays an important role in imperfect information games,e.g.,StarCraft and poker.Neural Fictitious Self-Play(NFSP)is an effective algorithm that learns approximate Nash Equilibrium of imperfect-information games from purely self-play without prior domain knowledge.However,it needs to train a neural network in an off-policy manner to approximate the action values.For games with large search spaces,the training may suffer from unnecessary exploration and sometimes fails to converge.In this paper,we propose a new Neural Fictitious Self-Play algorithm that combines Monte Carlo tree search with NFSP,called MC-NFSP,to improve the performance in real-time zero-sum imperfect-information games.With experiments and empirical analysis,we demonstrate that the proposed MC-NFSP algorithm can approximate Nash Equilibrium in games with large-scale search depth while the NFSP can not.Furthermore,we develop an Asynchronous Neural Fictitious Self-Play framework(ANFSP).It uses asynchronous and parallel architecture to collect game experience and improve both the training efficiency and policy quality.The experiments with th e games with hidden state information(Texas Hold^m),and the FPS(firstperson shooter)games demonstrate effectiveness of our algorithms. 展开更多
关键词 approximate Nash Equilibrium imperfect-information games dynamic games Monte Carlo tree search Neural Fictitious Self-Play reinforcement learning
原文传递
Rich-text document styling restoration via reinforcement learning 被引量:1
19
作者 Hongwei LI Yingpeng HU +2 位作者 Yixuan CAO Ganbin ZHOU Ping LUO 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第4期93-103,共11页
Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside ... Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside them is usually missing,making them improper or even burdensome to be displayed and edited in different formats and platforms.In this study we formulate the task of document styling restoration as an optimization problem,which aims to identify the styling settings on the document elements,e.g.,lines,table cells,text,so that rendering with the output styling settings results in a document,where each element inside it holds the(closely)exact position with the one in the original document.Considering that each styling setting is a decision,this problem can be transformed as a multi-step decision-making task over all the document elements,and then be solved by reinforcement learning.Specifically,Monte-Carlo Tree Search(MCTS)is leveraged to explore the different styling settings,and the policy function is learnt under the supervision of the delayed rewards.As a case study,we restore the styling information inside tables,where structural and functional data in the documents are usually presented.Experiment shows that,our best reinforcement method successfully restores the stylings in 87.65%of the tables,with 25.75%absolute improvement over the greedymethod.We also discuss the tradeoff between the inference time and restoration success rate,and argue that although the reinforcement methods cannot be used in real-time scenarios,it is suitable for the offline tasks with high-quality requirement.Finally,this model has been applied in a PDF parser to support cross-format display. 展开更多
关键词 styling restoration monte-carlo tree search reinforcement learning richly formatted documents TABLES
原文传递
Multicommodity Flow Modeling for the Data Transmission Scheduling Problem in Navigation Satellite Systems 被引量:1
20
作者 Jungang Yan Lining Xing +1 位作者 Chao Li Zhongshan Zhang 《Complex System Modeling and Simulation》 2021年第3期232-241,共10页
Introducing InterSatellite Links(ISLs)is a major trend in new-generation Global Navigation Satellite Systems(GNSSs).Data transmission scheduling is a crucial problem in the study of ISL management.The existing researc... Introducing InterSatellite Links(ISLs)is a major trend in new-generation Global Navigation Satellite Systems(GNSSs).Data transmission scheduling is a crucial problem in the study of ISL management.The existing research on intersatellite data transmission has not considered the capacities of ISL bandwidth.Thus,the current study is the first to describe the intersatellite data transmission scheduling problem with capacity restrictions in GNSSs.A model conversion strategy is designed to model the aforementioned problem as a length-bounded single-path multicommodity flow problem.An integer programming model is constructed to minimize the maximal sum of flows on each intersatellite edge;this minimization is equivalent to minimizing the maximal occupied ISL bandwidth.An iterated tree search algorithm is proposed to resolve the problem,and two ranking rules are designed to guide the search.Experiments based on the BeiDou satellite constellation are designed,and results demonstrate the effectiveness of the proposed model and algorithm. 展开更多
关键词 intersatellite link navigation satellite system data transmission multicommodity flow tree search
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部