Quantum error correction, a technique that relies on the principle of redundancy to encode logical information into additional qubits to better protect the system from noise, is necessary to design a viable quantum co...Quantum error correction, a technique that relies on the principle of redundancy to encode logical information into additional qubits to better protect the system from noise, is necessary to design a viable quantum computer. For this new topological stabilizer code-XYZ^(2) code defined on the cellular lattice, it is implemented on a hexagonal lattice of qubits and it encodes the logical qubits with the help of stabilizer measurements of weight six and weight two. However topological stabilizer codes in cellular lattice quantum systems suffer from the detrimental effects of noise due to interaction with the environment. Several decoding approaches have been proposed to address this problem. Here, we propose the use of a state-attention based reinforcement learning decoder to decode XYZ^(2) codes, which enables the decoder to more accurately focus on the information related to the current decoding position, and the error correction accuracy of our reinforcement learning decoder model under the optimisation conditions can reach 83.27% under the depolarizing noise model, and we have measured thresholds of 0.18856 and 0.19043 for XYZ^(2) codes at code spacing of 3–7 and 7–11, respectively. our study provides directions and ideas for applications of decoding schemes combining reinforcement learning attention mechanisms to other topological quantum error-correcting codes.展开更多
Quantum error correction technology is an important method to eliminate errors during the operation of quantum computers.In order to solve the problem of influence of errors on physical qubits,we propose an approximat...Quantum error correction technology is an important method to eliminate errors during the operation of quantum computers.In order to solve the problem of influence of errors on physical qubits,we propose an approximate error correction scheme that performs dimension mapping operations on surface codes.This error correction scheme utilizes the topological properties of error correction codes to map the surface code dimension to three dimensions.Compared to previous error correction schemes,the present three-dimensional surface code exhibits good scalability due to its higher redundancy and more efficient error correction capabilities.By reducing the number of ancilla qubits required for error correction,this approach achieves savings in measurement space and reduces resource consumption costs.In order to improve the decoding efficiency and solve the problem of the correlation between the surface code stabilizer and the 3D space after dimension mapping,we employ a reinforcement learning(RL)decoder based on deep Q-learning,which enables faster identification of the optimal syndrome and achieves better thresholds through conditional optimization.Compared to the minimum weight perfect matching decoding,the threshold of the RL trained model reaches 0.78%,which is 56%higher and enables large-scale fault-tolerant quantum computation.展开更多
Recently,linear complementary dual(LCD)codes have garnered substantial interest within coding theory research due to their diverse applications and favorable attributes.This paper directs its attention to the construc...Recently,linear complementary dual(LCD)codes have garnered substantial interest within coding theory research due to their diverse applications and favorable attributes.This paper directs its attention to the construction of binary and ternary LCD codes leveraging curiosity-driven reinforcement learning(RL).By establishing reward and devising well-reasoned mappings from actions to states,it aims to facilitate the successful synthesis of binary or ternary LCD codes.Experimental results indicate that LCD codes constructed using RL exhibit slightly superior error-correction performance compared to those conventionally constructed LCD codes and those developed via standard RL methodologies.The paper introduces novel binary and ternary LCD codes with enhanced minimum distance bounds.Finally,it showcases how random network distillation aids agents in exploring beyond local optima,enhancing the overall performance of the models without compromising convergence.展开更多
On the basis of asymptotic theory of Gersho, the isodistortion principle of vector clustering was discussed and a kind of competitive and selective learning method (CSL) which may avoid local optimization and have exc...On the basis of asymptotic theory of Gersho, the isodistortion principle of vector clustering was discussed and a kind of competitive and selective learning method (CSL) which may avoid local optimization and have excellent result in application to clusters of HMM model was also proposed. In combining the parallel, self organizational hierarchical neural networks (PSHNN) to reclassify the scores of every form output by HMM, the CSL speech recognition rate is obviously elevated.展开更多
Scalable video coding(SVC)has been widely used in video-on-demand(VOD)service,to efficiently satisfy users’different video quality requirements and dynamically adjust video stream to timevariant wireless channels.Und...Scalable video coding(SVC)has been widely used in video-on-demand(VOD)service,to efficiently satisfy users’different video quality requirements and dynamically adjust video stream to timevariant wireless channels.Under the 5G network structure,we consider a cooperative caching scheme inside each cluster with SVC to economically utilize the limited caching storage.A novel multi-agent deep reinforcement learning(MADRL)framework is proposed to jointly optimize the video access delay and users’satisfaction,where an aggregation node is introduced helping individual agents to achieve global observations and overall system rewards.Moreover,to cope with the large action space caused by the large number of videos and users,a dimension decomposition method is embedded into the neural network in each agent,which greatly reduce the computational complexity and memory cost of the reinforcement learning.Experimental results show that:1)the proposed value-decomposed dimensional network(VDDN)algorithm achieves an obvious performance gain versus the traditional MADRL;2)the proposed VDDN algorithm can handle an extremely large action space and quickly converge with a low computational complexity.展开更多
Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside ...Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside them is usually missing,making them improper or even burdensome to be displayed and edited in different formats and platforms.In this study we formulate the task of document styling restoration as an optimization problem,which aims to identify the styling settings on the document elements,e.g.,lines,table cells,text,so that rendering with the output styling settings results in a document,where each element inside it holds the(closely)exact position with the one in the original document.Considering that each styling setting is a decision,this problem can be transformed as a multi-step decision-making task over all the document elements,and then be solved by reinforcement learning.Specifically,Monte-Carlo Tree Search(MCTS)is leveraged to explore the different styling settings,and the policy function is learnt under the supervision of the delayed rewards.As a case study,we restore the styling information inside tables,where structural and functional data in the documents are usually presented.Experiment shows that,our best reinforcement method successfully restores the stylings in 87.65%of the tables,with 25.75%absolute improvement over the greedymethod.We also discuss the tradeoff between the inference time and restoration success rate,and argue that although the reinforcement methods cannot be used in real-time scenarios,it is suitable for the offline tasks with high-quality requirement.Finally,this model has been applied in a PDF parser to support cross-format display.展开更多
A stable control scheme for a class of unknown nonlinear systems was presented. The control architecture is composed of two parts, the fuzzy sliding mode controller (FSMC) is applied to drive the state to a designed s...A stable control scheme for a class of unknown nonlinear systems was presented. The control architecture is composed of two parts, the fuzzy sliding mode controller (FSMC) is applied to drive the state to a designed switching hyperplane, and a reinforcement self organizing fuzzy CPN (RSOFCPN) as a feedforward compensator is used to reduce the influence of system uncertainties. The simulation results demonstrate the effectiveness of the proposed control scheme.展开更多
Nonlinear solution of reinforced concrete structures, particularly complete load-deflection response, requires tracing of the equilibrium path and proper treatment of the limit and bifurcation points. In this regard, ...Nonlinear solution of reinforced concrete structures, particularly complete load-deflection response, requires tracing of the equilibrium path and proper treatment of the limit and bifurcation points. In this regard, ordinary solution techniques lead to instability near the limit points and also have problems in case of snap-through and snap-back. Thus they fail to predict the complete load-displacement response. The arc-length method serves the purpose well in principle, received wide acceptance in finite element analysis, and has been used extensively. However modifications to the basic idea are vital to meet the particular needs of the analysis. This paper reviews some of the recent developments of the method in the last two decades, with particular emphasis on nonlinear finite element analysis of reinforced concrete structures.展开更多
The hippocampus which lies in the temporal lobe plays an important role in spatial navigation,learning and memory.Several studies have been made on the place cell activity,spatial memory,prediction of future locations...The hippocampus which lies in the temporal lobe plays an important role in spatial navigation,learning and memory.Several studies have been made on the place cell activity,spatial memory,prediction of future locations and various learning paradigms.However,there are no attempts which have focused on finding whether neurons which contribute largely to both spatial memory and learning about the reward exist.This paper proposes that there are neurons that can simultaneously engage in forming place memory and reward learning in a rat hippocampus' s CA1 area.With a trained rat,a reward experiment was conducted in a modified 8-shaped maze with five stages,and utterance information was obtained from a CA1 neuron.The firing rate which is the count of spikes per unit time was calculated.The decoding was conducted with log-maximum likelihood estimation(Log-MLE) using Gaussian distribution model.Our outcomes provide evidence of neurons which play a part in spatial memory and learning regarding reward.展开更多
基金supported by the Natural Science Foundation of Shandong Province,China (Grant No. ZR2021MF049)Joint Fund of Natural Science Foundation of Shandong Province (Grant Nos. ZR2022LLZ012 and ZR2021LLZ001)。
文摘Quantum error correction, a technique that relies on the principle of redundancy to encode logical information into additional qubits to better protect the system from noise, is necessary to design a viable quantum computer. For this new topological stabilizer code-XYZ^(2) code defined on the cellular lattice, it is implemented on a hexagonal lattice of qubits and it encodes the logical qubits with the help of stabilizer measurements of weight six and weight two. However topological stabilizer codes in cellular lattice quantum systems suffer from the detrimental effects of noise due to interaction with the environment. Several decoding approaches have been proposed to address this problem. Here, we propose the use of a state-attention based reinforcement learning decoder to decode XYZ^(2) codes, which enables the decoder to more accurately focus on the information related to the current decoding position, and the error correction accuracy of our reinforcement learning decoder model under the optimisation conditions can reach 83.27% under the depolarizing noise model, and we have measured thresholds of 0.18856 and 0.19043 for XYZ^(2) codes at code spacing of 3–7 and 7–11, respectively. our study provides directions and ideas for applications of decoding schemes combining reinforcement learning attention mechanisms to other topological quantum error-correcting codes.
基金Project supported by the Natural Science Foundation of Shandong Province,China(Grant Nos.ZR2021MF049,ZR2022LLZ012,and ZR2021LLZ001)。
文摘Quantum error correction technology is an important method to eliminate errors during the operation of quantum computers.In order to solve the problem of influence of errors on physical qubits,we propose an approximate error correction scheme that performs dimension mapping operations on surface codes.This error correction scheme utilizes the topological properties of error correction codes to map the surface code dimension to three dimensions.Compared to previous error correction schemes,the present three-dimensional surface code exhibits good scalability due to its higher redundancy and more efficient error correction capabilities.By reducing the number of ancilla qubits required for error correction,this approach achieves savings in measurement space and reduces resource consumption costs.In order to improve the decoding efficiency and solve the problem of the correlation between the surface code stabilizer and the 3D space after dimension mapping,we employ a reinforcement learning(RL)decoder based on deep Q-learning,which enables faster identification of the optimal syndrome and achieves better thresholds through conditional optimization.Compared to the minimum weight perfect matching decoding,the threshold of the RL trained model reaches 0.78%,which is 56%higher and enables large-scale fault-tolerant quantum computation.
基金supported by the National Natural Science Foundation of China under Grant Nos.62372247 and 12441103the open research fund of National Mobile Communications Research Laboratory,Southeast University under Grant No.2025D01the Open Project of Guangxi Provincial Key Laboratory under Grant No.MIMS22-01。
文摘Recently,linear complementary dual(LCD)codes have garnered substantial interest within coding theory research due to their diverse applications and favorable attributes.This paper directs its attention to the construction of binary and ternary LCD codes leveraging curiosity-driven reinforcement learning(RL).By establishing reward and devising well-reasoned mappings from actions to states,it aims to facilitate the successful synthesis of binary or ternary LCD codes.Experimental results indicate that LCD codes constructed using RL exhibit slightly superior error-correction performance compared to those conventionally constructed LCD codes and those developed via standard RL methodologies.The paper introduces novel binary and ternary LCD codes with enhanced minimum distance bounds.Finally,it showcases how random network distillation aids agents in exploring beyond local optima,enhancing the overall performance of the models without compromising convergence.
基金National Natural Science Foundation ofChina!( No.69672 0 0 7)
文摘On the basis of asymptotic theory of Gersho, the isodistortion principle of vector clustering was discussed and a kind of competitive and selective learning method (CSL) which may avoid local optimization and have excellent result in application to clusters of HMM model was also proposed. In combining the parallel, self organizational hierarchical neural networks (PSHNN) to reclassify the scores of every form output by HMM, the CSL speech recognition rate is obviously elevated.
基金supported by the National Natural Science Foundation of China under Grant No.61801119。
文摘Scalable video coding(SVC)has been widely used in video-on-demand(VOD)service,to efficiently satisfy users’different video quality requirements and dynamically adjust video stream to timevariant wireless channels.Under the 5G network structure,we consider a cooperative caching scheme inside each cluster with SVC to economically utilize the limited caching storage.A novel multi-agent deep reinforcement learning(MADRL)framework is proposed to jointly optimize the video access delay and users’satisfaction,where an aggregation node is introduced helping individual agents to achieve global observations and overall system rewards.Moreover,to cope with the large action space caused by the large number of videos and users,a dimension decomposition method is embedded into the neural network in each agent,which greatly reduce the computational complexity and memory cost of the reinforcement learning.Experimental results show that:1)the proposed value-decomposed dimensional network(VDDN)algorithm achieves an obvious performance gain versus the traditional MADRL;2)the proposed VDDN algorithm can handle an extremely large action space and quickly converge with a low computational complexity.
基金This work was supported by the National Key Research and Development Program of China(2017YFB1002104)the National Natural Science Foundation of China(Grant No.U1811461)the Innovation Program of Institute of Computing Technology,CAS.
文摘Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside them is usually missing,making them improper or even burdensome to be displayed and edited in different formats and platforms.In this study we formulate the task of document styling restoration as an optimization problem,which aims to identify the styling settings on the document elements,e.g.,lines,table cells,text,so that rendering with the output styling settings results in a document,where each element inside it holds the(closely)exact position with the one in the original document.Considering that each styling setting is a decision,this problem can be transformed as a multi-step decision-making task over all the document elements,and then be solved by reinforcement learning.Specifically,Monte-Carlo Tree Search(MCTS)is leveraged to explore the different styling settings,and the policy function is learnt under the supervision of the delayed rewards.As a case study,we restore the styling information inside tables,where structural and functional data in the documents are usually presented.Experiment shows that,our best reinforcement method successfully restores the stylings in 87.65%of the tables,with 25.75%absolute improvement over the greedymethod.We also discuss the tradeoff between the inference time and restoration success rate,and argue that although the reinforcement methods cannot be used in real-time scenarios,it is suitable for the offline tasks with high-quality requirement.Finally,this model has been applied in a PDF parser to support cross-format display.
基金National Natural Science Foundation ofChina! under grant No.69674 0 2 3
文摘A stable control scheme for a class of unknown nonlinear systems was presented. The control architecture is composed of two parts, the fuzzy sliding mode controller (FSMC) is applied to drive the state to a designed switching hyperplane, and a reinforcement self organizing fuzzy CPN (RSOFCPN) as a feedforward compensator is used to reduce the influence of system uncertainties. The simulation results demonstrate the effectiveness of the proposed control scheme.
文摘Nonlinear solution of reinforced concrete structures, particularly complete load-deflection response, requires tracing of the equilibrium path and proper treatment of the limit and bifurcation points. In this regard, ordinary solution techniques lead to instability near the limit points and also have problems in case of snap-through and snap-back. Thus they fail to predict the complete load-displacement response. The arc-length method serves the purpose well in principle, received wide acceptance in finite element analysis, and has been used extensively. However modifications to the basic idea are vital to meet the particular needs of the analysis. This paper reviews some of the recent developments of the method in the last two decades, with particular emphasis on nonlinear finite element analysis of reinforced concrete structures.
基金The MSIP(Ministry of Science,ICT&Future Planning),Korea,under the ITRC(Information Technology Research Center)support program(NIPA-2013-H0301-13-2006)supervised by the NIPA(National IT Industry Promotion Agency)The Brain Research Program through the National Research Foundation of Korea funded by the Ministry of Science,ICT&Future Planning(2011-0019212)
文摘The hippocampus which lies in the temporal lobe plays an important role in spatial navigation,learning and memory.Several studies have been made on the place cell activity,spatial memory,prediction of future locations and various learning paradigms.However,there are no attempts which have focused on finding whether neurons which contribute largely to both spatial memory and learning about the reward exist.This paper proposes that there are neurons that can simultaneously engage in forming place memory and reward learning in a rat hippocampus' s CA1 area.With a trained rat,a reward experiment was conducted in a modified 8-shaped maze with five stages,and utterance information was obtained from a CA1 neuron.The firing rate which is the count of spikes per unit time was calculated.The decoding was conducted with log-maximum likelihood estimation(Log-MLE) using Gaussian distribution model.Our outcomes provide evidence of neurons which play a part in spatial memory and learning regarding reward.