Software projects are becoming larger and more complicated. Managing those projects is based on several software development methodologies. One of those methodologies is software version control, which is used in the ...Software projects are becoming larger and more complicated. Managing those projects is based on several software development methodologies. One of those methodologies is software version control, which is used in the majority of worldwide software projects. Although existing version control systems provide sufficient functionality in many situations, they are lacking in terms of semantics and structure for source code. It is commonly believed that improving software version control can contribute substantially to the development of software. We present a solution that considers a structural model for matching source code that can be used in version control.展开更多
A binary tree can be represented by a code reflecting the traversal of the corresponding regular binary tree in given monotonic order. A different coding scheme based on the branches of a regular binary tree with n-no...A binary tree can be represented by a code reflecting the traversal of the corresponding regular binary tree in given monotonic order. A different coding scheme based on the branches of a regular binary tree with n-nodes is proposed. It differs from the coding scheme generally used and makes no distinction between internal nodes and terminal nodes. A code of a regular binary tree with nnodes is formed by labeling the left branches by O’s and the right branches by l’s and then traversing these branches in pre-order. Root is always assumed to be on a left branch.展开更多
This article proposes the high-speed and high-accuracy code clone detection method based on the combination of tree-based and token-based methods. Existence of duplicated program codes, called code clone, is one of th...This article proposes the high-speed and high-accuracy code clone detection method based on the combination of tree-based and token-based methods. Existence of duplicated program codes, called code clone, is one of the main factors that reduces the quality and maintainability of software. If one code fragment contains faults (bugs) and they are copied and modified to other locations, it is necessary to correct all of them. But it is not easy to find all code clones in large and complex software. Much research efforts have been done for code clone detection. There are mainly two methods for code clone detection. One is token-based and the other is tree-based method. Token-based method is fast and requires less resources. However it cannot detect all kinds of code clones. Tree-based method can detect all kinds of code clones, but it is slow and requires much computing resources. In this paper combination of these two methods was proposed to improve the efficiency and accuracy of detecting code clones. Firstly some candidates of code clones will be extracted by token-based method that is fast and lightweight. Then selected candidates will be checked more precisely by using tree-based method that can find all kinds of code clones. The prototype system was developed. This system accepts source code and tokenizes it in the first step. Then token-based method is applied to this token sequence to find candidates of code clones. After extracting several candidates, selected source codes will be converted into abstract syntax tree (AST) for applying tree-based method. Some sample source codes were used to evaluate the proposed method. This evaluation proved the improvement of efficiency and precision of code clones detecting.展开更多
The classification for handwritten Chinese character recognition can be viewed as a transformation in discrete vector space. In this paper, from the point of discrete vector space transformation, a new 4-corner codes ...The classification for handwritten Chinese character recognition can be viewed as a transformation in discrete vector space. In this paper, from the point of discrete vector space transformation, a new 4-corner codes classifier based on decision tree inductive learning algorithm ID3 for handwritten Chinese characters is presented. With a feature extraction controller, the classifier can reduce the number of extracted features and accelerate classification speed. Experimental results show that the 4-corner codes classifier performs well on both recognition accuracy and speed.展开更多
Using a quantum computer to simulate fermionic systems requires fermion-to-qubit transformations.Usually,lower Pauli weight of transformations means shallower quantum circuits.Therefore,most existing transformations a...Using a quantum computer to simulate fermionic systems requires fermion-to-qubit transformations.Usually,lower Pauli weight of transformations means shallower quantum circuits.Therefore,most existing transformations aim for lower Pauli weight.However,in some cases,the circuit depth depends not only on the Pauli weight but also on the coefficients of the Hamiltonian terms.In order to characterize the circuit depth of these algorithms,we propose a new metric called weighted Pauli weight,which depends on Pauli weight and coefficients of Hamiltonian terms.To achieve smaller weighted Pauli weight,we introduce a novel transformation,Huffman-code-based ternary tree(HTT)transformation,which is built upon the classical Huffman code and tailored to different Hamiltonians.We tested various molecular Hamiltonians and the results show that the weighted Pauli weight of the HTT transformation is smaller than that of commonly used mappings.At the same time,the HTT transformation also maintains a relatively small Pauli weight.The mapping we designed reduces the circuit depth of certain Hamiltonian simulation algorithms,facilitating faster simulation of fermionic systems.展开更多
This paper studies the algorithms for coding and decoding Prufer codes of a labeled tree. The algorithms for coding and decoding Prufer codes of a labeled tree in the literatures require time usually. Although there e...This paper studies the algorithms for coding and decoding Prufer codes of a labeled tree. The algorithms for coding and decoding Prufer codes of a labeled tree in the literatures require time usually. Although there exist linear time algorithms for Prufer-like codes [1,2,3], the algorithms utilize the integer sorting algorithms. The special range of the integers to be sorted is utilized to obtain a linear time integer sorting algorithm. The Prufer code problem is reduced to integer sorting. In this paper we consider the Prufer code problem in a different angle and a more direct manner. We start from a naïve algorithm, then improved it gradually and finally we obtain a very practical linear time algorithm. The techniques we used in this paper are of interest in their own right.展开更多
文摘Software projects are becoming larger and more complicated. Managing those projects is based on several software development methodologies. One of those methodologies is software version control, which is used in the majority of worldwide software projects. Although existing version control systems provide sufficient functionality in many situations, they are lacking in terms of semantics and structure for source code. It is commonly believed that improving software version control can contribute substantially to the development of software. We present a solution that considers a structural model for matching source code that can be used in version control.
文摘A binary tree can be represented by a code reflecting the traversal of the corresponding regular binary tree in given monotonic order. A different coding scheme based on the branches of a regular binary tree with n-nodes is proposed. It differs from the coding scheme generally used and makes no distinction between internal nodes and terminal nodes. A code of a regular binary tree with nnodes is formed by labeling the left branches by O’s and the right branches by l’s and then traversing these branches in pre-order. Root is always assumed to be on a left branch.
文摘This article proposes the high-speed and high-accuracy code clone detection method based on the combination of tree-based and token-based methods. Existence of duplicated program codes, called code clone, is one of the main factors that reduces the quality and maintainability of software. If one code fragment contains faults (bugs) and they are copied and modified to other locations, it is necessary to correct all of them. But it is not easy to find all code clones in large and complex software. Much research efforts have been done for code clone detection. There are mainly two methods for code clone detection. One is token-based and the other is tree-based method. Token-based method is fast and requires less resources. However it cannot detect all kinds of code clones. Tree-based method can detect all kinds of code clones, but it is slow and requires much computing resources. In this paper combination of these two methods was proposed to improve the efficiency and accuracy of detecting code clones. Firstly some candidates of code clones will be extracted by token-based method that is fast and lightweight. Then selected candidates will be checked more precisely by using tree-based method that can find all kinds of code clones. The prototype system was developed. This system accepts source code and tokenizes it in the first step. Then token-based method is applied to this token sequence to find candidates of code clones. After extracting several candidates, selected source codes will be converted into abstract syntax tree (AST) for applying tree-based method. Some sample source codes were used to evaluate the proposed method. This evaluation proved the improvement of efficiency and precision of code clones detecting.
文摘The classification for handwritten Chinese character recognition can be viewed as a transformation in discrete vector space. In this paper, from the point of discrete vector space transformation, a new 4-corner codes classifier based on decision tree inductive learning algorithm ID3 for handwritten Chinese characters is presented. With a feature extraction controller, the classifier can reduce the number of extracted features and accelerate classification speed. Experimental results show that the 4-corner codes classifier performs well on both recognition accuracy and speed.
基金supported by the National Key Research and Development Program of China(Grant No.2024YFB4504101)the National Nat-ural Science Foundation of China(Grant No.22303022)the Anhui Province Innovation Plan for Science and Technology(Grant No.202423r06050002).
文摘Using a quantum computer to simulate fermionic systems requires fermion-to-qubit transformations.Usually,lower Pauli weight of transformations means shallower quantum circuits.Therefore,most existing transformations aim for lower Pauli weight.However,in some cases,the circuit depth depends not only on the Pauli weight but also on the coefficients of the Hamiltonian terms.In order to characterize the circuit depth of these algorithms,we propose a new metric called weighted Pauli weight,which depends on Pauli weight and coefficients of Hamiltonian terms.To achieve smaller weighted Pauli weight,we introduce a novel transformation,Huffman-code-based ternary tree(HTT)transformation,which is built upon the classical Huffman code and tailored to different Hamiltonians.We tested various molecular Hamiltonians and the results show that the weighted Pauli weight of the HTT transformation is smaller than that of commonly used mappings.At the same time,the HTT transformation also maintains a relatively small Pauli weight.The mapping we designed reduces the circuit depth of certain Hamiltonian simulation algorithms,facilitating faster simulation of fermionic systems.
文摘This paper studies the algorithms for coding and decoding Prufer codes of a labeled tree. The algorithms for coding and decoding Prufer codes of a labeled tree in the literatures require time usually. Although there exist linear time algorithms for Prufer-like codes [1,2,3], the algorithms utilize the integer sorting algorithms. The special range of the integers to be sorted is utilized to obtain a linear time integer sorting algorithm. The Prufer code problem is reduced to integer sorting. In this paper we consider the Prufer code problem in a different angle and a more direct manner. We start from a naïve algorithm, then improved it gradually and finally we obtain a very practical linear time algorithm. The techniques we used in this paper are of interest in their own right.