Regular expressions are widely used within and even outside of computer science due to their expressiveness and flexibility.However, regular expressions have a quite compact and rather tolerant syntax that makes them ...Regular expressions are widely used within and even outside of computer science due to their expressiveness and flexibility.However, regular expressions have a quite compact and rather tolerant syntax that makes them hard to understand, hard to compose,and error-prone. Faulty regular expressions may cause failures of the applications that use them. Therefore, ensuring the correctness of regular expressions is a vital prerequisite for their use in practical applications. The importance and necessity of ensuring correct definitions of regular expressions have attracted extensive attention from researchers and practitioners, especially in recent years. In this study, we provide a review of the recent works for ensuring the correct usage of regular expressions. We classify those works into different categories, including the empirical study, test string generation, automatic synthesis and learning, static checking and verification,visual representation and explanation, and repairing. For each category, we review the main results, compare different approaches, and discuss their advantages and disadvantages. We also discuss some potential future research directions.展开更多
Regular expression matching is playing an important role in deep inspection. The rapid development of SDN and NFV makes the network more dynamic, bringing serious challenges to traditional deep inspection matching eng...Regular expression matching is playing an important role in deep inspection. The rapid development of SDN and NFV makes the network more dynamic, bringing serious challenges to traditional deep inspection matching engines. However, state-of-theart matching methods often require a significant amount of pre-processing time and hence are not suitable for this fast updating scenario. In this paper, a novel matching engine called BFA is proposed to achieve high-speed regular expression matching with fast pre-processing. Experiments demonstrate that BFA obtains 5 to 20 times more update abilities compared to existing regular expression matching methods, and scales well on multi-core platforms.展开更多
Nowadays, using Deterministic Finite Automata (DFA) or Non-deterministic Finite Automata (NFA) to parse regular expressions is the most popular way for Deep Packet Inspection (DPI), and the research about DPI focuses ...Nowadays, using Deterministic Finite Automata (DFA) or Non-deterministic Finite Automata (NFA) to parse regular expressions is the most popular way for Deep Packet Inspection (DPI), and the research about DPI focuses on the improvement of DFA to reduce memory. However, most of the existing literature ignores a special kind of "overlap-matching expression", which causes states explosion and takes quite a large part in the DPI rules. To solve this problem, in this paper a new mechanism is proposed based on bitmap. We start with a simple regular expression to describe "overlap-matching expressions" and state the problem. Then, after calculating the terrible number of exploded states for this kind of expressions, the procedure of Bitmap-based Soft Parallel Mechanism (BSPM) is described. Based on BSPM, we discuss all the different types of "overlap-matching ex- pressions" and give optimization suggestions of them separately. Finally, experiment results prove that BSPM can give an excellent performance on solving the problem stated above, and the optimization suggestions are also effective for the memory reduction on all types of "overlap-matching expressions".展开更多
In this paper, we consider the general quasi-differential expressions each of order n with complex coefficients and their formal adjoints on the interval (a,b). It is shown in direct sum spaces of functions defined on...In this paper, we consider the general quasi-differential expressions each of order n with complex coefficients and their formal adjoints on the interval (a,b). It is shown in direct sum spaces of functions defined on each of the separate intervals with the cases of one and two singular end-points and when all solutions of the equation and its adjoint are in (the limit circle case) that all well-posed extensions of the minimal operator have resolvents which are HilbertSchmidt integral operators and consequently have a wholly discrete spectrum. This implies that all the regularly solvable operators have all the standard essential spectra to be empty. These results extend those of formally symmetric expression studied in [1-10] and those of general quasi-differential expressions in [11-19].展开更多
提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR(regular expressions cut a...提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR(regular expressions cut and combine algorithm based on DR),有效地选择出导致DFA状态膨胀的片段并隔离,降低了单个正则表达式存储需求.同时,基于正则表达式的组合关系提出一种选择性分群算法REGADR(regular expressions group algorithm based on DR),在可以接受的存储需求总量下,通过选择性分群大幅度减少了状态机的个数,有效地降低了匹配算法的复杂性.展开更多
基金by National Natural Science Foundation of China(Nos.61872339,61502184 and 61925203).
文摘Regular expressions are widely used within and even outside of computer science due to their expressiveness and flexibility.However, regular expressions have a quite compact and rather tolerant syntax that makes them hard to understand, hard to compose,and error-prone. Faulty regular expressions may cause failures of the applications that use them. Therefore, ensuring the correctness of regular expressions is a vital prerequisite for their use in practical applications. The importance and necessity of ensuring correct definitions of regular expressions have attracted extensive attention from researchers and practitioners, especially in recent years. In this study, we provide a review of the recent works for ensuring the correct usage of regular expressions. We classify those works into different categories, including the empirical study, test string generation, automatic synthesis and learning, static checking and verification,visual representation and explanation, and repairing. For each category, we review the main results, compare different approaches, and discuss their advantages and disadvantages. We also discuss some potential future research directions.
基金supported by the National Key Technology R&D Program of China under Grant No. 2015BAK34B00the National Key Research and Development Program of China under Grant No. 2016YFB1000102
文摘Regular expression matching is playing an important role in deep inspection. The rapid development of SDN and NFV makes the network more dynamic, bringing serious challenges to traditional deep inspection matching engines. However, state-of-theart matching methods often require a significant amount of pre-processing time and hence are not suitable for this fast updating scenario. In this paper, a novel matching engine called BFA is proposed to achieve high-speed regular expression matching with fast pre-processing. Experiments demonstrate that BFA obtains 5 to 20 times more update abilities compared to existing regular expression matching methods, and scales well on multi-core platforms.
基金Supported by the National High Technology Development 863 Program of China (No. 2008AA01Z117)
文摘Nowadays, using Deterministic Finite Automata (DFA) or Non-deterministic Finite Automata (NFA) to parse regular expressions is the most popular way for Deep Packet Inspection (DPI), and the research about DPI focuses on the improvement of DFA to reduce memory. However, most of the existing literature ignores a special kind of "overlap-matching expression", which causes states explosion and takes quite a large part in the DPI rules. To solve this problem, in this paper a new mechanism is proposed based on bitmap. We start with a simple regular expression to describe "overlap-matching expressions" and state the problem. Then, after calculating the terrible number of exploded states for this kind of expressions, the procedure of Bitmap-based Soft Parallel Mechanism (BSPM) is described. Based on BSPM, we discuss all the different types of "overlap-matching ex- pressions" and give optimization suggestions of them separately. Finally, experiment results prove that BSPM can give an excellent performance on solving the problem stated above, and the optimization suggestions are also effective for the memory reduction on all types of "overlap-matching expressions".
文摘In this paper, we consider the general quasi-differential expressions each of order n with complex coefficients and their formal adjoints on the interval (a,b). It is shown in direct sum spaces of functions defined on each of the separate intervals with the cases of one and two singular end-points and when all solutions of the equation and its adjoint are in (the limit circle case) that all well-posed extensions of the minimal operator have resolvents which are HilbertSchmidt integral operators and consequently have a wholly discrete spectrum. This implies that all the regularly solvable operators have all the standard essential spectra to be empty. These results extend those of formally symmetric expression studied in [1-10] and those of general quasi-differential expressions in [11-19].
文摘提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR(regular expressions cut and combine algorithm based on DR),有效地选择出导致DFA状态膨胀的片段并隔离,降低了单个正则表达式存储需求.同时,基于正则表达式的组合关系提出一种选择性分群算法REGADR(regular expressions group algorithm based on DR),在可以接受的存储需求总量下,通过选择性分群大幅度减少了状态机的个数,有效地降低了匹配算法的复杂性.