This paper presents an efficient pattern matching algorithm (FSW). FSW improves the searching process for a pattern in a text. It scans the text with the help of four sliding windows. The windows are equal to the leng...This paper presents an efficient pattern matching algorithm (FSW). FSW improves the searching process for a pattern in a text. It scans the text with the help of four sliding windows. The windows are equal to the length of the pattern, allowing multiple alignments in the searching process. The text is divided into two parts;each part is scanned from both sides simultaneously using two sliding windows. The four windows slide in parallel in both parts of the text. The comparisons done between the text and the pattern are done from both of the pattern sides in parallel. The conducted experiments show that FSW achieves the best overall results in the number of attempts and the number of character comparisons compared to the pattern matching algorithms: Two Sliding Windows (TSW), Enhanced Two Sliding Windows algorithm (ETSW) and Berry-Ravindran algorithm (BR). The best time case is calculated and found to be??while the average case time complexity is??.展开更多
Pattern matching is a very important topic in computer science. It has been used in various applications such as information retrieval, virus scanning, DNA sequence analysis, data mining, machine learning, network sec...Pattern matching is a very important topic in computer science. It has been used in various applications such as information retrieval, virus scanning, DNA sequence analysis, data mining, machine learning, network security and pattern recognition. This paper has presented a new pattern matching algorithm—Enhanced ERS-A, which is an improvement over ERS-S algorithm. In ERS-A, two sliding windows are used to scan the text from the left and the right simultaneously. The proposed algorithm also scans the text from the left and the right simultaneously as well as making comparisons with the pattern from both sides simultaneously. The comparisons done between the text and the pattern are done from both sides in parallel. The shift technique used in the Enhanced ERS-A is the four consecutive characters in the text immediately following the pattern window. The experimental results show that the Enhanced ERS-A has enhanced the process of pattern matching by reducing the number of comparisons performed.展开更多
In the XML community, exact queries allow users to specify exactly what they want to check and/or retrieve in an XML document. When they are applied to a semi-structured document or to a document with an overly comple...In the XML community, exact queries allow users to specify exactly what they want to check and/or retrieve in an XML document. When they are applied to a semi-structured document or to a document with an overly complex model, the lack or the ignorance of the explicit document model (DTD—Document Type Definition, Schema, etc.) increases the risk of obtaining an empty result set when the query is too specific, or, too large result set when it is too vague (e.g. it contains wildcards such as “*”). The reason is that in both cases, users write queries according to the document model they have in mind;this can be very far from the one that can actually be extracted from the document. Opposed to exact queries, preference queries are more flexible and can be relaxed to expand the search space during their evaluations. Indeed, during their evaluation, certain constraints (the preferences they contain) can be relaxed if necessary to avoid precisely empty results;moreover, the returned answers can be filtered to retain only the best ones. This paper presents an algorithm for evaluating such queries inspired by the TreeMatch algorithm proposed by Yao et al. for exact queries. In the proposed algorithm, the best answers are obtained by using an adaptation of the Skyline operator (defined in relational databases) in the context of documents (trees) to incrementally filter into the partial solutions set, those which satisfy the maximum of preferential constraints. The only restriction imposed on documents is No-Self-Containment.展开更多
Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Chan...Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.展开更多
A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low freq...A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low frequency image and several high frequency images, and the scale-invariant feature transform is employed to extract feature points from the low frequency im- age. A proximity matrix is constructed for the feature points of two related images. By singular value decomposition of the proximity matrix, a matching matrix (or matching result) reflecting the match- ing degree among feature points is obtained. Experimental results indicate that the proposed algorithm can reduce time complexity and possess a higher accuracy.展开更多
String matching is seen as one of the essential problems in computer science. A variety of computer applications provide the string matching service for their end users. The remarkable boost in the number of data that...String matching is seen as one of the essential problems in computer science. A variety of computer applications provide the string matching service for their end users. The remarkable boost in the number of data that is created and kept by modern computational devices influences researchers to obtain even more powerful methods for coping with this problem. In this research, the Quick Search string matching algorithm are adopted to be implemented under the multi-core environment using OpenMP directive which can be employed to reduce the overall execution time of the program. English text, Proteins and DNA data types are utilized to examine the effect of parallelization and implementation of Quick Search string matching algorithm on multi-core based environment. Experimental outcomes reveal that the overall performance of the mentioned string matching algorithm has been improved, and the improvement in the execution time which has been obtained is considerable enough to recommend the multi-core environment as the suitable platform for parallelizing the Quick Search string matching algorithm.展开更多
Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection.In this paper,we present an attempt to design effic...Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection.In this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core architectures.We observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of patterns.The multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is minimized.We formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization problem.Experimental results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.展开更多
提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR(regular expressions cut a...提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR(regular expressions cut and combine algorithm based on DR),有效地选择出导致DFA状态膨胀的片段并隔离,降低了单个正则表达式存储需求.同时,基于正则表达式的组合关系提出一种选择性分群算法REGADR(regular expressions group algorithm based on DR),在可以接受的存储需求总量下,通过选择性分群大幅度减少了状态机的个数,有效地降低了匹配算法的复杂性.展开更多
文摘This paper presents an efficient pattern matching algorithm (FSW). FSW improves the searching process for a pattern in a text. It scans the text with the help of four sliding windows. The windows are equal to the length of the pattern, allowing multiple alignments in the searching process. The text is divided into two parts;each part is scanned from both sides simultaneously using two sliding windows. The four windows slide in parallel in both parts of the text. The comparisons done between the text and the pattern are done from both of the pattern sides in parallel. The conducted experiments show that FSW achieves the best overall results in the number of attempts and the number of character comparisons compared to the pattern matching algorithms: Two Sliding Windows (TSW), Enhanced Two Sliding Windows algorithm (ETSW) and Berry-Ravindran algorithm (BR). The best time case is calculated and found to be??while the average case time complexity is??.
文摘Pattern matching is a very important topic in computer science. It has been used in various applications such as information retrieval, virus scanning, DNA sequence analysis, data mining, machine learning, network security and pattern recognition. This paper has presented a new pattern matching algorithm—Enhanced ERS-A, which is an improvement over ERS-S algorithm. In ERS-A, two sliding windows are used to scan the text from the left and the right simultaneously. The proposed algorithm also scans the text from the left and the right simultaneously as well as making comparisons with the pattern from both sides simultaneously. The comparisons done between the text and the pattern are done from both sides in parallel. The shift technique used in the Enhanced ERS-A is the four consecutive characters in the text immediately following the pattern window. The experimental results show that the Enhanced ERS-A has enhanced the process of pattern matching by reducing the number of comparisons performed.
文摘In the XML community, exact queries allow users to specify exactly what they want to check and/or retrieve in an XML document. When they are applied to a semi-structured document or to a document with an overly complex model, the lack or the ignorance of the explicit document model (DTD—Document Type Definition, Schema, etc.) increases the risk of obtaining an empty result set when the query is too specific, or, too large result set when it is too vague (e.g. it contains wildcards such as “*”). The reason is that in both cases, users write queries according to the document model they have in mind;this can be very far from the one that can actually be extracted from the document. Opposed to exact queries, preference queries are more flexible and can be relaxed to expand the search space during their evaluations. Indeed, during their evaluation, certain constraints (the preferences they contain) can be relaxed if necessary to avoid precisely empty results;moreover, the returned answers can be filtered to retain only the best ones. This paper presents an algorithm for evaluating such queries inspired by the TreeMatch algorithm proposed by Yao et al. for exact queries. In the proposed algorithm, the best answers are obtained by using an adaptation of the Skyline operator (defined in relational databases) in the context of documents (trees) to incrementally filter into the partial solutions set, those which satisfy the maximum of preferential constraints. The only restriction imposed on documents is No-Self-Containment.
文摘Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.
基金supported by the National Natural Science Foundation of China (6117212711071002)+1 种基金the Specialized Research Fund for the Doctoral Program of Higher Education (20113401110006)the Innovative Research Team of 211 Project in Anhui University (KJTD007A)
文摘A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low frequency image and several high frequency images, and the scale-invariant feature transform is employed to extract feature points from the low frequency im- age. A proximity matrix is constructed for the feature points of two related images. By singular value decomposition of the proximity matrix, a matching matrix (or matching result) reflecting the match- ing degree among feature points is obtained. Experimental results indicate that the proposed algorithm can reduce time complexity and possess a higher accuracy.
文摘String matching is seen as one of the essential problems in computer science. A variety of computer applications provide the string matching service for their end users. The remarkable boost in the number of data that is created and kept by modern computational devices influences researchers to obtain even more powerful methods for coping with this problem. In this research, the Quick Search string matching algorithm are adopted to be implemented under the multi-core environment using OpenMP directive which can be employed to reduce the overall execution time of the program. English text, Proteins and DNA data types are utilized to examine the effect of parallelization and implementation of Quick Search string matching algorithm on multi-core based environment. Experimental outcomes reveal that the overall performance of the mentioned string matching algorithm has been improved, and the improvement in the execution time which has been obtained is considerable enough to recommend the multi-core environment as the suitable platform for parallelizing the Quick Search string matching algorithm.
基金supported by the National Natural Science Foundation of China under Grant Nos.60803030,60925009,60921002the National Basic Research 973 Program of China under Grant No.2011CB302502
文摘Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection.In this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core architectures.We observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of patterns.The multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is minimized.We formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization problem.Experimental results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.
文摘提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR(regular expressions cut and combine algorithm based on DR),有效地选择出导致DFA状态膨胀的片段并隔离,降低了单个正则表达式存储需求.同时,基于正则表达式的组合关系提出一种选择性分群算法REGADR(regular expressions group algorithm based on DR),在可以接受的存储需求总量下,通过选择性分群大幅度减少了状态机的个数,有效地降低了匹配算法的复杂性.