Multi-modal image matching is crucial in aerospace applications because it can fully exploit the complementary and valuable information contained in the amount and diversity of remote sensing images.However,it remains...Multi-modal image matching is crucial in aerospace applications because it can fully exploit the complementary and valuable information contained in the amount and diversity of remote sensing images.However,it remains a challenging task due to significant non-linear radiometric,geometric differences,and noise across different sensors.To improve the performance of heterologous image matching,this paper proposes a normalized self-similarity region descriptor to extract consistent structural information.We first construct the pointwise self-similarity region descriptor based on the Euclidean distance between adjacent image blocks to reflect the structural properties of multi-modal images.Then,a linear normalization approach is used to form Modality Independent Region Descriptor(MIRD),which can effectively distinguish structural features such as points,lines,corners,and flat between multi-modal images.To further improve the matching accuracy,the included angle cosine similarity metric is adopted to exploit the directional vector information of multi-dimensional feature descriptors.The experimental results show that the proposed MIRD has better matching accuracy and robustness for various multi-modal image matching than the state-of-the-art methods.MIRD can effectively extract consistent geometric structure features and suppress the influence of SAR speckle noise using non-local neighboring image blocks operation,effectively applied to various multi-modal image matching.展开更多
In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and ha...In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and have poor versatility,and there is a certain mismatch phenomenon,which affects the positioning accuracy.Therefore,this paper proposes a location algorithm that combines a target recognition algorithm with a depth feature matching algorithm to solve the problem of unmanned aerial vehicle(UAV)environment perception and multi-modal image-matching fusion location.This algorithm was based on the single-shot object detector based on multi-level feature pyramid network(M2Det)algorithm and replaced the original visual geometry group(VGG)feature extraction network with the ResNet-101 network to improve the feature extraction capability of the network model.By introducing a depth feature matching algorithm,the algorithm shares neural network weights and realizes the design of UAV target recognition and a multi-modal image-matching fusion positioning algorithm.When the reference image and the real-time image were mismatched,the dynamic adaptive proportional constraint and the random sample consensus consistency algorithm(DAPC-RANSAC)were used to optimize the matching results to improve the correct matching efficiency of the target.Using the multi-modal registration data set,the proposed algorithm was compared and analyzed to verify its superiority and feasibility.The results show that the algorithm proposed in this paper can effectively deal with the matching between multi-modal images(visible image–infrared image,infrared image–satellite image,visible image–satellite image),and the contrast,scale,brightness,ambiguity deformation,and other changes had good stability and robustness.Finally,the effectiveness and practicability of the algorithm proposed in this paper were verified in an aerial test scene of an S1000 sixrotor UAV.展开更多
This paper describes how data records can be matched across large datasets using a technique called the Identity Correlation Approach (ICA). The ICA technique is then compared with a string matching exercise. Both t...This paper describes how data records can be matched across large datasets using a technique called the Identity Correlation Approach (ICA). The ICA technique is then compared with a string matching exercise. Both the string matching exercise and the ICA technique were employed for a big data project carried out by the CSO. The project was called the SESADP (Structure of Earnings Survey Administrative Data Project) and involved linking the Irish Census dataset 2011 to a large Public Sector Dataset. The ICA technique provides a mathematical tool to link the datasets and the matching rate for an exact match can be calculated before the matching process begins. Based on the number of variables and the size of the population, the matching rate is calculated in the ICA approach from the MRUI (Matching Rate for Unique Identifier) formula, and false positives are eliminated. No string matching is used in the ICA, therefore names are not required on the dataset, making the data more secure & ensuring confidentiality. The SESADP Project was highly successful using the ICA technique. A comparison of the results using a string matching exercise for the SESADP and the ICA are discussed here.展开更多
Given a set U which is consisted of strings defined on alphabet Σ, string cross pattern matching is to find all the matches between every two strings in U. It is utilized in text processing like removing the duplicat...Given a set U which is consisted of strings defined on alphabet Σ, string cross pattern matching is to find all the matches between every two strings in U. It is utilized in text processing like removing the duplication of strings. This paper presents a fast string cross pattern matching algorithm based on extracting high frequency strings. Compared with existing algorithms including single-pattern algorithms and multi-pattern matching algorithms, this algorithm is featured by both low time complexity and low space complexity. Because Chinese alphabet is large and the average length of Chinese words is much short, this algorithm is more suitable to process the text written by Chinese, especially when the size of Σ is large and the number of strings is far more than the maximum length of strings of set U.展开更多
A screen content coding (SCC) algorithm that uses a primary reference buffer (PRB) and a secondary reference buffer (SRB) for string matching and string copying is proposed. PRB is typically the traditional reco...A screen content coding (SCC) algorithm that uses a primary reference buffer (PRB) and a secondary reference buffer (SRB) for string matching and string copying is proposed. PRB is typically the traditional reconstructed picture buffer which provides reference string pixels for the current pixels being coded. SRB stores a few of recently and frequently referenced pixels for repetitive reference by the current pixels being coded. In the encoder, searching of optimal reference string is performed in both PRB and SRB, and either a PRB or SRB string is selected as an optimal reference string on a string-by-string basis. Compared with HM-16.4+SCM-40 reference software, the proposed SCC algorithm can improve coding performance measured by bit-distortion rate reduction of average 4.19% in all-intra configuration for text and graphics with motion category' of test sequences defined by JCT-VC common test condition.展开更多
Data centers are being distributed worldwide by cloud service providers(CSPs)to save energy costs through efficient workload alloca-tion strategies.Many CSPs are challenged by the significant rise in user demands due ...Data centers are being distributed worldwide by cloud service providers(CSPs)to save energy costs through efficient workload alloca-tion strategies.Many CSPs are challenged by the significant rise in user demands due to their extensive energy consumption during workload pro-cessing.Numerous research studies have examined distinct operating cost mitigation techniques for geo-distributed data centers(DCs).However,oper-ating cost savings during workload processing,which also considers string-matching techniques in geo-distributed DCs,remains unexplored.In this research,we propose a novel string matching-based geographical load balanc-ing(SMGLB)technique to mitigate the operating cost of the geo-distributed DC.The primary goal of this study is to use a string-matching algorithm(i.e.,Boyer Moore)to compare the contents of incoming workloads to those of documents that have already been processed in a data center.A successful match prevents the global load balancer from sending the user’s request to a data center for processing and displaying the results of the previously processed workload to the user to save energy.On the contrary,if no match can be discovered,the global load balancer will allocate the incoming workload to a specific DC for processing considering variable energy prices,the number of active servers,on-site green energy,and traces of incoming workload.The results of numerical evaluations show that the SMGLB can minimize the operating expenses of the geo-distributed data centers more than the existing workload distribution techniques.展开更多
String matching is seen as one of the essential problems in computer science. A variety of computer applications provide the string matching service for their end users. The remarkable boost in the number of data that...String matching is seen as one of the essential problems in computer science. A variety of computer applications provide the string matching service for their end users. The remarkable boost in the number of data that is created and kept by modern computational devices influences researchers to obtain even more powerful methods for coping with this problem. In this research, the Quick Search string matching algorithm are adopted to be implemented under the multi-core environment using OpenMP directive which can be employed to reduce the overall execution time of the program. English text, Proteins and DNA data types are utilized to examine the effect of parallelization and implementation of Quick Search string matching algorithm on multi-core based environment. Experimental outcomes reveal that the overall performance of the mentioned string matching algorithm has been improved, and the improvement in the execution time which has been obtained is considerable enough to recommend the multi-core environment as the suitable platform for parallelizing the Quick Search string matching algorithm.展开更多
A new method for solving the tiling problem of surface reconstruction is proposed. The proposed method uses a snake algorithm to segment the original images, the contours are then transformed into strings by Freeman'...A new method for solving the tiling problem of surface reconstruction is proposed. The proposed method uses a snake algorithm to segment the original images, the contours are then transformed into strings by Freeman' s code. Symbolic string matching technique is applied to establish a correspondence between the two consecutive contours. The surface is composed of the pieces reconstructed from the correspondence points. Experimental results show that the proposed method exhibits a good behavior for the quality of surface reconstruction and its time complexity is proportional to mn where m and n are the numbers of vertices of the two consecutive slices, respectively.展开更多
Modern applications require large databases to be searched for regions that are similar to a given pattern. The DNA sequence analysis, speech and text recognition, artificial intelligence, Internet of Things, and many...Modern applications require large databases to be searched for regions that are similar to a given pattern. The DNA sequence analysis, speech and text recognition, artificial intelligence, Internet of Things, and many other applications highly depend on pattern matching or similarity searches. In this paper, we discuss some of the string matching solutions developed in the past. Then, we present a novel mathematical model to search for a given pattern and it’s near approximates in the text.展开更多
This study presents a parallel version of the string matching algorithms research tool(SMART)library,implemented on NVIDIA’s compute unified device architecture(CUDA)platform,and uses general-purpose computing on gra...This study presents a parallel version of the string matching algorithms research tool(SMART)library,implemented on NVIDIA’s compute unified device architecture(CUDA)platform,and uses general-purpose computing on graphics processing unit(GPGPU)programming concepts to enhance performance and gain insight into the parallel versions of these algorithms.We have developed the CUDA-enhanced SMART(CUSMART)library,which incorporates parallelized iterations of 64 string matching algorithms,leveraging the CUDA application programming interface.The performance of these algorithms has been assessed across various scenarios to ensure a comprehensive and impartial comparison,allowing for the identification of their strengths and weaknesses in specific application contexts.We have explored and established optimization techniques to gauge their influence on the performance of these algorithms.The results of this study highlight the potential of GPGPU computing in string matching applications through the scalability of algorithms,suggesting significant performance improvements.Furthermore,we have identified the best and worst performing algorithms in various scenarios.展开更多
In this paper,we investigate the stable matching problem with multiple preferences in bipartite graphs,where each agent has various preference lists for all available partners with respect to different criteria.The pr...In this paper,we investigate the stable matching problem with multiple preferences in bipartite graphs,where each agent has various preference lists for all available partners with respect to different criteria.The problem requires that each matched agent must have exactly one partner and the obtained matching should be stable for all criteria.As our main contribution,we present an integer linear programming(ILP)model for determining whether there exists a globally stable matching in bipartite graphs,which has been proved to be NP-hard.Since the time consumed for solving ILPs might dramatically increase as the size of instances grows,we develop a preprocessing technique that helps to eliminate pairs that will never be a member of any globally stable matching and thus accelerates the computing process.We perform experiments on randomly generated preference lists and observe a significant speedup when we preprocess the instance before solving the ILPs.As there does not need to exist a perfect matching that is stable for all given criteria,we extend our ILP to the optimized version of the aforementioned problem,which asks to find a matching with maximum cardinality that is stable among all matched agents.展开更多
Multi-pattern matching with wildcards is a problem of finding the occurrence of all patterns in a pattern set {p^1,… ,p^k} in a given text t. If the percentage of wildcards in pattern set is not high, this problem ca...Multi-pattern matching with wildcards is a problem of finding the occurrence of all patterns in a pattern set {p^1,… ,p^k} in a given text t. If the percentage of wildcards in pattern set is not high, this problem can be solved using finite automata. We introduce a multi-pattern matching algorithm with a fixed number of wildcards to overcome the high percentage of the occurrence of wildcards in patterns. In our proposed method, patterns are matched as bit patterns using a sliding window approach. The window is a bit window that slides along the given text, matching against stored bit patterns. Matching process is executed using bit wise operations. The experimental results demonstrate that the percentage of wildcard occurrence does not affect the proposed algorithm's performance and the proposed algorithm is more efficient than the algorithms based on the fast Fourier transform. The proposed algorithm is simple to implement and runs efficiently in O(n + d(n/σ )(m/w)) time, where n is text length, d is symbol distribution over k patterns, m is pattern length, and σ is alphabet size.展开更多
By studying the algorithms of single pattern matching, five factors that have effect on time complexity of the algorithm are analyzed. The five factors are: sorting the characters of pattern string in an increasing o...By studying the algorithms of single pattern matching, five factors that have effect on time complexity of the algorithm are analyzed. The five factors are: sorting the characters of pattern string in an increasing order of using frequency, utilizing already-matched pattern suffix information, utilizing already-matched pattern prefix information, utilizing the position factor which is absorbed from quick search algorithm, and utilizing the continue-skip idea which is originally proposed by this paper. Combining all the five factors, a new single pattern matching algorithm is implemented. It's proven by the experiment that the efficiency of new algorithm is the best of all algorithms.展开更多
Based on the study of single pattern matching, MBF algorithm is proposed by imitating the string searching procedure of human. The algorithm preprocesses the pattern by using the idea of Quick Search algorithm and the...Based on the study of single pattern matching, MBF algorithm is proposed by imitating the string searching procedure of human. The algorithm preprocesses the pattern by using the idea of Quick Search algorithm and the already-matched pattern psefix and suffix information. In searching phase, the algorithm makes use of the!character using frequency and the continue-skip idea. The experiment shows that MBF algorithm is more efficient than other algorithms.展开更多
Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Chan...Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.展开更多
基金supported by the National Natural Science Foundation of China,China(No.61801491)。
文摘Multi-modal image matching is crucial in aerospace applications because it can fully exploit the complementary and valuable information contained in the amount and diversity of remote sensing images.However,it remains a challenging task due to significant non-linear radiometric,geometric differences,and noise across different sensors.To improve the performance of heterologous image matching,this paper proposes a normalized self-similarity region descriptor to extract consistent structural information.We first construct the pointwise self-similarity region descriptor based on the Euclidean distance between adjacent image blocks to reflect the structural properties of multi-modal images.Then,a linear normalization approach is used to form Modality Independent Region Descriptor(MIRD),which can effectively distinguish structural features such as points,lines,corners,and flat between multi-modal images.To further improve the matching accuracy,the included angle cosine similarity metric is adopted to exploit the directional vector information of multi-dimensional feature descriptors.The experimental results show that the proposed MIRD has better matching accuracy and robustness for various multi-modal image matching than the state-of-the-art methods.MIRD can effectively extract consistent geometric structure features and suppress the influence of SAR speckle noise using non-local neighboring image blocks operation,effectively applied to various multi-modal image matching.
基金supported in part by the National Natural Science Foundation of China under Grant 62276274in part by the Natural Science Foundation of Shaanxi Province under Grant 2020JM-537,and in part by the Aeronautical Science Fund under Grant 201851U8012(corresponding author:Xiaogang Yang).
文摘In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and have poor versatility,and there is a certain mismatch phenomenon,which affects the positioning accuracy.Therefore,this paper proposes a location algorithm that combines a target recognition algorithm with a depth feature matching algorithm to solve the problem of unmanned aerial vehicle(UAV)environment perception and multi-modal image-matching fusion location.This algorithm was based on the single-shot object detector based on multi-level feature pyramid network(M2Det)algorithm and replaced the original visual geometry group(VGG)feature extraction network with the ResNet-101 network to improve the feature extraction capability of the network model.By introducing a depth feature matching algorithm,the algorithm shares neural network weights and realizes the design of UAV target recognition and a multi-modal image-matching fusion positioning algorithm.When the reference image and the real-time image were mismatched,the dynamic adaptive proportional constraint and the random sample consensus consistency algorithm(DAPC-RANSAC)were used to optimize the matching results to improve the correct matching efficiency of the target.Using the multi-modal registration data set,the proposed algorithm was compared and analyzed to verify its superiority and feasibility.The results show that the algorithm proposed in this paper can effectively deal with the matching between multi-modal images(visible image–infrared image,infrared image–satellite image,visible image–satellite image),and the contrast,scale,brightness,ambiguity deformation,and other changes had good stability and robustness.Finally,the effectiveness and practicability of the algorithm proposed in this paper were verified in an aerial test scene of an S1000 sixrotor UAV.
文摘This paper describes how data records can be matched across large datasets using a technique called the Identity Correlation Approach (ICA). The ICA technique is then compared with a string matching exercise. Both the string matching exercise and the ICA technique were employed for a big data project carried out by the CSO. The project was called the SESADP (Structure of Earnings Survey Administrative Data Project) and involved linking the Irish Census dataset 2011 to a large Public Sector Dataset. The ICA technique provides a mathematical tool to link the datasets and the matching rate for an exact match can be calculated before the matching process begins. Based on the number of variables and the size of the population, the matching rate is calculated in the ICA approach from the MRUI (Matching Rate for Unique Identifier) formula, and false positives are eliminated. No string matching is used in the ICA, therefore names are not required on the dataset, making the data more secure & ensuring confidentiality. The SESADP Project was highly successful using the ICA technique. A comparison of the results using a string matching exercise for the SESADP and the ICA are discussed here.
文摘Given a set U which is consisted of strings defined on alphabet Σ, string cross pattern matching is to find all the matches between every two strings in U. It is utilized in text processing like removing the duplication of strings. This paper presents a fast string cross pattern matching algorithm based on extracting high frequency strings. Compared with existing algorithms including single-pattern algorithms and multi-pattern matching algorithms, this algorithm is featured by both low time complexity and low space complexity. Because Chinese alphabet is large and the average length of Chinese words is much short, this algorithm is more suitable to process the text written by Chinese, especially when the size of Σ is large and the number of strings is far more than the maximum length of strings of set U.
基金supported in part by National Natural Science Foundation of China under Grant No.61201226 and 61271096Natural Science Foundation of Shanghai under Grant No.12ZR1433800Specialized Research Fund for the Doctoral Program under Grant No.20130072110054
文摘A screen content coding (SCC) algorithm that uses a primary reference buffer (PRB) and a secondary reference buffer (SRB) for string matching and string copying is proposed. PRB is typically the traditional reconstructed picture buffer which provides reference string pixels for the current pixels being coded. SRB stores a few of recently and frequently referenced pixels for repetitive reference by the current pixels being coded. In the encoder, searching of optimal reference string is performed in both PRB and SRB, and either a PRB or SRB string is selected as an optimal reference string on a string-by-string basis. Compared with HM-16.4+SCM-40 reference software, the proposed SCC algorithm can improve coding performance measured by bit-distortion rate reduction of average 4.19% in all-intra configuration for text and graphics with motion category' of test sequences defined by JCT-VC common test condition.
文摘Data centers are being distributed worldwide by cloud service providers(CSPs)to save energy costs through efficient workload alloca-tion strategies.Many CSPs are challenged by the significant rise in user demands due to their extensive energy consumption during workload pro-cessing.Numerous research studies have examined distinct operating cost mitigation techniques for geo-distributed data centers(DCs).However,oper-ating cost savings during workload processing,which also considers string-matching techniques in geo-distributed DCs,remains unexplored.In this research,we propose a novel string matching-based geographical load balanc-ing(SMGLB)technique to mitigate the operating cost of the geo-distributed DC.The primary goal of this study is to use a string-matching algorithm(i.e.,Boyer Moore)to compare the contents of incoming workloads to those of documents that have already been processed in a data center.A successful match prevents the global load balancer from sending the user’s request to a data center for processing and displaying the results of the previously processed workload to the user to save energy.On the contrary,if no match can be discovered,the global load balancer will allocate the incoming workload to a specific DC for processing considering variable energy prices,the number of active servers,on-site green energy,and traces of incoming workload.The results of numerical evaluations show that the SMGLB can minimize the operating expenses of the geo-distributed data centers more than the existing workload distribution techniques.
文摘String matching is seen as one of the essential problems in computer science. A variety of computer applications provide the string matching service for their end users. The remarkable boost in the number of data that is created and kept by modern computational devices influences researchers to obtain even more powerful methods for coping with this problem. In this research, the Quick Search string matching algorithm are adopted to be implemented under the multi-core environment using OpenMP directive which can be employed to reduce the overall execution time of the program. English text, Proteins and DNA data types are utilized to examine the effect of parallelization and implementation of Quick Search string matching algorithm on multi-core based environment. Experimental outcomes reveal that the overall performance of the mentioned string matching algorithm has been improved, and the improvement in the execution time which has been obtained is considerable enough to recommend the multi-core environment as the suitable platform for parallelizing the Quick Search string matching algorithm.
文摘A new method for solving the tiling problem of surface reconstruction is proposed. The proposed method uses a snake algorithm to segment the original images, the contours are then transformed into strings by Freeman' s code. Symbolic string matching technique is applied to establish a correspondence between the two consecutive contours. The surface is composed of the pieces reconstructed from the correspondence points. Experimental results show that the proposed method exhibits a good behavior for the quality of surface reconstruction and its time complexity is proportional to mn where m and n are the numbers of vertices of the two consecutive slices, respectively.
文摘Modern applications require large databases to be searched for regions that are similar to a given pattern. The DNA sequence analysis, speech and text recognition, artificial intelligence, Internet of Things, and many other applications highly depend on pattern matching or similarity searches. In this paper, we discuss some of the string matching solutions developed in the past. Then, we present a novel mathematical model to search for a given pattern and it’s near approximates in the text.
基金Project supported by the Scientific and Technological Research Council of Türkiye(No.117E142)Open access funding provided by the Scientific and Technological Research Council of Türkiye(TÜBİTAK)。
文摘This study presents a parallel version of the string matching algorithms research tool(SMART)library,implemented on NVIDIA’s compute unified device architecture(CUDA)platform,and uses general-purpose computing on graphics processing unit(GPGPU)programming concepts to enhance performance and gain insight into the parallel versions of these algorithms.We have developed the CUDA-enhanced SMART(CUSMART)library,which incorporates parallelized iterations of 64 string matching algorithms,leveraging the CUDA application programming interface.The performance of these algorithms has been assessed across various scenarios to ensure a comprehensive and impartial comparison,allowing for the identification of their strengths and weaknesses in specific application contexts.We have explored and established optimization techniques to gauge their influence on the performance of these algorithms.The results of this study highlight the potential of GPGPU computing in string matching applications through the scalability of algorithms,suggesting significant performance improvements.Furthermore,we have identified the best and worst performing algorithms in various scenarios.
基金supported by the National Key R&D Program of China(No.2022YFE0196100)the Guangxi Key Laboratory of Cryptography and Information Security(No.GCIS202116)+2 种基金the Fundamental Research Project of Shenzhen City(No.JCYJ20210324102012033)the Shenzhen Science and Technology Program(No.CJGJZD20210408092806017)the National Natural Science Foundation of China(Nos.12371321 and 12071460).
文摘In this paper,we investigate the stable matching problem with multiple preferences in bipartite graphs,where each agent has various preference lists for all available partners with respect to different criteria.The problem requires that each matched agent must have exactly one partner and the obtained matching should be stable for all criteria.As our main contribution,we present an integer linear programming(ILP)model for determining whether there exists a globally stable matching in bipartite graphs,which has been proved to be NP-hard.Since the time consumed for solving ILPs might dramatically increase as the size of instances grows,we develop a preprocessing technique that helps to eliminate pairs that will never be a member of any globally stable matching and thus accelerates the computing process.We perform experiments on randomly generated preference lists and observe a significant speedup when we preprocess the instance before solving the ILPs.As there does not need to exist a perfect matching that is stable for all given criteria,we extend our ILP to the optimized version of the aforementioned problem,which asks to find a matching with maximum cardinality that is stable among all matched agents.
基金Supported by the European Framework Program(FP7)(FP7-PEOPLE-2011-IRSES)the National Sci-Tech Support Plan of China(2014BAH02F03)
文摘Multi-pattern matching with wildcards is a problem of finding the occurrence of all patterns in a pattern set {p^1,… ,p^k} in a given text t. If the percentage of wildcards in pattern set is not high, this problem can be solved using finite automata. We introduce a multi-pattern matching algorithm with a fixed number of wildcards to overcome the high percentage of the occurrence of wildcards in patterns. In our proposed method, patterns are matched as bit patterns using a sliding window approach. The window is a bit window that slides along the given text, matching against stored bit patterns. Matching process is executed using bit wise operations. The experimental results demonstrate that the percentage of wildcard occurrence does not affect the proposed algorithm's performance and the proposed algorithm is more efficient than the algorithms based on the fast Fourier transform. The proposed algorithm is simple to implement and runs efficiently in O(n + d(n/σ )(m/w)) time, where n is text length, d is symbol distribution over k patterns, m is pattern length, and σ is alphabet size.
基金the National Natural Science Foundation of China (Nos. 60502032 and 60672068)
文摘By studying the algorithms of single pattern matching, five factors that have effect on time complexity of the algorithm are analyzed. The five factors are: sorting the characters of pattern string in an increasing order of using frequency, utilizing already-matched pattern suffix information, utilizing already-matched pattern prefix information, utilizing the position factor which is absorbed from quick search algorithm, and utilizing the continue-skip idea which is originally proposed by this paper. Combining all the five factors, a new single pattern matching algorithm is implemented. It's proven by the experiment that the efficiency of new algorithm is the best of all algorithms.
文摘Based on the study of single pattern matching, MBF algorithm is proposed by imitating the string searching procedure of human. The algorithm preprocesses the pattern by using the idea of Quick Search algorithm and the already-matched pattern psefix and suffix information. In searching phase, the algorithm makes use of the!character using frequency and the continue-skip idea. The experiment shows that MBF algorithm is more efficient than other algorithms.
文摘Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.