With the increasing enlargement of network scale and the rapid development of network techniques, large numbers of the network applications begin to appear. Packet capture plays an important role as one basic techniqu...With the increasing enlargement of network scale and the rapid development of network techniques, large numbers of the network applications begin to appear. Packet capture plays an important role as one basic technique used in each field of the network applications. In a high-speed network, the heavy traffic of network transmission challenges the packet capture techniques. This paper does an in-depth analysis on the traditional packet capture mechanisms in Linux, and then measures the performance bottleneck in the process of packet capture. The methods for improving the packet capture performance are presented and an optimized packet capture scheme is also designed and implemented. The test demonstrates that the new packet capture mechanism (Libpacket) can greatly improve the packet capture performance of the network application systems in a high-speed network.展开更多
The load balancing strategy of RSS used in the PF_RING capture method does not work well on multi-core processor platforms to achieve the disadvantage of the load balancing on the processor cores.This paper presents a...The load balancing strategy of RSS used in the PF_RING capture method does not work well on multi-core processor platforms to achieve the disadvantage of the load balancing on the processor cores.This paper presents a packet load balancing method based on FD and RSS.The basic idea of this method is to capture the packet with the 5 tuple filter matching,and then can not be classified packets and flow oriented filter matching,and finally can not be classified packets matching RSS.Design of experiments to test the packet capture performance and load balancing performance which the packet capture method of PF_RING using the combination of load balancing strategy based on FD+RSS and RSS,the results show that the data packet stream load balancing method based on FD+RSS can improve the performance of data packet capture and load balancing among multiple cores.展开更多
The demands of programmability have become more and more exigent as novel network services appear, such as E-commerce, social softwares, and online videos. Commodity multi-core CPUs have been widely applied in network...The demands of programmability have become more and more exigent as novel network services appear, such as E-commerce, social softwares, and online videos. Commodity multi-core CPUs have been widely applied in network packet processing to get high programmability and reduce the time-to-market. However,there is a great gap between the packet processing performance of commodity multi-core and that of the traditional packet processing hardware, e.g., NP(Network Process). Recently, optimization of the packet processing performance of commodity multi-cores has become a hot topic in industry and academia. In this paper, based on a detailed analysis of the packet processing procedure, firstly we identify two dominating overheads, namely the virtual-to-physical address translation and the packet buffer management. Secondly, we make a comprehensive survey on the current optimization methods. Thirdly, based on the survey, the heterogeneous architecture of the commodity multi-core + FPGA is proposed as a promising way to improve the packet processing performance.Fourthly, a novel Self-Described Buffer(SDB) management technology is introduced to eliminate the overheads of the allocation and deallocation of the packet buffers offloaded to FPGA. Then, an evaluation testbed, named PIOT(Packet I/O Testbed), is designed and implemented to evaluate the packet forwarding performance. I/O capacity of different commodity multi-core CPUs and the performance of optimization methods are assessed and compared based on PIOT. At last, the future work of packet processing optimization on multi-core CPUs is discussed.展开更多
Packet classification has been studied for decades; it classifies packets into specific flows based on a given rule set. As software-defined network was proposed, a recent trend of packet classification is to scale th...Packet classification has been studied for decades; it classifies packets into specific flows based on a given rule set. As software-defined network was proposed, a recent trend of packet classification is to scale the five-tuple model to multi-tuple. In general, packet classification on multiple fields is a complex problem. Although most existing software-based algorithms have been proved extraordinary in practice, they are only suitable for the classic five-tuple model and difficult to be scaled up. Meanwhile, hardware-specific solutions are inflexible and expensive, and some of them are power consuming. In this paper, we propose a universal multi-dimensional packet classification approach for multi-core systems. In our approach, novel data structures and four decomposition-based algorithms are designed to optimize the classification and updating of rules. For multi-field rules, a rule set is cut into several parts according to the number of fields. Each part works independently. In this way, the fields are searched in parallel and all the partial results are merged together at last. To demonstrate the feasibility of our approach, we implement a prototype and evaluate its throughput and latency. Experimental results show that our approach achieves a 40% higher throughput than that of other decomposed-based algorithms and a 43% lower latency of rule incremental update than that of the other algorithms on average. Furthermore, our approach saves 39% memory consumption on average and has a good scalability.展开更多
With the continuous improvement of transmission rate,high speed network links require good performance of packet capture.The multiprocessor platform has strong computational capabilities,and brings new chance for high...With the continuous improvement of transmission rate,high speed network links require good performance of packet capture.The multiprocessor platform has strong computational capabilities,and brings new chance for high rate packet capture.In this paper,we analyze the performance of common packet capture approaches that are based on general-purpose multiprocessor platform.The analysis contains two aspects:one is the maximum packet capture rate and throughput on multiprocessor platform,the other is central processing unit(CPU)load under the maximum capture rate.By comparing and analyzing the experimental result,we give the maximum packet capture rate and throughput of different capture approaches.Furthermore,we analyze the CPU load which is produced by two capture processes run on the multiprocessor platform simultaneously and make a comparison with the single capture process.展开更多
Packet classification is crucial to the implementation of advanced network services that require the capability to distinguish traffic in different flows, such as access control in firewalls and protocol analysis in i...Packet classification is crucial to the implementation of advanced network services that require the capability to distinguish traffic in different flows, such as access control in firewalls and protocol analysis in intrusion detection systems. This paper proposes a novel packet classification algorithm optimized for multi-core network processors. The proposed algorithm, AggreCuts, has an explicit worst-case search time with modest memory usage. The data structure of AggreCuts is flexible and well-adapted to different types of multi-core platforms. The algorithm on both Intel IXP2850 32-bit and Cavium OCTEON3860 64-bit multi-core platforms was implemented to evaluate the performance of AggreCuts. The experimental results show that AggreCuts outperforms the best-known existing algorithm in terms of memory usage and classification speed.展开更多
基金Sponsored by the National High Technology Development Program of China (Grant No. 2002AA142020).
文摘With the increasing enlargement of network scale and the rapid development of network techniques, large numbers of the network applications begin to appear. Packet capture plays an important role as one basic technique used in each field of the network applications. In a high-speed network, the heavy traffic of network transmission challenges the packet capture techniques. This paper does an in-depth analysis on the traditional packet capture mechanisms in Linux, and then measures the performance bottleneck in the process of packet capture. The methods for improving the packet capture performance are presented and an optimized packet capture scheme is also designed and implemented. The test demonstrates that the new packet capture mechanism (Libpacket) can greatly improve the packet capture performance of the network application systems in a high-speed network.
文摘The load balancing strategy of RSS used in the PF_RING capture method does not work well on multi-core processor platforms to achieve the disadvantage of the load balancing on the processor cores.This paper presents a packet load balancing method based on FD and RSS.The basic idea of this method is to capture the packet with the 5 tuple filter matching,and then can not be classified packets and flow oriented filter matching,and finally can not be classified packets matching RSS.Design of experiments to test the packet capture performance and load balancing performance which the packet capture method of PF_RING using the combination of load balancing strategy based on FD+RSS and RSS,the results show that the data packet stream load balancing method based on FD+RSS can improve the performance of data packet capture and load balancing among multiple cores.
基金supported by National High-tech R&D Program of China(863 Program)(Grant No.2015AA0156-03)National Natural Science Foundation of China(Grant No.61202483)
文摘The demands of programmability have become more and more exigent as novel network services appear, such as E-commerce, social softwares, and online videos. Commodity multi-core CPUs have been widely applied in network packet processing to get high programmability and reduce the time-to-market. However,there is a great gap between the packet processing performance of commodity multi-core and that of the traditional packet processing hardware, e.g., NP(Network Process). Recently, optimization of the packet processing performance of commodity multi-cores has become a hot topic in industry and academia. In this paper, based on a detailed analysis of the packet processing procedure, firstly we identify two dominating overheads, namely the virtual-to-physical address translation and the packet buffer management. Secondly, we make a comprehensive survey on the current optimization methods. Thirdly, based on the survey, the heterogeneous architecture of the commodity multi-core + FPGA is proposed as a promising way to improve the packet processing performance.Fourthly, a novel Self-Described Buffer(SDB) management technology is introduced to eliminate the overheads of the allocation and deallocation of the packet buffers offloaded to FPGA. Then, an evaluation testbed, named PIOT(Packet I/O Testbed), is designed and implemented to evaluate the packet forwarding performance. I/O capacity of different commodity multi-core CPUs and the performance of optimization methods are assessed and compared based on PIOT. At last, the future work of packet processing optimization on multi-core CPUs is discussed.
基金This work was supported by the National Basic Research 973 Program of China under Grant No. 2012CB315805 and the National Natural Science Foundation of China under Grant Nos. 61472130 and 61702174.
文摘Packet classification has been studied for decades; it classifies packets into specific flows based on a given rule set. As software-defined network was proposed, a recent trend of packet classification is to scale the five-tuple model to multi-tuple. In general, packet classification on multiple fields is a complex problem. Although most existing software-based algorithms have been proved extraordinary in practice, they are only suitable for the classic five-tuple model and difficult to be scaled up. Meanwhile, hardware-specific solutions are inflexible and expensive, and some of them are power consuming. In this paper, we propose a universal multi-dimensional packet classification approach for multi-core systems. In our approach, novel data structures and four decomposition-based algorithms are designed to optimize the classification and updating of rules. For multi-field rules, a rule set is cut into several parts according to the number of fields. Each part works independently. In this way, the fields are searched in parallel and all the partial results are merged together at last. To demonstrate the feasibility of our approach, we implement a prototype and evaluate its throughput and latency. Experimental results show that our approach achieves a 40% higher throughput than that of other decomposed-based algorithms and a 43% lower latency of rule incremental update than that of the other algorithms on average. Furthermore, our approach saves 39% memory consumption on average and has a good scalability.
基金supported by the Key Project in the National Science and Technology Pillar Program (No.2008BAH37B04)the Consolidated Project of IBM Institute of China and Beijing University of Posts and Telecommunications (No.JLP200906011-1).
文摘With the continuous improvement of transmission rate,high speed network links require good performance of packet capture.The multiprocessor platform has strong computational capabilities,and brings new chance for high rate packet capture.In this paper,we analyze the performance of common packet capture approaches that are based on general-purpose multiprocessor platform.The analysis contains two aspects:one is the maximum packet capture rate and throughput on multiprocessor platform,the other is central processing unit(CPU)load under the maximum capture rate.By comparing and analyzing the experimental result,we give the maximum packet capture rate and throughput of different capture approaches.Furthermore,we analyze the CPU load which is produced by two capture processes run on the multiprocessor platform simultaneously and make a comparison with the single capture process.
基金Supported by the National High-Tech Research and Development (863) Program of China (No. 2007AA01Z468)
文摘Packet classification is crucial to the implementation of advanced network services that require the capability to distinguish traffic in different flows, such as access control in firewalls and protocol analysis in intrusion detection systems. This paper proposes a novel packet classification algorithm optimized for multi-core network processors. The proposed algorithm, AggreCuts, has an explicit worst-case search time with modest memory usage. The data structure of AggreCuts is flexible and well-adapted to different types of multi-core platforms. The algorithm on both Intel IXP2850 32-bit and Cavium OCTEON3860 64-bit multi-core platforms was implemented to evaluate the performance of AggreCuts. The experimental results show that AggreCuts outperforms the best-known existing algorithm in terms of memory usage and classification speed.