Security and privacy issues are magnified by velocity,volume,and variety of big data.User's privacy is an even more sensitive topic attracting most people's attention.While XcodeGhost,a malware of i OS emergin...Security and privacy issues are magnified by velocity,volume,and variety of big data.User's privacy is an even more sensitive topic attracting most people's attention.While XcodeGhost,a malware of i OS emerging in late 2015,leads to the privacy-leakage of a large number of users,only a few studies have examined XcodeGhost based on its source code.In this paper we describe observations by monitoring the network activities for more than 2.59 million i Phone users in a provincial area across 232 days.Our analysis reveals a number of interesting points.For example,we propose a decay model for the prevalence rate of Xcode Ghost and we find that the ratio of the infected devices is more than 60%;that a lot of popular applications,such as Wechat,railway 12306,didi taxi,Youku video are also infected;and that the duration as well as the traffic volume of most Xcode Ghost-related HTTP-requests is similar with usual HTTP-request which makes it difficult to be found.Besides,we propose a heuristic model based on fingerprint and its web-knowledge to identify the infected applications.The identifying result shows the efficiency of this model.展开更多
Top-k ranking of websites according to traffic volume is important for Internet Service Providers(ISPs) to understand network status and optimize network resources. However, the ranking result always has a big deviati...Top-k ranking of websites according to traffic volume is important for Internet Service Providers(ISPs) to understand network status and optimize network resources. However, the ranking result always has a big deviation with actual rank for the existence of unknown web traffic, which cannot be identified accurately under current techniques. In this paper, we introduce a novel method to approximate the actual rank. This method associates unknown web traffic with websites according to statistical probabilities. Then, we construct a probabilistic top-k query model to rank websites. We conduct several experiments by using real HTTP traffic traces collected from a commercial ISP covering an entire city in northern China. Experimental results show that the proposed techniques can reduce the deviation existing between the ground truth and the ranking results vastly. In addition, we find that the websites providing video service have higher ratio of unknown IP as well as higher ratio of unknown traffic than the websites providing text web page service. Specifically, we find that the top-3 video websites have more than 90% of unknown web traffic. All these findings are helpful for ISPs understanding network status and deploying Content Distributed Network(CDN).展开更多
With the continuous improvement of transmission rate,high speed network links require good performance of packet capture.The multiprocessor platform has strong computational capabilities,and brings new chance for high...With the continuous improvement of transmission rate,high speed network links require good performance of packet capture.The multiprocessor platform has strong computational capabilities,and brings new chance for high rate packet capture.In this paper,we analyze the performance of common packet capture approaches that are based on general-purpose multiprocessor platform.The analysis contains two aspects:one is the maximum packet capture rate and throughput on multiprocessor platform,the other is central processing unit(CPU)load under the maximum capture rate.By comparing and analyzing the experimental result,we give the maximum packet capture rate and throughput of different capture approaches.Furthermore,we analyze the CPU load which is produced by two capture processes run on the multiprocessor platform simultaneously and make a comparison with the single capture process.展开更多
The traditional intrusion detection system has the problem of high false positive rate and false negative rate.This paper deeply analyzes the differences of statistical features between single-flow and multi-flow on t...The traditional intrusion detection system has the problem of high false positive rate and false negative rate.This paper deeply analyzes the differences of statistical features between single-flow and multi-flow on the database network,and presents a group of features that are easy to acquire and can be used to detect the anomaly in database network efficiently.By applying this group of features in Fisher algorithm for anomaly detection,the false positive rate and false negative rate are dramatically reduced.Simultaneously,the model made by using the group of features has the advantages of low algorithm complexity,good detection result and strong generalization ability.Experimental results show that there is higher accuracy when using the features of single-flow and multiflow to construct the anomaly detection model than only using single-flow features.展开更多
基金supported by 111 Project of China under Grant No.B08004
文摘Security and privacy issues are magnified by velocity,volume,and variety of big data.User's privacy is an even more sensitive topic attracting most people's attention.While XcodeGhost,a malware of i OS emerging in late 2015,leads to the privacy-leakage of a large number of users,only a few studies have examined XcodeGhost based on its source code.In this paper we describe observations by monitoring the network activities for more than 2.59 million i Phone users in a provincial area across 232 days.Our analysis reveals a number of interesting points.For example,we propose a decay model for the prevalence rate of Xcode Ghost and we find that the ratio of the infected devices is more than 60%;that a lot of popular applications,such as Wechat,railway 12306,didi taxi,Youku video are also infected;and that the duration as well as the traffic volume of most Xcode Ghost-related HTTP-requests is similar with usual HTTP-request which makes it difficult to be found.Besides,we propose a heuristic model based on fingerprint and its web-knowledge to identify the infected applications.The identifying result shows the efficiency of this model.
基金supported by 111 Project of China under Grant No.B08004
文摘Top-k ranking of websites according to traffic volume is important for Internet Service Providers(ISPs) to understand network status and optimize network resources. However, the ranking result always has a big deviation with actual rank for the existence of unknown web traffic, which cannot be identified accurately under current techniques. In this paper, we introduce a novel method to approximate the actual rank. This method associates unknown web traffic with websites according to statistical probabilities. Then, we construct a probabilistic top-k query model to rank websites. We conduct several experiments by using real HTTP traffic traces collected from a commercial ISP covering an entire city in northern China. Experimental results show that the proposed techniques can reduce the deviation existing between the ground truth and the ranking results vastly. In addition, we find that the websites providing video service have higher ratio of unknown IP as well as higher ratio of unknown traffic than the websites providing text web page service. Specifically, we find that the top-3 video websites have more than 90% of unknown web traffic. All these findings are helpful for ISPs understanding network status and deploying Content Distributed Network(CDN).
基金supported by the Key Project in the National Science and Technology Pillar Program (No.2008BAH37B04)the Consolidated Project of IBM Institute of China and Beijing University of Posts and Telecommunications (No.JLP200906011-1).
文摘With the continuous improvement of transmission rate,high speed network links require good performance of packet capture.The multiprocessor platform has strong computational capabilities,and brings new chance for high rate packet capture.In this paper,we analyze the performance of common packet capture approaches that are based on general-purpose multiprocessor platform.The analysis contains two aspects:one is the maximum packet capture rate and throughput on multiprocessor platform,the other is central processing unit(CPU)load under the maximum capture rate.By comparing and analyzing the experimental result,we give the maximum packet capture rate and throughput of different capture approaches.Furthermore,we analyze the CPU load which is produced by two capture processes run on the multiprocessor platform simultaneously and make a comparison with the single capture process.
基金supported by the Key Project in the National Science and Technology Pillar Program (No.2008BAH37B04)the 111 Project (No.B08004).
文摘The traditional intrusion detection system has the problem of high false positive rate and false negative rate.This paper deeply analyzes the differences of statistical features between single-flow and multi-flow on the database network,and presents a group of features that are easy to acquire and can be used to detect the anomaly in database network efficiently.By applying this group of features in Fisher algorithm for anomaly detection,the false positive rate and false negative rate are dramatically reduced.Simultaneously,the model made by using the group of features has the advantages of low algorithm complexity,good detection result and strong generalization ability.Experimental results show that there is higher accuracy when using the features of single-flow and multiflow to construct the anomaly detection model than only using single-flow features.