Although many classical IP geolocation algorithms are suitable to rich-connected networks, their performances are seriously affected in poor-connected networks with weak delay-distance correlation. This paper tries to...Although many classical IP geolocation algorithms are suitable to rich-connected networks, their performances are seriously affected in poor-connected networks with weak delay-distance correlation. This paper tries to improve the performances of classical IP geolocation algorithms by finding rich-connected sub-networks inside poor-connected networks. First, a new delay-distance correlation model (RTD-Corr model) is proposed. It builds the relationship between delay-distance correlation and actual network factors such as the tortuosity of the network path and the ratio of propagation delay. Second, based on the RTD-Corr model and actual network characteristics, this paper discusses about how to find rich-connected networks inside China Intemet which is a typical actual poor-connected network. Then we find rich-connected sub-networks of China Intemet through a large-scale network measurement which covers three major ISPs and thirty provinces. At last, based on the founded rich-connected sub-networks, we modify two classical IP geolocation algorithms and the experiments in China Intemet show that their accuracy is significantly increased.展开更多
Existing IP geolocation algorithms based on delay similarity often rely on the principle that geographically adjacent IPs have similar delays.However,this principle is often invalid in real Internet environment,which ...Existing IP geolocation algorithms based on delay similarity often rely on the principle that geographically adjacent IPs have similar delays.However,this principle is often invalid in real Internet environment,which leads to unreliable geolocation results.To improve the accuracy and reliability of locating IP in real Internet,a street-level IP geolocation algorithm based on landmarks clustering is proposed.Firstly,we use the probes to measure the known landmarks to obtain their delay vectors,and cluster landmarks using them.Secondly,the landmarks are clustered again by their latitude and longitude,and the intersection of these two clustering results is taken to form training sets.Thirdly,we train multiple neural networks to get the mapping relationship between delay and location in each training set.Finally,we determine one of the neural networks for the target by the delay similarity and relative hop counts,and then geolocate the target by this network.As it brings together the delay and geographical coordinates clustering,the proposed algorithm largely improves the inconsistency between them and enhances the mapping relationship between them.We evaluate the algorithm by a series of experiments in Hong Kong,Shanghai,Zhengzhou and New York.The experimental results show that the proposed algorithm achieves street-level IP geolocation,and comparing with existing typical streetlevel geolocation algorithms,the proposed algorithm improves the geolocation reliability significantly.展开更多
IP geolocation is essential for the territorial analysis of sensitive network entities,location-based services(LBS)and network fraud detection.It has important theoretical significance and application value.Measuremen...IP geolocation is essential for the territorial analysis of sensitive network entities,location-based services(LBS)and network fraud detection.It has important theoretical significance and application value.Measurement-based IP geolocation is a hot research topic.However,the existing IP geolocation algorithms cannot effectively utilize the distance characteristics of the delay,and the nodes’connection relation,resulting in high geolocation error.It is challenging to obtain the mapping between delay,nodes’connection relation,and geographical location.Based on the idea of network representation learning,we propose a representation learning model for IP nodes(IP2vec for short)and apply it to street-level IP geolocation.IP2vec model vectorizes nodes according to the connection relation and delay between nodes so that the IP vectors can reflect the distance and topological proximity between IP nodes.The steps of the street-level IP geolocation algorithm based on IP2vec model are as follows:Firstly,we measure landmarks and target IP to obtain delay and path information to construct the network topology.Secondly,we use the IP2vec model to obtain the IP vectors from the network topology.Thirdly,we train a neural network to fit the mapping relation between vectors and locations of landmarks.Finally,the vector of target IP is fed into the neural network to obtain the geographical location of target IP.The algorithm can accurately infer geographical locations of target IPs based on delay and topological proximity embedded in the IP vectors.The cross-validation experimental results on 10023 target IPs in New York,Beijing,Hong Kong,and Zhengzhou demonstrate that the proposed algorithm can achieve street-level geolocation.Compared with the existing algorithms such as Hop-Hot,IP-geolocater and SLG,the mean geolocation error of the proposed algorithm is reduced by 33%,39%,and 51%,respectively.展开更多
IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much mor...IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.展开更多
IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much mor...IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.展开更多
High-density street-level reliable landmarks are one of the important foundations for street-level geolocation.However,the existing methods cannot obtain enough street-level landmarks in a short period of time.In this...High-density street-level reliable landmarks are one of the important foundations for street-level geolocation.However,the existing methods cannot obtain enough street-level landmarks in a short period of time.In this paper,a street-level landmarks acquisition method based on SVM(Support Vector Machine)classifiers is proposed.Firstly,the port detection results of IPs with known services are vectorized,and the vectorization results are used as an input of the SVM training.Then,the kernel function and penalty factor are adjusted for SVM classifiers training,and the optimal SVM classifiers are obtained.After that,the classifier sequence is constructed,and the IPs with unknown service are classified using the sequence.Finally,according to the domain name corresponding to the IP,the relationship between the classified server IP and organization name is established.The experimental results in Guangzhou and Wuhan city in China show that the proposed method can be as a supplement to existing typical methods since the number of obtained street-level landmarks is increased substantially,and the median geolocation error using evaluated landmarks is reduced by about 2 km.展开更多
基金Supported by the National Natural Science Foundation of China(61379151,61274189,61302159 and 61401512)the Excellent Youth Foundation of Henan Province of China(144100510001)Foundation of Science and Technology on Information Assurance Laboratory(KJ-14-108)
文摘Although many classical IP geolocation algorithms are suitable to rich-connected networks, their performances are seriously affected in poor-connected networks with weak delay-distance correlation. This paper tries to improve the performances of classical IP geolocation algorithms by finding rich-connected sub-networks inside poor-connected networks. First, a new delay-distance correlation model (RTD-Corr model) is proposed. It builds the relationship between delay-distance correlation and actual network factors such as the tortuosity of the network path and the ratio of propagation delay. Second, based on the RTD-Corr model and actual network characteristics, this paper discusses about how to find rich-connected networks inside China Intemet which is a typical actual poor-connected network. Then we find rich-connected sub-networks of China Intemet through a large-scale network measurement which covers three major ISPs and thirty provinces. At last, based on the founded rich-connected sub-networks, we modify two classical IP geolocation algorithms and the experiments in China Intemet show that their accuracy is significantly increased.
基金the National Key R&D Program of China 2016YFB0801303(F.L.received the grant,the sponsors’website is https://service.most.gov.cn/)by the National Key R&D Program of China 2016QY01W0105(X.L.received the grant,the sponsors’website is https://service.most.gov.cn/)+5 种基金by the National Natural Science Foundation of China U1636219(X.L.received the grant,the sponsors’website is http://www.nsfc.gov.cn/)by the National Natural Science Foundation of China 61602508(J.L.received the grant,the sponsors’website is http://www.nsfc.gov.cn/)by the National Natural Science Foundation of China 61772549(F.L.received the grant,the sponsors’website is http://www.nsfc.gov.cn/)by the National Natural Science Foundation of China U1736214(F.L.received the grant,the sponsors’website is http://www.nsfc.gov.cn/)by the National Natural Science Foundation of China U1804263(X.L.received the grant,the sponsors’website is http://www.nsfc.gov.cn/)by the Science and Technology Innovation Talent Project of Henan Province 184200510018(X.L.received the grant,the sponsors’website is http://www.hnkjt.gov.cn/).
文摘Existing IP geolocation algorithms based on delay similarity often rely on the principle that geographically adjacent IPs have similar delays.However,this principle is often invalid in real Internet environment,which leads to unreliable geolocation results.To improve the accuracy and reliability of locating IP in real Internet,a street-level IP geolocation algorithm based on landmarks clustering is proposed.Firstly,we use the probes to measure the known landmarks to obtain their delay vectors,and cluster landmarks using them.Secondly,the landmarks are clustered again by their latitude and longitude,and the intersection of these two clustering results is taken to form training sets.Thirdly,we train multiple neural networks to get the mapping relationship between delay and location in each training set.Finally,we determine one of the neural networks for the target by the delay similarity and relative hop counts,and then geolocate the target by this network.As it brings together the delay and geographical coordinates clustering,the proposed algorithm largely improves the inconsistency between them and enhances the mapping relationship between them.We evaluate the algorithm by a series of experiments in Hong Kong,Shanghai,Zhengzhou and New York.The experimental results show that the proposed algorithm achieves street-level IP geolocation,and comparing with existing typical streetlevel geolocation algorithms,the proposed algorithm improves the geolocation reliability significantly.
基金the National Natural Science Foundation of China(Grant Nos.U1804263,U1736214,62172435)the Zhongyuan Science and Technology Innovation Leading Talent Project(No.214200510019)。
文摘IP geolocation is essential for the territorial analysis of sensitive network entities,location-based services(LBS)and network fraud detection.It has important theoretical significance and application value.Measurement-based IP geolocation is a hot research topic.However,the existing IP geolocation algorithms cannot effectively utilize the distance characteristics of the delay,and the nodes’connection relation,resulting in high geolocation error.It is challenging to obtain the mapping between delay,nodes’connection relation,and geographical location.Based on the idea of network representation learning,we propose a representation learning model for IP nodes(IP2vec for short)and apply it to street-level IP geolocation.IP2vec model vectorizes nodes according to the connection relation and delay between nodes so that the IP vectors can reflect the distance and topological proximity between IP nodes.The steps of the street-level IP geolocation algorithm based on IP2vec model are as follows:Firstly,we measure landmarks and target IP to obtain delay and path information to construct the network topology.Secondly,we use the IP2vec model to obtain the IP vectors from the network topology.Thirdly,we train a neural network to fit the mapping relation between vectors and locations of landmarks.Finally,the vector of target IP is fed into the neural network to obtain the geographical location of target IP.The algorithm can accurately infer geographical locations of target IPs based on delay and topological proximity embedded in the IP vectors.The cross-validation experimental results on 10023 target IPs in New York,Beijing,Hong Kong,and Zhengzhou demonstrate that the proposed algorithm can achieve street-level geolocation.Compared with the existing algorithms such as Hop-Hot,IP-geolocater and SLG,the mean geolocation error of the proposed algorithm is reduced by 33%,39%,and 51%,respectively.
文摘IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.
文摘IP geolocation determines geographical location by the IP address of Internet hosts.IP geolocation is widely used by target advertising,online fraud detection,cyber-attacks attribution and so on.It has gained much more attentions in these years since more and more physical devices are connected to cyberspace.Most geolocation methods cannot resolve the geolocation accuracy for those devices with few landmarks around.In this paper,we propose a novel geolocation approach that is based on common routers as secondary landmarks(Common Routers-based Geolocation,CRG).We search plenty of common routers by topology discovery among web server landmarks.We use statistical learning to study localized(delay,hop)-distance correlation and locate these common routers.We locate the accurate positions of common routers and convert them as secondary landmarks to help improve the feasibility of our geolocation system in areas that landmarks are sparsely distributed.We manage to improve the geolocation accuracy and decrease the maximum geolocation error compared to one of the state-of-the-art geolocation methods.At the end of this paper,we discuss the reason of the efficiency of our method and our future research.
基金The work presented in this paper is supported by the National Key R&D Program of China[Nos.2016YFB0801303,2016QY01W0105]the National Natural Science Foundation of China[Nos.U1636219,U1804263,61602508,61772549,U1736214,61572052]Plan for Scientific Innovation Talent of Henan Province[No.2018JR0018].
文摘High-density street-level reliable landmarks are one of the important foundations for street-level geolocation.However,the existing methods cannot obtain enough street-level landmarks in a short period of time.In this paper,a street-level landmarks acquisition method based on SVM(Support Vector Machine)classifiers is proposed.Firstly,the port detection results of IPs with known services are vectorized,and the vectorization results are used as an input of the SVM training.Then,the kernel function and penalty factor are adjusted for SVM classifiers training,and the optimal SVM classifiers are obtained.After that,the classifier sequence is constructed,and the IPs with unknown service are classified using the sequence.Finally,according to the domain name corresponding to the IP,the relationship between the classified server IP and organization name is established.The experimental results in Guangzhou and Wuhan city in China show that the proposed method can be as a supplement to existing typical methods since the number of obtained street-level landmarks is increased substantially,and the median geolocation error using evaluated landmarks is reduced by about 2 km.