Mobile big data collected by mobile network operators is of interest to many research communities and industries for its remarkable values.However,such spatiotemporal information may lead to a harsh threat to subscrib...Mobile big data collected by mobile network operators is of interest to many research communities and industries for its remarkable values.However,such spatiotemporal information may lead to a harsh threat to subscribers’privacy.This work focuses on subscriber privacy vulnerability assessment in terms of user identifiability across two datasets with significant detail reduced mobility representation.In this paper,we propose an innovative semantic spatiotemporal representation for each subscriber based on the geographic information,termed as daily habitat region,to approximate the subscriber’s daily mobility coverage with far lesser information compared with original mobility traces.The daily habitat region is realized via convex hull extraction on the user’s daily spatiotemporal traces.As a result,user identification can be formulated to match two records with the maximum similarity score between two convex hull sets,obtained by our proposed similarity measures based on cosine distance and permutation hypothesis test.Experiments are conducted to evaluate our proposed innovative mobility representation and user identification algorithms,which also demonstrate that the subscriber’s mobile privacy is under a severe threat even with significantly reduced spatiotemporal information.展开更多
Identifying an unfamiliar caller's profession is important to protect citizens' personal safety and property. Owing to the limited data protection of various popular online services in some countries, such as ...Identifying an unfamiliar caller's profession is important to protect citizens' personal safety and property. Owing to the limited data protection of various popular online services in some countries, such as taxi hailing and ordering takeouts, many users presently encounter an increasing number of phone calls from strangers. The situation may be aggravated when criminals pretend to be such service delivery staff, threatening the user individuals as well as the society. In addition, numerous people experience excessive digital marketing and fraudulent phone calls because of personal information leakage. However, previous works on malicious call detection only focused on binary classification, which does not work for the identification of multiple professions. We observed that web service requests issued from users' mobile phones might exhibit their application preferences, spatial and temporal patterns, and other profession-related information. This offers researchers and engineers a hint to identify unfamiliar callers. In fact, some previous works already leveraged raw data from mobile phones (which includes sensitive information) for personality studies. However, accessing users' mobile phone raw data may violate the more and more strict private data protection policies and regulations (e.g., General Data Protection Regulation). We observe that appropriate statistical methods can offer an effective means to eliminate private information and preserve personal characteristics, thus enabling the identification of the types of mobile phone callers without privacy concerns. In this paper, we develop CPFinder —- a system that exploits privacy-preserving mobile data to automatically identify callers who are divided into four categories of users: taxi drivers, delivery and takeouts staffs, telemarketers and fraudsters, and normal users (other professions). Our evaluation of an anonymized dataset of 1,282 users over a period of 3 months in Shanghai City shows that the CPFinder can achieve accuracies of more than 75.0% and 92.4% for multiclass and binary classifications, respectively.展开更多
Many existing efforts have taken advantage of large-scale spatial-temporal data to partition cities via constructed human interaction networks.However,few studies focus on communities emerging between adjacent cities ...Many existing efforts have taken advantage of large-scale spatial-temporal data to partition cities via constructed human interaction networks.However,few studies focus on communities emerging between adjacent cities in big urban agglomerations,which we call“cross-city”communities.In this study,we introduce a novel framework to detect cross-city communities in urban agglomerations under different scales leveraging a large number of fine-grained mobile signaling data aiming to break the original administrative boundaries.Taking the Pearl River Delta(PRD)urban agglomeration in China as study area,we investigate the existence of potential communities at three scales,i.e.city-group level,city level and sub-city level.The partition results are expected to benefit transportation planning,urban zoning and administrative boundary re-delineation.The results from our study highlight the necessity of considering cross-city communities and their scale effects when examining urban spatial interactions.展开更多
The impact of bike sharing systems(BSS)on urban mobility,and their study as part of the overall transport system in smart cities,has attracted significant academic interest in recent years.However,the lack of historic...The impact of bike sharing systems(BSS)on urban mobility,and their study as part of the overall transport system in smart cities,has attracted significant academic interest in recent years.However,the lack of historical and standardized data in current service tools hinders the analysis and improvement of these platforms,i.e.by reusing technical databased solutions.Big data nature(in volume,variety and velocity)of collecting BSS historical information must be also addressed,in order to take an integrated perspective.This paper describes an integrated solution to this challenge by(1)proposing a unified station status concept for recording historical information,based on the identification,study and unification of common relevant fields found in almost all BSS data warehouses,and(2)implementing a big data-inspired ETL infrastructure together with a storage optimization,methodology which not only allows to access and collect previous defined concepts but also overcomes existing big data challenge when storing BSS information.The system also consumes other external relevant information,such as weather factors,which have been aggregated,enhancing stored knowledge,with KPIs and statistics.The developed solution illustrates how it can manage over seven years of data from twentyseven BSS,serving not only machine-to-machine communication but also human-computer communication and enabling data-driven solutions.展开更多
Rural vitality is the life force expressed by a combination of endogenous dynamics and external influences. Exploring the complex relationship between rural functions, elements and flows could achieve sustainable rura...Rural vitality is the life force expressed by a combination of endogenous dynamics and external influences. Exploring the complex relationship between rural functions, elements and flows could achieve sustainable rural development. This study constructed a theoretical framework guided by the four functions of production, living, ecology and culture with the support of mobile big data. Furthermore, the network centrality of villages was estimated to reflect the intensity of urban-rural social mobility ties. The results indicated marked spatial disparities in rural vitality, and the coupling of ecological-cultural vitality has a high degree of coherence. Four rural vitality grades were identified: high level(38, 14.08%), medium-high level(66, 24.44%), medium-low level(110, 40.74%) and low level(56, 20.74%), covering 270 administrative village units. The flow intensity of social linkage elements is consistent with rural vitality and the socioeconomic spillover effect of urban centers on neighboring villages was noticeable. Topographic complexity negatively affected the living function, mainly in the T1 and T2 terrain gradients;the rural ecological function was not fully correlated with urban adjacency, and proximity could lead to adverse effects such as urban sprawl and resource destruction. The application of this study is to explore the importance of “flow” by utilizing mobile big data to refine the evaluation unit to the village scale. Accelerating the construction of network coverage and information interconnection and promoting the elemental flow of people, transportation and information between urban and rural areas are important ways to enhance rural vitality.展开更多
Urban greenspace has a profound impact on public health by purifying the air,blocking bacteria,and creating activity venues.Due to people's different position,the greenspace exposure to different age groups change...Urban greenspace has a profound impact on public health by purifying the air,blocking bacteria,and creating activity venues.Due to people's different position,the greenspace exposure to different age groups changes at various times.In this study,we combined NDVI(normalized difference vegetation index)and GVI(green view index)green indices with mobile signaling big data to evaluate the greenspace exposure of 3 age groups in Shanghai at different times.A dynamic assessment model for greenspace exposure has been adopted in this study.April 2021 and April 2022 were selected as the study periods,representing the non-lockdown period and the lockdown period,respectively.The results indicate that greenspace exposure changes slightly during the lockdown period.During lockdown,the NDVI exposure in the age groups of 31 to 50,51,and above was higher than that during non-lockdown.However,the NDVI exposure of people aged 0 to 30 during lockdown is lower than that during non-lockdown.The GVI exposure of people aged 51 and above is lower than that of the other age group.Whether it is under lockdown or not,from 8:00 to 17:00,the NDVI exposure showed a slightly higher value than at other hours.The value of GVI exposure fluctuates steadily during 6:00 to 24:00.This study enriches the evaluation dimensions of urban greenspace exposure.展开更多
基金This work was in part supported by the National Natural Science Foundation of China(Nos.61622101 and 61571020)in part by the Natural Science Foundation(Nos.DMS-1521746 and DMS-1737795.
文摘Mobile big data collected by mobile network operators is of interest to many research communities and industries for its remarkable values.However,such spatiotemporal information may lead to a harsh threat to subscribers’privacy.This work focuses on subscriber privacy vulnerability assessment in terms of user identifiability across two datasets with significant detail reduced mobility representation.In this paper,we propose an innovative semantic spatiotemporal representation for each subscriber based on the geographic information,termed as daily habitat region,to approximate the subscriber’s daily mobility coverage with far lesser information compared with original mobility traces.The daily habitat region is realized via convex hull extraction on the user’s daily spatiotemporal traces.As a result,user identification can be formulated to match two records with the maximum similarity score between two convex hull sets,obtained by our proposed similarity measures based on cosine distance and permutation hypothesis test.Experiments are conducted to evaluate our proposed innovative mobility representation and user identification algorithms,which also demonstrate that the subscriber’s mobile privacy is under a severe threat even with significantly reduced spatiotemporal information.
基金the European Union's Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No.824019 and China Scholarship Council(CSC)the Fundamental Research Funds for Central Universities(No.2020JJ014,YY19SSK05).
文摘Identifying an unfamiliar caller's profession is important to protect citizens' personal safety and property. Owing to the limited data protection of various popular online services in some countries, such as taxi hailing and ordering takeouts, many users presently encounter an increasing number of phone calls from strangers. The situation may be aggravated when criminals pretend to be such service delivery staff, threatening the user individuals as well as the society. In addition, numerous people experience excessive digital marketing and fraudulent phone calls because of personal information leakage. However, previous works on malicious call detection only focused on binary classification, which does not work for the identification of multiple professions. We observed that web service requests issued from users' mobile phones might exhibit their application preferences, spatial and temporal patterns, and other profession-related information. This offers researchers and engineers a hint to identify unfamiliar callers. In fact, some previous works already leveraged raw data from mobile phones (which includes sensitive information) for personality studies. However, accessing users' mobile phone raw data may violate the more and more strict private data protection policies and regulations (e.g., General Data Protection Regulation). We observe that appropriate statistical methods can offer an effective means to eliminate private information and preserve personal characteristics, thus enabling the identification of the types of mobile phone callers without privacy concerns. In this paper, we develop CPFinder —- a system that exploits privacy-preserving mobile data to automatically identify callers who are divided into four categories of users: taxi drivers, delivery and takeouts staffs, telemarketers and fraudsters, and normal users (other professions). Our evaluation of an anonymized dataset of 1,282 users over a period of 3 months in Shanghai City shows that the CPFinder can achieve accuracies of more than 75.0% and 92.4% for multiclass and binary classifications, respectively.
基金supported in part by the Guangxi science and technology program(GuiKe 2021AB30019)Sichuan Science and Technology Program(2022YFN0031,2023YFN0022,and 2023YFS0381)+2 种基金Hubei key R&D plan(2022BAA048)Zhuhai industry university research cooperation project of China(ZH22017001210098PWC)Shanxi Science and Technology Program(202201150401020).
文摘Many existing efforts have taken advantage of large-scale spatial-temporal data to partition cities via constructed human interaction networks.However,few studies focus on communities emerging between adjacent cities in big urban agglomerations,which we call“cross-city”communities.In this study,we introduce a novel framework to detect cross-city communities in urban agglomerations under different scales leveraging a large number of fine-grained mobile signaling data aiming to break the original administrative boundaries.Taking the Pearl River Delta(PRD)urban agglomeration in China as study area,we investigate the existence of potential communities at three scales,i.e.city-group level,city level and sub-city level.The partition results are expected to benefit transportation planning,urban zoning and administrative boundary re-delineation.The results from our study highlight the necessity of considering cross-city communities and their scale effects when examining urban spatial interactions.
基金supported by project PID2019-109152 GB-I00 financed by Ministerio de Ciencia,Innovación y Universidades,Spain(MCIN/AEI/10.13039/501100011033)by project UHU-1266216(FEDER 2014e2020)financed by Junta de Andalucía and Universidad de Huelva。
文摘The impact of bike sharing systems(BSS)on urban mobility,and their study as part of the overall transport system in smart cities,has attracted significant academic interest in recent years.However,the lack of historical and standardized data in current service tools hinders the analysis and improvement of these platforms,i.e.by reusing technical databased solutions.Big data nature(in volume,variety and velocity)of collecting BSS historical information must be also addressed,in order to take an integrated perspective.This paper describes an integrated solution to this challenge by(1)proposing a unified station status concept for recording historical information,based on the identification,study and unification of common relevant fields found in almost all BSS data warehouses,and(2)implementing a big data-inspired ETL infrastructure together with a storage optimization,methodology which not only allows to access and collect previous defined concepts but also overcomes existing big data challenge when storing BSS information.The system also consumes other external relevant information,such as weather factors,which have been aggregated,enhancing stored knowledge,with KPIs and statistics.The developed solution illustrates how it can manage over seven years of data from twentyseven BSS,serving not only machine-to-machine communication but also human-computer communication and enabling data-driven solutions.
基金National Natural Science Foundation of China,No.41971236。
文摘Rural vitality is the life force expressed by a combination of endogenous dynamics and external influences. Exploring the complex relationship between rural functions, elements and flows could achieve sustainable rural development. This study constructed a theoretical framework guided by the four functions of production, living, ecology and culture with the support of mobile big data. Furthermore, the network centrality of villages was estimated to reflect the intensity of urban-rural social mobility ties. The results indicated marked spatial disparities in rural vitality, and the coupling of ecological-cultural vitality has a high degree of coherence. Four rural vitality grades were identified: high level(38, 14.08%), medium-high level(66, 24.44%), medium-low level(110, 40.74%) and low level(56, 20.74%), covering 270 administrative village units. The flow intensity of social linkage elements is consistent with rural vitality and the socioeconomic spillover effect of urban centers on neighboring villages was noticeable. Topographic complexity negatively affected the living function, mainly in the T1 and T2 terrain gradients;the rural ecological function was not fully correlated with urban adjacency, and proximity could lead to adverse effects such as urban sprawl and resource destruction. The application of this study is to explore the importance of “flow” by utilizing mobile big data to refine the evaluation unit to the village scale. Accelerating the construction of network coverage and information interconnection and promoting the elemental flow of people, transportation and information between urban and rural areas are important ways to enhance rural vitality.
基金supported by the National Key R&D Program of China(2022YFC3802600 and 2022YFC3802603)China National Forestry and Grassland Administration,Forestry and Grassland Science and Technology Youth Talent Project(2024132024)
文摘Urban greenspace has a profound impact on public health by purifying the air,blocking bacteria,and creating activity venues.Due to people's different position,the greenspace exposure to different age groups changes at various times.In this study,we combined NDVI(normalized difference vegetation index)and GVI(green view index)green indices with mobile signaling big data to evaluate the greenspace exposure of 3 age groups in Shanghai at different times.A dynamic assessment model for greenspace exposure has been adopted in this study.April 2021 and April 2022 were selected as the study periods,representing the non-lockdown period and the lockdown period,respectively.The results indicate that greenspace exposure changes slightly during the lockdown period.During lockdown,the NDVI exposure in the age groups of 31 to 50,51,and above was higher than that during non-lockdown.However,the NDVI exposure of people aged 0 to 30 during lockdown is lower than that during non-lockdown.The GVI exposure of people aged 51 and above is lower than that of the other age group.Whether it is under lockdown or not,from 8:00 to 17:00,the NDVI exposure showed a slightly higher value than at other hours.The value of GVI exposure fluctuates steadily during 6:00 to 24:00.This study enriches the evaluation dimensions of urban greenspace exposure.