Software crowdsourcing(SW CS)is an evolving software development paradigm,in which crowds of people are asked to solve various problems through an open call(with the encouragement of prizes for the top solutions).Beca...Software crowdsourcing(SW CS)is an evolving software development paradigm,in which crowds of people are asked to solve various problems through an open call(with the encouragement of prizes for the top solutions).Because of its dynamic nature,SW CS has been progressively accepted and adopted in the software industry.However,issues pertinent to the understanding of requirements among crowds of people and requirements engineers are yet to be clarified and explained.If the requirements are not clear to the development team,it has a significant effect on the quality of the software product.This study aims to identify the potential challenges faced by requirements engineers when conducting the SW–CS based requirements engineering(RE)process.Moreover,solutions to overcome these challenges are also identified.Qualitative data analysis is performed on the interview data collected from software industry professionals.Consequently,20 SW–CS based RE challenges and their subsequent proposed solutions are devised,which are further grouped under seven categories.This study is beneficial for academicians,researchers and practitioners by providing detailed SW–CS based RE challenges and subsequent solutions that could eventually guide them to understand and effectively implement RE in SW CS.展开更多
The prevalence of smart phone and improvement of wireless net promote the usage of crowdsourced live streaming,where individual users act as live streaming sources to broadcast themselves online. Characterizing the pe...The prevalence of smart phone and improvement of wireless net promote the usage of crowdsourced live streaming,where individual users act as live streaming sources to broadcast themselves online. Characterizing the performance and identifying its bottleneck in such systems can shed light on the system design and performance optimization. TCP performance of a commercial crowdsourced live streaming system is examined by analyzing packet-level traces collected at streaming servers. TCP stalls that heavily hurt the Qo E of user have been identified. In particular,the TCP stalls account for as much as 31. 6% of the flow completion time for upload flows and result in abandonment of upload on the corresponding channels. Stalls caused by timeout retransmissions are further dissected and timeout retransmission characteristics are revealed to be dependent on the video encoding methods. These findings provide new insights in crowdsourced live streaming systems and can guide designers to improve the TCP efficiency.展开更多
The knowledge garnered in environmental science takes a crucial part in informing decision-making in various fields,including agriculture, transportation, energy, public health and safety, and more. Understanding the ...The knowledge garnered in environmental science takes a crucial part in informing decision-making in various fields,including agriculture, transportation, energy, public health and safety, and more. Understanding the basic processes in each of these fields relies greatly on progress being made in conceptual, observational and technological approaches. However,existing instruments for environmental observations are often limited as a result of technical and practical constraints. Current technologies, including remote sensing systems and ground-level measuring means, may suffer from obstacles such as low spatial representativity or a lack of precision when measuring near ground-level. These constraints often limit the ability to carry out extensive meteorological observations and, as a result, the capacity to deepen the existing understanding of atmospheric phenomena and processes. Multi-system informatics and sensing technology have become increasingly distributed as they are embedded into our environment. As they become more widely deployed, these technologies create unprecedented data streams with extraordinary levels of coverage and immediacy, providing a growing opportunity to complement traditional observation techniques using the large volumes of data created. Commercial microwave links that comprise the data transfer infrastructure of cellular communication networks are an example of these types of systems. This viewpoint letter briefly reviews various works on the subject and presents aspects concerning the added value that may be obtained as a result of the integration of these new means, which are becoming available for the first time in this era, for studying and monitoring atmospheric phenomena.展开更多
The crowdsourcing-based WLAN indoor localization system has been widely promoted for the effective reduction of the workload from the offline phase data collection while constructing radio maps.Aiming at the problem o...The crowdsourcing-based WLAN indoor localization system has been widely promoted for the effective reduction of the workload from the offline phase data collection while constructing radio maps.Aiming at the problem of the inaccurate location annotation of the crowdsourced samples,the existing invalid access points(APs)in collected samples,and the uneven sample distribution,as well as the diverse terminal devices,which will result in the construction of the wrong radio map,an effective WLAN indoor radio map construction scheme(WRMCS)is proposed based on crowdsourced samples.The WRMCS consists of 4 main modules:outlier detection,key AP selection,fingerprint interpolation,and terminal device calibration.Moreover,an online localization algorithm is put forward to estimate the position of the online test fingerprint.The simulation results show that the proposed scheme can achieve higher localization accuracy than the peer schemes,and possesses good effectiveness and robustness at the same time.展开更多
A composite random variable is a product (or sum of products) of statistically distributed quantities. Such a variable can represent the solution to a multi-factor quantitative problem submitted to a large, diverse, i...A composite random variable is a product (or sum of products) of statistically distributed quantities. Such a variable can represent the solution to a multi-factor quantitative problem submitted to a large, diverse, independent, anonymous group of non-expert respondents (the “crowd”). The objective of this research is to examine the statistical distribution of solutions from a large crowd to a quantitative problem involving image analysis and object counting. Theoretical analysis by the author, covering a range of conditions and types of factor variables, predicts that composite random variables are distributed log-normally to an excellent approximation. If the factors in a problem are themselves distributed log-normally, then their product is rigorously log-normal. A crowdsourcing experiment devised by the author and implemented with the assistance of a BBC (British Broadcasting Corporation) television show, yielded a sample of approximately 2000 responses consistent with a log-normal distribution. The sample mean was within ~12% of the true count. However, a Monte Carlo simulation (MCS) of the experiment, employing either normal or log-normal random variables as factors to model the processes by which a crowd of 1 million might arrive at their estimates, resulted in a visually perfect log-normal distribution with a mean response within ~5% of the true count. The results of this research suggest that a well-modeled MCS, by simulating a sample of responses from a large, rational, and incentivized crowd, can provide a more accurate solution to a quantitative problem than might be attainable by direct sampling of a smaller crowd or an uninformed crowd, irrespective of size, that guesses randomly.展开更多
In crowdsourced federated learning,differential privacy is commonly used to prevent the aggregation server from recovering training data from the models uploaded by clients to achieve privacy preservation.However,impr...In crowdsourced federated learning,differential privacy is commonly used to prevent the aggregation server from recovering training data from the models uploaded by clients to achieve privacy preservation.However,improper privacy budget settings and perturbation methods will severely impact model performance.In order to achieve a harmonious equilibrium between privacy preservation and model performance,we propose a novel architecture for crowdsourced federated learning with personalized privacy preservation.In our architecture,to avoid the issue of poor model performance due to excessive privacy preservation requirements,we establish a two-stage dynamic game between the task requestor and clients to formulate the optimal privacy preservation strategy,allowing each client to independently control privacy preservation level.Additionally,we design a differential privacy perturbation mechanism based on weight priorities.It divides the weights based on their relevance with local data,applying different levels of perturbation to different types of weights.Finally,we conduct experiments on the proposed perturbation mechanism,and the experimental results indicate that our approach can achieve better global model performance with the same privacy budget.展开更多
Map matching has been widely investigated in indoor pedestrian navigation to improve positioning accuracy and robustness.This paper proposes an accurate map matching algorithm based on activity detection and crowdsour...Map matching has been widely investigated in indoor pedestrian navigation to improve positioning accuracy and robustness.This paper proposes an accurate map matching algorithm based on activity detection and crowdsourced Wi-Fi(AiFiMatch).Firstly, by taking indoor road segments between activity-related locations as nodes, and the activity type from one road segment to another as directed edge, the indoor floor plan is abstracted as a directed graph. Secondly, the smartphone’s motion sensors are utilized to detect different activities based on a decision tree and then the pedestrian’s walking trajectory is divided into subtrajectory sequence according to location-related activities. Finally, the sub-trajectory sequence is matched to the directed graph of indoor floor plan to position the pedestrian by using a Hidden Markov Model(HMM). Simultaneously, Wi-Fi fingerprints are bound to road segments based on timestamp. Through crowdsourcing, a radio map of indoor road segments is constructed. The radio map in turn inversely promotes the HMM based map matching algorithm. AiFiMatch is evaluated by the experiments using smartphones in a teaching building. Experimental results show that the pedestrian can be accurately tracked even without knowing the starting position and AiFiMatch is robust to a certain degree of step length and heading direction errors.展开更多
In crowdsourced mobile application testing, workers are often inexperienced in and unfamiliar with software testing. Meanwhile, workers edit test reports in descriptive natural language on mobile devices. Thus, these ...In crowdsourced mobile application testing, workers are often inexperienced in and unfamiliar with software testing. Meanwhile, workers edit test reports in descriptive natural language on mobile devices. Thus, these test reports generally lack important details and challenge developers in understanding the bugs. To improve the quality of inspected test reports, we issue a new problem of test report augmentation by leveraging the additional useful information contained in duplicate test reports. In this paper, we propose a new framework named test report augmentation framework (TRAF) towards resolving the problem. First, natural language processing (NLP) techniques are adopted to preprocess the crowdsourced test reports. Then, three strategies are proposed to augment the environments, inputs, and descriptions of the inspected test reports, respectively. Finally, we visualize the augmented test reports to help developers distinguish the added information. To evaluate TRAF, we conduct experiments over five industrial datasets with 757 crowdsourced test reports. Experimental results show that TRAF can recommend relevant inputs to augment the inspected test reports with 98.49% in terms of NDCG and 88.65% in terms of precision on average, and identify valuable sentences from the descriptions of duplicates to augment the inspected test reports with 83.58% in terms of precision, 77.76% in terms of recall, and 78.72% in terms of F-measure on average. Meanwhile, empirical evaluation also demonstrates that augmented test reports can help developers understand and fix bugs better.展开更多
Currently, mobile devices (e.g., smartphones) are equipped with multiple wireless interfaces and rich builtin functional sensors that possess powerful computation and communication capabilities, and enable numerous ...Currently, mobile devices (e.g., smartphones) are equipped with multiple wireless interfaces and rich builtin functional sensors that possess powerful computation and communication capabilities, and enable numerous Mobile Crowdsourced Sensing (MCS) applications. Generally, an MCS system is composed of three components: a publisher of sensing tasks, crowd participants who complete the crowdsourced tasks for some kinds of rewards, and the crowdsourcing platform that facilitates the interaction between publishers and crowd participants. Incentives are a fundamental issue in MCS. This paper proposes an integrated incentive framework for MCS, which appropriately utilizes three widely used incentive methods: reverse auction, gamification, and reputation updating. Firstly, a reverse-auction-based two-round participant selection mechanism is proposed to incentivize crowds to actively participate and provide high-quality sensing data. Secondly, in order to avoid untruthful publisher feedback about sensing-data quality, a gamification-based verification mechanism is designed to evaluate the truthfulness of the publisher's feedback. Finally, the platform updates the reputation of both participants and publishers based on their corresponding behaviors. This integrated incentive mechanism can motivate participants to provide high-quality sensed contents, stimulate publishers to give truthful feedback, and make the platform profitable.展开更多
High-definition map has become a vital cornerstone in the navigation of autonomous vehicles in complex traffic scenarios.Thus,the construction of high-definition maps has become crucial.Traditional methods relying on ...High-definition map has become a vital cornerstone in the navigation of autonomous vehicles in complex traffic scenarios.Thus,the construction of high-definition maps has become crucial.Traditional methods relying on expensive mapping vehicles equipped with high-end sensor equipment are not suitable for mass map construction because of the limitation imposed by its high cost.Hence,this paper proposes a new method to create a high-definition road semantics map using multi-vehicle sensor data.The proposed method implements crowdsourced point-based visual SLAM to align and combine the local maps derived by multiple vehicles.This allows users to modify the extraction process by using a more sophisticated neural network,thus achieving a more accurate detection result when compared with traditional binarization method.The resulting map consists of road marking points suitable for autonomous vehicle navigation and path-planning tasks.Finally,the method is evaluated on the real-world KAIST urban dataset and Shougang dataset to demonstrate the level of detail and accuracy of the proposed map with 0.369 m in mapping errors in ideal condition.展开更多
Volunteered geographic information(VGI)can be considered a subset of crowdsourced data(CSD)and its popularity has recently increased in a number of application areas.Disaster management is one of its key application a...Volunteered geographic information(VGI)can be considered a subset of crowdsourced data(CSD)and its popularity has recently increased in a number of application areas.Disaster management is one of its key application areas in which the benefits of VGI and CSD are potentially very high.However,quality issues such as credibility,reliability and relevance are limiting many of the advantages of utilising CSD.Credibility issues arise as CSD come from a variety of heterogeneous sources including both professionals and untrained citizens.VGI and CSD are also highly unstructured and the quality and metadata are often undocumented.In the 2011 Australian floods,the general public and disaster management administrators used the Ushahidi Crowd-mapping platform to extensively communicate flood-related information including hazards,evacuations,emergency services,road closures and property damage.This study assessed the credibility of the Australian Broadcasting Corporation’s Ushahidi CrowdMap dataset using a Naïve Bayesian network approach based on models commonly used in spam email detection systems.The results of the study reveal that the spam email detection approach is potentially useful for CSD credibility detection with an accuracy of over 90%using a forced classification methodology.展开更多
Cross-domain routing in Integrated Heterogeneous Networks(Inte-HetNet)should ensure efficient and secure data transmission across different network domains by satisfying diverse routing requirements.However,current so...Cross-domain routing in Integrated Heterogeneous Networks(Inte-HetNet)should ensure efficient and secure data transmission across different network domains by satisfying diverse routing requirements.However,current solutions face numerous challenges in continuously ensuring trustworthy routing,fulfilling diverse requirements,achieving reasonable resource allocation,and safeguarding against malicious behaviors of network operators.We propose CrowdRouting,a novel cross-domain routing scheme based on crowdsourcing,dedicated to establishing sustained trust in cross-domain routing,comprehensively considering and fulfilling various customized routing requirements,while ensuring reasonable resource allocation and effectively curbing malicious behavior of network operators.Concretely,CrowdRouting employs blockchain technology to verify the trustworthiness of border routers in different network domains,thereby establishing sustainable and trustworthy crossdomain routing based on sustained trust in these routers.In addition,CrowdRouting ingeniously integrates a crowdsourcing mechanism into the auction for routing,achieving fair and impartial allocation of routing rights by flexibly embedding various customized routing requirements into each auction phase.Moreover,CrowdRouting leverages incentive mechanisms and routing settlement to encourage network domains to actively participate in cross-domain routing,thereby promoting optimal resource allocation and efficient utilization.Furthermore,CrowdRouting introduces a supervisory agency(e.g.,undercover agent)to effectively suppress the malicious behavior of network operators through the game and interaction between the agent and the network operators.Through comprehensive experimental evaluations and comparisons with existing works,we demonstrate that CrowdRouting excels in providing trustworthy and fine-grained customized routing services,stimulating active participation in cross-domain routing,inhibiting malicious operator behavior,and maintaining reasonable resource allocation,all of which outperform baseline schemes.展开更多
Camera-equipped mobile devices are encouraging people to take more photos and the development and growth of social networks is making it increasingly popular to share photos online. When objects appear in overlapping ...Camera-equipped mobile devices are encouraging people to take more photos and the development and growth of social networks is making it increasingly popular to share photos online. When objects appear in overlapping Fields Of View(FOV), this means that they are drawing much attention and thus indicates their popularity. Successfully discovering and locating these objects can be very useful for many applications, such as criminal investigations, event summaries, and crowdsourcing-based Geographical Information Systems(GIS).Existing methods require either prior knowledge of the environment or intentional photographing. In this paper, we propose a seamless approach called 'Spotlight', which performs passive localization using crowdsourced photos.Using a graph-based model, we combine object images across multiple camera views. Within each set of combined object images, a photographing map is built on which object localization is performed using plane geometry. We evaluate the system’s localization accuracy using photos taken in various scenarios, with the results showing our approach to be effective for passive object localization and to achieve a high level of accuracy.展开更多
Crowdsourcing is an effective method to obtain large databases of manually-labeled images, which is especially important for image understanding with supervised machine learning algorithms. However, for several kinds ...Crowdsourcing is an effective method to obtain large databases of manually-labeled images, which is especially important for image understanding with supervised machine learning algorithms. However, for several kinds of tasks regarding image labeling, e.g., dog breed recognition, it is hard to achieve high-quality results. Therefore, further optimizing crowdsourcing workflow mainly involves task allocation and result inference. For task allocation, we design a two-round crowdsourcing framework, which contains a smart decision mechanism based on information entropy to determine whether to perform the second round task allocation. Regarding result inference, after quantifying the similarity of all labels, two graphical models are proposed to describe the labeling process and corresponding inference algorithms are designed to further improve the result quality of image labeling. Extensive experiments on real-world tasks in Crowdflower and synthesis datasets were conducted. The experimental results demonstrate the superiority of these methods in comparison with state-of-the-art methods.展开更多
Missing value imputation with crowdsourcing is a novel method in data cleaning to capture missing values that could hardly be filled with automatic approaches. However, the time cost and overhead in crowdsourcing are ...Missing value imputation with crowdsourcing is a novel method in data cleaning to capture missing values that could hardly be filled with automatic approaches. However, the time cost and overhead in crowdsourcing are high. Therefore, we have to reduce cost and guarantee the accuracy of crowdsourced imputation. To achieve the optimization goal, we present COSSET+, a crowdsourced framework optimized by knowledge base. We combine the advantages of both knowledge-based filter and crowdsourcing platform to capture missing values. Since the amount of crowd values will affect the cost of COSSET+, we aim to select partial missing values to be crowdsourced. We prove that the crowd value selection problem is an NP-hard problem and develop an approximation algorithm for this problem. Extensive experimental results demonstrate the efficiency and effectiveness of the proposed approaches.展开更多
Crowdsourcing holds broad applications in information acquisition and dissemination,yet encounters challenges pertaining to data quality assessment and user reputation management.Reputation mechanisms stand as crucial...Crowdsourcing holds broad applications in information acquisition and dissemination,yet encounters challenges pertaining to data quality assessment and user reputation management.Reputation mechanisms stand as crucial solutions for appraising and updating participant reputation scores,thereby elevating the quality and dependability of crowdsourced data.However,these mechanisms face several challenges in traditional crowdsourcing systems:1)platform security lacks robust guarantees and may be susceptible to attacks;2)there exists a potential for large-scale privacy breaches;and 3)incentive mechanisms relying on reputation scores may encounter issues as reputation updates hinge on task demander evaluations,occasionally lacking a dedicated reputation update module.This paper introduces a reputation update scheme tailored for crowdsourcing,with a focus on proficiently overseeing participant reputations and alleviating the impact of malicious activities on the sensing system.Here,the reputation update scheme is determined by an Empirical Cumulative distribution-based Outlier Detection method(ECOD).Our scheme embraces a blockchain-based crowdsourcing framework utilizing a homomorphic encryption method to ensure data transparency and tamper-resistance.Computation of user reputation scores relies on their behavioral history,actively discouraging undesirable conduct.Additionally,we introduce a dynamic weight incentive mechanism that mirrors alterations in participant reputation,enabling the system to allocate incentives based on user behavior and reputation.Our scheme undergoes evaluation on 11 datasets,revealing substantial enhancements in data credibility for crowdsourcing systems and a reduction in the influence of malicious behavior.This research not only presents a practical solution for crowdsourcing reputation management but also offers valuable insights for future research and applications,holding promise for fostering more reliable and high-quality data collection in crowdsourcing across diverse domains.展开更多
The proliferation of intelligent,connected Internet of Things(IoT)devices facilitates data collection.However,task workers may be reluctant to participate in data collection due to privacy concerns,and task requesters...The proliferation of intelligent,connected Internet of Things(IoT)devices facilitates data collection.However,task workers may be reluctant to participate in data collection due to privacy concerns,and task requesters may be concerned about the validity of the collected data.Hence,it is vital to evaluate the quality of the data collected by the task workers while protecting privacy in spatial crowdsourcing(SC)data collection tasks with IoT.To this end,this paper proposes a privacy-preserving data reliability evaluation for SC in IoT,named PARE.First,we design a data uploading format using blockchain and Paillier homomorphic cryptosystem,providing unchangeable and traceable data while overcoming privacy concerns.Secondly,based on the uploaded data,we propose a method to determine the approximate correct value region without knowing the exact value.Finally,we offer a data filtering mechanism based on the Paillier cryptosystem using this value region.The evaluation and analysis results show that PARE outperforms the existing solution in terms of performance and privacy protection.展开更多
Crowdsourcing technology is widely recognized for its effectiveness in task scheduling and resource allocation.While traditional methods for task allocation can help reduce costs and improve efficiency,they may encoun...Crowdsourcing technology is widely recognized for its effectiveness in task scheduling and resource allocation.While traditional methods for task allocation can help reduce costs and improve efficiency,they may encounter challenges when dealing with abnormal data flow nodes,leading to decreased allocation accuracy and efficiency.To address these issues,this study proposes a novel two-part invalid detection task allocation framework.In the first step,an anomaly detection model is developed using a dynamic self-attentive GAN to identify anomalous data.Compared to the baseline method,the model achieves an approximately 4%increase in the F1 value on the public dataset.In the second step of the framework,task allocation modeling is performed using a twopart graph matching method.This phase introduces a P-queue KM algorithm that implements a more efficient optimization strategy.The allocation efficiency is improved by approximately 23.83%compared to the baseline method.Empirical results confirm the effectiveness of the proposed framework in detecting abnormal data nodes,enhancing allocation precision,and achieving efficient allocation.展开更多
With the rapid development ofmobile Internet,spatial crowdsourcing has becomemore andmore popular.Spatial crowdsourcing consists of many different types of applications,such as spatial crowd-sensing services.In terms ...With the rapid development ofmobile Internet,spatial crowdsourcing has becomemore andmore popular.Spatial crowdsourcing consists of many different types of applications,such as spatial crowd-sensing services.In terms of spatial crowd-sensing,it collects and analyzes traffic sensing data from clients like vehicles and traffic lights to construct intelligent traffic prediction models.Besides collecting sensing data,spatial crowdsourcing also includes spatial delivery services like DiDi and Uber.Appropriate task assignment and worker selection dominate the service quality for spatial crowdsourcing applications.Previous research conducted task assignments via traditional matching approaches or using simple network models.However,advanced mining methods are lacking to explore the relationship between workers,task publishers,and the spatio-temporal attributes in tasks.Therefore,in this paper,we propose a Deep Double Dueling Spatial-temporal Q Network(D3SQN)to adaptively learn the spatialtemporal relationship between task,task publishers,and workers in a dynamic environment to achieve optimal allocation.Specifically,D3SQNis revised through reinforcement learning by adding a spatial-temporal transformer that can estimate the expected state values and action advantages so as to improve the accuracy of task assignments.Extensive experiments are conducted over real data collected fromDiDi and ELM,and the simulation results verify the effectiveness of our proposed models.展开更多
The potential of citizens as a source of geographical information has been recognized for many years.Such activity has grown recently due to the proliferation of inexpensive location aware devices and an ability to sh...The potential of citizens as a source of geographical information has been recognized for many years.Such activity has grown recently due to the proliferation of inexpensive location aware devices and an ability to share data over the internet.Recently,a series of major projects,often cast as citizen observatories,have helped explore and develop this potential for a wide range of applications.Here,some of the experiences and learnings gained from part of one such project,which aimed to further the role of citizen science within Earth observation and help address environmental challenges,LandSense,are shared.The key focus is on quality assurance of citizen generated data on land use and land cover especially to support analyses of remotely sensed data and products.Particular focus is directed to quality assurance checks on photographic image quality,privacy,polygon overlap,positional accuracy and offset,contributor agreement,and categorical accuracy.The discussion aims to provide good practice advice to aid future studies and help fulfil the full potential of citizens as a source of volunteered geographical information(VGI).展开更多
基金‘This research is funded by Taif University,TURSP-2020/115’.
文摘Software crowdsourcing(SW CS)is an evolving software development paradigm,in which crowds of people are asked to solve various problems through an open call(with the encouragement of prizes for the top solutions).Because of its dynamic nature,SW CS has been progressively accepted and adopted in the software industry.However,issues pertinent to the understanding of requirements among crowds of people and requirements engineers are yet to be clarified and explained.If the requirements are not clear to the development team,it has a significant effect on the quality of the software product.This study aims to identify the potential challenges faced by requirements engineers when conducting the SW–CS based requirements engineering(RE)process.Moreover,solutions to overcome these challenges are also identified.Qualitative data analysis is performed on the interview data collected from software industry professionals.Consequently,20 SW–CS based RE challenges and their subsequent proposed solutions are devised,which are further grouped under seven categories.This study is beneficial for academicians,researchers and practitioners by providing detailed SW–CS based RE challenges and subsequent solutions that could eventually guide them to understand and effectively implement RE in SW CS.
基金Supported by the National Basic Research Program of China(2012CB315801)the National Natural Science Foundation of China(No.6157060397)
文摘The prevalence of smart phone and improvement of wireless net promote the usage of crowdsourced live streaming,where individual users act as live streaming sources to broadcast themselves online. Characterizing the performance and identifying its bottleneck in such systems can shed light on the system design and performance optimization. TCP performance of a commercial crowdsourced live streaming system is examined by analyzing packet-level traces collected at streaming servers. TCP stalls that heavily hurt the Qo E of user have been identified. In particular,the TCP stalls account for as much as 31. 6% of the flow completion time for upload flows and result in abandonment of upload on the corresponding channels. Stalls caused by timeout retransmissions are further dissected and timeout retransmission characteristics are revealed to be dependent on the video encoding methods. These findings provide new insights in crowdsourced live streaming systems and can guide designers to improve the TCP efficiency.
文摘The knowledge garnered in environmental science takes a crucial part in informing decision-making in various fields,including agriculture, transportation, energy, public health and safety, and more. Understanding the basic processes in each of these fields relies greatly on progress being made in conceptual, observational and technological approaches. However,existing instruments for environmental observations are often limited as a result of technical and practical constraints. Current technologies, including remote sensing systems and ground-level measuring means, may suffer from obstacles such as low spatial representativity or a lack of precision when measuring near ground-level. These constraints often limit the ability to carry out extensive meteorological observations and, as a result, the capacity to deepen the existing understanding of atmospheric phenomena and processes. Multi-system informatics and sensing technology have become increasingly distributed as they are embedded into our environment. As they become more widely deployed, these technologies create unprecedented data streams with extraordinary levels of coverage and immediacy, providing a growing opportunity to complement traditional observation techniques using the large volumes of data created. Commercial microwave links that comprise the data transfer infrastructure of cellular communication networks are an example of these types of systems. This viewpoint letter briefly reviews various works on the subject and presents aspects concerning the added value that may be obtained as a result of the integration of these new means, which are becoming available for the first time in this era, for studying and monitoring atmospheric phenomena.
基金the National High Technology Research and Development Program of China(No.2012AA120802)National Natural Science Foundation of China(No.61771186)+1 种基金Postdoctoral Research Project of Heilongjiang Province(No.LBH-Q15121)Undergraduate University Project of Young Scientist Creative Talent of Heilongjiang Province(No.UNPYSCT-2017125).
文摘The crowdsourcing-based WLAN indoor localization system has been widely promoted for the effective reduction of the workload from the offline phase data collection while constructing radio maps.Aiming at the problem of the inaccurate location annotation of the crowdsourced samples,the existing invalid access points(APs)in collected samples,and the uneven sample distribution,as well as the diverse terminal devices,which will result in the construction of the wrong radio map,an effective WLAN indoor radio map construction scheme(WRMCS)is proposed based on crowdsourced samples.The WRMCS consists of 4 main modules:outlier detection,key AP selection,fingerprint interpolation,and terminal device calibration.Moreover,an online localization algorithm is put forward to estimate the position of the online test fingerprint.The simulation results show that the proposed scheme can achieve higher localization accuracy than the peer schemes,and possesses good effectiveness and robustness at the same time.
文摘A composite random variable is a product (or sum of products) of statistically distributed quantities. Such a variable can represent the solution to a multi-factor quantitative problem submitted to a large, diverse, independent, anonymous group of non-expert respondents (the “crowd”). The objective of this research is to examine the statistical distribution of solutions from a large crowd to a quantitative problem involving image analysis and object counting. Theoretical analysis by the author, covering a range of conditions and types of factor variables, predicts that composite random variables are distributed log-normally to an excellent approximation. If the factors in a problem are themselves distributed log-normally, then their product is rigorously log-normal. A crowdsourcing experiment devised by the author and implemented with the assistance of a BBC (British Broadcasting Corporation) television show, yielded a sample of approximately 2000 responses consistent with a log-normal distribution. The sample mean was within ~12% of the true count. However, a Monte Carlo simulation (MCS) of the experiment, employing either normal or log-normal random variables as factors to model the processes by which a crowd of 1 million might arrive at their estimates, resulted in a visually perfect log-normal distribution with a mean response within ~5% of the true count. The results of this research suggest that a well-modeled MCS, by simulating a sample of responses from a large, rational, and incentivized crowd, can provide a more accurate solution to a quantitative problem than might be attainable by direct sampling of a smaller crowd or an uninformed crowd, irrespective of size, that guesses randomly.
基金This work was supported by the National Natural Science Foundation of China(No.62271072)Beijing Natural Science Foundation(No.4232009).
文摘In crowdsourced federated learning,differential privacy is commonly used to prevent the aggregation server from recovering training data from the models uploaded by clients to achieve privacy preservation.However,improper privacy budget settings and perturbation methods will severely impact model performance.In order to achieve a harmonious equilibrium between privacy preservation and model performance,we propose a novel architecture for crowdsourced federated learning with personalized privacy preservation.In our architecture,to avoid the issue of poor model performance due to excessive privacy preservation requirements,we establish a two-stage dynamic game between the task requestor and clients to formulate the optimal privacy preservation strategy,allowing each client to independently control privacy preservation level.Additionally,we design a differential privacy perturbation mechanism based on weight priorities.It divides the weights based on their relevance with local data,applying different levels of perturbation to different types of weights.Finally,we conduct experiments on the proposed perturbation mechanism,and the experimental results indicate that our approach can achieve better global model performance with the same privacy budget.
基金supported by the National Natural Science Foundation of China(Grant No.61702288)the Natural Science Foundation of Tianjin in China(Grant No.16JCQNJC00700)the Fundamental Research Funds for the Central Universities
文摘Map matching has been widely investigated in indoor pedestrian navigation to improve positioning accuracy and robustness.This paper proposes an accurate map matching algorithm based on activity detection and crowdsourced Wi-Fi(AiFiMatch).Firstly, by taking indoor road segments between activity-related locations as nodes, and the activity type from one road segment to another as directed edge, the indoor floor plan is abstracted as a directed graph. Secondly, the smartphone’s motion sensors are utilized to detect different activities based on a decision tree and then the pedestrian’s walking trajectory is divided into subtrajectory sequence according to location-related activities. Finally, the sub-trajectory sequence is matched to the directed graph of indoor floor plan to position the pedestrian by using a Hidden Markov Model(HMM). Simultaneously, Wi-Fi fingerprints are bound to road segments based on timestamp. Through crowdsourcing, a radio map of indoor road segments is constructed. The radio map in turn inversely promotes the HMM based map matching algorithm. AiFiMatch is evaluated by the experiments using smartphones in a teaching building. Experimental results show that the pedestrian can be accurately tracked even without knowing the starting position and AiFiMatch is robust to a certain degree of step length and heading direction errors.
基金This work was partially supported by the National Natural Science Foundation of China (Grant Nos. 61370144, 61722202, 61403057, and 61772107)Jiangsu Prospective Project of Industry- University-Research (BY2015069-03)Besides, the authors would thank the three graduate students who devote their efforts for the data annotation.
文摘In crowdsourced mobile application testing, workers are often inexperienced in and unfamiliar with software testing. Meanwhile, workers edit test reports in descriptive natural language on mobile devices. Thus, these test reports generally lack important details and challenge developers in understanding the bugs. To improve the quality of inspected test reports, we issue a new problem of test report augmentation by leveraging the additional useful information contained in duplicate test reports. In this paper, we propose a new framework named test report augmentation framework (TRAF) towards resolving the problem. First, natural language processing (NLP) techniques are adopted to preprocess the crowdsourced test reports. Then, three strategies are proposed to augment the environments, inputs, and descriptions of the inspected test reports, respectively. Finally, we visualize the augmented test reports to help developers distinguish the added information. To evaluate TRAF, we conduct experiments over five industrial datasets with 757 crowdsourced test reports. Experimental results show that TRAF can recommend relevant inputs to augment the inspected test reports with 98.49% in terms of NDCG and 88.65% in terms of precision on average, and identify valuable sentences from the descriptions of duplicates to augment the inspected test reports with 83.58% in terms of precision, 77.76% in terms of recall, and 78.72% in terms of F-measure on average. Meanwhile, empirical evaluation also demonstrates that augmented test reports can help developers understand and fix bugs better.
基金supported in part by the National Natural Science Foundation of China(No.61171092)in part by the Jiangsu Educational Bureau Project(No.14KJA510004)
文摘Currently, mobile devices (e.g., smartphones) are equipped with multiple wireless interfaces and rich builtin functional sensors that possess powerful computation and communication capabilities, and enable numerous Mobile Crowdsourced Sensing (MCS) applications. Generally, an MCS system is composed of three components: a publisher of sensing tasks, crowd participants who complete the crowdsourced tasks for some kinds of rewards, and the crowdsourcing platform that facilitates the interaction between publishers and crowd participants. Incentives are a fundamental issue in MCS. This paper proposes an integrated incentive framework for MCS, which appropriately utilizes three widely used incentive methods: reverse auction, gamification, and reputation updating. Firstly, a reverse-auction-based two-round participant selection mechanism is proposed to incentivize crowds to actively participate and provide high-quality sensing data. Secondly, in order to avoid untruthful publisher feedback about sensing-data quality, a gamification-based verification mechanism is designed to evaluate the truthfulness of the publisher's feedback. Finally, the platform updates the reputation of both participants and publishers based on their corresponding behaviors. This integrated incentive mechanism can motivate participants to provide high-quality sensed contents, stimulate publishers to give truthful feedback, and make the platform profitable.
基金This work was supported in part by National Natural Science Foundation of China(U186420361773234 and 52102464)Project Funded by China Postdoctoral Science Foundation(2019M660622)in part by the International Science and Technology Cooperation Program of China(2019YFE0100200).
文摘High-definition map has become a vital cornerstone in the navigation of autonomous vehicles in complex traffic scenarios.Thus,the construction of high-definition maps has become crucial.Traditional methods relying on expensive mapping vehicles equipped with high-end sensor equipment are not suitable for mass map construction because of the limitation imposed by its high cost.Hence,this paper proposes a new method to create a high-definition road semantics map using multi-vehicle sensor data.The proposed method implements crowdsourced point-based visual SLAM to align and combine the local maps derived by multiple vehicles.This allows users to modify the extraction process by using a more sophisticated neural network,thus achieving a more accurate detection result when compared with traditional binarization method.The resulting map consists of road marking points suitable for autonomous vehicle navigation and path-planning tasks.Finally,the method is evaluated on the real-world KAIST urban dataset and Shougang dataset to demonstrate the level of detail and accuracy of the proposed map with 0.369 m in mapping errors in ideal condition.
基金Authors wish to acknowledge the Australian Government for providing support for the research work through the Research Training Program(RTP)and Monique Potts,ABC–Australia for providing the 2011 Australian Flood’s Ushahidi Crowdmap data.
文摘Volunteered geographic information(VGI)can be considered a subset of crowdsourced data(CSD)and its popularity has recently increased in a number of application areas.Disaster management is one of its key application areas in which the benefits of VGI and CSD are potentially very high.However,quality issues such as credibility,reliability and relevance are limiting many of the advantages of utilising CSD.Credibility issues arise as CSD come from a variety of heterogeneous sources including both professionals and untrained citizens.VGI and CSD are also highly unstructured and the quality and metadata are often undocumented.In the 2011 Australian floods,the general public and disaster management administrators used the Ushahidi Crowd-mapping platform to extensively communicate flood-related information including hazards,evacuations,emergency services,road closures and property damage.This study assessed the credibility of the Australian Broadcasting Corporation’s Ushahidi CrowdMap dataset using a Naïve Bayesian network approach based on models commonly used in spam email detection systems.The results of the study reveal that the spam email detection approach is potentially useful for CSD credibility detection with an accuracy of over 90%using a forced classification methodology.
基金supported in part by the National Natural Science Foundation of China under Grant U23A20300 and 62072351in part by the Key Research Project of Shaanxi Natural Science Foundation under Grant 2023-JC-ZD-35+1 种基金in part by the Concept Verification Funding of Hangzhou Institute of Technology of Xidian University under Grant GNYZ2024XX007in part by the 111 Project under Grant B16037.
文摘Cross-domain routing in Integrated Heterogeneous Networks(Inte-HetNet)should ensure efficient and secure data transmission across different network domains by satisfying diverse routing requirements.However,current solutions face numerous challenges in continuously ensuring trustworthy routing,fulfilling diverse requirements,achieving reasonable resource allocation,and safeguarding against malicious behaviors of network operators.We propose CrowdRouting,a novel cross-domain routing scheme based on crowdsourcing,dedicated to establishing sustained trust in cross-domain routing,comprehensively considering and fulfilling various customized routing requirements,while ensuring reasonable resource allocation and effectively curbing malicious behavior of network operators.Concretely,CrowdRouting employs blockchain technology to verify the trustworthiness of border routers in different network domains,thereby establishing sustainable and trustworthy crossdomain routing based on sustained trust in these routers.In addition,CrowdRouting ingeniously integrates a crowdsourcing mechanism into the auction for routing,achieving fair and impartial allocation of routing rights by flexibly embedding various customized routing requirements into each auction phase.Moreover,CrowdRouting leverages incentive mechanisms and routing settlement to encourage network domains to actively participate in cross-domain routing,thereby promoting optimal resource allocation and efficient utilization.Furthermore,CrowdRouting introduces a supervisory agency(e.g.,undercover agent)to effectively suppress the malicious behavior of network operators through the game and interaction between the agent and the network operators.Through comprehensive experimental evaluations and comparisons with existing works,we demonstrate that CrowdRouting excels in providing trustworthy and fine-grained customized routing services,stimulating active participation in cross-domain routing,inhibiting malicious operator behavior,and maintaining reasonable resource allocation,all of which outperform baseline schemes.
文摘Camera-equipped mobile devices are encouraging people to take more photos and the development and growth of social networks is making it increasingly popular to share photos online. When objects appear in overlapping Fields Of View(FOV), this means that they are drawing much attention and thus indicates their popularity. Successfully discovering and locating these objects can be very useful for many applications, such as criminal investigations, event summaries, and crowdsourcing-based Geographical Information Systems(GIS).Existing methods require either prior knowledge of the environment or intentional photographing. In this paper, we propose a seamless approach called 'Spotlight', which performs passive localization using crowdsourced photos.Using a graph-based model, we combine object images across multiple camera views. Within each set of combined object images, a photographing map is built on which object localization is performed using plane geometry. We evaluate the system’s localization accuracy using photos taken in various scenarios, with the results showing our approach to be effective for passive object localization and to achieve a high level of accuracy.
文摘Crowdsourcing is an effective method to obtain large databases of manually-labeled images, which is especially important for image understanding with supervised machine learning algorithms. However, for several kinds of tasks regarding image labeling, e.g., dog breed recognition, it is hard to achieve high-quality results. Therefore, further optimizing crowdsourcing workflow mainly involves task allocation and result inference. For task allocation, we design a two-round crowdsourcing framework, which contains a smart decision mechanism based on information entropy to determine whether to perform the second round task allocation. Regarding result inference, after quantifying the similarity of all labels, two graphical models are proposed to describe the labeling process and corresponding inference algorithms are designed to further improve the result quality of image labeling. Extensive experiments on real-world tasks in Crowdflower and synthesis datasets were conducted. The experimental results demonstrate the superiority of these methods in comparison with state-of-the-art methods.
文摘Missing value imputation with crowdsourcing is a novel method in data cleaning to capture missing values that could hardly be filled with automatic approaches. However, the time cost and overhead in crowdsourcing are high. Therefore, we have to reduce cost and guarantee the accuracy of crowdsourced imputation. To achieve the optimization goal, we present COSSET+, a crowdsourced framework optimized by knowledge base. We combine the advantages of both knowledge-based filter and crowdsourcing platform to capture missing values. Since the amount of crowd values will affect the cost of COSSET+, we aim to select partial missing values to be crowdsourced. We prove that the crowd value selection problem is an NP-hard problem and develop an approximation algorithm for this problem. Extensive experimental results demonstrate the efficiency and effectiveness of the proposed approaches.
基金This work is supported by National Natural Science Foundation of China(Nos.U21A20463,62172117,61802383)Research Project of Pazhou Lab for Excellent Young Scholars(No.PZL2021KF0024)Guangzhou Basic and Applied Basic Research Foundation(Nos.202201010330,202201020162,202201020221).
文摘Crowdsourcing holds broad applications in information acquisition and dissemination,yet encounters challenges pertaining to data quality assessment and user reputation management.Reputation mechanisms stand as crucial solutions for appraising and updating participant reputation scores,thereby elevating the quality and dependability of crowdsourced data.However,these mechanisms face several challenges in traditional crowdsourcing systems:1)platform security lacks robust guarantees and may be susceptible to attacks;2)there exists a potential for large-scale privacy breaches;and 3)incentive mechanisms relying on reputation scores may encounter issues as reputation updates hinge on task demander evaluations,occasionally lacking a dedicated reputation update module.This paper introduces a reputation update scheme tailored for crowdsourcing,with a focus on proficiently overseeing participant reputations and alleviating the impact of malicious activities on the sensing system.Here,the reputation update scheme is determined by an Empirical Cumulative distribution-based Outlier Detection method(ECOD).Our scheme embraces a blockchain-based crowdsourcing framework utilizing a homomorphic encryption method to ensure data transparency and tamper-resistance.Computation of user reputation scores relies on their behavioral history,actively discouraging undesirable conduct.Additionally,we introduce a dynamic weight incentive mechanism that mirrors alterations in participant reputation,enabling the system to allocate incentives based on user behavior and reputation.Our scheme undergoes evaluation on 11 datasets,revealing substantial enhancements in data credibility for crowdsourcing systems and a reduction in the influence of malicious behavior.This research not only presents a practical solution for crowdsourcing reputation management but also offers valuable insights for future research and applications,holding promise for fostering more reliable and high-quality data collection in crowdsourcing across diverse domains.
基金This work was supported by the National Natural Science Foundation of China under Grant 62233003the National Key Research and Development Program of China under Grant 2020YFB1708602.
文摘The proliferation of intelligent,connected Internet of Things(IoT)devices facilitates data collection.However,task workers may be reluctant to participate in data collection due to privacy concerns,and task requesters may be concerned about the validity of the collected data.Hence,it is vital to evaluate the quality of the data collected by the task workers while protecting privacy in spatial crowdsourcing(SC)data collection tasks with IoT.To this end,this paper proposes a privacy-preserving data reliability evaluation for SC in IoT,named PARE.First,we design a data uploading format using blockchain and Paillier homomorphic cryptosystem,providing unchangeable and traceable data while overcoming privacy concerns.Secondly,based on the uploaded data,we propose a method to determine the approximate correct value region without knowing the exact value.Finally,we offer a data filtering mechanism based on the Paillier cryptosystem using this value region.The evaluation and analysis results show that PARE outperforms the existing solution in terms of performance and privacy protection.
基金National Natural Science Foundation of China(62072392).
文摘Crowdsourcing technology is widely recognized for its effectiveness in task scheduling and resource allocation.While traditional methods for task allocation can help reduce costs and improve efficiency,they may encounter challenges when dealing with abnormal data flow nodes,leading to decreased allocation accuracy and efficiency.To address these issues,this study proposes a novel two-part invalid detection task allocation framework.In the first step,an anomaly detection model is developed using a dynamic self-attentive GAN to identify anomalous data.Compared to the baseline method,the model achieves an approximately 4%increase in the F1 value on the public dataset.In the second step of the framework,task allocation modeling is performed using a twopart graph matching method.This phase introduces a P-queue KM algorithm that implements a more efficient optimization strategy.The allocation efficiency is improved by approximately 23.83%compared to the baseline method.Empirical results confirm the effectiveness of the proposed framework in detecting abnormal data nodes,enhancing allocation precision,and achieving efficient allocation.
基金supported in part by the Pioneer and Leading Goose R&D Program of Zhejiang Province under Grant 2022C01083 (Dr.Yu Li,https://zjnsf.kjt.zj.gov.cn/)Pioneer and Leading Goose R&D Program of Zhejiang Province under Grant 2023C01217 (Dr.Yu Li,https://zjnsf.kjt.zj.gov.cn/).
文摘With the rapid development ofmobile Internet,spatial crowdsourcing has becomemore andmore popular.Spatial crowdsourcing consists of many different types of applications,such as spatial crowd-sensing services.In terms of spatial crowd-sensing,it collects and analyzes traffic sensing data from clients like vehicles and traffic lights to construct intelligent traffic prediction models.Besides collecting sensing data,spatial crowdsourcing also includes spatial delivery services like DiDi and Uber.Appropriate task assignment and worker selection dominate the service quality for spatial crowdsourcing applications.Previous research conducted task assignments via traditional matching approaches or using simple network models.However,advanced mining methods are lacking to explore the relationship between workers,task publishers,and the spatio-temporal attributes in tasks.Therefore,in this paper,we propose a Deep Double Dueling Spatial-temporal Q Network(D3SQN)to adaptively learn the spatialtemporal relationship between task,task publishers,and workers in a dynamic environment to achieve optimal allocation.Specifically,D3SQNis revised through reinforcement learning by adding a spatial-temporal transformer that can estimate the expected state values and action advantages so as to improve the accuracy of task assignments.Extensive experiments are conducted over real data collected fromDiDi and ELM,and the simulation results verify the effectiveness of our proposed models.
基金funded by the European Commission’s Horizon 2020 program as part of the LandSense project[grant number 689812]Horizon 2020[LandSense,689812]。
文摘The potential of citizens as a source of geographical information has been recognized for many years.Such activity has grown recently due to the proliferation of inexpensive location aware devices and an ability to share data over the internet.Recently,a series of major projects,often cast as citizen observatories,have helped explore and develop this potential for a wide range of applications.Here,some of the experiences and learnings gained from part of one such project,which aimed to further the role of citizen science within Earth observation and help address environmental challenges,LandSense,are shared.The key focus is on quality assurance of citizen generated data on land use and land cover especially to support analyses of remotely sensed data and products.Particular focus is directed to quality assurance checks on photographic image quality,privacy,polygon overlap,positional accuracy and offset,contributor agreement,and categorical accuracy.The discussion aims to provide good practice advice to aid future studies and help fulfil the full potential of citizens as a source of volunteered geographical information(VGI).