Monocular vision-based navigation is a considerable ability for a home mobile robot. However, due to diverse disturbances, helping robots avoid obstacles, especially nonManhattan obstacles, remains a big challenge. In...Monocular vision-based navigation is a considerable ability for a home mobile robot. However, due to diverse disturbances, helping robots avoid obstacles, especially nonManhattan obstacles, remains a big challenge. In indoor environments, there are many spatial right-corners that are projected into two dimensional projections with special geometric configurations. These projections, which consist of three lines,might enable us to estimate their position and orientation in 3 D scenes. In this paper, we present a method for home robots to avoid non-Manhattan obstacles in indoor environments from a monocular camera. The approach first detects non-Manhattan obstacles. Through analyzing geometric features and constraints,it is possible to estimate posture differences between orientation of the robot and non-Manhattan obstacles. Finally according to the convergence of posture differences, the robot can adjust its orientation to keep pace with the pose of detected non-Manhattan obstacles, making it possible avoid these obstacles by itself. Based on geometric inferences, the proposed approach requires no prior training or any knowledge of the camera’s internal parameters,making it practical for robots navigation. Furthermore, the method is robust to errors in calibration and image noise. We compared the errors from corners of estimated non-Manhattan obstacles against the ground truth. Furthermore, we evaluate the validity of convergence of differences between the robot orientation and the posture of non-Manhattan obstacles. The experimental results showed that our method is capable of avoiding non-Manhattan obstacles, meeting the requirements for indoor robot navigation.展开更多
Clinical practice guidelines(CPGs)contain evidence-based and economically reasonable medical treatment processes.Executable medical treatment processes in healthcare information systems can assist the treatment proces...Clinical practice guidelines(CPGs)contain evidence-based and economically reasonable medical treatment processes.Executable medical treatment processes in healthcare information systems can assist the treatment processes.To this end,business process modeling technologies have been exploited to model medical treatment processes.However,medical treatment processes are usually flexible and knowledge-intensive.To reduce the effort in modeling,we summarize several treatment patterns(i.e.,frequent behaviors in medical treatment processes in CPGs),and represent them by three process modeling languages(i.e.,BPMN,DMN,and CMMN).Based on the summarized treatment patterns,we propose a pattern-based integrated framework for modeling medical treatment processes.A modeling platform is implemented to support the use of treatment patterns,by which the feasibility of our approach is validated.An empirical analysis is discussed based on the coverage rates of treatment patterns.Feedback from interviewed physicians in a Chinese hospital shows that executable medical treatment processes of CPGs provide a convenient way to obtain guidance,thus assisting daily work for medical workers.展开更多
The 5G mobile Internet facilitates contents generation for online communities and platforms through human-to-human collaboration.Wikipedia,a well-known online community,uses wiki technology to build an encyclopedia th...The 5G mobile Internet facilitates contents generation for online communities and platforms through human-to-human collaboration.Wikipedia,a well-known online community,uses wiki technology to build an encyclopedia through collective intelligence and collaboration.Mainstream wiki systems adopt a centralized implementation,and while existing studies have optimized the efficiency of the wiki systems'centralized implementation,these systems still suffer from a lot of problems,for example,opacity and distrust.Over the years,blockchain has brought a flurry of fervour and decentralization to the system architecture,meanwhile giving users a sense of trust and participation.Thus,an innovative blockchain-enabled wiki framework called DecWiki is proposed to build one transparent,truthful,collaborative and autonomous encyclopedia.After several participatory design iterations,we present DecWiki's detailed architecture and its implementation in the form of a smart contract and use the interplanetary file system to complement the big data storage.Meanwhile,we use the trusted execution environment to secure sensitive information in the wireless scenario.Finally,the system overhead and the acceptance of the prototype are evaluated.Extensive experiments present its significant performance.展开更多
How to explore fine-grained but meaningful information from the massive amount of social media data is critical but challenging.To address this challenge,we propose the TopicBubbler,a visual analytics system that supp...How to explore fine-grained but meaningful information from the massive amount of social media data is critical but challenging.To address this challenge,we propose the TopicBubbler,a visual analytics system that supports the cross-level fine-grained exploration of social media data.To achieve the goal of cross-level fine-grained exploration,we propose a new workflow.Under the procedure of the workflow,we construct the fine-grained exploration view through the design of bubble-based word clouds.Each bubble contains two rings that can display information through different levels,and recommends six keywords computed by different algorithms.The view supports users collecting information at different levels and to perform fine-grained selection and exploration across different levels based on keyword recommendations.To enable the users to explore the temporal information and the hierarchical structure,we also construct the Temporal View and Hierarchical View,which satisfy users to view the cross-level dynamic trends and the overview hierarchical structure.In addition,we use the storyline metaphor to enable users to consolidate the fragmented information extracted across levels and topics and ultimately present it as a complete story.Case studies from real-world data confirm the capability of the TopicBubbler from different perspectives,including event mining across levels and topics,and fine-grained mining of specific topics to capture events hidden beneath the surface.展开更多
activities.Ex-periments on a synthetic log of the non-secondary hy-pertension MTP and empirical findings demonstrate the effectiveness of our approach.The results show that the process mining in our approach framework...activities.Ex-periments on a synthetic log of the non-secondary hy-pertension MTP and empirical findings demonstrate the effectiveness of our approach.The results show that the process mining in our approach framework can automatically generate more accurate MTP mod-els,and the subprocess models based on treatment pat-terns make the models easy to understand.展开更多
The flourishing development of social media platforms based on cultivating user relationships to spread and share information has provided a breeding ground for cyberbullying.How to infer the evolution of public opini...The flourishing development of social media platforms based on cultivating user relationships to spread and share information has provided a breeding ground for cyberbullying.How to infer the evolution of public opinion propagation in an online bullying environment is of great significance to the governance of cyberbullying.In this paper,we propose a data-driven agent-based Model for Public Opinion Propagation Simulation(MPOPS)in cyberbullying.First,we design a Public Opinion Propagation Environment for Opinion Fusion and Polarization(OFP-POPE)in a cyber-physical-social space.Second,we conduct agent-based fine-grained modeling based on the OFP-POPE.Third,we define and quantify user interaction behaviors and improve the Susceptible-Exposed-Infected Removed(SEIR)model by taking these interaction behaviors as the factors influencing public opinion propagation.Finally,we take the“2022 Tangshan restaurant attack”incident as an empirical study case and conduct simulation experiments on the MPOPS driven by real-world data.The experimental results show that the MPOPS is superior to other baseline models,and can simulate the evolutionary trend of public opinion propagation in actual cyberbullying scenarios.In addition,the ablation experiment and parameter sensitivity analysis provide a reference for future cyberbullying intervention efforts.展开更多
Fault localization is an important and challeng- ing task during software testing. Among techniques studied in this field, program spectrum based fault localization is a promising approach. To perform spectrum based f...Fault localization is an important and challeng- ing task during software testing. Among techniques studied in this field, program spectrum based fault localization is a promising approach. To perform spectrum based fault local- ization, a set of test oracles should be provided, and the ef- fectiveness of fault localization depends highly on the quality of test oracles. Moreover, their effectiveness is usually af- fected when multiple simultaneous faults are present. Faced with multiple faults it is difficult for developers to determine when to stop the fault localization process. To address these issues, we propose an iterative fauk localization process, i.e., an iterative process of selecting test cases for effective fault localization (IPSETFUL), to identify as many faults as pos- sible in the program until the stopping criterion is satisfied. It is performed based on a concept lattice of program spec- trum (CLPS) proposed in our previous work. Based on the labeling approach of CLPS, program statements are catego- rized as dangerous statements, safe statements, and sensitive statements. To identify the faults, developers need to check the dangerous statements. Meantime, developers need to se- lect a set of test cases covering the dangerous or sensitive statements from the original test suite, and a new CLPS is generated for the next iteration. The same process is pro- ceeded in the same way. This iterative process ends until there are no failing tests in the test suite and all statements on the CLPS become safe statements. We conduct an empirical study on several subject programs, and the results show that IPSETFUL can help identify most of the faults in the program with the given test suite. Moreover, it can save much effort in inspecting unfaulty program statements compared with the existing spectrum based fault localization techniques and the relevant state of the art technique.展开更多
Performance variability,stemming from nondeterministic hardware and software behaviors or deterministic behaviors such as measurement bias,is a well-known phenomenon of computer systems which increases the difficulty ...Performance variability,stemming from nondeterministic hardware and software behaviors or deterministic behaviors such as measurement bias,is a well-known phenomenon of computer systems which increases the difficulty of comparing computer performance metrics and is slated to become even more of a concern as interest in Big Data analytic increases.Conventional methods use various measures(such as geometric mean)to quantify the performance of different benchmarks to compare computers without considering this variability which may lead to wrong conclusions.In this paper,we propose three resampling methods for performance evaluation and comparison:a randomization test for a general performance comparison between two computers,bootstrapping confidence estimation,and an empirical distribution and five-number-summary for performance evaluation.The results show that for both PARSEC and highvariance BigDataBench benchmarks 1)the randomization test substantially improves our chance to identify the difference between performance comparisons when the difference is not large;2)bootstrapping confidence estimation provides an accurate confidence interval for the performance comparison measure(e.g.,ratio of geometric means);and 3)when the difference is very small,a single test is often not enough to reveal the nature of the computer performance due to the variability of computer systems.We further propose using empirical distribution to evaluate computer performance and a five-number-summary to summarize computer performance.We use published SPEC 2006 results to investigate the sources of performance variation by predicting performance and relative variation for 8,236 machines.We achieve a correlation of predicted performances of 0.992 and a correlation of predicted and measured relative variation of 0.5.Finally,we propose the utilization of a novel biplotting technique to visualize the effectiveness of benchmarks and cluster machines by behavior.We illustrate the results and conclusion through detailed Monte Carlo simulation studies and real examples.展开更多
Heterogeneous information network (HIN)-structured data provide an effective model for practical purposes in real world. Network embedding is fundamental for supporting the network-based analysis and prediction tasks....Heterogeneous information network (HIN)-structured data provide an effective model for practical purposes in real world. Network embedding is fundamental for supporting the network-based analysis and prediction tasks. Methods of network embedding that are currently popular normally fail to effectively preserve the semantics of HIN. In this study, we propose AGA2Vec, a generative adversarial model for HIN embedding that uses attention mechanisms and meta-paths. To capture the semantic information from multi-typed entities and relations in HIN, we develop a weighted meta-path strategy to preserve the proximity of HIN. We then use an autoencoder and a generative adversarial model to obtain robust representations of HIN. The results of experiments on several real-world datasets show that the proposed approach outperforms state-of-the-art approaches for HIN embedding.展开更多
Support vector machines(SVMs)have been recognized as a powerful tool to perform linear classification.When combined with the sparsity-inducing nonconvex penalty,SVMs can perform classification and variable selection s...Support vector machines(SVMs)have been recognized as a powerful tool to perform linear classification.When combined with the sparsity-inducing nonconvex penalty,SVMs can perform classification and variable selection simultaneously.However,the nonconvex penalized SVMs in general cannot be solved globally and efficiently due to their nondifferentiability,nonconvexity,and nonsmoothness.Existing solutions to the nonconvex penalized SVMs typically solve this problem in a serial fashion,which are unable to fully use the parallel computing power of modern multi-core machines.On the other hand,the fact that many real-world data are stored in a distributed manner urgently calls for a parallel and distributed solution to the nonconvex penalized SVMs.To circumvent this challenge,we propose an efficient alternating direction method of multipliers(ADMM)based algorithm that solves the nonconvex penalized SVMs in a parallel and distributed way.We design many useful techniques to decrease the computation and synchronization cost of the proposed parallel algorithm.The time complexity analysis demonstrates the low time complexity of the proposed parallel algorithm.Moreover,the convergence of the parallel algorithm is guaranteed.Experimental evaluations on four LIBSVM benchmark datasets demonstrate the efficiency of the proposed parallel algorithm.展开更多
Knowledge base plays an important role in machine understanding and has been widely used in various applications, such as search engine, recommendation system and question answering. However, most knowledge bases are ...Knowledge base plays an important role in machine understanding and has been widely used in various applications, such as search engine, recommendation system and question answering. However, most knowledge bases are incomplete, which can cause many downstream applications to perform poorly because they cannot find the corresponding facts in the knowledge bases. In this paper, we propose an extraction and verification framework to enrich the knowledge bases. Specifically, based on the existing knowledge base, we first extract new facts from the description texts of entities. But not all newly-formed facts can be added directly to the knowledge base because the errors might be involved by the extraction. Then we propose a novel crowd-sourcing based verification step to verify the candidate facts. Finally, we apply this framework to the existing knowledge base CN-DBpedia and construct a new version of knowledge base CN-DBpedia2, which additionally contains the high confidence facts extracted from the description texts of entities.展开更多
An aggregate nearest neighbor (ANN) query returns a point of interest (POI) that minimizes an aggregate function for multiple query points. In this paper, we propose an e?cient approach to tackle ANN queries in r...An aggregate nearest neighbor (ANN) query returns a point of interest (POI) that minimizes an aggregate function for multiple query points. In this paper, we propose an e?cient approach to tackle ANN queries in road networks. Our approach consists of two phases: searching phase and pruning phase. In particular, we first continuously compute the nearest neighbors (NNs) for each query point in some specific order to obtain the candidate POIs until all query points find a common POI. Second, we filter out the unqualified POIs based on the pruning strategy for a given aggregate function. The two-phase process is repeated until there remains only one candidate POI, and the remained one is returned as the final result. In addition, we discuss the partition strategies for query points and the approximate ANN query for the case where the number of query points is huge. Extensive experiments using real datasets demonstrate that our proposed approach outperforms its competitors significantly in most cases.展开更多
Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods...Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods, and shapelets-based methods. In this work, we experimentally demonstrate, for the first time, that no single distance measure performs significantly better than others on clustering datasets of time series where spectral clustering is used. As such, a question arises as to how to choose an appropriate measure for a given dataset of time series. To answer this question, we propose an integration scheme that incorporates multiple distance measures using semi-supervised clustering. Our approach is able to integrate all the measures by extracting valuable underlying information for the clustering. To the best of our knowledge, this work demonstrates for the first time that the semi-supervised clustering method based on constraints is able to enhance time series clustering by combining multiple distance measures. Having tested on clustering various time series datasets, we show that our method outperforms individual measures, as well as typical integration approaches.展开更多
Ground elevation estimation is vital for numerous applications in autonomous vehicles and intelligent robotics including three-dimensional object detection,navigable space detection,point cloud matching for localizati...Ground elevation estimation is vital for numerous applications in autonomous vehicles and intelligent robotics including three-dimensional object detection,navigable space detection,point cloud matching for localization,and registration for mapping.However,most works regard the ground as a plane without height information,which causes inaccurate manipulation in these applications.In this work,we propose GeeNet,a novel end-to-end,lightweight method that completes the ground in nearly real time and simultaneously estimates the ground elevation in a grid-based representation.GeeNet leverages the mixing of two-and three-dimensional convolutions to preserve a lightweight architecture to regress ground elevation information for each cell of the grid.For the first time,GeeNet has fulfilled ground elevation estimation from semantic scene completion.We use the SemanticKITTI and SemanticPOSS datasets to validate the proposed GeeNet,demonstrating the qualitative and quantitative performances of GeeNet on ground elevation estimation and semantic scene completion of the point cloud.Moreover,the crossdataset generalization capability of GeeNet is experimentally proven.GeeNet achieves state-of-the-art performance in terms of point cloud completion and ground elevation estimation,with a runtime of 0.88 ms.展开更多
Bitcoin has gained its popularity for almost 10 years as a "secure and anonymous digital currency". However, according to several recent researches, we know that it can only provide pseudonymity rather than real ano...Bitcoin has gained its popularity for almost 10 years as a "secure and anonymous digital currency". However, according to several recent researches, we know that it can only provide pseudonymity rather than real anonymity, and privacy has been one of the main concerns in the system similar to Bitcoin. Ring signature is a good method for those users who need better anonymity in cryptocurrency. It was first proposed by Rivest et al. based upon the discrete logarithm problem (DLP) assumption in 2006, which allows a user to sign a message anonymously on behalf of a group of users even without their coordination. The size of ring signature is one of the dominating parameters, and constant-size ring signature (where signature size is independent of the ring size) is much desirable. Otherwise, when the ring size is large, the resultant ring signature becomes unbearable for power limited devices or leads to heavy burden over the communication network. Though being extensively studied, currently there are only two approaches for constant-size ring signature. Achieving practical constant-size ring signature is a long-standing open problem since its introduction. In this work, we solve this open question. We present a new constant-size ring signature scheme based on bilinear pairing and accumulator, which is provably secure under the random oracle (RO) model. To the best of our knowledge, it stands for the most practical ring signature up to now.展开更多
Users usually focus on the application-level requirements which are quite friendly and direct to them.However,there are no existing tools automating the application-level requirements to infrastructure provisioning an...Users usually focus on the application-level requirements which are quite friendly and direct to them.However,there are no existing tools automating the application-level requirements to infrastructure provisioning and application deployment.Although some security issues have been solved during the development phase,the undiscovered vulnerabilities remain hidden threats to the application’s security.Cyberspace mimic defense(CMD)technologies can help to enhance the application’s security despite the existence of the vulnerability.In this paper,the concept of SECurity-as-a-Service(SECaaS)is proposed with CMD technologies in cloud environments.The experiment on it was implemented.It is found that the application’s security is greatly improved to meet the user’s security and performance requirements within budgets through SECaaS.The experimental results show that SECaaS can help the users to focus on application-level requirements(monetary costs,required security level,etc.)and automate the process of application orchestration.展开更多
Recently there has been a burst of interest in blockchain and cryptocurrencies, which blends computer science, finance and business in unprecedented ways. Multiple computer science areas, including cryptography, distr...Recently there has been a burst of interest in blockchain and cryptocurrencies, which blends computer science, finance and business in unprecedented ways. Multiple computer science areas, including cryptography, distributed systems, programming languages, game theory, and system security techniques, are involved and today's blockchain and cryptocurrency systems still face security, availability, scalability and performance issues. This special section is an effort to encourage and promote research on this area from the computer architecture and software perspective.展开更多
基金supported by the National Natural Science Foundation of China(61771146,61375122)the National Thirteen 5-Year Plan for Science and Technology(2017YFC1703303)in part by Shanghai Science and Technology Development Funds(13dz2260200,13511504300)。
文摘Monocular vision-based navigation is a considerable ability for a home mobile robot. However, due to diverse disturbances, helping robots avoid obstacles, especially nonManhattan obstacles, remains a big challenge. In indoor environments, there are many spatial right-corners that are projected into two dimensional projections with special geometric configurations. These projections, which consist of three lines,might enable us to estimate their position and orientation in 3 D scenes. In this paper, we present a method for home robots to avoid non-Manhattan obstacles in indoor environments from a monocular camera. The approach first detects non-Manhattan obstacles. Through analyzing geometric features and constraints,it is possible to estimate posture differences between orientation of the robot and non-Manhattan obstacles. Finally according to the convergence of posture differences, the robot can adjust its orientation to keep pace with the pose of detected non-Manhattan obstacles, making it possible avoid these obstacles by itself. Based on geometric inferences, the proposed approach requires no prior training or any knowledge of the camera’s internal parameters,making it practical for robots navigation. Furthermore, the method is robust to errors in calibration and image noise. We compared the errors from corners of estimated non-Manhattan obstacles against the ground truth. Furthermore, we evaluate the validity of convergence of differences between the robot orientation and the posture of non-Manhattan obstacles. The experimental results showed that our method is capable of avoiding non-Manhattan obstacles, meeting the requirements for indoor robot navigation.
基金supported by Chinese National Key Research and Development Program(No.2017YFB1400604).
文摘Clinical practice guidelines(CPGs)contain evidence-based and economically reasonable medical treatment processes.Executable medical treatment processes in healthcare information systems can assist the treatment processes.To this end,business process modeling technologies have been exploited to model medical treatment processes.However,medical treatment processes are usually flexible and knowledge-intensive.To reduce the effort in modeling,we summarize several treatment patterns(i.e.,frequent behaviors in medical treatment processes in CPGs),and represent them by three process modeling languages(i.e.,BPMN,DMN,and CMMN).Based on the summarized treatment patterns,we propose a pattern-based integrated framework for modeling medical treatment processes.A modeling platform is implemented to support the use of treatment patterns,by which the feasibility of our approach is validated.An empirical analysis is discussed based on the coverage rates of treatment patterns.Feedback from interviewed physicians in a Chinese hospital shows that executable medical treatment processes of CPGs provide a convenient way to obtain guidance,thus assisting daily work for medical workers.
基金supported by the National Natural Science Foundation of China(NSFC)under Grant No.61932007.
文摘The 5G mobile Internet facilitates contents generation for online communities and platforms through human-to-human collaboration.Wikipedia,a well-known online community,uses wiki technology to build an encyclopedia through collective intelligence and collaboration.Mainstream wiki systems adopt a centralized implementation,and while existing studies have optimized the efficiency of the wiki systems'centralized implementation,these systems still suffer from a lot of problems,for example,opacity and distrust.Over the years,blockchain has brought a flurry of fervour and decentralization to the system architecture,meanwhile giving users a sense of trust and participation.Thus,an innovative blockchain-enabled wiki framework called DecWiki is proposed to build one transparent,truthful,collaborative and autonomous encyclopedia.After several participatory design iterations,we present DecWiki's detailed architecture and its implementation in the form of a smart contract and use the interplanetary file system to complement the big data storage.Meanwhile,we use the trusted execution environment to secure sensitive information in the wireless scenario.Finally,the system overhead and the acceptance of the prototype are evaluated.Extensive experiments present its significant performance.
基金supported by the Natural Science Foundation of China(NSFC No.62202105)Shanghai Municipal Science and Technology Major Project,China(2021SHZDZX0103)+1 种基金General Program(No.21ZR1403300)Sailing Program,China(No.21YF1402900)and ZJLab.
文摘How to explore fine-grained but meaningful information from the massive amount of social media data is critical but challenging.To address this challenge,we propose the TopicBubbler,a visual analytics system that supports the cross-level fine-grained exploration of social media data.To achieve the goal of cross-level fine-grained exploration,we propose a new workflow.Under the procedure of the workflow,we construct the fine-grained exploration view through the design of bubble-based word clouds.Each bubble contains two rings that can display information through different levels,and recommends six keywords computed by different algorithms.The view supports users collecting information at different levels and to perform fine-grained selection and exploration across different levels based on keyword recommendations.To enable the users to explore the temporal information and the hierarchical structure,we also construct the Temporal View and Hierarchical View,which satisfy users to view the cross-level dynamic trends and the overview hierarchical structure.In addition,we use the storyline metaphor to enable users to consolidate the fragmented information extracted across levels and topics and ultimately present it as a complete story.Case studies from real-world data confirm the capability of the TopicBubbler from different perspectives,including event mining across levels and topics,and fine-grained mining of specific topics to capture events hidden beneath the surface.
基金Chinese National Key Research and Development Program(No.2017YFB1400604).
文摘activities.Ex-periments on a synthetic log of the non-secondary hy-pertension MTP and empirical findings demonstrate the effectiveness of our approach.The results show that the process mining in our approach framework can automatically generate more accurate MTP mod-els,and the subprocess models based on treatment pat-terns make the models easy to understand.
基金supported by the National Key Research and Development Program of China(No.2021YFC3300202).
文摘The flourishing development of social media platforms based on cultivating user relationships to spread and share information has provided a breeding ground for cyberbullying.How to infer the evolution of public opinion propagation in an online bullying environment is of great significance to the governance of cyberbullying.In this paper,we propose a data-driven agent-based Model for Public Opinion Propagation Simulation(MPOPS)in cyberbullying.First,we design a Public Opinion Propagation Environment for Opinion Fusion and Polarization(OFP-POPE)in a cyber-physical-social space.Second,we conduct agent-based fine-grained modeling based on the OFP-POPE.Third,we define and quantify user interaction behaviors and improve the Susceptible-Exposed-Infected Removed(SEIR)model by taking these interaction behaviors as the factors influencing public opinion propagation.Finally,we take the“2022 Tangshan restaurant attack”incident as an empirical study case and conduct simulation experiments on the MPOPS driven by real-world data.The experimental results show that the MPOPS is superior to other baseline models,and can simulate the evolutionary trend of public opinion propagation in actual cyberbullying scenarios.In addition,the ablation experiment and parameter sensitivity analysis provide a reference for future cyberbullying intervention efforts.
文摘Fault localization is an important and challeng- ing task during software testing. Among techniques studied in this field, program spectrum based fault localization is a promising approach. To perform spectrum based fault local- ization, a set of test oracles should be provided, and the ef- fectiveness of fault localization depends highly on the quality of test oracles. Moreover, their effectiveness is usually af- fected when multiple simultaneous faults are present. Faced with multiple faults it is difficult for developers to determine when to stop the fault localization process. To address these issues, we propose an iterative fauk localization process, i.e., an iterative process of selecting test cases for effective fault localization (IPSETFUL), to identify as many faults as pos- sible in the program until the stopping criterion is satisfied. It is performed based on a concept lattice of program spec- trum (CLPS) proposed in our previous work. Based on the labeling approach of CLPS, program statements are catego- rized as dangerous statements, safe statements, and sensitive statements. To identify the faults, developers need to check the dangerous statements. Meantime, developers need to se- lect a set of test cases covering the dangerous or sensitive statements from the original test suite, and a new CLPS is generated for the next iteration. The same process is pro- ceeded in the same way. This iterative process ends until there are no failing tests in the test suite and all statements on the CLPS become safe statements. We conduct an empirical study on several subject programs, and the results show that IPSETFUL can help identify most of the faults in the program with the given test suite. Moreover, it can save much effort in inspecting unfaulty program statements compared with the existing spectrum based fault localization techniques and the relevant state of the art technique.
基金This work was supported in part by the National High Technology Research and Development Program of China(2015AA015303)the National Natural Science Foundation of China(Grant No.61672160)+2 种基金Shanghai Science and Technology Development Funds(17511102200)National Science Foundation(NSF)(CCF-1017961,CCF-1422408,and CNS-1527318)We acknowledge the computing resources provided by the Louisiana Optical Network Initiative(LONI)HPC team.Finally,we appreciate invaluable comments from anonymous reviewers.
文摘Performance variability,stemming from nondeterministic hardware and software behaviors or deterministic behaviors such as measurement bias,is a well-known phenomenon of computer systems which increases the difficulty of comparing computer performance metrics and is slated to become even more of a concern as interest in Big Data analytic increases.Conventional methods use various measures(such as geometric mean)to quantify the performance of different benchmarks to compare computers without considering this variability which may lead to wrong conclusions.In this paper,we propose three resampling methods for performance evaluation and comparison:a randomization test for a general performance comparison between two computers,bootstrapping confidence estimation,and an empirical distribution and five-number-summary for performance evaluation.The results show that for both PARSEC and highvariance BigDataBench benchmarks 1)the randomization test substantially improves our chance to identify the difference between performance comparisons when the difference is not large;2)bootstrapping confidence estimation provides an accurate confidence interval for the performance comparison measure(e.g.,ratio of geometric means);and 3)when the difference is very small,a single test is often not enough to reveal the nature of the computer performance due to the variability of computer systems.We further propose using empirical distribution to evaluate computer performance and a five-number-summary to summarize computer performance.We use published SPEC 2006 results to investigate the sources of performance variation by predicting performance and relative variation for 8,236 machines.We achieve a correlation of predicted performances of 0.992 and a correlation of predicted and measured relative variation of 0.5.Finally,we propose the utilization of a novel biplotting technique to visualize the effectiveness of benchmarks and cluster machines by behavior.We illustrate the results and conclusion through detailed Monte Carlo simulation studies and real examples.
基金This work was supported by the National Natural Science Foundation of China under Grant No.61672161the Youth Research Fund of Shanghai Municipal Health and Family Planning Commission of China under Grant No.2015Y0195。
文摘Heterogeneous information network (HIN)-structured data provide an effective model for practical purposes in real world. Network embedding is fundamental for supporting the network-based analysis and prediction tasks. Methods of network embedding that are currently popular normally fail to effectively preserve the semantics of HIN. In this study, we propose AGA2Vec, a generative adversarial model for HIN embedding that uses attention mechanisms and meta-paths. To capture the semantic information from multi-typed entities and relations in HIN, we develop a weighted meta-path strategy to preserve the proximity of HIN. We then use an autoencoder and a generative adversarial model to obtain robust representations of HIN. The results of experiments on several real-world datasets show that the proposed approach outperforms state-of-the-art approaches for HIN embedding.
基金Project supported by the Major State Research Development Program,China(No.2016YFB0201305)。
文摘Support vector machines(SVMs)have been recognized as a powerful tool to perform linear classification.When combined with the sparsity-inducing nonconvex penalty,SVMs can perform classification and variable selection simultaneously.However,the nonconvex penalized SVMs in general cannot be solved globally and efficiently due to their nondifferentiability,nonconvexity,and nonsmoothness.Existing solutions to the nonconvex penalized SVMs typically solve this problem in a serial fashion,which are unable to fully use the parallel computing power of modern multi-core machines.On the other hand,the fact that many real-world data are stored in a distributed manner urgently calls for a parallel and distributed solution to the nonconvex penalized SVMs.To circumvent this challenge,we propose an efficient alternating direction method of multipliers(ADMM)based algorithm that solves the nonconvex penalized SVMs in a parallel and distributed way.We design many useful techniques to decrease the computation and synchronization cost of the proposed parallel algorithm.The time complexity analysis demonstrates the low time complexity of the proposed parallel algorithm.Moreover,the convergence of the parallel algorithm is guaranteed.Experimental evaluations on four LIBSVM benchmark datasets demonstrate the efficiency of the proposed parallel algorithm.
基金National Key R&D Program of China under Grant No.2017YFC1201203sponsored by Shanghai Sailing Program under Grant No.19YF1402300the Initial Research Funds for Young Teachers of Donghua University under Grant No.112-07-0053019.
文摘Knowledge base plays an important role in machine understanding and has been widely used in various applications, such as search engine, recommendation system and question answering. However, most knowledge bases are incomplete, which can cause many downstream applications to perform poorly because they cannot find the corresponding facts in the knowledge bases. In this paper, we propose an extraction and verification framework to enrich the knowledge bases. Specifically, based on the existing knowledge base, we first extract new facts from the description texts of entities. But not all newly-formed facts can be added directly to the knowledge base because the errors might be involved by the extraction. Then we propose a novel crowd-sourcing based verification step to verify the candidate facts. Finally, we apply this framework to the existing knowledge base CN-DBpedia and construct a new version of knowledge base CN-DBpedia2, which additionally contains the high confidence facts extracted from the description texts of entities.
基金This research is supported in part by the Shanghai Natural Science Foundation of China under Grant No. 14ZR1403100, the Shanghai Science and Technology Development Funds of China under Grant Nos. 13dz2260200 and 13511504300, and the National Natural Science Foundation of China under Grant No. 61073001.
文摘An aggregate nearest neighbor (ANN) query returns a point of interest (POI) that minimizes an aggregate function for multiple query points. In this paper, we propose an e?cient approach to tackle ANN queries in road networks. Our approach consists of two phases: searching phase and pruning phase. In particular, we first continuously compute the nearest neighbors (NNs) for each query point in some specific order to obtain the candidate POIs until all query points find a common POI. Second, we filter out the unqualified POIs based on the pruning strategy for a given aggregate function. The two-phase process is repeated until there remains only one candidate POI, and the remained one is returned as the final result. In addition, we discuss the partition strategies for query points and the approximate ANN query for the case where the number of query points is huge. Extensive experiments using real datasets demonstrate that our proposed approach outperforms its competitors significantly in most cases.
基金The work was partially supported by the National Natural Science Foundation of China under Grant Nos. 61332013, 61272110, and 61370229, and the National Key Technology Research and Development Program of China under Grant No. 2013BAH72B01.
文摘Time series clustering is widely applied in various areas. Existing researches focus mainly on distance measures between two time series, such as dynamic time warping (DTW) based methods, edit-distance based methods, and shapelets-based methods. In this work, we experimentally demonstrate, for the first time, that no single distance measure performs significantly better than others on clustering datasets of time series where spectral clustering is used. As such, a question arises as to how to choose an appropriate measure for a given dataset of time series. To answer this question, we propose an integration scheme that incorporates multiple distance measures using semi-supervised clustering. Our approach is able to integrate all the measures by extracting valuable underlying information for the clustering. To the best of our knowledge, this work demonstrates for the first time that the semi-supervised clustering method based on constraints is able to enhance time series clustering by combining multiple distance measures. Having tested on clustering various time series datasets, we show that our method outperforms individual measures, as well as typical integration approaches.
基金the National Natural Science Foundation of China(No.U2033209)。
文摘Ground elevation estimation is vital for numerous applications in autonomous vehicles and intelligent robotics including three-dimensional object detection,navigable space detection,point cloud matching for localization,and registration for mapping.However,most works regard the ground as a plane without height information,which causes inaccurate manipulation in these applications.In this work,we propose GeeNet,a novel end-to-end,lightweight method that completes the ground in nearly real time and simultaneously estimates the ground elevation in a grid-based representation.GeeNet leverages the mixing of two-and three-dimensional convolutions to preserve a lightweight architecture to regress ground elevation information for each cell of the grid.For the first time,GeeNet has fulfilled ground elevation estimation from semantic scene completion.We use the SemanticKITTI and SemanticPOSS datasets to validate the proposed GeeNet,demonstrating the qualitative and quantitative performances of GeeNet on ground elevation estimation and semantic scene completion of the point cloud.Moreover,the crossdataset generalization capability of GeeNet is experimentally proven.GeeNet achieves state-of-the-art performance in terms of point cloud completion and ground elevation estimation,with a runtime of 0.88 ms.
基金This work is supported in part by the National Key Research and Development Program of China under Grant No. 2017YFB0802000, the National Natural Science Foundation of China under Grant Nos. 61472084 and U1536205, the Shanghai Innovation Action Project under Grant No. 16DZ1100200, the Shanghai Science and Technology Development Funds under Grant No. 6JC1400801, and the Shandong Provincial Key Research and Development Program of China under Grant No. 2017CXG0701.
文摘Bitcoin has gained its popularity for almost 10 years as a "secure and anonymous digital currency". However, according to several recent researches, we know that it can only provide pseudonymity rather than real anonymity, and privacy has been one of the main concerns in the system similar to Bitcoin. Ring signature is a good method for those users who need better anonymity in cryptocurrency. It was first proposed by Rivest et al. based upon the discrete logarithm problem (DLP) assumption in 2006, which allows a user to sign a message anonymously on behalf of a group of users even without their coordination. The size of ring signature is one of the dominating parameters, and constant-size ring signature (where signature size is independent of the ring size) is much desirable. Otherwise, when the ring size is large, the resultant ring signature becomes unbearable for power limited devices or leads to heavy burden over the communication network. Though being extensively studied, currently there are only two approaches for constant-size ring signature. Achieving practical constant-size ring signature is a long-standing open problem since its introduction. In this work, we solve this open question. We present a new constant-size ring signature scheme based on bilinear pairing and accumulator, which is provably secure under the random oracle (RO) model. To the best of our knowledge, it stands for the most practical ring signature up to now.
基金National Key Research and Development Program of China(2017YFB0803202)Major Scientific Research Project of Zhejiang Lab(No.2018FD0ZX01)+1 种基金National Core Electronic Devices,High-end Generic Chips and Basic Software Major Projects(2017ZX01030301)the National Natural Science Foundation of China(No.61309020)and the National Natural Science Fund for Creative Research Groups Project(No.61521003).
文摘Users usually focus on the application-level requirements which are quite friendly and direct to them.However,there are no existing tools automating the application-level requirements to infrastructure provisioning and application deployment.Although some security issues have been solved during the development phase,the undiscovered vulnerabilities remain hidden threats to the application’s security.Cyberspace mimic defense(CMD)technologies can help to enhance the application’s security despite the existence of the vulnerability.In this paper,the concept of SECurity-as-a-Service(SECaaS)is proposed with CMD technologies in cloud environments.The experiment on it was implemented.It is found that the application’s security is greatly improved to meet the user’s security and performance requirements within budgets through SECaaS.The experimental results show that SECaaS can help the users to focus on application-level requirements(monetary costs,required security level,etc.)and automate the process of application orchestration.
文摘Recently there has been a burst of interest in blockchain and cryptocurrencies, which blends computer science, finance and business in unprecedented ways. Multiple computer science areas, including cryptography, distributed systems, programming languages, game theory, and system security techniques, are involved and today's blockchain and cryptocurrency systems still face security, availability, scalability and performance issues. This special section is an effort to encourage and promote research on this area from the computer architecture and software perspective.