Due to limitations in geometric representation and semantic description, the current pedestrian route analysis models are inadequate. To express the geometry of geographic entities in a micro-spatial environment accur...Due to limitations in geometric representation and semantic description, the current pedestrian route analysis models are inadequate. To express the geometry of geographic entities in a micro-spatial environment accurately, the concept of a grid is presented, and grid-based methods for modeling geospatial objects are described. The semantic constitution of a building environment and the methods for modeling rooms, corridors, and staircases with grid objects are described. Based on the topology relationship between grid objects, a grid-based graph for a building environment is presented, and the corresponding route algorithm for pedestrians is proposed. The main advantages of the graph model proposed in this paper are as follows: 1) consideration of both semantic and geometric information, 2) consideration of the need for accurate geometric representation of the micro-spatial environment and the efficiency of pedestrian route analysis, 3) applicability of the graph model to route analysis in both static and dynamic environments, and 4) ability of the multi-hierarchical route analysis to integrate the multiple levels of pedestrian decision characteristics, from the high to the low, to determine the optimal path.展开更多
With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this pap...With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.展开更多
By modeling the spatiotemporal data of the power grid, it is possible to better understand its operational status, identify potential issues and risks, and take timely measures to adjust and optimize the system. Compa...By modeling the spatiotemporal data of the power grid, it is possible to better understand its operational status, identify potential issues and risks, and take timely measures to adjust and optimize the system. Compared to the bus-branch model, the node-breaker model provides higher granularity in describing grid components and can dynamically reflect changes in equipment status, thus improving the efficiency of grid dispatching and operation. This paper proposes a spatiotemporal data modeling method based on a graph database. It elaborates on constructing graph nodes, graph ontology models, and graph entity models from grid dispatch data, describing the construction of the spatiotemporal node-breaker graph model and the transformation to the bus-branch model. Subsequently, by integrating spatiotemporal data attributes into the pre-built static grid graph model, a spatiotemporal evolving graph of the power grid is constructed. Furthermore, the concept of the “Power Grid One Graph” and its requirements in modern power systems are elucidated. Leveraging the constructed spatiotemporal node-breaker graph model and graph computing technology, the paper explores the feasibility of grid situational awareness. Finally, typical applications in an operational provincial grid are showcased, and potential scenarios of the proposed spatiotemporal graph model are discussed.展开更多
图结构数据在社交网络、交通系统、生物信息等场景中广泛存在。图神经网络(graph neural networks,GNNs)利用消息传递机制迭代地聚合邻居信息,在节点分类、链路预测和图分类等任务中展现出良好性能。然而,随着数据规模的持续扩大与应用...图结构数据在社交网络、交通系统、生物信息等场景中广泛存在。图神经网络(graph neural networks,GNNs)利用消息传递机制迭代地聚合邻居信息,在节点分类、链路预测和图分类等任务中展现出良好性能。然而,随着数据规模的持续扩大与应用场景的日趋复杂,GNNs面临表达能力有限与泛化能力不足等关键挑战。近年来,以大语言模型(large language models,LLMs)为代表的基础模型迅速发展,展现出卓越的泛化与推理能力,为图机器学习领域带来了新的启发。基于此,本研究提出图基础模型(graph foundation model,GFM)的概念,希望通过在大规模图数据上预训练,获得能够灵活适配多种下游任务的通用模型;同时系统梳理了近年来图基础模型的相关研究,并依据其对GNNs与LLMs的依赖程度,将现有方法归纳为3类,综述其研究进展并介绍了作者团队在相关方向的实践探索经验。最后,展望了图基础模型未来发展可能面临的关键挑战与前景,以期为图机器学习领域的持续创新提供参考。展开更多
随着大数据环境下数据安全风险复杂化,现有数据安全审计技术因碎片化特征利用及扩展能力不足,难以实现全生命周期风险覆盖,限制了风险检测效能.因此,提出一种基于风险要素的图嵌入数据安全审计方案(graph-embedded data security audit ...随着大数据环境下数据安全风险复杂化,现有数据安全审计技术因碎片化特征利用及扩展能力不足,难以实现全生命周期风险覆盖,限制了风险检测效能.因此,提出一种基于风险要素的图嵌入数据安全审计方案(graph-embedded data security audit scheme based on risk elements,RE-GDSA).首先构建含数据属性D(data)、用户特征U(user)、载体环境C(carrier)、操作行为A(action)的安全风险要素空间,实现数据全生命周期风险特征的结构化映射;然后利用图嵌入技术将风险要素映射为低维语义向量,构建跨维度关联模型以实现高效风险检测.通过有效性分析和性能分析验证了该方案的可行性.展开更多
为有效提升配电网韧性,提出了一种基于数据-模型混合驱动的多类型移动应急资源优化调度方法。首先,考虑到交通道路状态动态变化对移动储能车(mobile energy storage system,MESS)和应急抢修队(repair crew,RC)策略的影响,构建了以电力-...为有效提升配电网韧性,提出了一种基于数据-模型混合驱动的多类型移动应急资源优化调度方法。首先,考虑到交通道路状态动态变化对移动储能车(mobile energy storage system,MESS)和应急抢修队(repair crew,RC)策略的影响,构建了以电力-交通耦合网总损失成本最小为目标的多类型移动应急资源随机优化调度模型。然后,为了实时准确地求解MESS和RC最优路由和调度策略,提出了一种数据-模型混合驱动方法对所构建的复杂非线性随机优化模型进行求解。在数据驱动部分提出一种图注意力网络多智能体强化学习算法,以求解考虑交通网道路修复时间和移动应急资源邻接关系动态变化等不确定因素的MESS和RC最优路由策略。所提算法有效结合多种改进策略和优先经验回放策略以提高算法的采样效率和训练效果。在模型驱动部分采用二阶锥松弛和大M法将多类型移动应急资源优化调度问题构建为混合整数二阶锥规划模型以求解可再生能源出力和配电网负荷变化影响下MESS和RC最优调度策略。最后,在2个不同规模的电力-交通耦合网中验证所提方法的有效性、泛化能力和可拓展能力。展开更多
针对信息安全课程知识推荐存在的多源行为融合不足、偏好适配针对性弱等问题,提出基于双向长短期记忆-多头注意力-学生多源行为数据融合(bidirectional long short-term memory-multi-head attention-fusion of student multi-source be...针对信息安全课程知识推荐存在的多源行为融合不足、偏好适配针对性弱等问题,提出基于双向长短期记忆-多头注意力-学生多源行为数据融合(bidirectional long short-term memory-multi-head attention-fusion of student multi-source behavior data,BiLSTM-MA-FSBD)的知识推荐方法。首先,整合学生多源行为数据,提取核心行为特征,构建涵盖动态时序与静态关联的融合特征体系;然后,设计BiLSTM网络对行为序列依赖关系进行编码,利用MA机制自适应分配行为权重,实现学习偏好的精准推断;最后,构建3层级信息安全知识图谱,量化知识点依赖关系,结合偏好匹配度进行个性化推荐。结果表明,BiLSTM-MA-FSBD方法的推荐精确率比协同过滤(collaborative filtering,CF)方法提高了26.2个百分点。该方法可以有效适配信息安全课程的教学特性与学生个性化学习需求,为解决课程知识的精准推荐问题提供了切实可行的技术方案。展开更多
In order to guarantee the correctness of business processes, not only control-flow errors but also data-flow errors should be considered. The control-flow errors mainly focus on deadlock, livelock, soundness, and so o...In order to guarantee the correctness of business processes, not only control-flow errors but also data-flow errors should be considered. The control-flow errors mainly focus on deadlock, livelock, soundness, and so on. However, there are not too many methods for detecting data-flow errors. This paper defines Petri nets with data operations(PN-DO) that can model the operations on data such as read, write and delete. Based on PN-DO, we define some data-flow errors in this paper. We construct a reachability graph with data operations for each PN-DO, and then propose a method to reduce the reachability graph. Based on the reduced reachability graph, data-flow errors can be detected rapidly. A case study is given to illustrate the effectiveness of our methods.展开更多
Construction project is not a standalone engineering maneuver.It is closely linked to the well-being of local communities in concern.The city renovation in Beijing down center for Olympic 2008 transformed many antique...Construction project is not a standalone engineering maneuver.It is closely linked to the well-being of local communities in concern.The city renovation in Beijing down center for Olympic 2008 transformed many antique architecture and regional landscape.It gave a world-recognized achievement in China s modem development and manifested a major milestone in China's economic development.In the course of metro construction projects,there are substantial interwoven municipal structures influencing the success of the projects,which including,but the least,all underground cables and ducts,sewage system,the power consumption of construction works,traffic diversion,air pollution,expatriate business activities and social security.There are many US and UK project insurance companies moving into Asia Pacific.They are doing re-insurance business on major construction guarantee,such as machinery damage,project on-time,power consumption,claims from contractors and communities.Environmental information,such as water quality,indoor and outdoor air quality,people inflow and lift waiting time play deterministic roles in construction's fit-touse.Big Data is a contemporary buzzword since 2013,and the key competence is to provide real time response to heuristic syndrome in order to make short-term prediction.This paper attempts to develop a conceptual model in big data for construction展开更多
In this paper, a Graph-based semantic Data Model (GDM) is proposed with the primary objective of bridging the gap between the human perception of an enterprise and the needs of computing infrastructure to organize i...In this paper, a Graph-based semantic Data Model (GDM) is proposed with the primary objective of bridging the gap between the human perception of an enterprise and the needs of computing infrastructure to organize information in some particular manner for efficient storage and retrieval. The Graph Data Model (GDM) has been proposed as an alternative data model to combine the advantages of the relational model with the positive features of semantic data models. The proposed GDM offers a structural representation for interacting to the designer, making it always easy to comprehend the complex relations amongst basic data items. GDM allows an entire database to be viewed as a Graph (V, E) in a layered organization. Here, a graph is created in a bottom up fashion where V represents the basic instances of data or a functionally abstracted module, called primary semantic group (PSG) and secondary semantic group (SSG). An edge in the model implies the relationship among the secondary semantic groups. The contents of the lowest layer are the semantically grouped data values in the form of primary semantic groups. The SSGs are nothing but the higher-level abstraction and are created by the method of encapsulation of various PSGs, SSGs and basic data elements. This encapsulation methodology to provide a higher-level abstraction continues generating various secondary semantic groups until the designer thinks that it is sufficient to declare the actual problem domain. GDM, thus, uses standard abstractions available in a semantic data model with a structural representation in terms of a graph. The operations on the data model are formalized in the proposed graph algebra. A Graph Query Language (GQL) is also developed, maintaining similarity with the widely accepted user-friendly SQL. Finally, the paper also presents the methodology to make this GDM compatible with the distributed environment, and a corresponding query processing technique for distributed environment is also suggested for the sake of completeness.展开更多
基金supported by National Natural Science Foundation of China(Nos.41571387,41201375 and 41501440)Tianjin Research Program of Application Foundation and Advanced Technology(No.14JCQNJC07900)+1 种基金Tianjin Science and Technology Planning Project(Nos.15ZCZDSF00390 and 14TXGCCX00015)Opening Fund of Tianjin Engineering Research Center of Geospatial Information Technology"Modeling and analysis of path graph in 3D indoor spatial environment"
文摘Due to limitations in geometric representation and semantic description, the current pedestrian route analysis models are inadequate. To express the geometry of geographic entities in a micro-spatial environment accurately, the concept of a grid is presented, and grid-based methods for modeling geospatial objects are described. The semantic constitution of a building environment and the methods for modeling rooms, corridors, and staircases with grid objects are described. Based on the topology relationship between grid objects, a grid-based graph for a building environment is presented, and the corresponding route algorithm for pedestrians is proposed. The main advantages of the graph model proposed in this paper are as follows: 1) consideration of both semantic and geometric information, 2) consideration of the need for accurate geometric representation of the micro-spatial environment and the efficiency of pedestrian route analysis, 3) applicability of the graph model to route analysis in both static and dynamic environments, and 4) ability of the multi-hierarchical route analysis to integrate the multiple levels of pedestrian decision characteristics, from the high to the low, to determine the optimal path.
基金supported in part by the Fundamental Research Funds for the Central Universities under Grant No.2013RC0114111 Project of China under Grant No.B08004
文摘With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.
基金supported by the Project of China Southern Power Grid Digital Grid Research Institute Co.,Ltd.(210002KK52222026)。
文摘By modeling the spatiotemporal data of the power grid, it is possible to better understand its operational status, identify potential issues and risks, and take timely measures to adjust and optimize the system. Compared to the bus-branch model, the node-breaker model provides higher granularity in describing grid components and can dynamically reflect changes in equipment status, thus improving the efficiency of grid dispatching and operation. This paper proposes a spatiotemporal data modeling method based on a graph database. It elaborates on constructing graph nodes, graph ontology models, and graph entity models from grid dispatch data, describing the construction of the spatiotemporal node-breaker graph model and the transformation to the bus-branch model. Subsequently, by integrating spatiotemporal data attributes into the pre-built static grid graph model, a spatiotemporal evolving graph of the power grid is constructed. Furthermore, the concept of the “Power Grid One Graph” and its requirements in modern power systems are elucidated. Leveraging the constructed spatiotemporal node-breaker graph model and graph computing technology, the paper explores the feasibility of grid situational awareness. Finally, typical applications in an operational provincial grid are showcased, and potential scenarios of the proposed spatiotemporal graph model are discussed.
文摘图结构数据在社交网络、交通系统、生物信息等场景中广泛存在。图神经网络(graph neural networks,GNNs)利用消息传递机制迭代地聚合邻居信息,在节点分类、链路预测和图分类等任务中展现出良好性能。然而,随着数据规模的持续扩大与应用场景的日趋复杂,GNNs面临表达能力有限与泛化能力不足等关键挑战。近年来,以大语言模型(large language models,LLMs)为代表的基础模型迅速发展,展现出卓越的泛化与推理能力,为图机器学习领域带来了新的启发。基于此,本研究提出图基础模型(graph foundation model,GFM)的概念,希望通过在大规模图数据上预训练,获得能够灵活适配多种下游任务的通用模型;同时系统梳理了近年来图基础模型的相关研究,并依据其对GNNs与LLMs的依赖程度,将现有方法归纳为3类,综述其研究进展并介绍了作者团队在相关方向的实践探索经验。最后,展望了图基础模型未来发展可能面临的关键挑战与前景,以期为图机器学习领域的持续创新提供参考。
文摘随着大数据环境下数据安全风险复杂化,现有数据安全审计技术因碎片化特征利用及扩展能力不足,难以实现全生命周期风险覆盖,限制了风险检测效能.因此,提出一种基于风险要素的图嵌入数据安全审计方案(graph-embedded data security audit scheme based on risk elements,RE-GDSA).首先构建含数据属性D(data)、用户特征U(user)、载体环境C(carrier)、操作行为A(action)的安全风险要素空间,实现数据全生命周期风险特征的结构化映射;然后利用图嵌入技术将风险要素映射为低维语义向量,构建跨维度关联模型以实现高效风险检测.通过有效性分析和性能分析验证了该方案的可行性.
文摘为有效提升配电网韧性,提出了一种基于数据-模型混合驱动的多类型移动应急资源优化调度方法。首先,考虑到交通道路状态动态变化对移动储能车(mobile energy storage system,MESS)和应急抢修队(repair crew,RC)策略的影响,构建了以电力-交通耦合网总损失成本最小为目标的多类型移动应急资源随机优化调度模型。然后,为了实时准确地求解MESS和RC最优路由和调度策略,提出了一种数据-模型混合驱动方法对所构建的复杂非线性随机优化模型进行求解。在数据驱动部分提出一种图注意力网络多智能体强化学习算法,以求解考虑交通网道路修复时间和移动应急资源邻接关系动态变化等不确定因素的MESS和RC最优路由策略。所提算法有效结合多种改进策略和优先经验回放策略以提高算法的采样效率和训练效果。在模型驱动部分采用二阶锥松弛和大M法将多类型移动应急资源优化调度问题构建为混合整数二阶锥规划模型以求解可再生能源出力和配电网负荷变化影响下MESS和RC最优调度策略。最后,在2个不同规模的电力-交通耦合网中验证所提方法的有效性、泛化能力和可拓展能力。
基金supported in part by the National Key R&D Program of China(2017YFB1001804)Shanghai Science and Technology Innovation Action Plan Project(16511100900)
文摘In order to guarantee the correctness of business processes, not only control-flow errors but also data-flow errors should be considered. The control-flow errors mainly focus on deadlock, livelock, soundness, and so on. However, there are not too many methods for detecting data-flow errors. This paper defines Petri nets with data operations(PN-DO) that can model the operations on data such as read, write and delete. Based on PN-DO, we define some data-flow errors in this paper. We construct a reachability graph with data operations for each PN-DO, and then propose a method to reduce the reachability graph. Based on the reduced reachability graph, data-flow errors can be detected rapidly. A case study is given to illustrate the effectiveness of our methods.
文摘Construction project is not a standalone engineering maneuver.It is closely linked to the well-being of local communities in concern.The city renovation in Beijing down center for Olympic 2008 transformed many antique architecture and regional landscape.It gave a world-recognized achievement in China s modem development and manifested a major milestone in China's economic development.In the course of metro construction projects,there are substantial interwoven municipal structures influencing the success of the projects,which including,but the least,all underground cables and ducts,sewage system,the power consumption of construction works,traffic diversion,air pollution,expatriate business activities and social security.There are many US and UK project insurance companies moving into Asia Pacific.They are doing re-insurance business on major construction guarantee,such as machinery damage,project on-time,power consumption,claims from contractors and communities.Environmental information,such as water quality,indoor and outdoor air quality,people inflow and lift waiting time play deterministic roles in construction's fit-touse.Big Data is a contemporary buzzword since 2013,and the key competence is to provide real time response to heuristic syndrome in order to make short-term prediction.This paper attempts to develop a conceptual model in big data for construction
文摘In this paper, a Graph-based semantic Data Model (GDM) is proposed with the primary objective of bridging the gap between the human perception of an enterprise and the needs of computing infrastructure to organize information in some particular manner for efficient storage and retrieval. The Graph Data Model (GDM) has been proposed as an alternative data model to combine the advantages of the relational model with the positive features of semantic data models. The proposed GDM offers a structural representation for interacting to the designer, making it always easy to comprehend the complex relations amongst basic data items. GDM allows an entire database to be viewed as a Graph (V, E) in a layered organization. Here, a graph is created in a bottom up fashion where V represents the basic instances of data or a functionally abstracted module, called primary semantic group (PSG) and secondary semantic group (SSG). An edge in the model implies the relationship among the secondary semantic groups. The contents of the lowest layer are the semantically grouped data values in the form of primary semantic groups. The SSGs are nothing but the higher-level abstraction and are created by the method of encapsulation of various PSGs, SSGs and basic data elements. This encapsulation methodology to provide a higher-level abstraction continues generating various secondary semantic groups until the designer thinks that it is sufficient to declare the actual problem domain. GDM, thus, uses standard abstractions available in a semantic data model with a structural representation in terms of a graph. The operations on the data model are formalized in the proposed graph algebra. A Graph Query Language (GQL) is also developed, maintaining similarity with the widely accepted user-friendly SQL. Finally, the paper also presents the methodology to make this GDM compatible with the distributed environment, and a corresponding query processing technique for distributed environment is also suggested for the sake of completeness.