期刊文献+
共找到99篇文章
< 1 2 5 >
每页显示 20 50 100
LRP:learned robust data partitioning for efficient processing of large dynamic queries
1
作者 Pengju LIU Pan CAI +2 位作者 Kai ZHONG Cuiping LI Hong CHEN 《Frontiers of Computer Science》 2025年第9期43-60,共18页
The interconnection between query processing and data partitioning is pivotal for the acceleration of massive data processing during query execution,primarily by minimizing the number of scanned block files.Existing p... The interconnection between query processing and data partitioning is pivotal for the acceleration of massive data processing during query execution,primarily by minimizing the number of scanned block files.Existing partitioning techniques predominantly focus on query accesses on numeric columns for constructing partitions,often overlooking non-numeric columns and thus limiting optimization potential.Additionally,these techniques,despite creating fine-grained partitions from representative queries to enhance system performance,experience from notable performance declines due to unpredictable fluctuations in future queries.To tackle these issues,we introduce LRP,a learned robust partitioning system for dynamic query processing.LRP first proposes a method for data and query encoding that captures comprehensive column access patterns from historical queries.It then employs Multi-Layer Perceptron and Long Short-Term Memory networks to predict shifts in the distribution of historical queries.To create high-quality,robust partitions based on these predictions,LRP adopts a greedy beam search algorithm for optimal partition division and implements a data redundancy mechanism to share frequently accessed data across partitions.Experimental evaluations reveal that LRP yields partitions with more stable performance under incoming queries and significantly surpasses state-of-the-art partitioning methods. 展开更多
关键词 data partitioning data encoding query prediction beam search data redundancy
原文传递
RDF partitioning for scalable SPARQL query processing
2
作者 Xiaoyan WANG Tao YANG +2 位作者 Jinchuan CHEN Long HE Xiaoyong DU 《Frontiers of Computer Science》 SCIE EI CSCD 2015年第6期919-933,共15页
The volume of RDF data increases dramatically within recent years, while cloud computing platforms like Hadoop are supposed to be a good choice for processing queries over huge data sets for their wonderful scalabilit... The volume of RDF data increases dramatically within recent years, while cloud computing platforms like Hadoop are supposed to be a good choice for processing queries over huge data sets for their wonderful scalability. Previous work on evaluating SPARQL queries with Hadoop mainly focus on reducing the number of joins through careful split of HDFS files and algorithms for generating Map/Reduce jobs. However, the way of partitioning RDF data could also affect system performance. Specifically, a good partitioning solution would greatly reduce or even to- tally avoid cross-node joins, and significantly cut down the cost in query evaluation. Based on HadoopDB, this work processes SPARQL queries in a hybrid architecture, where Map/Reduce takes charge of the computing tasks, and RDF query engines like RDF-3X store the data and execute join operations. According to the analysis of query workloads, this work proposes a novel algorithm for automatically parti- tioning RDF data and an approximate solution to physically place the partitions in order to reduce data redundancy. It also discusses how to make a good trade-off between query evaluation efficiency and data redundancy. All of these pro- posed approaches have been evaluated by extensive experiments over large RDF data sets. 展开更多
关键词 rdf data data partitioning sparql query
原文传递
Semantic-based query processing for relational data integration 被引量:1
3
作者 苗壮 张亚非 +2 位作者 王进鹏 陆建江 周波 《Journal of Southeast University(English Edition)》 EI CAS 2011年第1期22-25,共4页
To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,al... To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance. 展开更多
关键词 data integration relational database simple protocol and rdf query language(sparql minimal connectable unit query processing
在线阅读 下载PDF
Data partitioning based on sampling for power load streams
4
作者 王永利 徐宏炳 +2 位作者 董逸生 钱江波 刘学军 《Journal of Southeast University(English Edition)》 EI CAS 2005年第3期293-298,共6页
A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,wh... A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely,which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient,practical and suitable for time-varying data streams processing. 展开更多
关键词 data streams continuous queries parallel processing sampling data partitioning
在线阅读 下载PDF
A survey of RDF data management systems 被引量:5
5
作者 M. Tamer OZSU 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第3期418-432,共15页
RDF is increasingly being used to encode data for the semantic web and data exchange. There have been a large number of works that address RDF data manage- ment following different approaches. In this paper we pro- vi... RDF is increasingly being used to encode data for the semantic web and data exchange. There have been a large number of works that address RDF data manage- ment following different approaches. In this paper we pro- vide an overview of these works. This review considers cen- tralized solutions (what are referred to as warehousing ap- proaches), distributed solutions, and the techniques that have been developed for querying linked data. In each category, further classifications are provided that would assist readers in understanding the identifying characteristics of different approaches. 展开更多
关键词 rdf sparql linked object data.
原文传递
System Ⅱ:A Native RDF Repository Based on the Hypergraph Representation for RDF Data Model 被引量:2
6
作者 吴刚 李涓子 +1 位作者 胡建强 王克宏 《Journal of Computer Science & Technology》 SCIE EI CSCD 2009年第4期652-664,共13页
RDF is the data interchange layer for the Semantic Web. an RDF repository should provide not only the necessary scalability In order to manage the increasing amount of RDF data, and efficiency, but also sufficient inf... RDF is the data interchange layer for the Semantic Web. an RDF repository should provide not only the necessary scalability In order to manage the increasing amount of RDF data, and efficiency, but also sufficient inference capabilities. Though existing RDF repositories have made progress towards these goals, there is still ample space for improving the overall performance. In this paper, we propose a native RDF repository, System H, to pursue a better tradeoff among system scalability, query efficiency, and inference capabilities. System II takes a hypergraph representation for RDF as the data model for its persistent storage, which effectively avoids the costs of data model transformation when accessing RDF data. Based on this native storage scheme, a set of efficient semantic query processing techniques are designed. First, several indices are built to accelerate RDF data access including a value index, a labeling scheme for transitive closure computation, and three triple indices. Second, we propose a hybrid inference strategy under the pD* semantics to support inference for OWL-Lite with a relatively low computational complexity. Finally, we extend the SPARQL algebra to explicitly express inference semantics in logical query plan by defining some new algebra operators. In addition, MD5 hash value of URI and schema level cache are introduced as practical implementation techniques. The results of performance evaluation on the LUBM benchmark and a real data set show that System Ⅱ has a better combined metric value than other comparable systems. 展开更多
关键词 rdf data management query processing INDEX
原文传递
Multidimensional Data Querying on Tree-Structured Overlay
7
作者 XU Lizhen WANG Shiyuan 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1367-1372,共6页
Multidimensional data query has been gaining much interest in database research communities in recent years, yet many of the existing studies focus mainly on ten tralized systems. A solution to querying in Peer-to-Pee... Multidimensional data query has been gaining much interest in database research communities in recent years, yet many of the existing studies focus mainly on ten tralized systems. A solution to querying in Peer-to-Peer(P2P) environment was proposed to achieve both low processing cost in terms of the number of peers accessed and search messages and balanced query loads among peers. The system is based on a balanced tree structured P2P network. By partitioning the query space intelligently, the amount of query forwarding is effectively controlled, and the number of peers involved and search messages are also limited. Dynamic load balancing can be achieved during space partitioning and query resolving. Extensive experiments confirm the effectiveness and scalability of our algorithms on P2P networks. 展开更多
关键词 range query skyline query P2P indexing multi-dimensional data partition
在线阅读 下载PDF
RDF的异构数据集成与查询系统及方法在资金流水分析中的应用
8
作者 杨胜海 龙集坤 余芷谊 《科学与信息化》 2025年第8期166-168,共3页
在资金流水分析的实际操作过程中,传统数据处理技术在处理大规模及复杂金融数据时,尤其是在网络金融犯罪领域犯罪手段日益隐蔽且多样化的情况下,往往表现出效率低下和准确性不足的缺陷。本文通过具体案例分析,探讨了RDF的异构数据集成... 在资金流水分析的实际操作过程中,传统数据处理技术在处理大规模及复杂金融数据时,尤其是在网络金融犯罪领域犯罪手段日益隐蔽且多样化的情况下,往往表现出效率低下和准确性不足的缺陷。本文通过具体案例分析,探讨了RDF的异构数据集成与查询系统及方法在资金流水分析中的应用,旨在解决资金流水分析过程中的难题,并显著提升查询的效率与准确性。 展开更多
关键词 rdf 异构数据集成 查询系统及方法 资金流水分析
在线阅读 下载PDF
Optimization of RDF link traversal based query execution 被引量:2
9
作者 朱艳琴 花岭 《Journal of Southeast University(English Edition)》 EI CAS 2013年第1期27-32,共6页
Aiming at the problem that only some types of SPARQL ( simple protocal and resource description framework query language) queries can be answered by using the current resource description framework link traversal ba... Aiming at the problem that only some types of SPARQL ( simple protocal and resource description framework query language) queries can be answered by using the current resource description framework link traversal based query execution (RDF-LTE) approach, this paper discusses how the execution order of the triple pattern affects the query results and cost based on concrete SPARQL queries, and analyzes two properties of the web of linked data, missing backward links and missing contingency solution. Then three heuristic principles for logic query plan optimization, namely, the filtered basic graph pattern (FBGP) principle, the triple pattern chain principle and the seed URIs principle, are proposed. The three principles contribute to decrease the intermediate solutions and increase the types of queries that can be answered. The effectiveness and feasibility of the proposed approach is evaluated. The experimental results show that more query results can be returned with less cost, thus enabling users to develop the full potential of the web of linked data. 展开更多
关键词 web of linked data resource description framework link traversal based query execution rdf-LTE) sparql query query optimization
在线阅读 下载PDF
Tailored Partitioning for Healthcare Big Data: A Novel Technique for Efficient Data Management and Hash Retrieval in RDBMS Relational Architectures
10
作者 Ehsan Soltanmohammadi Neset Hikmet Dilek Akgun 《Journal of Data Analysis and Information Processing》 2025年第1期46-65,共20页
Efficient data management in healthcare is essential for providing timely and accurate patient care, yet traditional partitioning methods in relational databases often struggle with the high volume, heterogeneity, and... Efficient data management in healthcare is essential for providing timely and accurate patient care, yet traditional partitioning methods in relational databases often struggle with the high volume, heterogeneity, and regulatory complexity of healthcare data. This research introduces a tailored partitioning strategy leveraging the MD5 hashing algorithm to enhance data insertion, query performance, and load balancing in healthcare systems. By applying a consistent hash function to patient IDs, our approach achieves uniform distribution of records across partitions, optimizing retrieval paths and reducing access latency while ensuring data integrity and compliance. We evaluated the method through experiments focusing on partitioning efficiency, scalability, and fault tolerance. The partitioning efficiency analysis compared our MD5-based approach with standard round-robin methods, measuring insertion times, query latency, and data distribution balance. Scalability tests assessed system performance across increasing dataset sizes and varying partition counts, while fault tolerance experiments examined data integrity and retrieval performance under simulated partition failures. The experimental results demonstrate that the MD5-based partitioning strategy significantly reduces query retrieval times by optimizing data access patterns, achieving up to X% better performance compared to round-robin methods. It also scales effectively with larger datasets, maintaining low latency and ensuring robust resilience under failure scenarios. This novel approach offers a scalable, efficient, and fault-tolerant solution for healthcare systems, facilitating faster clinical decision-making and improved patient care in complex data environments. 展开更多
关键词 Healthcare data partitioning Relational database Management Systems (RDBMS) Big data Management Load Balance query Performance Improvement data Integrity and Fault Tolerance EFFICIENT Big data in Healthcare Dynamic data Distribution Healthcare Information Systems partitioning Algorithms Performance Evaluation in databases
在线阅读 下载PDF
跨节点RDF关联路径检索技术研究与实现
11
作者 刘峰 韩芳 +3 位作者 夏景隆 陈锟 魏天珂 高帅 《数据与计算发展前沿(中英文)》 CSCD 2024年第4期34-45,共12页
【目的】跨节点关联路径检索是实现大规模分布式场景下科学数据关联发现的重要手段,如何突破多节点多跳查询的效率和准确性是一个关键技术难题,相关解决方案和技术具有重要而广泛的应用前景。【方法】本文提出了以RDF类关系为驱动的跨... 【目的】跨节点关联路径检索是实现大规模分布式场景下科学数据关联发现的重要手段,如何突破多节点多跳查询的效率和准确性是一个关键技术难题,相关解决方案和技术具有重要而广泛的应用前景。【方法】本文提出了以RDF类关系为驱动的跨节点关联路径检索技术,该技术以分布式节点RDF类关联关系构建为基础,将跨节点数据实体关联检索映射为RDF类关联检索,进而以RDF类关联关系为基础,指导SPARQL联邦查询语句的动态封装,实现关联数据跨节点检索。【结果】通过测试验证,本文技术方案能有效提升跨节点RDF关联路径检索的效率和质量,可以支持多数据源节点,任意关联方向,多跳的动态查询。【结论】基于RDF类关系驱动的跨节点关联路径检索技术,为解决分布式环境下的数据联合查询提供了一种高效且准确的解决方案,有望在复杂网络环境及大数据应用场景中发挥重要作用。 展开更多
关键词 rdf 科学关联数据 语义关联发现 多跳查询 跨节点
在线阅读 下载PDF
基于SPARK的大规模RDF数据上的SPARQL查询算法 被引量:1
12
作者 崔家奇 闫威 《计算机应用与软件》 北大核心 2020年第12期26-31,45,共7页
海量RDF很难在单台机器上进行管理和查询RDF数据。针对该问题,提出一种基于Spark的SPARQL查询方法SSQ,将SPARQL查询转化为Spark分布式平台上的RDD操作。将数据图及查询图进行有效划分,增加并行度且减少分区间通信开销。通过谓词索引减... 海量RDF很难在单台机器上进行管理和查询RDF数据。针对该问题,提出一种基于Spark的SPARQL查询方法SSQ,将SPARQL查询转化为Spark分布式平台上的RDD操作。将数据图及查询图进行有效划分,增加并行度且减少分区间通信开销。通过谓词索引减小搜索空间,并优化连接,减少匹配次数,提高查询效率。在Spark集群上实现算法,在合成数据集LUBM上进行测试并与现有方法进行比较。结果表明该算法能够快速执行复杂SPARQL查询,并具有良好的可扩展性。 展开更多
关键词 rdf数据 sparql查询 SPARK分布式平台 平衡语义划分 通信开销
在线阅读 下载PDF
RDF数据查询处理技术综述 被引量:65
13
作者 杜方 陈跃国 杜小勇 《软件学报》 EI CSCD 北大核心 2013年第6期1222-1242,共21页
随着语义网以及信息抽取技术等研究的发展,Web上涌现出越来越多的RDF数据,海量RDF数据的管理,已经成为学术界和工业界研究的热点之一.从RDF数据集形态及RDF数据组织存储两个维度以及查询表述、查询处理、查询优化等方面,深入地分析和比... 随着语义网以及信息抽取技术等研究的发展,Web上涌现出越来越多的RDF数据,海量RDF数据的管理,已经成为学术界和工业界研究的热点之一.从RDF数据集形态及RDF数据组织存储两个维度以及查询表述、查询处理、查询优化等方面,深入地分析和比较了RDF数据查询处理方法,并在此基础上提出了未来研究的方向和挑战. 展开更多
关键词 rdf rdf数据管理 rdf查询处理 查询优化
在线阅读 下载PDF
高可扩展的RDF数据存储系统 被引量:9
14
作者 袁平鹏 刘谱 +1 位作者 张文娅 吴步文 《计算机研究与发展》 EI CSCD 北大核心 2012年第10期2131-2141,共11页
由于资源描述框架(resource description framework,RDF)具有表达灵活、简洁等优点,已被接受为表达元数据及万维网上数据互联的规范.近年来,其数据量在以飞快的速度增长.相应地,要求存储RDF数据的系统应具有高扩展性.介绍了一个高可扩展... 由于资源描述框架(resource description framework,RDF)具有表达灵活、简洁等优点,已被接受为表达元数据及万维网上数据互联的规范.近年来,其数据量在以飞快的速度增长.相应地,要求存储RDF数据的系统应具有高扩展性.介绍了一个高可扩展的RDF数据存储系统TripleBit.为尽可能降低存储空间消耗,采用了增量压缩和变长整数编码方法.并采用了数据分块的存储方法,既使得存储管理方便又使得存储结构紧凑,加速了数据读取.系统提供了基于启发式规则的动态查询计划生成方法,所产生的查询计划在执行过程中根据中间结果会相应作调整,以保持最优的执行顺序.对于多变量的查询,使用二步执行策略以减少查询过程中产生的中间结果.与目前流行RDF数据存储系统相比较,在存储空间上RDF-3X比TripleBit至少多40%;在查询性能上,比RDF-3X和MonetDB获得数倍的提升. 展开更多
关键词 资源描述框架 语义数据存储 数据编码 查询处理 查询计划
在线阅读 下载PDF
一种基于HBase的RDF数据存储模型 被引量:8
15
作者 朱敏 程佳 柏文阳 《计算机研究与发展》 EI CSCD 北大核心 2013年第S1期23-31,共9页
随着语义网数据的爆炸式增长,如何高效地管理海量RDF数据成为一个关键问题.现有的集中式关系型RDF数据存储管理系统已难以适应这种需求,越来越多的研究者使用分布式系统和并行计算技术来管理海量RDF数据.提出一种基于分布式数据库HBase... 随着语义网数据的爆炸式增长,如何高效地管理海量RDF数据成为一个关键问题.现有的集中式关系型RDF数据存储管理系统已难以适应这种需求,越来越多的研究者使用分布式系统和并行计算技术来管理海量RDF数据.提出一种基于分布式数据库HBase的RDF数据存储模型,根据OWL本体定义文件,将数据按类划分,同一类的三元组数据保存在该类的S_PO和O_PS两张表中,实现该存储模型上的8种Triple Pattern和Basic Graph Pattern查询算法,并提供部分推理功能,在Hadoop集群环境下对存储模型与查询算法进行了可行性验证. 展开更多
关键词 资源描述框架 语义数据存储 sparql 基本图模式 查询处理
在线阅读 下载PDF
分布式RDF数据管理综述 被引量:15
16
作者 邹磊 彭鹏 《计算机研究与发展》 EI CSCD 北大核心 2017年第6期1213-1224,共12页
资源描述框架(resource description framework,RDF)作为一个展示、共享和连接网络上的数据的模型,已经被广泛地用在各种应用中.同时,SPARQL(simple protocol and RDF query language)作为一种结构化查询语言则被用来支持对RDF数据进行... 资源描述框架(resource description framework,RDF)作为一个展示、共享和连接网络上的数据的模型,已经被广泛地用在各种应用中.同时,SPARQL(simple protocol and RDF query language)作为一种结构化查询语言则被用来支持对RDF数据进行查询检索.随着RDF数据规模的日益增长,在现有RDF数据库上进行SPARQL查询处理已经超出了单机的处理能力.于是,人们需要设计出高性能的分布式RDF数据库以支持对SPARQL查询进行高效的处理.当前,已经有大量的工作来讨论如何搭建分布式RDF数据管理系统.对这些不同的分布式RDF数据管理方法进行综述,将现有的分布式RDF数据管理方法分成3类:基于云计算平台的分布式RDF数据管理方法、基于数据划分的分布式RDF数据管理方法和联邦式系统.基于云计算平台的分布式RDF数据管理方法利用已有云平台进行RDF数据的管理;基于数据划分的分布式RDF数据管理方法首先将RDF数据图划分成若干子图,然后将这些子图分配到不同计算节点上;联邦式系统的特点是数据已经分布在不同节点上,数据管理系统无法控制数据的分布.在每类分布式RDF数据管理方法的介绍中,将深入讨论以帮助读者了解各种方法的特点. 展开更多
关键词 rdf数据管理 sparql查询处理 分布式数据库系统 云计算 关联数据
在线阅读 下载PDF
四种SPARQL查询构建器及其比较研究 被引量:3
17
作者 郭少友 魏朋争 +1 位作者 洪娜 李木子 《情报科学》 CSSCI 北大核心 2015年第3期80-84,共5页
SPARQL查询构建器可以辅助用户实现SPARQL查询的半自动化构建。在总结SPARQL查询构建器一般类型的基础上,扼要地介绍Querymed、Viziquer、VQB、Bio SPARQL四种有代表性的构建器,并从所支持的条件输入方式、SPARQL语法、数据源个数等方... SPARQL查询构建器可以辅助用户实现SPARQL查询的半自动化构建。在总结SPARQL查询构建器一般类型的基础上,扼要地介绍Querymed、Viziquer、VQB、Bio SPARQL四种有代表性的构建器,并从所支持的条件输入方式、SPARQL语法、数据源个数等方面对这四种构建器进行比较研究。 展开更多
关键词 sparql 查询构建器 数据源 rdf 本体
原文传递
基于NoSQL的RDF数据存储与查询技术综述 被引量:22
18
作者 王林彬 黎建辉 沈志宏 《计算机应用研究》 CSCD 北大核心 2015年第5期1281-1286,共6页
随着语义网的发展和RDF(resource description framework,资源描述框架)数据量的快速增长,利用NoSQL数据库存储和管理大规模RDF数据已经成为了当前的研究热点。介绍了No SQL数据库的种类划分和各类型特点,阐述了RDF数据在各类No SQL数... 随着语义网的发展和RDF(resource description framework,资源描述框架)数据量的快速增长,利用NoSQL数据库存储和管理大规模RDF数据已经成为了当前的研究热点。介绍了No SQL数据库的种类划分和各类型特点,阐述了RDF数据在各类No SQL数据库中存储结构设计和并行查询算法的研究现状,分析比较了不同方法的优缺点。最后,讨论了利用No SQL数据库管理RDF的优势,总结了现有研究的不足之处,并展望了未来的研究方向。 展开更多
关键词 资源描述框架 NOSQL数据库 数据模型 存储结构设计 rdf并行查询算法
在线阅读 下载PDF
基于Hadoop的RDF数据存储及查询优化 被引量:15
19
作者 徐德智 刘扬 Sarfraz Ahmed 《计算机应用研究》 CSCD 北大核心 2017年第2期477-480,486,共5页
随着资源描述框架(resource description framework,RDF)数据量的快速增长,利用分布式的方法来存储和管理大规模RDF数据成为当前的研究热点。为了实现对海量RDF数据的高效存储和查询,研究了RDF三元组在分布式平台Hadoop中的存储和查询方... 随着资源描述框架(resource description framework,RDF)数据量的快速增长,利用分布式的方法来存储和管理大规模RDF数据成为当前的研究热点。为了实现对海量RDF数据的高效存储和查询,研究了RDF三元组在分布式平台Hadoop中的存储和查询方法,提出了一种新的基于Hadoop的RDF数据处理优化方法,通过采用基于HBase混合式数据布局方法以及引入MapReduce连接查询的I/O代价模型来对海量RDF数据的查询进行优化。在LUBM标准测试数据集中进行了实验,结果表明该方法能够在保证空间效率的前提下,有效地提高复杂查询的效率。 展开更多
关键词 rdf rdf数据查询 MAPREDUCE HBASE 查询优化
在线阅读 下载PDF
KREAG:基于实体三元组关联图的RDF数据关键词查询方法 被引量:14
20
作者 李慧颖 瞿裕忠 《计算机学报》 EI CSCD 北大核心 2011年第5期825-835,共11页
语义网数据的大量增加使得RDF数据查询成为一个重要研究主题.关键词查询方式不需要掌握数据模式或查询语言,更适合普通用户使用.文中提出一种RDF数据关键词查询方法KREAG(Keyword query over RDF data based on Entity-triple Associati... 语义网数据的大量增加使得RDF数据查询成为一个重要研究主题.关键词查询方式不需要掌握数据模式或查询语言,更适合普通用户使用.文中提出一种RDF数据关键词查询方法KREAG(Keyword query over RDF data based on Entity-triple Association Graph).为了支持用户对属性或关系名进行查询,将RDF数据建模为顶点带标签的实体三元组关联图.该模型保证了RDF数据中实体间关联转化为关联图中顶点间的通路,且文本信息全部封装到关联图顶点标签上.在此基础上,将关键词查询问题转化为关联图上查找有向斯坦纳树问题.在保证近似比为m的前提下(m为查询关键词的个数),利用近似算法实现快速查询响应.通过合理的评分方式衡量查询结果的相关性,支持top-k查询.算法的时间复杂度为O(m.|V|),其中|V|为实体三元组关联图中顶点个数.实验表明KREAG较其它方法具有更快的响应时间,同时能够有效地实现RDF数据的关键词查询. 展开更多
关键词 关键词查询 rdf数据 TOP-K 实体 关联
在线阅读 下载PDF
上一页 1 2 5 下一页 到第
使用帮助 返回顶部