期刊文献+
共找到3,868篇文章
< 1 2 194 >
每页显示 20 50 100
Question classification in question answering based on real-world web data sets 被引量:1
1
作者 袁晓洁 于士涛 +1 位作者 师建兴 陈秋双 《Journal of Southeast University(English Edition)》 EI CAS 2008年第3期272-275,共4页
To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,t... To improve question answering (QA) performance based on real-world web data sets,a new set of question classes and a general answer re-ranking model are defined.With pre-defined dictionary and grammatical analysis,the question classifier draws both semantic and grammatical information into information retrieval and machine learning methods in the form of various training features,including the question word,the main verb of the question,the dependency structure,the position of the main auxiliary verb,the main noun of the question,the top hypernym of the main noun,etc.Then the QA query results are re-ranked by question class information.Experiments show that the questions in real-world web data sets can be accurately classified by the classifier,and the QA results after re-ranking can be obviously improved.It is proved that with both semantic and grammatical information,applications such as QA, built upon real-world web data sets, can be improved,thus showing better performance. 展开更多
关键词 question classification question answering real-world web data sets question and answer web forums re-ranking model
在线阅读 下载PDF
A Deep Web Data Integration System for Job Search 被引量:6
2
作者 LIU Wei LI Xian +2 位作者 LING Yanyan ZHANG Xiaoyu MENG Xiaofeng 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1197-1201,共5页
With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over... With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over them, providing such an integrated job search system over Web databases has become a Web application in high demand. Based on such consideration, we build a deep Web data integration system that supports unified access for users to multiple job Web sites as a job meta-search engine. In this paper, the architecture of the system is given first, and the key components in the system are introduced. 展开更多
关键词 web database web data integration job website
在线阅读 下载PDF
Web Data Cube Construction in Multidimensional On-line Analytical Processing Environment
3
作者 朱焱 《Journal of Southwest Jiaotong University(English Edition)》 2007年第1期1-7,共7页
This paper investigates how to integrate Web data into a multidimensional data warehouse (cube) for comprehensive on-line analytical processing (OLAP) and decision making. An approach for Web data-based cube const... This paper investigates how to integrate Web data into a multidimensional data warehouse (cube) for comprehensive on-line analytical processing (OLAP) and decision making. An approach for Web data-based cube construction is proposed, which includes Web data modeling based on MIX ( Metadam based Integration model for data X-change ), generic and specific mapping rules design, and a transformation algorithm for mapping Web data to a multidimensional array. Besides, the structure and implementation of the prototype of a Web data base cube are discussed. 展开更多
关键词 web data warehousing web data-based cube MOLAP
在线阅读 下载PDF
用PB8的Web Data Window DTC开发Web应用 被引量:1
4
作者 华铨平 《现代计算机》 2003年第7期73-75,79,共4页
浏览器/Web服务器+应用服务器/数据库服务器的三层或多层体系结构已成为当今应用开发技术的主流,本文着重介绍 PowerBuilder 8.0的 Web Data Window DTC的使用,阐述瘦客户技术的实现。
关键词 webdataWindowsDTC 数据库 POWERBUILDER8.0 web 数据窗口
在线阅读 下载PDF
A Framework of Web Data Integrated LBS Middleware
5
作者 MENG Xiaofeng YIN Shaoyi XIAO Zhen 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1187-1191,共5页
In this paper, we propose a flexible locationbased service (LBS) middleware framework to make the development and deployment of new location based applications much easier. Considering the World Wide Web as a huge d... In this paper, we propose a flexible locationbased service (LBS) middleware framework to make the development and deployment of new location based applications much easier. Considering the World Wide Web as a huge data source of location relative information, we integrate the common used web data extraction techniques into the middleware framework, exposing a unified web data interface for the upper applications to make them more attractive. Besides, the framework also emphasizes some common LBS issues, including positioning, location modeling, location-dependent query processing, privacy and secure management. 展开更多
关键词 location-based service (LBS) MIDDLEWARE web data extraction
在线阅读 下载PDF
Web Data Aggregation in MOLAP:Approach,Language,and Implementation
6
作者 朱焱 唐慧佳 马永强 《Journal of Southwest Jiaotong University(English Edition)》 2007年第3期179-186,共8页
This paper investigates the Web data aggregation issues in multidimensional on-line analytical processing (MOLAP) and presents a rule-driven aggregation approach. The core of the approach is defining aggregate rules... This paper investigates the Web data aggregation issues in multidimensional on-line analytical processing (MOLAP) and presents a rule-driven aggregation approach. The core of the approach is defining aggregate rules. To define the rules for reading warehouse data and computing aggregates, a rule definition language - array aggregation language (AAL) is developed. This language treats an array as a function from indexes to values and provides syntax and semantics based on monads. External functions can be called in aggregation rules to specify array reading, writing, and aggregating. Based on the features of AAL, array operations are unified as function operations, which can be easily expressed and automatically evaluated. To implement the aggregation approach, a processor for computing aggregates over the base cube and for materializing them in the data warehouse is built, and the component structure and working principle of the aggregation processor are introduced. 展开更多
关键词 web data aggregation Aggregation language MOLAP Aggregation processor
在线阅读 下载PDF
Web Database Query Interface Annotation Based on User Collaboration
7
作者 LIU Wei LIN Can MENG Xiaofeng 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1403-1406,共4页
A vision based query interface annotation meth od is used to relate attributes and form elements in form based web query interfaces, this method can reach accuracy of 82%. And a user participation method is used to tu... A vision based query interface annotation meth od is used to relate attributes and form elements in form based web query interfaces, this method can reach accuracy of 82%. And a user participation method is used to tune the result; user can answer "yes" or "no" for existing annotations, or manually annotate form elements. Mass feedback is added to the annotation algorithm to produce more accurate result. By this approach, query interface annotation can reach a perfect accuracy. 展开更多
关键词 web database data integration data extraction
在线阅读 下载PDF
Web data mining在远程教育中的应用
8
作者 白伟 《山西科技》 2009年第2期54-55,共2页
采用Web data mining对远程教育进行分析,根据受教育对象存在的个体差异,提出个性化远程学习系统的框架结构思想和个性化服务的理念,对相关信息进行数据挖掘并建立起一个集智能化、个性化为一体的远程教育系统,从而更好地改善远程教育... 采用Web data mining对远程教育进行分析,根据受教育对象存在的个体差异,提出个性化远程学习系统的框架结构思想和个性化服务的理念,对相关信息进行数据挖掘并建立起一个集智能化、个性化为一体的远程教育系统,从而更好地改善远程教育服务的现状。 展开更多
关键词 web数据挖掘 远程教育 个性化学习 个性化服务
在线阅读 下载PDF
On Structure-based Web Data Extraction: The Model, Method and Application
9
作者 俞方桦 戴玮 陈家训 《Journal of China Textile University(English Edition)》 EI CAS 2000年第4期103-106,共4页
Web data extraction is to obtain valuable data from the tremendous information resource of the World Wide Web according to the pre - defined pattern. It processes and classifies the data on the Web. Formalization of t... Web data extraction is to obtain valuable data from the tremendous information resource of the World Wide Web according to the pre - defined pattern. It processes and classifies the data on the Web. Formalization of the procedure of Web data extraction is presented, as well as the description of crawling and extraction algorithm. Based on the formalization, an XML - based page structure description language, TIDL, is brought out, including the object model, the HTML object reference model and definition of tags. At the final part, a Web data gathering and querying application based on Internet agent technology, named Web Integration Services Kit (WISK) is mentioned. 展开更多
关键词 World WIDE web web MINING data EXTRACTION HTML XML
在线阅读 下载PDF
The Optimization and Improvement of MapReduce in Web Data Mining
10
作者 Jun Qu Chang-Qing Yin Shangwei Song 《Journal of Software Engineering and Applications》 2015年第8期395-406,共12页
Extracting and mining social networks information from massive Web data is of both theoretical and practical significance. However, one of definite features of this task was a large scale data processing, which remain... Extracting and mining social networks information from massive Web data is of both theoretical and practical significance. However, one of definite features of this task was a large scale data processing, which remained to be a great challenge that would be addressed. MapReduce is a kind of distributed programming model. Just through the implementation of map and reduce those two functions, the distributed tasks can work well. Nevertheless, this model does not directly support heterogeneous datasets processing, while heterogeneous datasets are common in Web. This article proposes a new framework which improves original MapReduce framework into a new one called Map-Reduce-Merge. It adds merge phase that can efficiently solve the problems of heterogeneous data processing. At the same time, some works of optimization and improvement are done based on the features of Web data. 展开更多
关键词 CLOUD COMPUTING web data MAPREDUCE Map-Reduce-Merge
在线阅读 下载PDF
Audiovisual Art Event Classification and Outreach Based on Web Extracted Data
11
作者 Andreas Giannakoulopoulos Minas Pergantis +1 位作者 Aristeidis Lamprogeorgos Stella Lampoura 《Journal of Software Engineering and Applications》 2025年第1期24-43,共20页
The World Wide Web provides a wealth of information about everything, including contemporary audio and visual art events, which are discussed on media outlets, blogs, and specialized websites alike. This information m... The World Wide Web provides a wealth of information about everything, including contemporary audio and visual art events, which are discussed on media outlets, blogs, and specialized websites alike. This information may become a robust source of real-world data, which may form the basis of an objective data-driven analysis. In this study, a methodology for collecting information about audio and visual art events in an automated manner from a large array of websites is presented in detail. This process uses cutting edge Semantic Web, Web Search and Generative AI technologies to convert website documents into a collection of structured data. The value of the methodology is demonstrated by creating a large dataset concerning audiovisual events in Greece. The collected information includes event characteristics, estimated metrics based on their text descriptions, outreach metrics based on the media that reported them, and a multi-layered classification of these events based on their type, subjects and methods used. This dataset is openly provided to the general and academic public through a Web application. Moreover, each event’s outreach is evaluated using these quantitative metrics, the results are analyzed with an emphasis on classification popularity and useful conclusions are drawn concerning the importance of artistic subjects, methods, and media. 展开更多
关键词 web data Extraction Art Events Classification Artistic Outreach Online Media
在线阅读 下载PDF
Web 3.0时代平台互联互通的偏差及其因应之策
12
作者 叶明 姚莹 《南京邮电大学学报(社会科学版)》 2026年第1期72-83,共12页
平台互联互通是Web 3.0时代的应有之义,然而其尚存在封闭式竞争行为屡禁不止,歧视性互联互通愈显,互联互通的范围层次有待提升等多重偏差。仔细审视背后的诱因,可以归结为平台互联互通嵌含利益冲突,存在规范与技术罅漏及运动式监管的局... 平台互联互通是Web 3.0时代的应有之义,然而其尚存在封闭式竞争行为屡禁不止,歧视性互联互通愈显,互联互通的范围层次有待提升等多重偏差。仔细审视背后的诱因,可以归结为平台互联互通嵌含利益冲突,存在规范与技术罅漏及运动式监管的局限。有鉴于此,应革新互联互通的推行理念,由强制互联变为顺“市”而为,同时廓清平衡数据开放与数据隐私保护的思路,以纾解利益冲突。在规范和技术方面,需要体系化完善数据要素制度规范、强化技术支撑从而消除推行隐忧。此外,还应破除运动式监管模式的窠臼,构建平台互联互通的常态化监管机制。 展开更多
关键词 web 3.0 平台 平台治理 平台互联互通 数据 数据监管 数据要素制度
在线阅读 下载PDF
基于Web的工业机器人语言系统设计
13
作者 彭玲 姜立标 +1 位作者 王蕊 谢杨钟 《计算机技术与发展》 2026年第3期53-58,共6页
针对传统工业机器人语言系统指令不易扩展,程序编辑操作繁琐,可视化程度低,工作效率低下等问题,该文提出一种基于Web的工业机器人语言系统设计方案。采用B/S结构替代了原有的C(示教盒)/S(控制器),实现在没有配备示教器的情况下,使用个... 针对传统工业机器人语言系统指令不易扩展,程序编辑操作繁琐,可视化程度低,工作效率低下等问题,该文提出一种基于Web的工业机器人语言系统设计方案。采用B/S结构替代了原有的C(示教盒)/S(控制器),实现在没有配备示教器的情况下,使用个人设备通过互联网访问机器人控制器对机器人进行控制,应用少儿编程思想,将机器人程序示教简化为可视化拖拽,极大度地简化学习成本,在机器人控制系统构建本地Web服务器实现上位机与机器人控制系统的数据交互。此外,设计分层式的机器人语言解释器,高效地实现机器人语言解析。最后,通过六轴机器人控制系统进行操作验证。验证结果表明,该设计方案具有良好的可移植性、操作性与扩展性,系统编程效率有效提升,端对端响应延迟小于150 ms。 展开更多
关键词 工业机器人语言 web B/S结构 可视化 数据交互
在线阅读 下载PDF
基于DataPool的Web测试数据生成与维护方法 被引量:2
14
作者 黄陇 李诺 +1 位作者 金茂忠 刘超 《计算机科学》 CSCD 北大核心 2006年第10期272-274,共3页
针对Web应用测试数据所具有的特点,本文提出了一种基于DataPool的Web应用测试数据生成与维护方法。在形式化定义DataPool和明确其语义描述的基础上,根据浏览器端不同的输入域类型在DataPool的编辑视图中提供了相对应的测试数据生成方式... 针对Web应用测试数据所具有的特点,本文提出了一种基于DataPool的Web应用测试数据生成与维护方法。在形式化定义DataPool和明确其语义描述的基础上,根据浏览器端不同的输入域类型在DataPool的编辑视图中提供了相对应的测试数据生成方式,并提供了各种维护功能。在DataPool浏览视图中,支持以单个和批量的方式选择测试数据。 展开更多
关键词 dataPool web测试 测试数据 数据维护
在线阅读 下载PDF
Intelligent and Adaptive Web Data Extraction System Using Convolutional and Long Short-Term Memory Deep Learning Networks 被引量:5
15
作者 Sudhir Kumar Patnaik C.Narendra Babu Mukul Bhave 《Big Data Mining and Analytics》 EI 2021年第4期279-297,共19页
Data are crucial to the growth of e-commerce in today's world of highly demanding hyper-personalized consumer experiences,which are collected using advanced web scraping technologies.However,core data extraction e... Data are crucial to the growth of e-commerce in today's world of highly demanding hyper-personalized consumer experiences,which are collected using advanced web scraping technologies.However,core data extraction engines fail because they cannot adapt to the dynamic changes in website content.This study investigates an intelligent and adaptive web data extraction system with convolutional and Long Short-Term Memory(LSTM)networks to enable automated web page detection using the You only look once(Yolo)algorithm and Tesseract LSTM to extract product details,which are detected as images from web pages.This state-of-the-art system does not need a core data extraction engine,and thus can adapt to dynamic changes in website layout.Experiments conducted on real-world retail cases demonstrate an image detection(precision)and character extraction accuracy(precision)of 97%and 99%,respectively.In addition,a mean average precision of 74%,with an input dataset of 45 objects or images,is obtained. 展开更多
关键词 adaptive web scraping deep learning Long Short-Term Memory(LSTM) web data extraction You only look once(Yolo)
原文传递
Integrating Multi-Source Web Records into Relational Database 被引量:1
16
作者 HUANG Jianbin JI Hongbing SUN Heli 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1177-1181,共5页
How to integrate heterogeneous semi-structured Web records into relational database is an important and challengeable research topic. An improved model of conditional random fields was presented to combine the learnin... How to integrate heterogeneous semi-structured Web records into relational database is an important and challengeable research topic. An improved model of conditional random fields was presented to combine the learning of labeled samples and unlabeled database records in order to reduce the dependence on tediously hand-labeled training data. The pro- posed model was used to solve the problem of schema matching between data source schema and database schema. Experimental results using a large number of Web pages from diverse domains show the novel approach's effectiveness. 展开更多
关键词 web data integration schema matching conditional random fields
在线阅读 下载PDF
A Dynamic XML-NS View Based Approach for the Extensible Integration of Web Data Sources
17
作者 WUWei LUZheng-ding LIRui-xuan WANGZhi-gang 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期647-651,共5页
We propose a three-step technique to achieve this purpose. First, we utilize a collection of XML namespaces organized into hierarchical structure as a medium for expressing data semantics. Second, we define the format... We propose a three-step technique to achieve this purpose. First, we utilize a collection of XML namespaces organized into hierarchical structure as a medium for expressing data semantics. Second, we define the format of resource descriptor for the information source discovery scheme so that we can dynamically register and/or deregister the Web data sources on the fly. Third, we employ an inverted-index mechanism to identify the subset of information sources that are relevant to a particular user query. We describe the design, architecture, and implementation of our approach—IWDS, and illustrate its use through case examples. Key words integration - heterogeneity - Web data source - XML namespace CLC number TP 311.13 Foundation item: Supported by the National Key Technologies R&D Program of China(2002BA103A04)Biography: WU Wei (1975-), male, Ph.D candidate, research direction: information integration, distribute computing 展开更多
关键词 INTEGRATION HETEROGENEITY web data source XML namespace
在线阅读 下载PDF
An Efficient Mechanism for Product Data Extraction from E-Commerce Websites
18
作者 Malik Javed Akhtar Zahur Ahmad +3 位作者 Rashid Amin Sultan H.Almotiri Mohammed A.Al Ghamdi Hamza Aldabbas 《Computers, Materials & Continua》 SCIE EI 2020年第12期2639-2663,共25页
A large amount of data is present on the web which can be used for useful purposes like a product recommendation,price comparison and demand forecasting for a particular product.Websites are designed for human underst... A large amount of data is present on the web which can be used for useful purposes like a product recommendation,price comparison and demand forecasting for a particular product.Websites are designed for human understanding and not for machines.Therefore,to make data machine-readable,it requires techniques to grab data from web pages.Researchers have addressed the problem using two approaches,i.e.,knowledge engineering and machine learning.State of the art knowledge engineering approaches use the structure of documents,visual cues,clustering of attributes of data records and text processing techniques to identify data records on a web page.Machine learning approaches use annotated pages to learn rules.These rules are used to extract data from unseen web pages.The structure of web documents is continuously evolving.Therefore,new techniques are needed to handle the emerging requirements of web data extraction.In this paper,we have presented a novel,simple and efficient technique to extract data from web pages using visual styles and structure of documents.The proposed technique detects Rich Data Region(RDR)using query and correlative words of the query.RDR is then divided into data records using style similarity.Noisy elements are removed using a Common Tag Sequence(CTS)and formatting entropy.The system is implemented using JAVA and runs on the dataset of real-world working websites.The effectiveness of results is evaluated using precision,recall,and F-measure and compared with five existing systems.A comparison of the proposed technique to existing systems has shown encouraging results. 展开更多
关键词 Document object model rich data region common tag sequence web data extraction deep web mining
在线阅读 下载PDF
Automatic Data Extraction from Websites for Generating Aquatic Product Market Information
19
作者 袁红春 陈莹 孙越夫 《Journal of Donghua University(English Edition)》 EI CAS 2006年第6期15-19,共5页
The massive web-based information resources have led to an increasing demand for effective automatic retrieval of target information for web applications. This paper introduces a web-based data extraction tool that de... The massive web-based information resources have led to an increasing demand for effective automatic retrieval of target information for web applications. This paper introduces a web-based data extraction tool that deploys various algorithms to locate, extract and filter tabular data from HTML pages and to transform them into new web-based representations. The tool has been applied in an aquaculture web application platform for extracting and generating aquatic product market information. Results prove that this tool is very effective in extracting the required data from web pages. 展开更多
关键词 web data table localization algorithm distance algorithm data filtering algorithm data extraction tool.
在线阅读 下载PDF
Creating customized data services from web pages
20
作者 季光 Wang Guiling Han Yanbo 《High Technology Letters》 EI CAS 2013年第2期203-207,共5页
To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user throu... To extract structured data from a web page with customized requirements,a user labels some DOM elements on the page with attribute names.The common features of the labeled elements are utilized to guide the user through the labeling process to minimize user efforts,and are also utilized to retrieve attribute values.To turn the attribute values into a structured result,the attribute pattern needs to be induced.For this purpose,a space-optimized suffix tree called attribute tree is built to transform the document object model(DOM) tree into a simpler form while preserving its useful properties such as attribute sequence order.The pattern is induced bottom-up on the attribute tree,and is further used to build the structured result.Experiments are conducted and show high performance of our approach in terms of precision,recall and structural correctness. 展开更多
关键词 web data extraction structured data user labeling CUSTOMIZATION data service
在线阅读 下载PDF
上一页 1 2 194 下一页 到第
使用帮助 返回顶部