期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
An Efficient Mechanism for Product Data Extraction from E-Commerce Websites
1
作者 Malik Javed Akhtar Zahur Ahmad +3 位作者 Rashid Amin Sultan H.Almotiri Mohammed A.Al Ghamdi Hamza Aldabbas 《Computers, Materials & Continua》 SCIE EI 2020年第12期2639-2663,共25页
A large amount of data is present on the web which can be used for useful purposes like a product recommendation,price comparison and demand forecasting for a particular product.Websites are designed for human underst... A large amount of data is present on the web which can be used for useful purposes like a product recommendation,price comparison and demand forecasting for a particular product.Websites are designed for human understanding and not for machines.Therefore,to make data machine-readable,it requires techniques to grab data from web pages.Researchers have addressed the problem using two approaches,i.e.,knowledge engineering and machine learning.State of the art knowledge engineering approaches use the structure of documents,visual cues,clustering of attributes of data records and text processing techniques to identify data records on a web page.Machine learning approaches use annotated pages to learn rules.These rules are used to extract data from unseen web pages.The structure of web documents is continuously evolving.Therefore,new techniques are needed to handle the emerging requirements of web data extraction.In this paper,we have presented a novel,simple and efficient technique to extract data from web pages using visual styles and structure of documents.The proposed technique detects Rich Data Region(RDR)using query and correlative words of the query.RDR is then divided into data records using style similarity.Noisy elements are removed using a Common Tag Sequence(CTS)and formatting entropy.The system is implemented using JAVA and runs on the dataset of real-world working websites.The effectiveness of results is evaluated using precision,recall,and F-measure and compared with five existing systems.A comparison of the proposed technique to existing systems has shown encouraging results. 展开更多
关键词 document object model rich data region common tag sequence web data extraction deep web mining
在线阅读 下载PDF
SecureWeb: Protecting Sensitive Information Through the Web Browser Extension with a Security Token 被引量:4
2
作者 Shuang Liang Yue Zhang +3 位作者 Bo Li Xiaojie Guo Chunfu Jia Zheli Liu 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2018年第5期526-538,共13页
The leakage of sensitive data occurs on a large scale and with increasingly serious impact. It may cause privacy disclosure or even property damage. Password leakage is one of the fundamental reasons for information l... The leakage of sensitive data occurs on a large scale and with increasingly serious impact. It may cause privacy disclosure or even property damage. Password leakage is one of the fundamental reasons for information leakage, and its importance is must be emphasized because users are likely to use the same passwords for different Web application accounts. Existing approaches use a password manager and encrypted Web application to protect passwords and other sensitive data; however, they may be compromised or lack accessibility. The paper presents SecureWeb, which is a secure, practical, and user-controllable framework for mitigating the leakage of sensitive data. SecureWeb protects users' passwords and aims to provide a unified protection solution to diverse sensitive data. The efficiency of the developed schemes is demonstrated and the results indicate that it has a low overhead and are of practical use. 展开更多
关键词 password manager data privacy format-preserving encryption Shadow document Object model(DOM)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部