期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
The design and implementation of web mining in web sites security 被引量:2
1
作者 LI Jian, ZHANG Guo-yin , GU Guo-chang, LI Jian-li College of Computer Science and Technology, Harbin Engineering University, Harbin 150001China 《Journal of Marine Science and Application》 2003年第1期81-86,共6页
The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illeg... The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illegal access can be avoided. Firstly, the system for discovering the patterns of information leakages in CGI scripts from Web log data was proposed. Secondly, those patterns for system administrators to modify their codes and enhance their Web site security were provided. The following aspects were described: one is to combine web application log with web log to extract more information,so web data mining could be used to mine web log for discovering the information that firewall and Information Detection System cannot find. Another approach is to propose an operation module of web site to enhance Web site security. In cluster server session, Density -Based Clustering technique is used to reduce resource cost and obtain better efficiency. 展开更多
关键词 data mining web log mining web sites security density-based clustering
在线阅读 下载PDF
Enhancing the usage pattern mining performance with temporal segmentation of QPop Increment in digital libraries
2
作者 曹三省 KLEIN R. Rody 刘剑波 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2005年第11期1290-1296,共7页
The convergence of next-generation Networks and the emergence of new media systems have made media-rich digital libraries popular in application and research. The discovery of media content objects’ usage patterns, w... The convergence of next-generation Networks and the emergence of new media systems have made media-rich digital libraries popular in application and research. The discovery of media content objects’ usage patterns, where QPop Increment is the characteristic feature under study, is the basis of intelligent data migration scheduling, the very key issue for these systems to manage effectively the massive storage facilities in their backbones. In this paper, a clustering algorithm is established, on the basis of temporal segmentation of QPop Increment, so as to improve the mining performance. We employed the standard C-Means algorithm as the clustering kernel, and carried out the experimental mining process with segmented QPop Increases obtained in actual applications. The results indicated that the improved algorithm is more advantageous than the basic one in important indices such as the clustering cohesion. The experimental study in this paper is based on a Media Assets Library prototype developed for the use of the advertainment movie production project for Olympics 2008, under the support of both the Humanistic Olympics Study Center in Beijing, and China State Administration of Radio, Film and TV. 展开更多
关键词 Media-rich Digital library Data migration Media content log mining QPop
在线阅读 下载PDF
Web multimedia information retrieval using improved Bayesian algorithm 被引量:3
3
作者 余铁军 陈纯 +1 位作者 余铁民 林怀忠 《Journal of Zhejiang University Science》 EI CSCD 2003年第4期415-420,共6页
The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based... The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based on data mining, and then integrated into the original information space model to improve the accuracy of the new information space model. It can remove clutter and irrelevant text information and help to eliminate mismatch between the page author' s expression and the user' s understanding and expectation. User spacemodel was also utilized to discover the relationship between high-level and low-level features for assigning weight. The authors proposed improved Bayesian algorithm for data mining. Experiment proved that the au-thors' proposed algorithm was efficient. 展开更多
关键词 Relevant feedback Web log mining Improved Bayesian algorithm User space model
在线阅读 下载PDF
Obtaining Profiles Based on Localized Non-negative Matrix Factorization 被引量:2
4
作者 JIANGJi-xiang XUBao-wen +1 位作者 LUJian-jiang ZhouXiao-yu 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期580-584,共5页
Nonnegative matrix factorization (NMF) is a method to get parts-based features of information and form the typical profiles. But the basis vectors NMF gets are not orthogonal so that parts-based features of informatio... Nonnegative matrix factorization (NMF) is a method to get parts-based features of information and form the typical profiles. But the basis vectors NMF gets are not orthogonal so that parts-based features of information are usually redundancy. In this paper, we propose two different approaches based on localized non-negative matrix factorization (LNMF) to obtain the typical user session profiles and typical semantic profiles of junk mails. The LNMF get basis vectors as orthogonal as possible so that it can get accurate profiles. The experiments show that the approach based on LNMF can obtain better profiles than the approach based on NMF. Key words localized non-negative matrix factorization - profile - log mining - mail filtering CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (60373066, 60303024), National Grand Fundamental Research 973 Program of China (2002CB312000), National Research Foundation for the Doctoral Program of Higher Education of China (20020286004).Biography: Jiang Ji-xiang (1980-), male, Master candidate, research direction: data mining, knowledge representation on the Web. 展开更多
关键词 localized non-negative matrix factorization PROFILE log mining mail filtering
在线阅读 下载PDF
Testing and Evaluation for Web Usability Based on Extended Markov Chain Model 被引量:2
5
作者 MAOCheng-ying LUYan-sheng 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期687-693,共7页
As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. W... As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. Web applications generally contain lots of pages and are used by enormous users. Statistical testing is an effective way of ensuring their quality. Web usage can be accurately described by Markov chain which has been proved to be an ideal model for software statistical testing. The results of unit testing can be utilized in the latter stages, which is an important strategy for bottom-to-top integration testing, and the other improvement of extended Markov chain model (EMM) is to present the error type vector which is treated as a part of page node. this paper also proposes the algorithm for generating test cases of usage paths. Finally, optional usage reliability evaluation methods and an incremental usability regression testing model for testing and evaluation are presented. Key words statistical testing - evaluation for Web usability - extended Markov chain model (EMM) - Web log mining - reliability evaluation CLC number TP311. 5 Foundation item: Supported by the National Defence Research Project (No. 41315. 9. 2) and National Science and Technology Plan (2001BA102A04-02-03)Biography: MAO Cheng-ying (1978-), male, Ph.D. candidate, research direction: software testing. Research direction: advanced database system, software testing, component technology and data mining. 展开更多
关键词 statistical testing evaluation for Web usability extended Markov chain model (EMM) Web log mining reliability evaluation
在线阅读 下载PDF
Fuzzy Clustering Method for Web User Based on Pages Classification 被引量:2
6
作者 ZHANLi-qiang LIUDa-xin 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期553-556,共4页
A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the... A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the site, then computes fuzzy degree of cross page through aggregating on data of Web log. After that, by using fuzzy comprehensive evaluation method, the method constructs user interest vectors according to page viewing times and frequency of hits, and derives the fuzzy similarity matrix from the interest vectors for the Web users. Finally, it gets the clustering result through the fuzzy clustering method. The experimental results show the effectiveness of the method. Key words Web log mining - fuzzy similarity matrix - fuzzy comprehensive evaluation - fuzzy clustering CLC number TP18 - TP311 - TP391 Foundation item: Supported by the Natural Science Foundation of Heilongjiang Province of China (F0304)Biography: ZHAN Li-qiang (1966-), male, Lecturer, Ph. D. research direction: the theory methods of data mining and theory of database. 展开更多
关键词 Web log mining fuzzy similarity matrix fuzzy comprehensive evaluation fuzzy clustering
在线阅读 下载PDF
Learning Query Ambiguity Models by Using Search Logs 被引量:1
7
作者 宋睿华 窦志成 +1 位作者 洪小文 俞勇 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第4期728-738,共11页
Identifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query.... Identifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query. However, previous work showed user clicks alone are misleading in judging a query as being ambiguous or not. In this paper, we address the problem of learning a query ambiguity model by using search logs. First, we propose enriching a query by mining the documents clicked by users and the relevant follow up queries in a session. Second, we use a text classifier to map the documents and the queries into predefined categories. Third, we propose extracting features from the processed data. Finally, we apply a state-of-the-art algorithm, Support Vector Machine (SVM), to learn a query ambiguity classifier. Experimental results verify that the sole use of click based features or session based features perform worse than the previous work based on top retrieved documents. When we combine the two sets of features, our proposed approach achieves the best effectiveness, specifically 86% in terms of accuracy. It significantly improves the click based method by 5.6% and the session based method by 4.6%. 展开更多
关键词 ambiguous query log mining query classification
原文传递
Proactive planning of bandwidth resource using simulation-based what-if predictions for Web services in the cloud
8
作者 Jianpeng HU Linpeng HUANG +3 位作者 Tianqi SUN Ying FAN Wenqiang HU Hao ZHONG 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第1期25-52,共28页
Resource planning is becoming an increasingly important and timely problem for cloud users.As more Web services are moved to the cloud,minimizing network usage is often a key driver of cost control.Most existing appro... Resource planning is becoming an increasingly important and timely problem for cloud users.As more Web services are moved to the cloud,minimizing network usage is often a key driver of cost control.Most existing approaches focus on resources such as CPU,memory,and disk I/O.In particular,CPU receives the most attention from researchers,but the bandwidth is somehow neglected.It is challenging to predict the network throughput of modem Web services,due to the factors of diverse and complex response,evolving Web services,and complex network transportation.In this paper,we propose a methodology of what-if analysis,named Log2Sim,to plan the bandwidth resource of Web services.Log2Sim uses a lightweight workload model to describe user behavior,an automated mining approach to obtain characteristics of workloads and responses from massive Web logs,and traffic-aware simulations to predict the impact on the bandwidth consumption and the response time in changing contexts.We use a real-life Web system and a classic benchmark to evaluate Log2Sim in multiple scenarios.The evaluation result shows that Log2Sim has good performance in the prediction of bandwidth consumption.The average relative error is 2%for the benchmark and 8% for the real-life system.As for the response time,Log2Sim cannot produce accurate predictions for every single service request,but the simulation results always show similar trends on average response time with the increase of workloads in different changing contexts.It can provide sufficient information for the system administrator in proactive bandwidth planning. 展开更多
关键词 what-if analysis bandwidth management network simulation Web service log mining resource planning evolution OPNET
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部