期刊文献+
共找到453篇文章
< 1 2 23 >
每页显示 20 50 100
A Fast Interactive Sequential Pattern Mining Algorithm 被引量:1
1
作者 LU Jie-Ping LIU Yue-bo +2 位作者 NI wei-wei LIU Tong-ming SUN Zhi-hui 《Wuhan University Journal of Natural Sciences》 EI CAS 2006年第1期31-36,共6页
In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interacti... In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interactive sequential patterns mining algorithm (FISP), in which the number of frequent items of the projection databases constructed by the correct mining which based on the previously mined sequences has been reduced. Furthermore, the algorithm's iterative running times are reduced greatly by using global-threshold. The results of experiments testify that FISP outperforms PrefixSpan in interactive mining 展开更多
关键词 data mining sequential patterns interactive mining projection database
在线阅读 下载PDF
Application of Data Mining Method to Improve the Accuracy of Springback Prediction in Sheet Metal Forming
2
作者 许京荆 张志伟 吴益敏 《Journal of Shanghai University(English Edition)》 CAS 2004年第3期348-353,共6页
A new method was worked out to improve the precision of springback prediction in sheet metal forming by combining the finite element method (FEM) with the data mining (DM) technique. First the genetic algorithm (GA) w... A new method was worked out to improve the precision of springback prediction in sheet metal forming by combining the finite element method (FEM) with the data mining (DM) technique. First the genetic algorithm (GA) was adopted for recognizing the material parameters. Then according to the even design idea, the suitable calculation scheme was confirmed, and FEM was used for calculating the springback. The computation results were compared with experiment data, the difference between them was taken as source data, and a new pattern recognition method of DM called hierarchical optimal map recognition method (HOMR) is applied for summarizing the calculation regulation in FEM. At the end, the mathematics model of the springback simulation was established. Based on the model, the calculation errors of springback can be controlled within 10% compared with the experimental results. 展开更多
关键词 springback prediction pattern recognition genetic algorithm FEM even design idea HOMR data mining.
在线阅读 下载PDF
A Fast Algorithm for Mining Sequential Patterns from Large Databases
3
作者 陈宁 陈安 +1 位作者 周龙骧 刘鲁 《Journal of Computer Science & Technology》 SCIE EI CSCD 2001年第4期359-370,共12页
Mining sequential patterns from large databases has been recognized by many researchers as an attractive task of data mining and knowledge dis- covery. Previous algorithms scan the databases for many times, which is ... Mining sequential patterns from large databases has been recognized by many researchers as an attractive task of data mining and knowledge dis- covery. Previous algorithms scan the databases for many times, which is often unendurable due to the very large amount of databases. In this paper, the authors introduce an effective algorithm for mining sequential patterns from large databases. In the algorithm, the original database is not used at all for counting the support of sequences after the first pass. Rather, a tidlist structure generated in the Previous pass is employed for the purpose based on set intersection operations, avoiding the multiple scans of the databases. 展开更多
关键词 data mining knowledge discovery sequential pattern set opera-tion
原文传递
A Novel Incremental Mining Algorithm of Frequent Patterns for Web Usage Mining 被引量:1
4
作者 DONG Yihong ZHUANG Yueting TAI Xiaoying 《Wuhan University Journal of Natural Sciences》 CAS 2007年第5期777-782,共6页
Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a... Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision. 展开更多
关键词 incremental algorithm association rule frequent pattern tree web usage mining
在线阅读 下载PDF
Mapping frequent spatio-temporal wind profile patterns using multi-dimensional sequential pattern mining
5
作者 Norhakim Yusof Raul Zurita-Milla 《International Journal of Digital Earth》 SCIE EI 2017年第3期238-256,共19页
Holistic understanding of wind behaviour over space,time and height is essential for harvesting wind energy application.This study presents a novel approach for mapping frequent wind profile patterns using multidimen... Holistic understanding of wind behaviour over space,time and height is essential for harvesting wind energy application.This study presents a novel approach for mapping frequent wind profile patterns using multidimensional sequential pattern mining(MDSPM).This study is illustrated with a time series of 24 years of European Centre for Medium-Range Weather Forecasts European Reanalysis-Interim gridded(0.125°×0.125°)wind data for the Netherlands every 6 h and at six height levels.The wind data were first transformed into two spatio-temporal sequence databases(for speed and direction,respectively).Then,the Linear time Closed Itemset Miner Sequence algorithm was used to extract the multidimensional sequential patterns,which were then visualized using a 3D wind rose,a circular histogram and a geographical map.These patterns were further analysed to determine their wind shear coefficients and turbulence intensities as well as their spatial overlap with current areas with wind turbines.Our analysis identified four frequent wind profile patterns.One of them highly suitable to harvest wind energy at a height of 128 m and 68.97%of the geographical area covered by this pattern already contains wind turbines.This study shows that the proposed approach is capable of efficiently extracting meaningful patterns from complex spatio-temporal datasets. 展开更多
关键词 Spatio-temporal data mining multi-dimensional sequential pattern mining wind shear coefficient turbulence intensity wind energy
原文传递
From Sequential Pattern Mining to Structured Pattern Mining: A Pattern-Growth Approach 被引量:18
6
作者 Jia-WeiHan JianPei Xi-FengYan 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第3期257-279,共23页
Sequential pattern mining is an important data mining problem with broadapplications. However, it is also a challenging problem since the mining may have to generate orexamine a combinatorially explosive number of int... Sequential pattern mining is an important data mining problem with broadapplications. However, it is also a challenging problem since the mining may have to generate orexamine a combinatorially explosive number of intermediate subsequences. Recent studies havedeveloped two major classes of sequential pattern mining methods: (1) a candidategeneration-and-test approach, represented by (ⅰ) GSP, a horizontal format-based sequential patternmining method, and (ⅱ) SPADE, a vertical format-based method; and (2) a pattern-growth method,represented by PrefixSpan and its further extensions, such as gSpan for mining structured patterns.In this study, we perform a systematic introduction and presentation of the pattern-growthmethodology and study its principles and extensions. We first introduce two interestingpattern-growth algorithms, FreeSpan and PrefixSpan, for efficient sequential pattern mining. Then weintroduce gSpan for mining structured patterns using the same methodology. Their relativeperformance in large databases is presented and analyzed. Several extensions of these methods arealso discussed in the paper, including mining multi-level, multi-dimensional patterns and miningconstraint-based patterns. 展开更多
关键词 data mining sequential pattern mining structured pattern mining SCALABILITY performance analysis
原文传递
Finding frequent trajectories by clustering and sequential pattern mining 被引量:4
7
作者 Arthur A.Shaw N.P.Gopalan 《Journal of Traffic and Transportation Engineering(English Edition)》 2014年第6期393-403,共11页
Data mining is a powerful emerging technology that helps to extract hidden information from a huge volume of historical data. This paper is concerned with finding the frequent trajectories of moving objects in spatio-... Data mining is a powerful emerging technology that helps to extract hidden information from a huge volume of historical data. This paper is concerned with finding the frequent trajectories of moving objects in spatio-temporal data by a novel method adopting the concepts of clustering and sequential pattern mining. The algorithms used logically split the trajectory span area into clusters and then apply the k-means algorithm over this clusters until the squared error minimizes. The new method applies the threshold to obtain active clusters and arranges them in descending order based on number of trajectories passing through. From these active clusters, inter cluster patterns are found by a sequential pattern mining technique. The process is repeated until all the active clusters are linked. The clusters thus linked in sequence are the frequent trajectories. A set of experiments conducted using real datasets shows that the proposed method is relatively five times better than the existing ones. A comparison is made with the results of other algorithms and their variation is analyzed by statistical methods. Further, tests of significance are conducted with ANOVA to find the efficient threshold value for the optimum plot of frequent trajectories. The results are analyzed and found to be superior than the existing ones. This approach may be of relevance in finding alternate paths in busy networks ( congestion control), finding the frequent paths of migratory birds, or even to predict the next level of pattern characteristics in case of time series data with minor alterations and finding the frequent path of balls in certain games. 展开更多
关键词 data mining frequent trajectory CLUSTERING sequential pattern mining statistical method
原文传递
A New Algorithm for Mining Frequent Pattern 被引量:2
8
作者 李力 靳蕃 《Journal of Southwest Jiaotong University(English Edition)》 2002年第1期10-20,共11页
Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidat... Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidate set generation and test approach. However, candidate set generation is very costly. Han J. proposed a novel algorithm FP growth that could generate frequent pattern without candidate set. Based on the analysis of the algorithm FP growth, this paper proposes a concept of equivalent FP tree and proposes an improved algorithm, denoted as FP growth * , which is much faster in speed, and easy to realize. FP growth * adopts a modified structure of FP tree and header table, and only generates a header table in each recursive operation and projects the tree to the original FP tree. The two algorithms get the same frequent pattern set in the same transaction database, but the performance study on computer shows that the speed of the improved algorithm, FP growth * , is at least two times as fast as that of FP growth. 展开更多
关键词 data mining algorithm frequent pattern set FP growth
在线阅读 下载PDF
Improved Pattern Tree for Incremental Frequent-Pattern Mining 被引量:1
9
作者 周明 王太勇 《Transactions of Tianjin University》 EI CAS 2010年第2期129-134,共6页
By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tre... By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tree by sorting each path in a frequency-descending order. While updating the improved pattern tree, there was no need to rescan the entire new database or reconstruct a new tree for incremental updating. A test was performed on synthetic dataset T1014D100K with 100 000 transactions and 870 items. Experimental results show that the smaller the minimum sup- port threshold, the faster the improved pattern tree achieves over CanTree for all datasets. As the minimum support threshold increased from 2% to 3.5%, the runtime decreased from 452.71 s to 186.26 s. Meanwhile, the runtime re- quired by CanTree decreased from 1 367.03 s to 432.19 s. When the database was updated, the execution time of im- proved pattern tree consisted of construction of original improved pattern trees and reconstruction of initial tree. The experiment results showed that the runtime was saved by about 15% compared with that of CanTree. As the number of transactions increased, the runtime of improved pattern tree was about 25% shorter than that of FP-tree. The improved pattern tree also required less memory than CanTree. 展开更多
关键词 data mining association rules improved pattern tree incremental mining
在线阅读 下载PDF
Quantum Algorithm for Mining Frequent Patterns for Association Rule Mining 被引量:1
10
作者 Abdirahman Alasow Marek Perkowski 《Journal of Quantum Information Science》 CAS 2023年第1期1-23,共23页
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre... Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits. 展开更多
关键词 data mining Association Rule mining Frequent pattern Apriori algorithm Quantum Counter Quantum Comparator Grover’s Search algorithm
在线阅读 下载PDF
Analyzing Sequential Patterns in Retail Databases
11
作者 Unil Yun 《Journal of Computer Science & Technology》 SCIE EI CSCD 2007年第2期287-296,共10页
Finding correlated sequential patterns in large sequence databases is one of the essential tasks in data mining since a huge number of sequential patterns are usually mined, but it is hard to find sequential patterns ... Finding correlated sequential patterns in large sequence databases is one of the essential tasks in data mining since a huge number of sequential patterns are usually mined, but it is hard to find sequential patterns with the correlation. According to the requirement of real applications, the needed data analysis should be different. In previous mining approaches, after mining the sequential patterns, sequential patterns with the weak affinity are found even with a high minimum support. In this paper, a new framework is suggested for mining weighted support affinity patterns in which an objective measure, sequential ws-confidence is developed to detect correlated sequential patterns with weighted support affinity patterns. To efficiently prune the weak affinity patterns, it is proved that ws-confidence measure satisfies the anti-monotone and cross weighted support properties which can be applied to eliminate sequential patterns with dissimilar weighted support levels. Based on the framework, a weighted support affinity pattern mining algorithm (WSMiner) is suggested. The performance study shows that WSMiner is efficient and scalable for mining weighted support affinity patterns. 展开更多
关键词 data mining sequential pattern mining sequential ws-confidence weighted support affinity
原文传递
Fast Discovering Frequent Patterns for Incremental XML Queries
12
作者 PENGDun-lu QIUYang 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期638-646,共9页
It is nontrivial to maintain such discovered frequent query patterns in real XML-DBMS because the transaction database of queries may allow frequent updates and such updates may not only invalidate some existing frequ... It is nontrivial to maintain such discovered frequent query patterns in real XML-DBMS because the transaction database of queries may allow frequent updates and such updates may not only invalidate some existing frequent query patterns but also generate some new frequent query patterns. In this paper, two incremental updating algorithms, FUX-QMiner and FUXQMiner, are proposed for efficient maintenance of discovered frequent query patterns and generation the new frequent query patterns when new XMI, queries are added into the database. Experimental results from our implementation show that the proposed algorithms have good performance. Key words XML - frequent query pattern - incremental algorithm - data mining CLC number TP 311 Foudation item: Supported by the Youthful Foundation for Scientific Research of University of Shanghai for Science and TechnologyBiography: PENG Dun-lu (1974-), male, Associate professor, Ph.D, research direction: data mining, Web service and its application, peerto-peer computing. 展开更多
关键词 XML frequent query pattern incremental algorithm data mining
在线阅读 下载PDF
SHAPE-BASED TIME SERIES SIMILARITY MEASURE AND PATTERN DISCOVERY ALGORITHM
13
作者 ZengFanzi QiuZhengding +1 位作者 LiDongsheng YueJianhai 《Journal of Electronics(China)》 2005年第2期142-148,共7页
Pattern discovery from time series is of fundamental importance. Most of the algorithms of pattern discovery in time series capture the values of time series based on some kinds of similarity measures. Affected by the... Pattern discovery from time series is of fundamental importance. Most of the algorithms of pattern discovery in time series capture the values of time series based on some kinds of similarity measures. Affected by the scale and baseline, value-based methods bring about problem when the objective is to capture the shape. Thus, a similarity measure based on shape, Sh measure, is originally proposed, andthe properties of this similarity and corresponding proofs are given. Then a time series shape pattern discovery algorithm based on Sh measure is put forward. The proposed algorithm is terminated in finite iteration with given computational and storage complexity. Finally the experiments on synthetic datasets and sunspot datasets demonstrate that the time series shape pattern algorithm is valid. 展开更多
关键词 Shape similarity measure pattern discovery algorithm Time series data mining
在线阅读 下载PDF
基于频繁模式树和深度学习的频繁项集挖掘算法 被引量:1
14
作者 李洋 李华 《黑龙江工业学院学报(综合版)》 2025年第1期94-98,共5页
随着数据量的急剧增长,从海量数据中挖掘有价值的信息变得尤为重要。频繁项集挖掘作为数据挖掘的一个关键领域,旨在识别数据集中频繁出现的项集,这些项集能够揭示数据间的内在联系,并为后续的高级分析提供基础。然而,传统的频繁项集挖... 随着数据量的急剧增长,从海量数据中挖掘有价值的信息变得尤为重要。频繁项集挖掘作为数据挖掘的一个关键领域,旨在识别数据集中频繁出现的项集,这些项集能够揭示数据间的内在联系,并为后续的高级分析提供基础。然而,传统的频繁项集挖掘算法在处理大规模数据集时面临准确性和效率的挑战。为了解决这些问题,本研究提出频繁模式树和深度学习的新型频繁项集挖掘算法。该算法首先利用深度置信网络提取数据的高级特征,然后基于这些特征构建频繁模式树,以高效挖掘频繁项集。实验结果表明,该算法在查全率和查准率方面均表现优异,查全率高达97.56%,查准率高达95.49%,显示出其在实际应用中的高准确性和广泛适用性。 展开更多
关键词 频繁模式树 深度学习 频繁项集 数据挖掘 挖掘算法
在线阅读 下载PDF
基于数据挖掘的某三甲医院1~14岁特应性皮炎患儿用药规律探索
15
作者 雷婷 程云生 +1 位作者 殷方雄 王媛媛 《儿科药学杂志》 2025年第5期26-30,共5页
目的:基于数据挖掘技术分析某三甲医院皮肤科1~14岁特应性皮炎患儿药物使用现状及关联性,为提高临床用药合理性、探索治疗方案及后续研究提供依据。方法:利用医院电子病历管理平台检索2024年于皮肤科就诊的1~14岁特应性皮炎患儿门诊病历... 目的:基于数据挖掘技术分析某三甲医院皮肤科1~14岁特应性皮炎患儿药物使用现状及关联性,为提高临床用药合理性、探索治疗方案及后续研究提供依据。方法:利用医院电子病历管理平台检索2024年于皮肤科就诊的1~14岁特应性皮炎患儿门诊病历,记录处方信息,建立关联Apriori算法模型,对药物使用进行关联性分析。结果:共纳入1317张门诊处方,中重度特应性皮炎处方共计428张(32.50%)。不同种药物关联规则分析显示,规则“曲安奈德益康唑乳膏→氯雷他定糖浆”在2种药物联用中记录最高(支持度35.53%,置信度46.03%);规则“曲安奈德益康唑乳膏+氯雷他定糖浆→尿素维E乳膏”在3种药物联用中记录最高(支持度16.35%,置信度43.89%)。不同类别药物关联分析显示,规则“外用糖皮质激素→口服抗过敏药(抗组胺)”在2类药物联用中记录最高(支持度74.47%,置信度67.62%);规则“外用糖皮质激素+外用钙调磷酸酶抑制剂→口服抗过敏药(抗组胺)”在3类药物联用中记录最高(支持度23.41%,置信度66.33%)。结论:通过高频药物使用关联分析,总结了我院1~14岁患儿在治疗特应性皮炎时常用药物及药物联用治疗方案,符合其疾病治疗特点,与临床诊疗指南推荐基本相符,可为医药专业人员诊疗提供不同方向与路径参考。尚需加强部分药品超说明书药物使用管理,保障患儿用药安全。 展开更多
关键词 数据挖掘 APRIORI算法 特应性皮炎 用药规律 关联分析
暂未订购
基于序列大数据增量式挖掘算法的多模态通信信号同步方法
16
作者 杜婧子 《计算技术与自动化》 2025年第3期1-5,共5页
在多模态通信信号同步中,由于信号特征较为复杂,对后续的频偏估计过程造成了一定的干扰,导致信号同步处理结果的TIE值比较高。为此,提出了基于序列大数据增量式挖掘算法的多模态通信信号同步方法。通过建立多模态通信信号的信道模型,采... 在多模态通信信号同步中,由于信号特征较为复杂,对后续的频偏估计过程造成了一定的干扰,导致信号同步处理结果的TIE值比较高。为此,提出了基于序列大数据增量式挖掘算法的多模态通信信号同步方法。通过建立多模态通信信号的信道模型,采用序列大数据增量式挖掘算法对信号进行聚类处理,由此提取出不同聚类簇的信号时序特征。结合该特征,对信号执行M次方运算,从而利用FFT变换的方法估计相应的信号频偏。在此基础上,通过并行捕获的方法对信号频偏进行修正,从而实现多模态通信信号的同步处理。经过实验测试可知,该方法在时间间隔误差(Time Interval Error,TIE)指标方面表现出了较低的数值水平,信号的同步效果更优,在多模态通信领域中有着良好的应用前景。 展开更多
关键词 信号同步 多模态通信信号 增量式挖掘算法 序列大数据 通信信号 信号处理
在线阅读 下载PDF
基于Apriori算法的中药注射液治疗血瘀证用药规律分析 被引量:2
17
作者 何洁 胡建新 《临床合理用药》 2025年第4期7-12,共6页
目的 分析中药注射液治疗血瘀证的用药规律,为临床合理用药提供参考。方法 通过计算机检索中国知网、万方数据知识服务平台、维普网等数据库中中药注射液治疗血瘀证的相关研究文献,检索时限为1980年1月—2023年7月,采用Apriori算法分析... 目的 分析中药注射液治疗血瘀证的用药规律,为临床合理用药提供参考。方法 通过计算机检索中国知网、万方数据知识服务平台、维普网等数据库中中药注射液治疗血瘀证的相关研究文献,检索时限为1980年1月—2023年7月,采用Apriori算法分析中药注射液种类、用药频次、性味归经,并运用关联规则进行数据挖掘分析。结果 治疗血瘀证的中药注射液67种,其中丹红注射液用药频次最高;单味中药中丹参使用频次最高(184次,35.32%),单味中药的功效以活血化瘀药(397次,44.76%)所占比例最高,药性以温、寒为主,药味以苦、甘、辛为主,归经以肝经、心经为主;关联规则分析得到丹参—红花等核心药对。结论 基于Apriori算法运用关联规则进行数据挖掘分析,得到中药注射液治疗血瘀证的用药规律,可为血瘀证中药注射液临床用药提供参考。 展开更多
关键词 血瘀证 中药注射液 数据挖掘 APRIORI算法 用药规律
原文传递
Research and Application on Web Information Retrieval Based on Improved FP-Growth Algorithm 被引量:3
18
作者 JIAO Minghai YAN Ping JIANG Huiyan 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1065-1068,共4页
A kind of single linked lists named aggregative chain is introduced to the algorithm, thus improving the architecture of FP tree. The new FP tree is a one-way tree and only the pointers that point its parent at each n... A kind of single linked lists named aggregative chain is introduced to the algorithm, thus improving the architecture of FP tree. The new FP tree is a one-way tree and only the pointers that point its parent at each node are kept. Route information of different nodes in a same item are compressed into aggregative chains so that the frequent patterns will be produced in aggregative chains without generating node links and conditional pattern bases. An example of Web key words retrieval is given to analyze and verify the frequent pattern algorithm in this paper. 展开更多
关键词 data mining CHAINS FP-growth algorithm frequent pattern aggregative information retrieval
在线阅读 下载PDF
规模化序列大数据的增量式低复杂度挖掘仿真
19
作者 汪志 陈霞 《计算机仿真》 2025年第4期234-238,共5页
规模化序列大数据不仅包括结构化数据,还包含大量的非结构化和半结构化数据,具有多样性、高速度等特点,增加了数据挖掘的复杂性。为提升新增数据后的数据挖掘效果,提出了面向规模化序列大数据的增量式挖掘优化算法。通过迭代校正法检测... 规模化序列大数据不仅包括结构化数据,还包含大量的非结构化和半结构化数据,具有多样性、高速度等特点,增加了数据挖掘的复杂性。为提升新增数据后的数据挖掘效果,提出了面向规模化序列大数据的增量式挖掘优化算法。通过迭代校正法检测出序列数据中的噪声点并清洗数据;基于互信息特征选择算法,降低数据维度增强特征;基于原始数据模糊相似度优化增量式模糊聚类挖掘算法,在降低挖掘复杂度的同时,实现对新增数据部分的挖掘。通过仿真可知,所提方法可应对规模化序列数据集的动态变化,有效挖掘出目标信息。 展开更多
关键词 序列数据 增量式挖掘 数据清洗 互信息特征选择 模糊聚类
在线阅读 下载PDF
基于数据挖掘技术的计算机软件工程研究
20
作者 郝亚茹 《移动信息》 2025年第7期458-460,共3页
随着软件规模的日益增大,计算机软件工程面临着诸多挑战,而数据挖掘技术为解决这些问题提供了一种有效的途径。文中通过设计一个基于数据挖掘技术的计算机软件工程实验,对软件工程进行了分析,旨在探索软件工程中可能存在的模式、趋势和... 随着软件规模的日益增大,计算机软件工程面临着诸多挑战,而数据挖掘技术为解决这些问题提供了一种有效的途径。文中通过设计一个基于数据挖掘技术的计算机软件工程实验,对软件工程进行了分析,旨在探索软件工程中可能存在的模式、趋势和关联。该实验结果将为计算机软件工程提供了有价值的参考和启示。 展开更多
关键词 数据挖掘 聚类分析 关联规则挖掘 序列模式挖掘
在线阅读 下载PDF
上一页 1 2 23 下一页 到第
使用帮助 返回顶部